Biological Magnetic Resonance Volume 17
Structure Computation and Dynamics in Protein NMR
A Continuation Order Plan ...
41 downloads
1214 Views
11MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Biological Magnetic Resonance Volume 17
Structure Computation and Dynamics in Protein NMR
A Continuation Order Plan is available for this series. A continuation order will bring delivery of each new volume immediately upon publication. Volumes are billed only upon actual shipment. For further information please contact the publisher.
Biological Magnetic Resonance Volume 17
Structure Computation and Dynamics in Protein NMR Edited by
N. Rama Krishna University of Alabama at Birmingham
Birmingham, Alabama
and
Lawrence J. Berliner Ohio State University Columbus, Ohio
KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
!"##$ %&"'( )*+,-.%&"'(
0-306-47084-5 0-306-45953-1
/0110.2345!*.6789!:+7.)4;3+#*$ 633.*+D=-
5. EXPERIMENTS FOR AUTOMATED ANALYSIS OF BACKBONE RESONANCEASSIGNMENTS
The general kinds of NMR data required for AUTOASSIGN execution are shown schematically in Fig. 2. The required data can be obtained using various implementations of triple-resonance experiments that are available on the world-
Automated Analysis of Resonance Assignments
101
wide web from several academic laboratories, or from commercial NMR spectrometer vendors. In particular, the AUTOASSIGN program does not consider whether a particular pulse sequence is implemented in the “out-and-back” fashion or in the “straight-through” fashion, nor whether or not pulsed-field gradients have been used for water suppression. Similarly, it is not relevant if the magnetization is detected on protons or on aliphatic protons, so long as the nuclei that are frequency labeled include those indicated in each schematic experiment of Fig. 2. In our experience the specific implementations of triple-resonance experiments described below that are in use in our laboratory are especially suitable for generating input to the AUTOASSIGN program. The following sections describe our current “standard” versions of these triple-resonance experiments, together with practical issues associated with their execution. All of the experiments are carried out using heteronuclear coherence selection with pulsed-field gradients (PFGs) for solvent suppression, and sensitivity enhancement by collection of both the x- and y-components that develop during the frequency-evolution period (Kay et al., 1992). In these sensitivityenhanced experiments, the delay is set to a value between 100 and
depending on the gradient recovery properties of the specific NMR probe used in the data collection. In this regard, it is crucial to select probe and gradient amplifier hardware with excellent PFG phase recovery properties. Selective C´ or decoupling indicated during specific periods of these pulse sequences is achieved using a sinc waveform with an MLEV supercycle (S. D. Emerson and G. T. Montelione, unpublished results). These pulse sequences have been implemented on Varian Inova 500 and 600 NMR spectrometers. The corresponding C source codes, parameter files, and waveforms, together with macros for automatically optimizing values of key coherence transfer delays for different protein NMR samples, are available over the worldwide web at http://www-nmr.cabm.rutgers.edu/. 5.1. HSQC Our implementation of -correlated PFG-HSQC (Kay et al., 1992; Li and Montelione, 1993) is shown schematically in Fig. 6. The single adjustable coherence transfer delay, is tuned to values slightly smaller than by arraying its value and evaluating the intensity of the first spectrum of the 2D array. In providing input to AUTOASSIGN, two 2D HSQC spectra are collected with
different sweepwidths in the
dimension: the larger sweepwidth is sufficient to
include all side-chain Arg and Lys frequencies; the smaller sweepwidth is adjusted to provide folding of these same Arg and Lys resonances. The remaining 3D spectra are collected with exactly the same and sweepwidths and corresponding digital resolutions used in this second HSQC spectrum. The program uses these two HSQC spectra to identify folded Arg and Lys cross
102
GaetanoT. Montelione et al.
Automated Analysis of Resonance Assignments
103
104
Gaetano T. Montelione et al.
peaks in the spectrum recorded with smaller sweepwidth and in all the remaining 3D triple-resonance spectra.
5.2. HNCO
Our implementation of PFG–HNCO (based on Muhandiram and Kay, 1994) is shown schematically in Fig. 7. The delay is set as described above, and the adjustable coherence transfer delay is tuned using the value optimized in the HSQC experiment (above). The coherence transfer delay is adjusted to exactly corresponding to a null for coherence transfer within groups, as shown in Fig. 8. This choice of provides significant (though not
necessarily perfect) suppression of cross peaks from side-chain amide
groups
of Asn and Gln, which otherwise are quite strong in the spectrum. The key
coherence transfer function specific to this PFG–HNCO, is shown for a typical effective coherence relaxation rate ms in Fig. 9. For proteins in the 7–20 kD range, typical optimal values of range from 8.0 to 15.0 ms.
5.3. HN(CA)CO The current implementation of AUTOASSIGN can use HN(CA)CO spectral data to identify intraresidue C´ resonance frequencies. For simpler systems, these data are not necessary, but for more complex systems it has been found to be very valuable to have some intraresidue C´ frequency information (Zimmerman et al., 1997). When these data are available, they can be matched to sequential C´
resonance frequency data derived from HNCO experiments to resolve ambiguities in establishing links between GS objects. It should be noted that unless the protein sample is deuterated in the position, the HN(CA)CO experiment is relatively insensitive and generally provides intraresidue C´ frequency information for only a subset of amino acid spin systems. Nonetheless, when this data is available it
greatly enhances the progress of the AUTOASSIGN process. Our current implementation of PFG–HN(CA)CO (based on Clubb et al., 1992) is shown schematically in Fig. 10. The delay is set as described above, the adjustable coherence transfer delay is set using the value optimized in the HSQC experiment (above), and the adjustable coherence transfer delay is set to exactly as in the HNCO experiment, so as to suppress pathways
involving side-chain groups. The key coherence transfer functions specific to PFG–HN(CA)CO, and are shown for a typical effective coherence relaxation rate in Figs. 11 and 12, respectively. During the coherence transfer delay magnetization transfer occurs via both active
and
coupling constants, giving rise to intraresidue and sequential, respectively, cross peaks. AUTOASSIGN uses these sequential cross peaks to align the HN(CA)CO spectrum with HNCO data, and uses the intraresidue cross peaks to fill out the
Automated Analysis of Resonance Assignments
105
106
Gaetano T. Montelione et al.
Automated Analysis of Resonance Assignments
107
108
Gaetano T. Montelione et al.
CA-ladders of GSs. Optimum values of delay are 8.0–15.0 ms. Typically the intraresidue cross peaks are more intense than sequential cross peaks (Fig. 11),
though differential relaxation effects sometimes result in sequential correlations that are more intense than the corresponding intraresidue correlations. The coherence transfer function for the delay is shown in Fig. 12. During this period,
magnetization transfer via the coupling constant is modulated by the passive couplings, particularly the scalar coupling interaction. Thus, the coherence transfer functions are significantly different for Gly and non-Gly residues (Fig. 12). For non-Gly residues, optimum values of are typically 1.5–3.0 ms. 5.4.
HNCA
Figure 13 shows the implementation of PFG–HNCA (based on Muhandiram and Kay, 1994) in current use for AUTOASSIGN analysis. Delays and
are set as described above for the PFG–HNCO experiment. The key coherence transfer function specific to PFG–HNCA, is shown for a typical effective coherence relaxation rate in Fig. 14. During this period, magnetization transfer occurs via both active and coupling constants, giving rise to intraresidue and sequential, respectively, cross peaks. As in the case of the
Automated Analysis of Resonance Assignments
109
HN(CA)CO experiment, AUTOASSIGN uses these sequential cross peaks to align the HNCA spectrum with CA(CO)NH data, and uses the intraresidue cross peaks to fill out the CA-ladders of GSs. Optimum values of are typically 6.0–14.0 ms. As in the case of HN(CA)CO spectra, the intraresidue cross peaks in HNCA spectra are typically (though not always) more intense than sequential cross peaks (Fig. 14). 5.5.
HACA(CO)NH
Figures 15 and 16 show the implementations of 3D PFG–HACA(CO)NH experiments [based on 4D experiments originally described by Boucher et al.,
1992] in current use for AUTOASSIGN analysis. Figure 15 shows a version of the experiment with frequency labeling in the dimension, while Fig. 16 shows the related experiment with frequency labeling in (Feng et al., 1996).
Conveniently, the optimum values of delays are identical for these two experiments, so that once these delays are optimized for one of the pair, the user can run both experiments with essentially the same parameter sets (except for changes required
to switch from indirect detection of in the dimension to Delays and are set as described above for the PFG–HNCO experiment, and the delay is optimized for intraresidue magnetization transfer via the active
110
Gaetano T. Montelione et al.
Automated Analysis of Resonance Assignments
111
coupling constant (Fig. 11). The delay is used for the refocusing of antiphase magnetization. Its coherence transfer function is similar to that of the curve shown for cross peaks in Fig. 12; i.e., the optimum value is a bit longer than the corresponding delay in the HN(CA)CO experiment. This is because, unlike the non-Gly pathway of the HN(CA)CO experiment, the relevant magnetization of the HACA(CO)NH experiment is not modulated by the passive coupling constant during the delay Of special consideration in the HACA(CO)NH experiments (and in all of the experiments below in which coherence transfer pathways begin on aliphatic protons) are the coherence transfer functions associated with delays (Fig. 17) and (Fig. 18). The optimal value of is determined by the crossover point for the coherence transfer curves for CH (nonGly) and (Gly) groups, i.e., ms (Fig. 17). The optimal values of representing a combination of the effects of active coupling and passive coupling, range from 2.0 to 4.0 ms, depending on the effective relaxation rates. As described below, these delays can also be tuned to provide for C–H and C–C spin-topology editing (Feng et al., 1996; Rios et al., 1996).
112
Gaetano T. Montelione et al.
Automated Analysis of Resonance Assignments
113
114
5.6.
Gaetano T. Montelione et al.
HACANH
Figure 19 shows the implementation of the 3D PFG–HA(CA)NH experiment [based on experiments originally described by Montelione and Wagner, (1989)] in current use for AUTOASSIGN analysis, with frequency labeling of
during the
period. The related 3D (HA)CANH experiment (Montelione and Wagner, 1989), providing intraresidue information, can be run in place of the HNCA experiment, though the latter generally exhibits better sensitivity. With the exception of
the delay all of the delays in HACANH are optimized exactly the same as for HACA(CO)NH; i.e., delays and , are set as described for the PFG– HNCO experiment, delays as described for the HACA(CO)NH experiments, and the delay is optimized for intraresidue magnetization transfer via the active coupling constant (Fig. 11). Coherence transfer curves for corresponding to intraresidue and sequential cross peaks are shown in Fig. 20. These curves are different for Gly (or Gly-X) cross peaks than for nonGly (or nonGly-X) cross peaks because of the modulating effects of passive
Automated Analysis of Resonance Assignments
115
coupling on magnetization in non-Gly pathways during the period. AUTOASSIGN uses the intraresidue peaks in the HACANH spectrum to identify resonances of CA-LDDRS, and uses the sequential peaks to help align the HACANH spectra with the corresponding HACA(CO)NH data. Accordingly, data collection should be carried out so as to optimize both the intraresidue and sequential cross peaks, and typical optimum values of the delay are 3.0–5.0 ms (Fig. 20). As described below, and can also be tuned to provide for C–C and C–H spin-topology editing, respectively.
5.7.
C–C and C–H Phase Information in HACA(CO)NH and HACANH Experiments
Much of the recent work on triple-resonance pulse sequence development has focused on obtaining information useful for classifying amino acid spin system types (Montelione et al., 1992; Bax and Grzesiek, 1993; Lyons and Montelione, 1993; Wittekind and Mueller, 1993; Yamazaki et al., 1993, 1995; Olejniczak and Fesik, 1994; Gehring and Guittet, 1995; Grzesiek and Bax, 1995; Tashiro et al., 1995; Dötsch and Wagner, 1996; Dötsch et al., 1996a, 1996b; Farmer and Venters,
116
Gaetano T. Montelione et al.
Automated Analysis of Resonance Assignments
117
1996; Feng et al., 1996; Rios et al., 1996). This information is extremely valuable for determining resonance assignments, especially when combined with characteristic chemical-shiftdata into automated assignment programs like AUTOASSIGN (Friedrichs et al., 1994; Meadows et al., 1994; Zimmerman and Montelione, 1995; Zimmerman et al., 1994, 1997). Specific information about spin system topologies can be obtained by appropriate tuning of scalar coupling effects. For example, constant-time frequencyevolution periods commonly used in triple-resonance experiments for homonuclear decoupling are generally designed to combine frequency evolution and coherence defocusing–refocusing periods. During these coherence defocusing– refocusing periods, magnetization oscillates differently according to the spin system topology and the set of active and passive scalar couplings. In uniformly
118
Gaetano T. Montelione et al.
. enriched molecules, proper tuning of these delay times can provide resonance phase information (i.e., positive or negative peak intensities) which depends on the number of coupled nuclei (Santoro and King, 1992; Grzesiek and Bax, 1993; Wittekind and Mueller, 1993; Tashiro et al., 1995; Dötsch and Wagner, 1996; Dötsch et al., 1996b; Feng et al., 1996; Rios et al., 1996). We refer to these as “C–C type phase experiments.“ Alternatively, proper tuning of the time period used for refocusing (or defocusing) antiphase carbon magnetization into (or from) in-phase carbon magnetization can provide resonances phase information which depends on the number of coupled nuclei (Morris, 1980; Gehring and Guittet, 1995; Dötsch et al., 1996a; Feng et al., 1996; Rios et al., 1996). We refer to these as “C–H type phase experiments.” Such phase experiments can be used to identify spin system topologies characteristic of different amino acid residue types (see, for example, Grzesiek and Bax, 1993; Tashiro et al., 1995; Feng et al., 1996;
Rios et al., 1996). While there are several varieties of side-chain spin system types, there are only two kinds of carbons in a polypeptide chain composed of the 20 common naturally occurring amino acid residues, i.e., methylene Gly with no directly coupled atoms and methine non-Gly with a single directly coupled atom. Magnetization pathways involving Gly resonances can therefore be distinguished with either C–C type or C–H type phase information. In backbone 2D HN(CO)CA (Gehring and Guittet, 1995) and 3D HN(CA)HA–Gly (Wittekind et al., 1993) pulse sequences, selection of Gly resonances and suppression of
non-Gly resonances is obtained on the basis of their different C–H coupling topologies. Unfortunately, these experiments provide only intraresidue Gly or sequential Gly-X peaks (all the other correlations are “nulled”), and are generally carried out in addition to experiments tuned for identifying intraresidue and sequential connectivity information for the remaining non-Gly spin systems. In our efforts to develop automated methods for determining resonance assignments in proteins, we have found that it is convenient to incorporate Gly “phase labeling” directly into the standard HACA(CO)NH (Figs. 15 and 16) and HACANH (Fig. 19) experiments used for establishing intraresidue and/or sequential connectivities. In these pulse sequences the delay periods (Fig. 17) and (Figs. 18 and 20) can be adjusted to provide C–H or C–C phase information, respectively. To illustrate this point, we describe transfer functions for and spin systems of the relevant pulse sequence fragments. For all of the experiments outlined in Figs. 15, 16, and 19, the transfer function for in-phase carbon magnetization after the refocusing delay is multiplied by the following term describing the refocusing of antiphase magnetization by scalar coupling:
Automated Analysis of Resonance Assignments
119
where is the coupling constant and m is the number of protons directly bonded to the carbon. This transfer function” is identical for the HACANH and HACA(CO)NH experiments (Fig. 17). Appropriate tuning of can thus be used to discriminate magnetization beginning on Gly (m = 2) and non-Gly (m = 1) nuclei based on C–H phase information. Similar considerations can be used to describe the C–C phase effects of coupling during the constant-time evolution periods (Figs. 18 and 20). For HACANH and HACA(CO)NH experiments, the transfer function is multiplied by the following terms describing the effects of and scalar coupling during the period
where
and are the and coupling constants, respectively, and n is the number of carbon atoms with active one-bond coupling to the atom. Equations (2) and (3) describe the
intraresidue and sequential transfer pathways in HACANH, respectively, while Eq. (4) describes the sequential transfer pathway in HACA(CO)NH. These “ transfer functions” are plotted in Fig. 18 [Eq. (4)] and Fig. 20 [Eqs. (2) and (3)], respectively, for Gly and non-Gly cross peaks. Tuning of can thus be used to discriminate magnetization beginning on Gly (n = 0) and non-Gly (n = 1) nuclei based on C–C phase information.
In the case of C–H phase tuning (Fig. 17), the signal modulation is dominated by the coupling constant during Optimal values of for maximizing both methine and methylene magnetizations are (i.e., for nonphase spectra and for phase spectra (see discussion in Feng et al., 1996), assuming Computer simulations of this transfer function carried out with effective uniform relaxation times (assumed to be identical for Gly and non-Gly residues) from 2 to 50 ms indicate that the positions of these optimal
values are independent of relaxation (Feng et al., 1996), although of course the amplitude of the transfer function at these optimal values
becomes smaller as the value of becomes shorter. In the case of C–C phase tuning (Figs. 18 and 20), the highest-frequency modulation during the constant-time period for non-Gly spin systems is due to the coupling constant. With uniform relaxation time optimal values of in the HACANH experiment are (nonphase) and ms (phase), respectively, while for the HACA(CO)NH experiments, the corre-
120
Gaetano T. Montelione et al.
sponding optimal values are (nonphase) and (phase), respectively. For both experiments, the positions of these optima (both phase and nonphase) are relatively independent of relaxation for effective uniform relaxation times as small as 15 ms, but shift to smaller values with shorter relaxation times (Feng et al., 1996). The coherence transfer function plots in Figs. 17, 18, and 20 were simulated assuming a uniform relaxation time of 10 ms during the entire or periods. This value is based on our experience with uniformly enriched proteins in the 7–14 kD range. Under these conditions, good signal-to-noise spectra can be obtained using nonphase, C–C phase, or C–H phase delay tunings in proton
and carbon
versions of the 3D PFG–HACANH and PFG–
HACA(CO)NH experiments (Feng et al., 1996). The signal is modulated during the period by the product of the appropriate and coherence transfer functions. This product is generally smaller for the HACANH experiment than for the HACA(CO)NH experiment; i.e., the HACA(CO)NH experiments are generally more sensitive. Comparisons of these transfer functions provide a prediction of the best method (C–C or C–H) for obtaining backbone phase information in these two experiments. These predictions of the simulations have also been verified by experimental measurements (Feng et al., 1996). With a uniform relaxation rate the HACANH experiments yield better signal-to-noise (S/N) ratios using C–C phase methods (i.e., long value), while the HACA(CO)NH experiments yield better S/N ratios using C–H phase methods (i.e., long value). Specifically, relative to nonphase spectra, the HACANH experiments generally provide better signal-to-noise ratios using C–C Gly phase labeling than when using C–H phase labeling. Indeed, as is evident from the curves in Fig. 20, in HACANH the signal-to-noise ratios can be better for Gly phase versions than for the nonphase version (Feng et al., 1996). This is because longer values used for C–C Gly phase labeling (e.g., also allow more complete transfer by the relatively small coupling constant resulting in better S/N ratios even in the presence of significant relaxation rates (Fig. 20). In larger proteins, the effective relaxation rate during the period is faster, and C–H phase labeling may be preferable. On the other hand, relative to nonphase spectra, the HACA(CO)NH experiments generally exhibit better signal-to-noise ratios using C–H Gly phase labeling than with C–C phase labeling. Since the coupling constant is fairly large the transfer is quite efficient and the long periods required for obtaining C–C phase information cost signal-to-noise due to relaxation effects without much enhancement of the coherence transfer. While not essential for the function of the AUTOASSIGN program, such “Gly phase” information can be interpreted by the program and, when available, can greatly enhance its performance.
Automated Analysis of Resonance Assignments
121
5.8. CBCA(CO)NH Figure 21 shows the version of 3D PFG–CBCA(CO)NH experiment [based on experiments originally described by Grzesiek and Bax (1992a)] in current use for AUTOASSIGN analysis, with frequency labeling of and during the period. This experiment correlates the and resonances of residue i with the resonance of residue by transferring magnetization through the intervening carbonyl group. Our philosophy in setting up this experiment, and in setting up the related 3D PFG–CBCANH experiment (described below), is to select coherence transfer delays so as to optimize the intensities of cross peaks involving resonances, rather than, for example, compromising the choice of delay values to optimize intensities of both and cross peaks. The resulting subset of weaker cross peaks in these spectra are used to align these CBCANH data with (HA)CA(CO)NH spectra, and then the resonance frequency information is used to complete the “rungs” of CO-LDDRS. Delays and are set as described above for the PFG–HNCO experiment, and the delay is optimized for intraresidue magnetization transfer via the active coupling constant (Fig. 11). For nonphase versions of the experiment, delays
and
are
set as described above for the HACA(CO)NH and HACANH experiments.
Coherence transfer functions for critical delays
and
in the
CBCA(CO)NH experiment are shown in Figs. 22 and 23, respectively. These
delays must be optimized to maximize the
transfer. Their optimum
values cannot be determined by arraying them and comparing the signal intensity in the first 1D spectrum of the 3D data set, as the pathways also
contribute significantly to this signal. Instead, the optimal values must be determined by comparing spectra recorded with different values of these arrays. For nonphase versions of the experiment, typical values are and (Table 3).
5.9. CBCANH Figure 24 shows our implementation of 3D PFG–CBCANH [based on experiments originally described by Grzesiek and Bax (1992b)] in current use for
AUTOASSIGN analysis, with frequency labeling of and during the period. This experiment correlates the and resonances of residue i with the resonance of residue and the resonance of residue [via . As with CBCA(CO)NH, our philosophy in setting up this CBCANH experiment is to select coherence transfer delays so as to optimize the intensity of cross peaks involving
resonances, rather than compromising the delay values to
optimize intensities of
and
cross peaks. The subset of
cross peaks in these
spectra are used to align these data with HNCA spectra, and then the resonance frequency information is used to complete the “rungs” of CA-LDDRS. Delays
122
Gaetano T. Montelione et al.
Automated Analysis of Resonance Assignments
123
124
Gaetano T. Montelione et al.
and are set as in the PFG–HNCO experiment, the delay is optimized for intraresidue magnetization transfer via the active coupling constant (Fig. 11), and for nonphase versions of the experiment delays and are set as described above for the HACA(CO)NH and HACANH experiments. The coherence transfer function for the delay is essentially identical to that of the CBCA(CO)NH experiment (Fig. 22). The coherence transfer function for the delay in the CBCANH experiment is shown in Fig. 25. As for CBCA(CO)NH, optimal and values for transfer must be evaluated
by comparing
spectra recorded with different values of these delays.
For nonphase versions of the experiment, typical values are
(Table 3).
ms and
Automated Analysis of Resonance Assignments
125
126
Gaetano T. Montelione et al.
5.10. C–C and C–H Phase Information in CBCA(CO)NH and CBCANH Experiments Figures 17 and 22 clearly show that C–H and C–C phase versions of CBCA(CO)NH and CBCANH can be obtained using longer values of or (Rios et al., 1996). Such information is very valuable for distinguishing spin system types (Rios et al., 1996) and can greatly enhance the automated analysis process. However, although this represents an important future extension, AUTOASSIGN has not yet been designed to take advantage of this type of phase information.
It is also possible to distinguish
from
(and Gly
cross peaks in
CBCA(CO)NH by using longer values of (Fig. 23) or by using values in CBCANH (Fig. 25). With these settings, passive coupling effects during the refocusing of antiphase coherence will
result in intensities for non-Gly cross peaks with opposite signs relative to Gly or all cross peaks, providing a means for distinguishing non-Gly cross peaks (Grzesiek and Bax, 1992a, 1992b). However, such phase labeling of cross peaks costs significant sensitivity (see coherence transfer curves Figs. 23 and 25). Our current philosophy has been to tune all delays in these CBCA(CO)NH and CBCANH experiments so as to maximize the intensities of the cross peaks without phase information, and to distinguish from cross peaks by comparing the CBCA(CO)NH and CBCANH data with (HA)CA(CO)NH and HNCA spectra, respectively. Although this strategy is somewhat redundant, it has the advantage of
Automated Analysis of Resonance Assignments
127
tuning delays so as to provide the most sensitivity in these generally less sensitive CBCA(CO)NH and CBCANH spectra. This approach is particularly important for
collecting CBCANH data, as this critical experiment is one of the least sensitive of the entire set that is normally collected.
6. FUTURE DEVELOPMENTS The specific implementations of triple-resonance experiments described here together with the AUTOASSIGN software provide a robust and efficient process for determining backbone C, N, and H resonance assignments in proteins with molecular weights Future developments focus on extending the automated analysis process to include complete assignments of side-chain aliphatic and aromatic resonances and integration of these assignments in automated processes for analysis of NOESY spectra and 3D structure generation calculations. In regard to side-chain assignments, it is anticipated that phase-type spectra that provide specific information about local and topologies will be especially amenable to automated analysis (Tashiro et al., 1995; Feng et al., 1996; Rios et al.,
1996).
NMR analysis has tremendous potential for analyzing the many gene products that are being identified in the various genomic sequencing projects. In considering “high throughput” analysis of protein structures from NMR data, another key issue is the total time required for data collection. The current process requires some 5 to 14 days of data collection to provide the necessary input for automated analysis of backbone resonance assignments. At least this much additional time will be required to collect the data needed to complete the side-chain assignments and
NOESY data sets needed for structure generation calculations. Recent work in our laboratory has demonstrated that the signal-to-noise ratios of some of the tripleresonance spectra described here can be significantly enhanced, in some cases by more than a factor of 2, by replacing single quantum coherence states that evolve
during specific points in the pulse sequences with multiple-quantum heteronuclear coherence states that exhibit better relaxation properties (Swapna et al., 1997;
Shang et al., 1997). Efforts are in progress to evaluate the value of these multiplequantum versions of the triple-resonance experiments described here in the overall efficiency of the assignment process. An even more efficient approach would be to carry out complete assignments of the and skeleton using decoupled triple-resonance experiments (Grzesiek et al., 1993; Yamazaki et al., 1994; Farmer and Venters, 1995, 1996) on fully (or partially) perdeuterated enriched protein samples, and then adding aliphatic H atom assignments to these using various kinds of correlation experiments (with appropriate isotope shift corrections). While originally designed for addressing assignment problems in larger proteins the improved relaxation properties of
128
Gaetano T. Montelione et al.
perdeuterated proteins make them ideal for rapid collection of triple-resonance
spectra of smaller proteins in progress in our laboratory.
I as well. Efforts along these lines are currently
ACKNOWLEDGMENTS. We thank Rebecca Klein for her expert assistance in scientific editing. This work was supported by grants from the National Institutes of Health (GM-47014 and GM-50733), the National Science Foundation (MCB9407569), a National Science Foundation Young Investigator Award (MCB9357526), and by a New Jersey Commission on Science and Technology Research Excellence Award.
REFERENCES Bartels, C., Billeter, M., Guntert, P., and Wüthrich, K., 1996, J. Biomol. NMR 7: 207–213. Bax, A., and Grzesiek, S., 1993, Accts. Chem. Res. 26: 131–138. Billeter, M., Braun, W., and Wüthrich, K., 1982, 7. Mol. Biol. 155: 321–346. Boucher, W, Laue, E. D., Campbell-Burk, S. L., and Domaille, P. J., 1992, J. Biomol. NMR 2: 631–637.
Chien, C.-Y, Tejero, R., Huang, Y, Zimmerman, D. E., Rios, C. B., Krug, R. M., and Montelione, G. T., 1997, Nature Struct. Biol. 4: 891–895. Clowes, R. T., Crawford, A., Raine, A. R. C., Smith, B. O., and Laue, E. D., 1995, Curr. Opin. Biotech. 6:81–88. Clubb, R. T., Thanabal, V., and Wagner, G., 1992, J. Magn. Reson. 97: 213–217. Delaglio, F., Grzesiek, S.; Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A., 1995, J. Biomol. NMR 6: 277–293. Dötsch, V., Oswald, R. E., and Wagner, G., 1996a, J. Magn. Reson Ser. B 110: 304–308. Dötsch, V., Oswald, R. E., and Wagner, G., 1996b, J. Magn. Reson Ser. B 110: 107–111. Dötsch, V., and Wagner, G., 1996, J. Magn. Reson Ser. B 111: 310–313. Farmer, B. T. III, and Venters, R. A., 1995, J. Am. Chem. Soc. 117: 4187–4188. Farmer, B. T. III, and Venters, R. A., 1996, J. Biomol. NMR 7: 59–71. Feng, W., Rios, C. B., and Montelione, G. T., 1996, J. Biomol. NMR 8: 98–104. Feng, W., Tejero, R., Zimmerman, D. E., Inouye, M., and Montelione, G. T., 1998, Biochemistry 37: 10881–10896. Friedrichs, M. S., Mueller, L., and Wittekind, M., 1994, J. Biomol. NMR 4: 703–726. Garrett, D. S., Powers, R., Gronenborn, A. M., and Clore, G. M., 1991, J. Magn. Reson. 95: 214–220. Gehring, K., and Guittet, E., 1995, J. Magn. Reson. Ser. B 109: 206–208. Grzesiek, S., Anglister, J., Ren, H., and Bax, A., 1993, J. Am. Chem. Soc. 115: 4369–4370. Grzesiek, S., and Bax, A., 1992a, J. Am. Chem. Soc. 114: 6291–6293. Grzesiek, S., and Bax, A., 1992b, J. Magn. Reson. 99: 201–207. Grzesiek, S., and Bax, A., 1993, J. Biomol. NMR 3: 185–204. Grzesiek, S., and Bax, A., 1995, J. Biomol. NMR 6: 335–339. Hare, B. J., and Prestegard, J. H., 1994, 7. Biomol. NMR 4: 35–46. Ikura, M., Kay, L. E., and Bax, A., 1990, Biochemistry 29: 4659–4667. Kay, L. E., Ikura, M., Tschudin, R., and Bax, A., 1990, J. Magn. Reson. 89: 496–514. Kay, L. E., Keifer, P., and Saarinen, T., 1992, J. Am. Chem. Soc. 114: 10663–10665. Kumar, V., 1992, Artif. Intell. Mag. Spring:32–44. Laity, J. H., Lester, C., Shimotakahara, S., Zimmerman, D. E., Scheraga, H. A., and Montelione, G. T., 1997, Biochemistry 36: 12683–12699.
Automated Analysis of Resonance Assignments
129
Li, Y. C., and Montelione, G. T., 1993, J. Magn. Reson. Ser. B 101: 315–319. Lyons, B. A., and Montelione, G. T., 1993, J. Magn. Reson. Ser. B 101: 206–209. Lyons, B. A., Tashiro, M., Cedergren, L., Nilsson, B., and Montelione, G. T., 1993, Biochemistry 32: 7839–7845. Macworth, A. K., 1977, Artif. Intell. 8: 99–118. Marion, D., Ikura, M., Tschudin, R., and Bax, A., 1989, J. Magn. Reson. 85: 393–399. Meadows, R. P., Olejniczak, E. T., and Fesik, S.W., 1994, J. Biomol. NMR 4: 79–96. Mittard, V, Morelle, N., Brutscher, B., Simorre, J.-P., Marion, D., Stein, M., Jacquot, J.-P, Lirsac, P.-N., and Lancelin, J. -M., 1995, Eur. J. Biochem. 229: 473–485. Montelione, G. T., Lyons, B. A., Emerson, S. D., and Tashiro, M., 1992, J. Am. Chem. Soc. 114: 10,974–10,975. Montelione, G. T., and Wagner, G., 1989, J. Am. Chem. Soc. 1 1 1 : 5474–5475. Montelione, G. T., and Wagner, G., 1990, J. Magn. Reson. 87: 183–188. Morelle, N., Brutscher, B., Simorre, J. P., and Marion, D., 1995, J. Biomol. NMR 5: 154–160. Morris, G., 1980, J. Am. Chem. Soc. 102: 428–429. Moy, F. J., Seddon, A. P., Campbell, E. B., Böhlen, P., and Powers, R., 1995, J. Biomol. NMR 6: 245–254. Muhandiram, D. R., and Kay, L. E., 1994, J. Magn. Reson. Ser. B. 103: 203–216. Nagayama, K., 1986, J. Magn. Reson. 69: 508–510. Newkirk, K., Feng, W., Jiang, W., Tejero, R., Emerson, S. D., Inouye, M., and Montelione, G. T., 1994, Proc. Natl. Acad. Sci. U. S. A. 91: 5114–5118. Olejniczak, E. T., and Fesik, S.W., 1994, J. Am. Chem. Soc. 116: 2215–2216. Olson, J. B., Jr., and Markley, J. L., 1994, J. Biomol. NMR 4: 385–410. Rico, M., Bruix, M. Santoro, J., Gonzalez, C., Neira, J. L., Nieto, J. L., and Herranz, J., 1989, Eur. J. Biochem. 183: 623–638. Rios, C. B., Feng, W., Tashiro, M., Shang, Z., and Montelione, G. T, 1996, J. Biomol. NMR 8: 345–350. Robertson, A.D., Purisima, E. O., Eastman, M. A., and Scheraga, H. A., 1989, Biochemistry 28: 5930–5938. Santoro, J., Gonzalez, C., Bruix, M., Neira, J. L., Nieto, J. L., Herranz, J., and Rico, M., 1993, J. Mol. Biol. 229: 722–734. Santoro, J., and King, G. C., 1992, J. Magn. Reson. 97: 202–207. Shang, Z., Swapna, G. V. T., Rios, C. B., and Montelione, G. T, 1997, J. Am. Chem. Soc. 119:9274–9278. Shimotakahara, S., Rios, C. B., Laity, J. H., Zimmerman, D. E., Scheraga, H. A., and Montelione, G. T., 1997, Biochemistry 36: 6915–6929. Simorre, J.-P., Brutscher, B., Caffrey, M. S., and Marion, D., 1994, J. Biomol. NMR 4: 325–333. Swapna, G. V. T., Rios, C. B., Shang, Z., and Montelione, G. T, 1997, J. Biomol. NMR 9: 105–111. Szyperski, T., Wider, G., Bushweller, J. H., and Wüthrich, K., 1993, J. Am. Chem. Soc. 115: 9307–9308. Tashiro, M., Rios, C. B., and Montelione, G. T., 1995, J. Biomol. NMR 6: 211–216. Tashiro, M., Tejero, R., Zimmerman, D. E., Celda, B., Nilsson, B., and Montelione, G. T, 1997, J. Mol. Biol. 272: 573–590. Wagner, G., and Wüthrich, K., 1982, J. Mol. Biol. 155: 347–366. Wishart, D. S., Bigam, C. G., Yao, J., Abildgaard, F., Dyson, H. J., Oldfield, E., Markley, J. L., and Sykes, B. D., 1995, J. Biomol. NMR 6: 135–140. Wittekind, M., Metzler, W. J., and Mueller, L., 1993, J. Magn. Reson. Ser. B 101: 214–217. Wittekind, M., and Mueller, L., 1993, J. Magn. Reson. Ser. B 101: 201–205. Wlodawer, A., Svensson, L. A., Sjölin, L., and Gilliland, G. L., 1988, Biochemistry 27: 2705–2717. Wüthrich, K., 1986, NMR of Proteins and Nucleic Acids, Wiley, New York. Yamazaki, T, Forman-Kay, J. D., and Kay, L. E., 1993, J. Am. Chem. Soc. 115: 11,054–11,055. Yamazaki, T., Lee, W., Revington, M., Mattiello, D. L., Dahlquist, F. W., Arrowsmith, C. H., Kay, L. E., 1994, J. Am. Chem. Soc. 116, 6464–6465.
130
Gaetano T. Montelione et al.
Yamazaki, T., Pascal, S. M., Singer, A. U., Forman-Kay, J. D., and Kay, L. E., 1995, J. Am. Chem. Soc. 117: 3556–3564. Zimmerman, D. E., Kulikowski, C. A., and Montelione, G. T., 1993, Proc. 1st Int’l Conf. Intell. Syst. Mol. Biol. 1: 447–455.
Zimmerman, D. E., Kulikowski, C. A., Wang, L. L., Lyons, B. A., and Montelione, G. T., 1994, J. Biomol. NMR 4:241–256. Zimmerman, D. E., Kulikowski, C. A., Feng, W., Tashiro, M., Powers, R., and Montelione, G. T., 1997, J. Mol. Biol. 269: 592–610. Zimmerman, D. E. and Montelione, G. T., 1995, Curr. Opin. Struct. Biol. 5: 664–673.
4
Calculation of Symmetric Oligomer Structures from NMR Data
Seán I. O’Donoghue and Michael Nilges 1. SUMMARY The size range of proteins amenable to NMR spectroscopy has extended to the point where protein oligomer structures are now being routinely determined. Many are symmetric; we found 36 symmetric oligomers solved by NMR in the present protein structure database: 32 dimers, 2 tetramers, 1 pentamer, and 1 hexamer. Hence, we anticipate that an increasing number of symmetric oligomer structures will be studied in the future. Since symmetry-related nuclei have degenerate chemical shift, the resonance assignment problem for symmetric oligomers is simplified compared with asymmetric molecules of similar size. However, the NOESY assignment and structure calculation are much more difficult, mainly due to difficulty in distinguishing among intra-, inter-, and comonomer (mixed) NOE signals. For dimers, this difficulty can be overcome using asymmetric labeling, but ambiguity remains for higher-order oligomers. In this chapter we focus on a calculation method, called the symmetry-ADR method, that we have developed for overcoming these difficulties. The main features of the method are the use of special
Seán I. O’Donoghue and Michael Nilges • European Molecular Biology Laboratory, D-69012 Heidelberg, Germany. Biological Magnetic Resonance, Volume 17: Structure Computation and Dynamics in Protein NMR, edited by Krishna and Berliner. Kluwer Academic / Plenum Publishers, New York, 1999. 131
132
Seán I. O’Donoghue and Michael Nilges
restraints to specify the oligomeric symmetry, the use of ambiguous distance restraints (ADRs) to represent the ambiguous NOEs, and the use of novel annealing protocols for the structure calculation. We discuss in detail several structure calculations we have made with this method. We also briefly review the structure calculation methods used in all of the symmetric oligomers solved by NMR to date; the majority have been solved by using aspects of the symmetry-ADR method. We conclude that the symmetry-ADR method has proven to be useful and capable of producing accurate structures. However, our experience cautions us that the calculation of symmetric oligomers by NMR remains challenging, particularly for higher-order oligomers.
2.
INTRODUCTION
We discuss the different classes of symmetry that can occur in protein oligomers, how symmetry complicates NMR structure determination, and why experimental methods for breaking symmetry cannot fully address the problem.
2.1. Symmetry in Macromolecular Aggregates When identical macromolecules aggregate symmetrically, the most favorable
intermolecular contact surface is completely buried from the solvent. Hence, symmetric aggregates are usually energetically more favorable than asymmetric aggregates, in which some of the favorable contact surface must be exposed to the solvent. For in vivo protein complexes, this implies that evolution will tend toward symmetric arrangements. Thus, most protein aggregates of identical subunits we
observe are symmetric. Three different classes of symmetry can occur in macromolecular aggregates: space group, linear group, or point group symmetries. In each case, the aggregate is comprised of identical macromolecular subunits all related by geometrical transformations, which satisfy the requirements of a group, as defined in mathematical group theory. Space group symmetry occurs in crystals and is defined by rotation and translation operators, and a specification of the unit cell. In vivo, proteins rarely aggregate into crystals. More common is linear group symmetry, which occurs in protein fibers, viruses, and in filamentous phages; this symmetry is defined by rotation and translation operators. Due to the high molecular weight of protein crystals and fibers, it is generally not possible to determine
their structures at atomic resolution by NMR, although some structural properties
can be determined by solid-state NMR techniques (e.g., Facelli and Grant, 1993; Phillips et al., 1991). Most symmetric protein aggregates in vivo have point group symmetry, which
forms symmetric oligomers. Many of these oligomers are within the size range amenable to NMR structure determination. A point group is defined by specifying
Calculation of Symmetric Oligomer Structures from NMR Data
133
one or more symmetry axes and the rotation operators that relate the monomers arranged around each axis. For example, point group 2 indicates two identical monomers related by a twofold (180°) rotation around one symmetry axis; point group 3 indicates three identical monomers related by a threefold (120°) rotation
around one symmetry axis; point group 22 (equivalently, 222) indicates a dimer of dimers, i.e., two identical dimers (of point group 2) related by a twofold rotation around another two-fold symmetry axis; point group 32 indicates a trimer of dimers, i.e., three identical symmetric dimers related by a three-fold symmetry axis (see Fig. 1). For symmetric macromolecular aggregates, only the following point groups are possible: n, n2, 23, 432, 532, where n is any positive integer (Weyl, 1952). In the present PDB* (protein structure database; Bernstein et al., 1977), all these are represented (Schirmer, 1978). Some protein oligomers are quasi-symmetric; i.e., the monomers are chemically identical, but each has a slightly different conformation. Quasi-symmetries can be reliably detected by X-ray crystallography. However, in NMR spectroscopy, quasi-symmetries can only be detected when the distinct conformations are longlived compared with the NMR time scale; when each monomer exchanges rapidly
between the different conformations, only one average conformation will be seen in the NMR spectra, and the molecule will incorrectly appear to be completely symmetric. An example of a quasi-symmetry that cannot be detected by NMR occurs in the central asparagine residue of some leucine zipper homodimers: in one crystal structure of the GCN4 leucine zipper, the molecule is entirely symmetric except for the side chain of N16 (PDB code 1tza; O’Shea et al., 1991). In another crystal
structure, the entire structure has symmetric electron density, but the density for N16 cannot be fitted by a single conformation, indicating that the side chain is disordered (Konig and Richmond, 1993). This residue occurs at the dimer interface and interacts with its symmetry mate on the other monomer. A symmetric conformation of the two side chains would lead to steric overlap; hence the side chains exchange between two asymmetric conformations. In contrast, NMR studies of GCN4 and other closely related leucine zippers show only one set of resonances for this residue (Junius et al., 1996; Atkinson et al., 1991; Saudek et al., 1991; Oas
et al., 1990), indicating that the exchange must occur rapidly on the NMR time scale. The fast exchange could be confirmed by hydrogen-exchange and relaxation measurements (MacKay et al., 1996; King, 1996; Junius et al., 1995). Other oligomers are pseudosymmetric; i.e., the monomers are chemically distinct but are arranged nearly symmetrically. Pseudosymmetric oligomers are fairly common; the structures of several have already been determined by NMR. When the proteins have distinct sequences, pseudosymmetry is generally no *For the meanings of acronyms and symbols used, see the symbols list preceding the references.
134
Seán I. O’Donoghue and Michael Nilges
Calculation of Symmetric Oligomer Structures from NMR Data
135
136
Seán I. O’Donoghue and Michael Nilges
problem for NMR. However, as the sequence similarity increases, there will be more chemical-shift degeneracy, and the situation approaches true symmetry;
hence, determining the structure by NMR becomes more complicated.
2.2. The Problem: Symmetry Degeneracy in NMR Spectra In NMR spectra of symmetric oligomers, all symmetry-related nuclei have equivalent magnetic environments and therefore are degenerate in chemical shift. Thus, only one monomer is “seen” in the spectra. We refer to this degeneracy as “symmetry degeneracy,” to distinguish it from the more familiar “dispersion degeneracy” that also occurs in asymmetric systems. To determine the number of monomers in a symmetric oligomer generally requires an independent technique, such as sedimentation equilibrium studies or chemical cross-linking.
Symmetry degeneracy greatly simplifies the resonance assignment problem since we only have to assign one monomer. Consequently, in the homonuclear case,
it is feasible to assign symmetric oligomers that are much larger than the present limit for asymmetric structures. As an extreme example, Flynn et al. (1977) recently reported the partial assignment of a symmetric oligomer of 11 monomers, total molecular weight of 91 kDa. Unfortunately, NOESY assignment and structure calculation of symmetric structures are considerably more complicated than for asymmetric structures. The central problem is that it is impossible to distinguish if an NOE cross peak is intramonomer, intermonomer, or comonomer. For trimers and higher-order oligomers there are several different classes of intermonomer NOEs that occur; again, it is impossible to distinguish these classes in symmetry-degenerate NMR spectra (Fig. 2). Hence, with traditional calculation methods, which require explicit assignment of all NOEs, no structure can be determined a priori. A related problem is the reduced number of NOE cross peaks compared to an equivalent-sized asymmetric system; in theory, this can be compensated by decreasing the degrees of freedom searched during the structure calculation, specifying the coordinates for only one monomer together with the symmetry axes. In practice, implementing this approach complicates the structure calculation. There is an additional complication when the point group symmetry is not clear. For example, a tetramer may have either point group 222 or 4, and from the NMR spectra it would not be possible to distinguish between these possibilities; hence we would not know which symmetry to apply during the structure calculation. The point group can sometimes be inferred from stability studies; e.g., for the p53 tetramer, the dimer was observed to be more stable, which suggested point group 222, rather than 4 (Lee et al., 1994).
Calculation of Symmetric Oligomer Structures from NMR Data
137
2.3. Reducing Symmetry Degeneracy with Asymmetric Labeling Experimentally, symmetry degeneracy is a distinct and more fundamental problem than dispersion degeneracy. While dispersion degeneracy can be reduced
138
Seán I. O’Donoghue and Michael Nilges
using higher field strengths, better acquisition, or better sample conditions, symmetry degeneracy cannot. Several experimental approaches have been proposed to break the symmetry by mixing labeled with unlabeled monomers (“asymmetric labeling”). These experiments can specifically identify which NOEs are intermonomer by analyzing difference spectra in the case of labeling (Arrowsmith
et al., 1991) or using X-filtered spectroscopy for
and
labeling (Folmer et
al., 1995a; Folkers et al., 1993). However, these approaches have several limitations. First, it is sometimes difficult to achieve full mixing of labeled and unlabeled monomers in the oligomer. Second, the difference spectra and X-filtered experiments have reduced signal-tonoise and may have strong artifacts; this may be improved through careful design of the experiment (Folmer et al., 1995a), but interpreting these spectra still requires a great deal of caution as the artifacts can lead to serious errors in the final structure
(Clore et al., 1995). The lower signal-to-noise may result in comonomer NOEs being incorrectly assigned as purely intramonomer; when there are very few purely
intermonomer NOEs, the structure determination is considerably more complicated. A final limitation is that the method cannot distinguish between the different classes of intermonomer NOEs that occur in trimers and higher-order oligomers; it can only distinguish between intermonomer and intramonomer NOEs. This is usually sufficient to enable the monomer structure to be calculated, and for dimers it is generally sufficient to enable complete structure determination. However, for higher-order oligomers the data remain highly ambiguous; determining the structure requires a special calculation method.
3. THE SYMMETRY-ADR CALCULATION METHOD
In this section we describe the symmetry-ADR method we have developed for calculating symmetric oligomer structures. The method has three main features: the use of symmetry restraint terms to enforce correct symmetry; the use of ambiguous distance restraints to describe the ambiguity in the NOEs arising from symmetry
degeneracy; and finally, the use of specific annealing protocols to actually run the structure calculation.
3.1. Symmetry Restraint Terms Throughout the structure calculation, each monomer is represented by a
separate set of coordinates; the symmetry is enforced using two restraint terms. The first term forces the monomers to be (very nearly) identical and uses the NCS (“non-crystallographic symmetry”) restraint option in X-PLOR (Brünger, 1993). This restrains each atom to the average position over all monomers, using the following potential:
Calculation of Symmetric Oligomer Structures from NMR Data
where
139
are the Cartesian coordinates of
the ath atom on the mth monomer after superimposing onto the first monomer, are the averages of the superimposed coordinates, A is the total number of atoms in one monomer, and M is the number of monomers. The second restraint term ensures a symmetric arrangement of the identical
monomers using distance symmetry (DSYM) restraints (O’Donoghue et al., 1996; Nilges, 1993). In this potential, we specify some number, S, of atom pairs, and chosen from one monomer; then considering all symmetry-related atoms, we
restrain all equivalent intermonomer distances to be equal. Which distances need to be included depends on the point group. For a dimer, we need only restrain two
intermonomer distances for each pair of atoms; i.e., the following difference should be zero:
where indicates the distance between the atom on the first monomer, and the atom on the second monomer [similarly for . For higher order oligomers, several different intermonomer distances need to be restrained for each pair of atoms. Table 1 shows the equivalent distance pairs for all oligomers up to a hexamer. When the point group is not known (e.g., for a tetramer the point group could be either 4 or 222), it would be necessary to do a separate series of structure calculations for each possible point group; the correct point group should give the lowest-energy structures. The equivalent distance pairs are restrained using the following “soft-square” potential that switches from an initially square-well poten-
tial to asymptotic behavior for large deviations:
where and are determined by the requirement that the function is continuous and differentiable at the switching distance and
is the slope of the asymptote. Since the NCS restraint term keeps the monomers identical, only a small number of atom pairs need to be restrained. We
use one pair of atoms per residue in the monomer, with the atoms systematically set to each and the atoms chosen at random. A more efficient method would be to have only one set of monomer coordinates; however, this would require explicitly defining the symmetry axes at the beginning of the calculation. Currently, this approach is not implemented in X-PLOR. The advantage of our approach (separate coordinates for each monomer
140
Seán I. O’Donoghue and Michael Nilges
and enforcing symmetry using the two terms above) is that the symmetry axes do not need to be defined; they evolve implicitly during the structure calculation, driven by the NOE data.
3.2. Ambiguous Distance Restraints (ADRs) The second major problem of symmetric oligomers is how to treat ambiguous NOEs that arise from symmetry degeneracy. Our approach is to use the same
Calculation of Symmetric Oligomer Structures from NMR Data
141
formalism as for ambiguous NOEs arising from dispersion degeneracy in asymmetric systems (Nilges and O’Donoghue, 1998; Nilges, 1997, 1996, 1995), simply extending the summation to include intermonomer contributions (O’Donoghue et al., 1993; Nilges, 1993). This approach is described in detail below. For the nth NOE cross peak of volume upper and lower limits distance bounds are calculated as follows:
where and are the distance and volume of a known reference; and are error estimates on the upper and lower limit bounds, respectively. The cross peak between a pair of methylene protons is often used as the reference. However, this is not the best choice as this distance is very short and
fixed, whereas most of the structurally important NOEs are longer and hence are differently affected by spin diffusion and internal dynamics. A better reference is to define as the average of the characteristic backbone–backbone
distances within assigned secondary structure elements, and as the arithmetic average of the corresponding NOE volumes. Once an initial set of structures has been calculated, for subsequent iterations and can be defined using all proton pairs less than, say, 6 apart (Nilges et al., 1997). Several definitions for and have been tried; empirically, we have found to be a good starting point for asymmetric structures. In practice, is very important to ensure that is set correctly when using our protocols. When the error bounds are too generous, the restraints do not discriminate between the correct structure and many other possibilities; the restraints may even be satisfied
in a monomer. In these cases, the calculation gives an ensemble of structures with high RMSD. With too tight error bounds, the calculation may converge to an incorrect conformation with high energy; unfortunately, it is not easy to define what constitutes “high” energy—it depends on the data quality and on the force field used. For each NOE peak, we apply one ambiguous distance restraint (ADR) for each
monomer in the oligomer. For the mth restraint from the nth NOE, we restrain the following “d–6-summed distance”:
142
Seán I. O’Donoghue and Michael Nilges
where the sum over i is over all dispersion-degenerate protons from the mth monomer, on the F1 axis; the sum over j is over all dispersion-degenerate protons from the monomer, on the F2 axis; M is the number of monomers in the oligomer. Hence, the resulting distance restraint set (DRS) has restraints, where N is the number of NOE crosspeaks. In some of our previous work (Folkers et al., 1994; Nilges, 1993), we used one restraint per NOE; this requires another summation over m, and division by M (in X-PLOR this can be done automatically by setting the monomer parameter to M). However, we now suggest using a separate restraint for each monomer (i.e., M restraints per NOE with the monomer parameter set to 1) since it is then easier to include data from asymmetric labeling experiments: if we know that a peak is not intramonomer, the intramonomer contribution to Eq. (6) is not calculated (i.e., the sum over
excludes
During refinement, the model structure is constrained to satisfy the NOE data in the DRS by restraining the distances to be within the corresponding upper and lower bounds using the “soft” potential function (Nilges et al., 1988b) which switches between flat, square, and asymptotic behavior:
where
and are determined as described for Eq. (3), is usually set to 1 and the slope of the asymptote, is usually set to 2 NOEs which are unambiguous are defined similarly (i.e., M restraints for each peak) but are treated as a separate restraint term, (in X-PLOR, this is done by defining two NOE classes); is usually weighted more strongly than and the more stringent square-well potential may be used [effectively setting to infinity in Eq. (7)].
3.3. Annealing Protocols
Having defined the symmetry and ADR information, we need a method to search conformational space to find conformations that satisfy these experimental restraints. The annealing protocols we use are derived from standard “molecular dynamical simulated annealing” (MDSA) protocols developed for asymmetric structures with unambiguous distance restraints (Nilges et al., 1988a, 1988c), and
use essentially the same simplified force field (Nilges et al., 1988b). However, we have developed several modified protocols specifically for calculating symmetric oligomers (Table 2). In this section, we discuss the three main protocols.
Calculation of Symmetric Oligomer Structures from NMR Data
143
3.3.1. Ab initio Protocols
The first protocol developed for symmetric oligomers, called MDSA-SO-RPP, was designed to begin with no assumed knowledge of the monomer structure (Nilges, 1993). The protocol begins by generating a monomer structure with a random chain (i.e., random angles); the other monomers are then created with exactly the same coordinates. Thus, the initial conformation trivially satisfies both of the symmetry terms. A later protocol, called MDSA-SO-RXYZ, begins with random Cartesian coordinates, and hence uses a very different weighting scheme to vary the force-field parameters. In both protocols, the calculation is done in three phases. The first stage is a high-temperature conformational search in which nonbonded interactions between atoms are greatly reduced by calculating only interactions between atoms using the repel potential (Nilges et al., 1988b) with a slightly increased radius. The weights on the NCS, DSYM, and covalent geometry terms are also reduced. This allows the structure the necessary freedom to move toward a low-energy conformation. The monomers quickly separate from the initial coincident position. In the second phase, the temperature of the system is slowly
144
Seán I. O’Donoghue and Michael Nilges
cooled, and the weights on the nonbonded, symmetry, and NOE terms are simultaneously increased. Nonbonded interactions are calculated between all atoms, switching to smaller radius. In the final phase, the energy of the structure is minimized using weights of 1.0 for all energy terms. In later versions of these protocols, we have tried starting structures in which
the monomers are placed in the correct symmetry by rotations, keeping the center of mass for each monomer is at the origin. Particularly for the random Cartesian coordinate structures, this starting orientation is completely unbiased in the initial implicit intra- and intermonomer assignments. This may be of particular advantage for solving oligomers in which the monomers are intricately interwoven.
3.3.2. Beginning with a Known Monomer We have also developed a variation of the above protocol (called MDSA-SOWDMR) for the case where a reasonably accurate monomer structure can be calculated before the complete oligomer structure is known (O’Donoghue et al, 1996). Such will often be the case, as asymmetric labeling techniques allow intramonomer NOEs to be unambiguously assigned. The protocol begins from a well-defined monomer structure calculated from the intramonomer NOEs with the standard MDSA-AM-RXYZ protocol (Table 2); “well-defined” means that the structure has good covalent geometry and the overall topology is approximately correct. The monomer structure is maintained throughout the oligomer protocol using higher initial weights on the covalent geometry terms, the NCS term, the intramonomer NOEs, and on the term restraining the experimentally determined dihedral angles. In this way, many assignments are implicitly done at the first stage of the protocol; many of the ADRs that correspond to intramonomer NOEs will already be satisfied by the monomer structure, and the intermonomer assignment possibilities in Eq. (6) will not contribute to the force driving the structure calculation. In contrast, most of the ADRs that correspond to intermonomer NOEs will not be satisfied, and hence a relatively large force will be applied in which both the intermonomer and intramonomer terms will contribute. The initial relative placement of the monomers is important, as it defines the initial weighting of these contributions. For this reason, the monomers are initially placed with the correct symmetry but with a randomized relative orientation of the monomers. This is done by centering the monomer at the origin and randomly rotating; the symmetry-related monomers are then generated by applying appropriate symmetry rotations. During the structure calculation, the weight on the NCS term is kept high, forcing the monomers to move cooperatively. Except as described above, the calculation proceeds as before. Clearly, this protocol is not as unbiased as the ab initio protocols; however, in some cases, it appears to give better initial convergence.
Calculation of Symmetric Oligomer Structures from NMR Data
145
3.4. Iterative Structure Calculation and Explicit Assignment of ADRs In our experience, only a small fraction of the structures in the initial ensembles have the correct overall oligomer topology; however, the correct structures usually have the lowest energies. For the annealing method to produce high-quality structures, we require a high rate of convergence to the correct topology. The same problem occurs when using ADRs to calculate asymmetric structures from spectra with high dispersion degeneracy. Unfortunately, the many contributing terms in the ADRs introduce many additional local minima, making it much more difficult to find the correct conformation. The solution is to use the low-energy structures in the initial ensemble to partially assign the ambiguous NOEs, then calculate a new ensemble of structures using the partial assignments. In this way the convergence toward the correct topology can be iteratively improved until it is high enough so that the lowest-energy structures define a high-quality solution structure. This iterative assignment can be done with ARIA (Nilges and O’Donoghue, 1998; Nilges et al., 1997), which was originally designed for calculating asymmetric structures. The standard criterion for assignment in ARIA is based on an estimate of the relative peak contributions of different assignment possibilities to the peak volume. For each assignment possibility, k, which contributes to a given NOE [i.e., each term in the summation on the right-hand side of Eq. (6)], the relative contribution, to the total NOE volume is estimated from the corresponding interproton distances in the ensemble of calculated structures using
where the sum over a is over all pairs of protons which contribute to the NOE, and is the average distance between the given proton pair in the structure ensemble. The assignment possibilities are then reordered according to the values, such that corresponds to the assignment with the largest contribution, We then find the largest contributions such that
where the cutoff parameter p is gradually reduced over successive iterations, usually starting from 0.999 for the first iteration and reaching a final value of 0.8 in the eighth iteration. The corresponding assignment possibilities are then written out as a new ADR, and a new round of structures calculated with the new DRS. Applying this “assignment filter,” the ambiguity can be iteratively reduced, giving progressive improvement in convergence and efficiency.
146
Seán I. O’Donoghue and Michael Nilges
3.5. Other Restraint Terms In some cases, particularly when more NOEs are comonomer than inter-
monomer, the above method can have very low convergence; here we describe some additional restraint terms that can be used to improve convergence in difficult cases. 3.5.1. Packing Restraints
During the calculation, it may happen that the monomers drift too far apart so that the intermonomer terms become negligible. To avoid this, it may be necessary to add an overall “packing” or “collapse” term. Simply restraining all atoms to the
origin with a low weight is sometimes sufficient (see Sec. 4.4 and Nilges, 1995). In the case of leucine zippers, we used a “coiled-coil” packing term, which we found to be important in solving the structure (Sec. 4.3). Such packing terms should not affect the energy landscape close to the correct fold, but merely increase convergence to the correct fold by preventing dissociation of the oligomer. 3.5.2. Comonomer Restraints
NOEs involving protons close to a symmetry axis may be comonomer, i.e., arising from a mixture of several classes of interaction (intramonomer interactions
or the different classes of intermonomer interactions). When the entire interface between two monomers is close to a symmetry axis (e.g., leucine zippers, Sec. 4.3), there will be more comonomer NOEs than pure intermonomer NOEs. In such cases, the intramonomer contributions alone are almost sufficient to satisfy the ADRs, and hence only a weak force is applied between the monomers. Hence, convergence to the correct topology can be particularly low; moreover, even after many iterations using the assignment filter (Sec. 3.4), these NOEs will at best be left ambiguous, or possibly incorrectly assigned as intramonomer.
A solution to this problem is to try to specifically assign comonomer NOEs, and include comonomer restraints in the structure calculation. Our assignment criterion for comonomer NOEs was to consider both the intramonomer and intermonomer assignment possibilities involved in each NOE; if both distances are less than 5 Å in all low-energy structures in the ensemble, we consider that the NOE is comonomer. In this case, we add two additional restraints to the DRS, separately restraining the intramonomer and the intermonomer distances to be less than 5 Å. These restraints usually improve convergence.
Since asymmetric labeling experiments may lead to comonomer NOEs being incorrectly assigned as intramonomer, all NOEs assigned as intramonomer can be also be checked in the above manner.
Calculation of Symmetric Oligomer Structures from NMR Data
147
3.5.3. Interface Filter
Another approach in cases where the initial convergence is very low is to attempt to identify the residues involved in intermonomer contacts. If all interface residues can be identified, we can design an “interface filter” that can be used to screen out structures that do not have the correct interface. The filter uses the following principle: each interface residue must be close to at least one interface residue on a separate monomer. We measure the summed distance from each interface residue to all other interface residues on other monomers. Structures in which this distance is greater than, say, 9 Å can then be excluded from the assignment analysis, hence improving convergence toward the correct topology. When asymmetric labeling experiments have been done, it may be possible to apply the interface filter from the beginning of the structure calculation, since in general only the interface residues will have ambiguous NOEs. In this case, we can improve convergence greatly by choosing starting conformations that satisfy the interface filter. In the absence of asymmetric labeling data, we may be able to map the interface residues after several rounds of structure calculation; if the data show some tendency to converge toward the correct topology and if the assignment filter (Sec. 3.4) is used carefully enough, only the interface residues will be left as ambiguous after several iterations. The method may work even for particularly difficult DRSs (O’Donoghue et al., 1996).
4. EXPERIENCES WITH THE SYMMETRY-ADR METHOD
In this section, we discuss the experience we have had in applying the symmetry-ADR method to calculating symmetric oligomer structures. 4.1.
Initial Test Calculations
The method was first tested using three DRSs (Nilges, 1993): a model DRS derived from the crystal structure of the met repressor (1cmc; Rafferty et al., 1989), and two experimental DRSs—one measured for TNCIII, a peptide comprising one EF hand of troponin C (1cta; Kay et al., 1991), and another measure for interleukin 8 (2il8, Clore et al., 1990). These structures are shown in Fig. 3. In these calculations, all NOEs were treated as ambiguous, and all calculations used the MDSASO-RPP protocol (Sec. 3.3.1); however, different starting structures were used for each DRS. The simplest case was that of interleukin 8, where we used the crystal structure (3il8; Baldwin et al., 1991) as the starting structure. All calculations converged to the previously published NMR structure. The crystal structure is about 2 Å RMSD from the NMR structure of the same molecule; the structural rearrangements were
148
Seán I. O’Donoghue and Michael Nilges
Calculation of Symmetric Oligomer Structures from NMR Data
149
therefore minor and involved mostly a widening of the gap between the two helices by about 2 Å. We were initially motivated to use the met represser for testing our method because of the problems encountered by Breg et al. (1990) in solving the solution
structure of the Arc repressor, a homologous protein; they were only able to solve the structure by exploiting this homology to partially assign the NOESY spectrum. In both structures, the two monomers are intricately interwoven (Fig. 3); the monomer structure can only be formed by interaction with another monomer. This fold proved to be very challenging for the MDSA-SO-RPP protocol. Calculations starting from identical, superimposed random chain monomers completely failed to converge. We then tested to see if convergence could be achieved starting from structures close to the crystal structure. Two kinds of distortions were applied to the crystal structure: rotating each secondary structure element up to 180° around its own axis, and shifting the sequence up to three residues from its correct position. In both cases, the calculation converged back to the crystal structure. The calculations with TNCIII gave the first evidence that fully automatic ab initio calculation is feasible. Calculations started with random chains, with both monomers ideally superimposed. Eight out of 50 structures converged to low energy and correct symmetry. Most nonconverged structures showed completely dissociated monomers; due to the scarcity of intermonomer NOEs and a relatively flexible monomer, all NOEs could almost be satisfied in one monomer alone. A packing restraint might have improved the convergence. The correct dimer structure gave the lowest energy. 4.2. ssDBP Dimer
The first real application of the method was in determining the structure of the single-stranded DNA binding protein (ssDBP) encoded by gene V of the phage M13 (2gvb; Folkers et al., 1994); the structure is shown in Fig. 3. As a starting structure, we used precalculated monomer structures placed in approximately the correct orientations, and the incorrect crystal structure (2gn5; Brayer and McPherson, 1983), which is shifted by one to four residues from the correct structure but has correct topology and symmetry. All NOEs were treated as ambiguous, and data from an asymmetric labeling experiment were also used by imposing 6-Å upper limits on intermonomer distances, in addition to using the ADRs derived from the homonuclear experiment. The calculation converged convincingly; ranked in order of total energy, the best 50% had a similarly low energy and the correct topology. 4.3. Leucine Zipper Homodimers The method was also used in determining the structure of the leucine zipper domain of the Jun homodimer (1jun; Junius et al., 1996; O’Donoghue et al., 1996).
150
Seán I. O’Donoghue and Michael Nilges
Despite the geometric simplicity of the coiled-coil fold (Fig. 3), the leucine zippers
are a particularly difficult case for NMR structure determination. Due to repetition in sequence and structure, there is high dispersion degeneracy in addition to the symmetry degeneracy. In addition, the entire intermonomer interface is close to the symmetry axis; hence, there are more comonomer NOEs than pure intermonomer NOEs. An additional problem with the Jun DRS was that no asymmetric labeling experiments were done to distinguish intra- and intermonomer NOEs. These
experiments would have been particularly useful to identify NOEs between symmetry-related nuclei, since many of the intermonomer NOEs are of this type. Initial calculations using the ab initio protocols (Sec. 3.3.1) showed extremely low convergence to the correct fold. Hence, we developed a protocol to exploit prior knowledge we have about the monomer structure (Sec. 3.3.2) and about the dimer symmetry. In developing the new protocol, we did extensive test calculations using model DRSs derived from the crystal structure of the GCN4 LZ homodimer (2zta; O’Shea et al., 1991) and from a model structure of the Jun LZ homodimer (O’Donoghue et al., 1993). These DRSs were designed to have complete symmetry ambiguity and the same number of NOEs per residue as in the Jun DRS, hence mimicking the experimental DRS. Many backbone–backbone NOEs in a symmetric coiled-coil can be unambiguously assigned as intramonomer (O’Donoghue et al., 1993). Using these NOEs with the MDSA-AM-RXYZ-1.0 protocol, we generated 50 monomer structures for each distance set. These structures were completely helical, with a somewhat variable overall twist. The dimer structures were calculated starting from these monomers, as described in Sec. 3.3.2. The NOEs were classified into three distance categories with upper limits of 3.3, 4.2, and 6.0. The lower limits were set to zero. A packing term was used, restraining the geometric centers of each symmetry-related heptad to be within 10.4 Å using a square-well quadratic potential (Nilges and Brünger, 1991). From trial calculations with the model DRSs, we were able to optimize the protocol specifically for the coiled-coil geometry; the final protocol, called MDSASCC-WDMR, had significantly improved convergence in the initial structure calculation round. The protocol starts with two identical monomers ( helices) arranged in parallel. Using the final protocol, we generated 50 structures for each of the model DRSs; all structures in the top 50% (ranked in order of total energy) had the correct coiled-coil interface (a and d residues in the interface). The final selected ensembles had good covalent geometry, no NOE violations greater than 0.5 Å, and superimposed closely onto the structures from which the DRSs were derived; the mainchain RMSD were Å for GCN4 and Å for Jun. These numbers give some idea of the expected accuracy of the structure calculation method. Using the experimental DRS for Jun, 50 dimer structures were calculated; again, the top 50% all had the correct coiled-coil interface. The ARIA assignment
Calculation of Symmetric Oligomer Structures from NMR Data
151
filter [ ] was used to produce a new, less ambiguous DRS. Also, several comonomer NOEs were assigned (Sec. 3.5.2). A second round of structures was calculated with the new DRS to give the final structures. The final ensemble had no NOE violations greater than 0.5 Å and good covalent geometry. The ensemble superimposes onto the homologous region of the Fos–Jun crystal structure (Glover and Harrison, 1995) with an average RMSD of 0.9 Å, giving an independent estimate of the accuracy of the ensemble. 4.4. p53 Tetramerization Domain
The tetramerization domain of the tumor suppressor protein p53 (1pes) was solved by Lee et al. (1994) using a modification of the MDSA-SO-RPP protocol to first calculate the dimer structure, and then the tetramer, starting from symmetric random-chain structures and using manual iterative assignment. The structure was also solved by Clore et al. (1sae; 1995, 1994), using a different approach relying much more on manual assignment (described in detail by Gronenborn and Clore, 1995). Encouraged by the results of Lee et al., we have since been using p53 as the standard test for the symmetry-ADR method. Our goal has been to automate the
calculation as far as possible. From the NMR data of Clore et al. (1995), a model DRS was derived by
removing all distance restraints that could only have been obtained by asymmetric labeling, i.e., data between equivalent protons on different monomers. The corresponding NOEs would lie on the diagonal in a standard homo- or heteronuclear NOESY spectrum. Similarly, the hydrogen-bond restraints for the sheets were
152
Seán I. O’Donoghue and Michael Nilges
removed, since they require assignment of intermonomer NOEs. In contrast, the hydrogen bonds in the helices could be used. In the first part of the calculation, intramonomer lower-bound restraints in the secondary structure elements were used to improve the definition of the helices, and avoid an incorrect fold of the -strands into hairpins [cf. the incorrectly folded structures of the met repressor in Nilges (1993)]. In addition, a packing term restraining all atoms weakly to the origin (Nilges, 1995) was employed. We used the protocol developed for asymmetric ambiguities (Nilges, 1995) without modification. The calculation consists of a sequence of four simulated annealing protocols. First, an approximate structure was calculated, starting from random Cartesian coordinates. This starting structure seemed appropriate for an intricately interwoven oligomer since it contains no systematic bias toward intraor intermonomer assignment. Because of our negative experiences starting with random chains for the met repressor model study, we did not try the originally published protocol (Nilges, 1993). No chiral information is present in the first part of the protocol. Subsequently, the correct enantiomer was selected (Kuszewski et al., 1992) and the structure was regularized. The structures were then refined twice. Out of 50 calculated structures, the six lowest-energy structures converged to the correct symmetry. The RMS difference between these structures ranges between 1 and 5 Å, and the interhelical angles range roughly from the values found in the first published structure (Clore et al., 1994) and the subsequently published NMR and X-ray crystal structures (Clore et al., 1995; Cho et al., 1994; Lee et al., 1994); see Fig. 4. This result demonstrates the lack of experimental data in the dimer– dimer interface (many of the NOEs in the dimer–dimer interface are between equivalent protons and were not used in the calculation). Hence, the data obtained without asymmetric labeling seems insufficient to uniquely determine the interface. Encouragingly, while most of the structures did not converge to the correct symmetry, many of the higher-energy structures contained essentially correct dimers (Fig. 4). Incorrect dimer topology was only found with much higher energies.
5. SYMMETRIC OLIGOMERS SOLVED BY NMR In this section we briefly discuss the calculation methods used in all of the symmetric oligomer structures solved by NMR to date; the majority have been solved using the symmetry-ADR method or parts of this method. Table 3 lists all symmetric oligomer structures in the PDB (November 1997) which were solved using NMR spectroscopy; most have been solved in the last few years. Almost all structures are dimers. There are only two tetramers, p53 (Clore et al., 1995; Lee et al., 1994) and platelet factor 4/IL-8 chimer (1 pfn; Mayo et al., 1995), both point group 22. Only in the last year have higher-order oligomers been
Calculation of Symmetric Oligomer Structures from NMR Data
153
154
Seán I. O’Donoghue and Michael Nilges
Calculation of Symmetric Oligomer Structures from NMR Data
155
solved: the VTB pentamer (4ull, point group 5; Richardson et al., 1997) and the insulain hexamer (1aiy, point group 32; Chang et al., 1997); these structures are shown in Fig. 1b. Currently, there are no oligomers with point groups 3,4, or 6, and no heptamer or higher-order oligomer. Of the 40 reported structure determinations, 26 have used one or both of the symmetry restraint terms, while 16 have used symmetry ADRs. In many cases, the NOEs assigned as intramonomer from asymmetric labeling experiments have been used to build monomer structures. Often, these structures were used to test the remaining ambiguous assignments—those that could not be satisfied as intramonomer were then assigned as intermonomer. This is essentially doing manually what is done automatically at the early stages of the symmetry-ADR method. In some cases, including the VTB pentamer and the insulin hexamer, the symmetry ambiguity of the NOEs was resolved by reference to previously determined crystal structures; this method is analogous to molecular replacement in X-ray crystallography. While this “molecular reference” method can be effective, it is clearly preferable if the ambiguities in the DRS can be resolved using NMR data alone. In the case of the insulin hexamer, we have recently re-calculated the structure using the symmetry-ADR method, without reference to the crystal struc-
ture. The resulting structure has the same fold as the crystal structure (O’Donoghue, S. I. Chang, X., Abseher, R., Nilges, M., and Led, J. J., in preparation). These results suggest it may be worthwhile re-calculating the other structures solved by reference to crystal structures. In several cases, the ab initio approach (Sec. 3.3.1) was tried and found to have extremely low convergence, in agreement with our own experiences with interleukin 8 and leucine zippers, and for the model data for the met repressor. In such
cases it is best to begin with monomer structures, where possible, and to apply iterative assignment and additional restraints (Secs. 3.4 and 3.5). 6. DISCUSSION 6.1. Problems of the Symmetry-ADR Method The method has been tested on completely ambiguous NOE DRSs in several model calculations; it has also been applied to five completely ambiguous experimental DRSs to produce novel structures; in each case, where crystal structures are available, there is good agreement. In addition, the method has been used with 11 partially ambiguous experimental DRSs, with intermonomer assignments derived either manually or from asymmetric labeling experiments. The method has the appeal that it can be extended to any point group symmetry, and that all information in the spectra can be used to direct the structure calculation, including results of asymmetric labeling. Thus, we conclude that the symmetry-ADR method is a useful
156
Seán I. O’Donoghue and Michael Nilges
general solution to the symmetric oligomer problem. However, the method has problems with certain types of symmetry. When all or most of the interfacial residues are close to a symmetry axis, as in leucine zippers, there are few purely intermonomer NOEs. In such cases, it is most important to correctly assign the comonomer NOEs—unfortunately, these usually cannot be assigned experimentally. The situation is much improved if X-filtered experiments can identify nuclei that interact with their own symmetry mates (and hence occur close to a symmetry axis). Unfortunately, however, these experiments are prone to artifacts and must be interpreted with care. Thus, during the initial rounds of structure calculation, the force driving toward the correct structure can be very weak, hence the convergence will be low, and the calculation can be quite difficult. In such cases, convergence can be improved by using packing restraints, iterative assignment, comonomer restraints, and the interface filter.
In some cases, such as the DRS without the asymmetric labeling data, the data are simply not sufficient to define a unique structure; the initial round of
structure calculation may then suggest a variety of different solutions, as also observed for the cellulose-binding domain by Xu et al. (1995), using a modified form of the symmetry-ADR method. In such cases, there is a danger that applying the iterative assignment procedure may lead to overfitting the data and may converge to the wrong fold. The question is: how do we know if the DRS is sufficient to define a structure? Clearly, we should apply measures to detect these situations, either internal measures of the information content in the data (e.g., free R-factors;
Brünger et al., 1993), agreement with other spectral information such as chemical shift (Sorimachi et al., 1996), or external criteria which judge the final structures [e.g. the PROSA program, which can recognize incorrect structures (Sippl, 1993)].
6.2. Should Symmetry Restraint Terms Be Used? In 14 of the symmetric oligomers solved to date, no symmetry restraint terms were used (Table 3). Leaving out the DSYM term usually does not influence the final structure greatly, once many intermonomer NOEs have been unambiguously assigned, since symmetrically applied intermonomer restraints have a similar effect. The DSYM term acts more as a catalyst, increasing convergence to the right fold. A justification for leaving out NCS symmetry may be seen in the fact that, due to thermal motion, the monomers in an oligomer will rarely be exactly symmetric. In contrast, using the DSYM and NCS terms ensures that the final structures have near perfect symmetry. When complete symmetry is observed in the spectra, we argue that it is valid to fit the structure to this observation. If there are regions affected by time-averaged quasi-symmetries, these may show up as having larger NCS energies. The symmetry restraint terms also aid in improving the convergence of the method, and overcome the potential problem of overfitting the data due to the reduced degrees of freedom (Sec. 2.2). When ADRs are used to describe the symmetry-ambiguous NOEs, it is particularly important to use symmetry restraints.
Calculation of Symmetric Oligomer Structures from NMR Data
157
Leaving out the symmetry restraint terms would require a careful investigation to determine if the data content is high enough via a free R-factor calculation. We believe that it is best to apply symmetry restraint terms during the initial structure calculation to solve the basic problems associated with symmetric oligomer structure determination. Having calculated the correct structure, and hence having assigned many intermonomer NOEs, an additional round of structure calculations can be performed without the symmetry restraint terms. Our experience with symmetric oligomers has shown that there may be a value in doing these calculations, as the ensemble of structures produced without symmetry restraints may better represent the internal dynamics within the oligomer (Abseher et al., 1998). 6.3. Alternatives to the Symmetry-ADR Method
Overall, the method shows poor convergence compared to asymmetric cases. This is likely due to strong correlations between the ambiguities of neighboring residues. A minimization method, such as simulated annealing, that moves single atoms (or rigid parts of amino acids) may not be optimal to move larger parts of the structure coherently if a whole set of NOEs needs to be implicitly
reassigned. The performance of the symmetry-ADR method may be improved by more powerful minimization techniques, such as torsion-angle dynamics, or by using
only one set of coordinates for every monomer, generating the others by strict symmetry. An alternative approach may be possible with self-correcting distance geometry (see Chapter 2 of this volume); to date, no structure determination of a symmetric oligomer has been reported using this approach. This approach is likely to have similar performance to iterative MDSA.
In conclusion, while the determination of symmetric oligomer structures still poses a challenge for NMR spectroscopy, the symmetry-ADR method is often successful, particularly in combination with data from asymmetric labeling experiments. Particularly for higher order oligomers, the use of symmetry ADRs appears to be currently the best approach, indeed often the only approach.
ACKNOWLEDGMENTS. We thank Drs. Robert Hooft and Gert Vriend for help with Sec. 2.1.
SYMBOLS Abbreviations ADR ambiguous distance restraint DRS distance restraint set
158
Seán I. O’Donoghue and Michael Nilges
DSYM distance symmetry restraints MDSA molecular dynamical simulated annealing
NCS NOE PDB RMSD
noncrystallographic symmetry nuclear Overhauser effect Brookhaven protein data bank root-mean-squared deviation
Symbols A total number of atoms per monomer relative contribution of the kth assignment possibility d(a,b) distance between atoms a and b ensemble-averaged distance for the ath assignment possibility
summed distance E energy k energy constant lower limit restraint distance corresponding to the nth NOE peak M total number of monomers N total number of NOESY cross peaks p cutoff parameter used in the ARIA assignment filter upper limit restraint distance corresponding to the nth NOE peak volume of the nth NOESY cross peak Cartesian coordinates for the ith atom
REFERENCES Abseher, R., Horstink, L., Hilbers, C. W., and Nilges, M., 1998, Proteins 31:370. Arrowsmith,C. H., Pachter, R., Altman, R. B., Iyer, S. B., andJardetzky, O., l99l, Biochemistry 29:6332. Atkinson, R. A., Saudek, V., Huggins, J. P., and Pelton, J. T., 1991, Biochemistry 30:9387. Baker, P. J., Turnbull, A. P., Sedelnikova, S. E., Stillman, T. J., and Rice, D. W., 1995, Structure 3:693. Baldwin, E. T., Weber, I. T., St. Charles, R., Xuan, J.-C., Appella, E., Yamada, M., Matsushima, K.,
Edwards, B. F. P., Clore, G. M., Gronenborn, A. M., and Wlodawer, A., 1991, Proc. Natl. Acad. Sci. USA 88:502. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M., 1977, J. Mol. Biol. 112:535. Bonvin, A. M. J. J., Vis, H., Breg, J. N., Burgering, M. J., Boelens, R., and Kaptein, R., 1994, J. Mol. Biol. 236:328. Brayer, G. D., and McPherson, A., 1983, J. Mol. Biol. 169:565.
Breg, J. N., van Opheusden, J. H., Burgering, M. J., Boelens, R., and Kaptein, R., 1990, Nature 346:586. Brünger, A. T., 1993, X-PLOR Version 3.1, User Manual, Yale University, New Haven, CT.
Brünger, A. T., Clore, M. G., Gronenborn, A. M., Saffrich, R., and Nilges, M., 1993, Science 261:328. Burgering, M. J. M., Boelens, R., Gilbert, D. E., Breg, J. N., Knight, K. L., and Kaptein, R., 1994, Biochemistry 33:15036. Chang, X., J gensen, A. M. M., Bardrum, P., and Led, J. J., 1997, Biochemistry 36:9409.
Calculation of Symmetric Oligomer Structures from NMR Data
159
Cho, Y., Gorina, S., Jeffrey, P., and Pavletich, N. P., 1994, Science 265:346. Chung, C. W., Cooke, R. M, Proudfoot, A. E., and Wells, T. N., 1995, Biochemistry 34:9307. Clore, G. M., Appella, E., Yamada, M., Matsushima, K., and Gronenborn, A. M., 1990, Biochemistry
29:1689. Clore, G. M., Omichinski, J. G., Sakaguchi, K., Zambrano, N., Appella, E., and Gronenborn, A., 1995, Science 267:1515. Clore, G. M., Omichinski, J. G., Sakaguchi, K., Zambrano, N., Sakamoto, H., Appella, E., and Gronenborn, A. M., 1994, Science 265:386. Drohat, A. C., Amburgey, J. C., Abildgaard, F, Stanch, M. R., Baldisseri, D., and Weber, D. J., 1996, Biochemistry 35:11577. Eberle, W., Pastore, A., Sander, C., and Rösch, P., 1991, J. Biomol. NMR 1:71. Facelli, J. C., and Grant, D. M., 1993, Nature 365:325. Fairbrother, W. J., Reilly, D., Colby, T. J., Hesselgesser, J., and Horuk, R., 1994, J. Mol. Biol. 242:252. Flynn, P. F, Gollnick, P., and Wand, A. J., 1997, Keystone Symposia, Silverthorne, in Frontiers of NMR in Molecular Biology. V, (G. Wagner, S. W. Fesik, and S. J. Opella, eds.), p. 41. Folkers, P. J. M., Folmer, R. H. A., Konings, R. N. H., and Hilbers, C. W., 1993, J. Amer. Chem. Soc. 115:3798. Folkers, P. J. M., Nilges, M., Folmer, R. H. A., Konings, R. N. H., and Hilbers, C. W., 1994, J. Mol. Biol. 236:229.
Folmer, R. H. A., Hilbers, C. W., Konings, R. N. H., and Hallenga, K., 1995a, J Biomol. NMR 5:427. Folmer, R. H. A., Nilges, M., Konings, R. N. H., and Hilbers, C. W., 1995b, J. Mol. Biol. 236:229.
Fry, E., Acharya, R., and Stuart, D., 1993, Acta Crystallogr. A 49:45.
Glover, J. N. M., and Harrison, S. C., 1995, Nature 373:257. Granier, T., Gallois, B., Dautant, A., Langlois D’Estaintot, B., and Precigoux, G., 1996, Acta Crystallogr. D 52:594. Gronenborn, A. M., and Clore, G. M., 1995, Crit. Rev. Biochem. Mol. Biol. 30:351. Handel, T. M., and Domaille, P. J., 1996, Biochemistry 35:6569. Hard, T., Barnes, H. J., Larsson, C., Gustafsson, J. A., and Lund, J., 1995, Nature Struct. Biol. 2:983. Hinck, A. P., Archer, S. J., Qian, S. W., Roberts, A. B., Sporn, M. B., Weatherbee, J. A., Tsang, M. L., Lucas, R., Zhang, B. L., Wenker, J., and Torchia, D. A., 1996, Biochemistry 35:8517. Jia, X., Grove, A., Ivancic, M., Hsu, V. L., Geiduschek, E. P., and Kearns, D. R., 1996, J. Mol. Biol. 263:259. Junius, F. K., Mackay, J. P., Bubb, W. A., Jensen, S. A., Weiss, A. S., and King, G. F., 1995, Biochemistry 34:6164. Junius, F. K., O’Donoghue, S. I., Nilges, M., Weiss, A. S., and King, G. F., 1996, J. Biol. Chem. 271:13663.
Kay, L. E., Forman-Kay, J. D., McCubbin, W. D., and Kay, C. M., 1991, Biochemistry 30:4323. Kilby, P. M., van Eldik, L. J., and Roberts, G. C, 1996, Structure 4:1041. Kim, K. S., Clark-Lewis, I., and Sykes, B. D., 1994, J. Biol. Chem. 269:32909. King, G. F., 1996, Biophys. J. 71:1. Konig, P., and Richmond, T. J., 1993, J. Mol. Biol. 233:139. Kuszewski, J., Nilges, M., and Brünger, A. T., 1992, J. Biomol. NMR 2:33. Lawrence, M. C., Suzuki, E., Varghese, J. N., Davis, P. C., Van Donkelaar, A.,Tillock, P. A., and Colman, P.M., 1990, EMBO J. 9:9. Lee, W., Harvey, T.-S., Yin, Y, Yau, P., Litchfield, D., and Arrowsmith, C.-H., 1994, Nature Struct. Biol. 1:877. Liang, H., Petros, A. M., Meadows, R. P., Yoon, H. S., Egan, D. A., Walter, K., Holzman, T. F., Robins, T., and Fesik, S. W., 1996, Biochemistry 35:2095. Lodi, P. J., Ernst, J. A., Kuszewski, J., Hickman, A. B., Engelman, A., Craigie, R., Clore, G. M., and Gronenborn, A. M., 1995, Biochemistry 34:9826.
160
Seán I. O’Donoghue and Michael Nilges
Lodi, P. J., Garrett, D. S., Kuszewski, J., Tsang, M. L., Weatherbee, J. A., Leonard, W. J., Gronenborn, A. M., and Clore, G. M., 1994, Science 263:1762.
MacKay, J. P., Shaw, G. L., and King, G. F., 1996, Biochemistry 35:4867.
MacKenzie, K. R., Prestegard, J. H., and Engelman, D. M., 1997, Science 276:131. Manival, X., Yang, Y., Strub, M. P., Kochoyan, M., Steinmetz, M., and Aymerich, S., 1997, EMBO J. 16:5019. Matsuo, H., Shirakawa, M., and Kyogoku, Y., 1995, J. Mol. Biol. 254:668. Mayo, K. H., Roongta, V, Ilyina, E., Milius, R., Barker, S., Quinlan, C., La Rosa, G., and Daly, T. J., 1995, Biochemistry 34:11399. Meunier, S., Bernassau, J.-M., Guillemot, J.-C., Ferrara, P., and Darbon, H., 1997, Biochemistry 36:4412. Nilges, M., 1993, Proteins 17:297. Nilges, M., 1995, J. Mol. Biol. 245:645. Nilges, M., 1996, Curr. Opin. Struct. Biol. 6:617.
Nilges, M., 1997, Fold. Des. 2:S53. Nilges, M., and Brünger, A. T., 1991, Protein Eng. 4:649. Nilges, M., and O’Donoghue, S. I., 1998, Prog. NMR Spect 32:107. Nilges, M., Clore, G. M., and Gronenborn, A. M., 1988a, FEBS Lett. 239:129. Nilges, M., Clore, G. M., and Gronenborn, A. M., 1988b, FEBS Lett. 229:317. Nilges, M., Gronenborn, A. M., Brünger, A. T., and Clore, G. M., 1988c, Protein Eng. 2:27. Nilges, M., Macias, M., O’Donoghue, S. I., and Oschkinat, H., 1997, J. Mol. Biol. 269:408. Oas, T. G., McIntosh, L. P., O’Shea, E. K., Dahlquist, F. W., and Kim, P. S., 1990, Biochemistry 29:2891. O’Donoghue, S. I., Junius, F. K., and King, G. F, 1993, Protein Eng. 6:557. O’Donoghue, S. I., King, G. F, and Nilges, M., 1996, J. Biomol. NMR 8:196. O’Shea, E. K., Klemm, J. D., Kim, P. S., and Alber, T., 1991, Science 254:539. Pabo, C. O., and Lewis, M., 1982, Nature 298:443. Phillips, L., Separovic, F., Cornell, B. A., Barden, J. A., and dos Remedios, C. G., 1991, Eur. J. Biophys.
19:147. Potts, B. C., Smith, J., Akke, M., Macke, T. J., Okazaki, K., Hidaka, H., Case, D. A., and Chazin, W. J., 1995, Nature Struct. Biol. 2:790. Rafferty, J. B., Somers, W. S., Saint-Girons, I., and Phillips, E. V., 1989, Nature 341:705. Richardson, J. M., Evans, P. D., Homans, S. W., and Donohue-Rolfe, A., 1997, Nature Struct. Biol. 4:190. Rico, M., Jimenez, M. A., Gonzalez, C., De Filippis, V., and Fontana, A., 1994, Biochemistry 33:14834.
Saudek, V., Pastore, A., Castiglione Morelli, M. A., Frank, R., and Gibson, T., 1991, Protein Eng. 4:519. Schirmer, R. H., 1978, in Principles of Protein Structure (C. R. Cantor, ed.), Springer-Verlag, Berlin. Shaw, G. S., Hodges, R. S., and Sykes, B. D., 1992, Biochemistry 31:9572. Sippl, M., 1993, Proteins 17:355.
Skelton, N. J., Aspiras, F., Ogez, J., and Schall, T. J., 1995, Biochemistry 34:5329. Sorimachi, K., Jacks, A. J., Le Gal-Coeffet, M. F., Williamson, G., Archer, D. B., and Williamson, M. P., 1996, J. Mol. Biol. 259:970. Srinivasan, N., White, H. E., Emsley, J., Wood, S. P., Pepys, M. B., and Blundell, T. L., 1994, Structure
2:1017. Starich, M. R., Sandman, K., Reeve, J. N., and Summers, M. F., 1996, J. Mol. Biol. 255:187.
Sticht, H., Auer, M., Schmitt, B., Besemer, J., Horcher, M., Kirsch, T., Lindley, I. J., and Rosch, P., 1996, Eur. J. Biochem 235:26. Sutcliffe, M. J., Dobson, C. M., and Oswald, R. E., 1992, Biochemistry 31:2962. Vis, H., Mariani, M., Vorgias, C. E., Wilson, K. S., Kaptein, R., and Boelens, R., 1995, J. Mol. Biol. 254:692. Walters, K. J., Dayie, K. T., Reece, R. J., Ptashne, M., and Wagner, G., 1997, Nature Struct. Biol. 4:744.
Calculation of Symmetric Oligomer Structures from NMR Data
161
Weyl, H., 1952, Symmetry, Princeton University Press, Princeton, NJ.
Wu, Z. R., Ebrahimian, S., Zawrotny, M. E., Thornburg, L. D., Perez-Alvarado, G. C., Brothers, P., Pollack, R. M., and Summers, M. F, 1997, Science 276:415. Xu, G. Y., Ong, E., Gilkes, N. R., Kilburn, D. G., Muhandiram, D. R., Harris-Brandts, M., Carver, J. P., Kay, L. E., and Harvey, T. S., 1995, Biochemistry 34:6993. Yamazaki, T., Hinck, A. P., Wang, Y. X., Nicholson, L. K., Torchia, D. A., Wingfield, P., Stahl, S. J., Kaufman, J. D., Chang, C. H., Domaille, P. J., and Lam, P. Y., 1996, Protein Sci. 5:495. Zhao, D., Arrowsmith, C. H., Jia, X., and Jardetzky, O., 1993, J. Mol. Biol. 229:735.
5
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
Elliott K. Gozansky, Varatharasa Thiviyanathan, Nishantha Illangasekare, Bruce A. Luxon, and David G. Gorenstein 1. INTRODUCTION In an effort to increase the molecular size boundary imposed on structure determination by NMR spectroscopy, an experiment named 3D NOESY–NOESY was developed by Boelens et al. (1989a). The experiment, similar to its 2D counterpart, utilizes the through-space dipole–dipole coupling to correlate three protons within
a pairwise 5-Å radius. In practical terms, it can be thought of as two consecutive 2D NOESY experiments resulting in a correlation between three protons (instead of two protons as found in 2D NOESY experiments). Since the experiment is
homonuclear (normally proton), it has the advantage of working on unlabeled samples. Curiously, the experiment was developed long before an efficient means for quantitative data analysis was established. Several data processing methods
Elliott K. Gozansky, Varatharasa Thiviyanathan, Nishantha Illangasekare, Bruce A. Luxon, and David G. Gorenstein • Sealy Center for Structural Biology and Department of Human Biological Chemistry and Genetics, The University of Texas Medical Branch at Galveston, Texas 77555-1157.
Biological Magnetic Resonance, Volume 17: Structure Computation and Dynamics in Protein NMR, edited by Krishna and Berliner. Kluwer Academic / Plenum Publishers, New York, 1999. 163
164
Elliott K. Gozansky et al.
were proposed, but they either suffered due to systematic error or were computationally intensive. A method called the 3D hybrid–hybrid matrix method (Donne et al., 1995a; Zhang et al., 1995) was proposed based on the long-standing 2D hybrid matrix methodology used in 2D NOESY data analysis. Fortunately, the 3D method retained the precision and accuracy of the 2D method and still retained nearly identical computational efficiency. Making the last pulse of one 2D NOESY experiment the first pulse in a second 2D NOESY creates the 3D NOESY–NOESY experiment. Figure 1 is a pictorial representation of the resulting pulse sequence. There are two incremented evolution periods and two mixing periods and but still only one acquisition period The 3D cross peak is actually a volume measured in four dimensions: three dimensions of chemical shift plus one dimension of amplitude. Provided all peaks
in a spectrum have identical line shape, the maximum amplitude of a 3D peak will be proportional to the 3D volume. In general, this is a poor assumption. The term denotes a 3D NOESY–NOESY volume correlating spins i, j, and k—in the order of i to j and then j to k. Interaction between k to j and then j to i (kji) as well as interactions of the type i to j and back to i (iji) (back-transfer peak), or i to i to j (iij), could also be detected. These latter two types of interactions are 2D-like in resolution characteristics. There are a number of ways to mathematically deal with the 3D NOESY– NOESY data. One approach uses the two-spin approximation (Wüthrich, 1986).
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
165
where is the 3D NOESY–NOESY volume and is of the distances between spins a and b. This method only considers the pairwise interactions—independent of any surrounding atoms. The two-spin approximation, albeit an intuitive description for dipole interactions, ignores the effects of multiple relaxation pathways (spin diffusion). Spin diffusion becomes quite significant with larger molecules. Although a little more difficult to solve, it is critical in any precise and accurate model of NOEs that the entire system be considered. Better techniques than the two-spin approximation utilize an eigenvalue–eigenvector solution to the rate
equation (Krishna et al., 1978; Bothner-By and Noggle, 1979; Keepers and James, 1984; Macura and Ernst, 1980; Meadows et al., 1991; Post et al., 1990). First, consider the description of cross peaks found in a 2D NOESY experiment. The 2D NOESY cross-peak volume matrix can be calculated as follows:
where is the 2D volume matrix and A(0) is initial magnetization vector available for NOE transfer. The rate matrix, R, describes the rate of NOE buildup for the entire proton spin system. The intrinsic cross-relaxation rate, between protons i and j, is related to the inverse sixth power of the distance between the two
protons. This relationship is the source of the two-spin approximation, given in Eq. (1). The rate matrix is a phenomenological description of the dipole interactions between every proton in the system and can be used to calculate the NOE for any single mixing time (2D NOESY) by Eq. (2). Each volume matrix element, denoted as represents the cross peak between protons i and j for a mixing time and R contains the cross-relaxation rates for all spin pairs:
Since R is a symmetric matrix about the diagonal, cross peaks above and below the spectral diagonal, for the same pair of spins, should be equal (assuming correct experimental parameters).
The self-relaxation rates (diagonal elements) are given by
166
Elliott K. Gozansky et al.
and the cross-relaxation rates (off-diagonal elements) are given by
where
is the resonance frequency and is the gyromagnetic ratio.
The spectral density function
which describes the transition probability at frequency is assumed to depend on a single, overall rotational correlation time Provided an invertible matrix, P, exists that can diagonalize R, Eq. (3) can be rewritten as
where
is a diagonal matrix. It is important to recall that
where I is the identity matrix. The rate equation can be easily extended to three dimensions. Consider the effects of a second mixing time and a third nucleus. Instead of an NOE between protons i and j there will be an NOE between i and j and then j and k (ijk). The initial magnetization can be represented as a column vector, which is often normalized to unity if thermal equilibrium has been reached. After the first NOE mixing period,
a 2D matrix is required to describe the magnetization available for the second mixing period. Thus, a 3D NOESY–NOESY cross peak can be described as
or, in diagonalized matrix form,
is the three-dimensional volume matrix produced by the 3D NOESY– NOESY experiment. This type of equation, where the NOE is considered across the whole system, is considered to be exact in describing the cross-peak volume.
In the case of equal mixing times there will be some symmetry in the data resulting from symmetry in the rate matrix. However, it is important to notice that the 3D NOESY–NOESY volumes are not equal through all permutations of the spins; in general, only This can be seen if one considers that A(0) can be arbitrarily set to I. Since R is symmetric,
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
167
However, in general, exp Thus, . Based on similar arguments the only equality that necessarily exists in the data is that described by Eq. (10).
Since matrix diagonalization can be difficult [and required for solving Eq. (8)], an approximation can be made using a Taylor series expansion of the rate equation (Boelens et al., 1989a, 1989b; Bonvin et al., 1991a; Habazettl et al., 1992a, 1992b;
Holak et al., 1991):
Normally, only the first few terms are retained since it becomes intractable to carry
out the summation much further. Data resulting from these methods can then be
used in distance geometry (Braun and Go, 1983; Havel et al., 1983; Wüthrich, 1986) or restrained molecular dynamics structure refinement (Gorenstein et al., 1990; Nilges et al., 1988; Zuiderweg et al., 1985). Unfortunately, the series converges relatively slowly; thus, approximation yields systematic error (examined in detail below) (Keepers and James, 1984; Post et al., 1990). Yip and Case (1989) put forth a gradient method for quantitative analysis of 3D NOESY–NOESY data. However, it scales to the sixth power of the number of
spins. Thus, even for a moderately sized system of spins (say 600), the solution becomes computationally prohibitive. Kaptein’s group created an approximation
method, based on the gradient method, and successfully refined an eight-residue peptide and the lac represser headpiece (residues 1–56) (Bonvin et al., 1991b; Slijper et al., 1995). This approximation incorporates a Taylor series expansion of the rate equation in the gradient analysis and still scaled with the cube of the number of spins.
2. SIMULATION STUDIES DESCRIBING 3D NOESY–NOESY CROSS PEAKS, APPROXIMATE VERSUS EXACT METHODS
Three-dimensional NOESY–NOESY cross-peak volumes have been handled using the approximation methods discussed above, without critical examination of their quality. In an attempt to quantify the limitations of various approximation methods, the crystal coordinates of an oligonucleotide duplex, Dickerson’s dode-
168
camer
Elliott K. Gozansky et al.
were used to compare the two-spin approximation
and the Taylor series expansion approximation (up to four terms) to the exact calculation based on the eigenvalue–eigenvector solution to the rate matrix equation
(Donne et al., 1995a). Atomic positions for all atoms in the oligomer were built with standard B-DNA geometry using the program AMBER 3.0 (Weiner and Kollman, 1981), followed by energy minimization as described previously (Nikonowicz and Gorenstein, 1992; Post et al., 1990). Isotropic tumbling was assumed for all studies using two overall correlation times of 1.6 and 3.2 ns. RMS errors were used as a criterion for the statistical analysis of the deviation between the
simulated NOESY–NOESY volumes using the “exact” methods approximate method
and the
The RMS was defined as
Comparison between the methods was examined using mixing times from 20 to 200 ms. In Fig. 2 scatterplots of approximate volumes versus “exact” volumes
are shown. Since geminal cross peaks tend to be large and grouped away from nongeminal, only nongeminal protons are displayed in the figure. In Fig. 3A, B, the RMS errors for one-term and two-term approximations for the whole data set have been plotted against mixing times. The first-order approximation, equivalent
to the two-spin approximation, reaches an RMS error of 50% at a mixing time around 60 ms (where the error increases with increasing mixing time). With the addition of more terms in the expansion, there is an improvement in the error; however, at useful mixing times the approximations yield significant systematic error. As noted in 2D NOESY simulation studies (Post et al., 1990), it is not the mixing time alone but rather the combined effect of correlation time and mixing time that determines when the approximation fails. This is more obvious in Fig. 3C, D, where the RMS errors are plotted against the product of correlation time and mixing time, Specifically, the first-order approximation failed (RMS error greater than 50%) with for both the dodecamer and the decamer. In the literature, efforts have been made to account for the spin-diffusion effect using the Taylor series approximation approach (Boelens et al., 1989a, 1989b; Kessler et al., 1991). In those approaches, it was assumed that the linear term in the expansion was the direct magnetization transfer term and the second-order term was regarded as spin diffusion through a third spin. However, the assumption cannot
explain why, in the Taylor expansion, the terms have alternate signs. If dramatic spin diffusion occurs, then all of the “spin-diffusion” terms (i.e., higher-ordered terms) in the series should contribute to the NOE volume in the same way; the second-order and all larger-order terms should have the same sign in the series. Second, note that cross relaxation will affect the NOESY and NOESY–NOESY
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
169
170
Elliott K. Gozansky et al.
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
171
volumes according to an exponential relationship. Therefore, when the exponent is expanded, the direct and indirect magnetization transfers are embedded in every term of the series. Specific terms in the Taylor series (e.g., linear, second-order, and third-order) cannot represent direct magnetization transfer, spin diffusion through a third spin, and spin diffusion through two other spins, respectively. Mathematically, for an expansion approximation to be successful, the series must converge uniformly and quickly. Although the Taylor series converges uniformly, it does not converge as fast as required for NOE volume approximation (Borgias et al., 1990). In the context of NOE simulation, the series usually did not converge after two terms (this would require the values of the third- and higher-order terms, to be negligibly small). Therefore, it was inadequate to take one or two terms in the Taylor series to simulate NOESY–NOESY volumes or interpret spin diffusion. Figure 4 shows the oscillatory behavior of approximations by the Taylor series expansion. In situations like this, the exact eigenvector–eigenvalue method should be the method of choice, particularly for 3D NOESY–NOESY experiments.
3. HYBRID–HYBRID RELAXATION MATRIX METHOD FOR 3D NOESY–NOESY DATA ANALYSIS Three-dimensional NOESY–NOESY has shown great promise for the structural refinement of large biomolecules. The NOESY-NOESY data analysis meth-
172
Elliott K. Gozansky et al.
ods, however, have proven to be quite challenging. Several different approaches have been developed to refine structures from 3D NOESY–NOESY spectra (Berstein et al., 1993; Bonvin et al., 1991a; Habazettl et al., 1992a, 1992b; Kessler et
al., 1991). Unfortunately, as shown in the previous section, the approximation methods fail at short mixing times, and the direct volume refinement (NOE gradient refinement) methods are computationally demanding for large systems (Bonvin et al., 199la; Dollwo and Wand, 1993; Yip, 1993; Yip and Case, 1989).
The relaxation matrix approach (Krishna et al., 1978; Bothner-By and Noggle, 1979; Keepers and James, 1984; Macura and Ernst, 1980; Measows et al., 1991;
Post et al., 1990) avoids the two-spin approximation by employing a matrix eigenvalue–eigenvector solution to the Bloch equations. Importantly, cross-relaxation rates evaluated by a matrix method include effects from multiple relaxation pathways (spin diffusion). Distances and structures derived from a matrix method
are more precise and accurate (Boelens et al., 1989a, 1989b; Borgias and James, 1990; Nikonowicz et al., 1990). For two-dimensional NMR, accurate distances can be directly obtained by the complete relaxation method by diagonalizing the 2D volume matrix which represents a 2D NOESY spectrum (Borgias et al., 1990; Post et al., 1990). Unfortunately, the eigenvalue–eigenvector solutions are very sensitive to the accuracy and completeness of the NOESY volume matrix (Post et al., 1990). A hybrid matrix solution
to this problem was originally proposed by Kaptein and co-workers (Boelens et al., 1988, 1989a) and implemented in several programs—IRMA (Boelens et al., 1988, 1989a), MARDIGRAS (Borgias et al., 1990), and MORASS (Gorenstein et al., 1990; Meadows et al., 1989), for example. This hybrid matrix approach combines the information from the experimental NOESY volumes and calculated volumes derived from an initial structure. The hybrid volume matrix, contains the well-resolved and measurable cross peaks from the experimental NOESY spectrum, while overlapped or weak cross peaks and diagonals are calculated from the cross-relaxation rates. A complete hybrid matrix is necessary
for successful matrix diagonalization. The distances derived from the complete rate matrix can then be utilized in a distance geometry or restrained molecular dynamics refinement of the structure. This process of hybridizing the volume matrix and structural refinement is repeated until a satisfactory agreement between the calculated and observed cross-peak volumes is obtained. In various structural refinements, three to six iterations are typically required to achieve convergence within
a family of structures consistent with the NOE data. We have recently demonstrated that this method can be extended to 3D NOESY–NOESY data, and the new method is called the hybrid–hybrid relaxation matrix method (Donne et al., 1995b; Zhang
et al., 1995). It represents a more computationally efficient method than the current gradient methods (Bonvin et al., 1991 b; Slijper et al., 1995), avoids the assumptions of the various approximation methods, and provides a method for the refinement of larger biomolecules.
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
173
3.1. Theory and Methods: Deconvolution of 2D NOESY Volumes from 3D NOESY–NOESY Volumes To mathematically represent 3D NOESY–NOESY data, various expressions have been suggested (Boelens et al., 1989b; Bonvin et al., 1991a; Borgias et al., 1990; Donne et al., 1995a; Kessler et al., 1991). The most straightforward expression represents the 3D NOESY–NOESY peak as the product of two 2D NOESY peaks:
where is a single 3D volume and are the 2D volumes, for spins a and b, during the two mixing times and (Boelens et al., 1989b). This equation is similar in form to the two-spin approximation, except here the effects of spin diffusion are explicitly considered. The main advantage of a 3D NOE–NOE spectrum over a 2D NOE spectrum is the enhanced spectral resolution provided by the third frequency dimension. Molecules greater than 10 kDa generally have significant spectral overlap. This
prevents the measurement of sufficient 2D NOEs to converge to a meaningful structure during refinement. Of course, this has been a major impetus in the development of new 3D and 4D experiments. Using Eq. (13), a 3D volume matrix can be deconvoluted into two 2D matrices. Theoretically, by Eq. (8), a 2D NOE matrix can be obtained from any single place in the 3D matrix; however, each plane in the 3D matrix is invariably incomplete. Clearly a 3D volume, will not be experimentally observable if the distance between spins i and j or spins j and k is greater than 5 Å. However, since each plane will be incomplete in a region different from any other plane, the full data set is recoverable. This is done by deconvoluting each of the incomplete planes in the 3D matrix into separate incomplete 2D planes representing a part of the full 2D NOE volume matrix. When more than one 2D plane contains a value for any one term, these values are averaged to give a single element As an added advantage of this treatment, errors are minimized by averaging over many computed values derived from many 3D volumes The hybrid–hybrid algorithm requires the calculation of the 2D volumes from the 3D volumes and the corresponding set of 2D volumes (for equal mixing times, we remove the distinction):
If there is enough spectral resolution in the 3D spectrum, values can be obtained from the cross-diagonal volumes or or the back-transfer volumes or 2D volumes measured independently from a well-resolved 2D NOESY spectrum. Often, it will not be possible to experimentally determine sufficient numbers of
174
Elliott K. Gozansky et al.
these cross peaks for larger biomolecules. In such cases simulated data must be used for the divisor in Eq. (14). Once again, a hybrid matrix solution to this problem has been fashioned and implemented in a 3D version of MORASS [Multiple Overhauser Relaxation AnalySis and Simulation (Meadows et al., 1989; Post et al., 1990)]. A flowchart of the hybrid–hybrid relaxation matrix method (3D MORASS) is shown in Fig. 5. First, an initial model structure is used to calculate the rate matrix. Then the 2D NOESY and 3D NOESY–NOESY spectra are simulated. The experimental and simulated 3D NOESY–NOESY data are then scaled and merged to create a hybrid 3D data set. Due to the tremendous number of 3D elements in larger biomolecules the disk file containing only the nonzero elements (ca. 1 % to 2% of the total elements) still requires approximately 140 to 280 Mbytes for 600 spins. Therefore, only data required for the method is stored. The 3D hybrid data are then deconvoluted into a 2D volume matrix with elements
where are the experimental 3D cross peaks and nonzero values are obtained from experimental or simulated data. Additional experimental or simulated 2D NOESY volumes can then be scaled and merged into the deconvoluted 2D volume matrix to give a complete 2D hybrid–hybrid volume matrix. The rate matrix can then be calculated from the hybrid–hybrid volume matrix using the 2D MORASS relaxation matrix approach, or any other 2D refinement protocol can be used. Note that numerical integration methods can also be used but are generally not as computationally efficient as the relaxation matrix method (Zhao and Jardetzky, 1994). The resulting distances are taken from the cross-relaxation rates, and the distances are then utilized in a distance geometry or restrained molecular dynamics refinement of the structure. The entire process is repeated in an iterative fashion until a satisfactory agreement between the calculated and observed 3D cross-peak volumes is obtained. Because two independent 3D data sets are merged (i.e., simulated and experimental data), the hybrid–hybrid matrix approach relies heavily on careful scaling. One way to scale the experimental and theoretical volume matrices is to match the volumes of several “markers” (Boelens et al., 1988, 1989a; Nikonowicz et al., 1990). Some of the back-transfer volumes in the 3D spectrum will hopefully be well resolved and correspond to spin pairs of fixed distance, similar to the proton pairs one would use as reference volumes in the 2D hybrid matrix or two-spin methods. We have found it best to use all experimental volumes for scaling. The ratio of the sum of these volumes gives the appropriate scale factor S:
Hybrid-Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
175
176
Elliott K. Gozansky et al.
where the summation is taken over all experimentally integrated volumes. (Various weighting factors may also be introduced.) Each refined structure from each cycle becomes a new model structure for the next iteration: S must be reevaluated for every iteration.
3.2.
Three-Dimensional Simulation Test and Effect of Added Noise
Before testing the refinement capability of the hybrid–hybrid method, it was necessary to examine the 3D data simulation routine and the effects of added noise. The correctness of the deconvolution routine was examined by comparing deconvoluted experimental 3D NOESY–NOESY data and experimental 2D NOESY data (Zhang et al., 1995). Reasonable accuracy of the deconvolution routine (data not shown) was followed by a study on the effects of added noise to the data. For these tests, a 3D NOESY–NOESY data set for a 12-mer GG mismatched duplex was simulated, based on two identical mixing times of 100 ms and a spectrometer frequency of 500 MHz. The structure of the 12-mer duplex was previously solved in our laboratory by a MORASS–restrained-MD calculation on the experimental 2D NOESY spectrum (Roongta, 1989). The refined coordinates were also used to generate a 2D NOE volume matrix at a mixing time of 200 ms. Elements of the relaxation rate matrix were calculated from the set of proton
Cartesian coordinates and a rotational correlation time of 3.6 ns. A partial set of 107 spins from the 12-mer duplex was used in the simulation, and only those volumes that could potentially be integrated, in an experiment, were included
(maximum of approximately 50,000). All other 3D matrix elements were set to zero, and only a linear table of the nonzero elements was stored on disk. Based upon the relaxation matrix, 3D volumes were generated for a given using Eq. (13). This represented the target spectrum with noise-free data. The most common types of experimental error found in multidimensional NMR, that of low signal-to-noise and incorrect volume integration, were added to the target spectrum to produce the “experimental” data. Random noise from a Gaussian distribution was added to all peak volumes in in order to simulate a constant low-level thermal noise and a peak integration error. Noise levels from to proportional to the individual volume, were added to each element. We used back-transfer volumes, to calculate an initial estimate of The “experimental” 3D spectrum was deconvoluted and the calculated average 2D volumes were compared to the exact simulated values. Figure 6 shows a plot of the %RMS deviation of the data after deconvolution as a function of the added random noise. As expected, the resulting RMS error for the “experi-
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
177
mental” 3D volumes was about one-half the error introduced into the 3D volumes.
However, when the planes were linked and the
values derived from the
simulated “experimental” 3D volumes were compared to the target the error is dramatically reduced two to threefold due to the effects of averaging.
3.3.
Hybrid–Hybrid Relaxation Matrix Structural Refinement of Duplex DNA from Simulated 3D NOESY–NOESY Data
In the previous section, a simulation study of the hybrid–hybrid matrix method was used to test the convergence of the deconvolution algorithm and the effects of
noise added to the data. Presented here are the results from a simulated refinement of a dodecamer DNA duplex by the 3D hybrid–hybrid matrix method. This test served to examine the performance of the entire methodology—in particular the
convergence capabilities compared to the 2D hybrid matrix method. Similar to the previous study, “experimental” 3D data sets were generated by adding noise to the known target structure. Theoretical 3D NOESY–NOESY spectra were then calculated from several model-built structures. As before, scaling was used to merge the
“experimental” 3D data with the theoretical 3D data to create a hybrid 3D NOESY– NOESY data set. This was deconvoluted into a 2D NOE volume data set and subsequently merged with the simulated 2D NOESY data. The result was a hybrid–hybrid 2D NOE volume matrix. Using this complete volume matrix to calculate a rate matrix, the distances were derived from the cross-relaxation rates and used in a restrained molecular dynamics refinement. It is worth pointing out that in the first hybridization the only volumes simulated were those needed to
178
Elliott K. Gozansky et al.
completely deconvolute the 3D data and scale the experimental to the simulated data. Only the simulated data needed for deconvolution were stored in memory. For the second hybridization, a complete 2D NOE volume matrix must be established. All interactions not present in the experimental data must be calculated from the model structure; however, experimental 2D NOE data was not required for such a calculation. The complete volume matrix is used to calculate the rate matrix, which is then transformed into the interproton distances. Finally, these distances are used
in a restrained molecular dynamics calculation to produce a more refined structure. This structure is then used as the model structure for the next iteration. The process is repeated until a satisfactory agreement between the theoretical and experimental 3D volumes is obtained, as shown in Fig. 7.
Hybrid–HybridMatrix Methodfor 3DNOESY–NOESYDataRefinements
179
3.3.1. Target Model and Data Simulation In this simulation study, the target structure was again taken from the X-ray coordinates of Dickerson’s dodecamer duplex DNA (Dickerson and Drew, 1981). Three different NOE data sets were simulated using this experimental target model. Two 3D NOESY–NOESY data sets were simulated using Eq. (16) with a Gaussian distribution of 20% (3D20) and 50% (3D50) integration error added, respectively. A 2D NOE data set was also simulated with 20% integration error (2D20) for comparison purposes. A 0.2% random thermal noise was added to all three data sets, and nonzero 3D cross peaks above a 0.2% cutoff were saved in a linear table representing the “experimental” spectrum. (Note that the two types of experimental noise are added explicitly in these data sets.) A three-dimensional matrix was not needed since over 99% of the 3D NOESY– NOESY volumes rounded to zero. A total of 1667 “true” 3D cross peaks were deconvoluted to generate 481 2D NOE volumes. Table 1 summarizes the statistics for the simulated data sets. A 2D cutoff of 2.5% was used to limit the number of volumes to a realistic level. Note that the target structure is nonsymmetric, so the number of constraints differs slightly for each strand. Again, both mixing
times, were 100 ms in the 3D NOESY–NOESY simulations with a single overall isotropic correlation time, of 3.6 ns. 3.3.2.
Hybrid–Hybrid Matrix Iterative Refinement Calculation
Three model structures were used as starting models for refinement. AMBER 4.0 was used to generate canonical A- and B-DNA model structures. Another model was generated from the canonical A-DNA by running an unconstrained molecular dynamics simulation for 2 ps at 1000 K and retaining the last structure. The Cartesian RMS deviations between the A-, B-, and with respect to
180
Elliott K. Gozansky et al.
the target dodecamer were 3.72, 2.95, and 4.21 Å, respectively. Since the had a distorted structure, it was used to test the effectiveness of the refinement protocol. The hybrid–hybrid matrix refinement calculations were performed using the scheme in Fig. 5. In each iteration new theoretical 3D NOESY–NOESY and 2D NOESY data sets were simulated using the model refined from the previous iteration. The standard 2D MORASS–AMBER refinement procedure was followed to generate a new set of distances for the restrained molecular dynamics calculation with a flat-well penalty function (Gorenstein et al., 1990). Such iterations were continued until the theoretical 3D spectrum converged to the “experimental” spectrum. The quality of refinement was examined by comparing the Cartesian RMS deviations between the refined structure and the target structure. The final force constants for the harmonic half-wells of the flat-well potential were for all refinements. The final flat-well distances, where no energy penalty applies, were 12% to 13% for 3D data sets and 9% for the 2D data set. This choice was made based on the consideration that the signal-tonoise ratio in 2D NOESY was greater than a 3D NOESY–NOESY (which should yield better precision for the 2D data integration). A single structure after the final iteration was considered the refined structure. Often, an extended restrained MD is used to generate an average final structure. However, we found it sufficient here to use the single final structure for the test of the method since, except in dynamically averaged torsional angles, there was no major difference between such single structures and those derived from the MD averaging (Nikonowicz et al., 1990). The quality of the final structure was judged by several figures of merit. The most important one in this case was the Cartesian RMS difference with the target model. Other criteria of refinement were as follows:
where * is ij for 2D data or ijk for 3D data, and a or b can be the theoretical or “experimental” volumes. Additional indicators of refinement examined were the R-factor, defined by
and the
-factor, defined as
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
where
181
is the NOESY mixing time.
The number of distance constraints derived from the hybrid–hybrid matrix
analysis of the 3D data sets was relatively conservative. The 0.20% volume threshold seems to be realistic from experimental considerations. For the simulated 2D data set, 359 constraints per duplex was clearly a reasonable upper limit for a molecule of this size, considering that Nerdal et al. (1989) used 310 constraints, experimentally derived from 2D NOESY, for the same 12-mer duplex. Table 1 shows that the %RMS(volume) values for the deconvoluted 2D data are only about half as large as the 3D data before deconvolution. This was not surprising because each deconvoluted 2D data point was obtained from averaging about three to four 3D NOESY–NOESY volumes (i.e., ca. 1700 initial cross peaks deconvoluted to ca. 500 cross peaks). As seen in this case, averaging should reduce the error by a factor
of where n was the number of data points used for each deconvoluted cross peak. This illustrates one of the significant advantages of the hybrid–hybrid matrix deconvolution method. 3.3.2a. Refinement Convergence. Table 2 summarizes the results for all nine refinement calculations. The results from the simulated refinement demonstrated that the proposed method is quite powerful and efficient. The quality of the refinement can be seen directly from the Cartesian RMS difference between each final structure and the target structure. All are below 1.60 Å except for the refinement with the 2D data set constraints (1.87 Å). A- and B-DNA models all achieved very good convergence. Starting from the A- and B-DNA models, the Cartesian RMS error from the refined 2D NOE data set was generally better than those refined from 3D NOESY–NOESY data sets. This was especially true for model-built B-DNA, which was most similar to the target duplex (basically a B-DNA structure with unfrayed ends). In this case, the final structure from the 2D NOE data set reached the lowest RMS value of all (1.10 Å). While the 3D data set with 50% added integration error (3D50) produced slightly better results than 3D20 for both models, the differences are probably not significant. For the model which was quite frayed and had the largest initial RMS difference from the target structure, the 3D data sets produced higher-quality structures than the 2D data sets. Several different iterative refinements always produced similar results (data not shown). Satisfactory convergence of this difficult model structure demonstrates the robustness of the hybrid–hybrid matrix method. This supports the general observation that having a larger number of distance constraints can be at least as important as increasing the accuracy of the constraints
182
Elliott K. Gozansky et al.
(Clore et al., 1993; Kaluarachchi et al., 1991; Meadows et al., 1991). It also indicates that the iterative deconvolution approach is quite efficient for achieving high-quality convergence. The target duplex along with comparisons of the target duplex, starting models, and the final refined structures are shown in Fig. 8. In contrast to the structures
refined using 3D data, the structures (starting model–refinement method and %error in data set) had a slight overall bend in comparison to the structure.
Although the individual structures converged reasonably well to the target structure (a measure of accuracy of the method), it did not guarantee that they all converged into the same family of final structures. Table 2 lists the average RMS differences among refined structures for each data set refinement. The average RMSD was calculated by simple arithmetic averaging of all possible RMSD of the refined structure as compared to the target. Another RMSD calculation involved calculating the difference between A-2D20 and B-2D20, A-2D20 and and B-2D20 and The three RMS values were averaged to give the final average RMS value. This method was repeated for the other two data groups, and the results agree well with the spread among the individual structures. The best convergence was from the 3D20 data set (RMSD spread was 1.31 Å), while the 2D20 data set gave the largest spread. This indicates that the 3D data is capable of producing better precision in convergence from different starting structures. This likely reflects again on the observation that having a larger number of less accurate constraints can lead to structures with greater precision but not necessarily greater accuracy.
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
183
The better precision and accuracy of the 3D data set refinement over the 2D
data set was further confirmed by helical parameter analysis (Zhu et al., 1996). In all cases, the overall quality of the reproduction of the sequence-specific variation in the helical parameters from the 3D50 refinement was better than those produced from the 2D20. This was particularly true for the major groove, minor groove, helix twist, roll angles, and nearly all backbone dihedral angles (data not shown). These results were consistent with the study of the accuracy of the hybrid relaxation matrix refinement of duplex structures (Kaluarachchi et al., 1991). This also provides strong evidence that such an iterative hybrid–hybrid matrix method of 3D NOESY–
184
Elliott K. Gozansky et al.
NOESY data is capable of achieving good convergence with high accuracy and
precision for both global and local structural features. The methodology has also been tested on simulated noise-free data sets with the same 12-mer DNA target model. The result indicates that, using between 350 to 400 constraints and 1.0% distance error in the flat-well potential function and 5 harmonic force constant, the %RMS(volumes) values for the final
structures were consistently below 10% (data not shown). The iterative refinement calculations were well behaved. As examples, Tables 3 through 6 list the complete refinement parameters for and respectively. In Table 4 the %RMS(volume) values calculated both from 3D NOESY–NOESY data and 2D NOESY data are listed. The 3D error measures were calculated between the 3D50 data set and that simulated from the model at each iteration. Likewise, the 2D values were calculated between the 2D-like data from
the deconvolution and the simulated 2D data from the model. In principle, the 3D result should reflect the progress more accurately than the 2D result since the deconvoluted 2D data depended on the intermediate models derived at each
iteration cycle. However, except for the sudden large change in the quality of the refinement parameters at iteration 6 for the 2D portion, both error measures progressed quite consistently and agreed well with the decrease in the Cartesian
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
185
186
Elliott K. Gozansky et al.
RMS difference between models and target structures. In all other calculations, no unusual values were observed. The progression of the R-factor and during the 3D and 2D versions was also well behaved. The flat-well force constants used were kept quite low throughout the refinement. When higher values were tested, they did not provide any noticeable improvement. The final total energies were all very similar to the unrestrained structure (ca. This level was representative for all the refinements.
Table 7 summarizes the final refinement results for all calculations. For the 3D hybrid–hybrid matrix refinement, the calculated errors in the deconvoluted 2D NOESY volumes are also included (in parentheses). Compared with the distribution of RMSD values in Table 2, these parameters seem to behave well in most cases.
The 2D %RMS(volume) values, R-factors, and with the results in Table 2. The 3D %RMS(volume) values for
are very consistent seem to be
quite high compared to the other error parameters. All of the refinements based on 3D NOESY–NOESY data sets required approximately the same number of iterations to achieve convergence, one to three more iterations longer than the 2D hybrid matrix method. In the hybrid–hybrid matrix deconvolution approach, the dependence of the deconvolution upon the iterative-model structure is likely responsible for the slightly higher number of iterations required to reach convergence. As might be expected, the data sets with the larger integration errors generally required more iteration cycles.
3.3.2b. Goodness of Refinement Measures. A number of parameters have been proposed to monitor the progress of 2D complete relaxation matrix refinement methods and the “goodness” of the obtained structures. For 2D matrix methods, the
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
187
%RMS(volume) is a very useful parameter for this purpose since it does not weigh the percentage differences between theoretical and experimental volumes for both large and small cross peaks differently. This is especially important since larger cross peaks often represent cross relaxation between intraresidue protons whose positions are constrained to be close by the geometry of the residue. The most important cross peaks are usually those that define interresidue distances and they are often the weakest. The R-factor (analogous to the X-ray R-factor) generally is
188
Elliott K. Gozansky et al.
regarded as a poor measure of quality of the refined structure because the magnitude
is dominated by the largest cross peaks, which are often the least important. The better reflects the quality of the structure since it more heavily weighs the weak cross peaks in comparison to the R-factor. Benefits of %RMS(volume) include increased sensitivity to the change of the model structure and a direct measurement between the calculated and experimental NOE data. Usually, a final refined value of %RMS(volume) comparable to the experimental error in the integrated volumes is acceptable (20% to 50%). For 2D NOE data, a %RMS(volume) value of 50% roughly corresponds to a distance error of approximately 8% to 9% (by propagation of error treatment, a factor of As described earlier, a 3D NOESY–NOESY cross peak is the product of two 2D NOE volumes; therefore, a simple direct relationship between %RMS(volume) and RMSD is not obvious. As such, it is not surprising to see that the refinement quality parameters in Table 7 do not appear to have a well-defined direct correspondence to the distribution of the RMSD values in Table 2. Note that the 2D matrix method does not have a precise one-to-one correspondence either. Rather, the quality of refinement parameters corresponds to a reasonable range of structural change. This should be kept in mind when comparing results in Tables 7 and 2. For example, in the 2D20 data set, the has the highest Cartesian RMSD in Table 2, but none of the parameters appear to be sensitive enough to reflect this. However, the difference between and B-2D20 usually would be observed
in practice. This likely reflects a problem inherent in the structural refinement of duplex nucleic acids where tertiary NOEs are not observed and where subtle bending or distortion of the structure can have a profound impact on the best RMS fit of the structures. For proteins, this should be much less a problem. All of the parameters in Table 7 are generally useful for monitoring the refinement progress. All the 2D parameters derived from the 3D data sets are quite useful for monitoring convergence and quality of the structures, although the %RMS(volume) values actually compare the simulated 2D NOE and the deconvoluted 2D-like data. The 2D version appears slightly more consistent than the 3D version of these parameters. The 3D version parameters are quite self-consistent even though they do not seem to agree with results in Table 2 as well. The has the highest %RMS(volume) value (248%), which corresponds to the highest Cartesian RMSD (1.60 Å) in Table 2. However, the lowest %RMS(volume) value (54%) actually corresponds to the second highest RMS value (1.54 Å) for B-3D20 in Table 2. For a given refinement, though, all these parameters reflect the trend of structural changes for the model very well. Figure 9 compares the results for the refinement progress for and This result is quite comparable to a 2D refinement calculation. The threshold value for 3D %RMS(volume) can only be determined empirically for the reasons mentioned above. It seems that a %RMS(volume) value two to five times as large as the 2D %RMS(volume) is
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
189
190
Elliott K. Gozansky et al.
acceptable. It can be concluded that the 2D and/or 3D parameters can be used effectively to monitor the hybrid–hybrid matrix calculation.
4. HYBRID–HYBRID MATRIX: EXPERIMENTAL REFINEMENT
TEST ON A DNA THREE-WAY JUNCTION As an experimental test of the hybrid–hybrid relaxation matrix refinement method, a DNA three-way junction (TWJ) was refined. The TWJ, consisting of three DNA strands forming three conjoined helical duplexes, contained two unpaired bases on one strand in the junction region. Gel electrophoresis and UV melting experiments have shown that two or more unpaired bases stabilize the DNA TWJ (Leontis et al., 1991). This molecule
was chosen as a test molecule because the resonance assignments were obtained previously (Leontis et al., 1993, 1995) and a high-quality 2D NOESY data set was available. The G–C rich TWJ sequence was designed to include one A–T base pair in each helical arm to serve as a spectroscopic marker. The sequence d(GGACGTCGCAGC), which is also shown in Fig. 10, contained a unique A–T base pair in each helical arm, allowing for unequivocal assignments. For base identification, the strand was first categorized and then the bases were numbered separately for each strand. For example, S1-G1 stands for the first guanine residue on strand 1. The NMR sample was prepared by dissolving stoichiometric amounts of the three oligonucleotide strands in buffer containing 10 mM sodium phosphate, pH 6.8 (uncorrected for deuterium), 100 mM NaCl, 10 mM and 0.5 mM EDTA to give a final concentration of approximately 2 mMDNA. The 3D NOESY– NOESY data were collected at 28°C at 750 MHz on a Varian UnityPlus instrument. The relaxation delay was set at 0.9 s, and a sweepwidth of 7100 Hz was used in all
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
191
three dimensions. Acquisition time was 72 ms. The mixing times were both 200 ms. Residual water suppression was achieved by low power saturation at the water frequency immediately before the first 90° pulse and during the two mixing periods. No attempt was made to suppress zero-quantum interference during the mixing times. In the direct detection dimension, 512 complex points were acquired, and 128 complex points were acquired for each of the two indirect dimensions. Eight scans were acquired for each pair. Quadrature detection in Fl and F2 was achieved with the hypercomplex method (States et al., 1982). The total experiment
time was 228 h. The 3D data set (128*128*512) was processed using Felix software (Biosym, CA) to give a data matrix of 256*256* 1024 real data points after zero-filling in all three dimensions. Only the real part of the final spectrum was stored. A 90° shifted sine bell apodization function was used in all three dimensions. The Flat routine available in the Felix program was used for baseline correction. Assignment of proton chemical shifts was based on previously reported values by Leontis et al. (1995). Cross peaks were picked by using the automatic peak-pick-
ing option available in the Felix program. The bounding box for each peak of interest was manually adjusted to reflect the actual linewidth of the peak. A total of 6253 peaks were picked by the automatic peak-picking routine. After manually sorting through the list to remove the body diagonals, artifacts, and noise peaks, 1635 peaks were retained. A total of 912 3D peaks (ijk- and iji-type peaks) were used for the deconvolution process—the remainder were iij-type peaks. The 3D NOESY–NOESY volumes were deconvoluted into 2D NOE volumes using the procedures described above (Zhang et al., 1995) and used in the standard MORASS–restrained-MD iterative refinement cycles. From the 2D NOESY experimental data set, 78 additional volumes, not obtained from the deconvoluted 3D spectrum, were incorporated with the deconvoluted 2D volumes to form the hybrid–hybrid 2D volume matrix. Starting model coordinates were obtained from a previous study using X-PLOR (Brünger, 1993a; Ouporov and Leontis, 1995). This model structure was then placed in a box of water containing 1720 TIP3P water molecules along with 29 sodium counterions (one sodium counterion for each phosphate group) and was subsequently equilibrated for 10 ps using the molecular dynamics program AMBER 4.1 (Weiner and Kollman, 1981). This structure was used as the starting model for all subsequent refinements. To ensure Watson–Crick base pairing, a hydrogen-bond constraint was added at each base pair in the helical arms, except for the base pairs at the stem ends. Only one hydrogen-bond constraint was applied per base pair in order to allow propeller twist during the refinement. For the MD part of each iteration, the starting structure was first energy minimized with the NOE constraints for 3000 steps. Eight ps of constrained molecular dynamics protocol with temperature annealing was performed on the energy-minimized model. Then the average structure from the last 3 ps of the MD
192
Elliott K. Gozansky et al.
was energy-minimized again, and the resulting structure was used as the starting model for the next iteration. Several key indicators monitored the progress of the iterative refinement process. The RMS errors in the volumes were used as the first criteria for monitoring the refinements. As can be seen from Table 8, the %RMS errors in volumes start at relatively higher numbers and gradually settle down to lower values with increased percentage of volume merging between the experimental and theoretical volumes. Energy factors, such as the total potential energy and the constraint energy, were also monitored throughout the iterative process. Both the total energy and the constraint energy increased in value as the error bars and force constants on the constraints were tightened in the molecular dynamics refinement. The effect of the force constant was controlled by changing the error bars from liberal values of 25% to much lower values as the confidence in the intermediate structures was estab-
lished. The R-factor was found to decrease as the refinement progressed as expected.
Figure 11 shows a plot of deconvoluted 2D NOE volumes derived from experimental 3D volumes versus the experimentally determined 2D NOESY volumes with a slope of 0.82. The random dispersion of these data points indicates a lack of systematic error introduced by the deconvolution process. The plot of theoretically calculated 2D volumes for the final structure versus the experimental (deconvoluted 2D plus the experimentally determined 2D) volumes, gives a slope
of 0.99 (Fig. 12), again with no systematic error. Figure 13 shows the number of NOE volumes measured per residue from the 2D and 3D data sets. Except for residues in the junction region, where NOE interactions are weak, the 3D NOESY– NOESY gave higher numbers of measurable NOE peaks than the 2D NOESY. A
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
193
194
Elliott K. Gozansky et al.
unique tertiary contact was also observed between the methyl group of S3-T6 and the of S3-G11. This crucial NOE peak was well resolved in both 2D and 3D spectra (as 2D iij- and ijj-type peaks), and was very useful to determine the conformation of the S3-T6 base. The final, refined, structure of the TWJ is shown in Fig. 14. As in the preliminary model, the three helical arms form two domains. Two of the helices, helix 1 and helix 2, are stacked on each other forming one continuous helical domain. The other helical domain, formed by helix 3, extends almost perpendicularly from the axis of the first helical domain. The unpaired pyrimidine bases are extrahelical, exposed to solvent and lie along the minor groove of helix 1. These two unpaired bases are stacked on each other. Helical parameters for all three helical arms exhibit only minor deviations from typical values for right-handed B-form DNA. Unusual values are, however, observed for the glycosidic angles of S3-T6 and S3-G8. The glycosidic bond of S3-T6 exists in an unusual syn conformation, allowing its methyl group to contact the hydrophobic surface of the minor groove of helix 1, at S3-G11.
5. CONCLUSIONS In this chapter, a simple, efficient, and robust structure refinement method using 3D NOESY–NOESY data has been presented and successfully tested by simulation
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
195
studies on a 12-mer DNA model as well as an experiment refinement on a 32-nucleotide TWJ. This method uses a straightforward deconvolution scheme to obtain a hybrid–hybrid 2D NOE volume matrix from 3D NOESY–NOESY volume data. It then uses the hybrid matrix 2D MORASS (or comparable Bloch equation solution) method for structure refinement calculation. Simulations have shown that the highest accuracy and precision in structural refinement is achieved by increased numbers and accuracy of the constraints used (Meadows et al., 1991; Thomas et al., 1991). The use of approximate methods has led investigators to apply liberal error bars when using NMR-derived distances during constrained structural refinements. While it is true that many good-quality structures have been obtained from NMR (Wüthrich, 1989)—especially with the development of 3D and 4D NMR (Clore et al., 1991: Grzesiek et al., 1992; Kay et al., 1990; Zuiderweg et al., 1991)—it is obvious that using larger numbers and stronger constraints will achieve greater precision (and hopefully accuracy) in the structures obtained by NMR. This appears to be generally true for proteins (Brünger et al., 1993b; Thomas et al., 1991) and nucleic acids (Gronenborn and Clore, 1989; Meadows et al., 1991; Metzler et al., 1990; Nilsson et al., 1986; Pardi et al., 1988). It has been reported that a relaxation matrix analysis for proteins may actually lead to poorer accuracy (Clore et al., 1993), although this has been disputed by futher simulation studies by Zhao and Jardetzky (1994). Thus, 3D NOESY–NOESY experiments hold the promise of providing more accurate structures given the vastly increased number of resolvable 3D NOESY– NOESY volumes (Kessler et al., 1991). As has been shown, approximation methods
196
Elliott K. Gozansky et al.
may not yield accurate distances at the longer mixing times required to achieve adequate magnetization transfer and signal-to-noise in large molecules (Donne et al., 1995a). Calculations based upon the complete relaxation matrix, however, are well within current computational resources, even for macromolecules containing more than 600 spins. Competing with the hybrid–hybrid matrix method are various direct gradient–NOE refinement methods which use the volume data directly in an attempt to match the theoretical volume data back to the experimental data (Bonvin et al., 199la; Habazettl et al., 1992a, 1992b). While this latter method does take into account spin diffusion, it scales to the sixth power of the number of spins in the system; a method that is computationally prohibitive for large systems. Kaptein and co-workers reported an approximation to the NOE gradient calculation method that scales to the third power of the number of spins (Bonvin et al., 1991 a); however, the hybrid–hybrid matrix method only scales with the square of the number of spins
via diagonalization of the n × n volume matrix. It must be emphasized that NOE gradient refinements should utilize relaxation matrix methods for calculating 3D volumes in order to achieve the highest accuracy for the refined structure (Donne et al., 1995a). Although our tests were conducted on nucleic acids, structural refinement of proteins should prove equally feasible. Kaptein’s gradient refinement method, carried out on the lac represser headpiece, demonstrates the important potential of relaxation matrix methods for 3D NOESY–NOESY refinement. The hybrid– hybrid matrix method has several advantages:
(1) It does not rely on any 2D experimental data. To use this method, only 3D NOESY–NOESY experimental data and a reasonable starting model are needed. An initial model can be constructed from the two-spin method utilizing NOEs derived from 3D or 4D heteronuclear edited or filtered NOESY. This makes it particularly suitable for studying larger molecules since significant numbers of good-quality 2D NOESY cross peaks cannot be resolved for molecules much larger than 10 kDa. (2) It is quite robust, precise, and accurate. The results from the simulation have shown that it converges well and possibly better than similar 2D methods. This seems to be especially true when using less than favorable starting models. It also involves the use of the more accurate 2D hybrid full-matrix method which takes into account the extensive spin diffusion which occurs in larger macromolecules at useful mixing times. Furthermore, in larger macromolecules, sensitivity in the 3D NOESY–NOESY (as well as 3D–4D heteronuclear filtered and edited NOESY) experiments is often poor so that longer NOESY mixing times must be used to increase the NOE volumes. Both larger molecular size and longer mixing times conspire to make spin diffusion a greater problem requiring a complete relaxation matrix analysis. The important combination of generating larger numbers of more
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
197
accurate distance constraints and the 2D method makes it possible to achieve quite accurate structures. (3) The method provides a simple means to incorporate into the refinement distance constraints derived from 3D–4D heteronuclear filtered and edited NOESY experiments since these can be added to the hybrid–hybrid volume matrix along with the deconvoluted volumes, any 2D NOESY volumes, and the simulated volumes. Of course one would want to carefully evaluate the relative accuracy of these constraints which are analyzed at a two-spin approximation level. These less accurate constraints can be added to the distance constraint list with appropriate increases in the error bars and decreases in the force constants. In addition, the 3D hybrid–hybrid method avoids exclusive use of less accurate
nD heteronuclear filtered and edited NOESY experiments commonly used for structural refinement. In various multidimensional heteronuclear coherence transfer NOESY methods, the NOESY volumes will be determined significantly by the degree of coherence transfer between the heteroatom and protons. This will depend on the magnitude of the coupling constant, which can vary significantly with atom and residue type. Distance constraints in these spectra can only be divided into three of four limits (e.g., short, medium, and long), which in turn limits the accuracy and
precision of the structure. (4) Software written for 2D NOESY data analysis does not need to be
significantly modified since the 2D deconvoluted matrix replaces the experimental 2D NOESY volume matrix. The analysis techniques described here are also general enough that they can be expanded to include even higher-dimensional experiments (such as a 4D NOESY–NOESY–NOESY spectrum). (5) The method is computationally very efficient as it does not involve the use of any 3D matrix or gradients which, at best, scale with the cube of the number of spins (Bonvin et al., 1991a; Zhang et al., 1995). This potentially limits the practical size of the systems that can be refined. Here, only the 3D peaks that are required for deconvolution and scaling are simulated and stored in memory. Once merged with the 3D experimental data, deconvoluted, and merged with the experimental 2D data, the data is stored as a simple 2D array. As a result, there is no need to create or store a three-dimensional array for the 3D data, which saves processing time and disk space. The design partially removes the extra limitations on the number of spins the computer can handle, thus making the methodology applicable to larger systems. CPU time scales as the square of the number of spins. It is apparent that the hybrid–hybrid method proposed here can be very useful for solving the structure of quite large molecules because it does not rely on 2D
NMR data and further effort is currently underway to fully automate the protocol. In practice, one problem with its successful implementation is the low S:N and relatively poor dispersion of 3D NOESY–NOESY spectra compared to heteronuclear spectra (requiring 1–2 mM concentrations). With higher fields and newer
198
Elliott K. Gozansky et al.
probes, we find this to not be as major an encumbrance as found previously. For instance, we have a Varian 5-mm 1H-only probe with a S:N of 1500:1 and a newly constructed Nalorac 8-mm triple-resonance probe with a S:N of 2400:1 which operate on a 750-MHz Varian spectrometer. The successful application of the hybrid–hybrid methodology in solving the structure of the 32-nucleotide TWJ gives us confidence that the method can be applied to proteins or nucleic acids of considerably higher molecular weight, especially with the newer probes.
REFERENCES Berstein, R., Ross, A., Cieslar, C, and Holak, T. A., 1993, J. Magn. Reson. B101:185.
Boelens, R., Koning, T. M. G., and Kaptein, R., 1988, J. Mol. Struct. 173:299. Boelens, R., Koning, T. M. G., van der Marel, G. A., van Boom, J. H., and Kaptein, R., 1989a, J Magn. Reson. 82:290.
Boelens, R., Vuister, G., Koning, T. M. G., and Kaptein, R., 1989b, J. Am. Chem. Soc. 111:8525. Bonvin, A. M. J. J., Boelens, R., and Kaptein, R., 1991a, J. Magn. Reson. 95:626. Bonvin, A. M. J. J., Boelens, R., and Kaptein, R., 1991b, J. Biomol. NMR. 1:305.
Borgias, B. A., Gochin, M., Kerwood, D. J., and James, T. L., 1990, Prog. NMR Spectrosc. 22:83. Borgias, B. A., and James, T. L., 1990, J. Magn. Reson. 87:475. Bothner-By, A. A., and Noggle, J. H., 1979, J. Am. Chem. Soc. 101:5152. Braun, W., and Go, N., 1983, J. Mol. Biol. 186:613. Brünger, A. T., 1993a, X-PLOR, Version 3.1. A System for X-Ray Crystallography and NMR, Yale University Press, London.
Brünger, A. T., Clore, G. M., Gronenborn, A. M., Saffrich, R., and Nilges, M., 1993b, Science 261:328. Clore, G. M., Wingfield, P. T., and Gronenborn, A. M., 1991, Biochemistry 30:2315.
Clore, G. M., Robien, M. A., and Gronenborn, A. M., 1993, J. Mol. Biol. 231:82. Dickerson, R. E., and Drew, H. R., 1981, J. Mol. Biol. 149:761. Dollwo, M. J., and Wand, J., 1993, J. Biomol. NMR. 3:205. Donne, D. G., Gozansky, E. K., and Gorenstein, D. G., 1995a, J. Magn. Reson. B106:156. Donne, D. G., Gozansky, E. K., Zhu, F. Q., Zhang, Q., Luxon, B. L., and Gorenstein, D. G., 1995b, Bull. Magn. Reson. 17:61. Gorenstein, D. G., Meadows, R. P., Metz, J. T., Nikonowicz, E. P., and Post, C. B., 1990, Advances in Biophysical Chemistry (Bush, C. A., ed.), JAI Press, Greenwich, p. 47. Gronenborn, A. M., and Clore, G. M., 1989, Biochemistry 28:5978. Grzesiek, S., Dobeli, H., Gentz, R., Garotta, G., Labhardt, A. M., and Bax, A., 1992, Biochemistry. 31:8180. Habazettl, J., Ross, A., Oschkinat, H., and Holak, T. A., 1992a, J. Magn. Reson. 97:511. Habazettl, J., Schleicher, M., Otlewski, J., and Holak, T. A., 1992b, J. Mol. Biol. 228:156. Havel, T. A., Kuntz, I. D., and Crippen, G. M., 1983, Bull. Math. Biol. 45:665. Holak, T. A., Habazettl, J., Oschkinat, J., and Otlewski, J., 1991, J. Am. Chem. Soc. 113:3196. Kaluarachchi, K., Meadows, R. P., and Gorenstein, D. G., 1991, Biochemistry 30:8785. Kay, L. E., Clore, G. M., Bax, A., and Gronenborn, A. M., 1990, Science 249:411. Keepers, J. W., and James, T., 1984, J. Magn. Reson. 57:404.
Kessler, H., Seip, S., and Saulitis, 1991, J. J. Biomol. NMR 1:83. Krishna, N. R., Agresti, D. G., Glickson, J. D., and Walter, R., 1978, Biophysical J. 24:791. Leontis, N. B., Kwok, W., and Newman, J. S., 1991, Nucleic Acids Res. 19:759.
Hybrid–Hybrid Matrix Method for 3D NOESY–NOESY Data Refinements
199
Leontis, N. B., Hills, M. T, Piotto, I. V., Malhotra, A., Nussbaum, J., and Gorenstein, D. G., 1993, J. Biomol. Struct. Dyn. 11:215.
Leontis, N. B., Hills, M. T., Piotto, M., Ouporov, I. V., Malhotra, A., and Gorenstein, D. G., 1995, Biophys. J. 68:251. Macura, S., and Ernst, R. R., 1980, Mol. Phys. 41:95.
Meadows, R. P., Post, C. B., and Gorenstein, D. G., 1989, MORASS, Purdue University, West Lafayette. Meadows, R. P., Post, C. B., Kaluarachchi, K., and Gorenstein, D. G., 1991, Bull. Magn. Reson. 13:22. Metzler, W. J., Wang, C, Kitchen, D. B., Levy, R. M., and Pardi, A., 1990, J. Mol. Biol. 214:711. Nerdal, W., Hare, D. R., and Reid, B. R., 1989, Biochemistry 28:10008. Nikonowicz, E. P., and Gorenstein, D. G., 1992, J. Am. Chem. Soc. 114:7494. Nikonowicz, E. P., Meadows, R. P., and Gorenstein, D. G., 1990, Biochemistry 29:4193. Nilges, M., Gronenborn, A. M., Brünger, A. T., and Clore, G. M., 1988, Protein Eng. 2:27. Nilsson, L., Clore, G. M., Gronenborn, A. M., Brünger, A. T., and Karplus, M., 1986, J. Mol. Biol. 188:455.
Ouporov, I. V., and Leontis, N. B., 1995, Biophys. J. 68:266. Pardi, A., Hare, D. R., and Wang, C., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:8785. Post, C. B., Meadows, R. P., and Gorenstein, D. G., 1990, J. Am. Chem. Soc. 112:6796. Roongta, V. A., 1989, Ph.D. Thesis, Purdue University, West Lafayette. Slijper, M., Bonvin, A. M. J. J., Boelens, R., and Kaptein, R., 1995, J. Magn. Reson. B107:298. States, D. J., Haberkorn, R. A., and Ruben, D. J., 1982, J. Magn. Reson. 48:286.
Thomas, P. D., Basus, V. J., and James, T. L., 1991, Natl. Acad. Sci. U.S.A. 88:1237. Weiner, P. K., and Kollman, P. A., 1981, J. Comput. Chem. 2:287. Wüthrich, K., 1986, NMR of Proteins and Nucleic Acids, Wiley, New York. Wüthrich, K., 1989, Science 243:45. Yip, P. F., 1993, J. Biomol. NMR 3:361. Yip, P. F., and Case, D. A., 1989, J. Magn. Reson. 83:643. Zhang, Q., Chen, J., Gozansky, E. K., Zhu, F., Jackson, P. L., and Gorenstein, D. G., 1995, J. Magn. Reson. B106:164.
Zhao, D., and Jardetzky, O., 1994, J. Mol. Biol. 239:601. Zhu, F. Q., Donne, D. G., Gozansky, E. K., Luxon, B. L., and Gorenstein, D. G., 1996, Magn. Reson.
Chem. 34:S125. Zuiderweg, E. R. P., Petros, A. M., Fesik, S. W., and Olejniczak, E. T., 1991, J. Am. Chem. Soc. 113:370.
Zuiderweg, E. R. P., Scheek, R., Boelens, R., van Gunsteren, R., and Kaptein, R., 1985, Biochemie 67:707.
6
Conformational Ensemble Calculations: Analysis of Protein and Nucleic Acid NMR Data
Anwer Mujeeb, Nikolai B. Ulyanov, Todd M. Billeci, Shauna Farr-Jones, and Thomas L. James 1. INTRODUCTION Biological processes are generally governed by conformation-specific interactions of biomolecules. Molecular recognition is often modulated by an intrinsic flexibility of biomolecules, in particular, proteins and nucleic acids. Since conformational dynamics in some cases can be crucial, considerable effort has been devoted to developing methods to ascertain the dynamic structure of biomolecules. NMR has been long recognized as a rich means for probing the dynamics of biomolecules in solution (Jardetsky and Lefevre, 1994; Palmer, 1997). NMR parameters by which
Anwer Mujeeb, Nikolai B. Ulyanov, Todd M. Billeci, Shauna Farr-Jones, and Thomas L. James • Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California 94143-0446.
Biological Magnetic Resonance, Volume 17: Structure Computation and Dynamics in Protein NMR, edited by Krishna and Berliner. Kluwer Academic / Plenum Publishers, New York, 1999.
201
202
Anwer Mujeeb et al.
dynamic aspects of protein structures are characterized include spin–lattice
and spin–spin rates for and
relaxation rates, rotating-frame spin–lattice resonances and heteronuclear or
relaxation nuclear
Overhauser effects (NOEs) to probe fast motions (Palmer, 1997; Li and Montelione, 1995; Lane, 1993; Gorenstein, 1994). Amide proton exchange rates reflect much slower motions (Koning et al., 1991). Deuterium relaxation measurements have also been used for probing side-chain dynamics in proteins. Furthermore, since biomolecules are not rigid bodies, internal motion and local flexibility must be taken into account during structural refinement (Schmitz et al., 1996; Nilges, 1996;
van Gunsteren et al., 1994). Standard methods for structure refinement, e.g., restrained molecular dynamics (rMD) and distance geometry (DG), yield a single rigid structure when used with the usual NMR structural restraints, i.e., interproton distances derived from nuclear
Overhauser effect spectroscopy (NOESY) cross-peak intensities and torsion angles derived from scalar coupling data. However, the nonlinear averaging of distances and angles that occurs with conformational fluctuations may compromise structural restraints. Some amelioration of the distortions in restraint values can be achieved by accounting for internal motions (Koning et al., 1991; Kumar et al., 1992; Liu et al., 1992), but a single rigid structure still results when using these methods. This
structure must be consequently understood as ensemble- and time-averaged. As we barely have sufficient data to define a single structure with high resolution, we certainly cannot define to high resolution each member of an ensemble of interconverting conformers. At best, we hope to improve our picture of the dynamic nature of proteins and nucleic acids. Challenges to construction of a dynamic high-resolution structure of biomolecules include
a. Calculating structural restraints with the best possible accuracy b.
Identifying internal inconsistencies in experimental NMR restraints, which may possibly arise from dynamics
c. Generating a large enough pool of molecular conformations to encompass all possible interconverting conformers d. Assessing the pool of conformations to ascertain which are needed to account for all of the NMR data with minimal conflict e. Evaluating how well the resulting ensemble represents the experimental NMR data
In this chapter, we outline the approaches being developed in our laboratory to address some of these challenges in generating accurate, precise, and possibly
dynamic images of biomolecules via NMR.
203
Analysis of Protein and Nucleic Acid NMR Data
2. DETERMINATION OF STRUCTURAL RESTRAINTS 2.1. Interproton Distance Restraints The preliminary requirement for structure elucidation by NMR is an accurate estimation of structural restraints. Two main sources of structurally sensitive information from NMR data are NOE and coupling constants, NOE being of principal importance. In homonuclear NMR, the NOE between two protons arises due to the interaction between their magnetic dipole moments; the NOE can be observed if the protons are close in space In simplest terms, the intensity of a cross peak between protons i and j in a two-dimensional NOE (2D NOE) spectrum relates to the interproton distance
as
Interproton distances can be calculated from experimental values of NOE crosspeak intensities. These interproton distances then constitute the basis for structural refinement. However, the relationship of Eq. (1) becomes imprecise as a result of
multispin effects, so-called spin diffusion. The presence of numerous protons in a molecule constitutes a network of magnetization relaxation pathways. Each proton experiences multiple dipole–dipole interactions with neighboring protons, which is the basis of spin diffusion. Although one can use an isolated spin-pair approximation [Eq. (1)] to estimate the interproton distances, spin diffusion makes this method prone to systematic errors. In order to calculate accurate interproton distances, one must consider all relaxation pathways.
A matrix of NOE intensities, a matrix of dipole–dipole relaxation rates, equation (Keepers and James, 1984)
where the off-diagonal relaxation rates power of the interproton distance:
at mixing time is related to by the matrix exponential
are proportional to the inverse sixth
and the proportionality coefficients depend on the motional model (Ernst et al., 1987). In contrast to the relationship of Eq. (1), Eqs. (2) and (3) are exact: they hold in the presence or absence of spin diffusion. Equations (2) and (3) constitute the basis for the program CORMA (COmplete Relaxation Matrix Analysis) developed
in this laboratory (Keepers and James, 1984). CORMA calculates NOE intensities using either a full set of interproton distances or the atomic coordinates of a structure.
204
Anwer Mujeeb et al.
Equation (2) can be easily inverted, thus expressing a matrix of relaxation rates via the logarithm of a matrix of NOE intensities:
Together with inverted Eq. (3),
one could obtain a means of calculating interproton distances from experimentally measured NOE intensities. Distances calculated in such a way would be accurate in the sense that they would not depend on the presence of spin diffusion in a molecule. Unfortunately, this is not a practical method to determine distances from the experimental NOE data. For any realistic system, the set of experimental NOE intensities is incomplete due to experimental limitations, such as peak overlap, limited spectral resolution, finite signal-to-noise ratio, and incompletely assigned
proton resonances. The algorithm MARDIGRAS (Borgias and James, 1990) developed in our laboratory solves this problem by supplementing the experimentally measured NOE intensities with theoretical intensities calculated from an initial (model) structure, thus forming a complete hybrid matrix of NOE intensities and a corresponding matrix of relaxation rates. Hybrid matrices are then modified iteratively to achieve consistency between all observed and nonobserved intensities and relaxation rates. The output consists of accurate interproton distances corresponding to experimentally observed NOE intensities. These distances are represented in the form of upper and lower bounds as well as an average value for each calculated
distance. For both CORMA and MARDIGRAS calculations, an explicit motional model must be assumed in order to derive the proportionality coefficients in Eqs. (3) and (5). In most practical cases, it is assumed that a molecule undergoes overall isotropic tumbling with correlation time an assumption valid for roughly spherical shapes of molecules. However, in the case of larger modular proteins (e.g., long axis–to– short axis ratio and nucleic acids duplexes ( base pairs), anisotropic tumbling should be assumed. An estimation of overall correlation time is required to determine the distance information from 2D NOE data. The correlation time can be measured from heteronuclear relaxation data for or labeled samples, or it can be roughly estimated from experimentally measured values of spin–lattice relaxation and spin–spin relaxation rates of various protons. For small nucleic acids and proteins, the correlation time is on the order of a few nanoseconds. It is always advisable to estimate a range of correlation times and perform MARDIGRAS calculations using several values within that range. Also, one has to remember that depends on solvent viscosity and, most importantly, on the aggregation state of the molecule. Besides an overall tumbling motion, fast internal molecular motions that affect relaxation rates should also be taken into account during
Analysis of Protein and Nucleic Acid NMR Data
205
relaxation matrix calculations. Rotational motions of methyl groups, flipping of aromatic rings, exchange of labile protons with bulk solvent, and accounting for effective internal motions in the form of order parameters (Lipari and Szabo, 1982) are some examples of processes that can be included in relaxation matrix calculations (Liu et al., 1992, 1993; Kumar et al., 1992). A severe limiting factor in the NMR-based structure determination can be the quality of measured 2D NOE intensities. In many cases the observed intensities are compromised by decreased signal-to-noise ratio, integration errors, and incomplete recovery of longitudinal magnetization of protons. Incomplete recovery of magnetization occurs when the delay between consecutive free induction decay (FID) acquisitions in an NMR experiment (repetition delay) is not sufficient for the longitudinal magnetization to reach its equilibrium value. Interproton distances can be biased if determined from an incompletely relaxed 2D NOE spectrum. A repetition delay of about five times larger than the spin–lattice relaxation time is required to achieve nearly full recovery of proton magnetization. In certain cases, e.g., the H2 proton on adenine, values are often quite long making it impractical to use long repetition delays. A program called SYMM was developed in our laboratory for correcting 2D NOE intensities for partial relaxation (Liu et al., 1996). SYMM uses the ratios of the below- and above-diagonal cross-peak intensities of a partially relaxed 2D NOE spectrum to calculate scaling factors that are used to correct the intensities. Alternatively, SYMM can adjust the NOE intensities by taking into account the experimentally measured values for individual protons and the experimentally used repetition delay value. It is desirable that NOE intensities are corrected prior to complete relaxation matrix calculations for spectra acquired with very short repetition delay. Quantitative errors may become increasingly significant for 2D NOE intensities measured at longer mixing times due to increased spin diffusion (Liu et al., 1995b). This is especially true for weak 2D NOE cross peaks representing large distances. In the case of proteins, large distances may involve long-range restraints that define the tertiary structure or global fold of the protein. Furthermore, because of error propagation, relatively small integration errors for strong cross peaks can also dramatically affect the distances derived from other weak NOE intensities. To overcome this problem, MARDIGRAS has recently been supplemented with an error analysis option (RANDMARDI) which simulates the effects of random spectral noise and errors in peak integration (Liu et al., 1995b). The program introduces a random error in each experimentally measured NOE intensity, and MARDIGRAS is then run for the perturbed set of intensities. The procedure is iterated (typically, 30–50 rounds) with the random errors varied within the user-defined limits. The resulting distance bounds from random-error MARDIGRAS calculations have enhanced accuracy, although the precision (i.e., the tightness of bounds) may be often decreased. This procedure has been applied to a DNA-psoralen complex (Liu et al., 1995b), a heteroduplex of an antisense DNA probe and its RNA target
206
Anwer Mujeeb et al.
sequence (Mujeeb et al., 1997), and a 17-mer RNA hairpin with a dynamic loop structure (Yao et al., 1997). In all cases RANDMARDI yielded more accurate distance restraints. Use of RANDMARDI in the case of the DNA-psoralen complex
study showed a clear advantage of this approach, affording a set of unbiased interproton distances. For the antisense DNA•RNA hybrid, analysis of interproton distances calculated with RANDMARDI permitted a successful analysis of conformational preferences of individual residues (see below). If possible, more than one starting model structure should be used in MARDIGRAS. Although MARDIGRAS shows little dependence on the starting model,
better starting structures yield somewhat more accurate distances. After initial refinements, resulting structures can be used as new starting models for the next round of distance calculations with MARDIGRAS. Further, it is often desirable
that calculations be performed on several sets of 2D NOE data collected at various mixing times. Typically, two to four 2D NOE spectra at different mixing times are
recorded, affording an increased number of distances as well as improved accuracy of the distance bounds. In the case of small molecules with fast overall correlation times NOE intensities may become small or negative. For such molecules, the rotatingframe NOE experiment, ROESY, is a more sensitive method. The CARNIVAL algorithm permits determination of interproton distances from ROESY intensities (Liu et al., 1995a) and is now part of the MARDIGRAS program. This method
corrects for the Hartmann–Hahn transfer of magnetization (HOHAHA), which otherwise leads to errors in distances calculated from ROESY data. The HOHAHA effect is accounted for by using scalar coupling constant values. In its turn, the coupling constants can be experimentally determined or estimated by a Karplus relationship from a model structure.
Interproton distances calculated with MARDIGRAS should be carefully analyzed for internal disagreements. Such disagreements may arise due to experimental errors in NOE intensities such as resonance misassignments and integration errors, which may be readily identified and corrected. Alternatively, internal inconsistencies can be caused by conformational flexibility leading to dynamically averaged interproton distances (Ulyanov et al., 1995; Schmitz et al., 1996; Yao et al., 1997). Use of the complete relaxation matrix method has now been widely accepted for the calculation of interproton distances from 2D NOE intensities (Mujeeb et al., 1993; Sorensen et al., 1997; Glemarec et al., 1996; Farr-Jones et al., 1995). Besides the MARDIGRAS and CORMA programs, similar approaches are used in methods such as the hybrid matrix approach and refinement of the structure against measured intensities via back calculations (Huang et al., 1993; Zhang et al., 1995; Zhu and Reid, 1995; Görler and Kalbitzer, 1997; Fedoroff et al., 1997).
Analysis of Protein and Nucleic Acid NMR Data
207
2.2. Coupling Constants and Torsion-Angle Restraints Coupling-constant information as such or transformed into torsion-angle restraints constitutes another vital piece of NMR-derived structural data. Couplingconstant information, being vicinal in nature, provides local structural restraints. If such restraints are redundant, they may help characterize local motions. Interproton distances extracted from NOE data can be supplemented with coupling constants during the structure refinement process. Transforming vicinal coupling constants into bond torsion angles requires a well-parameterized Karplus function [see, e.g., Schmitz and James (1995)]. In the case of nucleic acids, several three-bond coupling constant values involving sugar protons are often estimated, providing insights into sugar pucker conformation and dynamics (Schmitz et al., 1990; Mujeeb et al., 1992; Conte et al., 1996). Homonuclear NMR methods such as phase-sensitive COSY, ECOSY, PECOSY and double-quantum-filtered COSY (2QF-COSY) can be used to make direct or indirect measurements of coupling constants for small proteins, peptides, and oligonucleotide duplexes (Marion and Wüthrich, 1983; Griesinger et al., 1985; Bax and Lerner, 1988). However, line broadening due to increasing molecular size may prevent the direct measurement of coupling constants. Indirect determination of coupling constants involves fitting
experimental cross peaks to simulated ones using the SPHINX and LINSHA programs (Widmer and Wüthrich, 1987). This procedure has been successfully applied to fit 2QF-COSY peaks of DNA duplexes and an hybrid (Celda et al., 1989; Schmitz et al., 1990; Mujeeb et al., 1992; Weisz et al., 1992;González et al., 1994). The program SPHINX generates a stick spectrum based on energy transitions for various COSY-type pulse sequences. Then LINSHA calculates effective 2D line shapes while accounting for experimental parameters such as
digital resolution, window functions, truncation effects, and resonance linewidths for the protons involved. Cross peaks are simulated for a set of coupling constant values and resonance linewidths and then fitted to experimental cross peaks. The
best fit provides coupling constants that are consistent with the experimental data. In the case of nucleic acids, this procedure provides an elegant method for determining sugar puckers via analysis of vicinal coupling constants using a modified Karplus function. [For a detailed description of the strategies involved in using SPHINX and LINSHA methods, see Schmitz and James (1995).] Sugars in nucleic acids are nonflat five-membered rings whose conformers (puckers) are classified according to the atom that deviates most from the plane and according to the direction of the deviation. Conformers are called endo if the most deviating atom points toward the exocyclic position, and conformers with the opposite direction of deviation are called exo (Saenger, 1984). For example, standard B-form model of DNA has puckers of deoxyriboses, and standard A-form models of both DNA and RNA have sugar puckers. If we assume constant bond
208
Anwer Mujeeb et al.
lengths in sugars, the sugar ring conformation will depend on four degrees of
freedom. However, for many practical cases, sugar conformation can be described by just two parameters: pseudorotation phase angle P and maximum pucker
amplitude range of
(Altona and Sundaralingatn, 1972; van Wijk et al., 1992). The typical for undistorted rings is 30°–40°. Angle P is defined from 0° to 360° (or from –180° to 180°); sugar puckers have P between 0° and 36°, and puckers have P from 144° to 180°. Also, sugar conformations with P from –90° to 90° are often called northern, or N-conformers, and those with P from 90° to 270° are called southern, or S-conformers. Values of scalar coupling constants are directly related to the pseudorotation phase angle P and maximum pucker amplitude In the case of rapidly interconverting sugar ring conformations, the effective coupling constant would be a population-weighted arithmetic average of coupling constants of all conformers involved. A rigid sugar conformation can be defined by only two endocyclic torsion angles. By overdefining the pucker via coupling-constant analysis, one has the opportunity to characterize the repuckering dynamics. We have used a simple two-state model where a fast jump between Nand S- conformers is modeled to fit the derived torsion angles (Schmitz et al., 1990; Mujeeb et al., 1992; Weisz et al., 1992). For DNA duplexes, the major population of S-conformer (70%–95%) has been determined using this method.
In contrast to the above strategy, which involves translating the J-coupling constants into torsion angles, the direct use of experimental coupling constants during restrained molecular dynamics refinement has also been explored (González et al., 1995). Experimental J-coupling constants were applied as a flat-well energy
term in addition to the AMBER 4.1 force field. The flat-well width expresses the accuracy of experimental coupling constant values with explicit upper and lower
bounds. Similarly to the distance restraints, this term adds a penalty to the total energy when the calculated coupling constant is beyond the experimental bounds. Theoretical J-coupling constants during AMBER simulations are calculated from
the trajectory coordinates using a generalized Karplus function. Methods involving the direct measurement of coupling constants as well as fitting procedures often fail in the case of larger proteins, as the analysis is often fraught by increased natural proton linewidths and signal overlap in molecules of
increasing molecular weight. For proteins, emphasis mostly has been on three-bond couplings related to the backbone dihedral angle and the side-chain torsion angle However, recent improvements in and isotopic enrichment of proteins have made it possible to measure many heteronuclear J-couplings using multidimensional heteronuclear NMR experiments. Pulse sequences for the meas-
urement of three-bond
couplings have been developed and
improved (Bax et al., 1994; Case et al., 1994). Information regarding J-coupling
constants in proteins helps in stereospecific assignments of prochiral groups like and methyl groups on amino acids such as leucine and valine (Basus, 1989). It is also possible to use the J-coupling data for prochiral protons during
Analysis of Protein and Nucleic Acid NMR Data
209
refinement or conformational search without prior stereospecific assignments (Constantine et al., 1995). In the absence of stereospecific assignments, the use of “floating” chiralities during structure refinement also enhances the quality of the calculated structures. Floating chiralities may be assigned to all nondegenerate NMR signals from methylene and isopropyl groups and their fitting can be monitored statistically (Beckman et al., 1993) or via an NOE energy term (Folmer et al., 1997). Finally, coupling-constant data not only constitute a vital part of structural information, but also provide a tool for assessing the quality of structures calculated using NOE restraints. One can make use of J-coupling information for cross-validation.
For this purpose, coupling-constant information should be excluded from the refinement process. Subsequently, theoretical coupling constants can be calculated for the resulting structure(s) and compared with experimental data (Ulyanov et al., 1995). However, a prerequisite for this approach is a sufficient number of NOE restraints which allows efficient structure refinement without torsion-angle restraints.
2.3.
Other Types of Restraints
In certain systems, NOE intensities and coupling constants can be supplemented with other types of structural restraints. In the case of molecules containing anisotropic paramagnetic centers, such as metalloproteins and spin-labeled nucleic acids, the paramagnetic effects on chemical shifts may be used as additional restraints (Gochin and Roder, 1995; Salgueiro et al., 1997). Semiquantitative estimations of distance restraints from paramagnetic relaxation parameters can also be an important adjunct to standard NMR information (Gillespie and Shortle, 1997). Unlike conventional NMR information, the effects of paramagnetic shifts are long range, and inclusion of such information holds an obvious promise of enhanced quality of calculated structures. A systematic dependence of the and chemical shifts of and resonances, respectively, on secondary and tertiary protein structure has also been established (Wishart and Sykes, 1994a, 1994b). Efforts have been made to include this type of information directly in structure refinements (Ösapay et al., 1994), although with limited success in nonmetalloproteins. The limitations arise from the complexity of the physical interactions which determine the value of chemical shift (Case et al., 1994). So, although for the moment direct use of chemical-shift data may not be feasible in high-resolution structure calculations, they do contain useful structural information that can possibly be used indirectly during refinement.
2.4. Indices of Agreement
After the structure (or structural ensemble) was determined via NMR, all predicted NMR parameters must be compared with original experimental data. A
210
Anwer Mujeeb et al.
number of NOE-based agreement indices can be calculated with the program CORMA. Among them of note is a residual index which is analogous to a crystallographic R-factor:
where A 0(i) and Ac(i) are observed and calculated NOE intensities, respectively, and the summation is performed over all observed cross peaks. However, due to the equal weighting of all deviations, such an R-factor is dominated by strong intensities corresponding to short distances. The term may be negligibly small if both calculated and observed intensities are weak and correspond to long, but nevertheless very different, distances. At the same time, such long distances can be critical for determining the correct tertiary fold of the molecule. A more sensitive index of agreement for NOE cross-peak intensities is a sixth-root-weighted factor which accounts for the approximate sixth-root dependence of interproton distances on NOE intensities (Thomas et al., 1991):
In addition to the indices of agreements (e.g., R or ) based on all observed NOE intensities (“total” R-factors), it is often useful to split the observed intensities into several groups and calculate the indices for each group separately. For example, R-factors calculated for the subset of intensities which correspond to fixed interproton distances (such as distances between geminal protons or protons on aromatic rings) are not very sensitive to the details of a molecular structure; however, such R-factors are more sensitive to the motional model used and, in particular, to the overall rotational correlation time (see above). NOE-based R-factors are routinely split in the program CORMA into the intra- and interresidue components, the latter being more sensitive to the tertiary structure. Also, R-factors can be calculated for each residue separately. In addition to NOE-based R-factors, figures of merits can be calculated for interproton distances, scalar coupling constants (Ulyanov et al., 1993), and relaxation rates (Sec. 4.2). Analysis of such a plentitude of agreement indices may seem tedious, but it can indicate potential problems with the structure refinement, and, in some cases, it can reveal possible conformational averaging.
3.
ASSESSMENT OF CONFORMATIONAL FLEXIBILITY
The presence of multiple conformations in solution can be assessed by a variety of NMR methods. Especially well established are the methods to probe local
Analysis of Protein and Nucleic Acid NMR Data
211
dynamics of the backbone and side chains in proteins using heteronuclear NMR of and nuclei [for a review, see, e.g., Palmer (1997)]. In the case of homonuclear NMR on protons, one of the indications of conformational averaging is a so-called exchange broadening of proton NMR lines, which can be quantified by measurements(Schmitz et al., 1992b; Lane et al., 1993; Blackledge et al., 1993; McAteer et al., 1995) and then interpreted in terms of particular motional models. Here we will be concerned with the situation when several conformation are in fast exchange with each other. Such conformations give rise to a single set of NMR signals, and structural restraints derived from NOE intensities or scalar
coupling constants are the nonlinear averages. Attempts to satisfy all these restraints in a single rigid structure may be unsuccessful or even yield a nonphysical conformation. One of the first well-documented examples of such a situation is antamanide, a cyclic decapeptide. In this case, rMD calculations were first used to prove that all experimental restraints could not be satisfied in any single structure (Kessler et al., 1988), which served later as a basis for identification of possible conformers (Blackledge et al., 1993). In another example, a model 17-residue peptide, evidence of multiple conformations was obtained after careful analysis of NOE cross-peak patterns (Merutka et al., 1993). On the one hand, a pattern of shortand medium-range NOE cross peaks showed that the peptide is highly helical; on the other hand, certain long-range cross peaks indicated that nonhelical but structured conformers exist at the same time. In yet another example, it was observed for a 17-nucleotide hairpin-loop RNA, that an adenine H2 proton from the loop region had cross peaks simultaneously with protons of five(!) different residues (Yao et al., 1997). Clearly, such a situation could arise only for a highly flexible loop, but at the same time individual conformers must exist long enough to give rise to the observed NMR signals. The last example involves sugar pucker dynamics in nucleic acids. As discussed in Sec. 2.2, scalar coupling data indicate that deoxyriboses are in dynamic equilibrium for DNA duplexes and hybrids. Indeed, experimental J-couplings could not be explained by any single sugar conformation, but they were consistent with an equilibrium of and puckers. This conclusion can be corroborated with NOE data as well: intraresidue NOE-derived interproton distances between the sugar protons and base protons (H6 of pyrimidines or H8 of purines) were typical of sugars (typical of B-DNA), while the distances between and H6/H8 were suggestive of puckers (such as in A-DNA) (Ulyanov et al., 1995; Schmitz and James, 1995). From the examples discussed here, it is clear that redundant experimental restraints are required in order to reveal potential conformational averaging. This explains the scarcity of well-documented cases of conformational flexibility: solution structures are more typically underdetermined by the NMR data than overdetermined. Anyway, the first logical step in assessment of NMR-derived restraints is a conventional refinement assuming a single rigid structure. For
212
Anwer Mujeeb et al.
example, the average structure of an antisense
hybrid determined by the rMD with NOE-derived distance restraints displayed the sugar pucker for the DNA strand (intermediate between and This was in contradiction to the experimentally derived J-coupling information, which suggested a dynamic equilibrium for deoxyriboses (González et al., 1995).
4. ENSEMBLE CALCULATIONS
4.1.
Overview In recent years, several methods have been put forward for determining
structural ensembles from NMR data [reviewed by Ulyanov et al. (1998)]. One of
the most popular is so-called time-averaged molecular dynamics (MDtar), developed by Torda et al. (1990). Unlike standard rMD, in which all restraints are imposed at each time step of the simulation, MDtar requires that experimental
restraints be satisfied only for the whole trajectory on a time-average basis. MDtar has better sampling properties than rMD, but most importantly it generates a trajectory which serves to explain the experimental data with an ensemble of structures. When this method was applied to the example of a hybrid mentioned in the previous section, the resulting structural ensemble satisfied all the restraints together and better than any single structure (González et al., 1995; Schmitz and James, 1995). Similarly, the MDtar trajectory calculated for the
17-nucleotide RNA explained better the multiple NOE cross peaks observed for the loop adenine (Yao et al., 1997). One of the shortcomings of this method is that it produces very large ensembles of structures. Not only are such ensembles not unique (clearly, one never has enough experimental data to define unambiguously
each member of the ensemble), but it is also very difficult to analyze and draw structural and functional conclusions from such trajectories. To alleviate this problem, various clustering techniques can be used for selection of representative structures. Of note are the identification of time clusters in the MDtar trajectories (Yao et al., 1997) and hierarchical cluster analysis based on atomic root-meansquare deviations (RMSD) with the NMRCLUST program (Kelley et al., 1996). However, we must mention that the agreement with experimental data can be
compromised when going from large MDtar trajectories to a limited set of representative structures (Yao et al., 1997). In the next section we will discuss a method, PARSE, which is geared to determine smaller ensembles of structures, but is still sufficient to explain the experimental data in the case of conformational averaging. An important methodological development in this field is a method of “multiple copy refinement.” This approach involves simultaneous rMD refinement of several conformations, assuming their equal populations (Bonvin and Brünger, 1995; Kemmink and Scheek, 1995). Unlike standard (“single copy”) rMD refinement,
Analysis of Protein and Nucleic Acid NMR Data
213
distance restraints are imposed here not on individual interproton distances, but on distances which are ensemble-averaged for all copies of a molecule. The method appears to be very promising in situations where a few conformations exist in almost equal populations. A variant of multiple-copy refinement was proposed by Fennen et al. (1995), in which the contribution of each copy was weighted by a Boltzmann
factor calculated using conformational energy. In another group of methods, structural ensemble determination is separated into two independent parts: generation of a pool of potential conformers, and assessment of their probabilities (Brüschweiler et al., 1991; Ulyanov et al., 1995; Pearlman, 1996). One of these approaches is discussed in detail in the next section.
4.2. Relaxation-Rate-Based Probability Calculations The algorithm PARSE (Probability Assessment via Relaxation rates of a
Structural Ensemble), developed in our laboratory (Ulyanov et al., 1995), splits the problem of structural ensemble determination from NMR data into two independent steps. First, a pool of potential conformers is generated, and then the conformers are assessed to find which conformers and their probabilities give the best agreement with experimental data. For the first step, PARSE uses the idea originally implemented in the algorithm MEDUSA by Ernst and co-workers (Brüschweiler et al., 1991): the total set of experimental constraints is subdivided into several groups, and the structure is refined repeatedly against each subset. The rationale for this procedure must be clear from the previous section: in the case of conformational averaging, the experimental restraints may be internally inconsistent. Therefore, removing conflicting restraints from the total set, several at a time, and refining against the
reduced set is expected to produce each time a distinct conformation. While it may not be true that each conformation, produced in such a way, represents a true solution conformer, nevertheless the conformational space covered with the calculated pool of structures is expected to reflect roughly the true conformational
flexibility. For small molecules with a small total number of restraints, all possible restraint subsets can be considered (Blackledge et al., 1993). For bigger molecules, a preliminary assessment of flexibility (see above) is necessary, the goal of which
is identification of conflicting restraints. For the second step, the identity and probabilities of the conformers in the
optimal ensemble are calculated with the program PDQPRO (Ulyanov et al., 1995). PDQPRO uses a quadratic programming algorithm (Fletcher, 1981) for the global constrained minimization of a relaxation-rate-based index of agreement (constraints are put on probabilities which must be nonnegative and sum to unity). This approach makes use of the fact that for rapidly exchanging conformers, the effective dipole–dipole relaxation rate is a linear population-weighted average of rates of
214
Anwer Mujeeb et al.
individual conformers. For a given theoretical conformational ensemble (i.e., a set of structures with assigned probabilities the effective relaxation rates are
where is a relaxation rate for proton pair (i,j) in the conformer and the summation is performed over all conformers. Then the problem of ensemble determination can be formulated as fitting the theoretical rates to the observed rates by varying probabilities The PDQPRO program performs this fitting by finding the global minimum of the quadratic objective function
under the constraints and The summation is carried out over all observed cross peaks (i,j) in Eq. (9); weights can be used to regulate the relative contribution of different cross peaks. Theoretical relaxation rates for each potential conformer can be calculated with the program CORMA using Eq. (3), and experimental relaxation rates can be derived from NOE intensities with MARDIGRAS using Eq. (4).
5. EXPERIMENTAL EXAMPLES In this section we will illustrate the two steps of the approach outlined above using preliminary data for a small peptide and a nucleic acid as examples. It is clear from the previous section that conformers’ probabilities can be assessed with PDQPRO independently of the sampling method. In the first example, we will apply PDQPRO for a very small pool of conformations calculated during conventional NMR refinement. In the second example, we will show how conformational sampling can be carried out with partitioning of the distance restraints into a series of self-consistent subsets.
5.1.
MVIIC
A 26-residue peptide, MVIIC, from the sea snail Conus magus is a channel blocker which binds with high affinity to voltage-sensitive channels
in neurons. The peptide has three disulfide bonds between Cys 1 and Cys 16, Cys 8 and Cys 20, and between Cys 15 and Cys 26. The solution structure for this peptide has been determined using NMR (Farr-Jones et al., 1995). Interproton distance restraints were calculated from homonuclear 2D NOE cross-peak intensities using MARDIGRAS. The distances were input to DG, which produced 15 structures;
Analysis of Protein and Nucleic Acid NMR Data
215
these structures were subsequently refined with rMD simulated annealing using AMBER as described (Farr-Jones et al., 1995). The final ensemble of 15 refined structures is shown in Fig. 1 (left). This kind of structural ensemble is a typical result of NMR refinement: the global fold is well defined by the NMR restraints (the backbone atomic RMSD is 0.84 Å), but some variation in the backbone geometry and sidechain conformations (e.g., Tyr 13) is clearly seen. We must emphasize that the set of structures shown here (Fig. 1, left) is different from structural ensembles discussed in the previous section: all 15 structures are equally valid (or invalid), each was calculated using the same protocol, and no probability is associated with any of them. The conformational variations in the 15 structures may reflect the true conformational flexibility in the solution or just the degree of indetermination of the structure by the NMR restraints available. The theoretical relaxation rates for the 15 structures were calculated using CORMA, and the experimental rates were derived from the original NOE intensities using MARDIGRAS. After that, the index [Eq. (9)] was optimized with PDQPRO, assuming equal weights A structural ensemble with the minimum index consists of just seven structures with nonzero probabilities: 32.0, 30.8, 17.8,9.4,4.6, 3.9, and 1.5%. The resulting ensemble is shown in Fig. 1 (right) with the diameter and darkness of the backbone proportional to the probability of the structure. It is seen that the conformational variation is significantly reduced for the
216
Anwer Mujeeb et al.
PDQPRO ensemble, although it has not disappeared completely. For example, the side chain of Tyr 13 still occupies two distinct positions in the conformational space, but the probability of one of them is a mere 1.5%. It is trivial that the relaxationrate-based index improved for the PDQPRO ensemble (3.61) compared to individual structures (4.38–7.95, Table 1), because was optimized during the PDQPRO calculation. It is noteworthy, however, that the NOE-based and R-factors also improved, especially their interresidue components, which are more sensitive to the tertiary fold (Table 1). Still, it is not clear yet to what extent the PDQPRO ensemble represents the solution conformers, because the improvement in the indices of agreement was not that dramatic. A conservative interpretation of the result is that the eight structures eliminated are not necessary to explain the existing NMR data. 5.2.
Nucleic Acid Example
We have recently carried out a high-resolution proton NMR study on a hybrid duplex formed by a methylphosphonated DNA strand and a complementary RNA strand, (Mujeeb et al., 1997). Here, MP stands for the methylphosphonate linkages in the pure
Analysis of Protein and Nucleic Acid NMR Data
217
stereoconfiguration; MP alternates with the usual phosphodiester linkage in the DNA strand. Our investigations suggested that this hybrid is dynamic, and its average structure is different from the standard A- and B-forms. Indeed, the absence of the cross peaks for the RNA strand indicated that the corresponding scalar coupling constants were small, which is typical for the A-like ribose conformations. In contrast, deoxyriboses in the MP-DNA strand exhibited detectable strong cross peaks, and almost similar
patterns for the cross peaks in the 2QF-COSY spectrum. Similar observations have been made for another hybrid, which were explained by highly flexible sugar puckers in the DNA strand (González et al., 1994, 1995; Schmitz et al., 1996). We will illustrate how the idea of PARSE can be used to calculate an ensemble of representative structures which explain the experimental data. Interproton distance restraints were calculated using MARDIGRAS for the 2D
NOE data acquired at three mixing times (50,150, and 350 ms). The RANDMARDI approach was applied to account for spectral noise and integration errors as described in Sec. 2.1. Three initial model structures were used for MARDIGRAS calculations, which corresponded to the A-form, B-form, and a mixed conformation
with the sugar puckers for the RNA strand and puckers for the MP-DNA strand sugar puckers have pseudorotation phase angle P from 72° to 108°, intermediate between that for and Comparison of the calculated distances with the three initial structures showed that calculated distances in the RNA strand were consistent with the values expected for the A-form. However, the situation was mixed for the MP-DNA strand. Some of the
distances (e.g., intraresidue distances between sugar proton and base proton H6 or H8) were typical of A-DNA, other distances (e.g., interresidue sequential distances were consistent with B-form DNA, and yet other distances had intermediate values between A- and B-forms. Such apparently conflicting interproton distances suggested a significant degree of conformational averaging in the MP-DNA residues. A comparison of experimental NOE cross-peak intensities with NOE intensities calculated via CORMA for the three model structures (using index of agreement computed for each residue separately; see Sec. 2.4) yielded qualitatively similar results. We attempted to produce a set of the hybrid conformations which may represent the flexibility of this molecule in solution and explain the experimental interproton distances. For that purpose, we constructed a series of distance restraints subsets by excluding, one at a time, the conflicting restraints from the total set. Altogether, 144 subsets were generated, and the structure was refined against each of them. The refinement included a 10-ps in vaccuo rMD run using standard protocols described elsewhere (Mujeeb et al., 1993). Normally, such a procedure is known as “cross validation” (Brünger et al., 1993; Weisz et al., 1994). When a molecule has one major conformation in solution,
218
Anwer Mujeeb et al.
all experimental restraints must be self-consistent. In such a case, exclusion of a small percentage from the restraint set is not expected to affect the refined structure. However, if experimental restraints were subject to conformational averaging, refinement with just a few critical restraints removed may lead to a dramatically
different structure (Ulyanov et al., 1995, 1998). This turned out to be the case for the MP-DNA•RNA hybrid. Refinements against the 144 distance-restraint subsets led to many distinct conformations with a variety of sugar puckers in the MP-DNA
strand. Subsequently, the PDQPRO calculations were carried out for the pool of 144 potential conformers. Sets of theoretical dipolar relaxation rates were calculated for each of 144 conformations using CORMA. Experimental relaxation rates were calculated with MARDIGRAS for experimental 2D NOE sets at three mixing times. A fourth set of experimental relaxation rates was created by averaging the rates for
Analysis of Protein and Nucleic Acid NMR Data
219
the three mixing-time data sets. The PDQPRO calculations selected about five
conformers with nonzero probabilities for each of the four sets of experimental rates. Altogether, 10 structures were selected with some conformations picked in more than one set of calculations, although with different probabilities. Figure 2 shows the distribution of sugar puckers in the selected structures. In agreement with the qualitative results of the 2QF-COSY cross-peak analysis, the riboses in the RNA strand have a relatively narrow distribution around puckers (top), while deoxyriboses in the MP-DNA strand populate all sugar conformations from endo to (bottom). It is interesting that flexibility of sugars in the MP-DNA strand of the hybrid surpasses that of the sugars in pure DNA•DNA duplexes; this is in agreement with the studies on another DNA•RNA hybrid (González et al., 1994, 1995;Schmitze et al., 1996).
6. CONCLUSIONS A number of computational methods are being currently developed to model fast dynamics of nucleic acids and proteins at the atomic level of resolution based
on NMR data. Such dynamics can be described in the form of structural ensembles. A prerequisite for this modeling is the presence of conflicting experimental restraints, distance restraints derived from NOE data, and/or torsion-angle restraints derived from scalar coupling constants. In its turn, this requires a certain redundancy of restraints, which must overdefine the structure. The PDQPRO algorithm is capable of assessing the probabilities of structures in a given conformational pool based on the fitting to the experimental data. A convenient feature of this algorithm is that it can be combined with any method of conformational sampling which produces a pool of potential conformers. In all applications of the PDQPRO method that we attempted so far, it reduces significantly the size of a conformational pool
(without sacrificing the agreement with experimental data), which simplifies greatly the subsequent structural analysis of the calculated ensemble. However, the success of this approach depends critically on comprehensive prior conformational sampling (Ulyanov et al., 1998). Methods, such as MEDUSA and PARSE, attempt
a “rational” construction of the conformational pool by identifying the conflicting restraints and then repeatedly refining the structure against reduced sets of restraints. In the case of bigger molecules, such an operation is far from trivial, and it often requires a prior hypothesis about possible solution conformers. Promising alternative approaches, which require further exploration, involve combination of PDQPRO with such sampling methods as MDtar or multiple-copy refinement. Lastly, all these methods should not be expected to produce a unique set of solution conformers, because typically there are barely enough experimental restraints to define even a single high-resolution structure. However, as we demonstrated for
certain experimental systems, it is possible to characterize areas of conformational
220
Anwer Mujeeb et al.
space that a flexible molecule occupies in solution, and to generate an ensemble of representative structures satisfying the existing experimental data.
ACKNOWLEDGMENTS. We thank Dr. Uli Schmitz for many useful discussions, and Eric Pettersen for providing the script to display ensembles of structures. This work was supported by NIH grants GM39247 and RR01081. TMB was partially supported by the NIH Training Grant NS07219.
REFERENCES Altona, C., and Sundaralgam, M., 1972, J. Am. Chem. Soc. 94:2333–2344. Basus, V. J., 1989, Meth. Enzym. 177:132. Bax, A., Vuister, G. W., Grzesiek, S., Delaglio, F., Wang, A. C., Tschudin, R., and Zhu, G., 1994, Meth.
Enzym. 239:79. Bax, A., and Lerner, L., 1988, J. Magn. Reson. 79:429. Beckman, R. A., Litwin, S., and Wand, A. J., 1993, J. Biomol. NMR 3:675. Blackledge, M. J., Brüschweiler, R., Griesinger, C., Schmidt, J. M., Xu, P., and Ernst, R. R., 1993, Biochemistry 32:10960. Bonvin, A. M., and Brünger, A. T., 1995, J. Mol. Biol. 250:80. Borgias, B. A., and James, T. L., 1990, J. Magn. Reson. 87:475. Brünger, A. T., Clore, G. M., Gronenborn, A. M., Saffrich, R., and Nilges, M., 1993, Science 261:328. Brüschweiler, R., Blackledge, M., and Ernst, R. R., 1991, J. Biomol. NMR 1:3.
Case, D. A., Dyson, H. J., and Wright, P. A., 1994, Meth. Enzym. 239:393. Celda, B., Widmer, H., Leupin W., Chazin, W. J., Denny, W. A., Wüthrich, K., 1989, Biochemistry 28:1462–1470.
Constantine, K. L., Friedrichs, M. S., Mueller, L., and Bruccoleri, R. E., 1995, J. Magn. Reson. Ser. B 108:176. Conte, M. R., Bauer, C. J., and Lane, A. N., 1996, J. Biomol. NMR 7:190. Ernst, R. R., Bodenhausen, G., and Wokaun, A., 1987, Principles of Nuclear Magnetic Resonance in One and Two Dimensions, Oxford University Press, New York. Farr-Jones, S., Miljanich, G. P., Nadasdi, L., Ramachandran, J., and Basus, V. J., 1995, J. Mol. Biol. 248:106. Fedoroff, O. Y., Ge, Y., and Reid, B. R., 1997, J. Mol. Biol. 269:225. Fennen, J., Torda, A. E., and van Gunsteren, W. F., 1995, J. Biomol. NMR 6:163. Ferrin, T. E., Huang, C. C., Jarvis, L. E., and Langridge, R., 1988, J. Mol. Graphics 6:13. Fletcher, R., 1981, Practical Methods of Optimization. Vol. 2, Wiley, New York.
Folmer, R. H., Hilbers, C. W., Konings, R. N., and Nilges, M., 1997, J. Biomol. NMR 9:245. Gillespie, J. R., and Shortle, D., 1997, J. Mol. Biol. 268:170. Glemarec, C., Kufel, J., Földesi, A., Maltseva, T., Sandström, A., Kirsebom, L. A., and Chattopadhyaya, J., 1996, Nucl. Acids Res. 24:2022. Gochin, M., and Roder, H., 1995, Prot. Sci. 4:296. González, C., Stec, W., Kobylanska, A., Hogrefe, R. I., Reynolds, M. A., and James, T. L., 1994, Biochemistry 33:11062.
González, C., Stec, W., Reynolds, M. A., and James, T. L., 1995, Biochemistry 34:4969. Gorenstein, D. G., 1994, Chem. Rev. 94:1315.
Görler, A., and Kalbitzer, H. R., 1997, J. Magn. Reson. 124:177. Griesinger, C., Sorensen, O. W., and Ernst, R. R., 1985, J. Am. Chem. Soc. 107:6394.
Analysis of Protein and Nucleic Acid NMR Data
221
Huang, P., Patel, D. J., and Eisenberg, M., 1993, Biochemistry 32:3852. Jardetsky, O., and Lefevre, J. F., 1994, FEBS Lett. 338:246. Keepers, J. W., and James, T. L., 1984, J. Magn. Reson. 57:404–426. Kelley, L. A., Gardner, S. P., and Sutcliffe, M. J., 1996, Prot. Eng. 9:1063.
Kemmink, J., and Scheek, R. M., 1995, J. Biomol. NMR 6:33. Kessler, H., Griesinger, C., Lautz, J., Müller, A., van Gunsteren, W. F., and Berendsen, H. J. C., 1988, J. Am. Chem. Soc. 110:3393. Koning, T. M. G., Boelens, R., van der Marel, G. A., van Boom, J. H., and Kaptein, R., 1991, Biochemistry 30:3787. Kumar, A., James, T. L., and Levy, G. C., 1992, Isr. J. Chem. 32:257. Lane, A. N., 1993, Prog. NMR Spectrosc. 25:481. Lane, A. N., Bauer, C. J., and Frenkiel, T. A., 1993, Eur. Biophys. J. 21:425. Li, Y. C., and Montelione, G. T., 1995, Biochemistry 34:2408. Lipari, G., and Szabo, A., 1982, J. Am. Chem. Soc. 104:4546.
Liu, H., Thomas, P. D., and James, T. L., 1992, J. Magn. Reson. 98:163. Liu, H., Kumar, A., Weisz, K., Schmitz, U., Bishop, K. D., and James, T. L., 1993, J. Am. Chem. Soc. 115:1590. Liu, H., Banville, D. L., Basus, V. J., and James, T. L., 1995a, J. Magn. Reson. Ser. B 107:51. Liu, H., Spielmann, H. P., Ulyanov, N. B., Wemmer, D. E., and James, T. L., 1995b, J. Biomol. NMR 6:390. Liu, H., Tonelli, M., and James, T. L., 1996, J. Magn. Reson. Ser. B 111:85.
Marion, D., and Wüthrich, K., 1983, Biochem. Biophys. Res. Commun. 113:967. McAteer, K., Ellis, P. D., and Kennedy, M. A., 1995, Nucl. Acids Res. 23:3962. Merutka, G., Morikis, D., Brüschweiler, R., and Wright, P, E., 1993, Biochemistry 32:13089. Mujeeb, A., Bishop, K., Peterlin, B. M., Turck, C., Parslow, T. G., and James, T. L., 1994, Proc. Natl. Acad. Sci. U.S.A. 91:8248. Mujeeb, A., Kerwin, S. M., Egan, W., Kenyon, G. L., and James, T. L., 1992, Biochemistry 31:9325. Mujeeb, A., Kerwin, S. M., Egan, W., Kenyon, G. L., and James, T. L., 1993, Biochemistry 32:13419.
Mujeeb, A., Reynolds, M. A., and James, T. L., 1997, Biochemistry 36:2371. Nilges, M., 1996, Curr. Opin. Struct. Biol. 6:617. Ösapay, K., Theriault, Y, Wright, P. E., and Case, D. A., 1994, J. Mol. Biol. 244:183. Palmer, A. G. III, 1997, Curr. Opin. Struct. Biol. 7:732. Pearlman, D. A., 1994, J. Biomol. NMR 4:279. Pearlman, D. A., 1996, J. Biomol. NMR 8:49. Pellegrini, M., Gobo, M., Rocchi, R., Peggion, E., Mammi, S., and Mierke, D. F., 1996, Biopolymers 40:561. Saenger, W., 1984, Principles of Nucleic Acid Structure, Springer-Verlag, New York.
Salgueiro, C. A., Turner, D. L., and Xavier, A. V, 1997, Eur. J. Biochem. 15:244. Schmitz, U., González, C., Ulyanov, N. B., Blocker, F. H., Liu, H., and James, T. L., 1996, in Biological Structure and Dynamics, Vol. 2 (R. H. Sarma and M. H. Sarma, eds.), Adenine Press, New York, p. 165. Schmitz, U., and James, T. L., 1995, Meth. Enzym. 261:1. Schmitz, U., Kumar, A., and James, T. L., 1992a, J. Am. Chem. Soc. 114:10654. Schmitz, U., Sethson, I., Egan, W. M., and James, T. L., 1992b, J. Mol. Biol. 227:510. Schmitz, U., Ulyanov, N. B., Kumar, A., and James, T. L., 1993, J. Mol. Biol. 234:373. Schmitz, U., Zon, G., and James, T. L., 1990, Biochemistry 29:2357.
Sorensen, M. D., Bjorn, S., Norris, K., Olsen, O., Petersen, L., James, T. L., and Led, J. J., 1997, Biochemistry 36:10439.
Thomas, P. D., Basus, V. J., and James, T. L., 1991, Proc. Natl. Acad. Sci. U.S.A. 88:1237. Torda, A. E., Scheek, R. M., and van Gunsteren, W. F., 1990, J. Mol. Biol. 214:223.
222
Anwer Mujeeb et al.
Ulyanov, N. B., Mujeeb, A., Donati, A., Furrer, P., Liu, H., Farr-Jones, S., Konerding, D. E., Schmitz, U., and James, T. L, 1998, in Molecular Modeling of Nucleic Acids (N. B. Leontis and J. Santalucia, Jr., eds.). American Chemical Society, Washington DC, p. 181. Ulyanov, N. B., Schmitz, U., and James, T. L., 1993, J. Biomol. NMR 3:547. Ulyanov, N. B., Schmitz, U., Kumar, A., and James, T. L., 1995, Biophys. J. 68:13. van Gunsteren, W. F., Brunne, R. M., Gros, P., van Schaik, R. C., Schiffer, C. A., and Torda, A. E., 1994, Meth. Enzym. 239:619. van Wijk, J., Huckriede, B. D., Ippel, J. H., and Altona, C., 1992, Meth. Enzym. 211:286. Weisz, K., Shafer, R. H., Egan, W. M., and James, T. L., 1992, Biochemistry 31:7477. Weisz, K., Shafer, R. H., Egan, W. M., and James, T. L., 1994, Biochemistry 33:354. Widmer, H., and Wüthrich, K., 1987, J. Magn. Reson. 74:316. Wishart, D. S., and Sykes, B. D., 1994a, Meth. Enzym. 239:363. Wishart, D. S., and Sykes, B. D., 1994b, J. Biomol. NMR 4:171. Yao, L. J., James, T. L., Kealey, J. T., Santi, D. V, and Schmitz, U., 1997, J. Biomol. NMR 9:229. Zhang, Q., Chen, J., Gozansky, E. K., Zhu, F., Jackson, P. L., and Gorenstein, D. G., 1995, J. Magn. Res. Ser. B 106:164. Zhu, L., and Reid, B. R., 1995, J. Magn. Reson. B 106:227.
7
Complete Relaxation and Conformational Exchange Matrix (CORCEMA) Analysis of NOESY Spectra of Reversibly Forming
Ligand–Receptor Complexes Application to Transferred NOESY
N. Rama Krishna and Hunter N. B. Moseley 1. INTRODUCTION
The code for the three-dimensional structure of a macromolecule such as a protein is contained in its primary structure (Anfinsen, 1973). A study of the high-resolution three-dimensional structures of proteins is important in understanding protein folding pathways, and in deciphering the rules for protein structure prediction. Principal methods for studying high-resolution structures of proteins are X-ray crystallography and multidimensional NMR spectroscopy. Even though the code
N. Rama Krishna and Hunter N. B. Moseley • Department of Biochemistry and Molecular Genetics, Comprehensive Cancer Center, The University of Alabama at Birmingham, Birmingham, Alabama 35294-2041. Biological Magnetic Resonance, Volume 17: Structure Computation and Dynamics in Protein NMR, edited by Krishna and Berliner. Kluwer Academic / Plenum Publishers, New York, 1999.
223
224
N. Rama Krishna and Hunter N. B. Moseley
for the biological activity of a macromolecule is also inherent in its three-dimensional structure, it is only through the formation of complexes with other molecules (e.g., proteins, enzymes, antibodies, DNA/RNA, natural products, carbohydrates and lipids) that they exert their important activities. Thus, a study of the three-dimensional (3D) structures of molecular complexes is vital to an understanding of the molecular basis for recognition, biological function, and mode of action. Molecular complexes at atomic detail can also be studied by crystallography and
NMR spectroscopy. The analysis of nuclear Overhauser effects (NOE) in multispin systems by the use of isolated spin-pair approximation (ISPA) is often inadequate since it neglects multispin effects, i.e., three-spin effects for small molecules with short correlation times, and spin-diffusion effects for large molecules with longer rotational correlation times (Krishna et al., 1978). To properly account for these effects in
quantitative interpretation of 2D NOESY data on nucleic acids and proteins,
complete relaxation rate matrix algorithms such as CORMA have been developed (Borgias and James, 1989, 1988; Keepers and James, 1984). Similarly, total relaxation rate matrix methods have been used in the quantitative analyses of 1D NOEs to account for these multispin effects, including spin diffusion in biomolecules (Krishna et al., 1978). Many other structure refinement procedures that employ total relaxation rate matrix analyses of NOESY intensities have since been proposed (e.g., Xu and Krishna, 1995; Xu et al., 1995a, 1995b; Bonvin et al., 1994; Sugar and Xu, 1992; Guntert et al., 1991; Mertz et al., 1991; Gorenstein et al., 1990; Borgias and James, 1988; Boelens et al., 1988). In all these treatments, there is usually the assumption of a single conformation for the macromolecule under consideration.
1.1. Molecular Complexes and Conformational Exchange When the biomolecular system exhibits a conformational exchange as well, it is necessary to incorporate properly such exchange effects in rigorous structure refinements based on complete relaxation matrix treatments. Typical examples of biomolecular conformational exchange include a ligand exchanging between free and receptor-bound forms (e.g., Moseley et al., 1997), proteins existing in equilibrium between native, partially folded, and denatured forms (Alexandrescu et al., 1990; Dobson and Evans, 1984), or two distinct native forms (e.g., Boyd et al., 1984; Gupta et al., 1972), proteins exhibiting conformational heterogeneity in part of the sequence (Meadows et al., 1991; Driscoll et al., 1990) or slow internal rotations for some of the side chains (Fejzo et al., 1991; Campbell et al., 1976; Wuthrich and Wagner, 1975), or disulfide bond isomerization (Otting et al., 1993) or cis–trans proline isomerization (Hinck et al., 1997), and a DNA duplex existing in an equilibrium between two or more distinct conformations (Choe et al., 1991; Feigon et al., 1984). Theoretical treatments have been developed that describe
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
225
complete relaxation and conformational exchange matrix analyses of NOESY of dynamical systems (Curto et al., 1996; Moseley et al., 1995; Ni and Zhu, 1994; Krishna and Lee, 1992; Lee and Krishna, 1992; London et al., 1992; Lippens et al.,
1992; Ni, 1992). The methods of analysis of NOESY data to develop molecular models of tight complexes (with with moderate molecular weights are essentially identical to those used for uncomplexed proteins, because the NOESY intensities are not influenced by the exchange rates of the interacting species between their free and bound forms. Because the exchange off-rates are much too
slow under these conditions, one can prepare samples containing only the ligand– receptor complex without any free ligand. High-molecular-weight
complexes of proteins with tightly bound ligands still present a challenge for structure determination by NMR. This upper limit in molecular weight has been extended very recently with the introduction of TROSY (Pervushin et al., 1997).
1.2. Reversible Binding and Transferred NOESY When the binding is reversible (during the course of an experiment), as is
usually the case for larger dissociation constants to the NOESY spectrum of a sample containing a ligand and its receptor reflects the transfer of magnetization by chemical exchange between free and complexed species as well as by dipolar exchange in each state. This reversible binding has been exploited in the design of the “transferred NOE or transferred NOESY”
technique, as a means of studying indirectly the receptor-bound conformation of a ligand from the NOESY spectrum of a sample containing excess ligand (Albrand et al., 1979). Under fast exchange conditions, the enhanced cross relaxation due to the longer rotational correlation time can often more than compensate for the minor fractional population of the complexed ligand. It is not the purpose of this chapter to provide an exhaustive survey of literature on the transferred NOESY field; the reader is referred to Ni (1994) for a comprehensive review of this field up to 1993. Other related topics dealing with protein– ligand interactions are covered by James and Oppenheimer (1994). The current
chapter is intended to provide the authors’ own perspective and approach to the quantitative interpretation of NOESY spectra of interacting molecules in general
and the transferred NOESY in particular. Our approach, using the complete relaxation and conformational exchange matrix (CORCEMA) methodology, stresses the need to focus on the entire system involving the ligand, the receptor, their motional dynamics, and the kinetics of the binding process. It is hoped that
this chapter will exert some modest influence on the way we think about transferred NOESY experiments. In the following discussion, we will be using the words receptor and protein (or enzyme) interchangeably, though it must be understood
that the receptor can be any general macromolecule. Indeed, examples are known
226
N. Rama Krishna and Hunter N. B. Moseley
where complexes form under fast reversible binding. These include drug–DNA (Pavlopoulos et al., 1995; Crenshaw et al., 1995; Wadkine and Graves, 1991), protein–DNA (Baleja et al., 1994; Dekker et al., 1993), protein–protein (Yi et al., 1994; Chen et al., 1993), nucleotide/nucleoside-enzyme (Murali et al., 1997; Jarori et al., 1994; Perlman et al., 1994; London et al., 1992), corepressor-represser– operator (Lee et al., 1995), carbohydrate–protein (Casset et al., 1997; Bevilacqua et al., 1992), peptide–protein (Blommers et al., 1997;Ni et al., 1995;Campbell and Sykes, 1991), peptide–antibody (Anglister et al., 1995; Scherf et al., 1992); peptide–membrane (Bersch et al., 1993; Gounarides et al., 1993) complexes, just to mention a few. In addition, the reported observation of NOEs between protein protons and bound water molecules (Otting and Wuthrich, 1989; Clore et al., 1990a) is another example of this. In all these cases, the NOESY intensities reflect both chemical and dipolar exchange of magnetization between protons. Their respective
contributions need to be properly quantitated to get meaningful quantitative struc-
tural information. One area of special interest in studying complexes is structure-based drug design by NMR (Fesik, 1993). For complexes of proteins with tight-binding lead compounds, the methods of analysis of NOESY spectra are well established. Sometimes, however, some of the promising lead compounds may only show marginal affinities, and the chemist is faced with the prospect of designing higheraffinity analogs. Under these circumstances, a structure of the weekly binding lead compound deduced from a quantitative analysis of transferred NOESY spectra that incorporates the interactions between the ligand and the residues in the active site is of immense value. The CORCEMA method treats such ligand–receptor interactions explicitly and should be of value in structure-based drug design efforts. Since the original introduction of the transferred NOE technique in the late 1970s (Albrand et al., 1979), the biomolecular NMR field has experienced several significant advances. Most notable among these are the introduction of multidimensional NMR and TROSY methods for studying proteins that were too large to study in the 1970s, development of isotope-filtered methods for recording subspectra (intra- and intermolecular) of interacting molecules, development of molecular biological procedures and overexpression systems for the production of proteins and other macromolecules with uniform isotopic labeling for use in these measurements, random fractional deuteration to reduce problems due to severe dipolar line broadening and spin diffusion in moderately large proteins development of a wide variety of structure refinement algorithms that can quantitatively analyze the NOESY and other NMR data, and improvements in sensitivity through the construction of NMR systems with very high field magnets (now approaching the Gigahertz range). It is clear that the transferred NOE field can be further advanced by taking advantage of these technological and methodological developments. The theory for steady-state 1D transferred NOEs has been developed by Clore and Gronenborn (1982). It was later extended to selective saturation-based time-
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
227
dependent transferred NOEs with an emphasis on an analysis of initial slopes (Clore and Gronenborn, 1983). Under these conditions, useful structural information in the bound state is obtained if the conformational exchange is fast on the relaxation rate scales. The 2D transferred NOESY (tr-NOESY) presents obvious advantages over time-dependent 1D NOE techniques in terms of generation of large data sets
in an efficient manner, and is generally the preferred approach. Recent studies since 1991 have identified and characterized several factors that play a critical role in the quantitative analyses of 2D transferred NOESY. These are summarized below. 1.2.1. Finite Off-Rates
The importance of finite off-rates in tr-NOESY was anticipated (Choe et al., 1991), and theoretical frameworks to account for these were described by Lee and Krishna (1992) and, independently, by others (London et al., 1992; Ni, 1992; Lippens et al., 1992). The main advantage of these formulations is that the utility of the tr-NOESY technique is now extended to a much wider range of off- and on-rates rather than the restrictive regime of exchange rates faster than the crossrelaxation rates. Typical examples of reversibly forming complexes with off-rate slow on the chemical-shift scale are lysozyme–GlcNac (Lumb et al., 1994) and
Trp-repressor–operator (Lee et al., 1995) complexes. Additionally, as the molecular weight of the enzyme increases, the rotational correlation time and, hence, the cross-relaxation rates also increase, and the fast exchange approximation often employed in traditional tr-NOESY analyses may not be satisfied. AH these theoretical formulations have been cast for treating multispin systems.
1.2.2. Intermolecular Ligand–Receptor Dipolar Relaxation Many traditional analyses of tr-NOE experiments have routinely neglected ligand–protein intermolecular cross relaxation, partly because (i) until recently there was no adequate theoretical framework available to properly account for them, (ii) neglect of cross relaxation with protein protons simplified the analyses consid-
erably, and (iii) in some instances the structure of the receptor active site and/or the identity of the residues within the site was presumably unknown and it was therefore
difficult to take this cross relaxation into account. Even now, several publications continue to appear in which bound ligand conformations are being deduced without regard to the possible role of protein protons and motions at the active site on the ligand NOESY spectra. Only time will tell whether any of these published structures need revision. It is obvious, however, that any serious effort at a structure-based design of a protein-binding ligand will substantially benefit from one’s ability to explicitly
incorporate the intermolecular contacts with the protein, rather than suppressing them or ignoring them. This is because the conformation of the active site itself can
change substantially, depending upon the different chemical modifications on the
228
N. Rama Krishna and Hunter N. B. Moseley
ligand. Purine nucleotide phosphorylase inhibitors are an example of this (Ealick et al., 1991). Our results (Moseley et al., 1994, 1995; Jackson et al., 1995) and those
of others (Arepalli et al., 1995; Ni and Zhu, 1994; Zheng and Post, 1993; London et al., 1992) show conclusively that the neglect of ligand–protein interactions, and in particular the neglect of protein-mediated effects, can result in misleading conclusions about the bound conformation of the ligand. The effects of ligand–protein cross relaxation on the ligand tr-NOESY spectrum can be rather complex, and result in two distinct types of effects, depending upon the relative disposition of receptor protons in relation to ligand proteins. These are (i) protein-mediated spin-diffusion effects, which can sometimes dramatically affect the initial growth portions of the ligand tr-NOESY (Jackson et al., 1995), and (ii) protein-leakage effects, which tend to affect more the decay portions of the ligand–tr-NOESY. The protein-mediated spin diffusion (or protein indirect effects) deserves special consideration, since, under high ligand–receptor ratios customarily employed in tr-NOE experiments, protein-mediated spin-diffusion effects can be much more efficient than the corresponding bound-ligand-mediated spin-diffusion effects and can produce dramatic effects even for very short mixing times (Jackson et al., 1995). These protein-indirect effects on the ligand tr-NOEs, if not properly accounted for, can result in misleading compact structures for the bound ligand, as predicted theoretically (Moseley et al., 1995; Jackson et al., 1995; Ni and Zhu, 1994) and experimentally confirmed (Dratz et al., 1996; Arepalli et al., 1995). In contrast, the protein-leakage effects might lead to slightly expanded structures, if not properly treated. 1.2.3. Motions in the Bound State
1.2.3a. Protein Motions at the Active Site. An implicit assumption usually made is that the bound conformation deduced by the tr-NOESY technique corresponds directly to that of the ligand bound in the active site. Indeed, such an
assumption is not unreasonable (i) if the active site with and without the ligand remains essentially identical [e.g., neuraminidase (Janakiraman et al., 1994)] as in the rigid lock-and-key model (Fischer, 1894), or (ii) when the process of ligand binding to and release from the active site (into the solvent) is practically instantaneous. Complications arise, however, when the ligand-binding process does not follow the rigid lock-and-key model and the active site on the receptor exhibits distinct conformational movements following the initial binding of a ligand, and/or the occupation of the active site by a ligand is not instantaneous but takes a finite time. These motions can be fast (motion of side chains) or slow (e.g., large-scale motions of domains). Motions on a time scale have been detected in the tips of the flaps that cover the active site of HIV-1 protease (Nicholson et al., 1995). In the
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
229
case of lysozyme (adsorbed on mica), atomic force microscopy measurements have detected conformational changes of the order of 5 Å lasting for ~50 ms in the presence of a substrate (Radmacher et al., 1994). Maltodextrin binding protein exhibits a “Venus flytrap” type of rigid-body hinge-bending motion of two globular domains through an angle of 35° upon ligand binding (Sharff et al., 1992). Other examples of proteins exhibiting hinge-bending motions include thermolysin (Holland et al., 1992) and related neutral proteases, periplasmic proteins (Quiocho, 1991), yeast hexokinase (Bennett and Steitz, 1978), and adenylate kinase (Schulz et al., 1990), to name a few. The active site in human purine nucleoside phosphorylase exhibits a considerable fluidity and undergoes different conformational rearrangements, depending upon the inhibitors (Ealick et al., 1991). All these proteins can be thought of as exhibiting an “induced fit” binding mode (Koshland, 1958). It is easily appreciated that the motions at the active site occurring over a finite time (milliseconds to nanoseconds, longer than the rotational correlation time of the receptor) during the course of induced-fit binding of a ligand have the potential to modulate the intermolecular ligand–protein dipolar contacts as well as the intramolecular contacts due to the accompanying conformational changes. Ignoring such effects might result in an erroneous interpretation of tr-NOESY data
on a complex ligand–receptor system.
1.2.3b. Ligand Motions in the Bound State. The ligand itself may exhibit conformational transitions while bound to the protein (Perlman et al., 1994). Hence, any attempt to determine the so-called bound conformation of a ligand must properly address this conformational malleability of the ligand as well. The CORCEMA algorithm permits an incorporation of motions in both the protein and the ligand in the bound state (and of course in the unbound states as well). 1.2.4. Intermolecular Transferred NOESY When the ligand–receptor ratio is not too high, it is sometimes possible to observe intermolecular ligand–receptor NOESY contacts for moderate-size proteins These contacts are extremely valuable for properly docking the ligand within the binding pocket and, together with intra-tr-NOEs, for structurally refining the ligand and the active site residues.
1.2.5. Nonspecific Binding of the Ligand
In addition to binding in the active site of a receptor with high affinity and specificity, a ligand could also often associate with a receptor at nonspecific or weak
binding sites. Such nonspecific binding has been demonstrated in several systems (e.g., Murali et al., 1997; Jarori et al., 1994; Behling et al., 1988). The tr-NOEs resulting from nonspecific binding may mask tr-NOEs from specific binding, and thus can complicate structural analysis and the estimation of specifically bound ligand concentration. Hydrophobic association with surface hydrophobic residues
230
N. Rama Krishna and Hunter N. B. Moseley
and electrostatic interaction with charged surface residues are presumably a couple of factors that contribute to such nonspecific binding. 2. CORCEMA THEORY The theory for CORCEMA analysis presented here is based on the matrix algebra formulation originally developed by our laboratory to treat multistate (n-state) conformational exchange (Krishna et al., 1980), and is an extension of our early work on NOESY in exchanging systems (Lee and Krishna, 1992; Choe et al., 1991). In the following, we formulate the theory for a general multistate situation, and illustrate its application with two specific examples: a two-state model and a three-state model of ligand–enzyme interactions.
2.1. Basic Formulation The dynamic matrix D that governs the time evolution of the peak intensities in a 2D NOESY experiment is given by (Ernst et al., 1987)
where R is the relaxation rate matrix and K is the kinetic matrix (Ernst et al., 1987; Krishna et al., 1980). The kinetic matrix has elements (Krishna et al., 1980),
where
is the rate of exchange from conformations i to j. The elements of the R matrix for NOESY are (Krishna et al., 1978)
In the above equations, are the transition probabilities due to dipolar relaxation between two spin-1/2 nuclei i and j undergoing isotropic rotational diffusion, and are given by (Krishna et al., 1978)
where. and. and. and are the Larmor frequency and magnetogyro ratio, respectively, for spin i, r
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
is the internuclear distance, molecule, and
231
is the isotropic rotational correlation time for the
is the leakage relaxation. We will refer to the off-diagonal elements
as cross-relaxation rates. Similar expressions for ROESY (Bax and Davis, 1985; Bothner-By et al., 1984) experiments on homonuclear systems are
and
The transition probabilities and macromolecules that satisfy the limit
are given by Eq. (4). For large the off-diagonal elements for the
relaxation rate matrix have opposite signs for the NOESY and ROESY experiments, a feature that has been exploited in the design of pulse sequences for minimizing spin-diffusion contributions (Macura et al., 1994; Fejzo et al., 1992). For situations where internal motions need consideration, we can use Lipari and Szabo’s model free expressions (1982a, 1982b) for the transition probabilities, modified in an empirical manner (Baleja et al., 1990) for NOE contact between two protons i and j:
and
is the effective correlation time for internal motion and satisfies the extreme narrowing condition (Lipari and Szabo, 1982a, 1982b). In Eq. (6), and respectively, are order parameters for nuclei i and j, and is an averaged value due to internal motion. A generalization of Eq. (6) to cases with internal correlation times on different time scales can also be made when the need arises (Clore et al.,
1990b). If is the fractional population of molecules in conformation k, then the elements of the kinetic matrix further satisfy the relationships (Krishna et al., 1980)
232
N. Rama Krishna and Hunter N. B. Moseley
From Eqs. (8) and (9) it follows that a row vector composed of 1’s and a column vector composed of the fractional populations constitute an eigenvector pair corresponding to the zero eigenvalue of the kinetic matrix K (Krishna et al., 1980). Using
this property, we have previously shown for noninteracting systems that when the conformational exchange rates are much faster than the relaxation rates in all the conformations, the effective relaxation rate (or rate matrix for coupled multispin systems) is simply a weighted average of relaxation rates (or rate matrices) in the
N individual conformations (Krishna et al., 1980):
where is the effective relaxation rate matrix, and is the relaxation rate matrix for the kth conformation with a fractional population In the present formalism, which deals with interacting systems, the R and K matrices are generalized relaxation rate and kinetic matrices, respectively, and are composed of submatrices that describe each molecular species in each state. The
manner in which they are defined becomes obvious from the two-state and threestate examples given below. The NOESY intensities at a mixing time are calculated from the expression where is a square matrix of intensities for all the molecules, and I(0) is a diagonal matrix with elements proportional to equilibrium concentration matrices in each state. It corresponds to all protons that are according to their chemical shifts during the evolution. If U is the transformation matrix that diagonalizes then where
is a diagonal matrix. Then Eq. (11) becomes
2.1.1. CORCEMA Calculations for Finite Delays In Eq. (13), the assumption was made that the magnetizations for all protons recovered to their thermal equilibrium values, prior to the preparation pulse, and hence the diagonal elements of the I(0) matrix correspond directly to the equilibrium concentrations; i.e.,
where is a diagonal matrix with elements representing the concentrations. In practice, however, this condition is met less frequently, and the relaxation delay T between pulses (i.e., acquisition time plus the waiting period for the next preparation pulse) is somewhat comparable to the longitudinal relaxation times. Under these conditions, Eq. (13) is modified as follows:
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
233
where I(0, T) is a diagonal matrix with elements,
An important result from Eq. (16) is that the NOESY spectrum can be asymmetric
due to differences in the longitudinal recoveries of coupled spins—a factor that can be exploited to generate large data sets for relaxation rate matrix analyses in general by recording the spectra at several mixing times as a function of the recycling time T. The effect of finite delays between pulses on structure refinements of nucleic acids and proteins has been addressed (Zhu and Reid, 1995; Dellwo et al., 1994). 2.2. TWO-STATE MODEL OF LIGAND–RECEPTOR INTERACTIONS In the following, we present a formulation suitable for reversible binding of a ligand and a protein to form binary complexes. This model is characterized by the
free state consisting of the interacting species and in their uncomplexed form and the bound state in which they form a complex as shown in Fig. 1. The generalized R matrix is composed of generalized submatrices and for the free and bound states, respectively (matrices describing more than one molecular species will be referred to as generalized matrices). They are defined as follows:
The and are the relaxation rate matrices for the ligand and the enzyme, respectively, in their uncomplexed state. The diagonal and off-diagonal terms of these matrices take into account the complete dipolar connectivities as described elsewhere (Krishna et al., 1978). In addition, any leakage terms (such as dipolar
234
N. Rama Krishna and Hunter N. B. Moseley
relaxation of amide protons with the nucleus, solvent exchange rates, and relaxation due to dissolved paramagnetic oxygen) can be added to the diagonal elements (Krishna et al., 1978). The absence of cross relaxation between the ligand and enzyme in their free states is denoted by the zero off-diagonal elements of the matrix. In a similar fashion, and are the relaxation rate matrices for the
complexed form of the ligand and the enzyme, respectively. The intermolecular dipolar cross relaxation [containing terms of the type in the complex is denoted by (and its transpose, This matrix is, in general, rectangular because of the different number of protons in the ligand and the protein. The and matrices include the complete relaxation matrix elements for the intramolecular relaxation within the bound forms of the ligand and the enzyme, respectively. In addition, their diagonal elements also include terms of the type associated with the intermolecular dipolar cross relaxation. In the CORCEMA algorithm, we opted to enter all equivalent protons (e.g., methyl protons) explicitly, so the R matrix is always symmetric. Such a practice is compatible with the manner in which the input files are entered in the algorithm (as PDB files for the coordinates of individual atoms). The generalized kinetic matrix K is composed of generalized kinetic submatrices defined as follows:
with and and where is a unit matrix of dimension appropriate for molecule N; and are respectively the on and off rates describing the reversible complex. The generalized intensity matrix has the following general form:
In this equation, the diagonal elements represent the traditional “intramolecular NOESY spectra” for the four molecular species, except that these
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
intensities are now influenced by the exchange process also. The symmetric counterparts
and
235
and their
refer to the traditional exchange peaks between
the free and bound forms of the ligand and the protein. These include traditional exchange cross peaks, as well as exchange-mediated NOESY (Choe et al., 1991). The and are the “direct intermolecular” NOESY contacts in the “bound state.” As will be shown later, except for very tight binding situations, these intensities will also be influenced by the exchange process. The remaining six peaks are exchange-mediated intermolecular ligand–protein NOESY spectra; four involve a free species and a bound species and two and involve the free ligand and the free protein. The last two arise from a double-exchange process (Curto et al., 1996).
The I(0) matrix is given by
Because of the conformational exchange matrix K, the D matrix is not generally symmetric. Even though algorithms exist that can diagonalize asymmetric matrices such as D, it is desirable to reduce this matrix to a symmetric form if one wishes to use the more standard orthogonal transformation routines meant for symmetric matrices. The D matrix can be brought into a symmetric form using a symmetrization matrix S defined as (Moseley et al., 1995)
Here
and
where
, and are the concentrations of the free and bound forms of the ligand and enzyme. This form of the symmetrization matrix is related to but
slightly different from the one used by Ni (1992) in terms of separate ratios of equilibrium concentrations for ligand and enzyme in their free and bound forms, and the number of equivalent spins for each resolved resonance. As shown in the second example, our definition lends itself to an automatic extension to any arbitrary number of states.
Thus the symmetrized dynamic matrix,
is given by
236
N. Rama Krishna and Hunter N. B. Moseley
where it is easily verified that The symmetrized form of T as
and
are now symmetric (Moseley et al., 1995). by a transformation
can be put in a diagonal form
The expression for NOESY intensities, Eq. (13), now becomes
2.2.1. Fast Conformational Exchange
When the conformational exchange rates are much faster than the relaxation rates in the free and bound states of the ligand and the enzyme, a simplifying result
obtains, since in this case the relaxation matrix R can be treated as a minor perturbation on the kinetic matrix K, and a perturbation theory treatment can be applied (Moseley et al., 1995; Krishna et al., 1980). For this case, it is easily shown that
where
where and etc; and are the fractional populations for the free forms of the ligand and enzyme, respectively. Since the R matrix is now a minor perturbation, its contribution in first order to the dynamic matrix will be significant only to the “zero” block-diagonal element of the diagonalized form of the K matrix. An approximate solution for
is
where
The important result is that the pertinent generalized relaxation rate matrix that
governs the intensities in the NOESY spectrum is simply a weighted average of the generalized relaxation rate matrices for the free and the bound states of the
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
237
molecules forming a binary complex. This is a generalization to interacting systems of our earlier result (Krishna et al., 1980). Note that because of the matrix is asymmetric, but can be readily put in a symmetric form using a similarity transformation (Moseley et al., 1995). Taking
advantage of the fact that for normal mixing times employed in NOESY experiments, because of fast conformational exchange (on both relaxation and chemical-shift scales), Eq. (13) now reduces to
where
etc. From Eq. (30) it is
clear that the NOESY spectrum is determined by a generalized relaxation rate matrix which is a weighted average of the rate matrices for the free and bound states (i.e., including the protons on the ligand and the enzyme, and the intermolecular cross relaxation in the bound form). The effect of intermolecular ligand–receptor cross relaxation on the ligand
tr-NOESY spectrum for the general case can be calculated numerically from Eq. (13). However, for the case of fast exchange on the relaxation rate scale in the two-state model, the first few terms describing the initial growth portion are obtained from a Taylor series expansion of Eq. (30):
where
is a constant of proportionality, and
Most noteworthy is the dependence of
represents higher-order terms. on ligand–protein cross relaxation
and in the quadratic and higher-order terms in , This has important consequenceson the accuracy of bound-ligand structures (vide infra).
2.2.2. Absence of Ligand–Enzyme Cross Relaxation In the absence of intermolecular cross relaxation (i.e., and the ligand and the receptor are uncoupled in spin relaxation, and one obtains from Eq. (29) the much simpler result for the effective relaxation rate matrix governing the ligand tr-NOESY:
A similar result obtains for the enzyme.
238
N. Rama Krishna and Hunter N. B. Moseley
2.2.3. Analytical Expressions for the Transferred NOESY on a Two-Spin System (A–X) in the Absence of Ligand–Protein Cross Relaxation
In the following, we present analytical expressions (Lee and Krishna, 1992 and Choe et al., 1991) for the case of a ligand composed of two spin-1/2 nuclei, exchanging between two conformations corresponding to free (A–X) and bound states according to the following scheme (Fig. 2). We have assumed that ligand-protein cross relaxation is negligible. This example is useful in understanding the behavior of NOESY intensities as a function of correlation times, cross-relaxation rates, and forward and reverse exchange rates (which in turn determine the fractional populations). The dynamic matrix for the above case is
This matrix can be diagonalized (Lee and Krishna, 1992) using the transformation matrices U and in Eq. (12) given by
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
The matrix and
239
is diagonal with elements In the above equations,
and The intensities for NOESY peaks associated with spin A are (Lee and Krishna, 1992)
Of these, I(AA) is the diagonal peak, I(AX) corresponds to the direct NOESY peak, is the exchange peak, and is the exchange-mediated NOESY peak. They are schematically defined in Fig. 3. Similar expressions for intensities
associated with the remaining three spins can be obtained by proper interchange of indices. These expressions are useful in understanding the effect of finite off-rates on each of the individual components and due to transfer of magnetization between spins A and X in their two conformations. These individual components can only be observed if the exchange rate is slow on the chemical-shift scale. For fast exchange on the chemical-shift
240
N. Rama Krishna and Hunter N. B. Moseley
scale, the net intensity is a sum of these four components. The effect of varying equilibrium constants on these four intensities has been described by Lee and Krishna (1992). For the special case where the concentrations and the relaxation rates are identical in both states, Eqs. (36)–(39) reduce to the simpler expressions given earlier (Choe et al., 1991).
2.3. Treatment for More than Two States
As an example of treatment for more than two states we consider proteins and enzymes that exhibit hinge-bending motions (or forming encounter complexes, initially). We consider a ligand binding to an enzyme in its “open” state, followed by a hinge-bending motion on the enzyme that “closes in” the ligand in the active site (Fig. 4). The ligand could bind to the open state of the enzyme at a nonspecific or a weak binding site. We have assumed that the free ligand cannot bind directly to the active site in the closed state, due to some steric hindrance (e.g., yeast hexokinase). This assumption was made only for simplicity and to make the model more interesting, but the theoretical results given below can easily be extended to the general case where direct binding of the free ligand can take place in the closed state also. For our simulations, we chose an example provided by the hypothetical binding of Leu inhibitor to thermolysin (Fig. 5). Several other proteins showing this behavior have been mentioned earlier. In general, all examples where binding in the active site is facilitated by induced fit or hinge bending fall into this category, and may involve more than three states. For
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
241
the simplest of these, we adopt the following three-state scheme shown in Fig. 4.
In this case, the R matrix is
242
N. Rama Krishna and Hunter N. B. Moseley
where and refer to generalized relaxation rate matrices for the free state, the open state, and the closed state, respectively, of the ligand–enzyme system. The matrix is identical to that for the two-state case. Similarly, the and matrices describing the open and closed states of the complex have a form similar to for the two-state case (i.e., they contain intermolecular cross-relaxation terms). The generalized kinetic matrix has the form [after a minor notational change from Moseley et al. (1995)]
where
In this example, we are interested in the special case where and which describes hinge-bending motions. The and matrices are identical to the two-state case, and consist of the on- and off-rates, respectively. The I(0) matrix is given by
where is similar to for the two-state case, and and represent the concentrations of the bound species in the open and closed states, and have definitions similar to for the two-state case. The dynamic matrix in this case can be made symmetric by a matrix S, which is an extension of Eq. (22) to the three-state case (Moseley et al., 1995).
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
243
2.3.1. Fast Conformational Exchange on the Relaxation Rate Scale for AH Three States Under this limit, the R matrix can be treated as a minor perturbation on K. Using the procedure outlined in Moseley et al. (1995), the expression for the intensity of the ligand–receptor system under fast-exchange conditions (fast on
both relaxation and chemical-shift scale) is shown to be
where
The generalized population matrix is analogous to that for the two-state case, and and represent population matrices for the open and closed states of the complex, respectively, and have definitions analogous to for the two-state case. They satisfy the requirement The above result for the three-state case can be generalized for any n-state fast exchange equilibrium of a bimolecular complex on the relaxation rate scale:
2.3.2. Slow Hinge-Bending Motions In this limit, there is fast exchange between states 1 and 2, and slow exchange between states 2 and 3. As shown in Moseley et al. (1995) one obtains
Here, we obtain an averaging of the relaxation rate matrix representing states 1 and 2 together that are in fast exchange on the relaxation rate scale. Matrices and are population matrices for states 1 and 2, but normalized such that
2.3.3. Fast Hinge-Bending Motions For this case, we assume that there is slow exchange between states 1 and 2, and fast exchange between states 2 and 3. We obtain the following result (Moseley et al., 1995):
where that
and
are the normalized population matrices for states 2 and 3 such
244
N. Rama Krishna and Hunter N. B. Moseley
2.4. Intermolecular Transferred NOESY
The CORCEMA algorithm can calculate the complete intensity matrix in Eq. (19), including the various intermolecular NOESY contacts between the ligand and the protein in their bound state, and the exchange-mediated peaks between states and within the free state. When the exchange is fast on the chemical-shift scale as well as the cross-relaxation rate scales, the intensity matrix collapses to Eq. (30) given previously. The terms and represent the intermolecular NOESY contacts between the ligand and the receptor. The first few terms in the expansion of the exponential term give
where
represents higher-order terms, is the concentration of the bound is a constant of proportionality.
form of the receptor, and 2.5.
Treatment of Nonspecific Binding
An important artifact in the analysis of experimental transferred NOESY data is the binding of a ligand to a protein in a nonspecific manner at multiple locations other than the binding pocket. The origin of this binding could be the presence of some very weak binding sites, electrostatic interactions between charged groups on the ligand and the receptor, or hydrophobic association with surface-accessible hydrophobic residues. The nonspecifically bound ligand acquires the rotational correlation time of the larger macromolecule and, due to the fast-exchange condition, will significantly contribute to the transferred NOESY experiment on the free ligand. Furthermore its conformation in the nonspecific location is likely to be different from that of the bound ligand within the active site. Conceivably, since several nonspecific binding sites are simultaneously occupied by the ligand molecules, the intermolecular cross relaxation will also be nonspecific in nature, and may be reflected as an increased leakage factor for the nonspecifically bound ligand. In the presence of a significant amount of nonspecific binding, there could be serious errors in the estimation of populations of specifically bound-ligand molecules and, hence, in the complete relaxation and exchange matrix calculations. Nonspecific binding is inherently somewhat difficult to quantitate and properly correct for. In the following we will describe three approaches. An elegant approach for correcting contributions from nonspecific binding involves recording transferred NOESY spectra on a ligand of interest by performing experiments with and without a tightly binding inhibitor and subtracting one from the other (Behling et al., 1988). Here the assumption is that tightly binding inhibitor preferentially occupies the binding site, and hence any transferred NOESY of a ligand in the presence of the inhibitor reflects only nonspecific contributions. These
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
245
NOESY intensities could be subtracted from the tr-NOESY with ligand alone, to arrive at the tr-NOESY spectrum of the specifically bound ligand. This scheme should work well for most proteins and enzymes with well-defined binding pockets. It is less clear if such an approach would work satisfactorily for enzymes with large-scale domain motions with the active site forming only when two domains are closed. This is because the amount of nonspecific binding can be different in the open and closed states due to differences in accessible surface areas for the ligand, and thus the method will only correct for nonspecific binding in the closed state.
Other investigators have focused on identifying sample conditions where such nonspecific binding is minimal (Murali et al., 1997; Jarori et al., 1994). Typically, for each system under investigation, the tr-NOESY for a peak was measured as a function of absolute ligand concentration, while holding the ligand–enzyme ratio constant. For ligand concentrations in the 1- to 2-mM range, the NOE typically remained constant but dramatically increased for higher concentrations (Murali et
al., 1997). This increase at higher concentrations was interpreted to be the result of nonspecific or weak binding. Some of these effects could also be due to increased solvent viscosity associated with increasing enzyme concentration. Performing tr-NOESY at lower ligand absolute concentrations (typically 1 to 2 mM), where the NOE remains constant, reduces the nonspecific binding contributions significantly, as shown by these investigators (Murali et al., 1997; Jarori et al., 1994). Even though this approach significantly minimizes the contributions from nonspecific binding (compared to the high-ligand-concentration case), it is reasonable to
assume they will not be eliminated entirely. To that extent, corrections for residual nonspecific binding may be needed to further improve data analyses. Our solution to the nonspecific-binding problem involves treating it as an optimizable parameter to get the best fit between experimental and calculated tr-NOESY spectra. To generate the appropriate average relaxation rate matrix, we
consider the following kinetic scheme for the molecular species: state 1 corresponds to a free ligand and a free enzyme. State 2 corresponds to the enzyme with nonspecifically bound ligand only. In state 3, the active site of the enzyme from state 2 (with nonspecifically bound ligand) is occupied by a specifically bound ligand derived from the free-ligand pool in state 1 (one can include an additional
pathway where the active site can also be occupied by a molecule from the nonspecifically bound ligand pool in state 2 without altering the results). Under the fast-exchange condition for all states, it can be shown that the above scheme reduces to Eq. (30), where
where
246
N. Rama Krishna and Hunter N. B. Moseley
In the above equation, and are the fractional populations of the ligand respectively in its free form (state 1), nonspecifically bound to the enzyme with unoccupied active site (state 2), nonspecifically bound to the enzyme when its active site is occupied (state 3), and specifically bound (in state 3); i.e., Similarly, and are, respectively, the fractional populations of the enzyme in its free (state 1), nonspecifically bound with unoccupied active site (state 2), and specifically bound (state 3) forms, with
Note that the enzyme in state (3) has both specifically and nonspecifically bound ligands attached to it. One can make the reasonable assumption that the conformation of the nonspecifically bound ligand does not alter from its free state. With this assumption, the relaxation rate matrix differs from that of the free-ligand rate matrix in only two respects: (i) its rotational correlation time, which is now identical to that of the enzyme complex, and (ii) it may experience leakage factors different from that in the free state. From a prior knowledge of the conformation of the free ligand, and the rotational correlation times for the free and bound forms, the optimizable parameters to correct for nonspecific contributions are reduced to the sum of fractional populations and the leakage factor, which we will assume is identical for all ligand protons in its nonspecific bound form. If the ligand molecule is sufficiently small, then the relaxation rate matrix for the nonspecifically bound enzyme can be set identical to that of the free enzyme, The current version of CORCEMA, however, does not use the simplification implied in Eqs. (50) and (51), but requires specification of the full three-state model. 3. METHODS
3.1. The CORCEMA Program The current version of CORCEMA is designed primarily to compute NOESY and ROESY spectra for different proposed models of a dynamical system while optimizing some chosen parameters (e.g., bound and free correlation times, offrates, order parameter, and internal correlation time) to get the best fit with the
experimental data. Figure 6 shows the CORCEMA protocol for version 1.74. Iterative optimization of the conformations of the ligand and the active site to get best fit with NMR data is planned for the future. Currently we use simulated annealing and Powell minimization to optimize nonstructural parameters like exchange rates, equilibrium constants, correlation times, etc. The program can calculate spectra for an N-state model of conformational exchange. The entire
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
247
program was written in C and has a modular architecture so that it can be modified easily. The program is compiled on a Silicon Graphics workstation with UNIX operating system. No machine-specific calls are made in the program, so it will be compatible with other computers and operating systems. The required input files are the number of states involved in the equilibrium (e.g., three states for the
hinge-bending motion), the coordinates (in PDB format) of the various molecular species in their free and bound forms, overall rotational correlation times, and magnitudes of the various conformational exchange rates (i.e., off- and on-rates, as well as the hinge-bending motion rates). The enzyme on- and off-rates could be obtained by independent methods. Next, flags are set to include internal motions
for methyl groups and aromatic rings. It is assumed that the internal rotation correlation time for the methyl groups is much shorter than the overall rotational correlation time for the ligand. The effect of internal motions on intramethyl and
external methyl (i.e., and interactions was incorporated using the model-free approach (Lipari and Szabo, 1982a, 1982b), empirically modified by assigning order parameters and to protons i and j in the internuclear vector (Baleja et al., 1990). For aromatic rings, it is assumed that the ring-flip correlation times are much longer than the rotational correlation times, but much shorter than
the cross-relaxation times;
method is used to account for modulation of internuclear distances. The third stage involves creation of generalized rate matrices for relaxation (R) and kinetics (K), based on the model under consideration. Next, the dynamic matrix D is created, symmetrized, and diagonalized using the QR factorization method (Press et al., 1992). In the prefinal stage, a file consisting of desired peak intensities (cross peaks including exchange-mediated peaks, diagonal peaks, or sums of appropriate sets of peaks in the case of fast exchange on the chemical-shift scale) is read to print the intensities and compare them to experimental values. This comparison involves a calculation of NOE R-factor (Xu et al., 1995a, 1995b; Krishna et al., 1978) obtained by optimization of nonstructural parameters to get the best fit. The implementing program for the optimization protocol in Fig. 6 computes the rates for the ligand and the enzyme in the dynamical model under consideration, as well as the concentrations of the different species under equilibrium, given a subset of exchange rates, equilibrium constants, and total ligand concentrations. These species concentrations are used to define the concentration matrix C and the symmetrization matrix S (in the current version we have normalized the concentrations with respect to one of the species and reexpressed them in terms of ratios of appropriate rates).
3.2. Calculation of Concentrations
For a two-state model, with (Moseley et al., 1995)
the concentrations are given by
248
N. Rama Krishna and Hunter N. B. Moseley
where
and
S i m i l a r l y , for a three-state model ( B e r n a s c o n i , 1986), w i t h and the concentrations are
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
249
where and 3.3. Methods for Suppressing or Identifying Protein-Mediated Effects
While we wish to highlight in this chapter the advantages of ligand–protein cross-relaxation effects (i.e., protein-mediated spin diffusion and protein-induced leakage effects) in structure-based design using transferred NOESY, we also stress the importance of undertaking proper control experiments in which the effects of protein–ligand cross relaxation on the NOESY spectra are deliberately suppressed. A comparison of this control spectrum with that in which the protein-induced
effects are felt by the ligand then provides a basis for deducing or inferring the conformation of the ligand–receptor complex. Once the ligand conformation has
been quantitatively deduced without interference from protein protons, this conformation can, in principle, be properly “docked” within the binding pocket of the protein. This docking process can be guided from direct intermolecular transferred NOESY contacts whenever they are observable (Ramesh et al., 1996; Anglister et al., 1995; Scherf and Anglister, 1993) or indirectly through protein-mediated effects on intraligand NOE intensities (Curto et al., 1996). Since both intermolecular and intraligand TrNOEs can be dependent upon the relaxation rate matrix elements involving the active site residues [Eqs. (31) and (49)], these intensities can serve as experimental constraints in CORCEMA calculations that incorporate the binding pocket residues (Curto et al., 1996). More experience is needed in these types of calculations; however, the transferred NOE studies on fibrinopeptide analogs docked into the thrombin binding pocket (Ni et al., 1995) as well as our work on
the Trp-repressor–operator system (Moseley et al., 1997) and our joint work with Professor Thomas Peters on the sLex/E-selectin system (vide infra) are encouraging and point to the feasibility of such calculations.
3.3.1 Perdeuterated Receptors
The most obvious approach to eliminate ligand–protein cross relaxation is to employ perdeuterated receptors. By eliminating ligand–protein intermolecular cross relaxation altogether, the ligand transferred NOESY spectra are completely free of both protein-mediated spin diffusion and protein-leakage effects. This
250
N. Rama Krishna and Hunter N. B. Moseley
approach has the added benefit of minimizing background signals from the receptor protons so that traditional 2D NMR methods will be adequate to probe the bound-ligand conformation. Typical tr-NOESY examples in literature include conformational studies of substrates interacting with perdeuterated yeast phosphoglycerate kinase (Shibata et al., 1995), honey bee venom melittin complexed to perdeuterated phosphatidylcholine vesicles (Okada et al., 1994), bound to perdeuterated lipid (Gounarides et al., 1993), and senktide, a neurokinin analog, bound to perdeuterated vesicles (Bersch et al., 1993). Selective deuteration of residues within the binding pocket of the receptor is useful in suppressing ligand– protein cross relaxation and identifying intermolecular contacts (Scherf and Anglister, 1993).
3.3.2. NMR Pulse Methods Since perdeuteration of receptors is not always feasible or economical, one could exploit a large number of pulse sequences that in effect suppress proteinmediated spin-diffusion effects which make dominant contributions during the early and mid-range portions of the NOESY time-course curves. In some of these methods, however, since the protein protons still contribute autorelaxation terms to the diagonal elements of the ligand relaxation rate matrix [terms of the type in in Eq. (3)], protein-leakage effects will persist and affect the decay portions of the ligand NOESY cross peaks. These methods are different
from subtraction methods (Andersen et al., 1987) that minimize baseline artifacts in the ligand NOESY spectrum due to broad protein resonances or due to intermolecular contacts with protein protons that resonate at a given ligand proton signal.
A inserted prior to also is effective in eliminating broad protein signals in the ligand tr-NOESY spectrum (Scherf and Anglister, 1993). 3.3.2a. Transferred NOESY with Short Mixing Times. One-dimensional transient NOE experiments (Wagner and Wuthrich, 1979; Gordon and Wuthrich, 1978; Krishna et al., 1978) obtained by the use of selective inversion or selective progressive saturation of chosen resonances have been proposed as a way of measuring direct NOEs between two protons before spin-diffusion effects become dominant. Similarly, 2D NOESY with very short mixing times will accomplish the same objective, at least in principle. In the presence of strong protein-mediated spin diffusion, this suggestion can be very difficult to realize in practice for two reasons. First, for very short mixing times the NOESY spectrum or its equivalent 1D transient NOE will suffer from poor signal/noise problems. Second, transferred NOESY simulations (Jackson et al., 1995; Moseley et al., 1995) on model systems with a correlation time of s under the fast-exchange condition showed that protein-mediated spin diffusion can become significant within the first 50 to 60 ms (Fig. 18). Thus, recording transferred NOESY spectra with extremely short mixing
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
251
times does not appear to be a universal solution, though it might still work for some specific examples, especially when protein-indirect effects are weak. 3.3.2b. Radio-Frequency Pulse Saturation of Receptor Resonances. Continu-
ous rf irradiation within the protein envelope (Clore and Gronenborn, 1983) or the 2D MINSY experiment (Massefski and Redfield, 1988) where the protein signals are selectively saturated during mixing time, are some of the simplest implementations to suppress protein-mediated spin diffusion. In both these methods it is important to ensure that the receptor protons within the binding pocket remain saturated. This could be ensured if the spectrum of the protein without the ligand could be recorded first, and some exploratory irradiation experiments carried out with radio frequency centered at different regions of the broad protein resonance envelope. For complexes where the ligand protons do not overlap with signals from the receptor binding site, the MINSY experiment might be attractive (e.g., if the receptor binding pocket is predominantly composed of aromatic residues and the ligand does not have any aromatic protons). 3.3.2c. Two-Dimensional ROESY. The 2D ROESY experiment (or more correctly the transferred ROESY) offers a very attractive method for separately identifying “direct” and “indirect” NOE contacts in the spectrum of a ligand reversibly binding to a protein. In the NOESY spectrum of a large molecule with both direct and indirect NOE contacts have the same sign as the diagonal, thus misleading the unwary experimentalist. In contrast, in the ROESY experiment, an expansion of the term as a series shows that all terms odd in have negative sign while even terms have positive sign with respect to a positive diagonal. Thus for short spin-lock times, the direct NOE contact has a sign opposite to that of the
diagonal, while the first indirect NOE contact peak has the same sign. (Some authors mistakenly ascribe the indirect effects in ROESY to spin diffusion, in
analogy to NOESY. This is a misnomer since the magnetization equation in the rotating frame cannot be converted to a diffusion equation because of the positive sign of Multispin effect is a more appropriate name for this magnetization transfer, in analogy to small molecules). Thus, a comparison of the NOESY and ROESY spectra should readily identify all protein-mediated indirect pathways. It is not uncommon for several NOESY peaks to be reduced in intensity or disappear altogether in a ROESY spectrum. This is due to a cancellation effect from direct and indirect contributions. A dramatic demonstration of the application of trROESY to identify protein-indirect effects has been given by Arepalli et al. (1995) in their study of the bound conformation of a disaccharide bound to a monoclonal antibody. Figure 7 shows the 2D transferred NOESY spectrum of the disaccharide in the presence of the Fab. Most noteworthy in this figure are the two intense peaks between H4 and (the H4 shows up as a doublet due to coupling with Based on the observation of this cross peak in a previous study (Glaudemans et al., 1990), these authors proposed a conformational change for the disaccharide. This conclusion, however, has been revised in the more recent study by Arepalli et al. (1995),
252
N. Rama Krishna and Hunter N. B. Moseley
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
253
where they performed a ROESY experiment and concluded that the originally observed cross peak between H4 and hydrogens was due to protein-indirect effects. Their ROESY spectra on the free disaccharide and in the presence of the Fab are shown in Fig. 8. 3.3.2d. Network Editing Sequences. A number of network editing pulse sequences have been proposed to suppress spin diffusion during mixing time, and hence hold promise for suppressing protein-mediated spin diffusion in tr-NOESY experiments (Hoogstraten et al., 1995; Macura et al., 1994; Fejzo et al., 1991). These sequences exploit the differences in the signs of the off-diagonal elements of the relaxation rate matrix R for NOESY and ROESY
254
N. Rama Krishna and Hunter N. B. Moseley
for large correlation times In the direct NOESY (D.NOESY) method (Macura et al., 1994), spin diffusion is allowed to take place during the experiment, but their effects on the spectrum are eliminated by the addition of properly normalized NOESY and ROESY spectra. A number of editing sequences have also been proposed that restrict relaxation to direct effects only (Macura et al., 1992). In the selective NOESY (S.NOESY) method (Fejzo et al., 1992), selective 180° pulses to invert selected band of spins are inserted on either side of a 90°–spin-lock–90° element during the mixing period in place of the normal NOESY mixing period to retain only direct effects between the inverted spins and noninverted spins and to suppress all other indirect effects due to spin diffusion. There is also a loss of sensitivity in this method due to the application of a long
CORCEMA Analysis of NOESY Spectra of Ligand–Rcceptor Complexes
255
series of pulses to the resonances. More practical experience with these network editing techniques is needed to evaluate them for tr-NOESY applications. 3.3.2e. QUIET-NOESY and Related Methods. Pulse sequences such as QUIET-NOESY, QUIET-BIRD-NOESY, QUIET-EXSY, and their variants (Vincent et al., 1996a, 1996b; Zwahlen et al., 1994) involve selective excitation of resonances from a chosen pair of protons to monitor direct cross relaxation between them while suppressing spin-diffusion contributions from the intervening protons. An important advantage of some these methods is that neither the chemical shifts of the intervening protons nor the existence of such protons need to be known. These methods should also find applications in tr-NOESY to suppress the effects of protein-mediated spin diffusion on the ligand resonances. Since the smaller ligand molecules, when present in excess (over the receptor) and in fast exchange, generally yield sharp well-resolved signals, they are easily amenable to these
selective excitation schemes. By using a labeled ligand and an unlabeled receptor, the QUIET-BIRD-NOESY can, in principle, give one-step magnetization transfer connectivities between all ligand protons attached to the labeled heteronuclei. 3.4. Methods for Observing Intermolecular Transferred NOESY Contacts
Because the intermolecular tr-NOE contacts between a reversibly binding ligand and its receptor are very useful in properly docking the ligand within the active site, and in quantitative CORCEMA calculations, it is worthwhile to explore
optimal experimental methods for observing them. The many limitations associated with observing intramolecular NOEs in high-molecular-weight systems, such as line broadening and spin diffusion, also apply, albeit in a somewhat less severe manner, to the observation of intermolecular tr-NOEs. For example, because the line shapes for intermolecular ligand–receptor tr-NOEs in the fast-exchange limit reflect the ligand in one dimension and the receptor in the second dimension, these
peaks are inherently somewhat sharper than the intrareceptor NOEs, and hence are comparatively easier to observe. The reported intermolecular tr-NOESY cross peaks in a 37-kDa ligand-protein–DNA complex (Lee et al., 1995) and in ~ 50-kDa peptide–antibody complexes (Arepalli et al., 1995; Scherf et al., 1992) are reasonably sharp and suggest the feasibility of observing these highly informative cross peaks in other similar systems. In larger-molecular-weight systems it may be worthwhile to explore methods such as random fractional deuteration (LeMaster, 1989) and reverse protonation of an otherwise deuterated receptor as a way of reducing dipolar broadening and severe spin-diffusion problems. With random fractional deuteration, however, determination of proper concentrations of different protons within the protein for CORCEMA calculations might become a problem.
Experimental observations (Anglister et al., 1993; Anglister and Zilber, 1990; Glasel, 1989; James, 1976) and theoretical calculations (Curto et al., 1996;Moseley
256
N. Rama Krishna and Hunter N. B. Moseley
et al., 1995) suggest that the intermolecular tr-NOESY contacts are easier to observe at lower ligand–receptor ratios, typically Indeed, it is not uncommon to observe intermolecular tr-NOE contacts during routine tr-NOESY measurements, especially when the ligand–receptor ratio is maintained relatively small At high ligand–receptor ratios they may be usually too weak to observe, especially for very high molecular weight receptors. Many times, however, the requirement of using lower ligand–receptor ratios also means a tr-NOESY spectrum that is dominated by broad featureless resonances from the high-molecular-weight receptor signals (e.g., Fig. 9), thus making it difficult to resolve the
somewhat sharper and interesting intermolecular tr-NOESY peaks. Luckily, a number of techniques can be used to suppress these broad receptor signals to reveal tr-NOESY signals of interest. We will briefly summarize some of them. Since intermolecular tr-NOEs build up and decay comparatively faster than the intraligand tr-NOEs, a second requirement for optimal observation of intermolecular tr-NOEs is that the mixing time be kept relatively short (Curto et al., 1996; Arepalli et al., 1995). Crystallographic data can some times aid the assignment of these inter-tr-NOESY peaks.
3.4.1. Two-Dimensional Transferred NOESY Difference Spectroscopy Anglister and co-workers have successfully used a 2D tr-NOE difference
spectroscopy method to suppress the broad resonances from the antibody receptor and to identify specific intermolecular NOE contacts between the peptide ligand and the antibody protons (Anglister et al., 1993; Scherf and Anglister, 1993; Scherf et al., 1992; Anglister and Zilber, 1990). Typically, NOESY spectra under identical
conditions are collected on two samples of the peptide–antibody mixtures, one in which the peptide is in excess (typically four- to fivefold excess over the antibody), and a second sample in which the ratio is 1:1. A 2D tr-NOE difference spectrum is
obtained by subtracting the second spectrum from the first one. Figure 9 shows the remarkable effectiveness of this method in suppressing the broad resonances of the high-molecular-weight receptor while retaining the important intraligand (cholera toxin peptide CTP3) and intermolecular ligand–receptor NOE contacts for the
cholera toxin peptide (CTP3) bound to the 50-kDa Fab fragment. A number of well-resolved peaks arising from intermolecular peptide–antibody NOE contacts are readily visible in this spectrum, in addition to intraligand tr-NOEs. The amino acid types in the antibody-combining site contributing intermolecular tr-NOEs with
the peptide ligand were identified by perdeuteration and partial deuteration of suspected amino acids (e.g., Phe, Tyr, Trp) and the concomitant disappearance of the cross peaks. In addition, these investigators have also successfully identified
interactions associated with a specific chain in the antibody by specifically labeling the heavy chain or the light chain. Typical results are shown in Fig. 10. Employing an antibody-combining site model based on crystallographic data from other
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
257
antibodies, these authors docked the peptide ligand (Fig. 11 for CTP3 bound to the combining site of TE33) into the binding site using distance restraints from intraand intermolecular-tr-NOE data together with energy minimization and molecular dynamics. 3.4.2.
Spin-Lock and Spin-lock and
Spin-Echo Relaxation Filters
spin-echo relaxation filters are filters that take advantage
of the different spin relaxation properties of the ligand and the receptor arising from their vastly different sizes. Typically, large proteins have much shorter transverse
258
N. Rama Krishna and Hunter N. B. Moseley
relaxation times, and hence their signals decay faster in the transverse plane than the signals from the low-molecular-weight ligands. Hence, a or relaxation filter can effectively suppress the broad signals from the receptor while retaining only the signals from the ligand. Depending on whether these filters are located at
the beginning of the
or
periods, the broad receptor
signals will be filtered from that particular dimension. Figure 12 shows a spectrum obtained by Arepalli et al. (1995), who employed a 20-ms spin-echo following the observed pulse to identify the intermolecular tr-NOESY contacts between a disaccharide [methyl O- -D-galactopyranosyl-( 1,6)-4-deoxy-2-deuterio-4-fluoro- -D-galactopyranoside] and the Fab derived from the antibody
X24. This part of the spectrum clearly identifies contacts between specific protons on the disaccharide and some aromatic residues within the binding pocket of the
Fab. Anglister and co-workers employed a filter with a 20-ms spin-lock pulse at the beginning of the period to observe the intermolecular tr-NOES between a cholera toxin peptide and Fab (Scherf and Anglister, 1993). Similarly,
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
259
260
N. Rama Krishna and Hunter N. B. Moseley
Casset et al. (1997) demonstrated the separate use of spin-lock and spin-echo relaxation filters to observe intermolecular tr-NOEs between the nonreducing disaccharide moiety of Forssman pentasaccharide reversibly binding to Dolichos biflorus lectin. Typical results are shown in Fig. 13. The spin-lock filters can result in a slight loss in sensitivity (Scherf and Anglister, 1993) as well as in some artifacts, and methods to overcome these have been suggested (Ni, 1994).
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
261
3.4.3. Isotope-Selected/Filtered Methods By selectively labeling one of the partners in the interacting pair with a suitable
isotope
one can use
double-half-filters to observe the different
subspectra associated with the ligand–receptor complex (Wider et al., 1990). For
example, the intermolecular tr-NOESY spectrum can be observed without interference from the intraligand and intrareceptor NOESY spectra using selected–X or its reverse double-half-filter pulse sequences (Wider et al., 1990). Several other isotope-filtered pulse sequences useful in identifying intermolecular NOESY contacts have also been described (Folkers et al., 1993; Ikura and Bax, 1992; Gemmecker et al., 1992; Ikura et al., 1992). Table 1 of Lian et al. (1994) gives a summary of these methods. Even though most of these methods have been widely used for tight-binding complexes, they should be applicable to identifying transferred intermolecular
NOEs in reversibly binding complexes as well. To overcome the limitations associated with tedious phase cycling and sensitivity loss due to transverse magnetization decay in these earlier sequences, pulse-field gradient-based isotope-filtered 3D HMQC–NOESY (Lee et al., 1994) and a 2D isotope-edited NOESY sequence (Lee et al., 1995) have been developed for identifying intermolecular ligand–receptor NOE contacts. Figure 14 is an example of the application of a 2D NOESY pulse sequence to identify intermolecular transferred NOEs between a tryptophan and its 37-kDa repressor–operator complex. In addition to contacts between bound forms of the ligand and the receptor, this figure also shows exchange-mediated intermolecular tr-NOEs between the free ligand and the bound form of the receptor (Curto et al., 1996). 3.5. Structure Refinement Calculations A large number of structures from tr-NOESY studies have been published in which the structure refinements were limited to the ligand only. Implicit in such
calculations is that the ligand–protein intermolecular cross relaxation has a negligible influence on intraligand tr-NOESY, and that one is dealing with a two-state model in the fast-exchange situation. In the following we restrict our discussion to those cases where one or both of these assumptions is not valid. For discussion purposes, the ligand–receptor complexes will be divided into two groups—small to medium-size systems where complete NMR structural determination of the entire system is possible, and larger complexes where the receptor may be too large for solution NMR structure determination with high resolution. In all these structure refinement calculations, it is highly desirable to obtain independent estimates of as many parameters (e.g., correlation times, off-rates, binding constants, hinge-bending rates, etc.) as possible to reduce the dimension of
262
N. Rama Krishna and Hunter N. B. Moseley
the search surface for the remaining variables to be optimized and to facilitate the search for the global minimum (Moseley et al., 1995). 3.5.1.
Small to Medium-Size
Complexes
We will assume that the receptor macromolecule is amenable to recombinant expression, and by virtue of its smaller size one could employ the complete battery
of modern 3D and 4D NMR techniques (Clore and Gronenborn, 1991) to determine the assignments and conformation of residues within and surrounding the active
site, with and without the bound ligand. As previously mentioned, it is useful to undertake tr-NOESY measurements at several ratios. Ligand–receptor intermolecular tr-NOE contacts as well as intraligand tr-NOEs can be measured without interference from the receptor signals by isotope-filtered and -edited NMR on complexes where only one of the molecules is labeled (Ramesh et al., 1996; Lee
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
263
et al., 1995). The tr-NOE data [i.e., intraligand, intrareceptor (active site), and intermolecular contacts] can be supplemented by torsion-angle constraints from vicinal coupling-constant data on the bound ligand. This latter information can be deduced under fast-exchange conditions by measuring the ligand vicinal coupling constants as a function of increasing amount of bound-ligand concentration, and then extrapolating them to the limit of 100% bound ligand (Campbell et al., 1992; Campbell and Sykes, 1991). These data could be used in a variety of refinement methods to deduce the ligand–receptor (active site) conformation. We briefly summarize some of them. 3.5.1a. Testing between Several Models. The simplest method of analysis involves testing between several alternative models for the complex by predicting tr-NOESY intensities using CORCEMA (Moseley et al., 1997, 1998). In the current version of CORCEMA, the algorithm computes the predicted intra-tr-NOESY as well as inter-tr-NOESY spectra for each proposed model (including the crystallographic structure, if available), and compares them with experimental data using NOE R-factor analysis (Xu et al., 1995a, 1995b; Krishna et al., 1978). Some
parameters, such as correlation times, leakage factors, and off-rates, etc., could be optimized to get the best fit in each case. If the agreement is not satisfactory,
alternative models for the complex could be proposed, and the model that gives the best fit can be identified (Krishna et al., 1978). If the intrareceptor NOESY spectra are observed experimentally, this data can be included in this procedure to directly determine the ligand-induced structural perturbations. This method is useful if one is testing among several proposed models (including crystallographic structures) to see which one is most compatible with the NMR data. It may also be used for a manual refinement of the structure of the complex, although this can be a rather
tedious task if there are too many variables or if the active site is flexible. 3.5.1b. Distance-Constrained Methods. In this approach, which does not use CORCEMA analysis, one will first classify the intraligand and intermolecular ligand–receptor tr-NOESY intensities (as well as intrareceptor tr-NOEs) into distance constraints with upper and lower bounds (Kuntz et al., 1989; Scheek et al., 1989; Nilges et al., 1988; Braun, 1987; Clore et al., 1986) using a qualitative
isolated spin-pair approximation (ISPA). Using intraligand distance constraints [and torsion-angle constraints from measurements (Campbell et al., 1992)], suitable models for the bound-ligand conformation are generated using standard distance geometry–restrained molecular dynamics–restrained energy minimization procedures. Next, using a known or an approximate conformation for the active site (e.g., from NMR or crystallography), one could first approximately dock the ligand within the active site (from the known intermolecular contacts and any additional information such as hydrogen bonds). This starting structure for the ligand active site can be refined using intermolecular distance constraints, while holding the active site residues fixed using distance constraints. This procedure
exploits the sensitivity of the intra-tr-NOESY and inter-tr-NOESY to proton–proton
264
N. Rama Krishna and Hunter N. B. Moseley
distances (Curto et al., 1996; Ramesh et al., 1996). An advantage of distanceconstrained methods such as distance geometry is that they are relatively less
CPU intensitive, and hence fast. A disadvantage is the loss of information (e.g., spin diffusion, protein leakage, finite off-rates, and internal motions) associated with a classification of intensities into distances using strong, medium, weak criteria. For example, in the presence of strong protein-mediated spin diffusion between a pair of ligand protons, the corresponding intraligand tr-NOEs at short mixing times can be very intense even when the distances are large (Jackson et al., 1995). Thus, standard distance geometry type of calculations using distance constraints based on intensities will result in compact structures by this approach since multispin effects are not properly accounted for in these methods. Similarly, if the fast-exchange condition is not satisfied, the magnitude of transferred NOEs can be smaller and can potentially result in slightly expanded distance geometry structures. 3.5.1c. Intensity-Restrained Refinement. The above limitation with distance-constrained methods involving a loss of information about multispin effects is lifted by the total relaxation matrix treatments (Keepers and James, 1984; Krishna et al., 1978). CORCEMA can be used in intensity-based refinement procedures (Xu et al., 1995; Mertz et al., 1991; Borgias and James, 1988) to iteratively optimize a target function consisting of experimentally measurable intensities only (e.g., intraligand tr-NOESY and inter-tr-NOESY in our case). The target function can be constructed to be simple (Yip and Case, 1989; Borgias and James, 1988) or variable (Xu et al., 1995a, 1995b; Guntert et al., 1991; Mertz et al., 1991; Braun, 1987). The optimization can be efficiently performed either with a least-squares refinement (Borgias and James, 1988), numerical or analytical gradient-based intensity-restrained refinement (Xu et al., 1995a, 1995b; Mertz et al., 1991; Yip and Case, 1989), or simulated-annealing-based methods (Xu and Krishna, 1995; Bonvin et al., 1994), or a combination of these. Intensity-restrained refinements can prove to be computer intensive due to repetitive diagonalizations of the dynamic matrix during optimization, but may serve as attractive methods if the total number of protons on the ligand and the active site residues under consideration is relatively small. These procedures typically use data at several mixing times, including long mixing times where spin diffusion from unobservable protons (e.g., active site residues of a high-molecular-weight enzyme) also can influence the observable intensities, and hence are amenable to refinement to some extent. Recent experimental work from our laboratory has confirmed this prediction (Xu et al., 1995b). In the presence of significant intermolecular cross relaxation, a rigorous characterization of the ligand–receptor complex requires a knowledge of the relaxation rate matrix for the macromolecule (active site residues), and can be deduced for moderate-size receptors in principle from intrareceptor NOESY. Even if the intra-NOESY spectrum for the active site residues is not amenable for direct observation (e.g., for high-molecular-weight complexes), because the intraligand tr-NOESY [Eq. (31)] and the inter-tr-NOESY [Eq. (49)] both depend upon the
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
265
matrix, they have the potential to serve as experimental constraints in the refinement of the active site conformation. In these refinements, it is helpful to take into account external leakage factors
as accurately as possible to calculate the intensity profiles. Some typical leakage factors arise due to weak interactions with paramagnetic oxygen in solution, exchange of amide protons with bulk solvent, and dipolar interaction of amide hydrogens with the quadrupolar nitrogen or the labeled heteronucleus (Dellwo et al., 1994; Liu et al., 1993; Krishna et al., 1978). 3.5.1d. Iterative Refinement Employing Distance Restraints and CORCEMA Back-Calculation of Transferred NOESY Spectra. The above limitations about loss of information in purely distance-constrained methods such as distance geometry or restrained molecular dynamics can be lifted by complementing them with a CORCEMA back-calculation of tr-NOESY spectra to properly account for multispin effects and internal motions. This type of procedure is an integral part of some standard hybrid-matrix-based algorithms used in refinement to a single rigid structure (Gorenstein et al., 1990; Borgias and James, 1990, 1989; Boelens et al., 1988). In principle, the relaxation rate matrix and, hence, the distances can be back-calculated directly from the experimental NOESY spectrum (Olejniczak et
always possible since many experimental al., 1986). In practice, however, this is not reasons (e.g., in our case, the intensities are not accessible due to overlap or other intrareceptor NOEs for high-molecular-weight receptors or some intermolecular
tr-NOEs may not always be observable). In the traditional hybrid-matrix method, the missing elements in the experimental intensity matrix are supplemented with intensities back-calculated for an initial trial structure using a complete relaxation rate matrix algorithm. For transferred NOESY analysis, this last step can be accomplished using CORCEMA and a reasonable trial structure and parameters for the ligand–receptor complex (e.g., from crystallography). The back transformation is to an average relaxation rate matrix for fast exchange or into the dynamic matrix for the general case. After reconciling the various elements of the dynamic matrix in a manner analogous with some existing algorithms (Borgias and James, 1990), one can use the distance information in a distance geometry or restrained molecular dynamics procedure to generate a new trial structure. The corresponding full tr-NOESY spectrum (including intrareceptor and intermolecular tr-NOESY) for this new trial structure can be computed using CORCEMA, and the next cycle of optimization can be started. Since, in the presence of strong ligand– protein cross relaxation, both intraligand tr-NOEs [Eq. (31)] and intermolecular tr-NOEs [Eq. (49)] depend upon the elements of they can potentially serve as experimental constraints on the orientation of active site residues. If the intrareceptor NOEs are directly observable they can be directly included. After a few cycles of refinement, a self-consistent structure for the ligand–protein (active site) complex may be generated, as determined from a comparison of experimental and calculated intensities at several mixing times. Whether the procedures described in
266
N. Rama Krishna and Hunter N. B. Moseley
Secs. 3.5.1c and 3.5.1d can be realized in practice can only be answered after further extensive work. Ni et al. (1992, 1995) used a slightly different iterative procedure involving distance geometry and spectral back-calculation. Section 5.1 contains a description of this procedure. 3.5.2. High-Molecular-Weight Complexes
Here we consider strategies for receptors which are too large
for
standard NMR structural determination. For such systems, since a quantitative
interpretation of tr-NOESY requires a knowledge of the active site residues and their coordinates, one has to rely on the crystallographic structure of this protein or
of a related homologous protein to serve as a starting structure. Even though intrareceptor NOEs will not be amenable to direct observation due to line broadening and spin-diffusion problems (and low concentrations of the receptor), intermolecular tr-NOEs may be observable perhaps up to about ~50-kDa. Random fractional deuteration (LeMaster, 1989) of the receptor or the TROSY implementation (Pervushin et al., 1997) may alleviate some of these problems and yield better-quality inter-tr-NOESY data which can be used for structure refinement purposes (however, estimation of precise concentrations of different receptor protons for use in CORCEMA calculations could be a potential problem due to nonuniformity in labeling). In those few instances where inter-tr-NOEs are observable, some tentative assignments for these could be made based on crystallographic
data with some reasonable assumptions. These assignments could be tested for self-consistency (see chapter 2 by Xu et al., in this series). Once high-quality intraligand tr-NOEs, and with some luck, some inter-trNOEs, have been recorded at several mixing times and several ratios, some of the procedures listed above for low-molecular-weight complexes may still be applicable, with the important difference that experimental intrareceptor NOEs are usually not available. In the presence of significant ligand–receptor cross relaxation, the intraligand tr-NOEs (and inter-tr-NOEs when observable) are dependent upon the relaxation rate matrix for the bound form of the receptor, and hence may serve as experimental constraints on the active site conformation, at least in principle. Whether such optimizations are feasible or not, in practice, can only be judged by trying them out, and more studies and experience are needed in this area. However, the results from the work on thrombin-bound structures of human fibrinopeptide analogs using an iterative refinement procedure that consisted of a combination of distance geomentry and tr-NOESY spectrum back-calculations (Ni et al., 1995), as well as our own work with sialyl tetrasaccharide bound to E-selectin (Moseley et al., 1998) that used a manual refinement, suggest that the above refinement methods are not unreasonable.
CORCEMA Analysis of NOESY Spectra of Ligand-Receptor Complexes
267
3.5.3. Normalization of Calculated and Experimental Intensities
In comparing the calculated intensities with respect to the experimental intensities, it is a common practice to use a scaling factor S to normalize them (Lian et al., 1994; Brünger, 1992; Borgias and James, 1988). This scaling factor is usually defined as where and are the calculated and experimental cross-peak intensities respectively, and the summation runs over all the cross peaks
observed experimentally, though in some optimizations the summation has been restricted to only some well-defined cross peaks (Lian et al., 1994; Moseley et al., 1997). Though this kind of normalization works reasonably well and we have used it (Moseley et al., 1997, 1998), it has the drawback that the fit (or lack of a fit) between a calculated and experimental NOESY curve will be less intuitive to interpret since any factor (e.g., protein-mediated spin diffusion) that seriously affects one or a few of the values used in the normalization also will affect all the remaining normalized calculated NOEs, including those that should not be affected by the factor. A better normalization procedure, we believe, is one where the cross peaks are referenced with respect to the corresponding diagonal peak intensities at zero mixing time (which, for long recycle delays makes them independent of the
model), and these diagonal peaks in turn are normalized between experiment and calculation (Xu et al., 1995b).
4. CHARACTERIZATION OF SOME CRITICAL FACTORS USING SIMULATED TRANSFERRED NOESY DATA
We have examined the role of several factors (vide supra) critical in tr-NOESY analysis using simulated data on a hypothetical ligand–enzyme system based on
the published X-ray structure of thermolysin with an irreversible inhibitor bound in the active site (Holland et al., 1992). Since our primary interest is to simulate the tr-NOESY results for different forward and reverse rates, we replaced the covalent bond between the inhibitor and the enzyme with a hydrogen to allow for reversible binding of this hypothetical inhibitor in our models. We have also drastically changed the orientation of the putative enzyme flap in order to better test effects of protein-mediated spin diffusion, as discussed below. Only the active site residues (a total of nine residues, consisting of N-l12, A-l13, F-l14, W-l15, N-l16, E-143,
H-146, R-203, and H-231) of thermolysin were included in our models to expedite calculations. To simulate the effect of hinge-bending motions, it was assumed that Ala 113 of thermolysin is farther from the inhibitor in the open state, and is closer in the closed state, as shown in Fig. 5. A correlation time of somewhat shorter than normal, was deliberately assumed for the free ligand. For the bound form, a value of was chosen. For the methyl group internal rotation correlation time, a value of was chosen for the free and bound states.
268
N. Rama Krishna and Hunter N. B. Moseley
4.1. Finite Receptor Off-Rates
Many traditional tr-NOE analyses have used the so-called fast-exchange approximation; i.e., the exchange rates are much faster than the cross-relaxation rates in the complex. As the enzyme off-rates become comparable to cross-relaxation rates in the bound state, one can intuitively anticipate that the magnitude of the tr-NOE will also diminish. If this effect is not properly taken into account, the diminished intensities might be misinterpreted in terms of wrong structures for the bound ligand. To demonstrate this effect, we have calculated the behavior of the NOESY cross-peak intensity connecting the two geminal protons (separated by a fixed distance of 1.8 Å) in the inhibitor in Fig. 5. They will be referred to as A–X in the following. For notational purposes, we will assume that A and X refer to the free-ligand protons (state 1 in Fig. 1), while and correspond to the ligand protons in the enzyme-bound form. To approximate the thermolysin–inhibitor interaction to a two-state situation, it was assumed that the open state of the enzyme is nonexistent and that the inhibitor goes to the closed state instantaneously upon complexation. In Fig. 15, we show the direct NOESY cross-peak intensities I(AX) and as well as the two exchange-mediated cross-peak intensities and They were computed as a function of the off-rate and the mixing time. In this simulation, was held fixed and the off-rate (and the on-rate) was varied. These figures show the dramatic effect of the off-rates on cross-peak intensities. Noteworthy is the dramatic change of sign of the I(AX) peak and the drop in intensity for the peak as increases. The two exchange-mediated peaks and have identical profiles. Addition of intensities in four figures will give the total intensity for the situation when conformational exchange is either fast on the chemical-shift scale or there is a chemical-shift degeneracy between the free and bound protons. In Fig. 16, the sum of the four cross peaks is shown over the entire range of The composite NOESY peak [which derives a major contribution from the free-ligand peak I(AX)] shows a plateau for the fast-exchange condition, begins to drop in intensity as approaches the cross-relaxation rate, and changes sign for very short off-rates. This figure underscores the importance of unequivocally establishing the exchange off-rate and the dissociation constants by independent methods, to establish the fast-exchange condition, if such an approximation is being used. As a hypothetical example, consider a large enzyme with a rotational correlation time of 50 ns (typical of ~200-kDa systems), a dissociation constant of M for a ligand, and a diffusion-controlled on-rate of The estimated off-rate of is
only about 12 times larger than the cross-relaxation rate of
for a geminal
proton pair (1.8 Å), and about 86 times larger than the cross-relaxation rate of ~12 for protons separated by 2.5 Å, on the enzyme complex. Thus, the assumption of a fast-exchange approximation is not uniformly applicable for this case, and it is safer to undertake a CORCEMA analysis to obtain meaningful structural infor-
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
269
270
N. Rama Krishna and Hunter N. B. Moseley
mation for the bound ligand. This breakdown of the fast-exchange approximation is further exacerbated if the on-rate is less than the diffusion-controlled rate.
4.2.
Effect of Ligand–Receptor Ratio on the Ligand Transferred NOESY
In setting up tr-NOESY experiments, it is often useful to get an idea of the range of ligand–receptor ratios that can be employed for optimal sensitivity. Figure 17 shows the effect of varying the ratio on the geminal proton total tr-NOESY (i.e., sum of two direct plus two exchange-mediated peak intensities). It is clear that significant tr-NOE effects are obtained when is 1 to 75 for the thermolysin–ligand system chosen in the current study. For the tr-NOESY technique as a method is not as dramatic since one has to contend with small changes in the magnitudes of the negative intensities of the ligand alone as opposed to a reversal in signs of the NOESY intensities for smaller ratios. For larger enzymes, the range of useful tr-NOESY regime increases, as recognized earlier
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
271
(Clore and Gronenborn, 1983). Many early experimental investigations have tended to employ high ratios as a way of taking advantage of the sensitivity of the technique to reduce the amount of purified enzyme required for the measurements. By employing high ratios, one might inadvertently (i) significantly enhance
the relative efficiency of protein-mediated spin diffusion (vide infra) and (ii) increase contributions from nonspecific binding, both of which in turn can result in erroneous structures for the bound ligand if not properly accounted for. In practice, it is prudent to perform tr-NOESY measurements over a wide range of ratios (e.g., 50:1–2:1) to develop a model for the conformation and dynamics that is self-consistent over this range. In those circumstances where one is forced to use high ratios because of severe line-broadening problems at lower values, it is essential to do proper control experiments (see Sec. 3.3) to examine if
272
N. Rama Krishna and Hunter N. B. Moseley
any significant ligand–protein intermolecular cross relaxation exists, and undertake additional measurements where protein-indirect effects are suppressed. Curto et al. (1996) simulated the dependence of ligand–receptor intermolecular tr-NOESY intensity as a function of As expected, the intensity decreases for
large values of the ratio. A comparison of Fig. 17 for intraligand tr-NOESY and Fig. 6 in Curto et al. (1996) for intermolecular tr-NOESY shows that their intensity surfaces as a function of and mixing time are dramatically different. Thus,
contrary to assertions by some investigators in the field, these two effects do not exhibit similar behavior. 4.3.
Role of Ligand–Protein Intermolecular Dipolar Relaxation
Intermolecular ligand–receptor dipolar interactions modulated by exchange have rather complex effects on the ligand tr-NOESY (Jackson et al., 1995; Moseley et al., 1995; Ni and Zhu, 1994; London et al., 1992; Nirmala et al., 1992). Broadly speaking, these intermolecular interactions can result in two distinct classes of
effects on the ligand–tr-NOESY spectrum, depending upon the geometrical arrangement of the ligand and receptor protons. These are (1) protein-mediated spin-diffusion effects, which can enhance the tr-NOEs even at short mixing times, and (2) protein-leakage effects, which lead to a decrease in tr-NOE intensity at
longer mixing times. The protein-mediated spin diffusion can sometimes result in dramatic effects (Jackson et al., 1995). In practice, a combination of these two effects might be manifested in the tr-NOESY spectra due to ligand–protein cross relaxation.
4.3.1. Protein-Mediated Spin-Diffusion Effects The protein-mediated spin-diffusion effects have been demonstrated in simulations by Jackson et al. (1995) using a hypothetical two-state model for thermolysin–leucine inhibitor complex conditions (Fig. 5) under a fast-exchange condition. In this calculation, the tr-NOESY intensity between two ligand methyl proton groups (average distance 6.8 Å) has been computed as a function of mixing time. Because of the large distance, the NOESY is extremely susceptible to indirect effects from the protein and the ligand protons.
Jackson et al. (1995) demonstrated the relative effects of each from two sets of CORCEMA calculations—the first one in which the protein protons were gradually removed (equivalent to selective deuteration) while retaining the ligand protons (Fig. 18A), and the second one in which the ligand protons were gradually removed while retaining the protein protons (Fig. 18B). From an examination of Figs. 18A and B, it is easily seen that the tr-NOESY experiences the protein-indirect effects (or protein-mediated spin-diffusion effects) during the growth portions and mid-regions of the mixing times, while the ligand-indirect effects are more pro-
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
273
nounced at longer mixing times, including during the decay of the intensity. Notice that, in this example, the initial lag period that is often used to identify protein-mediated spin diffusion is somewhat poorly defined for the case where all the alanine protons were included. Further, the true initial slope period is confined to the first 25 ms of the mixing time only. For small mixing times, the NOESY spectrum often suffers from poor signal/noise ratio, making it difficult to get good estimates of the initial slopes. Under these conditions, one might be tempted to fit the data in the 0–200 ms mixing times range by an initial slope approximation or a relaxation matrix analysis (limited to ligand protons only) to get estimates of intraligand distances. Such calculations will result in misleading compact structures for the bound ligand. Our calculations underscore the importance of properly incorporating the ligand–protein intermolecular cross relaxation. London et al. (1992) illustrated these protein-indirect effects for a hypothetical ligand–enzyme system shown in Fig. 19. The ligand spin system, arranged at the corners of an equilateral triangle, forms a complex with the enzyme represented by
274
N. Rama Krishna and Hunter N. B. Moseley
five protons equally spaced in a straight line. It was also assumed that there are no conformational changes in the ligand or the enzyme. In this model the enzyme proton 4 is closer to the ligand protons 1 and 2 than they are to each other. Figure 20A shows the variations in the tr-NOESY peak as a function of the exchange
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
275
rate. The protein-indirect effects are dramatic, especially at higher exchange rates.
The intensity is dramatically more intense even at very short mixing times Further, the intensities are substantially stronger than the reference intensity. This is a result of protein-mediated spin diffusion. In this instance, one might be misled into thinking that the distance between protons 1 and 2 is shorter in the bound state.
276
N. Rama Krishna and Hunter N. B. Moseley
It is noteworthy that, although the actual 1–3 and 1–2 distances are identical (= 3.5 Å), they show distinctly different behaviors in their tr-NOESY intensities due to interactions with the enzyme protons—the first pair shows the protein-leakage effects, while the second pair shows the protein-mediated spin-diffusion effects. This is a consequence of the relative arrangement of ligand and enzyme protons within the active site. The theoretical basis for the dramatic effects due to protein-mediated spin diffusion has been described in the literature (Jackson et al., 1995). For fast exchange, neglecting the effect of free-ligand relaxation, the first few terms of the ligand tr-NOESY can be written as [from Eq. (31)]
In the equation, is the relaxation rate matrix for the bound ligand (note that the diagonal terms also include terms of the type due to ligand–protein intermolecular dipolar relaxation). The matrix (and its transpose represents the ligand–protein cross-relaxation terms of the type The term that is linear in corresponds to the traditional initial slope. The second- and higherorder terms in contribute to direct and indirect effects. When the ligand is in high excess of the enzyme, as is the practice in traditional experiments, the It is seen from Eq. (54) that for the first indirect-pathway term (i.e., two-step transfer), the ligand-mediated spin-diffusion term is proportional to while a similar term due to enzyme-mediated spin diffusion is only proportional to Since is always for a high ratio, it is easily seen that the enzyme-mediated spin diffusion can be much more pronounced than the ligand-mediated spin diffusion for identical geometrical arrangement of the intervening protons from the ligand
and the enzyme. The physical basis for the differences in the efficiencies of ligand and proteinmediated spin-diffusion pathways has been discussed (Jackson et al., 1995). Given nearly identical geometrical arrangements, the enzyme-mediated spin-diffusion pathways are dominant over the ligand-mediated spin-diffusion pathways for small mixing times under high conditions. Because of the dependence on in the presence of strong protein-mediated effects the sensitivity of the tr-NOE experiment to ligand-mediated spin-diffusion effects can be recovered at least in part by employing somewhat lower ratios of ligand–enzyme rather than the high ratios traditionally employed [see Fig. 3 in Jackson et al. (1995)]. The protein-mediated spin-diffusion effects have been experimentally demonstrated by Arepalli et al. (1995) using ROESY (Fig. 8). These studies allowed them to revise their earlier conclusion in which, without consideration of protein-mediated effects, a significant conformational change was proposed for a disaccharide on binding to an antibody (Glaudemans et al., 1990). An examination of some published 1D
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
277
tr-NOE spectra in the literature clearly demonstrates large intensity changes in the background enzyme resonances presumably due to an intermolecular NOE in the complexed state when a “free” ligand resonance is saturated. Customarily, these intensity changes and the associated influence of ligand–enzyme cross relaxation have been routinely neglected in many studies. It will be interesting to reexamine these earlier tr-NOE measurements. 4.3.2. Protein-Leakage Effects
Using the hypothetical ligand–enzyme system shown in Fig. 19, London et al. (1992) demonstrated the reduction in tr-NOE intensity due to protein-leakage effects. Figure 20B shows the effect of these interactions on the tr-NOESY of
protons 1 and 3 on the ligand as a function of exchange rate. At longer mixing times the intensity is lower than that predicted by the reference curve, due to protein-leakage effects. Notice that the initial growth portions in the first 25–50 ms are identical. However, because of poor signal/noise this short-mixing-time region will usually be characterized by low sensitivity, and one is forced to focus attention on data at mixing times longer than 50 ms. If the protein-induced leakage effects are not identified as such, an analysis using correct correlation times for the complex might mislead one to the erroneous conclusion that the distance between protons 1 and 3 has become larger in the bound state. Thus, protein-leakage effects
might result in somewhat expanded structures for the bound ligand if not properly treated. Alternatively, one might be misled into deducing shorter correlation times for the complex. Depending upon the specific geometrical arrangement of ligand protons in relation to receptor protons at the active site, the tr-NOESY time-course curves for different proton pairs on the ligand might experience either protein-mediated spin-diffusion effects (which affect the growth portions), or protein-leakage effects (which affects primarily the decay), or a combination, or none of these.
4.4. Ligand–Protein Intermolecular NOESY Intensity as a Function of Off-Rate In Fig. 21 we have computed the intermolecular ligand–enzyme NOESY cross-peak intensity between the m1 methyl group of the ligand and the Ala 113 methyl group of the enzyme as a function of the off-rate. This intensity is a sum of four separate intensities (free ligand–free enzyme; bound ligand–bound enzyme; free ligand–bound enzyme; and bound ligand–free enzyme). The four contributions that would be observable if the exchange is slow on the chemical-shift scale are
shown in Fig. 22 as a function of and mixing time. Note that for fast-exchange rates, the free-ligand–free-enzyme cross peak develops intensity due to indirect pathways involving conformational exchange and cross relaxation in the bound
278
N. Rama Krishna and Hunter N. B. Moseley
state (Curto et al., 1996; Moseley et al., 1995). The substantial nature of intermolecular ligand–enzyme NOESY contact is obvious from these figures. These results suggest that it may be feasible to model intermolecular contacts between two interacting species even when they are not tightly bound M).
4.5. Effect of Motions in the Protein–Ligand Complex on the Transferred NOESY As a typical example of motions that can occur at the active site, we consider a fairly common situation associated with enzymes that undergo hinge-bending
motions; i.e., upon ligand binding, the enzyme undergoes a conformational transi-
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
279
tion from an “open state” to a “closed state” in which the ligand securely occupies the active site. We have also made the simplifying assumption in this model that the only way the ligand could be released from the closed state is by passing through the open state. The relevant scheme is shown in Fig. 4. The conformations of the ligand and the enzyme could be different in all three states. 4.5.1.
A Simplified Example
First, to demonstrate the relative influence of both (off-rate) and (hinge-bending rate) on the tr-NOESY, we chose the example of a ligand with two
280
N. Rama Krishna and Hunter N. B. Moseley
protons (AX) separated by a distance of 5.5 Å and a rotational correlation time of
In the open state of the complex, it was assumed that there is no additional dipolar interaction, but now the ligand tumbles slowly with the correlation time of
the enzyme In the closed state, a proton (M) from the enzyme approaches the A and X protons on the ligand with equal distances of 2.75 Å. Thus, in this model, the ligand–enzyme cross relaxation takes place only in the closed state. A uniform leakage factor of was included in all three states. Figure 23 shows the results for a mixing time of 4 s. The intensity surface in this figure is a reflection of the model used here, and displays three well-defined
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
281
plateaus associated with conformational averaging. For large off-rates one plateau is located in the very-slow-hinge-bending-rate region where the effective relaxation rate is given by [Eq. (47)], while the second one is located in the fast-hinge-bending region where the effective generalized relaxation rate is given by [Eq. (45)]. A third plateau, described by [Eq. (48)], is seen for very slow off-rates and large rates. A fourth plateau, for very small corresponds to the trivial case of no conformational exchange. The enormous variations observed in the intensity surface underscore the importance of identifying the proper range of off-rates and hinge-bending rates. Moseley et al. (1995) have simulated the effect of hinge bending on an intraligand tr-NOE for a hypothetical thermolysin–leucine inhibitor model to demonstrate the importance of incorporating such motions in a rigorous tr-NOESY analysis.
5. EXPERIMENTAL EXAMPLES Even though several studies have been published that employed tr-NOESY to
determine the bound-ligand conformations, the number of examples actually employing a CORCEMA type of analysis in which the receptor protons and/or finite
off-rates are explicitly considered is surprisingly small (Moseley et al., 1998,1997; Casset et al., 1996, 1997; Rinnbauer et al., 1998; Ni et al., 1995, 1992;Ning et al., 1994). Some investigators have focused on the analysis of intraligand tr-NOE buildup curves under fast-exchange assumption in terms of a relaxation rate matrix limited to the ligand protons only (Murali et al., 1997; Jarori et al., 1994; Bevilacqua et al., 1992). Distance-constrained refinement protocols such as distance geometry, restrained molecular dynamics, or restrained energy minimization methods have been used by several investigators to deduce the bound-ligand conformations (Hrabal et al., 1996; Schneider and Post, 1995; Fischer et al., 1995; Okada et al., 1994; Campbell and Sykes, 1991; Ni et al., 1990). Typically, the intraligand tr-NOESY intensities are classified as strong, medium, or weak, and typical distance constraints are set for use in a protocol that employs distance geometry or restrained molecular dynamics. The advantages and disadvantages of these methods over full relaxation rate matrix methods have already been considered in Sec. 3.5. The bound ligands were sometimes docked into the receptor binding pocket from a prior knowledge of intermolecular hydrogen-bond and/or salt bridge constraints (e.g., Schneider and Post, 1995; Ni et al., 1995) or intermolecular tr-NOESY contacts (Ramesh et al., 1996; Asensio et al., 1995a; Scherf et al., 1992). Use of direct methods such as MARDIGRAS have also been described in tr-NOESY analyses (Adams et al., 1997). In some instances, through an iterative distance geometry–full relaxation matrix back-calculation of tr-NOESY spectra (Ni et al., 1995) or a manual refinement with CORCEMA back-calculation (Moseley et al.,
282
N. Rama Krishna and Hunter N. B. Moseley
1998), the docked ligand structures were further refined, thereby correcting for protein-indirect effects. For carbohydrate ligands, a combination of energy maps for glycosidic linkages, molecular mechanics, and simulated annealing were typically used (Weimar et al., 1995; Asensio et al., 1995a, 1995b). In the following section, we will limit ourselves to some recent examples where ligand structure refinements explicitly incorporated protons from the receptor active site residues in a full relaxation rate matrix treatment.
5.1. Thrombin-Bound Structures of Human Fibrinopeptide Analogs
Ni and co-workers employed tr-NOESY to study the conformations of human fibrinopeptide A (FpA) and its analogs when bound to thrombin (Ni et al., 1995). FpA is a 16-residue peptide that is released from the chains of fibrinogen after proteolytic cleavage of the R16–G17 peptide bond by thrombin (Scheraga, 1986,1983). The sequence specificity of thrombin–fibrinogen interactions have been studied in great detail, including the effect of naturally occurring mutations in fibrinopeptide A on this interaction. The mutation of G12 to V12 decreases the efficiency of thrombin-catalyzed cleavage of R16–G17 peptide bonds in peptides derived from the
chains of fibrinogen Roulen (Lord et al., 1990; Ni et al., 1989).
An analog (P15–FpA) was synthesized in which Val 15 was replaced by Pro 15 to restrict conformational freedom and potentially enhance binding affinity to throm-
bin. A second analog (P15–FpA Rouen), in which Gly 12 was replaced by Val 12 to mimic the mutation associated with fibrinogen Rouen, was also synthesized. In solutions of these peptides with bovine (ligand: protein ratio 25:1), transferred NOE measurements at 25°C and pH 5.5 exhibited chemical exchange cross peaks between free and bound forms for the ligands, with the P15–FpA peptide showing significantly slower off-rate than P15–FpA Rouen. The bound structures for the peptides were calculated using distance geometry methods with approximate distances derived from transferred NOESY spectra at various mixing times, in combination with an iterative distance and structure refinement by comparing spectra calculated using full relaxation matrix treatment with the corresponding experimental spectra at longer mixing times (100, 150, 200, and 400 ms). These structures were further refined by a distance-restrained and electrostatically driven Monte Carlo method. The refined structures were docked into the thrombin active site using the distance geometry program DGEOM with the aid of hydrogen bonds and ion-pair interactions observed between inhibitors and trypsin-like serine proteases (including thrombin) in several protease–inhibitor
complexes. During docking, the thrombin structure was fixed as in the crystal structure, and the conformations of the peptide residues were allowed to vary. For calculation of transferred NOE spectra using complete relaxation matrix analysis, a minimum set of nine residues (H43, D189, G193, D194, S195, S214, W215, G216, and G219) at the catalytic site of bovine thrombin was explicitly included
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
283
for the intermolecular docking constraints used in the procedure. Experimentally
measured selective longitudinal relaxation rates for resolved resonances of the peptide ligands were used to replace the calculated diagonal elements of the relaxation rate matrix (Ni, 1994; Perlman et al., 1994). The dissociation constants
(0.04 mM for P15–FpA and 0.4 mM for P15–FpA Rouen) were estimated from values. Leakage rates of for the free peptide and for thrombin and the complex were used in the simulations. During the course of the docking refinement, some of the intraligand proton distances, which otherwise would have been very short to account for observed NOEs without incorporating the enzyme protons, had to be increased to allow for intervening enzyme protons (and the
concomitant protein-mediated effects on the tr-NOEs). The comparison with experimental spectra was made by converting the calculated NOE intensities to two-dimensional FIDs and utilizing the assigned chemical shifts and estimated linewidths for all resolved peptide protons. Figure 24 shows typical experimental and calculated spectra, while the final docked structures of P15–FpA and P15–FpA Rouen are shown in Fig. 25. The final transferred NOE-based structures for the thrombin-bound FpA were found to be closely similar to the crystal structure of FpA in the noncovalent
complexes with bovine thrombin. These authors suggested that the binding of FpA
Rouen to thrombin may require a conformational rearrangement of thrombin residues Ile 174 and Glu 217 to accommodate the bulky side chain of Val 12. Such
a structural requirement could possibly be the reason for the reduced conformational stability for the thrombin complex with FpA Rouen. Other studies on thrombin-bound structures have been described earlier (Hrabal et al., 1996; Ning et al., 1994, 1992). 5.2.
Studies on Blood Group A Trisaccharide Bound to Dolichos biflorus Lectin
Transferred NOESY has become a popular method to study the conformations of oligosaccharide antigens bound to antibodies (Arepalli et al., 1995; Bundle et al., 1994; Glaudemans et al., 1990) and lectins (Scheffler et al., 1997, 1995;Poppe et al., 1997; Casset et al., 1997, 1996; Asensio et al., 1995b; Cooke et al., 1994; Bevilacqua et al., 1992). The blood group antigens are important in blood transfusion and can be used as indicators of tissue differentiation and malignancy. The lectin Dolichos biflorus recognizes blood group A oligosaccharides through its unique specificity for GalNAc residues. Casset et al. (1996) have performed tr-NOESY and tr-ROESY measurements on the blood group A trisaccharide to determine the conformation of the minimal antigenic determinant when complexed to lectin Dolichos biflorus. Figure 26 shows the tr-NOESY and tr-ROESY spectra of the blood group A trisaccharide complexed with D. biflorus. The tr-NOESY spectrum identifies a
284
N. Rama Krishna and Hunter N. B. Moseley
number of new interglycosidic NOEs that are absent in the uncomplexed oligosaccharide. However, most of these new peaks in the tr-NOESY are due to either ligand- or protein-mediated indirect effects, and have been readily identified as such by the tr-ROESY experiment. To quantitatively test for the bound conformations of the blood group A oligosaccharide, Casset et al. generated two conformations
that corresponded to the energy minimia of two families of conformations, FamI and FamII, that described the conformations of the uncomplexed trisaccharide in solution. Specifically, these conformations were further energy minimized within the binding site as described by Imberty et al. (1994). The resulting energyminimized structures, labeled CxI and CxII, were used in generating theoretical
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
285
tr-NOESY curves using the program PDB2NOE (Ni and Zhu, 1994). Thirteen
residues of the D. biflorus lectin-binding site, based on modeling studies (Imberty et al., 1994), were explicitly included to account for interactions with protein protons in the tr-NOESY simulations: Asp 85, Gly 102, Gly 103, Tyr 104, Leu 127, Ser 128, Asn 129, Ser 130, Trp 132, Gly 213, Leu 214, Ser 215, and Tyr 218. The rotational correlation time for the complex was given a value of 55 ns, as expected
286
N. Rama Krishna and Hunter N. B. Moseley
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
287
for a protein of 110 kDa. The optimal values of and were deduced by fitting some theoretical intraresidue tr-NOEs to the corresponding experimental data. The methyl groups were assigned an internal correlation time of 25 ps. The experimental tr-NOESY data for interglycosidic contacts and the theoretical curves predicted for the CxI and CxII conformations are shown in Fig.
27. Casset et al. concluded that the CxI conformation gave a good agreement with the experimental data, though CxII could not be discarded entirely. Considering the many approximations involved in calculating the predicted tr-NOEs, in particular the orientation of the ligand within the binding pocket, the agreement between experiment and theory is encouraging, especially for interglycosidic NOEs such as H1GN–H3G and H1F–H3GN. A model proposed by Casset et al. (1996) for the blood group A trisaccharide bound to the D. biflorus is shown in Fig. 28. 5.3. Transferred NOESY Studies on the Forssman Pentasaccharide
Complexed to Dolichos biflorus The Forssman antigen is a commonly occurring heterophile antigen and, together with some related glycilipids, represents the antigenic determinant of the P blood group system (Casset et al., 1997; Marcus et al., 1976). It is present in several forms of human cancer, which include gastric, colon, and lung cancers (Ono et al., 1994; Uemura et al., 1989). To understand the nature of carbohydrate–receptor interactions, Thomas Peters and co-workers have undertaken extensive transferred NOESY characterization of the interaction of the Forsman pentasaccharide with the seed lectin from Dolichos biflorus, a 110-kDa tetramer with hemagglutinating properties, with two carbohydrate-binding sites per tetramer. The structure of the Forssman pentasaccharide (FPS) is shown in Fig. 29. A series of tr-NOESY and tr-ROESY measurements were performed for D. biflorus:FPS ratios of 1:5, 1:10, and 1:15. The evolution of the magnitude of the tr-NOEs and tr-ROEs as a function of the lectin/FPS ratio was found to be different for the nonreducing disaccharide moiety compared to the reducing end trisaccharide of the FPS, thus reflecting distinct relaxation and exchange properties for the disaccharide moiety. This was further confirmed by and filtered tr-NOESY, which identified several intermolecular contacts between the disaccharide and side-chain protons of the aromatic and aliphatic residues within the binding pocket of the lectin. Figure 13 shows typical and filtered tr-NOESY spectra identifying several intermolecular tr-NOESY peaks in the D. biflorus–FPS complex. Previously, a model for the disaccharide complexed within the binding pocket of D. biflorus was proposed by Imberty et al. (1994). The experimentally observed intermolecular tr-NOEs were found to be qualitatively in good agreement with the predictions based on this model. Based on this, Casset et al. proposed a model for the bound conformation of the FPS in which the nonreducing end disaccharide is buried in the lectin-binding pocket with the
288
N. Rama Krishna and Hunter N. B. Moseley
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
289
remaining trisaccharide moiety pointing away. Figure 30 shows their proposed model for the D. biflorus–FPS complex.
Rinnbauer et al. (1998) undertook a more quantitative analysis of some of the intraligand tr-NOEs measured at 500 MHz and 315 K, using the CORCEMA
program (Moseley et al., 1995). These experimental conditions corresponded to null NOE for the free ligand, yielding a correlation time of 0.21 ns. For the complex, a correlation time of 50 ns was chosen. The calculations employed the coordinates for 32 residues that constitute part of the binding pocket of D. biflorus and the coordinates for the pentasaccharide in the proposed model. The off-rate and the dissociation constant were determined first by iteratively optimizing calculated intraglycosidic tr-NOEs with respect to the corre-
290
N. Rama Krishna and Hunter N. B. Moseley
sponding experimental values until a minimum R-factor was obtained. These values were used for the remaining calculations. The results of the CORCEMA calculations are shown in Fig. 31 for some intraglycodic tr-NOEs, and in Fig. 32 for some of the interglycosidic NOEs. In general, the agreement seems to be quite satisfactory. On the other hand, since the nonreducing end disaccharide is also in close contact with some aromatic and aliphatic side chains in the binding pocket of the protein, some transferred NOEs can be expected to be very sensitive to the relative location of these protein protons in relation to the ligand, and the concomitant protein-mediated spin-diffusion and protein leakage effects (Jackson et al., 1995). Not surprisingly, some intraglycosidic and interglycosidic tr-NOEs involving predominantly
the terminal disaccharide also show poor fits (Fig. 33), reflecting the need for a further optimization of the orientation and/or conformation of the pentasaccharide
in the binding pocket. Because of the lack of specific resonance assignments for the protein residues that contributed to the observed intermolecular tr-NOEs, such an optimization is not trivial, although possible at least in principle (Curto et al., 1996). 5.4. Interaction of Sialyl
Tetrasaccharide with E-selectin
The conformation of sialyl tetrasaccharide when bound to E-selectin has been the subject of at least five transferred NOE analyses, attesting to the
importance of this system (Scheffler et al., 1997, 1995; Poppe et al., 1997; Cooke et al., 1994; Hansley et al., 1994). E-selectin is a membrane glycoprotein that belongs to the selectin family (E-, P-, and L-selectins). It is expressed on endothelial cells and plays an important role in inflammation (Lasky, 1992; Bevilacqua et al., 1989). E-selectin specifically binds the sialyl antigen which is present
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
291
292
N. Rama Krishna and Hunter N. B. Moseley
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
293
on neutrophilic granulocytes (Scheffler et al., 1995; Lasky, 1992; Bevilacqua et al., 1989). The structure of the tetrasaccharide is shown in Fig. 34. In solution, the free tetrasaccharide exists in equilibrium among several conformers (Rutherford et al., 1994). An understanding of the bound conformation of the antigen is of considerable interest in unraveling the molecular basis of recognition between the antigen and its receptor, E-selectin. Whereas the analyses in the earlier papers were qualitative in nature, Poppe et al. (1997) undertook a more quantitative analysis based on a full relaxation matrix treatment of the ligand. Based on published hydrodynamic data, the E-selectin protein was represented as a prolate ellipsoid, and the relaxation matrix elements were computed using Woessner’s expressions. [Incidentally, Eq. (5), given by Poppe et al. in their paper for computing the relaxation rate matrix elements of the bound ligand in a prolate ellipsoid, refers to the extreme narrowing limit and, hence, is not applicable for the bound state. The correct expressions (with frequency dependence) for relaxation rate matrix calculations applicable for large symmetric-top molecules were given earlier by our laboratory (Krishna et al., 1978)]. Poppe et al. also did not explicitly include protein protons in their calculations since a saturation of the protein envelope in the aromatic and aliphatic regions by a DANTE sequence during the mixing time period
294
N. Rama Krishna and Hunter N. B. Moseley
did not appreciably affect the transferred NOE intensities. The bound conformation of the branched trisaccharide portion was found to be close to that of the free ligand. In a more recent unpublished study, Peters and his co-workers at the University of Lübeck undertook careful and detailed tr-NOESY measurements at 600 MHz
(310 K) on with the E-selectin IgG-chimera (220 kDa) in using a ratio of 15:1 for the ligand:binding sites. In a joint collaboration, our laboratories analyzed the data quantitatively using the CORCEMA program and incorporating explicitly the protein protons within the active site (and excluding all exchangeable hydrogens). These calculations were aided by the availability of crystallographic data on the unliganded E-selectin (Graves et al., 1994) as well as the crystallographic data on complexed to the mannose binding protein (MBP) mutant (Ng and Weis, 1997). Using the crystal structure of MBP in its complex with we have manually aligned the E-selectin backbone based on homology in the binding pocket. Based on a comparison of the loop conformation for residues 84–88 in E-selectin which are in register with residues 189–193 in MBP, we suggest that this loop, which is extended in the unliganded E-selectin, moves toward the ligand in the complex. For CORCEMA calculations, three receptor models were used: the first (R1) was identical to the unliganded E-selectin, the second (R2) had the 84–88 loop bent to approximate that in the MBP, but with the Arg 84 side-chain manually positioned to be close to the fucose, and the third (R3) had no protein protons. These receptor structures were used without any energy minimization, and only residues within a 7-Å radius from the ligand protons were included in the CORCEMA calculations. For the bound ligand, four structures were used—the first one (L1) was the crystallographic structure of Ng and Weiss (1997). In this structure, the electron density for G to N linkage was very weak and indicative of considerable conformational flexibility for this interglycosidic linkage in the complex. A second structure (L2) was obtained by a further manual optimization of the torsion angles of L1, and two more structures (L3, L4) for the tetrasaccharide were generated based
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
295
on preliminary analysis of tr-NOESY data without total relaxation matrix treatment and modeling calculations. These structures were generated by docking the tetrasaccharide into the unliganded E-selectin structure using GRID and SYBYL, with the TRIPOS force field for the sugars, and Pullman charge calculations (Thomas Peters, private communication). For CORCEMA refinements, the correlation times (bound and free), the leakage factor for the free ligand, and the order parameter were optimized to get the best fit (lowest NOE R-factor) between experimental and calculated values for the six ligand–receptor models: L1:R1, L1:R3, L2:R2, L2:R3, L3:R1, and L4:R1. This optimization also used a leakage-shell model (Moseley et al., 1997) to account for leakage dipolar relaxation of active site protons with the rest of the protein protons. Each NOE (both calculated and experimental) was normalized with respect to a sum of some reference NOEs (e.g., intrasugar NOEs such as H4F–H6F). This kind of normalization results in the unfortunate artifact that any effect (e.g., protein-mediated spin diffusion) that significantly affects a few of these reference NOEs also indirectly affects the fits for the rest of the tr-NOEs, including those remote protons (see Sec. 3.5.3). This fact has to be kept in mind in while drawing conclusions based on a comparison of calculated and experimental tr-NOESY
curves. The for this system is sufficiently slow enough that it is safer to carry out the full CORCEMA treatment, including finite exchange off-rates, instead of assuming the fast-exchange approximation. In fact, the computed for this example is nearly four orders of magnitude smaller than the value of used by some investigators while trying to justify the fast-exchange assumption. Despite these moderately slow off-rate conditions, there will still be a significant amount of transferred NOE to the signal associated with the free ligand. In our computations, we combined intensities from all contributions (Choe et al., 1991; Moseley et al., 1995). For CORCEMA optimizations, a data set consisting of 16 tr-NOEs was used. Figure 35 shows some of the interglycodic tr-NOESY fits using the L2:R2 complex shown in Fig. 36. For the optimizations, and were held fixed, based on independent estimates (Poppe et al., 1997; Thomas Peters, unpublished results, 1996). The L2:R2 structure resulted in the best NOE R-factor From the 5F-2G tr-NOE fit, it is apparent that additional minor refinements may be necessary. Interestingly, deletion of the protein protons (i.e., L2:R3 complex) resulted in only a slight increase of the R-factor, suggesting that the protein protons play a marginally important role in the ligand tr-NOESY spectra, with only some NOEs showing the effect. The remaining complexes (L1:R1, L1:R3, L3:R1, and L4:R1) gave R-factors suggesting that they (and in particular the ligand conformations) are less compatible with the experimental data. More recent calculations in which the off-rate, bound correlation time, order parameters, and leakage factors were optimized, will be reported elsewhere (Moseley et al., 1999).
296
N. Rama Krishna and Hunter N. B. Moseley
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
5.5.
297
Reversible Binding of Corepressor Tryptophan with Repressor–Operator Complex
As a last example, we present the CORCEMA analysis of intermolecular transferred NOESY in a ligand–protein/DNA complex. It is also the first quantitative analysis of intermolecular transferred NOESY. The E. coli Trp-repressor is a
DNA-binding protein important in gene regulation. The Apo-repressor is a homodimer of two 107-residue monomers. The holo-repressor has two L-tryptophan corepressor molecules which bind in a noncooperative manner to two specific binding pockets in the dimer interface. Two Trp-repressors can bind in a tandem fashion to operator sequences (2:1 stoichiometry) of 33 base pairs or longer (Kumamoto et al., 1987). The minimal operator is an 18-base-pair consensus sequence which binds the Trp-repressor with a 1:1 stoichiometry (Haran et al., 1992; Bennett and Yanofsky, 1978). Crystal structures of the Apo- and holo-repressors and holo-repressor–operator complexes are available as are the NMR structures of the Apo-repressor–operator complex (Zhao et al., 1993; Lawson and Carey, 1993; Arrowsmith et al., 1991; Otwinowski et al., 1988). Lee et al. (1995) have assigned the intermolecular tr-NOESY contacts between the Trp-repressor– operator (Trp-op/rep) complex and the corepressor tryptophan.
298
N. Rama Krishna and Hunter N. B. Moseley
Figure 14 shows the typical NOESY spectrum with the peaks identifying the intermolecular contacts between the free and bound forms of tryptophan and the corepressor bound represser–operator complex. The concentration of unbound
operator–repressor complex is negligible under the conditions of the experiment. The chemical shifts of the residual unbound op/rep complex also coincide with that of the repressor-bound form. The peaks between the free corepressor and the bound complex arise due to an exchange-mediated NOESY (Lee and Krishna, 1992; Choe et al., 1991).
5.5.1. The Leakage-Shell Model
Moseley et al. (1997) analyzed the inter-tr-NOESY data by CORCEMA, using a two-state model consisting of free and bound conformations for the interacting
species (i.e., the corepressor and the op/rep complex). The crystallographic structure for the complex (Otwinowski et al., 1988) was used to generate the bound state with the corepressor and the residues in and around the binding pocket. The free state consisted of the corepressor and the binding pocket in their uncomplexed state, but with the same conformations as in the complexed state. Three intense intermolecular cross peaks were analyzed: (A) between the free to methyl (free and bound), (B) between bound to methyl (free and bound), and (C) between bound to methyl (free and bound) protons. To account for direct and indirect effects, the binding pocket explicitly included all hydrogens within a radius r from each participating hydrogen (or the attached carbon for methyl protons), as shown by four overlapping spheres in Fig. 37. All hydrogens in the sphere were given a uniform leakage factor of Hydrogens in the outer 1-Å shell of the overlapping spheres were given a uniform leakage of to simulate leakage pathways for the shell hydrogens due to dipolar interactions with the rest of the protons outside the shell. This way, the dimension of the CORCEMA matrix was kept relatively small and manageable while accurately accounting for the specific effects of each individual proton within the binding pocket (i.e., protein-mediated spin-diffusion and leakage effects), as well as the general somewhat nonspecific leakage effects due to the remaining protons outside the shell. The was set at To account for internal motions due to methyl group rotation, the Lipari–Szabo model-free approach was used, with representing the order parameter for the internuclear vector i–j. The internal correlation time was fixed at 0.005 ns. The order parameter for nonmethyl protons was fixed at 0.85. The parameters (correlation time for free ligand), (correlation time for the bound ligand and the complex), external methyl S, and leakage for the free ligand were optimized by Powell minimization to get best fits between CORCEMA-calculated NOESY intensities and experimental intensities. An NOE R-factor (Krishna et al., 1978; Xu et al., 1995a, 1995b) was used as the energy term to be minimized by the Powell minimization, where
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
299
The calculations were performed with active site sphere radii set to 5, 6, 7, and 8 Å, with and without a leakage shell. Figure 4 and Table 1 in Moseley et al. (1997) show the results of optimization that give the best fit. Also shown are the effects of
varying the exchange off-rate and the sensitivity of the inter-tr-NOESY to its variations. An off-rate of and a of 13.5 ns were determined from the best-fit optimizations. The off-rate is in excellent agreement with the value of determined by direct measurements (Lee et al., 1995). The bound correlation time of 13.5 ns at 45°C, determined by Moseley et al. from CORCEMA analysis, is in excellent agreement with the value of 14.5 ns at 37°C reported by Shan et al. (1996). This demonstrates the power of the CORCEMA method. Figure 38 demonstrates the sensitivity of the inter-tr-NOESY to changes in the orientation of the corepressor within the binding pocket. A change in of (i.e., from 92°
300
N. Rama Krishna and Hunter N. B. Moseley
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
301
to 112°) gives dramatically bad fits between experiment and theory, and can be
rejected outright. In contrast, a change by –20° (i.e., from 92° to 72°) results in an acceptable fit as well as R-factor (changing from 0.142 for the crystallographic orientation to 0.154). Nevertheless, this orientation resulted in somewhat unacceptable values for optimized parameters like (= 9.79 ns) and external methyl These CORCEMA calculations confirm that the intermolecular tr-NOESY data are compatible with the crystal structure orientation for the corepressor within the binding pocket. The intermolecular NOESY data were analyzed by Ramesh et al. (1996), using another method. They calculated approximate intermolecular distance constraints from a NOESY spectrum at 50 ms and used these constraints together with distance geometry–simulated annealing refinements to model the orientation of the corepressor within the binding pocket. These calculations also confirmed that the solution structure of the corepressor is generally similar to that in the crystal structure. Because of the difficulty in stereospecifically assigning the two a slightly altered orientation for the aromatic ring was also found to be compatible by distance geometry methods. Because an alteration in the corepressor orientation also alters the spin-diffusion pathways which may be sensed by the full-mixing-
time curves for the different NOEs, the CORCEMA method has the potential to provide a more sensitive probe of the ligand orientation within the binding pocket. Moseley et al. also pointed out that it is relatively easy to get good fits between experimental tr-NOEs and calculated tr-NOEs by optimizing parameters using some models—whether such good fits are meaningful or not can only be judged by comparing the values of the optimized parameters (e.g., bound correlation time, off-rate, etc.) with respect to their estimates from independent measurements.
6. FINAL COMMENTS
In this chapter, we have summarized the CORCEMA methodology along with some experimental examples where this and other similar methods have been successfully employed. The CORCEMA algorithm should prove to be useful in the NOESY analysis of interacting molecules under a wide range of conditions, from very weak to very tight binding. Quantitative analysis of transferred NOESY is one major application. In addition to a discussion on the quantitative determination of bound-ligand structures, we have also considered the possibility of exploiting tr-NOESY in structure-based design by an explicit incorporation of active site residues and the associated effects (i.e., protein-mediated spin diffusion, leakage effects, and intermolecular tr-NOEs) in CORCEMA analysis. Recent reports such as tr-NOESY-based screening of compound libraries for biological activity (Meyer et al., 1997) and the SAR-by-NMR method for designing higher-affinity ligands (Shuker et al., 1996), underscore the increasingly important role of high-field NMR
302
N. Rama Krishna and Hunter N. B. Moseley
spectroscopy in serious structure-based drug design efforts. In this context, CORCEMA and other similar algorithms for analyzing tr-NOESY data can play a major role in the arsenal of tools available to investigators involved in such efforts.
Together with computational advances in structure refinement protocols, along with experimental advances, CORCEMA and similar algorithms render the transferred NOESY technique into a powerful tool for structure-based drug design directly in the solution phase.
ACKNOWLEDGMENTS. This work was supported in part by NSF grant MCB9630775, NCI Grant CA-13148, and the Arthritis Foundation. The authors wish to thank Drs. Jacob Anglister, Cheryl Arrowsmith, Ad Bax, Anne Imberty, Robert London, Thomas Peters, and Tali Scherf for supplying the originals of some figures used in this chapter, and Drs. Robert London, Feng Ni, and Thomas Peters for sending preprints of manuscripts prior to publication. Figures 35 and 36 showing CORCEMA calculations on system are from a joint collaboration with the laboratory of Dr. Thomas Peters at the University of Lübeck, Germany. The authors also thank Dr. Peters for his comments on this article. Stimulating discussions with Drs. Ernie Curto and Patricia Jackson during the early stages of this work are also acknowledged.
REFERENCES Adams, E. R., Dratz, E. A., Gizachew, D., Deleo, F. R., Yu, L., Volpp, B. D., Vlases, M, Jesaitis, A. J., and Quinn, M. T., 1997, Biochem. J. 325:249. Albrand, J. P., Birdsall, B., Feeney, J., Roberts, G. C. K., and Burgen, A. S. V., 1979, Int. J. Biol. Macromol. 1:37. Alexandrescu, A. T., Hinck, A. P., and Markley, J. L., 1990, Biochemistry 29:4516. Andersen, N. H., Eaton, H. L, and Nguyen, K. T., 1987, Magn. Reson. Chem. 25:1025. Anfinsen, C., 1973, Science 181:223. Anglister, J., Scherf, T., Zilber, B., and Levy, R., 1995, Biopolymers 37:383. Anglister, J., Scherf, T., Zilber, B., Levy, R.,Zvi, A., Hiller, R., and Feigelson, D., 1993, Faseb J. 7:1154. Anglister, J., and Zilber, B., 1990, Biochemistry 29:921. Arepalli, S. R., Glaudemans, C. P. J., Daves, Jr., G. D., Kovac, P., and Bax, A., 1995, J. Magn. Reson. B106:195. Arrowsmith, C. H., Pachter, R., Altman, R., and Jardetzky, O., 1991, Eur. J. Biochem. 202:53–66. Asensio, J. L., Cañada, F. J., Bruix, M., Rodriguez-Romero, A., and Jimenez-Barbero, J., 1995a, Eur. J. Biochem. 230:621.
Asensio, J. L., Cañada, F. J., and Jimenez-Barbero, J., 1995b, Eur. J. Biochem. 233:618. Baleja, J. D., Mau, T., and Wagner, G., 1994, Biochemistry 33:3071. Baleja, J. D., Pon, R. T., and Sykes, B. D., 1990, Biochemistry 29:4828–4839. Bax, A., and Davis, D. G., 1985, J. Magn. Reson. 63:207. Behling, R. W., Yamane, T., Navon, G., and Jelinski, L. W., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:6721. Bennett, W. S., Jr., and Steitz, T. A., 1978, Proc. Natl. Acad. Sci. U.S.A. 75:4848. Bennett, G. N., and Yanofsky, C., 1978, J. Mol. Biol. 121:179–192.
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
303
Bernasconi, C. F., Ed., 1986, Investigation of Rates and Mechanisms of Reactions, Techniques of Chemistry, Vol. VI, Part I, Wiley-Interscience, Chaps. IV and VI. Bersch, B., Koehl, P., Nakatani, Y., Ourisson, G., and Milon, A., 1993, J. Biomol. NMR 3:443. Bevilacqua, V. L., Kim, Y., and Prestegard, J. H., 1992, Biochemistry 31:9339. Bevilacqua, M. P., Stengelin, S., Gimbrone, M. A., Jr., and Seed, B., 1989, Science 243:1160. Blommers, M. J. J., Fendrich, G., García-Echeverría, C., and Chêne, P., 1997, J. Am. Chem. Soc. 119:3425. Boelens, R., Konig, T. M. G., and Kaptein, R., 1988, J. Mol. Struct. 173:299. Bonvin, A. M. J., Boelens, R., and Kaptein, R., 1994, Biopolylers 34:39. Borgias, B. A., and James, T. L., 1988, J. Magn. Reson. 79:493. Borgias, B. A., and James, T. L., 1989, Meth. Enzymol. 176:169. Borgias, B. A., and James, T. L., 1990, J. Magn. Reson. 87:475. Bothner-By, A. A., Stephens, R. L., Lee, J., Warren, C. D., and Jeanloz, R. W., 1984, J. Am. Chem. Soc. 106:811. Boyd, J., Moore, G. R., and Williams, G., 1984, J. Magn. Reson. 58:511. Braun, W., 1987, Q. Rev. Biophy. 19:115–157. Brunger, A. T., 1992, X-PLOR, Version 3.1, Yale University Press, New Haven. Bundle, D. R., Baumann, H., Brisson, J. R., Gagné, S. M., Zdanov, A., and Cygler, M., 1994, Biochemistry 33:5183. Campbell, I. D., Dobson, C. M., Moore, G. R., Perkins, S. J., and Williams, R. J. P., 1976, FEBS Lett. 70:96–100. Campbell, A. P., and Sykes, B. D., 1991, J. Mol. Biol. 222:405. Campbell, A. P., Van Eyk, J. E., Hodges, R. S., and Sykes, B. D., 1992, Biochem. Biophys. Acta 1160:35. Casset, F., Imberty, A., Perez, S., Etzler, M. E., Paulsen, H., and Peters, T., 1997, Eur. J. Biochem. 244:242. Casset, F., Peters, T., Etzler, M., Korchangina, E., Nifant’ev, N., Pérez, S., and Imberty, A., 1996, Eur. J. Biochem. 239:710. Chen, Y., Reizer, J., Saier, M., Jr., Fairbrother, W. J., and Wright, P. E., 1993, Biochemistry 32:32. Choe, B. Y., Cook, G. W., and Krishna, N. R., 1991, J. Magn. Reson. 94:387. Clore, G. M., Bax, A., Wingfield, P., and Gronenborn, A. M., 1990a, Biochemistry 29:5671.
Clore, G. M., and Gronenborn, A. M., 1982, J. Magn. Reson. 48:402. Clore, G. M., and Gronenborn, A. M., 1983, J. Magn. Reson. 53:423. Clore, G. M., and Gronenborn, A. M., 1991, Prog. NMR Spectrosc. 23:43. Clore, G. M., Nilges, M., Sukumaran, D. K., Brunger, A. T., Karplus, M., and Gronenborn, A. M., 1986, EMBO J. 5:2729. Clore, G. M., Szabo, A., Bax, A., Kay, L. E., Driscoll, P. C., and Gronenborn, A. M., 1990b, J. Am. Chem. Soc. 112:4989. Cooke, R. M., Hale, R. S., Lister, S. G., Shah, G., and Malcolm, P., 1994, Biochemistry 33:10591. Crenshaw, J. M., Graves, D. E., and Denny, W. A., 1995, Biochemistry 34:13682–13687. Curto, E. V., Moseley, H. N. B., and Krishna, N. R., 1996, J. Comp-Aided Mol. Design 10:361–371. Czaplicki, J., Arrowsmith, C., and Jardetzky, O., 1991, J. Biomol. NMR 1:349–361. Dekker, N., Cox, M., Boelens, R., Verrijer, C. P., ver der Vliet, P. C., and Kaptein, R., 1993, Nature 362:852. Dellwo, M. J., Schneider, D. M., and Wand, A. J., 1994, J. Magn. Reson. B103:l.
Dobson, C. M., and Evans, P. A., 1984, Biochemistry 23:4267. Dratz, E. A., Gizachew, D., Busse, S. C., Rens-Domiano, S., and Hamm, H. E., 1996, Biophys. J. 70:A16. Driscoll, P. C., Gronenborn, A. M., Wingfield, P. T., and Clore, G. M., 1990, Biochemistry 29:4668. Ealick, S. E., Babu, Y., Bugg, C. E., Erion, M., Guida, W., Montgomery, J. A., and Secrist, J. A. III, 1991, Proc. Natl. Acad. Sci. U.S.A. 88:11540.
304
N. Rama Krishna and Hunter N. B. Moseley
Ernst, R. R., Bodenhausen, G., and Wokaun, A., 1987, Principles of Nuclear Magnetic Resonance in One and Two Dimensions, Clarendon Press, Oxford. Feigon, J., Wang, A. H. J., van der Marel, G. A., van Boom, J. H., and Rich, A., 1984, Nucleic Acid Res. 12:1243. Fejzo, J., Westler, W., Macura, S., and Markley, J. L., 1991, J. Magn. Reson. 92:195. Fejzo, J., Westler, W., Markley, J. L., and Macura, S., 1992, J. Am. Chem. Soc. 114:1523. Fesik, S. W., 1993, J. Biomol. NMR 3:261–269. Fischer, E., 1894, Ber. Deutsch Chem. Ges. 27:2985. Fischer, A., Laub, P. B., and Cooperman, B. S., 1995, Nature Struct. Biol. 2:951. Folkers, P. J. M., Folmer, R. H. A., Konings, R. N. H., and Hilbers, C. W., 1993, J. Am. Chem. Soc. 115:3798. Gemmecker, G., Olejniczak, E. T., and Fesik, S. W., 1992, J. Magn. Reson. 96:199. Glasel, J. A., 1989, J. Mol. Biol. 209:747. Glaudemans, C. P. J., Lerner, L. E., Daves, D. G., Jr., Kovac, P., Venable, R., and Bax, A., 1990, Biochemistry 29:906. Gordon, S. L., and Wüthrich, K., 1978, J. Am. Chem. Soc. 100:7094. Gorenstein, D. G., Meadows, R. P., Metz, J. T., Nikonowicz, E. P., and Post, C. P., 1990, in Advances in
Biophysical Chemistry (C. A. Bish, ed.), JAI Press, London, pp. 47–124. Gounarides, J. S., Broido, M. S., Becker, J. M., and Naider, F. R., 1993, Biochemistry 32:908. Graves, B. J., Crowther, R. L., Chandran, Ch., Rumberger, J. M., Li, S., Huang, K. S., Presky, D. H., Familletti, P. C., Wolitzky, B. A., and Burns, D. K., 1994, Nature 367:532. Guntert, P., Braun, W., and Wüthrich, K., 1991, J. Mol. Biol. 217:517. Gupta, R. K., Koenig, S. H., and Redfield, A. G., 1972, J. Magn. Reson. 7:66. Hansley, P., McDevitt, P. J., Brooks, I., Trill, J. J., Feild, J. A., McNulty, D. E., Connor, J. R., Griswold, D. E., Kumar, N. V., Kopple, K. D., Carr, S. A., Dalton, B. J., and Johanson, K., 1994, J. Biol. Chem. 269:23949.
Haran, T. E., Joachimiak, A., and Sigler, P. B., 1992, EMBO J. 11:3021–3030. Hinck, A. P., Walkenhorst, W. F., Truckses, D. M., and Markley, J. L., 1997, in Biological NMR
Spectroscopy (J. L. Markley and S. J. Opella, eds.), Oxford University Press, New York, pp. 113–138. Holland, D. R., Tronrud, D. E., Pley, H. W., Flaherty, K. M., Stark, W., Jansonius, J. N., McKay, D. B., and Matthews, B. W., 1992, Biochemistry 31:11310. Hoogstraten, C. G., Westler, W. M., Macura, S., and Markley, J. L., 1995, J. Am. Chem. Soc. 117:5610. Hrabal, R., Komives, E. A., and Ni, F., 1996, Protein Sci. 5:195. Ikura, M., and Bax, A., 1992, J. Am. Chem. Soc. 114:2433. Ikura, M., Clore, G. M., Gronenborn, A. M., Zhu, G., Klee, C. B., and Bax, A., 1992, Science 256:632. Imberty, A., Casset, F., Gegg, C. V., Etzler, M. E., and Pérez, S., 1994, Glycoconj. J. 11:400. Jackson, P. L., Moseley, H. N. B., and Krishna, N. R., 1995, J. Magn. Reson. 107B:289. James, T. L., 1976, Biochemistry 15:4724. James, T. L., and Oppenheimer, N. J., Eds., 1994, Nuclear Magnetic Resonance, Methods in Enzymology, Vol. 239, Sec. IV. Janakiraman, M. N., White, C. L., Laver, W. G., Air, G. M., and Luo, M., 1994, Biochemistry 33:8172. Jarori, G. K., Murali, N., and Rao, B. D. N., 1994, Biochemistry 33:6784. Keepers, J. W., and James, T. L., 1984, J. Magn. Reson. 57:404. Koshland, D. E., Jr., 1958, Proc. Natl. Acad. Sci. U.S.A. 44:98. Krishna, N. R., Agresti, D. G., Glickson, J. D., and Walter, R., 1978, Biophys. J. 24:791. Krishna, N. R., Goldstein, G., and Glickson, J. D., 1980, Biopolymers 19:2003. Krishna, N. R., and Lee, W., 1992, Biophys. J. 61:A33. Kumamoto, A. A., Miller, W. G., and Gunsalus, R. P., 1987, Gene Der. 1:556–564.
Kuntz, I. D., Thomason, J. F., and Oshiro, C. M., 1989, Meth. Enzymol. 177:159.
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
305
Lasky, L. A., 1992, Science 258:964. Lawson, C. L., and Carey, J., 1993, Nature 366:178–182. Lee, W., and Krishna, N. R., 1992, J. Magn. Reson. 98:36.
Lee, W., Revington, M. J., Arrowsmith, C. H., and Kay, L. E., 1994, FEBS Lett. 350:87–90. Lee, W., Revington, M., Farrow, N. A., Nakamura, A., Utsunomiya-Tate, N., Miyake, Y., Kainosho, M., and Arrowsmith, C. H., 1995, J. Biomol. NMR 5:367–475. LeMaster, D. M., 1989, Meth. Enzymol. 177:23. Lian, L. Y., Barsukov, I. L., Sutcliffe, M. J., Sze, K. H., and Roberts, G. C. K., 1994, Meth. Enzymol. 239:657–700. Lipari, G., and Szabo, A., 1982a, J. Am. Chem. Soc. 104:4559–4570.
Lipari, G., and Szabo, A., 1982b, J. Am. Chem. Soc. 104:4546–4559. Lippens, G. M., Cerf, C., and Hallenga, K., 1992, J. Magn. Reson. 99:268. Liu, H., Kumar, A., Weisz, K., Schmitz, U., Bishop, K. D., and James, T. L., 1993, J. Am. Chem. Soc. 115:1590. London, R. E., Perlman, M. E., and Davis, D. G., 1992, J. Magn. Reson. 97:79. Lord, S. T., Byrd, P. A., Hede, K. L., Wei, C., and Colby, T. J., 1990, J. Biol. Chem. 265:838. Lumb, K. J., Cheetham, J. C., and Dobson, C. M., 1994, J. Mol. Biol. 235:1072. Macura, S., Fejzo, J., Hoogstraten, C. G., Westler, W. M., and Markley, J. L., 1992, Isr. J. Chem. 32:245. Macura, S., Westler, W., and Markley, J. L., 1994, Meth. Enzymol. 239:106. Marcus, D. M., Naiki, M. A., and Kundu, S. K., 1976, Proc. Natl. Acad. Sci. U.S.A. 73:3263. Massefski, W., Jr., and Redfield, A. G., 1988, J. Magn. Reson. 78:150.
Meadows, R. P., Nikonowicz, E. P., Jones, C. R., Bastian, J. W., and Gorenstein, D. G., 1991, Biochemistry 30:1241. Mertz, J. E., Guntert, P., Wüthrich, K., and Braun, W., 1991, J. Biomol. NMR 1:257. Meyer, B., Weimar, T., and Peters, T., 1997, Eur. J. Biochem. 246:705. Moseley, H. N. B., Curto, E. V., and Krishna, N. R., 1994, 35th Experimental NMR Conference, WP115, Asilomar, CA. Moseley, H. N. B., Curto, E. V., and Krishna, N. R., 1995, J. Magn. Reson. 108B:243–261. Moseley, H. N. B., Lee, W., Arrowsmith, C. H., and Krishna, N. R., 1997, Biochemistry 36:5293. Moseley, H. N. B., Scheffler, K., Perez, S., Imberty, A., Krishna, N. R., and Peters, T., 1998, to be submitted, (1999).
Murali, N., Lin, Y., Mechulam, Y., Plateau, P., and Rao, B. D., 1997, Biophys. J. 70:2275. Ng, K. K.-S., and Weis, W. I., 1997, Biochemistry 36:979. Ni, F., 1992, J. Magn. Reson. 96:651. Ni, F., 1994, Prog. NMR Spectrosc. 26:517. Ni, F., Konishi, Y., Bullock, L. D., Rivetna, M. N., and Scheraga, H. A., 1989, Biochemistry 8:3106. Ni, F., Konishi, Y., and Scheraga, H. A., 1990, Biochemistry 29:4479. Ni, F.., Ripoll, D. R., Martin, P. D., and Edwards, B. F. P., 1992, Biochemistry 31:11551. Ni, F., and Zhu, Y., 1994, J. Magn. Reson. 103B: 180–184. Ni, F., Zhu, Y, and Scheraga, H. A., 1995, J. Mol. Biol. 252:656. Nicholson, L. K., Yamazaki, T., Torchia, D. A., Grzesiek, S., Bax, A., Stahl, S. J., Kaufman, J. D., Wingfield, P. T., Lam, P. Y. S., Jadhav, P. K., Hodge, C. N., Domaille, P. J., and Chang, 1995, Nature Struct. Biol. 2:274.
Nilges, M., Clore, G. M., and Gronenborn, A. M., 1988, FEBS Lett. 239:129. Ning, Q., Ripoll, R., Szewczuk, Z., Konishi, Y, and Ni, F., 1994, Biopolymers 34:1125. Nirmala, N. R., Lippens, G. M., and Hallenga, K., 1992, J. Magn. Reson. 100:25. Okada, A., Wakamatsu, K., Miyazawa, T., and Higashijima, T., 1994, Biochemistry 33:9438. Olejniczak, E. T., Gampe, T., Jr., and Fesik, S. W., 1986, J. Magn. Reson. 67:28. Ono, K., Hattori, H., Uemura, K., Nakayama, J., Ota, H., and Katsuyama, T., 1994, J. Histochem. Cytochem. 42:659.
306
N. Rama Krishna and Hunter N. B. Moseley
Otting, G., Liepinsh, E., and Wüthrich, K., 1993, Biochemistry 32:584–595. Otting, G., and Wüthrich, K., 1989, J. Am. Chem. Soc. 111:1871. Otwinowski, Z., Schevitz, R. W., Zhang, R. G., Lawson, C. L., Joachimiak, A., Marmorstein, R. Q., Luisi, B. F, and Sigler, P. B., 1988, Nature 335:321–329. Pavlopoulos, S., Rose, M., Wickham, G., and Craik, D. J., 1995, Anticancer Drug Design 10:623. Perlman, M., Davis, D. G., Koszalka, G. W., Tuttle, J. V., and London, R. E., 1994, Biochemistry 33:7547. Pervushin, K., Riek, R., Wider, G., and Wüthrich, K., 1997, Proc. Natl. Acad. Sci. USA 94:12366–
12371. Poppe, L., Brown, G. S., Philo, J. S., Nikrad, P. V., and Shah, B. H., 1997, J. Am. Chem. Soc. 119:1727. Press, W. A., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P., 1992, Numerical Recipes in C, 2nd ed., Cambridge University Press. Quiocho, F. A., 1991, Curr. Opinion Struct. Biol. 1:922. Radmacher, M., Fritz, H. G. Hansma, and P. K. Hansma, 1994, Science 265:1577. Ramesh, V., Syed, S. E. H., Frederick, R. O., Sutcliffe, M. J., Barnes, M., and Roberts, G. C. K., 1996, Eur. J. Biochem. 235:804–813. Rinnbauer, M., Mikros, E., and Peters, T., 1998, J. Carbohydrate Chem. 17:217–230.
Rutherford, T. J., Spackmann, D. G., Simpson, P. J., Homans, S. W., 1994, Glycobiology 4:59. Scheek, R. M., van Gunsteren, W. F, and Kaptein, R., 1989, Meth. Enzymol. 177:204. Scheffler, K., Ernst, B., Katopodis, A., Magnani, J. L., Wong, W. T., Weisemann, R., and Peters, T., 1995, Angew. Chem. Int. Ed. Engl. 34:1841. Scheffler, K., Brisson, J.-R., Weisemann, R., Magnani, J. L., Wong, W. T., Ernst, B. V, and Peters, T., 1997, J. Biomol. NMR 9:423. Scheraga, H. A., 1983, Ann. NY Acad. Sci. 408:330. Scheraga, H. A., 1986, Ann . NY Acad. Sci. 485:124. Scherf, T., and Anglister, J., 1993, Biophys. J. 64:754. Scherf, T., Hiller, R., Naider, F., Levitt, M., and Anglister, J., 1992, Biochemistry 31:6884. Schneider, M. L., and Post, C. B., 1995, Biochemistry 34:16574. Schulz, G. E., Muller, C. W., and Diederichs, K., 1990, J. Mol. Biol. 213:627. Shan, X., Gardner, K. H., Muhandiram, D. R., Rao, N. S., Arrowsmith, C. H., and Kay, L. E., 1996, J. Am. Chem. Soc. 118:6570–6579. Sharff, A. J., Rodseth, L. E., Spurlino, J. C., and Quiocho, F. A., 1992, Biochemistry 31:10657. Shibata, C. G., Gregory, J. D., Gerhardt, B. S., and Serpersu, E. H., 1995, Archiv. Biochem. Biophys. 319:204.
Shuker, S. B., Hajduk, P. J., Meadows, R. P., and Fesik, S. W., 1996, Science 274:1531. Sugar, I. P., and Xu, Y., 1992, Prog. Biophys. Mol. Biol. 58:61. Uemura, K., Hattori, H., Ono, K., Ogata, H., and Taketomi, T., 1989, Jpn. J. Exp. Med. 59:239. Vincent, S. J. F., Zwahlen, C., and Bodenhausen, G., 1996a, in NMR as a Structural Tool for Macromolecules: Current Status and Future Directions (B. D. N. Rao and M. D. Kemple, eds.),
Plenum Press, New York, pp. 145–166. Vincent, S. J. F., Zwahlen, C., Bolton, P. H., Logan, T. M., and Bodenhausen, G., 1996b, J. Am. Chem. Soc. 118:3531.
Wadkine, R. M., and Graves, D. E., 1991, Biochemistry 30:4278–4283. Wagner, G., and Wüthrich, K., 1979, J. Magn. Reson. 33:675. Weimar, T., Harris, S. L., Pitner, J. B., Bock, K., and Pinto, M., 1995, Biochemistry 34:13672. Wider, G., Weber, C., Traber, R., Widmer, H., and Wüthrich, K., 1990, J. Am. Chem. Soc. 112:9015. Wüthrich, K., and Wagner, G., 1975, FEBS Lett. 50:265–268. Xu, Y., and Krishna, N. R., 1995, J. Magn. Reson. B108: 192. Xu, Y., Krishna, N. R., and Sugar, I. P., 1995a, J. Magn. Reson. B107:201.
CORCEMA Analysis of NOESY Spectra of Ligand–Receptor Complexes
307
Xu, Y., Sugar, I. P., and Krishna, N. R., 1995b, J. Biomol. NMR 5:37. Yi, Q., Erman, J. E., and Satterlee, J. D., 1994, J. Am. Chem. Soc. 116:1981.
Yip, P. F., and Case, D. A., 1989, J. Magn. Reson. 83:643. Zhao, D., Arrowsmith, C. H., Jia, X., and Jardetzky, O., 1993, J. Mol. Biol. 229:735–746. Zheng, J., and Post, C. B., 1993, J. Magn. Reson. 101B:262–270.
Zhu, L., and Reid, B. R., 1995, J. Magn. Reson. B106:227. Zwahlen, C., Vincent, S. J. F., Di Bari, L., Levitt, M. H., and Bodenhausen, G., 1994, J. Am. Chem. Soc. 116:362.
II
Structure and Dynamics
8
Protein Structure and Dynamics from Field-Induced Residual Dipolar Couplings
James H. Prestegard, Joel R. Tolman, Hashim M. Al-Hashimi, and Michael Andrec 1. INTRODUCTION
The quest for information about the structure and dynamics of proteins in solution has traditionally been approached using spin-relaxation-based phenomena: NOEs
for distance-constraint-based structure, and heteronuclear relaxation for backbone and side-chain dynamics. While new techniques that capitalize on these phenomena continue to be introduced, both phenomena are limited in fundamental ways. NOEs are limited by their inherent short-range distance sensitivity. This becomes a problem when successive short-range constraints must dictate spatial relationships of remote parts of macromolecules, as occurs in extended DNA and RNA helices. It also puts long-range constraints, such as those between side-chain protons of different parts of a protein, at a premium and presses the limits of methods which
James H. Prestegard, Joel R. Tolman, Hashim M. Al-Hashimi, and Michael Andrec • Complex Carbohydrate Research Center University of Georgia, Athens, Georgia 30602. Biological Magnetic Resonance, Volume 17: Structure Computation and Dynamics in Protein NMR, edited by Krishna and Berliner. Kluwer Academic / Plenum Publishers, New York, 1999.
311
312
James H. Prestegard et al.
can extend backbone-directed assignment strategies to the very tip of residue side chains. Most heteronuclear spin relaxation phenomena are limited by truncation of their sensitivity to motional time scales on the order of or faster than the overall tumbling rate of a molecule (time scale of several nanoseconds)(Wagner, 1993). This becomes a problem when motions of fundamental importance to enzyme mechanisms occur on the microsecond to millisecond time scale. Here, we discuss some new methods that can complement traditional methods in cases where they are limited in these respects. These new methods are based on the phenomena of field-induced residual dipolar couplings. The existence of these couplings was realized and demonstrated many years ago, but its potential as an abundant source of information for macromolecular structure and dynamics has been fully realized only with the advent of very high field magnets and heteronuclear methods that allow very precise measurement of dipolar couplings in isotopically labeled biomolecules. While dipolar interactions between pairs of spin-1/2 nuclei certainly exist and they are in fact the basis of the relaxation-based phenomena to which we refer above, they are seldom directly observed in solution. This may seem surprising when one realizes that the spin operator description of the interaction shares features with that for through-bond spin–spin coupling. The interaction should contribute to multiplet structure or change the splitting in multiplet structure just as scalar (J) coupling does. As discussed more fully in Sec. 2, the reason for the failure to observe these contributions in solution is that the geometric factor which scales the dipolar interaction, averages to zero when the angle between the magnetic field and the interaction vector, is sampled isotropically during molecular tumbling. The internuclear distance, r, can be recovered through spin relaxation measurements, but, for truly isotropic sampling, direct measurement of neither the distance nor angular part of the dipolar interaction is possible. Recovery of the angular information in the dipolar interaction could be very important. If we could learn how the average orientations of two vectors, say two bond vectors in remote parts of a biomolecule differed, this could be a powerful constraint on structure, even in cases where the two vectors are remote. If we could learn how the averaging process differed for the two vectors, this could provide new information on internal dynamics, even when the dynamics involved time scales of motions only slightly shorter than the reciprocal of the residual dipolar interaction itself (100 ms). The value of angular information has long been recognized in the fields of solid-state NMR and liquid crystal NMR. But, with a few exceptions (Bastiaan and MacLean, 1990; Bastiaan et al., 1987), the value of measuring residual dipolar interactions in solution has not been appreciated. The reason is simple: the residual interactions in solution have been small and difficult to measure. Why is our ability to measure dipolar contributions to multiplet splittings changing? One reason is that field strengths available to high-resolution spectro-
Protein Structure and Dynamics from Dipolar Couplings
313
scopists are increasing. For weakly aligned systems the residual dipolar contributions to multiplet splittings increases as field squared. This means that, in going from spectrometers operating a few years ago at 600 MHz for protons to ones operating today at 800 MHz for protons, the magnitude of the dipolar contribution
has increased by a factor of 1.78. At 1 GHz the factor will be 2.78. With today’s fields, the interactions for isolated molecules in solution are still moderately small. So, in simple solution systems amenable to measurement today molecules must have an inherently large anisotropic magnetic susceptibility. In practical terms, this means the main subjects of this chapter will be certain paramagnetic proteins and certain highly anisotropic diamagnetic systems, such as proteins bound to a DNA helix. Very recently an ability to amplify the orientational tendencies of individual molecules by dissolving them in a dilute liquid crystal which orients more strongly has been demonstrated (Tjandra and Bax, 1997b). This is a very promising
technology that may allow extension of measurement to be discussed here to even a broader range of systems. In any emerging methodology, particularly one that pushes precision of meas-
urements, development of new experiments becomes of primary importance. We are fortunate that residual dipolar couplings appear as contributions to multiplet splittings, because many of the new methods being introduced to measure scalar couplings for the purpose of obtaining torsion-angle constraints can be used as models for the experiments needed here. The entire range of experiments for scalar coupling measurement are reviewed elsewhere in this volume. We will focus here on methods designed for measurement of one-bond couplings, because the bond distance in the one-bond dipolar interaction can be assumed known, and we can focus on the angular information in the residual dipolar contribution. We will also discuss the limits of precision in measurement and some of the systematic errors that can be introduced. This is important because, at current field strengths, in noncooperative systems, we will push the limits of measurement to their extremes. Interpretation is also an issue. The data are new and protocols for incorporating the data in structure determinations are just emerging. Preliminary applications to structure refinement and global structure determination will be discussed. We have mentioned the issue of motion. Motional and structural effects on NMR parameters are frequently intertwined, and this is no less the case for residual dipolar couplings. An approach that can at least screen for the presence of large-scale slow motions in proteins of known structure will be described. The predictions from the limited data available are as yet controversial. However, the promise that measurement of residual dipolar couplings will eventually yield unique contributions to the definition of both structure and motion in macromolecules is great. This chapter will hopefully lay a useful basis for the fulfillment of this promise.
314
James H. Prestegard et al.
2. THEORY 2.1. Anisotropic Spin Interactions in Solution-State NMR
The various interactions that contribute to the Hamiltonian describing nuclear spin energies in the presence of a magnetic field differ in their dependencies on molecular orientation with respect to the field. The Zeeman interaction does not depend on the orientation of the molecule since the axis of quantization of spin angular momentum is solely determined by the direction of the effective magnetic field in the laboratory frame. On the other hand, all other interactions, including nuclear shielding indirect spin–spin coupling direct dipole–dipole interactions and electric quadrupole coupling depend on the orientation. These orientation dependencies can be described in terms of second-rank
tensors written in the laboratory frame. In liquids and gases, thermal motion formally renders these lab frame tensors time dependent. However, for observation of the interactions of interest, characterized by frequencies from and greater, an average Hamiltonian will suffice, and time-dependent tensors can be replaced by effective averages. If the motion to be included in the averaging is perfectly isotropic in its sampling of orientational space, the averaging leaves only the trace of the interaction Hamiltonian. The dipolar and quadrupolar anisotropic interactions have traceless tensors and will therefore vanish from direct detection and only play a role in spin relaxation. Chemical shielding and indirect coupling will persist
as an apparent scalar interaction. This is what we expect in normal high-resolution NMR. While this averaging is convenient when high resolution and spectral simplicity are the goal, removal of all anisotropic interactions sacrifices a great deal in useful data. Small departures from isotropic tumbling may reintroduce these anisotropic interactions and maintain spectral simplicity. A convenient way to accomplish this is by recognizing that molecules with large anisotropic magnetic susceptibilities, will favor certain orientations with respect to an applied magnetic field and will therefore sample orientational space anisotropically. Any anisotropic terms in the spin Hamiltonian will now have an effective average which differs from its isotropic value, allowing direct observation of anisotropic spin interactions. Here, we are concerned primarily with the effect of partial averaging of the dipolar interaction between a pair of spins which results in a contribution to the observed line splittings. In the subsequent discussion, we will derive an expression for the residual dipolar coupling and demonstrate the structural information that can be extracted from its measurement. The theory underlying magnetic-field-induced order and the consequences in terms of the averaging of anisotropic interactions, such as the dipolar interaction, has been described in several places (Bastiaan and MacLean, 1990; Bastiaan et al., 1987; Lohman and MacLean, 1978). We repeat some of that description here in
Protein Structure and Dynamics from Dipolar Couplings
315
terms of irreducible spherical tensors. This approach has a certain simplicity and
makes many of the underlying assumptions more apparent. Irreducible spherical tensors have been used previously to describe orientation effects in the case of the EPR spectra of radicals (Falle and Luckhurst, 1970) and for the NMR spectrum of monofluorobenzene in a nematic liquid crystal solvent (Snyder, 1965). The reader is referred to those works for additional clarification. 2.2. The Dipolar Hamiltonian The lab frame Hamiltonian (in Hz) for the dipolar interaction between two nuclei can be expressed as a scalar contraction of spin and spatial tensor components:
where
are the gyromagnetic ratios for the nuclei, h is Plank’s constant, r
is a vector joining the two nuclei; T and D are second-rank, zero-order tensor
operators describing the spin and spatial part of the interaction, respectively. Note that the nonsecular terms have been dropped since their time dependence will render
them ineffective in determining the energies of the system. The components of the dipolar interaction tensor written in their own principal axis frame are actually quite simple. In this frame, with the as the internuclear vector, there is only one nonzero component, with value (lab) is, however, a complex function of molecular orientation. While the principal axis system (PAS) of a given dipolar interaction is often assumed fixed in a molecular frame, a wide range of molecular orientations is sampled by motion. We will therefore have to average over these orientations to obtain closed-form expressions for the dipolar interaction. Averaging is most conveniently done by transforming to an intermediate molecular frame. For the case of interest here, it is most convenient to choose the molecular frame which corresponds to the principal axis system of the magnetic susceptibility tensor for the entire molecule (referred to as the magnetic frame). For isolated molecules in solution, sampling of molecular orientations will be influenced by the interaction of the magnetic field with the anisotropic magnetic susceptibility of the molecule. Thus, assuming a rigid molecule, all necessary averaging will involve the relationship of the magnetic frame to the lab frame and will be the same for any dipole interaction in the molecule. On the other hand, the relationship of a given principal dipolar frame to the magnetic frame will remain a fixed function of molecular geometry. The transformations between frames are most easily accomplished using Wigner rotation matrices with elements The first transformation, carrying
316
James H. Prestegard et al.
the PAS of the dipolar interaction into the magnetic coordinate system, is described by the fixed Euler angles, If it is assumed that the molecule is rigid, then this transformation is independent of time:
The second transformation relates the PAS of the tensor (magnetic axes) to the lab frame by the Euler rotation denoted by The time dependence of these angles arises because of molecular reorientation:
Thus, the dipolar tensor, D, in the lab frame can be written
Substitution into Eq. (1) leads to
where the angle brackets indicate that a time average has been taken in order to account for molecular reorientation. This time average will cause to vanish if all orientations in space are sampled equally, and for this reason dipolar couplings are not normally observed in liquid-state high-resolution NMR. However, this assumption of an isotropic distribution of orientations will begin to fail at high magnetic fields and for molecules which have a considerably anisotropic suscepti-
bility. This occurs because of the interaction between the induced molecular dipole moment and the magnetic field, originally discussed by Van Vleck (1932). The next section will detail how consideration of this small interaction can break the isotropy of orientational space and allow the direct observation of anisotropic parameters, such as dipolar couplings, in solution-state NMR spectroscopy.
2.3. Residual Dipolar Couplings under Magnetic Field Alignment Placement of a molecule in a magnetic field gives rise to an induced magnetic dipole moment which is proportional to the susceptibility, This moment will oppose the field for diamagnetic molecules and be along the field for
Protein Structure and Dynamics from Dipolar Couplings
317
paramagnetic molecules (Van Vleck, 1932). For molecules which are not of spherical symmetry, must be described by a tensor, and thus the size of the induced magnetic dipole moment will be dependent upon the orientation of the molecule with respect to the magnetic field. The induced dipole in turn interacts with the magnetic field itself, leading to an energy of interaction, W, which can be written as (Bastiaan et al., 1987)
As most molecules are not of spherical symmetry, they will have some orientations with respect to the magnetic field which are energetically more
favorable than others. If W is large enough compared to the thermal energy, a sufficient, although small, degree of order will be induced such that the effects of the resulting incomplete averaging of anisotropic interactions become observable (Bastiaan et al., 1987; Bastiaan and MacLean, 1990; Lohman and MacLean, 1978). It is important to recognize that the orientation of the principal axes of the susceptibility tensor (PAS) within the molecular frame will determine how the
molecule will tend to order in the field. Shown in Fig. 1 is the most probable orientation for a benzene molecule with its symmetry-determined principal axis system. We can express the interaction energy in Eq. (6) as a scalar contraction of irreducible spherical tensors (in the PAS of the susceptibility tensor):
318
James H. Prestegard et al.
Note that the rank 0 (isotropic) component of has not been included in the above expression. This part will not contribute to an orientational dependence of the interaction energy W and, hence, can be dropped for simplicity. In the PAS of the tensor, the rank 2 irreducible spherical components of can easily be written in terms of familiar susceptibility anisotropies:
where the +1 and –1 components vanish because of the assumed symmetric nature of the susceptibility tensor. The operator T is simple in the lab frame where only the element exists and is equal to The magnetic frame equivalent is generated by use of Wigner rotation matrices and substitution into Eq. (7) leads to the desired expression for W
in the PAS of the
tensor,
Our objective in obtaining Eq. (9) is the calculation of the average which appears in Eq. (5). Having an expression for the energy, we can assume a Boltzmann distribution and carry out this averaging:
In the high-temperature approximation, the above integral can be solved by inspection utilizing the orthogonality theorem of the Wigner rotation matrix elements:
This allows Eq. (5), in the case of a heteronuclear spin pair, to be simplified to
Protein Structure and Dynamics from Dipolar Couplings
319
where has been replaced explicitly by the spherical angles and which relate the internuclear vector between spins I and S to the molecule fixed magnetic coordinate system. Note the appearance ofthe operator. This is exactly the same as appears in the expression for the Hamiltonian for through-bond scalar coupling. For a simple pair of unlike spin-1/2 nuclei two doublets will appear, each split by the sum of scalar and residual dipolar contributions. It is convenient to write the expression for the residual dipolar coupling
contribution in the case of two spin-1/2 nuclei (units of Hz):
Inspection of Eq. (13) leads to the conclusion that, for a rigidly tumbling molecule, the magnitude of the residual dipolar coupling will depend on the square of the
magnetic field strength, the susceptibility anisotropy, the internuclear distance r, and the orientation of the internuclear vector within the magnetic coordinate system (Fig. 2). The field dependence can be used to separate the residual dipolar contribution from scalar coupling contributions which are not field dependent. For a directly bonded pair of spins, it will be assumed that the internuclear distance is known, and measurement of these contributions, given knowledge of the susceptibility anisotropy, can provide information related to the orientation of bond vectors within a molecule fixed magnetic coordinate system. While these effects represent a potentially abundant source of structural information, some significant challenges remain. Even at the highest fields available today, the predicted splittings are small. For a benzene ring at 17.6 T and 298 K, the predicted residual dipolar contribution to a one-bond coupling with is 0.12 Hz. Even for systems with large
320
James H. Prestegard et al.
susceptibility anisotropies, these effects are only on the order of a few Hertz, and thus are experimentally demanding to measure.
3. EARLY HISTORY OF OBSERVATION Despite anticipated difficulties in observation of residual dipolar couplings, several accounts of observation can be found in the early NMR literature. Although
in this chapter we will be mainly concerned with applications of dipolar couplings and magnetic field alignment to the structure and dynamics of biomolecules in simple solution, many of the key concepts underlying the methodology stem from a more diverse set of studies. Many of the earlier studies are on organic molecules (Bastiaan and MacLean, 1990; Bastiaan et al., 1987; van Zijl et al., 1984), many use alignment methods other than magnetic-field-induced alignment (Buckingham and McLauchlan, 1967; Plantenga et al., 1980), many include the measurement of quadrupole as well as residual dipolar splittings (Bothner-By et al., 1981), and many rely on cooperative systems such as liquid crystals to achieve partial orientation (Sanders et al., 1994). However, all of these studies share underlying principles and are therefore worthy of some discussion here. In the interests of brevity this discussion cannot be complete, but we hope to make connections to other segments of the literature, which are particularly useful, such as the liquid crystal NMR literature (Emsley and Lindon, 1975). Observable effects of partial molecular alignment were first predicted for polar molecules aligned under the influence of an electrical field (Buckingham and Levering, 1962). This prediction gained experimental support soon after, when signatures of residual dipolar coupling were observed in p-nitrotoluene using
electric field NMR. The predicted effects, combined with the observed perturbations of multiplet splittings, allowed some of the first determinations of the absolute signs of scalar coupling constants in aromatic rings (Buckingham and McLauchlan, 1963). Applications of electric field alignment in NMR have not, however, become widespread because of the need for specially designed cells and limitation to relatively nonconducting solutions. Magnetic field alignment does not suffer the particular limitations of electric field alignment. However, it does require very high fields and large anisotropic magnetic susceptibilities. Aromatic and paramagnetic systems have the requisite
large anisotropic magnetic susceptibilities, and therefore became candidates for the early development of magnetic field alignment methodology as soon as magnets of
sufficient field strength were available. The first experimental demonstration of magnetic field alignment dates back to 1978 (Lohman and MacLean, 1978). The primary observation in this case was quadrupolar splitting of resonances from nuclei with spins greater than 1/2, rather than dipolar splitting from coupled pairs of spins. Quadrupolar splittings were observed in the NMR spectra of
Protein Structure and Dynamics from Dipolar Couplings
321
triphenelene-d6 and phenanthrene-d10, suggesting incomplete averaging of the quadrupolar interaction due to partial magnetic field alignment (Lohman and MacLean, 1978). Quadrupole interactions, present for nuclei with spins greater than 1, show a dependence on molecular alignment and molecular geometry similar to that of dipolar interactions with the magnitude of the observed splittings depending on the quadrupole coupling constant the magnetic field strength squared and the magnitude of the molecule’s magnetic susceptibility anisotropy Since values for quadrupole coupling constants were known (180 kHz for C–D bonds in aromatic molecules), the authors were able to compute the axial magnetic susceptibility anisotropy The diamagnetic anisotropy in these molecules is characteristic of aromatic behavior with cyclic delocalization of affecting a substantial increase in the principal negative susceptibility component perpendicular to the aromatic ring plane. Aromatic molecules thus tend to align with the normal of the ring plane perpendicular to the magnetic field as shown in Fig. 1. Studies on aromatic molecules have continued to be reported (Laatikainen et al., 1995, 1993). As discussed later, aromatic groups of tyrosine, tryptophan, phenyalanine, and histidine are also important in diamagnetic protein applications where their anisotropies play a dominate role in the resultant magnetic susceptibil-
ity (Tjandra et al., 1996). Quadrupole splittings were chosen for initial observations on small molecules because of the large quadrupole coupling constants and resultant large residual splittings. However, chemical-shift resolution for quadrupole nuclei is generally poor, making retrieval of information from multiple sites in large molecules difficult. When quadrupole interactions dominate effective transverse relaxation
rates of solution samples, linewidths increase as the square of the interaction strength, decreasing resolution as interaction strength increases. Hence, particularly
for larger molecules, measurement of the weaker dipolar interactions offers an advantage. Observations of residual dipolar splittings, nevertheless, required a combination of higher fields, and molecules with larger anisotropic susceptibilities. The enhanced anisotropy in paramagnetic systems eventually allowed the smaller effects of residual dipolar coupling between pairs of protons to be observed. The first observations using paramagnetic systems were actually made on quadrupolar couplings (Eyring et al., 1980). The observation of dipolar couplings was later made on the para methyl protons in paramagnetic Bis{tolyltris(pyrazlyl)borato]cobalt(II) (Bothner-By et al., 1981). This also marks one of the first discussions of the utility of these measurements in structural analysis. The small number and small magnitude of the residual dipolar splittings made a completely independent structural analysis impossible. However, quadrupolar splittings in the corresponding deuterated molecule allowed determination of a degree of order and the observed dipolar couplings could then be shown to be consistent with accepted methyl group geometry. Thus, the potential structural utility of the methodology was demonstrated.
322
James H. Prestegard et al.
Observation of residual dipolar couplings in diamagnetic molecules presented a greater challenge owing to the smaller magnitude of anisotropy. Residual dipolar
couplings could not at first be observed in small organic molecules but proved feasible in larger aromatic systems in which the anisotropy of individual aromatic systems add up to a larger effective anisotropy. This was first demonstrated for methylpyropheophorbide and coronene where residual dipole couplings became visible as an apparent magnetic field dependence as discussed below) of the scalar coupling interaction. Extraction of the couplings allowed determination of
the diamagnetic anisotropy for the molecules (Gayathri and Bothner-By, 1982). Diamagnetic anisotropies have since been determined for a number of molecules,
including simple aromatics, porphyrins, and halomethanes (Bothner-By et al., 1987; van Zijl and Bothner-By, 1988). Acceptance of this approach to determining susceptibility anisotropies, reflects the simplicity of the methodology in comparison to other techniques, such as use of the Cotton–Mouton effect.
Magnetic field alignment and residual dipolar couplings have also been used to successfully determine the three-dimensional structure of a complex cage molecule containing a porphyrin and a quinine (Lisicki et al., 1988). In later studies on a DNA decamer, dipolar effects were observed between protons (C5H and C6H) in two cytosines located at the center and terminal end of the double helix. From the data, an angle of 15° between the cytosine base planes was inferred, suggesting loosening of the double helix at the ends of the strands (Skoglund, 1987). More recent work on DNA using heteronuclear coupling has been reported by Kung et al. (1995). Thus the basic utility of residual dipolar couplings in both chemical and structural analysis of small to moderate-sized molecules has been established. Applications of residual dipolar couplings to the structural determination of proteins in solution would seem on the basis of the above discussion to have great potential. One can obtain data that is quite complementary to NOE-based data because residual dipolar splittings for directly bonded pairs can yield the relative orientation of different interaction vectors relative to an order-determining susceptibility tensor without the requirement for close approach. One can also capitalize on backbone-localized assignment strategies for large molecules since constraints on orientation of directly bonded backbone pairs are useful, even when long-range NOEs involving these pairs are seldom observed.
4. APPLICATION TO PROTEIN SYSTEMS
The first applications of residual dipolar constraints to problems in protein structure and dynamics have now appeared. Among those that rely on the inherent tendencies of the molecular system of interest to orient in a magnetic field, one involves work on a paramagnetic protein, myoglobin (Tolman et al., 1995, 1997).
Protein Structure and Dynamics from Dipolar Couplings
323
A second involves work on a protein with a rather substantial diamagnetic anisotropy, ubiquitin (Tjandra et al., 1996). And, a third involves work on the DNA-binding domain of a protein, GATA-1, bound to DNA, which, although diamagnetic, is highly anisotropic (Tjandra et al., 1997). We review these first systems in what follows in a belief that they can illustrate current problems and future potential. We will then focus on problems inherent in the measurement of dipolar splittings. The small magnitude of the residual dipolar interaction at currently available field strengths (1 ns) buried in internal cavities or in deep and narrow surface pockets. At high frequencies, above the dispersion, the excess relaxation rate (above the bulk water rate) is due to the conventional hydration layer, essentially the water molecules in contact with the protein surface. These water molecules generally have rotational correlation times as well as residence times in the subnanosecond range and therefore do not contribute to the MHz dispersion. In addition, rapidly exchanging labile protein hydrogens can make substantial contributions to the and relaxation rates over the entire frequency range (Denisov and Halle, 1995b; Venu et al., 1997). The theoretical framework needed to analyze NMRD data from protein solutions is based on the “standard model” of water relaxation, first proposed by Lars Onsager (Wang, 1955) and applied in the first detailed water relaxation study of protein solutions by the Krakow group in 1963 (Daszkiewicz et al., 1963). The standard model required two extensions (Halle and Wennerström, 1981b). First, a
frequency-independent term must be added to account for the contribution from mobile surface waters and fast local motions of long-lived waters and labile
446
Bertil Halle et al.
hydrogens. Second, the effect of fast local motions of long-lived water molecules on the dispersion amplitude is accounted for by an order parameter formalism that does not rely on specific assumptions about the nature of this motion.
4.1. Spatial Resolution The relaxation of the water magnetization in a protein solution is governed by molecular motions at two distinct levels. At the level of spin dynamics, translational motion of water molecules transfers magnetization between microenvironments with different intrinsic relaxation rates. If sufficiently fast, such material exchange leads to spatial averaging of the local relaxation rates. At the level of orientational time correlation functions, water rotation averages out the anisotropic spin-lattice coupling and thus determines the intrinsic spin relaxation rate (see Sect. 4.2).
4.1.1. Exchange Averaging The theoretical framework for analyzing relaxation data from nuclei exchanging between discrete states is well established and was, in fact, first developed in connection with studies of water in microheterogeneous systems (Zimmerman and Brittin, 1957). It is not obvious, however, that a discrete-state exchange model provides a valid description of continuous water diffusion in a spatially heterogeneous system such as a protein solution (Halle and Westlund, 1988). There are two aspects to this issue. First, in the fast-exchange regime the actual exchange mechanism is irrelevant, and it is only necessary that the perturbation of water rotation induced by the protein be relatively short-ranged. This is known to be the case: relaxation studies on a variety of microheterogeneous aqueous systems show that only water molecules in direct contact with an interface are significantly perturbed (Woessner, 1980; Carlström and Halle, 1988; Volke et al., 1994). Second, the observed spin relaxation rate depends on the exchange mechanism only in the intermediate exchange regime where the residence times are in the –ms range. Such long residence times are only relevant for buried water molecules and labile protein protons, for which a discrete-state exchange (or jump) model is indeed appropriate. This would not necessarily be the case for water molecules at the protein surface, but they are invariably in the fast-exchange regime. The simplest description of the effect of exchange averaging on the water longitudinal relaxation rate in a protein solution is of the form
Provided that chemical-shift differences can be ignored, an analogous result holds for (see Sect. 3.3.1). Since the contributions and are generally in the extreme-narrowing limit, the subscript 1 is suppressed. The first term in Eq. (32)
Multinuclear Relaxation Dispersion Studies of Protein Hydration
447
refers to the fraction of water molecules that are unperturbed by the protein and thus have the same relaxation rate as bulk water. The second term refers to the fraction of water molecules that are dynamically perturbed by the protein, but remain sufficiently mobile that their effective correlation times are much shorter than the tumbling time of the protein. These water molecules are responsible for (most of) the excess relaxation at frequencies above the dispersion.
The third term in Eq. (32) refers to the long-lived water molecules responsible for the relaxation dispersion. Each of these water molecules has a distinct residence time and intrinsic longitudinal relaxation rate and, taken together, they
account for a fraction of all water molecules in the sample. The '. and relaxation rates generally contain contributions from water hydrogens as well as labile protein hydrogens. Since labile hydrogens generally exchange slowly compared to the tumbling of the protein, they contribute only to the third term in Eq. (32). The simple form of this term is valid provided that the quantities and are small compared to 1 (Luz and Meiboom, 1964). In typical protein
solutions, they are in the range
so this condition is satisfied with a wide
margin. Equation (32) is an essentially phenomenological description of exchange
averaging and, as such, is of considerable generality. Since the NMRD method lacks
intrinsic spatial resolution, the microscopic significance of the individual terms in Eq. (32) can be deduced only with the aid of extrinsic structural data, such as high-resolution crystal structures. This has been done for a variety of proteins and the general picture is now clear (Denisov and Halle, 1995a, 1996). The short-lived water molecules responsible for the second term in Eq. (32) essentially comprise the traditional hydration layer, i.e., water molecules in contact with the protein surface (hence the subscript S). The quantity is the intrinsic relaxation rate averaged over all surface sites occupied by short-lived water molecules. The long-lived water molecules responsible for the third term in Eq. (32) are usually buried in cavities inside the protein or trapped in deep surface pockets with low accessibility to external water. In the following, we refer to this class of crystallographically identifiable water molecules as internal water molecules (hence the subscript I). (Crystallographers often use this term in a slightly more restrictive sense, including only water molecules that are not within hydrogen-bonding distance of external water molecules.) 4.1.2. Difference NMRD The identification of internal water molecules as the source of the relaxation dispersion has transformed the NMRD method into a quantitative tool for investigating specific water molecules of structural and functional significance and for
exploiting internal water molecules as noninvasive probes of protein structure and dynamics. The most powerful way of conducting such studies is in the form of a
448
Bertil Halle et al.
difference NMRD experiment where the NMRD profiles from two structurally related proteins are compared. In fact, the first demonstration of the crucial role of buried water molecules was a difference NMRD experiment where the NMRD profiles of BPTI and ubiquitin were compared (Denisov and Halle, 1994, 1995a).
These proteins are of similar size and surface structure, but differ qualitatively in one respect: BPTI contains four buried water molecules, ubiquitin none. The result (Fig. 6) is clear-cut: the virtual absence of a relaxation dispersion for ubiquitin must be due to the absence of buried water molecules in this protein. (The tiny ubiquitin dispersion can be attributed to a single weakly ordered water molecule in a surface pocket.) Subsequent work (Denisov et al., 1995, 1996) revealed that the dispersion from BPTI is actually due to only three of the four buried water molecules, the fourth one (W122) exchanging too slowly to contribute significantly at 300 K (see Sect. 5.5). The BPTI–ubiquitin difference experiment relied for its interpretation on high-resolution crystal structures of the two proteins. With the correlation between internal water molecules and NMRD firmly established (Denisov and Halle, 1996),
Multinuclear Relaxation Dispersion Studies of Protein Hydration
449
useful information can be obtained from difference NMRD experiments even when the structure of one of the two proteins is unknown. Partially folded proteins is a case in point. Figure 7 shows the dispersions from in the native state, in the partially folded A state (“molten globule”) at pH 2, and in the unfolded state (in the presence of 4 M GuHCl and with the four disulfide bonds reduced by dithiothreitol) (Denisov et al., 1999). This experiment provides three pieces of information. First, the dispersion from the A state (corresponding to at least three long-lived water molecules) implies the existence of persistent (>10 ns) structural elements that are not present in the unfolded form. Second, the dispersion frequency is a measure of the hydrodynamic volume of the protein (see Sect. 5.2). The frequency shift (more accurately measured from the dispersions) between the native and A forms suggests a 30% expansion of the latter. Third, the excess relaxation rate on the high-frequency plateau provides a global measure of solvent exposure. This is seen to differ little between the native and A forms, but, as expected, is substantially higher for the unfolded form.
450
Bertil Halle et al.
These two examples can be thought of as global difference NMRD experiments. More detailed information can be obtained from a local difference NMRD experiment where the relaxation dispersion is recorded before and after a site-directed structural perturbation that eliminates one or more of the internal water molecules that contribute to the relaxation dispersion from the unperturbed protein. In a more subtle version of this experiment, the perturbation does not eliminate any internal water molecules but only affects their residence times (e.g., by altering the rate of large-scale conformational fluctuations). If it can be established that the perturbation is local, e.g., from crystal structures of both forms, then should be unaffected and should be the same for all internal water molecules present in both forms. Equation (32) then yields for the difference dispersion
where the sum includes only the displaced water molecules. A local structural perturbation can be induced in several ways. Site-directed mutagenesis is the method of choice for replacing buried water molecules. For example, in the single-point BPTI mutant G36S, the buried water molecule W122 is replaced by the hydroxyl group in the side chain of serine-36. The wild-typeG36S difference dispersions shown in Fig. 8 are thus due to a single buried water molecule. Local covalent modifications can of course also be introduced by conventional chemical methods, e.g., selective reduction of disulfide bonds (this might be a residence time perturbation).
More accessible (but long-lived) water molecules in the native structure can be eliminated (or replaced by short-lived ones) by removing an intrinsic metal ion or cofactor or by adding a high-affinity substrate or inhibitor. If complete removal of an intrinsic ligand cannot be achieved, NMRD profiles can be recorded at a series of ligand–protein ratios and the results extrapolated to zero ligand concentration. This approach can be used, for example, for intrinsic multivalent metal ions that coordinate long-lived water molecules, as illustrated in Fig. 9 for calbindin where each of the two ions coordinates one water molecule (Denisov and Halle, 1995c). The strategy of water elimination by ligand binding is illustrated in Fig. 10 for a B-DNA dodecamer where five water molecules in the minor groove are displaced by the polyaromatic drug netropsin (Denisov et al., 1997a). Due to the relatively short residence time (1 ns), only the low-frequency part of the dispersion could be accessed (see Sect. 5.5). More complete dispersions from DNA solutions were subsequently recorded at 253 K, using an emulsion technique to avoid freezing (Jóhannesson and Halle, 1998). In general, hydrogen exchange is less of a problem in difference NMRD experiments since any labile hydrogen contribution tends to cancel out in the
Multinuclear Relaxation Dispersion Studies of Protein Hydration
451
difference. This is an important advantage as NMRD data from all three water nuclei can then be quantitatively compared, providing detailed information about residence times (Denisov et al., 1996) and orientational disorder (Denisov et al., 1997b) of buried water molecules. Concern about hydrogen exchange is warranted even in difference NMRD experiments, however, because the structural perturbation might affect the values or exchange rates of labile hydrogens (Denisov and Halle, 1995c). Ligands carrying rapidly exchanging hydrogens may, of course, also present problems (Denisov et al., 1997a). 4.2. Temporal Resolution
By definition, the water molecules responsible for the second term in Eq. (32) do not produce a dispersion in the experimentally accessible frequency range. Even
452
Bertil Halle et al.
for a residence time as long as 100 ps, the dispersion would be centered around 1 or Larmor frequency. The observed relaxation dispersion is due to the frequency dependence of the intrinsic relaxation rates of internal water molecules (and labile hydrogens). Within the BWR regime, as defined in Eq. (2), these rates are related to the spectral density function as shown in Sect. 3. The discussion in Sect. 4.2.1 applies to quadrupolar relaxation as well as to intramolecular dipolar relaxation; hence we omit the Q/D subscript on the spectral density function. Intermolecular dipolar contributions are considered in Sect. 4.2.2. GHz, an order of magnitude above the highest achievable
4.2.1. Intramolecular Spectral Density Function
For an internal water molecule (or labile proton spin pair) tumbling rigidly together with a spherical protein, the spectral density function has the usual
Lorentzian form
Multinuclear Relaxation Dispersion Studies of Protein Hydration
453
where the effective correlation time
is determined by the residence time in site k and the rotational correlation time of the protein according to (Beckert and Pfeifer, 1965; Hertz, 1967; Brüssau and Sillescu, 1972)
This simple relationship results from two innocuous assumptions. First, water (or labile hydrogen) exchange and protein rotation are statistically independent processes. Second, each exchange event randomizes the orientation of the spin–lattice interaction tensor. In other words, once the internal water molecule or labile hydrogen has exchanged with bulk water, the probability of returning to the same
454
Bertil Halle et al.
site (in the same protein molecule) before the protein has randomized its orientation
is negligible. Note that the residence time
appears at both levels of motional
averaging: Eq. (32) describes spatial averaging of the intrinsic relaxation rates,
while Eq. (35) describes orientational averaging by exchange from a locally ordered site to an isotropic bulk phase. (When treating water and labile hydrogen exchange on an equal footing, we denote the residence time by When either species is referred to, we use the notations and respectively.) Equation (34) is readily generalized to nonspherical proteins with symmetrictop rather than spherical-top rotational diffusion. The spectral density function is then a sum of three Lorentzians, weighted according to the relative orientation of the spin–lattice interaction tensor and rotational diffusion tensor (Woessner, 1962). For most globular proteins, however, the effect of anisotropic rotational diffusion on the shape of the relaxation dispersion is insignificant. Internal water molecules (and labile hydrogens) do not in general tumble
rigidly with the protein, but undergo restricted rotational motions on time scales short compared to If the local rotation is much faster than the global isotropic motion (with correlation time and remains in the extreme-narrowing regime at the highest relevant frequency, then the appropriate generalization of Eq. (34) takes the simple form (Halle and Wennerström, 1981b; Lipari and Szabo, 1982)
where more,
is an effective correlation time for the local restricted rotation. Furtheris the generalized second-rank orientational order parameter for site k, defined through
In our previous work, a generalized order parameter was used. The quantity is more convenient since it has a maximum value of unity for all three water nuclei in the limit of a rigidly attached water molecule. If k refers to an internal water site, is a molecular order parameter defined as
where specifies the orientation of the water-molecule-fixed frame M (Fig. 11) relative to an arbitrary protein-fixed frame P (assuming spherical-top rotation). To relate the generalized order parameters of all three water nuclei to the same set of molecular order parameters we have introduced in Eq. (37) a set of geometric coefficients defined as
Multinuclear Relaxation Dispersion Studies of Protein Hydration
455
where specifies the orientation of the principal frame F of the spin lattice interaction tensor (Fig. 11) relative to the M frame. The explicit forms of the geometric coefficients for the three water nuclei are collected in Table 2. Since the relaxation rates of these nuclei are not affected by a 180° flip of an internal water molecule around its axis (Denisov and Halle, 1995c; Venu et al., 1997). If k refers to a labile hydrogen site, it is more convenient to define
456
Bertil Halle et al.
the order parameters directly in terms of the orientation tensor with respect to the protein. This corresponds to setting
of the interaction in Eq. (37).
4.2.2. Intermolecular Spectral Density Function When Eq. (32) is applied to NMRD data, the intrinsic relaxation rate of an internal water molecule contains a contribution, given by Eq. (14a), from the intramolecular dipole coupling between the two water protons as well as a contribution, given by Eqs. (18) and (19), from intermolecular dipole couplings
between either water proton and all protein protons. The spectral density function for the intramolecular contribution, where only the orientation of the H–H vector is modulated, is of the same form as for quadrupolar relaxation, Eq. (36), and the generalized intramolecular order parameter is given by Eq. (37) with For the intermolecular contribution, where local motions can modulate both the orientation and the length r of the H–H vector, the spectral density function takes the form
where the sum runs over all internuclear vectors connecting one of the protons of is given by Eq. (12) and the effective dipole frequency averaged by local motions, can be expressed in terms of the generalized intermolecular order parameter as the internal water molecule k with a protein proton i. The dipole frequency
with
involving the solid spherical harmonics of rank Here is the H–H vector, of length and orientation and the are (unnormalized) spherical harmonics. For a rigid water–protein complex without internal motions on the time scale of protein tumbling or faster,
The generalized intermolecular order parameter is most conveniently evaluated in a coordinate system with its origin at the center of symmetry of the internal motion rather than at the proton (Otting et al., 1997). If only one of the two coupled protons undergoes internal motion, the solid harmonics can be transformed to the center of symmetry according to (Chiu, 1964)
Multinuclear Relaxation Dispersion Studies of Protein Hydration
457
where and are vectors from the center of symmetry to the mobile and fixed proton, respectively, and is assumed. Furthermore,
If the mobile-proton vector r1 is distributed with spherical symmetry, it follows from the orthogonality of the spherical harmonics that whereby Inserting this into Eq. (42) and using the closure relation for spherical harmonics, one obtains , i.e., the same result as if the spherically disordered proton were fixed at the center of symmetry. For internal
motion of lower symmetry, corrections to this result appear that are proportional to a power of
For cylindrical symmetry, for example, one finds
By employing a two-center expansion for solid harmonics (Chiu, 1964), internal motions of both protons can be handled in a similar way. When both protons are spherically disordered, same as if they were located at the centers of symmetry.
is the
4.3. Water Relaxation in Semisolid Proteins 4.3.1. General Features of Semisolid Systems A substantial fraction of all published NMR studies of water in biological systems are concerned, not with isotropic protein solutions, but with semisolid materials of relatively low water content. In this category we find a diverse
collection of materials, including protein fibers and powders, protein crystals, protein gels, biological tissues, and partially frozen protein solutions. Protein fibers and powders hydrated from the vapor phase to less than a monolayer of sorbed water may seem ideal for NMR studies of protein hydration since all water molecules interact strongly with the protein, whereas in protein solutions hydration effects are “diluted” by the dominant bulk water response. The structural, energetic, and dynamic properties of sorbed water, however, are qualitatively different from those of water at a protein surface in solution. Furthermore, dehydration may significantly perturb the native protein structure. While studies of sorbed water may therefore not be directly relevant to hydration in solution, they are nevertheless of importance for a variety of applications in food and materials technology. Protein
crystals and gels typically have water contents of 40% or more and are therefore better models for hydration in protein solutions and biological tissues.
458
Bertil Halle et al.
From the point of view of NMR relaxation, the motional-narrowing condition provides a natural demarcation line between semisolids and solutions. In most protein solutions, all orientation-dependent terms in the spin Hamiltonian are averaged to zero by protein tumbling at a rate exceeding the anisotropic coupling frequencies. Under these conditions, the conventional BWR theory of spin relaxation applies (Abragam, 1961). In the semisolid biological materials mentioned
above, the macromolecular component is stationary on this time scale. This has several important consequences. In macroscopically anisotropic systems, incompletely averaged anisotropic couplings may give rise to dipolar or quadrupolar line splittings, the temperature dependence of which can provide information about residence times in the range. Moreover, the relaxation behavior becomes more complex, and richer in information, than in solutions. Relaxation due to relatively fast anisotropic motions becomes orientation dependent and is no longer described by a single spectral density function (as in solutions). Further averaging by slower motions often dominates relaxation. Since the protein molecules are not free to tumble, the actual exchange of internal water molecules (and labile hydrogens) with bulk water can modulate the (residual) couplings, thereby providing direct access to residence times; cf. Eq. (35). If the exchange rates are comparable to the residual couplings, however, relaxation cannot be described by BWR theory. Solutions of large or highly concentrated globular proteins may exhibit such borderline behavior (neither solid nor solution), with overall tumbling rates as well as exchange rates of the same order of magnitude as the residual anisotropic couplings. More importantly, the water relaxation
dispersion in heterogeneous semisolids tends to be dominated by water molecules
with exchange rates comparable to the residual (dipolar or quadrupolar) couplings and then cannot be described by BWR theory. Since water–protein (but not
intraprotein) dipole couplings are modulated by water exchange, cross relaxation (with water as the relaxation sink) can assume much greater importance than in solutions (see Sect. 3.2.3). Since (residual) static dipolar couplings are present, even spin diffusion (in the original sense) can be important for relaxation. 4.3.2. Generalized Relaxation Theory
If protein rotation is sufficiently slow or even inhibited, the correlation time in the spectral density function in Eq. (36) no longer equals the rotational
correlation time If the fractions are small, the mean time during which a water molecule diffuses between two successive visits to a long-lived site
is sufficiently long that, after leaving a given internal site, a water molecule can reach any other site (on the same protein molecule or on a different one) with essentially equal probability. If the semisolid protein sample is macroscopically isotropic or nearly so, as for chemically cross-linked or highly concentrated protein solutions, it then follows that each exchange event brings about complete orienta-
Multinuclear Relaxation Dispersion Studies of Protein Hydration
459
tional randomization of the anisotropic quadrupole coupling. Equation (35) is then valid and the residence time becomes the correlation time, Since the residence times of internal water molecules span a wide range, from nanoseconds
to milliseconds at least, the motional-narrowing condition, Eq. (2), can be violated. This happens when is of the order of the inverse rigid–lattice coupling frequency or longer, i.e., about 1 for (see Table 1). Under such conditions, the second-order perturbation treatment inherent in the BWR theory must be replaced by a more general theory, such as the stochastic Liouville equation, where spin dynamics and molecular motion appear at the same level of description. A nonperturbative stochastic theory of spin relaxation by exchange among an isotropic distribution of locally anisotropic sites has recently been developed for quadrupolar nuclei (Halle, 1996) and is directly applicable to NMRD data from chemically cross-linked (Koenig and Brown, 1993; Koenig et al., 1993) or highly concentrated (Kimmich et al., 1990) protein solutions. Since the stochastic Liouville equation can be solved analytically for the isotropic exchange model (Halle, 1996), the entire spin dynamical behavior can be calculated within the low-dimensional spin space rather than in the computationally demanding infinite-dimensional direct-product space usually employed in stochastic Liouville calculations. For the experimentally relevant dilute regime the stochastic theory predicts that the longitudinal relaxation is exponential (as observed) with the relaxation rate obtained from Eq. (3a), but with the spectral density function in Eq. (36) replaced by the generalized spectral density function (Halle, 1996):
where is given by Eq. (37) with (since the locally averaged quadrupole tensor is taken to be uniaxial), and where, for The direct contribution from local motions has been neglected here, but can be added a posteriori if necessary (Halle, 1996). A similar (but not identical) result can be obtained less rigorously with the aid of Eqs. (3) and (32). This is not unexpected, because when the motional-narrowing condition in Eq. (2) coincides with the condition for fast-exchange averaging of local relaxation rates and when (so that the effective quadrupole coupling is sparse) BWR theory is approximately valid even when Eq. (2) is violated. As expected, Eq. (45) reduces to (the first term of) Eq. (36) when the motional-narrowing condition, Eq. (2), is satisfied. It should be noted that Eq. (45) is not subject to any restrictions on the relative magnitudes of and It is instructive to cast Eq. (45) on the form of the motional-narrowing spectral density, Eq. (36), as
460
Bertil Halle et al.
with the apparent fraction 1996)
and the apparent residence time
given by (Halle,
If Eq. (36) is used outside its domain of validity, the internal water fraction and residence time deduced from the dispersion profile are the apparent quantities in Eqs. (47). Equation (47b) shows that if then the apparent residence time deduced from the dispersion profile using motional-narrowing theory, is nothing
but the inverse of the residual quadrupole frequency For deuterons in buried water molecules, the residual quadrupole frequency should be close to (Table 1), while the residence times are expected to span a wide range. Figure 12 shows how deuterons with different residence times
Multinuclear Relaxation Dispersion Studies of Protein Hydration
461
contribute to the magnitude of the dispersion step. The maximum contribution comes from and the relative contribution is reduced by a factor 5 (or 50) when is shifted one (or two) decades away from If there is a distribution of residence times, the relaxation dispersion will thus be dominated by deuterons with residence times near The dispersion profile is therefore expected to show little temperature dependence. It has been demonstrated that these theoretical considerations can account for NMRD data from rotationally immobilized protein samples (Halle and Denisov, 1995). The previous interpretation of these data in terms of a universal residence time of 1
for protein-associated water
molecules (Koenig and Brown, 1993; Koenig et al., 1993; Koenig, 1995) thus appears to be an artifact of using the conventional (fast-exchange) perturbation theory of spin relaxation. In contrast, the nonperturbative, stochastic theory identifies the apparent correlation time of with the inverse of the residual quadrupole frequency, thus explaining its universality (for different proteins) and virtual independence of temperature (Halle and Denisov, 1995). The observed dispersion profiles (Fig. 13) are consistent with a broad distribution of residence times,
462
Bertil Halle et el.
spanning the range. These considerations are also relevant for dipolar relaxation in immobilized protein samples and for understanding the origin of
relaxation-based contrast in MRI images of soft tissues. The BWR theory can break down even for protein solutions if the protein tumbles sufficiently slowly. This should be the case for hemocvanin (9 MDa), with an apparent correlation time of 0.9 deduced from the dispersion (Koenig et al., 1975) and with virtually independent of temperature (Piculell and Halle, 1986). Both these observations can be rationalized by the generalized spectral
density function in Eq. (45). Originally, however, the inference that in hemocyanin solutions was taken as an indication that the standard two-state fast-exchange model is inapplicable (Koenig et al., 1975) when, in fact, it implies that the motional-narrowing condition in Eq. (2) is violated. 5.
QUANTITATIVE ANALYSIS OF NMRD DATA
Throughout most of this section, we assume that the relaxation rate is due entirely to water nuclei, as is always the case for With obvious modifications, however, most of the discussion applies also to labile hydrogens. Some considerations specific to labile hydrogens are presented in Sect. 5.7.
5.1.
Parametrization of the NMRD Profile
For the purpose of analyzing experimental NMRD data, it is convenient to express Eq. (32) on the form
Here, is a normalized dispersion function decreasing monotonically from 1 at the dispersion frequency. Furthermore, is the excess relaxation rate on the high-frequency plateau above the dispersion:
while
measures the magnitude of the dispersion step:
As long as relaxation is exponential, all relaxation rates are linear combinations of spectral densities. The decomposition of the spectral density function in Eq. (36), due to motional time scale separation, then carries over to the intrinsic relaxation rates which may be expressed as
Multinuclear Relaxation Dispersion Studies of Protein Hydration
463
with
in the extreme narrowing regime at all accessible frequencies. We consider first the simplest case where all water nuclei contributing to the dispersion exchange with bulk water rapidly compared to the intrinsic spin relaxation but slowly compared to the (isotropic) protein rotational diffusion. In the quadrupolar case, the (single-exponential) longitudinal relaxation rate is then given by Eq. (48) with the following identifications:
in Eq. (51) is the average of over all sites contributing to the dispersion (in analogy to the definition of and in Eq. (52) is the average of over these sites. Under the stipulated conditions, the measured relaxation dispersion is fully
characterized by the three parameters and as illustrated for a typical dispersion in Fig. 14. The dispersion function in Eq. (53) is commonly known as a Lorentzian dispersion, although it is, in fact, a sum of two Lorentzians. Nevertheless, it can be accurately approximated by the single-Lorentzian dispersion function (Hallenga and Koenig, 1976)
The difference between the normalized dispersion functions in Eqs. (53) and (55)
varies between + 0.013 and –0.016, with the zero crossing at In the case of relaxation, Eq. (48) should be replaced by
where, under fast-exchange conditions, the dispersion function for the intramolecular contribution is given by Eq. (53) and that for the intermolecular contribution by [cf. Eq. (10)]
464
Bertil Halle et al.
This dispersion function differs by less than 0.009 (over the full frequency range) from
If the small shift of the dispersion frequency is neglected, Eq. (56) can therefore be cast on the form of Eq. (48) with and
Here, is given by Eq. (51) and by Eq. (52) with intramolecular dipole frequency (Table 1)
replaced by the
Multinuclear Relaxation Dispersion Studies of Protein Hydration
465
with
the H–H separation in the water molecule. The value in Table 1 was derived from the (libration-corrected) intramolecular second moment of ice Ih, obtained by subtracting the calculated intermolecular moment from the measured second moment (Whalley, 1974). When this value is inserted in Eq. (61), one obtains which is the best available estimate of the intramolecular H–H separation in ice Ih (Kuhs and Lehmann, 1986). Finally, is given by
where the sum runs over all protein protons (i), the outer brackets signify averaging over all internal water protons (k), and the inner brackets signify averaging over any local motions, also taken into account via the intermolecular order parameters as defined in Eq. (42). Cross relaxation would alter the frequency dependence of but, as discussed in Sect. 3.2.3, such contributions are generally negligible.
5.2. Correlation Time If all water molecules contributing to the dispersion have residence times such that then the effective correlation time deduced from a fit of Eq. (48) to the NMRD data is simply the rotational correlation time of the protein, as assumed for Eqs. (51)–(54). The assumption that can be checked in several ways. For sufficiently dilute protein solutions, can be estimated from the Debye–Stokes–Einstein relation
with V the hydrodynamic volume of the protein and
the viscosity of the solvent.
This relation is strictly valid only for a protein that behaves as a smooth rigid sphere.
It is common to include a hydration layer in the volume V, but this practice has never been theoretically justified. Bead models have been developed to compute the hydrodynamic properties of real proteins from crystallographic data (Garcia de la Torre et al., 1994; Byron, 1997), thus taking into account surface roughness and nonsphericitv. Independent experimental estimates of may also be available, e.g., from relaxation. The assumption that may also be checked by recording NMRD profiles at different temperatures, since should have the same temperature dependence as (provided the protein structure is invariant), whereas the residence time is expected to vary more strongly (Denisov et al., 1996). If, by any of these means, it can be established that then Eq. (35) yields a lower bound for the residence time of any water molecule that contributes
466
significantly to the dispersion, i.e.,
Bertil Halle et al.
On the other hand, if the residence
time does not satisfy the inequalities then it can in principle be accurately determined from NMRD data (see Sect. 5.5.2).
5.3. Dispersion Amplitude
According to Eq. (52), the dispersion amplitude parameter contains information about the number of rapidly exchanging internal water molecules with residence times obeying and about their orientational order. It is convenient to express the internal water fraction , with the number of internal water molecules contributing to the dispersion and the total number of water molecules in the solution, both on a per-protein basis. . is typically of order and can be obtained from the protein concentration and the molecular weights of protein and (isotope-labeled) water. For a quantitatively reliable analysis of , an accurate determination of the protein concentration in the NMR sample is essential. This is particularly important in difference NMRD experiments (see Sect. 4.1.2). Whereas uncertainty in extinction coefficients usually limits the accuracy of spectrophotometrically determined protein concentrations to ca. 5% (Gill and von Hippel, 1989), chromatographic analysis of the entire amino acid content of a hydrolyzed aliquot of the protein solution can give protein concentrations to ca. 2% relative accuracy.
5.3.1. Quadrupole Coupling Constants The water and quadrupole frequencies given in Table 1 refer to the rigid-lattice limit of ice Ih. The use of ice values seems a priori justified at least for extensively hydrogen-bonded internal water molecules and is supported by detailed and NMRD studies of the singly buried water molecule W122 in BPTI (Denisov et al., 1995, 1996, 1997b). A wealth of solid-state NMR and NQR data on and QCCs in crystal hydrates and different ice polymorphs (Berglund et al., 1978; Poplett, 1982) as well as large-basis-set quantum-chemical calculations on molecular clusters (Halle and Wennerström, 1981b; Cummins et al., 1985, 1987; Eggenberger et al., 1992, 1993; Ludwig et al., 1995) have established correlations between the QCCs and the geometry of hydrogen bonding (or ion coordination). Judging from such data, the QCC variation among different internal water molecules should be small. In particular, the QCC ratio is nearly invariant at 30.5 1.5 in a variety of hydrogen-bonded solids (Poplett, 1982). Even for water molecules coordinated to ions, such as (Halle and Wennerström, 1981b), (Denisov and Halle, 1995c), and (Thomann et al., 1995), and for water adsorbed on NaX zeolite (Resing, 1976), the QCCs seem to differ little from the ice Ih value. In bulk water, however, the QCCs are 20%–25% larger than in ice Ih but virtually independent of temperature (van der Maarel et al., 1985, 1986;
Multinuclear Relaxation Dispersion Studies of Protein Hydration
467
Struis et al., 1987; Ludwig et al., 1995). These larger values are probably more appropriate for water molecules at the protein surface than for internal water molecules.
With knowledge about the protein concentration and the rigid-lattice coupling the value derived from the NMRD profile can be used to calculate the quantity
where
is the mean-square generalized order parameter for the
internal water
molecules responsible for the relaxation dispersion. Furthermore, denotes either the quadrupole frequency in Eq. (4) or the intramolecular dipole frequency in Eq. (61). In the 1Hcase, , obtained from Eqs. (60) and (62), should be used with Eq. (64). The available NMRD data from protein solutions suggest that is in the range 0.5–1.0 for buried water molecules. Since, by definition, cannot exceed 1, the quantity provides a lower bound for the number of long-lived internal water molecules in the protein. The actual number of long-lived
water molecules will be larger if
and/or if not all water molecules exchange rapidly with bulk water (see Sect. 5.5). On the other hand, if is known, as might be the case in a difference NMRD experiment, then SI can be obtained directly. If labile-hydrogen contributions (see Sect. 5.7) and intermediate-exchange effects (see Sect. 5.5) can be excluded (or corrected for), then the number should be the same for all three water nuclei. The ratio of the values derived from, say, the and dispersions then yields directly the ratio of the corresponding generalized order parameters, providing information about orientational disorder of internal water molecules.
5.3.2. Libration Amplitudes
The generalized order parameters and describe the effect on the relaxation dispersion of any reorientational motion of buried water molecules that is fast compared to the isotropic tumbling of the protein. Since the nuclear interaction tensors have different orientations with respect to the water molecule (Fig. 11), the three generalized order parameters provide independent information about the internal motion. This information is contained in the second-rank orientational order parameters in Eq. (38). To obtain a quantitative measure of the degree and anisotropy of orientational disorder, these order parameters can be translated into motional amplitudes with the aid of a model. In the anisotropic harmonic libration (AHL) model (Denisov et al., 1997b), the fast local motions are modeled in terms of three independent symmetric libration modes: (i) the rocking of the water molecule around an axis (x) perpendicular to the molecular plane, (ii)
468
Bertil Halle et al.
the wagging of the water molecule around an axis (y) parallel to the H–H vector,
and (iii) the twisting of the water molecule around its axis (see Fig. 11). In addition, the possibility of a fast 180° flip around the axis is included. In the AHL model, the angular variables are the libration angles , and for the rock, wag, and twist modes, respectively. The order parameters , can be expressed in terms of these variables as
where the angular brackets denote averages over the appropriate equilibrium distribution Due to the noncommutability of finite rotations, the order parameters in the AHL model depend on the order in which the rotations are applied. [The result in Eq. (65) corresponds to the order
first and
last.] For the libration
amplitudes of interest, however, this dependence is very weak and can be neglected.
On account of the symmetry of the libration modes, there are only 5 (rather than 25) independent order parameters, namely
In the presence of a
flip, the order parameters
must also reflect the
symmetry of the water molecule, which requires p to be even. The only effect of the flip is thus to make In the AHL model, the five order parameters in Eq. (66) are not independent
since they are all determined by the rms amplitudes
of the three libration
modes. The orientational distribution function for each mode is of the form
Multinuclear Relaxation Dispersion Studies of Protein Hydration
469
This distribution is normalized on the unrestricted interval rather than on The error introduced by this approximation is negligible for the libration amplitudes of interest (say, For the Gaussian distribution in Eq. (67), the five order parameters in Eq. (66) can be expressed in terms of the orientational averages:
with
and 2. Figure 15 shows the effect of each libration mode on the generalized order parameters. Some general observations can be made: (i) is most affected by the twist mode; (ii) is unaffected by the wag mode and is equally sensitive to rock and twist librations; and (iii) only is affected by the flip. Since a fast flip can reduce by as much as a factor 2.7 (in the absence of librational averaging), a comparison of the and dispersion amplitudes may help to diagnose this type of motion. In general, all three libration modes will be more or less excited. The preceding relations are valid for this general case and can be numerically inverted to obtain the three libration amplitudes from the
experimentally determined generalized order parameters. This strategy has recently
470
Bertil Halle et al.
been implemented for several buried water molecules in BPTI and two single-point mutants (Denisov et al., 1997b). While one of these (W122) is as ordered as a water molecule in ice Ih, the others are more disordered. Converting the libration amplitudes to rotational entropies, one finds that the three extensively hydrogenbonded buried water molecules in the Y35G mutant have a configurational entropy comparable to that of bulk water (Denisov et al., 1997b). This result clearly
challenges the conventional wisdom that bound water is highly ordered and suggests that the hydration of nonpolar cavities (Otting et al., 1997) may actually be entropically driven. 5.4. High-Frequency Plateau According to Eq. (51), the high-frequency excess relaxation rate contains contributions from reorientation and/or exchange of mobile surface waters and from local motions of internal water molecules. The latter contribution is usually negligible since and since is generally smaller than While this is certainly the case for subpicosecond librational motions, 180° flips of internal water molecules around the (dipole) axis can make a small but significant contribution to (Denisov and Halle, 1995c). For symmetry reasons, the flip does not contribute to If the flip is slow compared to protein tumbling , not even the relaxation can be affected, since the anisotropic quadrupole coupling then has been averaged to zero before any flips have occurred. The largest flip contribution can be expected when is close to For large proteins (long , water flips in the 1–10 ns range may actually produce an
observable secondary
dispersion step at higher frequencies. Usually, however,
the principal effect of water flips is not the small contribution to but the strong attenuation of (see Sect. 5.3.2). By definition, the contribution refers to water molecules in the extremenarrowing limit at all accessible frequencies. The relaxation rate is therefore proportional to an effective correlation time reflecting more or less restricted local rotation and/or exchange with bulk water. If the QCC is taken to be the same as for bulk water (see Sect. 5.3.1), we thus have If the second term in Eq. (51) can be neglected, we can use the relation andthe known values of and (directly measured on a reference water sample at the same temperature and isotopic composition as the protein solution) to calculate the quantity
where is the average correlation time for the water molecules at the protein surface. From relaxation studies of water in contact with various interfaces, it is
Multinuclear Relaxation Dispersion Studies of Protein Hydration
471
known that the dynamic perturbation is essentially confined to water molecules in direct contact with the surface (Woessner, 1980; Carlström and Halle, 1988; Volke et al., 1994). That this is the case also for proteins is suggested by molecular dynamics simulations (Brunne et al., 1993; Garcia and Stiller, 1993; Lounnas and Pettitt, 1994; Abseher et al., 1996; Rocchi et al., 1997; Kovacs et al., 1997). It is therefore reasonable to estimate for a monolayer, e.g., using the solvent-accessible surface area of the protein (as computed from crystallographic data) and a molecular area of 15 per water molecule. This leads to a dynamic retardation factor of 5–7 for most investigated native globular proteins (Denisov and Halle, 1996). Somewhat larger values for a few proteins, such as trypsin and BSA may be attributed to local motions within clusters of buried water molecules, as represented by the second term in Eq. (51). Although is known, it is useful to quote the ratio (rather than since the ratio depends neither on the tensorial rank of the interaction that induces relaxation (at least for a rotational diffusion model) nor on the isotopic composition of the water (fractionation factors are close to 1). To obtain an estimate for the time taken for a surface water molecule to rotate through one radian, may be multiplied by the (first-rank) dielectric relaxation time of bulk ca. 8 ps at 298 K. For typical globular proteins, one thus obtains values of order 50 ps (at 300 K). Being an arithmetic average over all surface waters, this value is biased toward the longer times in the (probably wide) distribution and may be markedly affected by a few “outliers.” Since both rotation and translation of exposed water molecules at the protein surface should be rate-limited by hydrogen-bond disruption, the 50-ps estimate also gives an indication of the average residence time of surface waters. 5.5. NMRD Time Scales 5.5.1. NMRD Windows For an internal water molecule to contribute fully to the entire relaxation dispersion, its residence time must be long compared to the rotational correlation time of the protein but short compared to the zero-frequency intrinsic relaxation time, If the local motion contribution to is ignored, these conditions can be expressed as
which may be said to define the “NMRD window” on residence times. Of course, water molecules that do not satisfy Eq. (70) may still contribute to the dispersion,
but do so with less than the maximum contribution Using Eqs. (3a), (32), (35), and (36), we can express the relative dispersion step as which becomes 1 when Eq. (70) is obeyed. This quantity is plotted as a function of in Fig. 16 for all three water nuclei, with
472
Bertil Halle et al.
and 100 ns, and from Table 1. (For has been increased by 30% to take intermolecular dipole couplings into account.) Due to the different rigid-lattice coupling frequencies, the intrinsic relaxation time is three orders of magnitude shorter for than for with falling in between (Table 1). The consequent variation of the width of the NMRD windows implies, for example, that some internal water molecules may give a large relative contribution to the dispersion but only a small one to the dispersion. For small to medium-sized proteins, with typically 5–10 ns, such differential window effects are important for residence times longer than a few 100 ns and must be taken into account when comparing values for different nuclei. It is also clear from Fig.
16a that although a large protein (100 kDa, say) may contain numerous buried water molecules, these will only contribute partially to the dispersion. The edge of the NMRD windows is due the competition of protein rotation and water exchange in orientationally averaging the anisotropic coupling, as expressed by Eq. (35). This is a pure correlation time effect and does not affect the value. 5.5.2. Water Residence Time For water molecules on the central plateau of the NMRD window, only lower and upper bounds on the residence time can be established, as expressed in Eq. (70). On the wide flanks of the NMRD window, however, can be accurately determined. On the flank, this requires independent information about (see
Sect. 5.2). Using this strategy, the residence time of water molecules in the narrow minor groove of a B-DNA dodecamer was recently determined to ns (at 277K(Denisov et al., 1997a) and ns at 253 K (Jóhannesson and Halle, 1998). Relatively short residence times, 5–10 ns at 300 K, have also been obtained for water molecules residing in deep surface pockets in ribonuclease A (Denisov and Halle, 1998) and ribonuclease Tl (Langhorst et al., 1999). Longer residence times can be determined by traversing the flank of the NMRD window as the temperature is varied. This is possible even within the restricted temperature range available with protein solutions since long residence times usually are associated with high (apparent) activation enthalpies. Furthermore, with decreasing temperature we not only move to the right on the in Fig. 16, but the edge of the NMRD window is also shifted to the left since increases (this actually shrinks the NMRD window from both sides). Due to the frequency dependence of the intrinsic relaxation rate the fast-exchange condition may be more strongly violated at low frequencies than at high frequencies. Since the dispersion is then more strongly attenuated at lower frequencies, the shape of the dispersion profile is affected. Provided that all water molecules contributing to the dispersion have the same residence time the Lorentzian form of Eq. (53) remains valid to an excellent approximation, but the dispersion is shifted to higher frequency (shorter and the dispersion amplitude
Multinuclear Relaxation Dispersion Studies of Protein Hydratlon
parameter is reduced. To show this, we return to Eq. (32), make use of the decomposition in Eq. (50), and carry out some rearrangements using the (excellent) approximation in Eq. (55). The result is again on the form of Eq. (48), but with in Eqs. (48) and (53) replaced by the effective correlation time
and
in Eq. (48) replaced by the effective amplitude parameter
473
474
Bertil Halle et al.
If the local motion contribution is in the fast-exchange limit
as is
usually the case, Eqs. (71) and (72) reduce to
where, in general, is given by Eq. (35). If NMRD profiles are recorded at a series of temperatures where the flank of the NMRD window is traversed, the residence time and its activation parameters can be determined from the variation of with temperature, as described by Eq. (73) and a suitable parametrization of (T). (The temperature dependence of is usually known; cf. Sect. 5.2.) The activation parameters are particularly valuable as they provide insight about the mechanism (usually largescale fluctuations of the protein structure) whereby a buried water molecule escapes from within a protein. The residence time can actually be obtained (at one temperature) without assuming a functional form for For example, at the temperature Eq. (73) yields (Often, when is outside the fast-exchange limit.) As an illustration of this approach, Fig. 17 shows the temperature dependence of deduced from and difference dispersions (see Sect. 4.1.2) isolating the contribution from the single buried water molecule W122 in BPTI. A joint fit to the two curves in Fig. 17 yielded a residence time at 300 K and an apparent activation enthalpy (Denisov et al., 1996). The temperature shift between the and curves is quantitatively accounted for by the different quadrupole frequencies of the two nuclei (Table 1). 5.6. Stretched Dispersions
The Lorentzian dispersion function is the fastest decaying function that can result from diffusive (overdamped) molecular motions. On the other hand, experimental dispersion profiles are sometimes found to be more extended than predicted
by Eq. (53). At least three factors can contribute to such dispersion stretching: (i) anisotropic protein rotation, (ii) protein–protein interactions, and (iii) a distribution of residence times extending into either or both flanks of the NMRD window. Depending on the circumstances, these effects can shift the dispersion to higher or lower frequency and/or stretch it over a wider frequency range. While it is straightforward to incorporate the effect of anisotropic rotational diffusion of the protein on the spectral density function, especially in the limit of rigid binding (Woessner, 1962), this generalization introduces not only one or two additional rotational diffusion coefficients as parameters but also requires information (available from high-resolution neutron diffraction data for a few proteins) about the orientation of all contributing internal water molecules (and
Multinuclear Relaxation Dispersion Studies of Protein Hydration
475
labile hydrogens) relative to the principal frame of the rotational diffusion tensor. In practice, this mechanism of dispersion stretching is probably unimportant for most globular proteins (aspect ratio (Denisov and Halle, 1995a). In concentrated solutions, protein–protein interactions may affect the relaxation dispersion. The hydrodynamic interference between nearby protein molecules retards their rotation to some extent; to first order the rotational diffusion coefficient is reduced by a factor at a protein volume fraction (Landau and Lifshitz, 1959; Montgomery and Berne, 1977), but the Lorentzian form of the spectral
density function is not significantly affected (Montgomery and Berne, 1977; Wolynes and Deutch, 1977). Direct interactions (electrostatic, van der Waals, and short-ranged), however, can induce a microscopically heterogeneous solution structure. Little is known about such heterogeneities apart from a few cases of specific association at the dimer or oligomer level. If internal water molecules (or labile hydrogens) experience different local environments on a time scale short compared
476
Bertil Halle et al.
to their spin relaxation times, then the observed relaxation dispersion will be a superposition of Lorentzian dispersions characterized by different rotational correlation times In the case of tight association, also the parameters and could vary. Large-scale heterogeneities that are not sampled on the relaxation time scale would give rise to multiexponential relaxation, but this has not been observed in protein solutions. Most proteins contain several internal water molecules, presumably with different residence times. Unless all residence times happen to fall on the central plateau of the NMRD window (Fig. 16), the Lorentzian dispersion term in Eq. (48) should be replaced by a sum over all contributing internal water molecules, i.e., the relaxation dispersion should be a weighted sum of Lorentzian dispersion functions with different (apparent) correlation times. If some residence times are not much longer than the rotational corelation time of the protein, Eq. (35) must be used. Provided that all contributing water molecules are in the fast-exchange limit, and are still given by Eqs. (51) and (52), but in Eq. (48) we must make the replacement
with as in Eq. (35) and the normalized amplitude factors In the event that all contributing internal water molecules have the same residence time the dispersion is Lorentzian but shifted to higher frequency, with an effective correlation time If or if and are comparable and is known, the residence time can thus be obtained directly from the dispersion. For the quadrupolar water nuclei, where Larmor frequencies above 100 MHz cannot be accessed, the shortest residence time that can be determined in this way is about 1 ns. If the fast-exchange limit is not applicable for all contributing water molecules, the dispersion can again become stretched (even if all In Eq. (48), we must then make the replacement
where
are given by Eqs. (71) and (72) with and as in Eq. (35). This mechanism for stretching and shifting the dispersion (to higher frequency) is particularly important for relaxation, where a large number of labile protons in intermediate exchange can contribute significantly to the dispersion (Denisov et al., 1997a). Stretched dispersions should also be more common for very large proteins: when is about 100 ns or longer, even the NMRD window does not exhibit a plateau region (Fig.
Multinuclear Relaxation Dispersion Studies of Protein Hydration
477
16a), in which case internal water molecules with different residence times will also have different effective correlation times. For relaxation, an additional complication may arise in the intermediate exchange regime in that the intrinsic relaxation
behavior may be slightly nonexponential (see Sect. 3.1.2). Traditionally, stretched dielectric and magnetic relaxation dispersions (and broad minima) have been accounted for in terms of empirical correlation time distributions (Yager, 1936; Connor, 1964). In connection with water NMRD studies of protein solutions and other aqueous biological systems, a lognormal distribution was favored initially (Blicharska et al., 1970; Kimmich and Noack, 1970a), but in
the past two decades most authors have used a so-called Cole–Cole dispersion for fitting stretched dispersions (Hallenga and Koenig, 1976). The original Cole–Cole dispersion function was used to describe dielectric dispersion data (Cole and Cole, 1941) and can be inverted to yield a particular correlation time distribution (Fuoss and Kirkwood, 1941). When this dispersion function was modified (Hallenga and Koenig, 1976) so as to be dimensionally commensurate with the real part of the spectral density function (which governs nuclear spin relaxation), its physical meaning was lost. In fact, it can be shown that the modified Cole–Cole dispersion does not correspond to any correlation time distribution (Halle et al., 1998). The significance of the effective correlation time extracted from a fit of the modified Cole–Cole dispersion to stretched NMRD data is therefore somewhat obscure. By inverting the Fourier transform in Eq. (5) and setting it follows that
The frequency integral of the modified Cole–Cole dispersion, however, exhibits an
unphysical divergence. A rigorous procedure has recently been developed for analyzing stretched NMRD profiles without the bias of an arbitrarily imposed
correlation time distribution (Halle et al., 1998). This model-free approach allows a separation of the static and dynamic information content of the dispersion data.
5.7.
Labile Hydrogens
Exchange averaging of macromolecular and water hydrogens is a potential pitfall in all water and relaxation work. Failure to appreciate this point has led to even qualitatively incorrect conclusions about hydration behavior. Well-documented cases include a study of poly (methacrylic acid) (Glasel, 1970) and a recent study of an oligonucleotide (Zhou and Bryant, 1996). In both cases, subsequent studies revealed that the relaxation effects that had been attributed to hydration water were entirely due to labile hydrogens (Halle and
Piculell, 1982; Denisov et al., 1997a).
478
Bertil Halle et al.
The labile hydrogen contribution to NMRD data from protein solutions has been characterized in greatest detail for BPTI. By recording and NMRD
profiles over a wide pH range (Fig. 18), the labile hydrogen contribution could be isolated and quantitatively accounted for in terms of known values and hydrogen exchange rate constants and intrinsic relaxation times of the expected magnitude (Denisov and Halle, 1995b). Since the intrinsic relaxation times of labile hydrogens are at least an order of magnitude longer for than for a larger fraction of the labile hydrogens contribute to the dispersion (Venu et al., 1997).
For BPTI, the labile proton contribution appears to dominate over the buried water contribution even at pH 7, where labile protons were previously thought to exchange too slowly to contribute to the dispersion (Koenig and Schillinger, 1969). While hydrogen exchange is a serious complication in and NMRD studies of protein hydration, it can also be used constructively to study side-chain
dynamics (via the intrinsic relaxation rates) and fast hydrogen exchange rates (not readily accessible with high-resolution techniques). More direct access to fast
proton exchange kinetics is provided by the CSM contribution to the transverse
Multinuclear Relaxation Dispersion Studies of Protein Hydration
479
relaxation rate (see Sect. 3.3.1). The CSM contribution usually dominates over the dipolar contribution to at frequencies of and increases strongly at higher frequencies since the chemical shifts are proportional to the magnetic field (Fig. 19). Most labile protons have chemical shifts of 1–5 ppm from the bulk water resonance. Even at moderate fields, therefore, is much larger than typical intrinsic relaxation rates of According to Eq. (29), which then applies, a given type of proton gives a maximum CSM contribution at a value where the (acid and base catalyzed) exchange rate matches the shift difference This gives rise to characteristic maxima in the dependence of (Fig. 19), which help to separate the contributions from different types of labile protons. If the chemical shifts are known, e.g., from high-resolution studies under conditions of slow exchange, a complete separation can possibly be achieved from
480
CPMG dispersions over a wide
Bertil Halle et al.
range (analogous to the
NMRD data in Fig.
18), perhaps including data at several fields.
6. OUTLOOK
Although water NMRD has been applied to protein solutions for nearly three decades, it is only in the last few years that this technique has matured to the stage where it can make significant contributions to protein science. At present, multinuclear NMRD and high-resolution NOE spectroscopy are the two most powerful NMR methods available for probing protein–water interactions in solution. The information provided by these two techniques is largely complementary. While NMRD has unsurpassed temporal resolution by its ability to map out the spectral
density function in the kHz–GHz range, NOE spectroscopy provides spatial resolution by spectral assignments that can establish the proximity of water molecules to specific protein protons. Although the water relaxation rate measured in an NMRD experiment reflects all rapidly exchanging water molecules in the sample, the frequency dependence separates the contributions from the few long-lived
(biologically interesting) water molecules and the many short-lived ones. Moreover, the location of long-lived water molecules can be established by difference NMRD experiments and with recourse to high-resolution crystal structures. (Also
the water NOE method relies on extrinsic structural information to convert chemical shifts into spatial coordinates and to distinguish water NOEs from chemically relayed NOEs.) While the water NOE method has so far been applied only to
solutions of small and medium-sized proteins (up to 22 kDa), the NMRD method is also applicable to very large proteins, subzero temperatures, and semisolid
samples. Labile proton exchange is a serious problem in NMRD as well as in water NOE spectroscopy (cross peaks from direct water NOEs cannot be distinguished from proton-exchange relayed NOEs and may be obscured by intense exchange cross peaks). Oxygen-17 relaxation, however, invariably reports on water molecules. The NMRD and NOE methods will undoubtedly continue to develop in ways that will allow a more detailed structural and dynamic characterization of water
molecules interacting with proteins and will remove some of the present methodological limitations. The ultimate goal is of course to combine the temporal resolution of NMRD with the spatial resolution of multidimensional high-field spectroscopy. The development of FFC instruments with high-field cryomagnets represents a step in this direction. For semisolid protein samples, such as biological tissues, the NMRD approach might be extended in several respects by employing more sophisticated pulse schemes, polarization transfer, and relaxation anisotropy. Building on recent advances in the study of protein hydration in solution, a
Multinuclear Relaxation Dispersion Studies of Protein Hydration
481
quantitative understanding of the molecular basis of relaxation-based contrast in soft-tissue imaging should also be within reach.
REFERENCES Abragam, A., 1961, The Principles of Nuclear Magnetism, Clarendon Press, Oxford.
Abseher, R., Schreiber, H., and Steinhauser, O., 1996, Proteins 25:366. Akasaka, K., 1979, J. Magn. Reson. 36:135. Alder, F., and Yu, F. C., 1951, Phys. Rev. 81:1067. Allerhand, A., and Thiele, E., 1966, J. Chem. Phys. 45:902.
Anderson, A. G., and Redfield, A. G., 1959, Phys. Rev. 116:583. Baguet, E., Chapman, B. E., Torres, A. M., and Kuchel, P. W., 1996, J. Magn. Reson. B 111:1. Balazs, E. A., Bothner-By, A. A., and Gergely, J., 1959, J. Mol. Biol. 1:147–154. Beckert, D., and Pfeifer, H., 1965, Ann. Phys. 16:262. Berglund, B., Lindgren, J., and Tegenfeldt, J., 1978, J. Mol. Struct. 43:179. Blicharska, B., Florkowski, Z., Hennel, J. W., Held, G., and Noack, F., 1970, Biochim. Biophys. Acta 207:381. Brunne, R. M., Liepinsh, E., Otting, G., Wüthrich, K., and van Gunsteren, W. F., 1993, J. Mol. Biol. 231:1040.
Brüssau, R. G., and Sillescu, H., 1972, Ber. Bunsenges. Phys. Chem. 76:31. Bull, T. E., Forsén, S., and Turner, D. L., 1979, J. Chem. Phys. 70:3106. Byron, O., 1997, Biophys. J. 72:408. Carlström, G., and Halle, B., 1988, Langmuir 4:1346. Chiu, Y., 1964, J. Math. Phys. 5:283. Chung, C.-W., and Wimperis, S., 1992, Mol. Phys. 76:47.
Civan, M. M., and Shporer, M., 1972, Biophys. J. 12:404. Cole, K. S., and Cole, R. H., 1941, J. Chem. Phys. 9:341. Connor, T. M., 1964, Trans. Faraday Soc. 60:1574. Conti, S., 1986, Mol. Phys. 59:449. Cummins, P. L., Bacskay, G. B., and Hush, N. S., 1987, Mol. Phys. 61:795. Cummins, P.L., Bacskay, G. B., Hush, N.S., Halle, B., and Engström, S., 1985, J. Chem. Phys.82:2002. Daszkiewicz, O. K., Hennel, J. W., Lubas, B., and Szczepkowski, T. W., 1963, Nature 200:1006. Denisov, V. P., Carlström, G., Venu, K., and Halle, B., 1997a, J. Mol. Biol. 268:118. Denisov, V. P., and Halle, B., 1994, J. Am. Chem. Soc. 116:10324. Denisov, V. P., and Halle, B., 1995a, J. Mol. Biol. 245:682. Denisov, V. P., and Halle, B., 1995b, J. Mol. Biol. 245:698. Denisov, V. P., and Halle, B., 1995c, J. Am. Chem. Soc. 117:8456. Denisov, V. P., and Halle, B., 1996, Faraday Discuss. 103:227. Denisov, V. P., and Halle, B., 1998, Biochemistry 37:9595.
Denisov, V. P., Halle, B., Peters, J., and Hörlein, H. D., 1995, Biochemistry 34:9046. Denisov, V. P., Johsson, B.-H., and Halle, B., 1999, Nature Struct. Biol. 6:253. Denisov, V. P., Peters, J., Hörlein, H. D., and Halle, B., 1996, Nature Struct. Biol. 3:505.
Denisov, V. P., Venu, K., Peters, J., Hörlein, H. D., and Halle, B., 1997b, J. Phys. Chem. B 101:9380. Edmonds, D. T., and Mackay, A. L., 1975, J. Magn. Reson. 20:515. Edmonds, D. T., and Zussman, A., 1972, Phys. Lett. 41A:167. Edzes, H. T., and Samulski, E. T., 1977, Nature 265:521. Edzes, H. T., and Samulski, E. T., 1978, J. Magn. Reson. 31:207. Eggenberger, R., Gerber, S., Huber, H., Searles, D., and Welker, M., 1992, J. Chem. Phys. 97:5898.
482
Bertil Halle et al.
Eggenberger, R., Gerber, S., Huber, H., Searles, D., and Welker, M., 1993, Mol. Phys. 80:1177. Eigen, M., 1964, Angew. Chemie (Int. Ed.) 3:1. Eliav, U., Shinar, H., and Navon, G., 1991, J. Magn. Reson. 94:439. Flesche, C. W., Gruwel, M. L. H., Deussen, A., and Schrader, J., 1995, Biochim. Biophys. Acta 1244:253. Florin, A. E., and Alei, M., 1967, J. Chem. Phys. 47:4268. Florkowski, Z., Hennel, J. W., and Blicharska, B., 1969, Nukleonika (Engl. transl.) 14:9. Fuoss, R. M., and Kirkwood, J. G., 1941, J. Am. Chem. Soc. 63:385. Furó, I., and Halle, B., 1995, Phys. Rev. E 51:466. Garcia, A. E., and Stiller, L., 1993, J. Comput. Chem. 14:1396. Garcia de la Torre, J., Navarro, S., Lopez Martinez, M. C., Diaz, F. G., and Lopez Cascales, J. J., 1994, Biophys. J. 67:530. Gill, S. C., and von Hippel, P. H., 1989, Anal. Biochem. 182:319. Glasel, J. A., 1968, Nature 218:953. Glasel, J. A., 1970, J. Am. Chem. Soc. 92:375. Grösch, L., and Noack, F., 1976, Biochim. Biophys. Acta 453:218. Halle, B., 1996, Prog. NMR Spectrosc. 28:137.
Halle, B., Andersson, T., Forsén, S., and Lindman, B., 1981, J. Am. Chem. Soc. 103:500.
Halle, B., and Denisov, V. P., 1995, Biophys. J. 69:242. Halle, B., Jóhannesson, H., and Venu, K., 1998, J. Magn. Reson. 135:1. Halle, B., and Karlström, G., 1983, J. Chem. Soc., Faraday Trans. 2 79:1031. Halle, B., and Piculell, L., 1982, J. Chem. Soc., Faraday Trans. 1 78:255. Halle, B., and Wennerström, H., 1981a, J. Magn. Reson. 44:89.
Halle, B., and Wennerström, H., 1981b, J. Chem. Phys. 75:1928. Halle, B., and Westlund, P. -O., 1988, Mol. Phys. 63:97. Hallenga, K., and Koenig, S. H., 1976, Biochemistry 15:4255. Hausser, R., and Noack, F., 1964, Z. Phys. 182:93. Hausser, R., and Noack, F., 1965, Z. Naturforsch. 20a:1668.
Hertz, H. G., 1967, Ber. Bunsenges. Phys. Chem. 71:979. Hills, B. P., 1992, Mol. Phys. 76:489. Hills, B. P., Takacs, S. F., and Belton, P. S., 1989, Mol. Phys. 67:903. Hindman, J. C., 1966, J. Chem. Phys. 44:4582. Huang Kenéz, P., Carlström, G., Furé, I., and Halle, B., 1992, J. Phys. Chem. 96:9524. Jaccard, G., Wimperis, S., and Bodenhausen, G., 1986, J. Chem. Phys. 85:6282. Jacobson, B., Anderson, W. A., and Arnold, J. T., 1954, Nature 173:772. Jardetzky, C. D., and Jardetzky, O., 1957, Biochim. Biophys. Acta 26:668–669. Job, C., Zajicek, J., and Brown, M. F., 1996, Rev. Sci. Instrum. 67:2113. Jóhannesson, H., and Halle, B., 1998, J. Am. Chem. Soc. 120:6859. Kimmich, R., 1971, Z. Naturforsch. 26b:1168. Kimmich, R., 1980, Bull. Magn. Reson. 1:195. Kimmich, R., Gneiting, T., Kotitschke, K., and Schnur, G., 1990, Biophys. J. 58:1183.
Kimmich, R., and Noack, F., 1970a, Z. Naturforsch. 25a:299. Kimmich, R., and Noack, F., 1970b, Z Angew. Phys. 29:248. Koenig, S. H., 1995, Biophys. J. 69:593. Koenig, S. H., and Brown, R. D., 1987, in NMR Spectroscopy of Cells and Organisms, Vol. II (R. K.
Gupta, ed.), CRC Press, Boca Raton, FL, pp. 75–114. Koenig, S. H., and Brown, R. D., 1991, Prog. NMR Spectrosc. 22:487.
Koenig, S. H., and Brown, R. D., 1993, Magn. Reson. Med. 30:685. Koenig, S. H., Brown, R. D., and Ugolini, R., 1993, Magn. Reson. Med. 29:77. Koenig, S. H., Bryant, R. G., Hallenga, K., and Jacob, G. S., 1978, Biochemistry 17:4348. Koenig, S. H., Hallenga, K., and Shporer, M., 1975, Proc. Nat. Acad. Sci. U.S.A. 72:2667.
Multinuclear Relaxation Dispersion Studies of Protein Hydration
483
Koenig, S. H., and Schillinger, W. E., 1969, J. Biol. Chem. 244:3283. Kovacs, H., Mark, A. E., and van Gunsteren, W. F., 1997, Proteins 27:395.
Kubinec, M. G., Culf, A. S., Cho, H., Lee, D. C., Burkham, J., Morimoto, H., Williams, P. G., and Wemmer, D. E., 1996, J. Biomol. NMR 7:236. Kuhs, W. F., and Lehmann, M. S., 1986, in Water Science Reviews, Vol. 2 (F. Franks, ed.), Cambridge University Press, Cambridge, pp. 1–65. Kwong, K. K., Hopkins, A. L., Belliveau, J. W., Chesler, D. A., Porkka, L. M., McKinstry, R. C., Finelli, D. A., Hunter, G. J., Moore, J. B., Barr, R. G., and Rosen, B. R., 1991, Magn. Reson. Med. 22:154. Landau, L. D., and Lifshitz, E. M., 1959, Fluid Mechanics, Pergamon Press, Oxford. Langhorst, U., Loris, R., Denisov, V. P., Doumen, J., Roose, P., Maes, D., Halle, B., and Steyaert, J., 1999, Protein Sci. 8:722. Lankhorst, D., and Leyte, J. C., 1984, Macromolecules 17:93. Lankhorst, D., Schriever, J., and Leyte, J. C., 1982, Ber. Bunsenges. Phys. Chem. 86:215.
Lankhorst, D., Schriever, J., and Leyte, J. C., 1983, Chem. Phys. 77:319. Liepinsh, E., and Otting, G., 1996, Magn. Reson. Med. 35:30. Lipari, G., and Szabo, A., 1982, J. Am. Chem. Soc. 104:4546. Lounnas, V., and Pettitt, B. M., 1994, Proteins 18:148. Ludwig, R., Weinhold, F., and Farrar, T. C., 1995, J. Chem. Phys. 103:6941. Lutz, O., and Oehler, H., 1977, Z. Naturforsch. 32a:131. Luz, Z., and Meiboom, S., 1963a, J. Am. Chem. Soc. 85:3923. Luz, Z., and Meiboom, S., 1963b, J. Chem. Phys. 39:366.
Luz, Z., and Meiboom, S., 1964, J. Chem. Phys. 40:2686. Mao, X.-A., Guo, J.-X., and Ye, C.-H., 1994, Chem. Phys. Lett. 222:417. Martin, M. L., Delpuech, J.-J., and Martin, G. J., 1980, Practical NMR Spectroscopy, Heyden, London. Mateescu, G. D., Yvars, G. M., and Dular, T., 1988, in Water and Ions in Biological Systems (P. Läuger, L. Packer, and V. Vasilescu, eds.), Birkhäuser-Verlag, Basel, pp. 239–250. McLachlan, A. D., 1964, Proc. Roy. Soc. (London), Ser. A 280:271. Meiboom, S., 1961, J. Chem. Phys. 34:375. Montgomery, J. A., and Berne, B. J., 1977, J. Chem. Phys. 67:4589. Noack, F., 1971, in NMR, Basic Principles and Progress, Vol. 3 (P. Diehl, E. Fluck, and R. Kosfeld, eds.), Springer-Verlag, Berlin, pp. 83–144.
Noack, F., 1986, Prog. NMR Spectrosc. 18:171. Noack, F., 1995, in Encyclopedia of Nuclear Magnetic Resonance (D. M. Grant, and R. K. Harris, eds.),
Wiley, New York, pp. 1980–1990. Noack, F., Becker, S., and Struppe, J., 1997, Annu. Rep. NMR Spectrosc. 33:1. Noggle, J. H., and Schirmer, R. E., 1971, The Nuclear Overhauser Effect, Academic Press, New York. Odeblad, E., and Lindström, G., 1955, Acta. Radiol. 43:469. Otting, G., 1997, Prog. NMR Spectrosc, 31:259. Otting, G., Liepinsh, E., Halle, B., and Frey, U., 1997, Nature Struct. Biol. 4:396. Otting, G., and Wüthrich, K., 1989, J. Am. Chem. Soc. 111:1871. Pekar, J., and Leigh, J. S., 1986, J. Magn. Reson. 69:582. Piculell, L., and Halle, B., 1986, J. Chem. Soc., Faraday Trans. 1 82:401. Pitner, T. P., Glickson, J. D., Dadok, J., and Marshall, G. R., 1974, Nature 250:582. Poplett, I. J. F., 1982, J. Magn. Reson. 50:397. Redfield, A. G., Fite, W., and Bleich, H. E., 1968, Rev. Sci. Instrum. 39:710. Resing, H. A., 1976, J. Phys. Chem. 80:186.
Rhim, W. K., Burum, D. P., and Elleman, D. D., 1979, J. Chem. Phys. 71:3139. Rocchi, C., Bizzarri, A. R., and Cannistraro, S., 1997, Chem. Phys. 214:261. Ronen, I., and Navon, G., 1994, Magn. Reson. Med. 32:789. Rose, K. D., and Bryant, R. G., 1980, J. Am. Chem. Soc. 102:21.
484
Bertil Halle et al.
Rubinstein, M., Baram, A., and Luz, Z., 1971, Mol. Phys. 20:67. Sceats, M. G., and Rice, S. A., 1980, J. Chem. Phys. 72:3236. Schriever, J., and Leyte, J. C., 1977, Chem. Phys. 21:265. Schweikert, K. H., Krieg, R., and Noack, F., 1988, J. Magn. Reson. 78:77.
Shaw, T. M., and Elsken, R. H., 1953, J. Chem. Phys. 21:565. Sitnikov, R., Furó, I., and Henriksson, U., 1996, J. Magn. Reson. A 122:76. Solomon, I., 1955, Phys. Rev. 99:559. Spiess, H. W., Garrett, B. B., Sheline, R. K., and Rabideau, S. W., 1969, J. Chem. Phys. 51:1201. Stoesz, J. D., Redfield, A. G., and Malinowski, D., 1978, FEBS Lett. 91:320. Stolpen, A. H., Reddy, R., and Leigh, J. S., 1997, J. Magn. Reson. 125:1. Struis, R. P. W. J., de Bleijser, J., and Leyte, J. C., 1987, J. Phys. Chem. 91:1639. Swift, T. J., and Connick, R. E., 1962, J. Chem. Phys. 37:307. Sykora, S., and Ferrante, G. M., 1995, NMR Relaxometry with FFC Spinmaster, Technical Note 954.1, Stelar s.n.c., Mede (PV), Italy. Thomann, H., Bernardo, M., Goldfarb, D., Kroneck, P. M. H., and Ullrich, V., 1995, J. Am. Chem. Soc. 117:8243. Tromp, R. H., de Bleijser, J., and Leyte, J. C., 1990, Chem. Phys. Lett. 175:568. van de Ven, F. J. M., Janssen, H. G. J. M., Gräslund, A., and Hilbers, C. W., 1988, J. Magn. Reson. 79:221. van der Maarel, J. R. C., Lankhorst, D., de Bleijser, J., and Leyte, J. C., 1985, Chem. Phys. Lett. 122:541. van der Maarel, J. R. C., Lankhorst, D., de Bleijser, J., and Leyte, J. C., 1986, J. Phys. Chem. 90:1470. van der Klink, J. J., Schriever, J., and Leyte, J., 1974, Ber. Bunsenges. Phys. Chem. 78:369.
Venu, K., Denisov, V. P., and Halle, B., 1997, J. Am. Chem. Soc. 119:3122. Volke, F., Eisenblätter, S., Galle, J., and Klose, G., 1994, Chem. Phys. Lipids 70:121. Wang, J. H., 1955, J. Am. Chem. Soc. 77:258. Werbelow, L., and Pouzard, G., 1981, J. Phys. Chem. 85:3887.
Westlund, P.-O., and Wennerström, H., 1982, J. Magn. Reson. 50:451. Whalley, E., 1974, Mol. Phys. 28:1105. Woessner, D. E., 1962, J. Chem. Phys. 37:647. Woessner, D. E., 1980, J. Magn. Reson. 39:297.
Woessner, D. E., and Snowden, B. S., 1970, J. Colloid Interface Sci. 34:290. Wolynes, P. G., and Deutch, J. M., 1977, J. Chem. Phys. 67:733. Wu, D., and Johnson, C. S., 1994, J. Magn. Reson. A 110:113. Yager, W. A., 1936, Physics 7:434. Zhou, D., and Bryant, R. G., 1996, J. Biomol. NMR 8:77. Zimmerman, J. R., and Brittin, W. E., 1957, J. Phys. Chem. 61:1328.
11
Hydration Studies of Biological Macromolecules by Intermolecular Water-Solute
Gottfried Otting 1. INTRODUCTION
The use of intermolecular water–peptide NOEs (nuclear Overhauser effect) for the detection of solvent exposure was already in 1974 (Pitner et al., 1974). With improved equipment, it is possible today to obtain a much more complete picture of the hydration of biomolecules in aqueous solution. This chapter describes from “Progress in Nuclear Magnetic Resonance Spectroscopy,” Vol. 31, Gottfried Otting, NMR Studies of Water Bound to Biological Molecules, pp. 259–285, 1997, with kind permission from Elsevier Science-NL, Sara Burgerhartstraat 25, 1055 KV Amsterdam, The Netherlands. The exploitation of intermolecular water–solute NOEs in biological molecules was
originally proposed in 1973 by N. Rama Krishna and Sidney L. Gordon in their study of the effects on solutes with coupled spin systems [J. Chem. Phys. 58 (1973), 5687–5696]. The first demonstration of an intermolecular solvent–solute NOE dates back to 1965 when Reinhold Kaiser reported the observation of an enhancement in a chloroform proton signal when the solvent cyclohexane was saturated [J. Chem. Phys. 42 (1965), 1838–1839]. Gottfried Otting • Department of Medical Biochemistry and Biophysics, Karolinska Institute, S-171 77 Stockholm, Sweden. Biological Magnetic Resonance, Volume 17: Structure Computation and Dynamics in Protein NMR, edited by Krishna and Berliner. Kluwer Academic / Plenum Publishers, New York, 1999.
485
486
Gottfried Otting
the use of the nuclear Overhauser effect in high-resolution NMR spectroscopy to detect and localize the water molecules hydrating proteins, DNA and RNA molecules. Other reviews are Kubinec and Wemmer (1992a), Wüthrich et al. (1992), Kochoyan and Leroy (1995), Billeter (1995), and Otting and Liepinsh (1995a). Intermolecular NOEs observed between the water and the solute allow the identification of individual hydration water molecules in the presence of a very large excess of bulk water which appears at the same chemical shift as the signals from the hydration water. This is possible because NOEs are effectively observed only for internuclear distances shorter than 4–5 Å. NOEs observed between the single, averaged water resonance and the solute thus report on direct interactions between the solute and the first shell of hydration. The degeneracy of the chemical shifts of hydration water and bulk water is a consequence of the chemical exchange between the two environments which is fast
on the chemical-shift time scale (milliseconds). Chemical exchange in this context refers not only to the exchange of entire water molecules but also to proton exchange between different water molecules. The proton exchange between water molecules is catalyzed by acids and bases and is slowest at neutral (Meiboom, 1961). Given the proton exchange rates in pure water (ca. at and 25°C, corresponding to an average proton residence time on a water oxygen of 1 ms), it is not surprising that, in general, only a single, averaged NMR resonance is observed for hydration water and bulk water, although it is in principle possible that some proteins contain single water molecules in internal cavities that exchange sufficiently slowly with the bulk water to give rise to resolved NMR signals. To date, such a case has not been reported, illustrating the presence of conformational fluctuations in proteins which trigger the exchange of internal hydration water molecules with bulk water within milliseconds even at temperatures near the freezing point of water. The signal from a hydration water proton exchanging with a rate of with the bulk water would have a linewidth of which would hardly be detectable in the crowded spectrum of a biological macromolecule. In principle, the exchange between different water molecules could be slowed down by the use of organic solvent molecules. For example, the hydration of the peptide antamanide has been studied in chloroform solution (Peng et al., 1996). Water-soluble proteins, DNA and RNA fragments, however, lose their native three-dimensional structure in pure or nearly pure organic solvents. Water molecules can be used with four NMR active isotopes: and Of those, deuterium and relax too rapidly to be suitable for high-resolution NMR spectroscopy and tritium is difficult to handle at high concentration because of its radioactivity. In addition, the magnetization transfer rate due to the NOE between two spins A and B depends on the gyromagnetic ratio of the spins as The high natural abundance of protons in water and biomolecules provides optimum sensitivity for the observation of NOEs at no extra cost.
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
487
In the following, principal differences between intermolecular and intramolecular NOEs are discussed, NMR experiments suitable for the measurement of intermolecular water–solute NOEs are evaluated, and protein hydration studies using intermolecular NOEs are reviewed. A further section briefly summarizes and compares the biophysical information obtained from NOE studies with that obtained from NMRD measurements, X-ray crystallography, and molecular dynamics simulations.
2. THEORETICAL BACKGROUND FOR INTERMOLECULAR NOEs The NOEs can be observed either by the transfer of longitudinal or transverse magnetization between spins. The latter is also referred to as ROE (rotating frame NOE). Throughout this article, the term NOE is used to describe both the cross relaxation in the laboratory frame and the rotating frame; the distinction between NOE and ROE is made only in the terms and which describe the rates of magnetization transfer between two protons by cross relaxation in the respective
frames of reference. The cross relaxation rates and Bothner-By et al., 1984; Griesinger and Ernst, 1987)
where
are defined by (e.g.,
is the spectral density at frequency is the Larmor frequency, is the gyromagnetic ratio of the protons,
is Planck’s constant divided by
and
is the induction
constant. The spectral density functions depend on the model describing the change in length and orientation of the vector connecting the two nuclear spins involved in the dipole–dipole interaction.
Since spectral densities do not assume negative values, is always positive, while can be positive or negative. The values are negative when the high-frequency components of the spectral density function are unimportant compared to its component at zero frequency, i.e., for slow reorientation of the internuclear vector. Negative values are typically observed for intramolecular NOEs between the protons in slowly tumbling macromolecules. Positive
values are observed for the intramolecular NOEs in small, rapidly reorientating molecules and for intermolecular NOEs, if at least one of the compounds is very mobile. Note that positive cross-relaxation rates yield negative cross peaks in NOESY and ROESY spectra (i.e., of opposite sign than the diagonal peaks), whereas negative cross-relaxation rates yield positive cross peaks.
488
Gottfried Otting
2.1. NOE between Two Rigidly Bound Protons
Intermolecular water–solute NOEs can be treated like intramolecular NOEs within the solute, if the hydration water molecules are rigidly bound for longer than the rotational correlation time of the solute (typically nanoseconds). This is the case, for example, for hydration water molecules bound in the interior of a protein with hydrogen bonds providing an icelike environment. The spectral density for the simple case of the interaction between two protons attached to an isotropically tumbling sphere (“rigid sphere model,” Fig. 1) is
where r is the interproton distance and the reorientation rate of the sphere.
is the rotational correlation time describing
Plots of and calculated using Eqs. (2) and (3) for different Larmor frequencies are given in Fig. 1 as a function of the rotational correlation time The curves show that (i) the signs of and are the same for small, rapidly tumbling molecules (i.e., small ), and (ii) the sign change of occurs at values of corresponding to 300 ps at a spectrometer frequency of
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
489
For lower Larmor frequencies, the sign change shifts to longer correlation times; i.e., lower magnetic fields favor positive values. The point where changes sign separates the “fast-motional regime” from the “slow-motional regime.” 2.2. NOE between Solute Proton and Bound but Locally Reorientating Water The description of the intermolecular becomes more complicated if the bound water molecule performs motions with a local correlation time shorter than the rotational correlation time of the solute. This situation is quite usually encountered for slowly tumbling proteins and other biological macromolecules. In the extreme case, can be positive for the water-solute interaction, while the rates between protons of the macromolecule are negative. Using explicit models for which analytical expressions of the spectral density function are available, one can show that water–solute NOEs with positive rates are observed with macromolecular systems only if the water molecule is displaced by more than its own diameter within less than a nanosecond, i.e., for rapid exchange
of water molecules (Otting et al., 1991a).
One of the models which can be calculated analytically is the “wobbling-in-acone” model (Fig. 2A) (Richarz et al., 1980; Fujiwara and Nagayama, 1985). Here, a water molecule may be thought to be hydrogen bonded via its oxygen to a proton donor on the solute, with free rotation around the H-bond axis and an additional “wobbling” motion of the axis. The model predicts reduced rates for increased water mobility especially for motions in the time regime, where the rigid-sphere model would predict a sign change. Analytical expressions are also available for a model, where a water proton moves along a line connecting the center of the solute with a solute proton and the water proton (Fig. 2B) (Luginbühl, 1996). This model predicts reduced rates if the water moves rapidly with an amplitude corresponding to complete dissociation and reassociation, but positive values are hardly obtained if the solute is a macromolecule in the slow-motional regime. It appears quite generally that positive rates result more easily from rapid reorientation of the vector with respect to the main magnetic field than if the vector rapidly changes its length. The difficulty of obtaining positive rates by local motions only is also
supported by experimental results: if methane or hydrogen molecules are inserted under pressure into hydrophobic cavities of hen egg-white lysozyme, the values of the intermolecular NOEs observed are negative, although the local reorientation rate of the gas molecules is certainly in the fast-motional regime (Otting et al., 1997). The cross-relaxation rate between a “probe” proton of the solute and a proton of a small molecule trapped inside a cavity of the solute, which reorientates rapidly in the cavity with spherical symmetry, has also theoretically
490
Gottfried Otting
been shown to be the same as the NOE between the probe proton and a hypothetical proton located at the center of the cavity (Otting et al., 1997). The most general representation of local motion is obtained by the use of a generalized order parameter
In this case the spectral density is given by (Halle
and Wennerström, 1981; Lipari and Szabo, 1982; Denisov et al., 1997)
where
denotes the correlation time of the rapid local motion, is the correlation time of the overall rotational tumbling of the solute, r is the internuclear distance with indicating time averaging, and ranges between 0 and 1 for complete disorder and complete order, respectively.
2.3. NOE with Rapidly Diffusing Water Molecules In the extreme limit, water may not be bound at all, but simply diffuse past the solute with no further restriction than that imposed by the space excluded by the
solute. Analytical formulas have been calculated by Ayant et al. (1977) for the case
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
491
where solute and solvent molecules are represented by large and small spheres, respectively, with proton spins located at a certain distance underneath the surface of the hard spheres (Fig. 2C). The spectral density describing the intermolecular interaction is
where are the translational diffusion coefficients of the spheres with spins I (protein) and S (solvent)], is the density of the solvent spin S, and and are defined in Fig. 2C. In addition,
where K denotes the modified spherical Besse function of the third kind and
and
and
are the rotational diffusion coefficients given by Stokes’
law:
where k is the Boltzmann constant, T is the absolute temperature, and is the viscosity coefficient. In evaluating the first term of the double sum in Eq. 5, it is helpful to use the relation (Ayant et al., 1977)
Plotting and with Eqs. (5)–(9) as a function of the inverse translational diffusion coefficient D yields curves similar to those of Fig. 1. Using the Einstein– Smoluchowski relationship
the diffusion coefficient D can be translated into an average residence time
of a water molecule at its hydration site on the solute, assuming that the water molecule is exchanged after a displacement x by its own diameter. Calculating and with (Fig. 2C), and for a frequency of 600 MHz, the sign inversion of is predicted for a diffusioncoefficient This value is about six times smaller than
492
Gottfried Otting
the self-diffusion coefficient of pure water at 36°C (Hausser et al., 1966). Using Eq. (10) with it corresponds to a residence time of about 70 ps. This time span is four times shorter than the rotational correlation time at which changes sign in the simple model of Fig. 1. It should be noted that the conversion of the diffusion coefficient into a residence time of the hydration water by Eq. (10) assumes three-dimensional diffusion. The corresponding equations for two-dimensional or one-dimensional diffusion, respectively (Villars and Benedek, 1974), would predict twofold or fourfold increased residence times from the same diffusion coefficient. Furthermore, the rotational correlation times and dimensions of the spheres used to represent the solute and the water molecules change the precise value of the diffusion coefficient for which In particular, a smaller radius and a shorter rotational correlation time of the solute predicts positive
rates for longer
residence times. Biological macromolecules present a surface with more curvature to the solvent than a sphere with a smooth surface. Furthermore, the solventexposed chemical groups are often more mobile than corresponding groups in the interior of, for example, a protein. Thus, a positive value of a water–solute NOE indicates a water residence time shorter than about 1 ns, but is difficult to pinpoint more accurately. Intermolecular cross-relaxation rates have also been calculated for a model where the solute is represented by a planar surface, treating bulk water as a self-diffusing continuum (Brüschweiler and Wright, 1994). It was pointed out that
for this and the model of Ayant et al. (1977),
and
with water molecules
in the fast-motional regime are approximately proportional to the inverse of the internuclear distance r, in marked contrast to the dependence usually observed for NOEs (Brüschweiler and Wright, 1994; Wang et al., 1996a). Residence times that are less dependent on the precise parameters of an explicit model can be derived by replacing in Eqs. (3) and (4) by an effective correlation time which depends on the mean residence time and the rotational correlation time of the solute as (Clore et al., 1990; Denisov et al., 1997)
If used with Eq. (3), the resulting model assumes isotropic diffusional rotation of the solute, that the water at the hydration site is rigidly bound to the solute without local mobility, and that the water exchanges between two discrete states: the hydration site and the bulk water. If Eq. (11) is used with Eq. (4), the local mobility of the bound water is taken into account, too. Using this approach it has been demonstrated that the presence of local motions with an order parameter and a local correlation time can shift the sign inversion of to residence times of 1 ns and longer (Denisov et al., 1997).
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
493
3. ASSIGNMENTS OF WATER-SOLUTE CROSS PEAKS Water–solute cross peaks in NOESY and ROESY can come about by three principal
mechanisms: (i) direct water–solute NOEs, (ii) exchange-relayed NOEs, where the magnetization is transferred from the water to the solute by chemical exchange and further to another solute proton by an intrasolute NOE, and (iii) chemical exchange between a labile solute proton and the water (Fig. 3). Direct NOEs and exchange-relayed NOEs are readily distinguished from chemical exchange peaks by their different signs in ROESY: in ROESY, chemical exchange peaks have the same sign as the diagonal peaks, whereas NOEs and exchange-relayed NOEs give rise to negative peaks, when the diagonal peaks are plotted as positive peaks. In principle, positive ROESY cross peaks are also observed for magnetization transferred by two subsequent NOE steps during the mixing time (“spin diffusion”) (Farmer et al., 1987) and by TOCSY-type transfers. Since spin-diffusion peaks tend to be very weak in ROESY and TOCSY-type cross peaks are prominent only near the diagonal and antidiagonal of a two-dimensional ROESY spectrum (Glaser and Drobny, 1990), they are disregarded in the following discussion. Since the rotational correlation times of biological macromolecules usually are much longer than , where is the Larmor frequency, intramolecular NOEs invariably lead to positive NOESY cross peaks. Therefore, negative NOESY cross peaks with the water are always direct NOEs. For positive water–solute NOESY cross peaks, which have been shown not to arise from direct chemical exchange by a corresponding ROESY spectrum, it is necessary to consider the possibility of
494
Gottfried Otting
exchange-relayed NOEs, before the assignment of a direct water–solute NOE can be made. The distinction between exchange-relayed NOEs and water–solute NOEs is usually not possible experimentally. Therefore, the possibility of exchangerelayed NOEs can be excluded only if the solute proton involved in the water-solute cross peak is at least 4-5 Å from any labile solute proton which exchanges rapidly with the water. It is thus important to know the three-dimensional structure of the solute before assigning water-solute cross peaks.
4. NMR EXPERIMENTS FOR THE DETECTION OF INTERMOLECULAR NOEs WITH WATER 4.1. Water Suppression Water suppression is required because the analog-digital converters (ADC) in commercial NMR spectrometers cannot adequately digitize small signals at the low signal amplification needed to digitize the entire unsuppressed water signal. Although a two-dimensional NOESY spectrum is symmetric with respect to the diagonal, intermolecular NOE cross peaks between water and solute protons can be observed with acceptable sensitivity only in the cross section along the frequency axis (row) taken at the chemical shift of the water resonance, because the corresponding cross section along the frequency axis (column) taken at the chemical shift of the water resonance is obscured by noise from the residual, incompletely suppressed water signal. The intermolecular water–solute NOEs detected in the row along the frequency axis arise from the magnetization transfer, where the water protons are frequency labeled during the evolution time part of the water magnetization is transferred to the solute during the mixing time and subsequently detected at the frequencies of the solute protons during the detection period Thus, the measurement of water-solute NOEs requires that the water suppression takes place after the NOE mixing time and before the detection period Many different experimental schemes are available to excite a spectrum without exciting the water resonance (e.g., Plateau and Guéron, 1982; Hore, 1983; et al., 1987; and Bax, 1987a; Smallcombe, 1993). Most of these assume, however, that the water magnetization is aligned along the positive at the start. (By definition, the is the axis parallel to the main magnetic field; equilibrium magnetization is aligned along the positive ) At the end of a NOESY or ROESY mixing period, however, the water magnetization is usually not simply aligned along the Spin-lock pulses, Watergate, and diffusion filters can be used which suppress the water resonance irrespective of the starting conditions.
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
495
4.1.1. Pair of Spin-Lock Pulses Spin-lock pulses defocus magnetization not aligned along the spin-lock axis by the spatial inhomogeneity of the radio-frequency field. Consequently, spin-lock pulses are most effective when applied at high power. High-power spin-lock pulses of 1 to 2 ms duration are sufficient for nearly complete averaging of the magnetization in the plane orthogonal to the spin-lock axis. A pair of orthogonal spin-lock pulses without interpulse delay suppresses all magnetization. With a free precession delay between the two spin-lock pulses, only the magnetization at the carrier frequency and at multiples of are suppressed. Therefore, the sequence can be used to suppress the water resonance if the carrier is at the water resonance. Solute magnetization which starts as y-magnetization and precesses by 90° during the delay is not suppressed. The resulting excitation profile follows the function where is the frequency relative to the carrier frequency. To avoid echo effects, the spin-lock pulses should be of different length, e.g., 0.5 ms for and 2 ms for If the delay is set to l/(spectral width), the excitation profile covers the spectral halves to the left and to the right of the carrier frequency (water frequency), each with a single lobe of the sine function (Otting et al., 1991b).
If the water suppression sequence follows a NOESY mixing time, the first spin-lock pulse can be replaced by a pulsed-field gradient (PFG) or homospoil pulse during the mixing time. The gradient selects the longitudinal magnetization which is aligned along the y-axis after the pulse at the end of the NOESY mixing time. In an analogous way, the first spin-lock pulse can be replaced by the spin lock of a ROESY mixing period. Spin-lock pulses are the quickest way of adequate water suppression.
4.1.2. Watergate The Watergate sequence (Piotto et al., 1992) uses the sequence 90°(selective)180°-90°(selective) with PFGs before and after the 90° pulses. The 90° pulses are selective for the water resonance. Therefore, the water resonance experiences a 360° or 0° rotation, while the solute resonances, which are not affected by the selective pulses, experience only the 180° pulse. With two PFGs of equal amplitude and sign, any transverse water magnetization is dephased by both PFGs, while the solute magnetization is defocused by the first PFG and refocused by the second. The sequence combines excellent water suppression with a uniform excitation profile which is decreased only near the water resonance, depending on the bandwidth of the 90° pulses. Furthermore, Watergate can be combined with selective water-flipback pulses, which selectively take the residual water magnetization to the positive before the Watergate sequence, which then only purges residual transverse water magnetization. A drawback of the Watergate scheme compared to a pair of spin-lock pulses is the fact that the solute magnetization stays
496
Gottfried Otting
transverse for a longer time, causing some signal loss by transverse relaxation and small distortions of the solute signals by scalar coupling evolution. A typical PFG
takes about 0.5–1 ms, and 90° pulses shorter than about 2 ms are no longer very selective, leading to phase and amplitude distortions in the spectrum near the water frequency. Scalar coupling evolution can be refocused if the 180° pulse in the Watergate sequence excites only some of the solute resonances, without exciting their coupling partners (Mori et al., 1994). Another variant of the Watergate sequence uses a pulse train of six hard pulses separated by short free precession delays to replace the 90°(selective)- 180°-90°(selective) sequence ( et al., 1993). The relative amplitudes of the pulses are 3:9:19:19:9:3. This 3-9-19 sequence has the advantage of robustness: no amplitudes and phases of selective pulses have to be optimized to achieve good water suppression. On the other hand, the excitation profile is no longer uniform, with a broad
zero-excitation region around the water and at the ends of the spectrum. 4.1.3. Diffusion Filter
One of the simplest diffusion filters uses the spin-echo sequence, with a strong PFG during each of the delays (van Zijl and Moonen, 1990; Wider et al., 1994; Wu et al., 1995). Water and solute magnetization defocus during the
first gradient and refocus during the second. For sufficiently long and strong gradients, the diffusion of the water molecules during the spin-echo sequence prevents the complete refocusing of the water magnetization, whereas the magnetization of the larger, more slowly diffusing solute is refocused more completely. This results in a preferential suppression of the water signal. Diffusion filters are typically longer than 10 ms, causing significant loss of magnetization by transverse relaxation and impure phases by evolution under scalar couplings. Their main advantages are a uniform excitation profile, which allows the detection of solute signals even under the water resonance, and the possibility of suppressing multiple solvent signals simultaneously (Ponstingl and Otting, 1997a). 4.2. Selective Water Excitation
The NMR signals of all hydration water molecules in proteins and DNAs are at the same frequency, because bound water and bulk water exchange rapidly on the NMR time scale (milliseconds). Consequently, all water–solute cross peaks are observed in two-dimensional NOESY and ROESY experiments in a single cross section. Although the intrasolute cross peaks in the two-dimensional spectra may be helpful for assigning the water–solute cross peaks, the intermolecular water– solute cross peaks could be recorded with better sensitivity and in a shorter
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
497
experimental time by selectively recording the cross section of interest in a onedimensional experiment using selective water excitation. Since two-dimensional spectra can be recorded in a few hours, one-dimensional versions, which may be more complicated to set up, do not provide important time savings. Selective water excitation is thus most important for recording two-dimensional analogs of three-dimensional NMR experiments, which would take days to record with adequate resolution. Studying biomolecular hydration by three-dimensional experiments allows the assignment of the water–solute cross peaks in crowded spectral regions. In these experiments, the magnetization transfer from water to the solute is followed by a second mixing time during which the magnetization is transferred further to other solute spins through scalar couplings
or NOEs (Otting et al., 1991b; Holak et al., 1992). The two-dimensional analogs with selective water excitation can be considered as two-dimensional experiments,
where the starting magnetization is obtained by the prior water–solute magnetization transfer. The selective excitation of the water is complicated by the phenomenon of radiation damping (Abragam, 1961). Radiation damping is caused by the interaction of the precessing magnetization with the detection coil of the probehead. The current induced in the coil acts back on the precessing magnetization like a conventional radio-frequency pulse, causing a rotation of the precessing magnetization toward the positive Consequently, transverse magnetization decays more rapidly than one would expect from relaxation. For inverted, longitudinal magnetization, any minor residual transverse component of the magnetization triggers radiation damping, increasing the amount of transverse magnetization until the magnetization passes through the transverse plane. Thus, the FID of the water signal after a 180° pulse grows and decays, with an envelope reminiscent of a Gauss function. This envelope is a direct measure of the current induced in the coil; i.e., it represents the pulse shape acting back on the water magnetization (Otting and Liepinsh, 1995b). On a 600-MHz NMR spectrometer, radiation damping can turn the water magnetization from the negative to the positive within 50 ms. Thus, no selective pulse can effectively excite the water in the presence of radiation damping if it is longer than 50 ms. Radiation damping is proportional to the intensity of the NMR signal and to the quality factor Q of the probehead. In practice, radiation damping is important only for probeheads with high quality factor as they are common at frequencies above 400 MHz. In a dilute solution of a biomolecule in the water resonance is prone to radiation damping, but the resonances of the biomolecule are not. The selectivity of radiation damping can be assessed quantitatively from the envelope of the FID observed after a 180° pulse, which describes the shape of the selective pulse arising from the current induced in the coil.
498
Gottfried Otting
In the following, different selective water excitation schemes are discussed using one-dimensional NOESY experiments as examples. The experiments represent also examples of different solvent suppression schemes.
4.2.1. Selective Water Excitation by a 90° Pulse The simplest one-dimensional experiment would be a NOESY experiment, where the excitation pulse and the evolution time are replaced by a selective 90° pulse at the water frequency (Fig. 4A). As discussed, a long, selective 90° pulse does not provide good sensitivity in the presence of radiation damping. Nonetheless, straight selective or semiselective 90° pulses have been used for water excitation in a couple of examples (Fig. 4B–E). With samples, selective water excitation can be achieved by the use of a heteronuclear filter which suppresses the signals from the labeled sample. Such experiments are discussed in this section, too. It has been noted (Mori et al., 1996a) that E-BURP pulses (Geen and Freeman, 1991) can be used with higher selectivity than, e.g., Gaussian pulses. The reason is that the amplitude of an E-BURP pulse grows toward its end, provoking less radiation damping during the initial half of the pulse. Yet, the recommended pulse duration was not longer than 16 ms, corresponding to the excitation of a fairly wide band (Mori et al., 1996a). In the experiment of Fig. 4B, radiation damping during the rest of the pulse sequence was avoided by the use of a PFG after the selective excitation pulse to defocus the water magnetization (Mori et al., 1994). The subsequent 90° pulse generates longitudinal water magnetization, a second PFG is used to destroy transverse magnetization, and a selective 90° pulse is used to generate transverse coherence for the NH protons. The magnetization is refocused by a gradient which is combined with a second gradient which is part of the following Watergate sequence. This Watergate variant contains only a single semiselective 180° refocusing pulse on the NH protons which does not excite the water. The experiment was proposed for the detection of chemical exchange between water and amide protons in proteins. Because of the use of a defocusing gradient after the selective water excitation which prevents the generation of purely longitudinal magnetization by the following 90° pulse, the sensitivity of the experiment is at most half of the sensitivity of the hypothetical experiment of Fig. 4A. The experiment of Fig. 4C (Mori et al., 1996a) eliminates the sensitivity disadvantage of the experiment of Fig. 4B. All the excited magnetization is longitudinal during the mixing time The selection of longitudinal magnetization is supported by a short strong PFG at the beginning of the mixing time. A weak gradient during the rest of the mixing time prevents the formation of transverse water magnetization which could trigger radiation damping. The mixing time is followed by a hard 90° pulse and a water-flipback pulse, which is a selective pulse
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
499
500
Gottfried Otting
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
501
on the water applied with a phase so that the water magnetization ends up along the positive Residual transverse water magnetization is suppressed by the following Watergate sequence. The water-flipback greatly enhances the sensitivity, since the recovery of equilibrium magnetization of water by relaxation is slow. The water-flipback pulse cannot be phase-cycled independently of the first excitation pulse. This is not expected to lead to artifacts, however, since the magnetization of interest has been transferred from the water to the solute during the mixing time where it is no longer affected by the selective flipback pulse. Since radiation damping prevents the use of a truly selective, long 90° excitation pulse, a spin-echo sequence was proposed to reduce those signals from the macromolecules that are excited by the selective 90° pulse (Fig. 4D) (Mori et al., 1996b). The spin-echo filter relies on the shorter transverse relaxation times of macromolecules compared to water. A PFG is applied both at the start and at the end of the spin-echo delay to prevent loss of water magnetization by radiation damping during the spin-echo sequence. Those PFGs must not be too intense to avoid loss of water magnetization by diffusion. A spin-echo delay of 40 ms was proposed for the use with proteins, where resonances overlap with the water signal. Although this delay is too short for complete relaxation of the protein magnetization, an additional suppression factor is provided by the scalar coupling evolution of the with respect to couplings to amide and which channels much of the magnetization into antiphase coherences which no longer lead to longitudinal magnetization during the NOESY mixing time. It was recommended to use two filter delays, 40 and 60 ms, to check the suppression of the solute resonances (Mori et al., 1996b). Much longer filter delays may result in substantial loss of water magnetization, since the effective relaxation time of water protons in solutions of solutes with exchangeable protons can easily be shorter than 200 ms due to exchange broadening. If proteins are available, the selective excitation problem can be overcome by purging the signals of the protein after semiselective water excitation.
502
Gottfried Otting
Purging of the magnetization of protons is particularly efficient, since the constant is large and of very similar size for different CH groups. The experiment of Fig. 4E (Grzesiek and Bax, 1993a, 1993b) uses a short spin-echo sequence of selective pulses on the water resonance. The delay is chosen so that any magnetization excited by the first selective pulse precesses during an effective delay of at which time a 90° pulse converts the antiphase magnetization into unobservable two-spin coherence. Residual in-phase magnetization defocuses again during the second delay into antiphase magnetization which is also converted into unobservable two-spin coherence by the second 90° pulse. If water–protein NOEs with protons are to be observed, the constant-time HSQC sequence with water-flipback following the NOESY
mixing time is an efficient way of measuring the NOEs in a two-dimensional spectrum. The selective 90° water-flipback pulse (the seventh proton pulse in the pulse sequence) is phase-cycled together with the phase of the first selective 90° excitation pulse to align the water magnetization along the in each scan. The following pulses effectively rotate the magnetization back to the before the detection period The magnetization of solute protons not bound to is not removed by the pulses at the beginning of the pulse sequence. In this case, it was proposed to distinguish between intramolecular NOEs and intermolecular water–solute interactions by a control experiment identical to that of Fig. 4E, except that it is preceded by selective water irradiation during the interscan relaxation delay until 200 ms before the first selective 90° pulse (Grzesiek et al., 1994). Water magnetization is removed by the selective water irradiation and does not recover very much during the following 200-ms delay. In contrast, the magnetization of the solute is either not affected by the very selective water preirradiation or it is largely replenished by intraprotein NOEs during the 200-ms delay. Therefore, water–solute NOEs are strongly suppressed in the control experiment, whereas intrasolute NOEs are much less affected. When the solute is 100% labeled with both and the scalar coupling evolution of the protons by the large one-bond and couplings can be used to purge the magnetization from the protons bound to and . thus selectively retaining the water magnetization. This principle was implemented in the HMQC experiment of Fig. 4F which was designed for the observation of water– amide proton cross peaks (Gemmecker et al., 1993). After a nonselective 90° excitation pulse, the magnetization evolves under scalar couplings with respect to and The 90° and 90° pulses after delays of and respectively, turn antiphase magnetization into unobservable two-spin
coherence. The filters are applied twice with slightly different delays to improve the purging quality for different and constants. The original experiment used neither water-flipback nor any special precautions to suppress radiation damping throughout the entire mixing time
Furthermore, the magneti-
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
503
zation of hydroxyl and sulfhydryl protons is not filtered out, even if their chemical shifts are resolved from the water resonance (Knauf et al., 1996).
The experiment of Fig. 4G makes use of the radiation damping effect itself to achieve a long, selective 90° pulse of the water (Otting and Liepinsh, 1995b). A 180° inversion pulse is followed by a gradient to remove any residual transverse magnetization. A long selective pulse of a very small nominal flip angle is used to generate a small amount of transverse magnetization, triggering radiation damping. Once the water magnetization passes through the transverse plane, it is picked up by the following 90° pulse and converted into longitudinal magnetization. The magnetization of the solute is not affected by the radiation damping unless the
signals are very close to the water resonance. In the original sequence, a train of homospoil pulses was used to suppress radiation damping during the NOESY mixing time and the water resonance was suppressed by a spin-lock purge pulse. The radiation damping field generating the selective 90° water pulse is similar to
that of a half-Gaussian pulse, which is similarly selective as the Gaussian pulse (Friedrich et al., 1987). By varying the intensity of the selective the duration of the selective water excitation can be adjusted also on probeheads of not too high quality factor, where radiation damping alone would produce unacceptably
long pulse durations. On a 600-MHz NMR spectrometer, radiation damping produces a 90° flip angle during about 25 ms.
The experiment of Fig. 4H (Wider et al., 1996) uses the same principle in a difference experiment. In the first experiment, the nonselective 90° excitation pulse is followed by a PFG to destroy all transverse magnetization. In the second experiment, the PFG is applied only after some delay which allows for nearly complete return of the water magnetization to the The solute magnetization, which is not affected by radiation damping, remains transverse until the PFG in either experiment and is therefore subtracted when the difference between both
experiments is calculated. Only every second experiment contributes to the desired signal in the difference experiment, leading to a twofold reduction in sensitivity. The radiation damping field generated by the selective 90° water pulse is similar to that of a time-reversed half-Gaussian pulse. The experiment of Fig. 4I is most similar to that of Fig. 4A. Instead of a continuous selective 90° excitation pulse, a time-shared 90° pulse is used with short free precession delays between the individual pulse segments as in DANTE type (Morris and Freeman, 1978) excitation (Otting and Liepinsh, 1995e). The quality
factor of the rf coil is switched high during the pulses and low during the delays. In this way, radiation damping is suppressed during the delays. Each pulse segment of the excitation pulse is more intense than the corresponding segment of a
continuous pulse of the same duration, because the overall integral of the pulse must be the same for a 90° flip angle. Therefore, the radiation damping field is more easily overcome during short pulse elements. In practice, selective 90° Gaussian pulses of 50 ms duration can be achieved in this way without significant loss of
504
Gottfried Otting
water magnetization. The excitation sidebands produced by the DANTE-type excitation are placed outside the spectral width by setting the free precession delays shorter than the dwell time. Switching of the quality factor of the probehead requires special hardware, by which the coil can be connected to electrical ground via a rapid switch. Switch designs are available that hardly affect the sensitivity of the probehead (Anklin et al., 1995). 4.2.2. Selective Water Excitation by a 180° Pulse The simplest conceivable NOE experiment using a selective 180° pulse for water excitation is the difference experiment sketched in Fig. 5A, where an experiment with the 180° pulse on the water resonance is subtracted from an experiment, where this 180° pulse is either absent or applied outside the spectral range of interest. Although it is a difference experiment, all scans contribute to the water–solute cross peaks, retaining the full sensitivity. As discussed before, the simple scheme does not allow for a very selective pulse in the presence of radiation damping. Nonetheless, the scheme has been used for selective water excitation with
pulse durations of up to 50 ms (Kriwacki et al., 1993). Similarly as in the experiment of Fig. 4D, the use of a diffusion filter has been proposed to help distinguish direct water–solute NOEs from intrasolute NOEs which are less strongly affected by
diffusion during the delay (Fig. 5B) (Kriwacki et al., 1993). A different scheme for a long, water-selective 180° pulse is presented by the experiment of Fig. 5C. The experiment presents a difference experiment, where the selective 180° pulse is composed of a DANTE-type series of small flip-angle pulses interleaved by short free precession delays (Böckmann and Guittet, 1996). Short bipolar gradients ( , 1995; Zhang et al., 1996) are applied during the delays to suppress radiation damping. In the second part of the difference experiment, the phase of the small flip-angle pulses is reversed in the second half of the selective excitation pulse, leading to an effective 0° flip angle for the water magnetization. The following NOE mixing time starts with a PFG to support the selection of longitudinal magnetization followed by a weak gradient throughout the mixing time to prevent radiation damping. Each bipolar gradient first defocuses and then refocuses the water magnetization. It has been shown that weak bipolar gradients of as little as 0.2 G/cm are sufficient to suppress radiation damping during the evolution time of a two-dimensional experiment ( , 1995). In the scheme of Fig. 5C, the free precession delays and thus the PFGs must be of the order of the dwell time or shorter to exclude the appearance of excitation sidebands in the spectrum. To achieve significant defocusing during
the short delay, each individual PFG must be relatively intense, yet sufficiently weak to avoid troubles from eddy currents. Figure 5D presents a scheme, where radiation damping is used to achieve a near-180° rotation of the water magnetization (Otting and Liepinsh, 1995b). Like
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
505
506
Gottfried Otting
the scheme of Fig. 5A, the experiment is a difference experiment. Following a nonselective 160° pulse, a series of homospoil pulses or PFGs is applied in one experiment but not in the other. With the homospoil pulses, any transverse magnetization is defocused and radiation damping is suppressed. Without the homospoil pulses, the transverse component of the water magnetization triggers radiation damping, which turns the water magnetization back to the positive while the magnetization of the solute remains unaffected as long as it precesses with different frequencies than the water magnetization. The effective field produced by the radiation damping resembles a Gaussian pulse. The experimental scheme of Fig. 5D yields optimum sensitivity, since almost no water magnetization is lost during the radiation damping process. In contrast, the water signal intensity observed after a selective radio-frequency pulse is always somewhat less than that observed after a nonselective pulse, mostly due to relaxation. A drawback of the excitation scheme of Fig. 5D is the poor definition of the mixing time, since the water magnetization is not longitudinal during the entire mixing time in half of the scans. As in all difference experiments based on selective 180° pulses, the water–solute NOE building up during the selective excitation scheme is not completely subtracted in the difference experiment, which may become noticeable when the excitation scheme is followed by a short ROE mixing time. Finally, it has been noted that difference experiments based on selective 180° inversion pulses tend to suffer from subtraction artifacts (Otting and Liepinsh, 1995b; Mori et al., 1996a), perhaps because of dipolar field effects (see below). The experimental schemes of Fig. 5E–G use a selective 180° refocusing pulse in the middle of a spin-echo period, during which the water magnetization is transverse. Radiation damping is suppressed by defocusing the magnetization by a PFG applied before the selective refocusing pulse. Thus, long, selective pulses can be used without interference from radiation damping. In these schemes, magnetization transfer between water and solute during the selective excitation scheme does not result in a net magnetization transfer; i.e., the NOE or ROE mixing times in these experiments are well defined and given by It must be remembered, though, that the exchange of protons between water and solute can lead to rather short effective relaxation times of the water magnetization. Furthermore, care must be taken to adjust the phase of the selective 180° refocusing pulse. If phase-shifted by 45° relative to the phase of the hard pulses, no longitudinal magnetization is generated at the start of the mixing time. The experiment of Fig. 5E uses PFGs of opposite polarity on either side of the selective 180° pulse; i.e., the second PFG defocuses the water magnetization even
further (Dalvit, 1995; Dalvit and Hommel, 1995a). Thus, only half of the water magnetization is longitudinal during the subsequent NOE mixing time, resulting in twofold reduced sensitivity. The magnetization transferred to the solute is refocused during the Watergate scheme, which also contains PFGs of opposite polarity. The
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
507
PFGs in the Watergate sequence must be of different strength to avoid undesired echo effects. The experiment of Fig. 5F is another variant of the experiment of Fig. 5E, where the water magnetization is refocused by the PFG after the selective 180° pulse, so that all water magnetization is longitudinal during the mixing time and full sensitivity is retained (Dalvit and Hommel, 1995b). Radiation damping is suppressed during the mixing time by a weak, continuous gradient. The mixing time ends with the combination of a selective 90° pulse and a nonselective 90° pulse,
which together return the water magnetization to the positive
The following
conventional Watergate sequence effectively does not excite the water resonance. Thus, high sensitivity is retained in this experiment even if the repetition rate is fast
compared to the relaxation time of the water. The excitation schemes of Figs. 5E and 5F have also been implemented in off-resonance ROESY experiments for the detection of exchange cross peaks with water (Birlirakis et al., 1996). The experiment of Fig. 5G (Wider et al., 1996) relies on a diffusion filter to separate the magnetization of the water and the solute. The selective 180° refocusing pulse is relatively short (4.1 ms) and therefore of little selectivity. The selectivity of this pulse is, however, not very important, since the water signal is selected based on the different diffusion rates of water and solute rather than frequency. The experiment is a difference experiment. In the first experiment, all magnetization
excited by the initial nonselective 90° excitation pulse is defocused by the following PFG. Only the magnetization refocused by the following selective 180° pulse is refocused by the subsequent PFG. Only little water magnetization is refocused, however, because the PFGs are applied with very high amplitude (i.e., 115 G/cm), leading to efficient suppression of the water magnetization by diffusion. In the second experiment, the first pair of PFGs is applied with weak amplitude (i.e., 10 G/cm) so that radiation damping is suppressed but magnetization losses by diffusion are unimportant. The difference between both experiments yields the cross peaks
with the water resonance and suppresses the intrasolute cross peaks between nonlabile or slowly exchanging protons. Since the diffusion of the solute during the excitation scheme also affects the solute magnetization, the total gradient power in each of the experiments is kept constant; that is, weak PFGs are used during the
Watergate sequence, if strong PFGs were used during the excitation scheme, and vice versa. In this way, the Watergate sequence acts as a diffusion filter like the excitation scheme. For comparable diffusion filtering effects, the duration of the
excitation scheme is the same as the duration of the Watergate sequence The advantage of the experiment is the suppression of intrasolute NOEs even if the solute’s resonances are at exactly the same chemical shift as the water. As in all
other experiments of Figs. 4 and 5, however, exchange-relayed NOEs (Fig. 2B) are not suppressed. A disadvantage is the twofold loss in sensitivity, since water magnetization is retained only in every second experiment. Furthermore, the
experiment is prone to eddy current artifacts from the strong gradients. Finally, the
508
Gottfried Otting
duration of the Watergate sequence is relatively long to match the duration of the excitation sequence.
4.3. Nonselective Experiments Nonselective experiments have the advantage that spectral artifacts such as and are readily identified, whereas they would appear as subtraction artifacts in experiments using selective water excitation. Furthermore, selective pulse shapes tend to produce negative excitation sidelobes (Hajduk et al., 1993), requiring special care in later spectral analysis. Otherwise, NOEs from solute protons excited with negative sign could easily be interpreted as negative NOESY cross peaks with the water. Nonselective experiments of higher dimensionality, however, tend to be less sensitive than selective experiments. Since quadrature detection in the indirect dimension requires that the phase of the first pulse be incremented in steps of 90°, the water magnetization cannot be channeled into longitudinal magnetization during the NOE mixing time for all FIDs as in the analogous experiments of lower dimensionality which employ selective waterexcitation schemes. Thus, water-flipback schemes cannot readily be implemented. Much of the sensitivity lost in experiments scrambling the water magnetization can, however, be recovered by the use of relaxation reagents which shorten the relaxation time of the water and therefore allow faster repetition rates (Otting and Liepinsh, 1995c). One of these is Gd-diethylenetriamine pentaacetic acid-bismethylamide [Gd(DTPA-BMA)], a nonionic relaxation reagent which is routinely used in MR imaging to shorten the relaxation time of water protons. Gd(DTPA-BMA) has been shown not to bind to plasma proteins and is effective at submillimolar concentrations.
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
509
510
Gottfried Otting
The pulse sequence of Fig. 6A shows how the water magnetization can be steered into reproducible positions during the mixing time of a NOESY experiment.
The 90° pulse preceding the mixing time is phase-shifted by 45° with respect to the first 90° excitation pulse of the pulse sequence (Driscoll et al., 1989). With the carrier at the water frequency, half the water magnetization becomes longitudinal at the start of the mixing time, while the other half becomes transverse, independent of whether the first 90° excitation pulse is applied along the x- or y-axis. In this way, the amount of water magnetization that needs to be suppressed is the same for every scan. The transverse magnetization is destroyed by the strong PFG at the start of the mixing time. Radiation damping during the rest of the mixing time is suppressed by a long, weaker gradient, and the remaining water magnetization is suppressed by some water suppression scheme, e.g., a spin-lock purge pulse or a Watergate sequence. Radiation damping during the evolution time would lead to broadening of the water signal in the dimension, but can be suppressed by the use of a bipolar gradient, by which the water magnetization is first defocused and then refocused ( , 1995). Alternatively, if PFGs are not available, a Q-switch (Anklin et al., 1995) or spin-lock pulses before the first 90° pulse and after the second 90° pulse (Otting, 1994) can be used for the same purpose. Three-dimensional experiments for the observation of water–solute NOEs are straightforward extensions of the corresponding two-dimensional experiments. Only a few illustrative examples are discussed here. In three-dimensional experiments, water magnetization can be suppressed either after the first or second mixing time. Figure 6B shows a pulse sequence for a 3D NOESY–TOCSY experiment, where transverse water magnetization during the first mixing time is suppressed by a PFG during and longitudinal water magnetization is suppressed by the sequence where the free precession interval before the spin-lock purge pulse SL introduces a sine-shaped excitation profile in the frequency dimension (Otting et al., 1991b). Alternatively, the water suppression scheme can be implemented right before the detection period placing the nonuniform excitation profile in the dimension (Holak et al., 1992). The hydration of or labeled solutes is conveniently studied by 3D NOESY–HSQC experiments. HSQC experiments are not only very sensitive, but also offer simple ways of combining various water suppression schemes with the delays already present in the pulse sequence. For example, water suppression by spin-lock pulses can be incorporated into the first INEPT step of the HSQC sequence, as illustrated by the experiment of Fig. 6C (Messerle et al., 1989). With the carrier at the water frequency, the magnetization of the protons bound to precesses during the INEPT delay by 90°, while the water magnetization stays aligned along the y-axis and is defocused by the spin-lock purge pulse. Since one-bond coupling constants are very similar for different groups, the heteronuclear coherence is hardly affected by the spin-lock purge pulse, resulting in a uniform excitation profile in all dimensions. The
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
511
Watergate sequence is implemented with similar ease in the reverse INEPT step of a 3D NOESY–HSQC experiment (Fig. 6D) ( et al., 1993). Isotope-labeled samples offer the additional option to use gradients for coherence selection with the possibility to totally remove the residual water magnetization (Hurd, 1991). For example, a pair of PFGs of opposite polarity around a 180° pulse can be used to defocus the magnetization of the spins without dephasing the proton magnetization. The coherences of interest are refocused by a corresponding gradient applied to the proton magnetization immediately before detection (Fig. 6E). In the implementation of Fig. 6E, a factor of in sensitivity is lost by the use of gradients in an echo–antiecho mode (Keeler et al., 1994). Using an HSQC sequence with sensitivity enhancement, up to twofold better sensitivity can be obtained (Kay et al., 1992). 4.4.
Dipolar Field Effects
The effective magnetic field experienced by solute and water spins depends also on the orientation of the water magnetization with respect to the main magnetic field. Thus, solute signals appear shifted by about 1 Hz, depending on whether the
bulk magnetization of the water is parallel or antiparallel to the main magnetic field (Edzes, 1990). The effect is present locally, too, i.e., if the water magnetization is parallel to the main magnetic field in some areas of the sample and antiparallel in others. Such inhomogeneous magnetization patterns arise when the magnetization is defocused by a PFG and partially converted into longitudinal magnetization by a following 90° pulse (Bowtell, 1992). The field shift can lead to subtraction artifacts in difference experiments and impure line shapes (Kubinec et al., 1996). It can be shown to cancel when PFGs are applied along the magic angle (54.7°) with respect to the main magnetic field (Warren et al., 1993). Both classical and quantum-mechanical descriptions are available for quantitative descriptions of this so-called dipolar field or demagnetization field effect (Broekert et al., 1996; Levitt, 1996; Richter et al., 1995).
5. APPLICATIONS
5.1. Studies of Protein Hydration
After initial reports on intermolecular water–peptide NOEs observed in 1D NOE difference experiments with angiotensin II (Pitner et al., 1974) and oxytocin (Glickson et al., 1976), hydration studies by intermolecular NOEs do not seem to have been pursued any further, perhaps because of the limited sensitivity of the NMR instrumentation or the difficulty in suppressing subtraction artifacts in the 1D NOE difference experiments.
512
Gottfried Otting
The use of intermolecular NOEs for the identification of individual hydration water molecules in proteins was first demonstrated with bovine pancreatic trypsin
inhibitor (BPTI) (Otting and Wüthrich, 1989). This study used NOESY and ROESY spectra to distinguish between chemical exchange and NOE or exchangerelayed NOEs. The cross peaks that could be assigned to intermolecular water– BPTI NOEs could all be explained by NOEs with the four internal hydration water molecules buried in the interior of BPTI, which had previously been identified by X-ray crystallography in all single-crystal structures of BPTI. The cross peaks were positive in NOESY and their intensities comparable to intraprotein cross peaks. It was noted that all water protons and most hydroxyl protons appeared at the chemical shift of the bulk water. Later, the exchange between hydration water and bulk water was formally verified by adding the paramagnetic shift reagent which shifts the frequency of the bulk water signal (Otting et al., 1991c). The experiment showed that the NOEs with hydration water molecules were shifted together with the bulk water signal.
BPTI was further used to develop homonuclear 3D NMR experiments for the study of protein hydration by intermolecular water–protein NOEs (Otting et al.,
1991b; Holak et al., 1992). The improved resolution in these experiments allowed the assignment of many more cross peaks. Negative NOESY cross peaks were observed for surface protons of BPTI, indicating little hindered diffusion rates of the hydration water molecules on the protein surface. A control experiment performed with a 50 mM solution of oxytocin at 8°C showed that negative water–peptide NOESY cross peaks can be observed for all
protons (Otting et al., 199la). At 8°C, all intrapeptide cross peaks were positive in NOESY. Lowering the temperature to –25°C (with the addition of 40% acetone to prevent freezing), the sign of the water–oxytocin NOESY cross peaks turned positive, indicating water residence times at the very low temperature (Otting et al., 1992). As a side result of the hydration studies, exchange cross peaks were observed between water and the hydroxyl protons of BPTI at low temperatures. Their
exchange rates were subsequently measured at 4°C as a function of (Liepinsh et al., 1992a). This study was later complemented by measurements of the proton exchange rates of the labile side-chain protons of lysine, arginine, threonine, serine, and tyrosine in the free amino acids in the temperature range 4–36°C and as a function of (Liepinsh and Otting, 1996). It was also shown that carboxyl protons of solvent-exposed side chains are not readily detected by water–polypeptide NOEs (Liepinsh et al., 1993). BPTI was also used as an example for a comparative hydration study of a protein with and without the presence of 200 mM
using a modified NOE–
TOCSY sequence with selective excitation by radiation damping (Fig. 5D). Unfortunately, the presence of artifacts and impure phases of the cross peaks interfered with a detailed spectral analysis (Böckmann and Guittet, 1995). The same excitation
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
513
scheme worked well for the same authors in a proton exchange study (Böckmann et al., 1996). An early protein hydration study by water–protein NOEs was performed with (Clore et al., 1990). The experiment used was a 3D ROESY HMQC using a spin-lock purge pulse for water suppression. 15 water–protein NOEs were identified and interpreted by 11 water molecules previously detected in the single-crystal structure. Although no NOESY experiment was performed, residence times were attributed to the detected hydration water molecules based on Eq. (11) and on the fact that their NOEs were sufficiently intense for detection. human was used in a later study (Ernst et al., 1995) to detect water–protein NOEs with methyl groups in a WNOESY experiment (Fig. 4E) (Grzesiek and Bax, 1993a). NOEs were detected with methyl protons lining a hydrophobic cavity of about volume in the interior of the protein, although no water molecules had been located in this cavity in any of the crystal structures. It was argued that the lack of hydrogen-bonding partners in the cavity wall could lead to a delocalization of the hydration water molecules, which would make their observation difficult by X-ray crystallography.
In a hydration study of reduced human thioredoxin, four hydration water molecules were detected by six water–protein NOEs with the amide protons in a 3D ROESY HMQC experiment (Forman-Kay et al., 1991; Clore et al., 1990). A structure calculation was performed using these NOE distance constraints supplemented by H-bond restraints with nearby carbonyl oxygens and lower-limit distance constraints for amide protons, for which no intermolecular NOE had been observed. Only those two water molecules which were characterized by two NOEs each were located at unique sites in the protein structure. Their orientation appeared disordered. The 3D ROESY HMQC experiment (Clore et al., 1990) was further used
to study the hydration of the immunoglobulin binding domain of streptococcal protein G (Clore and Gronenborn, 1992). Two solvent-exposed water molecules were identified by three NOEs with amide protons and their binding to the protein modeled with bifurcated hydrogen bonds. A structure computation including internal water molecules was further performed with an FK506-binding protein–ascomycin complex (Meadows et al., 1993; Xu et al., 1993). The protein was and 11 water–protein NOEs were detected in 3D ROESY HMQC (Clore et al., 1990) and 3D NOESY HMQC experiments using a spin-lock purge pulse for water suppression. The NOEs defined three internal water molecules at 30°C. The NOE distance constraints were supplemented by 18 hydrogen-bond constraints based on the crystal structure. The resulting structures were reported to be better defined in the vicinity of the water molecules, when the water molecules were explicitly included in the structure calculation. The same three internal water molecules were detected
514
Gottfried Otting
in a later study using FK506-binding protein with the PHOGSY pulse sequence (Fig. 5E) (Dalvit and Hommel, 1995a). The possibility of detecting hydration water molecules at the interface between a DNA-binding protein and DNA by intermolecular water–protein NOEs was demonstrated for a complex between an Antennapedia homeodomain mutant and a 14-base-pair DNA duplex (Qian et al., 1993). The 3D NOESY
and spectra with water suppression by spin-lock purge pulses (Fig. 6C) (Messerle et al., 1989) were recorded with samples of the complex containing protein. Three intermolecular water–protein NOEs
were identified. The experiment (WNOESY, Fig. 4E) was first demonstrated with a complex between calmodulin and an unlabeled 13-residue peptide, where intermolecular water–protein cross peaks were observed with numerous methyl groups (Grzesiek and Bax, 1993a). The same technique was used to quantify the magnetization exchange rates between water and protein protons in a sample of calcineurin B (Grzesiek and Bax, 1993b). In the absence of a three-dimensional structure, however, direct water–protein NOEs could not be distinguished from exchange-relayed NOEs.
The WNOESY and WROESY experiments (Fig. 4E) (Grzesiek and Bax, 1993a) were also used for the detection of intermolecular water–protein cross peaks
with GATA-1 in complex with a 16-base-pair DNA duplex, for which 20 direct water–protein NOEs were reported (Clore et al., 1994). Only eight NOEs were detected in the WNOESY experiments (recorded with NOE mixing times of 60 and 100 ms), one of them with the same sign as in the WROESY experiments (which were recorded with 60-ms mixing time). NOEs present in the WROESY and absent from the WNOESY experiment were ascribed to water molecules with residence times of 200–300 ps. Curiously, numerous water–protein
NOEs were observed with solvent-exposed methyl groups with good intensities in the WROESY experiment. Usually, the water–protein NOEs with solvent-exposed methyl groups yield cross peaks of the same sign and similar intensity in NOESY and ROESY (e.g., Otting et al., 1991a; Kubinec and Wemmer, 1992b; Liepinsh et al., 1992b; Radhakrishnan and Patel, 1994a). WNOESY and WROESY experiments (Grzesiek and Bax, 1993a) were further used to detect buried water molecules in the catalytic domain of stromelysin-1 complexed with a small inhibitor (Gooley et al., 1996). Seven water–protein NOEs were reported, giving evidence for three water molecules which had also been detected in the crystal structure by X-ray crystallography. A homonuclear hydration study of horse heart ferrocytochrome c and ferricy-
tochrome c using 2D NOE–TOCSY and ROE–TOCSY experiments with selective water excitation by a simple, sine-shaped 90° pulse and water suppression by spin-lock purge pulses reported five (six) hydration water molecules in the interior of ferri(o)cytochrome c, one of which changed position between the different
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
515
oxidation states (Qi et al., 1994). Thirty-four NOEs defined six water molecules. Two of these had not been detected by X-ray crystallography. A water molecule was detected at the interface between HIV-1 protease and a chemically synthesized inhibitor by one NOE with an amide proton (Grzesiek et al., 1994). The assignment was based on the crystal structure. A different inhibitor, designed to replace this water molecule, was shown to abolish the intermolecular water–amide proton NOE. The water–amide proton crossrelaxation rate was quantitatively measured using WNOESY and WROESY experiments (Fig. 4E) (Grzesiek and Bax, 1993a) and found to match the internuclear distance measured in the single crystal. Hence, a residence time longer than the rotational correlation time of the complex (9 ns) was attributed to this water molecule. Corresponding cross-relaxation rate measurements were performed later to characterize the hydration of HIV-1 protease in complex with the inhibitor KNI-272 (Wang et al., 1996b) and of HIV-1 protease in complex with DMP323 (Wang et al., 1996a). Four to six water molecules with residence times ns were reported for the complex with KNI-272, but only one to three such water molecules were found at the inhibitor binding site in the complex formed with DMP323. The quantitative measurement of intermolecular water–peptide NOEs in the
turn-forming peptide SYPYD demonstrated differential solvation of the proline residue under conditions of cis and trans prolyl peptide bonds and 1.8/30°C, respectively) (Yao et al., 1994). Two-dimensional ROESY and NOESY experiments were used with spin-lock purge pulses for water suppression (Otting et al., 1991b). Reduced intermolecular NOEs were observed in the cis proline form, indicating low solvent accessibility of the proline ring in the turn structure. An NOE study of human dihydrofolate reductase in complex with methotrexate and NADPH revealed six bound water molecules, five of which were also observable in the absence of NADPH (Meiering and Wagner, 1995). The observed water molecules were highly conserved between different crystal structures. It was noted that these water molecules were buried with less than 80% solvent accessibility and had low-temperature factors in the crystal structures and at least two hydrogen bonds. Three different mutants of the protein were prepared which removed a hydrogen bond to one of the water molecules (Meiering et al., 1995). Weaker water–protein NOEs were subsequently observed for this water molecule, possibly because of a shortened residence time. The experiments used were 3D (Clore et al., 1990), 3D (Messerle et al., 1989), and corresponding two-dimensional spectra using a 10-ms hyperbolic secant 90° pulse for water excitation were recorded. A homonuclear hydration study of a ribonuclease C-peptide analog showed negative NOESY cross peaks with the water resonance for all protons of all 13 residues, as far as the signals could be resolved in 2D NOESY and 3D NOESY– TOCSY spectra, although CD spectra indicated 60% (Brüschweiler et
516
Gottfried Otting
al., 1995). Spin-lock purge pulses (Otting et al., 1991b) were used for water suppression. The e-PHOGSY pulse sequence (Fig. 5F) was demonstrated using hen eggwhite lysozyme. The detection of at least three not further specified hydration water molecules was reported (Dalvit, 1996). Like hen egg-white lysozyme contains internal hydrophobic cavities, where no hydration water was detected in the crystal structures, whereas water–protein cross peaks with the protons lining the cavity walls were detected in NOESY and ROESY experiments using spin-lock purge pulses for water suppression (Otting et al., 1997). In contrast to the experiments with interleukin only weak intermolecular NOEs were observed, suggesting partial occupancies of the cavities. Partial occupancy is further suggested by the fact that one of the cavities is so small that only a single water molecule can be accommodated at a time. Thus, the difficulty of observing this water molecule by X-ray crystallography cannot be attributed to a delocalization of the hydration water. Using unlabeled, and samples, the hydration of oxidized flavodoxin from Desulfovibrio vulgaris was studied (Knauf et al., 1996) by way of homonuclear 3D NOESY–TOCSY (Otting et al., 1991b; 3D (Clore et al., 1990) and MEXICO (Gemmecker et al., 1993) experiments. The 3D NOESY–TOCSY experiment used spin-lock purge pulses for water suppression, but was modified by an additional 4-ms water-selective 90° Gaussian pulse at the end of the NOESY mixing period. The pulse was applied with orthogonal phase relative to the following hard 90° pulse. Its purpose was to improve the water suppression by turning any longitudinal water magnetization present at the end of the mixing time into the transverse plane with a phase so that it was not affected by the following 90° hard pulse, resulting in optimum defocusing by the following spin-lock purge pulse which was applied with orthogonal phase relative to the hard 90° pulse (Otting et al., 1991b; Knauf et al., 1996). Four hydration water molecules were defined by about 10 intermolecular water–protein NOEs, one of which lies in a bridging position between the protein and the ribityl side chain of the FMN ligand. Interestingly, some of the buried hydration water molecules reported by the single-crystal structure seemed to be absent in solution. Finally, four to five intermolecular water–protein NOEs detected in 3D (Messerle et al., 1989), 3D and the corresponding ROESY experiments of E. coli flavodoxin were used for the identification of two to three buried hydration water molecules (Ponstingl and Otting, 1997b).
5.2. Studies of DNA and RNA Hydration An early attempt to detect intermolecular
NOEs between water and
DNA by two-dimensional NOESY spectra failed because the imino and amino
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
517
protons of the DNA fragment exchanged too rapidly with the water under the conditions chosen (van de Ven et al., 1988). A quantitative analysis of exchangerelayed NOEs showed that all cross peaks observed with the water resonance could be interpreted as exchange-relayed NOEs. The first study demonstrating intermolecular water–DNA NOEs appeared four years later (Kubinec and Wemmer, 1992b). Using spin-lock purge pulses to suppress the water resonance in two-dimensional NOESY and ROESY experiments (Otting et al., 1991b), it was shown that the hydration water in the vicinity of the adenine 2 protons and some of the sugar protons in the minor groove of the
self-complementary DNA fragment has sufficiently long residence times to give rise to positive water–adenine 2H cross peaks in the NOESY spectrum. Negative NOESY cross peaks were reported with the thymidine methyl protons and G12
indicating short water residence times near these protons.
Positive NOESY cross peaks observed with the terminal nucleotides of the DNA duplex were probably falsely attributed to bound hydration water molecules, since the presence of hydroxyl groups at the terminal sugar moieties provides the possibility of exchange-relayed NOEs. The same DNA fragment and pulse sequences were used in a study published
shortly after, detecting the same water molecules of the spine of hydration in the minor groove and negative NOESY cross peaks with thymidine methyl groups,
guanine 8H, and some of the protons (Liepinsh et al., 1992b). Furthermore, the fragment was studied by the same techniques, where positive NOESY cross peaks with the adenine 2 protons of the central part of the duplex indicated the presence of a spine of hydration even there.
The DNA fragment sample, where A5 was selectively labeled with
was again studied later with a at positions 2 and 8 of the base
(Kubinec et al., 1996). The level of tritium labeling was sufficient to observe intermolecular water–proton to DNA–tritium NOEs in a heteronuclear NOESY experiment which was derived from the conventional three-pulse NOESY sequence by replacing the last 90° pulse by a 90° pulse with subsequent tritium detection. Since the water did not contain tritium, the spectrum could be recorded without water suppression. It was noted, however, that the water–proton to tritium cross peaks were mostly dispersive at short mixing times, regaining pure phase at mixing times of 100 ms or longer. It is likely that the phase distortions arose from demagnetization field effects (Edzes, 1990; Bowtell, 1992; Kubinec et al., 1996; Warren et al., 1993; Broekaert et al., 1996; Levitt, 1996; Richter et al., 1995). Using ROESY experiments with a spin-echo water suppression sequence, the water–DNA NOEs with four different phenazine-tethered matched and mismatched DNA duplexes were measured in a study that tried to correlate the intensities of the water–DNA NOEs with imino proton exchange rates and the thermodynamic stabilities of the duplexes (Maltseva et al., 1993). The validity of
the conclusions reached in this study was perhaps compromised by the fact that the
518
Gottfried Otting
water suppression scheme used (Bax et al., 1987; and Bax, 1987b) had not been designed for the observation of intermolecular NOEs with water, producing markedly unequal amounts of water magnetization with even and uneven FIDs recorded with different phase increments for quadrature detection in the indirect frequency dimension. Furthermore, large exchange cross peaks were observed and the possibility of exchange-relayed NOEs was not convincingly ruled out. NOESY and ROESY spectra recorded of the non-self-complementary duplex using water suppression by spinlock purge pulses (Otting et al., 1991b) confirmed the presence of a spine of hydration in the minor groove with water residence times above about 1 ns, since positive NOESY cross peaks were observed with several adenine 2 protons near the center of the duplex (Fawthrop et al., 1993). Interestingly, no hydration water molecules were observed at the central A–T step in the crystal structure of a closely
related duplex. The thymidine methyl groups showed negative NOESY cross peaks as all B-DNA type duplexes studied to date. The residence time of the water molecules of the spine of hydration in the minor groove were reported to be slightly shorter near the AT base pairs in than in since water–adenine 2H cross peaks were absent from the NOESY spectrum of the former, but positive in the latter DNA fragment, while the corresponding ROESY cross peaks were intense in both fragments (Liepinsh et al., 1994). It was speculated that the different residence times could arise from a different minor groove width. The experiments were two-dimensional NOESY and ROESY experiments using spin-lock pulses for water suppression (Otting et al., 1991b).
A subsequent study of three different DNA fragments containing TTAA and AATT segments showed that positive water–adenine 2H NOESY cross peaks can be observed also with TTAA segments (Jacobson et al., 1996). The experiments used the Q-switched water-selective 90° pulse in two-dimensional NOE–NOESY and ROE–NOESY experiments (Otting and Liepinsh, 1995c), where the water– DNA cross peaks lie on the diagonal and off-diagonal peaks assist with the assignment of the diagonal peaks. It was shown that NOEs could be observed on the diagonal free from interference with the strong exchange cross peaks of the terminal hydroxyl protons which otherwise appear in the spectral region of the resonances. Negative NOESY cross peaks were observed for base protons other than adenine 2H, most of the sugar protons, and all thymidine methyl groups. The 2H resonances of adenines next to GC base pairs also yielded mostly negative NOESY cross peaks. The conclusion of the study was that the residence time of the hydration water in the minor groove of TTAA segments depends on the nucleotide sequence context. The hydration of DNA triplexes and a parallel-stranded DNA duplex has been studied by two-dimensional NOESY and ROESY experiments using spin-lock purge pulses for water suppression (Radhakrishnan and Patel, 1994a, 1994b; Wang
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
519
and Patel, 1994). Positive water–DNA NOESY cross peaks were observed at -3 to –4°C for some of the protons lining the grooves in these unusual DNA structures. It was argued that these hydration water molecules could contribute to the conformational stability of the structures by shielding against unfavorable electrostatic interactions. A hydration study of RNA was recently performed with the fragment (Conte et al., 1996). Two-dimensional NOESY and ROESY spectra were recorded using Watergate for water suppression. Weak positive water-RNA cross peaks were observed in the NOESY spectrum with two of the adenine 2H and several protons. Since the minor groove in RNA is wider than in DNA, it was argued that groove width is less important for long water residence times than opportunities for hydrogen-bond formation. The NMR signals of the hydroxyl groups were resolved in the spectra, but gave rise to large exchange cross peaks with the water. It was therefore not trivial to exclude the possibility that the cross peaks with the nonexchangeable minor groove protons originated from exchange-relayed NOEs, in particular since the exchange cross peaks with the hydroxyl protons were about 100 times more intense than the water-RNA cross peaks with the nonexchangeable RNA protons, i.e., of similar
intensity as the diagonal peaks. The argument that only weak cross peaks were observed does not mean that these NOEs are weak, since there was also very little diagonal peak intensity for the resonances due to the rapid exchange with the water during the mixing time.
6. 6.1.
SUMMARY OF THE RESULTS Residence Times
By fortuitous coincidence, the sign of the NOE cross-relaxation rate changes for water residence times in the range 0.1–1 ns. Hydration water on protein surfaces and in the minor groove of DNA exhibits residence times exactly in this time range. Thus, NOE measurements provide a tool to distinguish “slow” and “fast” water molecules on this time scale. A second fortuitous coincidence is the fact that water molecules with longer residence times are much easier to detect by water-solute NOEs than rapidly diffusing water molecules. The NOE intensities increase with the residence time until the residence time becomes longer than the rotational correlation time of the solute. Therefore, water-solute NOEs cannot discriminate between different residence times in the regime above the rotational correlation time of the solute (typically several nanoseconds). Since only a few water molecules from the hydration shell of a biomolecule are in the slow-motional regime, the water–solute NOEs provide a filter for the preferential observation of these water molecules
520
Gottfried Otting
which usually are in more intimate contact with the solute than rapidly diffusing
water. The upper limit of the residence times of slowly exchanging hydration water molecules is in the millisecond time range. A residence time of at least about 20 ms would be required to enable the observation of a NOESY cross peak at a chemical shift separate from that of the bulk water (Otting et al., 1991c). A residence time of 1 ms would broaden the signal of the water molecule by about 300 Hz, which would be difficult to observe in the one-dimensional NMR spectrum of a biomolecule. Definitely, upper limits of 100 to 200 cannot be deduced from water–solute NOE studies as claimed (Ernst et al., 1995). Attempts to distinguish rapidly diffusing bulk water from hydration water diffusing at the rate of the macromolecule in an experiment with strong PFGs yielded an upper limit of 1 ms for the residence times of the internal hydration water molecules in BPTI at 4°C (Dötsch and Wider, 1995). Since proton exchange rates between water molecules in the bulk phase occur with rates of and faster (Meiboom, 1961), all these upper limits pertain strictly speaking only to the residence times of the water protons but not the entire water molecules. Residence times in the subnanosecond time range, as documented by negative NOESY cross peaks, must be due to the exchange of entire water molecules, unless proton exchange is very strongly catalyzed. In bulk water, proton exchange lifetimes become shorter than 1 ns at (25°C) (Meiboom, 1961). Recent work by Halle and co-workers showed that accurate residence times in the nanosecond to millisecond time range can be measured for individual hydration water molecules using nuclear magnetic relaxation dispersion (NMRD) of the water nuclei and (Denisov and Halle, 1995a, 1995b, 1995c; Denisov et al., 1996; Venu et al., 1997). The NMRD data predominantly reflect the exchange of the few hydration water molecules with extended residence times on the macromolecular solute. The measurements report on the entire hydration of the solute, not only in the vicinity of solute protons as the water–solute NOE measurements. Although only the relaxation times of the average water NMR signals are measured, information on individual hydration water molecules can be obtained by comparison between samples with and without solvent accessible hydration sites. Hydration sites can be rendered inaccessible by site-directed mutagenesis [for example, the internal water molecule 122 in BPTI is replaced by the hydroxyl group of Ser 36 in the mutant BPTI(G36S) (Berndt et al., 1993)] or by the addition of a ligand [for example, a drug binding to the minor groove of DNA replaces hydration water molecules from the spine of hydration (Denisov et al., 1997)]. The technique yields not only residence times but also order parameters for the solute-bound water
molecules. Furthermore, the number of water molecules bound with long residence times can be determined with good accuracy.
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
521
6.2. Structural Relevance
Since intermolecular water–solute NOEs single out buried hydration water molecules with residence times longer than 1 ns, it is tempting to believe that these water molecules are of importance for the three-dimensional structure of the biomolecules. The energetic implications of slowly and rapidly exchanging hydration water molecules are, however, not so clear. A slowly exchanging hydration
water molecule may not be “more stably” bound than a water molecule that is more easily exchanged by another water molecule. This is particularly apparent for water molecules in hydrophobic cavities where hydrogen-bonding partners are missing. The problem is also well illustrated by the hydration water molecules mediating specific contacts in the trp repressor/operator-DNA cocrystal structure (Otwinowski et al., 1988). Some of these hydration water molecules appear to be approximately conserved in the single-crystal structure of the free DNA (Shakked et al., 1994). In the free DNA, these water molecules are highly solvent exposed and are probably characterized by residence times in the subnanosecond time range. Thus, rapidly exchanging water molecules may be structurally important as slowly exchanging water molecules may be of little structural relevance. The observation
of hydration water molecules with residence times longer than 1 ns in the interior of proteins and in the minor groove of DNA is primarily a consequence of the fact that these water molecules are buried or at least largely protected from access to the bulk solvent. Hydration water molecules buried inside a protein or located in the minor groove of DNA are almost invariably also detectable by X-ray crystallography, where they are often characterized by low B-factors. These water molecules are thought to play a structural role, when the crystal structure shows several well-defined hydrogen bonds with the solute. Usually, many more hydration water molecules are detected by X-ray crystallography than by NOE experiments, but not all hydration water molecules of the first shell of hydration are detected. This is readily explained by the fact that the electron density of the water molecules must be spatially well localized in order to be observable by X-ray crystallography. Thus, continuously diffusing or disordered water escapes X-ray detection, whereas rapidly exchanging water molecules may be observable if they exchange in a
“hopping” motion. It is not surprising that many of the hydration water molecules detected by X-ray crystallography contact one or two solute molecules in the crystal lattice. Negative water–solute NOESY cross peaks observed in solution show that most of these hydration water molecules have residence times of less than 1 ns in solution. More puzzling are reports on water molecules with residence times longer than about 1 ns near solvent-exposed methyl groups (Ernst et al., 1995; Clore et al., 1994). Rapid rotation of water pentagons about the methyl groups has been proposed to explain the fact that these water molecules were not observable in single-crystal X-ray studies, but molecular dynamics simulations do not support this interpretation.
522
Gottfried Otting
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
523
The molecular dynamics (MD) of a protein or DNA molecule in water can be simulated with explicit water molecules for up to a few nanoseconds. The residence times of the hydration water molecules on protein surfaces predicted by the MD simulations range from tens of picoseconds to a few hundred picoseconds (Ahlström et al., 1988; Brunne et al., 1993; Knapp and Mügge, 1993; Billeter et al., 1996).
6.3. Future Perspectives
Currently, water–solute NOEs can be observed for the entire surface of small peptides, but not for protein or DNA molecules. With improved sensitivity of the
NMR equipment, intermolecular water–solute NOEs should become observable for all solvent-exposed protons of the biomolecular macromolecules. Equipment with improved sensitivity would further allow the use of heteronuclear NOEs to study the hydration of chemical groups devoid of protons, such as carbonyl groups. The principle feasibility of such studies has been demonstrated with small organic molecules (e.g., Seba and Ancian, 1990; Canet et al., 1992). The distinction between direct and exchange-relayed NOEs continues to be a problem if the three-dimensional conformation of the solute is not known. Theo-
retically, diffusion filters could be used to separate the signal of rapidly diffusing
bulk water from that of hydration water diffusing at the rate of the solute, but much stronger PFGs would have to be applied in a much shorter time span than what is technically possible today. The attempt to identify direct NOEs in the presence of exchange-relayed NOEs by a quantitative measurement of the NOE cross-relaxation rates failed (Wang et al., 1996a). Usually, many water–solute cross peaks are observed (Fig. 7), but only a few of them can be attributed to direct water–solute NOEs in an unambiguous way and the number of water molecules identified by these is even less. Automation of the spectral analysis will greatly enhance the attractiveness of
the technique. The detection of hydration water molecules at the interface between a protein and a small organic ligand molecule would suggest the design of a new ligand which could bind with higher affinity by replacing the water molecules by functional groups (Grzesiek et al., 1994). It may be conceived that future MD simulations will cover a sufficiently long time span to allow the calculation of water–solute NOEs with all protons of the solute, which will allow the further refinement of the force fields describing biomolecular hydration and lead to a model in quantitative agreement with the experimental NMR data.
7. CONCLUSION What have we learned from hydration studies of biological macromolecules using water–solute NOEs? Perhaps the most interesting result are the short residence
524
Gottfried Otting
times of the hydration water molecules on protein and DNA surfaces. Hydration–
dehydration events would not be expected to be rate-limiting steps in protein folding and intermolecular recognition. Most of the water molecules detected by X-ray crystallography were shown not to be kinetically stable in solution. The possibility of obtaining this information for many individual water molecules in aqueous solution is unique to the NOE method. The hydration studies of proteins and other biological macromolecules by intermolecular water–solute NOEs certainly triggered the development of numerous new pulse sequences dedicated to the detection of intermolecular water–solute cross peaks. In the field of selective water excitation, the experiments with the most colorful acronyms are perhaps not the most attractive in practice. Yet the ideas developed in the context of biomolecular hydration studies may prove invaluable in the development of pulse sequences applicable to the study of NOEs between biological macromolecules and organic cosolvents in aqueous solutions. The first NOE studies of protein–organic solvent interactions are currently emerging (Liepinsh and Otting, 1997; Ponstingl and Otting, 1997a). They may significantly enhance our understanding of altered enzyme specificity observed in nonaqueous environments and provide a tool for rational drug design.
NOTE. Abergel et al. recently demonstrated an elegant modification of the selective excitation scheme of Fig. 5D, where an electronic feedback circuit is used to eliminate or enhance radiation damping at any time during the pulse sequence (Abergel, D., Louis-Joseph, A., and Lallemand, J.-Y., 1996, J. Biomol. NMR 8:15).
ACKNOWLEDGMENTS. The author thanks Dr. Edvards Liepinsh for the spectrum of Fig. 7 and helpful discussions, Dr. Bertil Halle for a critical reading of the manuscript, and the Swedish Natural Science Research Council for financial support. REFERENCES Abragam, A., 1961, Principles of Nuclear Magnetism, Clarendon Press, Oxford. Ahlström, P., Teleman, O., and Jönsson, B., 1988, J. Am. Chem, Soc. 110:4198. Anklin, C., Rindlisbacher, M., Otting, G., and Laukien, F. H., 1995, J. Magn. Reson. B 106:199. Ayant, Y., Belorizky, E., Fries, P., and Rosset, J., 1977, J. Phys. (Paris) 38:325. Bax, A., and Davis, D. G., 1986, J. Magn. Reson. 65:355.
Bax, V., Clore, G. M., and Gronenborn, A. M., 1987, J. Am. Chem. Soc. 1109:6511. Berndt, K. D., Beunink, J., Schröder, W., and Wüthrich, K., 1993, Biochemistry 32:4564. Billeter, M., 1995, Prog. NMR Spectrosc. 27:635. Billeter, M., Güntert, P., Luginbühl, P., and Wüthrich, K., 1996, Cell 85:1057.
Birlirakis, N., Cerdan, R., and Guittet, E., 1996, J. Biomol. NMR 8:487. Böckmann, A., and Guittet, E., 1995, J. Chim. Phys. 92:1923. Böckmann, A., and Guittet, E., 1996, J. Biomol. NMR 8:87. Böckmann, A., Penin, F., and Guittet, E., 1996, FEBS Lett. 383:191.
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
525
Bothner-By, A. A., Stephens, R. L., Lee, J., Warren, C. D., and Jeanloz, R. W., 1984, J. Am. Chem. Soc. 106:811. Bowtell, R., 1992, J. Magn. Reson. 100:1. Broekaert, P., Vlassenbroek, A., Jeener, J., Lippens, G., and Wieruszeski, J.-M., 1996, J. Magn. Reson.
A 120:97. Brunne, R. M., Liepinsh, E., Otting, G., Wüthrich, K., and van Gunsteren, W. F, 1993, J. Mol. Biol.
231:1040. Brüschweiler, R., Morikis, D., and Wright, P. E., 1995, J. Biomol. NMR 5:353. Brüschweiler, R., and Wright, P. E., 1994, Chem. Phys. Lett. 229:75. Canet, D., Mahieu, N., and Tekely, P., 1992, J. Am. Chem. Soc. 114:6190. Clore, G. M., Bax, A., Omichinski, J. G., and Gronenborn, A. M., 1994, Structure 2:89. Clore, G. M., and Gronenborn, A. M., 1992, J. Mol. Biol. 223:853. Clore, G. M., Bax, A., Wingfield, P. T., and Gronenborn, A. M., 1990, Biochemistry 29:5671. Conte, M. R., Conn, G. L., Brown, T., and Lane, A. N., 1996, Nucl. Acids Res. 24:3693. Dalvit, C., 1995, J. Magn. Reson. A 113:120. Dalvit, C., 1996, J. Magn. Reson. B 112:282. Dalvit, C., and Hommel, U., 1995a, J. Biomol. NMR 5:306. Dalvit, C., and Hommel, U., 1995b, J. Magn. Reson. B 109:334. Denisov, V. P., Carlström, G., Venu, K., and Halle, B., 1997, J. Mol. Biol. 268:118.
Denisov, V. P., and Halle, B., 1995a, J. Mol. Biol. 245:682. Denisov, V. P., and Halle, B., 1995b, J. Mol. Biol. 245:698.
Denisov, V P., and Halle, B., 1995c, J. Am. Chem. Soc. 117:8456. Denisov, V. P., and Halle, B., 1996, Faraday Discuss. 103:227. Denisov, V. P., Peters, J., Hörlein, H. D., and Halle, B., 1996, Nat. Struct. Biol. 3:505. Dotsch, V., and Wider, G., 1995, 7. Am. Chem. Soc. 117:6064. Driscoll, P. C., Clore, G. M., Beress, L., and Gronenborn, A. M., 1989, Biochemistry 28:2178. Edzes, H. T., 1990, J. Magn. Reson. 86:293. Ernst, J. A., Clubb, R. T., Zhou, H.-X., Gronenborn, A. M., and Clore, G. M., 1995, Science 267:1813. Farmer, B. T. II, Macura, S., and Brown, L. R., 1987, J. Magn. Reson. 72:347. Fawthrop, S. A., Yang, J.-C., and Fisher, J., 1993, Nucl. Acids Res. 21:4860. Forman-Kay, J. D., Gronenborn, A. M., Wingfield, P. T., and Clore, G. M., 1991, J. Mol. Biol. 220:209. Friedrich, J., Davis, S., and Freeman, R., 1987, 7. Magn. Reson. 75:390. Fujiwara, T., and Nagayama, K., 1985, J. Chem. Phys. 83:3110. Geen, H., and Freeman, R., 1991, J. Magn. Reson. 93:93. Gemmecker, G., Jahnke, W., and Kessler, H., 1993, J. Am. Chem. Soc. 115:11620. Glaser, S. J., and Drobny, G. P., 1990, Adv. Magn. Reson. 14:35. Glickson, J. D., Rowan, R., Pitner, T. P., Dadok, J., Bothner-By, A. A., and Walter, R., 1976, Biochemistry 15:1111.
Gooley, P. R., O’Connell, J. F., Marcy, A. I., Cuca, G. C., Axel, M. G., Caldwell, C. G., Hagmann, W. K., and Becker, J. W., 1996, J. Biomol. NMR 7:8. Griesinger, C., and Ernst, R. R., 1987, J. Magn. Reson. 75:261. Griesinger, C., Otting, G., Wüthrich, K., and Ernst, R. R., 1988, J. Am. Chem. Soc. 110:7870. Grzesiek, S., and Bax, A., 1993a, J. Am. Chem. Soc. 115:12593. Grzesiek, S., and Bax, A., 1993b, J. Biomol. NMR 3:627.
Grzesiek, S., Bax, A., Nicholson, L. K., Yamazaki, T., Wingfield, P., Stahl, S. J., Eyermann, C. J., Torchia, D. A., Hodge, C. N., Lam, P. Y. S., Jadhav, P. K., and Chang, C.-H., 1994, J. Am. Chem. Soc. 116:1581. Hajduk, P. J., Horita, D. A., and Lerner, L. E., 1993, J. Magn. Reson. A 103:40. Halle, B., and Wennerström, H., 1981, J. Chem. Phys. 75:1928. Hore, P. J., 1983, J. Magn. Reson. 55:283.
526
Gottfried Otting
Hausser, R., Meier, G., and Noak, F., 1966, Z. Naturforsch. 21a:1410. Holak, T. A., Wiltscheck, R., and Ross, A., 1992, J. Magn. Reson. 97:632. Hurd, R. E., 1991, J. Magn. Reson. 91:648. Jacobson, A., Leupin, W., Liepinsh, E., and Otting, G., 1996, Nucl. Acids Res. 24:2911. John, B. K., Plant, D., Webb, P., and Hurd, R. E., 1992, J. Magn. Reson. 98:200. Kay, L. E., Keifer, P., and Saarinen, T., 1992, J. Am. Chem. Soc. 114:10663. Keeler, J., Clowes, R. T., Davis, A. L., and Laue, E. D., 1994, Meth. Enzymol. 239:145.
Knapp, E. W., and Mügge, I., 1993, J. Phys. Chem. 97:11339. Knauf, M. A., Löhr, F., Curley, G. P., O’Farrel, P., Mayhew, S. G., Müller, F., and Rüterjans, H., 1996, Eur. J. Biochem. 213:167. Kochoyan, M., and Leroy, J. L., 1995, Curr. Opin. Struct. Biol. 5:329. Kriwacki, R. W., Hill, R. B., Flanagan, J. M., Caradonna, J. P., and Prestegard, J. H., 1993, J. Am. Chem. Soc. 115:8907. Kubinec, M. G., Culf, A. S., Cho, H., Lee, D. C., Burkham, J., Morimoto, H., Williams, P. G., and Wemmer, D. E., 1996, J. Biomol. NMR 7:236. Kubinec, M. G., and Wemmer, D. E., 1992a, Curr. Opin. Struct. Biol. 2:828.
Kubinec, M. G., and Wemmer, D. E., 1992b, J. Am. Chem. Soc. 114:8739. Levitt, M. H., 1996, Concepts Magn. Reson. 8:77. Lipari, G., and Szabo, A., 1982, J. Am. Chem. Soc. 104:4546. Liepinsh, E., Leupin, W., and Otting, G., 1994, Nucl. Acids Res. 22:2249. Liepinsh, E., and Otting, G., 1996, Magn. Reson. Med. 35:30. Liepinsh, E., and Otting, G., 1997, Nat. Biotech. 15:264. Liepinsh, E., Otting, G., and Wüthrich, K., 1992a, J. Biomol NMR 2:447. Liepinsh, E., Otting, G., and Wüthrich, K., 1992b, Nucl. Acids Res. 20:6549. Liepinsh, E., Rink, H., Otting, G., and Wüthrich, K., 1993, J. Biomol. NMR 3:253. Luginbühl, P., 1996, Diss. ETH Nr. 11994. Maltseva, T. V., Agback, P., and Chattopadhyaya, J., 1993, Nucl. Acids Res. 21:4246. Meadows, R. P., Nettesheim. D. G., Xu, R. X., Olejniczak, E. T., Petros, A. M., Holzman, T. F., Severin, J., Gubbins, E., Smith, H., and Fesik, S. W., 1993, Biochemistry 32:754. Meiboom, S., 1961, J. Chem. Phys. 57:375. Meiering, E. M., Li, H., Delcamp, T. J., Freisheim, J. H., and Wagner, G., 1995, J. Mol. Biol. 247:309. Meiering, E. M., and Wagner, G., 1995, J. Mol. Biol. 247:294. Messerle, B. A., Wider, G., Otting, G., Weber, C., and Wüthrich, K., 1989, J. Magn. Reson. 85:608. Mori, S., Abeygunawardana, C., van Zijl, P. C. M., and Berg, J. M., 1996a, J. Magn. Reson. B 110:96. Mori, S., Berg, J. M., and van Zijl, P. C. M., 1996b, J. Biomol. NMR 7:77. Mori, S., Johnson, M. O., Berg, J. M., and van Zijl, P. C. M., 1994, J. Am. Chem. Soc. 116:11982. Morris, G. A., and Freeman, R., 1978, J. Magn. Reson. 29:433. Otting, G., 1994, J. Magn. Reson. B 103:288. Otting, G., and Liepinsh, E., 1995a, Acc. Chem. Res. 28:171. Otting, G., and Liepinsh, E., 1995b, Biomol. NMR 5:420.
Otting, G., and Liepinsh, E., 1995c, J. Magn. Reson. B 107:192. Otting, G., Liepinsh, E., Farmer, B. T. II, and Wüthrich, K., 1991b, J. Biomol. NMR 1:209.
Otting, G., Liepinsh, E., Halle, B., and Frey, U., 1997, Nat. Struct. Biol. 4:396. Otting, G., Liepinsh, E., and Wüthrich, K., 1991a, Science 254:974. Otting, G., and Wüthrich, K., 1989, J. Am. Chem. Soc. 111:1871. Otting, G., Liepinsh, E., and Wüthrich, K., 1991c, J. Am. Chem. Soc. 113:4363. Otting, G., Liepinsh, E., and Wüthrich, K., 1992, J. Am. Chem. Soc. 114:7093. Otwinowski, Z., Schevitz, R. W., Zhang, R.-G., Lawson, C. L., Joachimiak, A., Marmorstein, R. Q., Luisi, B. F., and Sigler, P. B., 1988, Nature 335:321.
Peng, J. W., Schiffer, C. A., Xu, P., van Gunsteren, W. E, and Ernst, R. R., 1996, J. Biomol. NMR 8:453.
Studies of Biological Macromolecules by Intermolecular Water-Solute NOEs
527
Piotto, M., Saudek, V., and V, 1992, J. Biomol. NMR 2:661. Pitner, T. P., Glickson, J. D., Dadok, J., and Marshall, G. R., 1974, Nature 250:582. Plateau, P., and Guéron, M., 1982, J. Am. Chem. Soc. 104:7310. Ponstingl, H., and Otting, G., 1997a, J. Biomol. NMR 9:441. Ponstingl, H., and Otting, G., 1997b, Eur. J. Biochem. 244:384.
Qi, P. X., Urbauer, J. L., Fuentes, E. J., Leopold, M. F., and Wand, A. J., 1994, Nat. Struct. Biol. 1:378. Qian, Y. Q., Otting, G., and Wüthrich, K., 1993, J. Am. Chem. Soc. 115:1189. Radhakrishnan, I., and Patel, D. J., 1994a, Structure 2:395.
Radhakrishnan, I., and Patel, D. J., 1994b, J. Mol. Biol. 241:600. Richarz, R., Nagayama, K., and Wüthrich, K., 1980, Biochemistry 19:5189. Richter, W., Lee, S., Warren, W. S., and He, Q., 1995, Science 267:654.
Seba, H. B., and Ancian, B., 1990, J. Chem. Soc. Chem. Commun.: 997. Shakked, Z., Guzikevich-Guerstein, G., Frolow, F., Rabinovich, D., Joachimiak, A., and Sigler, P. B., 1994, Nature 368:469. , V, 1995, J. Magn. Reson. A 114:132.
, V, and Bax, V., 1987a, J. Magn. Reson. 75:378. , V., and Bax, A., 1987b, J. Magn. Reson. 74:469. , V., Piotto, M., Leppik, R., and Saudek, V., 1993, J. Magn. Reson. A 102:241.
, V., Tschudin, R., and Bax, A., 1987, J. Magn. Reson. 75:352. Smallcombe, S. H., 1993, J. Am. Chem. Soc. 115:4776. van de Ven, F. J. M., Janssen, H. G. J. M., Graslund, A., and Hilbers, C. W., 1988, J. Magn. Reson.
79:221. van Zijl, P. C. M., and Moonen, C. T. W., 1990, J. Magn. Reson. 87:18.
Venu, K., Denisov, V. P., and Halle, B., 1997, J. Am. Chem. Soc. 119:3122. Villars, F. M. H., and Benedek, G. B., 1974, Physics, Vol. 2, Chap. 2, Addison-Wesley, Reading, MA. Wang, Y.-X., Freedberg, D. I., Grzesiek, S., Torchia, D. A., Wingfield, P. T., Kaufman, J. D., Stahl, S. J., Chang, C.-H., and Hodge, C. N., 1996a, Biochemistry 35:12694. Wang, Y.-X., Freedberg, D. I., Wingfield, P. T., Stahl, S. J., Kaufman, J. D., Kiso, Y., Bhat, T. N., Erickson, J. W., and Torchia, D. A., 1996b, J. Am. Chem. Soc. 118:12287. Wang, Y., and Patel, D. J., 1994, J. Mol. Biol. 242:508. Warren, W. S., Richter, W., Andreotti, A. H., and Farmer, B. T. II, 1993, Science 262:2005. Wider, G., Dötsch, V., and Wüthrich, K., 1994, J. Magn. Reson. A 108:255.
Wider, G., Riek, R., and Wüthrich, K., 1996, J. Am. Chem. Soc. 118:11629. Wu, D., Chen, A., and Johnson, C. S. Jr., 1995, J. Magn. Reson. A 115:260.
Wüthrich, K., Otting, G., and Liepinsh, E., 1992, Faraday Discuss. 93:35. Xu, R. X., Meadows, R. P., and Fesik, S. W., 1993, Biochemistry 32:2473.
Yao, J., Brüschweiler, R., Dyson, H. J., and Wright, P. E., 1994, J. Am. Chem. Soc. 116:12051. Zhang, S., and Gorenstein, D. G., 1996, J. Magn. Reson. A 118:291.
Contents of Previous Volumes
VOLUME 1 Chapter 1
NMR of Sodium-23 and Potassium-39 in Biological Systems Mortimer M. Civan and Mordechai Shporer
Chapter 2
High-Resolution NMR Studies of Histones C. Crane-Robinson Chapter 3
PMR Studies of Secondary and Tertiary Structure of Transfer RNA in Solution Philip H. Bolton and David R. Kearns
Chapter 4 Fluorine Magnetic Resonance in Biochemistry J. T. Gerig
Chapter 5
ESR of Free Radicals in Enzymatic Systems Dale E. Edmondson 529
530
Contents of Previous Volumes
Chapter 6 Paramagnetic Intermediates in Photosynthetic Systems Joseph T. Warden Chapter 7
ESR of Copper in Biological Systems John F. Boas, John R. Pilbrow, and Thomas D. Smith
Index VOLUME 2 Chapter 1
Phosphorus NMR of Cells, Tissues, and Organelles Donald P. Hollis Chapter 2
EPR of Molybdenum-Containing Enzymes Robert C. Bray
Chapter 3
ESR of Iron Proteins Thomas D. Smith and John R. Pilbrow
Chapter 4
Stable Imidazoline Nitroxides Leonid B. Volodarsky, Igor A. Grigor’ev, and Renad Z. Sagdeev Chapter 5
The Multinuclear NMR Approach to Peptides: Structures, Conformation, and Dynamics Roxanne Deslauriers and Ian C. P. Smith
Index
Contents of Previous Volumes
531
VOLUME 3
Chapter 1 Multiple Irradiation Experiments with Hemoproteins Regula M. Keller and Kurt Wüthrich Chapter 2
Vanadyl(IV) EPR Spin Probes: Inorganic and Biochemical Aspects N. Dennis Chasteen
Chapter 3
ESR Studies of Calcium- and Protein-Induced Photon Separations in Phospatidylserine-Phosphatidylcholine Mixed Membranes Shun-ichi Ohnishi and Satoru Tokutomi Chapter 4
EPR Crystallography of Metalloproteins and Spin-Labeled Enzymes James C. W. Chien and L. Charles Dickinson Chapter 5
Electron Spin Echo Spectroscopy and the Study of Metalloproteins W. B. Mims and J. Peisach Index VOLUME 4 Chapter 1
Spin Labeling in Disease D. Allan Butterfield
Chapter 2
Principles and Applications of Ian M. Armitage and James D Otvos
to Biological Systems
532
Contents of Previous Volumes
Chapter 3
Photo-CIDNP Studies of Proteins Robert Kaptein Chapter 4
Application of Ring Current Calculations to the Proton NMR of Proteins and Transfer RNA Stephen J. Perkins Index VOLUME 5 Chapter 1
CMR as a Probe for Metabolic Pathways in Vivo R. L. Baxter, N. E. Mackenzie, and A. I. Scott
Chapter 2
Nitrogen-15 NMR in Biological Systems Felix Blomberg and Heinz Rüterjans Chapter 3
Phosphorus-31 Nuclear Magnetic Resonance Investigations of Enzyme Systems B. D. Nageswara Rao
Chapter 4
NMR Methods Involving Oxygen Isotopes in Biophosphates Ming-Daw Tsai and Larol Bruzik Chapter 5
ESR and NMR Studies of Lipid-Protein Interactions in Membranes Philippe F. Devaux Index
Contents of Previous Volumes
VOLUME 6 Chapter 1 Two-Dimensional Spectroscopy as a Conformational Probe of Cellular Phosphates Philip H. Bolton Chapter 2 Lanthanide Complexes of Peptides and Proteins Robert E. Lenkinski Chapter 3 EPR of Mn(II) Complexes with Enzymes and Other Proteins George H. Reed and George D. Markham
Chapter 4 Biological Applications of Time Domain ESR Hans Thomann, Larry R. Dalton, and Lauraine A. Dalton Chapter 5 Techniques, Theory, and Biological Applications of Optically Detected Magnetic Resonance (ODMR) August H. Maki
Index VOLUME 7 Chapter 1 NMR Spectroscopy of the Intact Heart Gabriel A. Elgavish Chapter 2 NMR Methods for Studying Enzyme Kinetics in Cells and Tissue K. M. Brindle, I. D. Campbell, and R. J. Simpson
533
534
Contents of Previous Volumes
Chapter 3
Endor Spectroscopy in Photobiology and Biochemistry Klaus Möbius and Wolfgang Lubitz Chapter 4
NMR Studies of Calcium-Binding Proteins Hans J. Vogel and Sture Forsén
Index VOLUME 8 Chapter 1
Calculating Slow Motional Magnetic Resonance Spectra: A User’s Guide David J. Schneider and Jack H. Freed
Chapter 2
Inhomogeneously Broadened Spin-Label Spectra Barney Bales Chapter 3
Saturation Transfer Spectroscopy of Spin-Labels: Techniques and Interpretation of Spectra M. A. Hemminga and P. A. de Jager Chapter 4
Nitrogen-15 and Deuterium Substituted Spin Labels for Studies of Very Slow Rotational Motion Albert H. Beth and Bruce H. Robinson
Chapter 5
Experimental Methods in Spin-Label Spectral Analysis Derek Marsh Chapter 6
Electron-Electron Double Resonance James S. Hyde and Jim B. Feix
Contents of Previous Volumes
535
Chapter 7
Resolved Electron-Electron Spin-Spin Splittings in EPR Spectra Gareth R. Eaton and Sandra S. Eaton
Chapter 8
Spin-Label Oximetry James S. Hyde and Witold S. Subczynski
Chapter 9
Chemistry of Spin-Labeled Amino Acids and Peptides: Some New Mono- and Bifunctionalized Nitroxide Free Radicals Kálmán Hideg and Olga H. Hankovsky
Chapter 10 Nitroxide Radical Adducts in Biology: Chemistry, Applications, and Pitfalls Carolyn Mottley and Ronald P. Mason
Chapter 11
Advantages of and Deuterium Spin Probes for Biomedical Electron Paramagnetic Resonance Investigations Jane H. Park and Wolfgang E. Trommer
Chapter 12
Magnetic Resonance Study of the Combining Site Structure of a Monoclonal Anti-Spin-Label Antibody Jacob Anglister Appendix
Approaches to the Chemical Synthesis of Spin Labels Jane H. Park and Wolfgang E. Trommer Index
and Deuterium Substituted
536
Contents ofPrevious Volumes
VOLUME 9
Chapter 1 Phosphorus NMR of Membranes Philip L. Yeagle
Chapter 2
Investigation of Ribosomal 5S Ribonucleotide Acid Solution Structure and Dynamics by Means of High-Resolution Nuclear Magnetic Resonance Spectroscopy Alan G. Marshall and Jiejun Wu Chapter 3
Structure Determination via Complete Relaxation Matrix Analysis (CORMA) of Two-Dimensional Nuclear Overhauser Effect Spectra: DNA Fragments Brandan A. Borgias and Thomas L. James
Chapter 4
Methods of Proton Resonance Assignment for Proteins Andrew D. Robertson and John L. Markley Chapter 5
Solid-State NMR Spectroscopy of Proteins Stanley J. Opella Chapter 6
Methods for Suppression of the Signal in Proton FT/NMR Spectroscopy: A Review Joseph E. Meier and Alan G. Marshall Index VOLUME 10 Chapter 1
High-Resolution
Magnetic Resonance Spectroscopy of
Oligosaccharide-Alditols Released from Mucin-Type O-Glycoproteins Johannis P. Kamerling and Johannes F. G. Vliegenthart
Contents of Previous Volumes
537
Chapter 2
NMR Studies of Nucleic Acids and Their Complexes David E. Wemmer Index VOLUME 11 Chapter 1
Localization of Clinical NMR Spectroscopy Lizann Bolinger and Robert E. Lenkinski Chapter 2
Off-Resonance Rotating Frame Spin-Lattice Relaxation: Theory, and in Vivo MRS and MRI Applications
Thomas Schleich, G. Herbert Caines, and Jan M. Rydzewski Chapter 3 NMR Methods in Studies of Brain Ischemia Lee-Hong Chang and Thomas L. James
Chapter 4
Shift-Reagent-Aided Whole-Organ Systems
NMR Spectroscopy in Cellular, Tissue, and
Sandra K. Miller and Gabriel A. Elgavish
Chapter 5
In Vivo
NMR
Barry S. Selinski and C. Tyler Burt
Chapter 6
In Vivo
NMR Studies of Cellular Metabolism
Robert E. London
Chapter 7
Some Applications of ESR to in Vivo Animals Studies and EPR Imaging Lawrence J. Berliner and Hirotada Fujii
Index
538
Contents of Previous Volumes
VOLUME 12
Chapter 1
NMR Methodology for Paramagnetic Proteins Gerd N. La Mar and Jeffrey S. de Ropp
Chapter 2
Nuclear Relaxation in Paramagnetic Metalloproteins Lucia Banci
Chapter 3 Paramagnetic Relaxation of Water Protons Cathy Coolbaugh Lester and Robert G. Bryant
Chapter 4
Proton NMR Spectroscopy of Model Hemes F. Ann Walker and Ursula Simonis
Chapter 5 Proton NMR Studies of Selected Paramagnetic Heme Proteins J. D. Satterlee, S. Alam, Q. Yi, J. E. Erman, I. Constantinidis, D. J. Russell, and S. J. Moench Chapter 6
Heteronuclear Magnetic Resonance: Applications to Biological and Related Paramagnetic Molecules Joël Mispelter, Michel Momenteau, and Jean-Marc Lhoste Chapter 7
NMR of Polymetallic Systems in Proteins Claudio Luchinat and Stefano Ciurli
Index
Contents of Previous Volumes
539
VOLUME 13 Chapter 1 Simulation of the EMR Spectra of High-Spin Iron in Proteins Betty J. Gaffney and Harris J. Silverstone Chapter 2
Mössbauer Spectroscopy of Iron Proteins Peter G. Debrunner
Chapter 3 Multifrequency ESR of Copper: Biophysical Applications Riccardo Basosi, William E. Antholine, and James S. Hyde Chapter 4 Metalloenzyme Active-Site Structure and Function through Multifrequency CW and Pulsed ENDOR Brian M. Hoffman, Victoria J. DeRose, Peter E. Doan, Ryszard J. Gurbiel, Andrew L. P. Houseman, and Joshua Telser Chapter 5
ENDOR of Randomly Oriented Mononuclear Metalloproteins: Toward Structural Determinations of the Prosthetic Group Jürgen Hüttermann
Chapter 6 High-Field EPR and ENDOR in Bioorganic Systems Klaus Möbius Chapter 7
Pulsed Electron Nuclear Double and Multiple Resonance Spectroscopy of Metals in Proteins and Enzymes Hans Thomann and Marcelino Bernardo
Chapter 8 Transient EPR of Spin-Labeled Proteins David D. Thomas, E. Michael Ostap, Christopher L. Berger, Scott M. Lewis, Piotr G. Fajer, and James E. Mahaney
540
Contents of Previous Volumes
Chapter 9
ESR Spin-Trapping Artifacts in Biological Model Systems Aldo Tomasi and Anna Iannone
Index VOLUME 14 Introduction: Reflections on the Beginning of the Spin Labeling Technique Lawrence J. Berliner Chapter 1
Analysis of Spin Label Line Shapes with Novel Inhomogeneous Broadening from Different Component Widths: Application to Spatially Disconnected Domains in Membranes M. B. Sankaram and Derek Marsh
Chapter 2 Progressive Saturation and Saturation Transfer EPR for Measuring Exchange Processes and Proximity Relations in Membranes Derek Marsh, Tibor Páli, and László Horváth Chapter 3
Comparative Spin Label Spectra at X-band and W-band Alex I. Smirnov, R. L. Belford, and R. B. Clarkson
Chapter 4
Use of Imidazoline Nitroxides in Studies of Chemical Reactions: ESR Measurements of the Concentration and Reactivity of Protons, Thiols, and Nitric Oxide Valery V. Khramtsov and Leonid B. Volodarsky
Chapter 5
ENDOR of Spin Labels for Structure Determination: From Small Molecules to Enzyme Reaction Intermediates Marvin W. Makinen, Devkumar Mustafi, and Seppo Kasa
Contents of Previous Volumes
Chapter 6
Site-Directed Spin Labeling of Membrane Proteins and PeptideMembrane Interactions Jimmy B. Feix and Candice S. Klug Chapter 7
Spin-Labeled Nucleic Acids Robert S. Keyes and Albert M. Bobst
Chapter 8
Spin Label Applications to Food Science Marcus A. Hemminga and Ivon J. van den Dries Chapter 9
EPR Studies of Living Animals and Related Model Systems (In-Vivo EPR) Harold M. Swartz and Howard Halpern
Appendix Derek Marsh and Karl Schorn
Index VOLUME 15 Chapter 1 Tracery Theory and NMR Maren R. Laughlin and Joanne K. Kelleher Chapter 2
Isotopomer Analysis of Glutamate: A NMR Method to Probe Metabolic Pathways Intersecting in the Citric Acid Cycle A. Dean Sherry and Craig R. Malloy Chapter 3
Determination of Metabolic Fluxes by Mathematical Analysis of Labeling Kinetics John C. Chatham and Edwin M. Chance
541
542
Contents of Previous Volumes
Chapter 4 Metabolic Flux and Subcelluar Transport of Metabolites E. Douglas Lewandowski Chapter 5
Assessing Cardiac Metabolic Rates During Pathologic Conditions with Dynamic NMR Spectra Robert G. Weiss and Gary Gerstenblith
Chapter 6
Applications of Labeling to Studies of Human Brain Metabolism In Vivo Graeme F. Mason Chapter 7
In Vivo NMR Spectroscopy: A Unique Approach in the Dynamic Analysis of Tricarboxylic Acid Cycle Flux and Substrate Selection Pierre-Marie Luc Robitaille
Index VOLUME 16
Chapter 1
Determining Structures of Large Proteins and Protein Complexes by NMR G. Marius Clore and Angela M. Gronenborn
Chapter 2
Multidimensional NMR Methods for Resonance Assignment, Structure Determination, and the Study of Protein Dynamics Kevin H. Gardner and Lewis E. Kay Chapter 3
NMR of Perdeuterated Large Proteins Bennett T. Farmer II and Ronald A. Venters
Contents of Previous Volumes
Chapter 4
Recent Developments in Multidimensional NMR Methods for Structural Studies of Membrane Proteins Francesca M. Marassi, Jennifer J. Gesell, and Stanley J. Opella Chapter 5
Homonuclear Decoupling to Proteins Hiroshi Matsuo, and Gerhard Wagner Chapter 6
Pulse Sequences for Measuring Coupling Constants Geerten W. Vuister, Marco Tessari, Yasmin Karimi-Nejad, and Brian Whitehead Chapter 7 Methods for the Determination of Torsion Angle Restraints
in Biomacromolecules C. Griesinger, M. Hennig, J. P. Marino, B. Reif, C. Richter, and H. Schwalbe
Index
543
Index
Acyl carrier protein, 55 distance constraints, 24 Aggregation symmetric, 132 ,449 ALFA, 59, 60 Alignment, see Molecular alignment ALPS, 59, 60 Ambiguous restraints, 67 Ambiguous distance restraints (ADRs), 131, 140–142, 155–156, 157 assignment of, 145 symmetric, 140–142 Analytical expressions for the transferred NOESY of a two-spin system, 238 Angle search, 64, 39 Anisotropic interactions, see Interactions, anisotropic magnetic susceptibility, see Magnetic susceptibility, anisotropic reorientation, see Molecular alignment Annealing protocols, 142–144 naming convention, 143 ANRS method, 61 Apo-kedarcidin, 58 Applications of water-solute NOEs, 511 Arc motion, see Motion, arc model ARIA, 145 Assessment of conformational flexibility, 209 Assignment of NOEs, 136, 155 of resonances, 136 Assignment methods, 43 Assignments of water-solute cross peaks, 493 Asymmetric labeling, 137–138 Atom swapping, 63 Atomic B-factors, 12
AURELIA, 38 AUTOASSIGN, 56, 67 Automated methods, 40 Automated peak picking, 68 Automated resonance assignments, 40, 57 ALPS, 85 AUTOASSIGN, 85–97 CONTRAST, 84–85 FELIX, 83 NOESY spectra assignment, 67 program of Abbott Laboratories NMR Group, 84–85 program of Bristol-Myers Squibb NMR Group, 83 stereospecific assignments, 33 Averaging sum, 141 Back calculation of NMR spectra, 206 transferred NOESY spectra, 265, 281 Backbone dynamics derived from relaxation rates, 385 analysis of the multispin relaxation of 386 experiments to determine the relaxation rates, 391 heteronuclear NOE, 389 relaxation time, 387 relaxation time 387 Backbone dynamics derived from relaxation rates, 370 calculation of microdynamical parameters, 377 experimental details, 370 interpretation of microdynamical parameters, 381 processing of spectra and determination of relaxation rates, 376 sensitivity-enhanced HSQC experiment (SE-HSQC), 370 water-flipback HSQC, 371 Basic fibroblast growth factor (FGF-2), 90, 94, 97 Bayesian parameter estimation, 331
545
546 Bayesian posterior probabilities, 86 Bicelles, see Molecular alignment, Protein alignment Blood group A trisaccharide, 283, 289 Boltzmann average, 4, 28 constant, 6 ensemble, 5, 6 factor, 5 probability distribution, 15 sampling, 21 Bovine pancreatic ribonuclease, 90–97 Bovine seminal ribonunclease, 46 BPTI, 51, 448, 450, 474–475, 478–479 Branched polymers, 22 BSA, 461 Calbindin D9k, 450, 452 Calculation of concentrations, 247 Calmodulin, 58 CARNIVAL, 206 Cellobiohydrolase I, 65 Channel blocker, 214 Chemical exchange, 493, 501 Chemical shift dispersion degeneracy, 136–141 symmetry degeneracy, 136, 137, 140–141 Chemical shifts, 4,12,19,23 CLAIRE, 45 Coherence-transfer delays, 124 Cold-shock protein (Csp A), 90, 97 Comonomer NOEs, 137, 138, 146 Complete hybrid matrix, 204 Complete relaxation matrix, 165, 204, 282 CORCEMA, 223 CORMA, 203 IRMA,172 MARDIGRAS, 172, 204 MORASS, 172 PDB2 NOE, 285 Cone motion, see Motion, cone model Conformational annealing search, 142–144 averaging, 210 exchange, 224
heterogeneity, 86 sampling, 214 Conformational exchange matrix, 233
Constraint adiabatic distance, 24 methods, 6 propagation, 56, 87 CONTRAST, 84 CORCEMA, 223, 265 analysis of transferred NOESY data, 289–301 calculations for finite delays, 232 program, 246, 248 theory, 230
Index Corepressor tryptophan with repressor-operator
complex, 297 CORMA, 203, 224 Correlation time, 204, 453, 465–466 COSY (ECOSY), 328 Coupling constants, 203, 207 Couplings, measurement of effects of cross correlation, see Spin relaxation, cross correlation effects of dynamic frequency shifts, see Spin relaxation, dynamic frequency shifts frequency based experiments, 323–325, 328–333 intensity based experiments, 325, 333–336 precision of measurement, 323, 325, 330–332 systematic errors, 324, 335–336 Couplings, residual dipolar angular constraints, 312, 319, 321–322; see also Structure determination determination of sign, 320 field dependence, 313, 319, 323, 325 field induced, 311 history of observation, 320–322 in the study of motion, see Motion measurement of, see Couplings, measurement of separation from scalar, 319 theory, 314–320 Couplings, scalar measurement of, see Couplings, measurement of CPMG, 443, 444
Crambin, 73 Cramér-Rao lower bound (CRLB), see Couplings, measurement of, precision of measurement Cross validation, 217 Degeneracy dispersion, 136, 141 symmetry, 136, 137, 140–141 Determination of protein dynamics in the microsecond time window, 406 in the millisecond time window, 409 Diamagnetic susceptibility, see Magnetic susceptibility, diamagnetic Diamagnetic systems, see Magnetic susceptibility, diamagnetic DIAMOD, 69 DIANA, 65
Difference NMRD, 447 Difference spectroscopy, 138 Diffraction, 3, 8
Diffusion filter, 496 Dipolar field effects, 511 Dipolar relaxation with chemical exchange, 439 Dipolar couplings, see Couplings, residual dipolar, field-induced
Index Dipolar (cont.) Hamiltonian, 315, 316; see also Couplings, residual dipolar, theory shifts (pseudocontact shifts) 341 DISGEO, 66 Dispersion amplitude, 466 function, 463, 474–477 stretched, 474–477 Distance constraints, 6, 24 holonomic, 24 Distance geometry, 61–63, 202 DISGEO, 66 DGEOM, 282 self-correcting, 143 Distance restraints (constraints), 62 ambiguous, 140–142 bounds, 141 restraint function symmetry restraints, 139–140, 156–157 DNA and RNA hydration, 516 DNA, 450, 453, 472 DNA duplexes, 207 complexed to GATA-1, see Structure determination, examples, GATA-1 complexed to DNA
magnetic susceptibility, see Magnetic susceptibility,
diamagnetic, in DNA structure refinement, 322; see also Structure determination, examples, GATA-1 complexed to DNA DNA three way junction, 190–194 final structure, 192–193 hybrid-hybrid matrix refinement, 190–194 refinement summary, 192 sequence, 190 Dolichos biflorus lectin, 283, 287 Double-quantum-filtered COSY (2QF-COSY), 207 Dynamic frequency shift, see Spin relaxation, dynamic
frequency shifts Dynamic matrix, 230, 238 Dynamic shift, 436–437 Dynamics of protein structures, 311, 357 from field-induced dipolar couplings, 311 from and relaxation, 357
general features of dynamics, 357 microdynamic motional parameters, 359 ECOSY spectroscopy, 328 Effect of ligand-receptor ratio on tr-NOESY, 270 Effect of motions of transferred NOESY, 278 Electric field alignment, see Molecular alignment, using electric field Electron density, 5, 12 Electron spin, see Magnetic susceptibility, paramagnetic, in myoglobin Encounter complexes, 240
547 Energy, see potential minimization, 144 Ensemble, 6, 29 average, 14, 19, 29 calculations, 201 generation of, 6, 8
Ensemble of structures, 212 Equations of motion, 5, 6, 23, 24 Er-2, 72 Error analysis, 205
E-selectin, 290 Euler rotations, see Rotations, Euler Ewald techniques, 27 Exchange rate, 202 Experimental NMR restraints, 202
accuracy, 202, 206 internal inconsistencies, 213, 217 redundancy of restraints, 219 Extended system restraining methods, 6 Fast conformational exchange, 236 Fast field cycling, 421, 424–429 FELIX, 38, 47 3D data set to process, 191 Ferrocytochrome c, 60
FFC, see fast field cycling Field-induced residual dipolar couplings, 311 Field variation, 421 Finite receptor off-rates, 227, 268 FKBP, 67 Floating chirality, 63, 64, 67 Force Field, 5, 7, 9, 19 GROMOS, 20, 87 GROMOS 43A1, 9 parameters, 142 Forssman pentasaccharide, 287, 289 Frequency domain experiments, see Couplings, meas-
urement of, frequency domain experiments
Function of off-rate, 277 Fuzzy graphs, 47 GAL 4, 62 GARANT, 50, 51 GATA-1 magnetic susceptibility, see Magnetic susceptibility,
diamagnetic, in DNA structure refinement, see Structure determination, examples, GATA-1 complexed to DNA GCN4 homodimer, 149–151
Generalized intensity (I) matrix, 234 Generalized kinetic (K) matrix, 234 Generalized relaxation rate (R) matrix, 233 Generic spin system object, 86 Genetic algorithms, 50 Global optimization, 50
GLOMSA, 64, 65
548
Index
Glutamine-binding protein, 60 Goodness of refinement, 186 Graph theory, 41, 49 Grid search, 63 HABAS, 63, 65 Hamiltonian, see Dipolar Helix motion in myoglobin, see Motion, in myoglobin relative orientations, see Structure determination,
example, myoglobin Hemocyanin, 462 High temperature approximation, 318 Hinge-bending motion, 228–229, 240–243 Hirudin, 72 HIV integrase fragment dimer, 134 HnRNP C RNA-binding domain, 58 Holonomic distance constraints, 24 HSQC, 323–324; see also Couplings, measurement of
Human fibrinopeptide analogs, 282 Human RNP C RNA-binding domain, 58 Human transforming growth factor (hTGF), 57 Hybrid duplex, 216
Hybrid-hybrid matrix method, 163–199 advantages, 196–198 effect of added noise, 176 experimental refinement, 190–194 for 3D NOESY-NOESY data analysis, 171–176 iterative refinement calculation, 179–190 procedure, 175 refinement of a duplex DNA, 177–190 refinement of a DNA three way junction, 190–194 theory, 173 Hybrid-matrix-based algorithms, 265 IRMA, 172 MARDIGRAS, 204 MORASS, 172 Hybrid matrix of NOE intensities, 204 Hydration studies by intermolecular water-solute NOEs, 485 Indices of agreement, 209; see also R-factor and NOE R-factor crystallographic R-factor, 210 sixth-root-weighted
factor, 210
Insulin hexamer, 135, 154–155 Integration time step, 23, 24, 25, 26, 27 Intensity based experiments, see Couplings, measurement of, intensity-based experiments Intensity-restrained refinement, 264
Interaction function, 5, 7, 9, 19, 23 Interaction tensor, 455 Interactions anisotropic dipolar, see Couplings, residual dipolar, theory
electric quadrupole, 314; see also Quadrupolar
Interactions (cont.) isotropic scalar couplings, 314
Zeeman, 314 Interface filter, 147 Interleukin-8 dimer, 147, 148, 149 Intermolecular ligand-receptor dipolar relaxation, 227 Intermolecular NOE hydration studies, 485 in transferred NOESY, 229, 244, 255, 277 solvent-solute, 485, 523 theory, 487 water-DNA, 516 water-protein, 511 water-solute, 485
Intermolecular potentials, 141 Intermolecular transferred NOESY, 229, 244, 255, 277 methods for observing, 255–261 Intermonomer NOEs, 137–146
Internal motion, 202 Interproton distance restraints, 203, 217
dynamically averaged distances, 206 Intramonomer NOEs, 137 Irreducible spherical tensor (IRE), see Tensor, irreducible spherical tensors Isolated spin pair approximation (ISPA), 224, 437 Isotope-selected/filtered methods, 261 Isotropic
interactions, see Interactions, isotropic reorientation, see Molecular alignment Iterative structure calculation, 145, 157 4,12,16,19, 23 J-modulation experiments, 328 J-resolved spectroscopy, 328 Jun homodimer, 148, 149–151 Karplus relation, 5, 12, 31 Killer toxin, 72 Kinetic matrix, 230 Labile hydrogens, 477–480 Lac repressor, 47 Ladder, 56, 58 Leakage-shell model, 298, 299
Leucine zipper homodimers, 146, 148, 149–151 Libration amplitudes, 467
Ligand motions in the bound state, 229 Ligand-protein intermolecular dipolar relaxation, 272 Ligand-protein/DNA complex, 297 Ligand-protein intermolecular NOESY intensity, 277 Ligand-receptor complexes, 223 calculation of concentrations, 247–249 reversibly binding, examples of, 225, 226
Index Ligand-receptor interactions, 233 encounter complex, 240 multistate models, 240 two-state model, 233 LINSHA, 207 Liquid crystals, see Molecular alignment, using liquid crystals LISP, 96 Local-elevation search method, 21 Logical constraint propagation, 56 Magnetic field alignment, see Molecular alignment, using magnetic field Magnetic susceptibility anisotropic, 313, 316–328, 340–342, 348, 350 concentration dependence, 348 determination, 320–322, 340–342, 348 diamagnetic, in ubiquitin, 324–326 in aromatic systems, 317, 319–322 in DNA, 322, 326, 342 in myoglobin, 321, 340–342, 348 interaction with magnetic field, see Molecular alignment, using magnetic field
origin of, 321 paramagnetic determination, 340, 341 in myoglobin, 323–325, 340–341 in small inorganic complexes, 321 origin, see Magnetic susceptibility, paramagnetic, in myoglobin principal axis system, see Principal axis system, magnetic susceptibility Magnetization transfer, 438-442 Main chain directed strategy, 39 MARCOPOLO, 43 MARDIGRAS, 204, 281 Maximum common subgraph, 42 Mean-field approximations, 22 MEDUSA, 213 Met repressor dimer, 147, 148, 149 Metalloprotein, 209 Methods for relaxation rate determination, 361 determination of the heteronuclear NOE, 369 determination of the longitudinal relaxation time 366 determination of the transversal relaxation time 368 experiments for the determination of relaxation rates, 365 theory of relaxation in proteins, 361 Methods for suppressing or identifying proteinmediated spin diffusion, 249 Methylphosphonate, 216 Metric tensor, 25, 27, 31 Model free expressions for transition probabilities, 231
549 Molecular alignment, see also Protein alignment using bicelles, 327 using dilute liquid crystals, 313, 327 using electric field, 320 using magnetic field, 312, 314, 316–322; see also Magnetic susceptibility Molecular complexes, 224, see also Ligand-receptor complexes Molecular conformations, 202 pool of conformers, 202, 213 Molecular dynamics in torsion angle space, 157 simulated annealing with, 142–144 Molecular dynamics (MD) simulation, 9, 13, 14, 21, 23, 29 in four-dimensional Cartesian space, 22 Molecular motion libration, 467–470 models, 444–446 Monte Carlo, 59 Monte Carlo simulation, 6, 21 MORASS, 172, 174–194 3D version of, 174 iterative refinement cycles, 191
Motion, see also Motions amplitudes, 346–347 arc model, 344–347 cone model, 344–348 effects on magnetic susceptibility, 341, 348 effects on NOE measurement, 344 effects on residual dipolar couplings, 344–345, 348–352 in myoglobin function, 345 librations from spin relaxation, see Spin relaxation order parameters and time scales order matrix analysis, see Order matrix, motion characterization slow collective motion in myoglobin, 346–347 Motional model, 204 Motions bond-angle bending, 24, 26 bond-stretching, 24, 26 dominated by Coulomb interactions, 24, 27 dominated by van der Waals contacts, 24, 27 torsional, 26 water librational, 26 Multiple copy refinement, 212 Multiple-time-step algorithms, 23, 25, 27 Multiple-quantum coherence, 434–436 Mutual information method, 50 Myoglobin diamagnetic susceptibility, see Magnetic Susceptibility, diamagnetic, in myoglobin motion, see Motion paramagnetic susceptibility, see Magnetic Susceptibility, paramagnetic, in myoglobin
550 Myoglobin (cont.) structure refinement, see Structure determination, examples, myoglobin Network editing sequences, 253 Neural networks, 52 Neutron diffraction intensities, 4, 19 NMR CLUST, 212 NMR data, 202 NMR experiments for intermolecular NOEs with water, see Pulse sequence for water-solute NOEs NMR methods for suppressing protein-mediated spin diffusion, 250
NMRD, see Nuclear Magnetic Relaxation Dispersion NOAH, 71, 44 NOAH/DIAMOD, 40, 47 NOE, 203, 311–312, 342; see also Motion, effects on NOE measurement between solute proton and bound but locally reorientating water, 489 between two rigidly bound protons, 488 connectivities, 45, 74
with rapidly diffusing water molecules, 490 intermolecular, 255–261 NOE assignment between symmetry mates, 156 comonomer, 137, 146 intermonomer, 137, 146 intramonomer, 137 restraint potential, 142
NOE-NOESY, 522 NOE R-factor, 298, 299; see also R-factor NOESY, 420; see also NOE 3D NOESY-NOESY data deconvolution of, 173–176, 191 gradient method for the analysis of, 165 simulation studies, 167–171 Non-bonded potential, 143 Non-crystallographic symmetry, 138–139, 156–157 Nonselective experiments, 508 Nonspecific binding, 229, 244 Non-structural protein (NS-1) from influenza A virus, 90, 97
Non-symmetric aggregation, 132 Normalization of calculated and experimental intensities, 267 Nuclear Overhauser enhancement (NOE)
Index Nuclei
319, 326, 338 see Quadropolar , 323–325 , 431, 432 Nucleic acid, see DNA
, 214 Order, magnetic field induced, 314 Order matrix, 348 diagonalization, 349 motion characterization, 350–353
ordering director, 348, 350–351 relation to magnetic susceptibility parameters, 350 structure determination, 349–352 theory, 348–350 Order parameter, 29, 231, 490 intermolecular, 456–545 intramolecular, 454–456, 467–470 Order parameters from residual dipolar couplings, see Order matrix, theory from spin relaxation, see Spin relaxation, order parameters and time scales tetramerization domain, 136, 135, 148, 151–152 Packing restraints, 146 Pair of spin-lock pulses, 495 Paramagnetic susceptibility, see Magnetic susceptibility, paramagnetic Paramagnetic systems, see Magnetic susceptibility, paramagnetic
PARSE, 212 Particle-particle-particle-mesh methods, 27 Partial relaxation, 205 Pathogenesis-related protein from tomato, 72 PDB2 NOE, 282, 285 PDQPRO, 213, 214, 218 Peak ambiguity, 39 Peak picking for resonance assignments, 99 Penalty function, 6, 7, 13, 31 for lower-bound restraining, 7 for upper-bound restraining, 7 Perdeuterated receptors, 249 Phage 434 represser, 72 Pitfalls in structure determination, 31 Platelet factor4/lL-8 chimer tetramer, 152
distance bounds, 12, 15, 19, 23
Point groups, 132–133, 134–135
intensities, 4, 12, 15, 19
Potential distance symmetry, 139–140, 156–157 NOE, 142 non-crystallographic symmetry, 139–140, 156–157 repel, 143
relaxation matrix calculation, 12, 23, 165, 204, 223 Nuclear magnetic relaxation dispersion (NMRD), 419– 421 data Analysis, 462 difference, 447–451 window, 471–473
soft-square, 139, 142
square-well, 139, 142
Index Principal axis system (PAS) dipolar interaction, 315 magnetic susceptibility, 315-319, 323 order tensor, see Order matrix, ordering director and diagonalization Probabilities of conformers, 213 PROSPECT, 45 Protein alignment, see also molecular alignment magnetic field induced, 311,316 using bicelles, 327 using liquid crystals, 313, 327 Protein association, 132 Protein hydration, 511 Protein-hydration, 419 semisolid sample, 457–458 Protein-leakage effects, 277 Protein-mediated spin-diffusion effects, 228, 272 methods for suppressing, 249–255 Protein motions at the active site, 228, 229 Proteins, see GATA-1, myoglobin and ubiquitin Protocols for symmetric oligomers, 143 Pseudoatom, 62 Pseudorotation phase angle, 208, 217 Pseudosymmetry, 133–134 Psuedocontact shifts, see dipolar, shifts Pucker amplitude, 208 Pulse sequences amplitude modulated HSQC, 333 CBCA(CO)NH,121 CBCANH, 121-125 CPMG for 369 for 393,397 for 393,408 for NOE, 397 for heteronuclear NOE, 394 for NOE, 372, 374 for .. 372, 374 for 372, 374 for transverse SIIS cross relaxation, 403 HNCA, 108, 110 HN(CA)CO, 104, 107 HACA(CO)NH, 109–113 HACANH, 114–116 HNCO, 104, 105 HSQC, 101–103 inversion recovery sequences for 367 multiple-quantum triple-resonance spectra, 127 NOESY-NOESY, 164 phase modulated HSQC, 334 phase-type triple-resonance spectra, 115–120, 126–127 selective coupling enhanced HSQC, 329 water flip-back HSQC, 371 Pulse sequences for water-solute NOEs ID NOE, 499 3D NOESY-TOCSY, 509
551 Pulse sequences for water-solute NOEs (cont.) 3D , 509 3D 509 90° excitatin by radiation damping, 499 HYDRA-N, 505 MEXICO, 499 PHOGSY, 505 water excitation by 180° pulse, 505 with 160° water excitation, 505 with diffusion filter, 505 with Q-switched selective 90° pulse, 499 with WANTED sequences, 505 with WEX-I filter, 499 with WEX-II filter, 499 WNOESY, 499 Q-switch, 503 Q(l/6) factor, 180 Quadratic objective function, 214 Quadratic programming algorithm, 213 Quadrupolar relaxation, 433 Quadrupolar coupling constant, 321 interaction, see Interactions, anisotropic, electric quadrupole nuclei, see Nuclei, splittings, 320–321 Quadrupole coupling constant, 466–467 Quantitative J, see Couplings, measurement of, intensity based experiments Quasi-symmetry, 133 example of leucine zippers, 133 Quiet-bird-NOESY, 255 Quiet-EXSY, 255 Quiet-NOESY, 255 R-factor, 29, 156, 157, 180, 210, 298, 299; see also NOE R-factor averaging, 65, 67 summation, 65, 67 Radiation damping, 497 Ramachandran map, 326 RANDMARDI, 205 Real space assignment, 39, 61 Relaxation filters, 257–260 spin-echo, 257–260 spin-lock, 257–260 Relaxation rate, 214 Relaxation rate matrix, 165, 230 Relaxation-reagent, 508 Relaxation time, 24, 25 Relaxation BWR theory, 433, 458 chemical-shift modulation, 442–443, 479–480 cross, 437–442 due to isotropic couplings, 442
552 Relaxation (cont.) deuteron, 431, 433–434 dipolar, 437–442 dispersion, see Nuclear Magnetic Relaxation Dispersion effectively exponential, 434 exchange averaging, 446–447 filters, 257-260 generalized theory, 458 mechanisms, 432–444 oxygen-17, 431–432, 434 proton, 430–431 quadrupolar, 433–437 scalar, 443–444 stochastic theory, 458–462 temporal resolution, 351 time measurement, 422, 428 Reliability distance, 71
Reorientation local, 489 Repel potential, 143 Residence limes, 519 Residual dipolar couplings, see Couplings, residual dipolar Restrained molecular dynamics, 202, 208 time-averaged molecular dynamics, 212 trajectories, 212 Restraints, 138–140 comonomer, 146 distance symmetry, 139-140, 156–157 NOE distance, 142 non-bonded, 143 non-crystallographic symmetry, 138–139, 156–157 packing, 146 space group, 132 Reversibly binding molecular complexes, 225, 226 Ribonuclease, 472 RID method, 56, 57 RNA DNA hybrid, 207, 217 RNA hairpin, 206 ROESY, 206 Hartmann-Hahn transfer of magnetization (HOHAHA), 206 Rotations Euler, 315–318, 351 Wigner, 316–318, 351 SAR-by-NMR, 301 Saturation of receptor resonances, 251 Sauson-Flamsteed projection, 350–351 Screening of compound libraries, 301 SECODG, 40, 67, 68 Selective water excitation, 496 Selective water excitation by a 90° pulse, 498 Selective water excitation by a 180° pulse, 504 Self-correcting distance geometry, 40, 67, 68
Index SERENDIPITY, 39, 46, 49 SH3, 66 SHAKE method, 26 Sialyl tetrasaccharide, 290, 294 complexed to E-selectin, 297 Side-chain derived from relaxation rates, 395 dynamical parameters derived from relaxation times and steady state NOE, 396 SIIS Cross Relaxation, 402 Simulated annealing, 59, 60 see Structure determination, simulated annealing using molecular dynamics, 142–144 Simulated temperature annealing, 22, 25 Simulated transferred NOESY, 267 Simulation of NMR cross-peaks, 207 Single target function, 264 Soft-square potential, 139, 142 Spectral Density Function, 452–457 SPHINX, 207 Spin relaxation contributions to multiplets, 336–338 CSA/dipole-dipole, 336–337 dipole-dipole/dipole-dipole, 337–338 dynamic frequency shifts, 337–339 effects on couplings measurement, 337–339 nuclear dipole/Curie spin-nuclear dipole, 339 order parameters and time scales, 344–346 Square-well potential, 139, 142 dimer (Single-stranded DNA binding protein), 148, 149 Staphylococcal protein A, 57 Stereoconfiguration, 217 STEREOSEARCH, 64, 65 Stereospecific assignment, 209 Stereospecific assignment, 62, 63 Stochastic dynamics (SD) simulation, 21, 23 restraining methods, 6 Structural relevance, 521 Structural restraints, 203 Structure based design, 227, 301, 302 factor, 5, 12, 19, 23 refinement, 24, 202 well defined, 144 Structure calculation iterative, 145, 157 protocols, 142, 144 Structure determination examples, 322–344 GATA-1 complexed to DNA, 326 myoglobin, 322–323, 342–344 order matrix approach, see Order matrix, structure determination protocols, 339 simulated annealing, 339–344
Index Structure determination (cont.) ubiquitin, 324–326 Structure-based drug design, 302 Structure-based filters, 70, 71 Studies of protein hydration, 511 Studies of DNA and RNA hydration, 516 Sugar conformation, 208
Surface hydration water, 470–471 SYMM, 205 Symmetric oligomers, 131
examples of, 148 interface between, 147 possible point group symmetries, 133–135 solved by NMR, 131, 149–151, 152–155 structure calculation method, 138–147 Symmetric aggregation, 132 dimers, 133, 134–135 hexamers, 133, 134–135 oligomers, 131 pentamers, 133, 134–135 trimers, 133, 134–135 tetramers, 133, 134–135 Symmetrization matrix, 235 Symmetry ADR method, 138 problems, 155 pseudo, 133–134 quasi, 133 Symmetry degeneracy, 136, 137, 138 linear group, 132 point group, 132–133, 134–135
spin-echo relaxation filters, 257
spin-lock filter, 257 Temperature used in simulated annealing, 143–144 Temperature control, 423
553 Toxin III, 66 Tr-NOESY-based screening of compound libraries, 301 Transferred NOESY, 223, 225 analytical expressions for a two-spin system, 238 CORCEMA theory of, 223 effect of finite off-rates, 227, 268–270 effect of ligand-receptor ratio, 270–272 effect of motions in the protein-ligand complex, 278–281 effect of protein-mediated spin diffusion, 227, 228, 272–277
effect of protein-leakage, 228, 272–270 in structure-based design, 227, 301 intermolecular, 229, 277–278 for multi-state models, 240–243
for two state models, 233–240 on systems with encounter complexes, 240 screening of compound libraries, 301 simulation using CORCEMA, 289–301 simulation using PDB2 NOE, 282, 285 Transferred NOESY difference spectroscopy, 256 Transferred NOESY with short mixing times, 250 Transverse relaxation, 336 Treatment for more than two states, 240
Triple-resonance NMR, 82; see also Pulse sequences Troponin C EF hand, TNCIIIdimer, 147, 148, 149 Trp-repressor-complex operator, 227, 297 Twin-range method, 25 Two-dimensional ROESY, 251 Two-state model of ligand-receptor interactions, 233 Ubiquitin, 448 diamagnetic susceptibility, see Magnetic susceptibility, diamagnetic, in ubiquitin structure refinement, see Structure determination, examples, ubiquitin United-atom model, 8 Upper distance constraints, 74
Tendamistat, 19, 51
Variable target function, 21, 68, 69, 264
Tensor irreducible spherical (IRE), 315–319 magnetic susceptibility, see Magnetic susceptibility operators, 315–319 order, see Order matrix Theoretical background for intermolecular NOEs, 487 Thermolysin-inhibitor complex, 241 Three dimensional volume matrix, 166
VTB, B subunit of verotoxin, 135, 154–155
Three-spin effects, 224 Thrombin, 282 Time averaging, 14, 15 restraining, 14, 15, 16, 18, 21, 31 structure refinement, 15 Torsion-angle dynamics, 27, 31 Torsion angle restraints, 207 Toxin OSK1, 66
Water excitation, selective, 496 Water flipback, 502 Watergate, 495 demagnetizing fields, 511 diffusion filter, 496 dipolar field effects, 506, 511
spin-lock pulses, 495 Water-internal, 447 residence time, 453, 472–475 Water-protein magnetization transfer, 438–439 Water relaxation in semisolid proteins, 457 Water residence time, 491, 519 Water suppression, 494 Weak-coupling restraining methods, 6
554 Well-defined structure, 144 Wigner rotations, see Rotations, Wigner X-ray diffraction intensities, 4, 15, 19 scattering factors, 12
Index X-filtered spectroscopy, 138, 156 X-PLOR, 138, 139 XEASY, 38, 51 Z-Domain of staphylococcal protein A, 90, 97 Zeeman interaction, see Interactions, isotropic, Zeeman