TECHNIQUES IN PROTEIN CHEMISTRY VI
This Page Intentionally Left Blank
TECHNIQUES IN PROTEIN CHEMISTRY VI Edited by ...
35 downloads
1336 Views
26MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
TECHNIQUES IN PROTEIN CHEMISTRY VI
This Page Intentionally Left Blank
TECHNIQUES IN PROTEIN CHEMISTRY VI Edited by
John W. Crabb W. Alton Jones Cell Science Center, Inc. Lake Placid, New York
ACADEMIC PRESS San Diego New York Boston London Sydney Tokyo Toronto
Academic Press Rapid Manuscript Reproduction
This book is printed on acid-free paper, fe) Copyright © 1995 by ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Academic Press, Inc. A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495 United Kingdom Edition published by Academic Press Limited 24-28 Oval Road, London NWl 7DX
Library of Congress Card Catalog Number: 94-230592 International Standard Book Number: 0-12-194712-2 (case) International Standard Book Number: 0-12-194713-0 (comb)
PRINTED IN THE UNITED STATES OF AMERICA 95 96 97 98 99 00 EB 9 8 7 6
5
4
3 2 1
Contents
Foreword xv Preface xvii Acknowledgments
xix
Section I Mass Spectrometry of Peptides and Proteins The Use of a Volatile N-Terminal Degradation Reagent for Rapid, HighSensitivity Sequence Analysis of Peptides by Generation of Sequence Ladders 3 M. Bartlet-Jones, W. A. Jejfery, H. F. Hansen, and D.J. C. Pappin Investigation of Polyethylene Membranes as Potential Sample Supports for Linking SDS-PAGE with MALDI-TOF MS for the Mass Measurement of Proteins 13 James A. Blackledge and Anthony J. Alexander Comparison of ESI-MS, LSIMS, and MALDI-TOF-MS for the Primary Structure Analysis of a Monoclonal Antibody 21 Leticia Cano, Kristine M. Swiderek, and John E. Shively MS Based Scanning Methodologies Applied to Conus Venom A. Grey Craig, Wolfgang H. Fischer, Jean E. Rivier, J. Michael Mcintosh, and William R. Gray
31
Direct Coupling of an Automated 2-Dimensional Microcolumn Affinity Chromatography-Capillary HPLC System with Mass Spectrometry for Biomolecule Analysis 39 D. B. Kassel, T. G. Consler, M. Shalaby, P. Sekhri, N. Gordon, and T. Nadler Edman Degradation and MALDI Sequencing Enables N- and C-Terminal Sequence Analysis of Peptides 47 Roland Keliner, Gert Talbo, Tony Houthaeve, and Matthias Mann
vi
Contents
Identification of the Amino Terminal Peptide of N-Terminally Blocked Proteins by Differential Deutero-Acetylation Using LC/MS Techniques Craig D. Thulin and Kenneth A. Walsh
55
Section II Analysis of Posttranslational Processing Events HPAEC-PAD Analysis of Monoclonal Antibody Glycosylation Jeffrey Rohrer, Jim Thayer, Nebojsa Avdalovic, and Michael Weitzhandler
65
Carbohydrate Structure Characterization of Two Soluble Forms of a Ligand for the ECK Receptor Tyrosine Kinase 75 Christi L. Clogston, Patricia L. Derby, Robert Toso, James D. Skrine, Ming Zhang, Vann Parker, G. Michael Fox, Timothy D. Bartley, and Hsieng S. Lu Characterization of Individual N- and 0-Linked Glycosylation Sites Using Edman Degradation 83 A. A. Gooley, N. H. Packer, A. Pisano, J. W. Redmond, K. L. Williams, A. Jones, M. Loughnan, and P. F. Alewood The Unexpected Presence of Hydroxy lysine in Noncollagenous Proteins 91 Michael S. Molony, Shiaw-Lin Wu, Lene K. Keyt, and Reed J. Harris Isolation of Escherichia coli Synthesized Recombinant Proteins that Contain €-A^-Acetyllysine 99 Bernard N. Violand, Michael R. Schlittler, Cory Q. Lawson, James F. Kane, Ned R. Siegel, Christine E. Smith, and Kevin L. Duffin LC-MS Methods for Selective Detection of Posttranslational Modifications in Proteins: Glycosylation, Phosphorylation, Sulfation, and Acylation 107 Mark F. Bean, Roland S. Annan, Mark E, Hemling, Mary Mentzer, Michael J. Huddleston, and Steven A. Carr Identification of Phosphorylation Sites by Edman Degradation John D. Shannon and Jay W. Fox
117
Determination of the Disulfide Bonds of Human Macrophage Chemoattractant Protein-1 Using a Gas Phase Sequencer 125 Ramnath Seetharam, Jeanne L Corman, and Shubhada M. Kamerkar
Contents
\
Section III Protein Sequencing and Amino Acid Analysis Enzymatic Digestion of PVDF-Bound Proteins in the Presence of Glucopyranoside Detergents: Applicability to Mass Spectrometry 135 Joseph Fernandez, Farzin Gharahdaghi, and Sheenah M. Mische In-Gel Digestion of SDS PAGE-Separated Proteins: Observations from Internal Sequencing of 25 Proteins 143 Kenneth R. Williams and Kathryn L. Stone Peptide Mapping at the 1 |xg Level: In-Gel vs. PVDF Digestion Techniques 153 Lee Anne Merewether, Christi L. Clogston, Scott D. Patterson, and Hsieng S. Lu Enzymatic Digestion of Proteins in Zinc Chloride and Ponceau S Stained Gels 161 Sharleen Zhou and Arie Admon Direct Collection Onto Zitex and PVDF for Edman Sequencing: Elimination of Polybrene 169 William A. Burkhart, Mary B. Moyer, Wanda M. Bodnar, Anita M. Everson, Violeta G. Valladares, and Jerome M. Bailey Minimizing N-to-O Shift in Edman Sequencing William H. Vensel and George E. Tarr
177
The Hydrolysis Process and the Quality of Amino Acid Analysis: ABRF94AAA Collaborative Trial 185 K. limit YUksel, Thomas T. Andersen, Izydor Apostol, Jay W. Fox, Raymond J. Paxton, and Daniel J. Strydom A New Reagent for Cleaving at Cystine Residues 193 C. Mitchell, L. Hinman, L. Miller, and P. C. Andrews Protein Sequence Analysis Using Microbore PTH Separations 201 Michael F. Rohde, Christi Clogston, Lee Anne Merewether, Patricia Derby, and Kerry D. Nugent
viii
Contents
Assignment of Cysteine and Tryptophan Residues during Protein Sequencing: Results of ABRF-94SEQ 209 Jay Gambee, Philip C. Andrews, Karen DeJongh, Greg Grant, Barbara Merrill, Sheenah Mische, and John Rush Automated C-Terminal Protein Sequence Analysis Using the Hewlett-Packard G1009A C-Terminal Protein Sequencing System 219 Chad G. Miller, David H. Hawke, Jacqueline Tso, and Sherrell Early Applications Using an Alkylation Method for Carboxy-Terminal Protein Sequencing 229 MeriLisa Bozzini, Jindong Zhao, Pau-Miau Yuan, Doreen Ciolek, Yu-Ching Pan, John Norton, Daniel R. Marshak, and Victoria L. Boyd C-Terminal Sequence Analysis of Polypeptides Containing C-Terminal Proline 239 Jerome M. Bailey, Oanh Tu, Gilbert Issai, and John E. Shively
Section IV Peptide and Protein Separations and Other Methods High Sensitivity Detection of Tryptic Digests Using Derivatization and Fluorescence Detection 251 Steven A. Cohen, Igor Mechnikov, and Patricia Young Reagents for Rapid Reduction of Disulfide Bonds in Proteins Rajeeva Singh and George M. Whitesides
259
Strategies for the Removal of Ionic and Nonionic Detergents from Protein and Peptide Mixtures for On- and Off-Line Liquid Chromatography Mass Spectrometry (LCMS) 267 Kristine M. Swiderek, Michael L. Klein, Stanley A. Hefta, and John E. Shively Online Preparation of Complex Biological Samples prior to Analysis by HPLC, LC/MS, and/or Protein Sequencing 277 Ken Stoney and Kerry Nugent
Contents
ix
Methods for the Purification and Characterization of Calcium-Binding Proteins from Retina 285 Arthur S. Polans, Krzysztof Palczewski, Wojciech A. Gorczyca, and John W. Crabh Evidence for the Presence of a-Bungarotoxin in Venom-Derived K-Bungarotoxin 293 James J. Fiordalisi and Gregory A. Grant Progress in the Development of Solvent and Chromatography Systems Appropriate for Bitopic Membrane Proteins 301 Song-Jae Kil, Lisa M. Oleksa, Geojfrey C. Landis, and Charles R. Sanders II Rapid Separation of Proteins and Peptides Using Conventional SilicaBased Supports: Identification of 2-D Gel Proteins following In-Gel Proteolysis 311 Robert L. Moritz, James Eddes, Hong Ji, Gavin E. Reid, and Richard J. Simpson
Section V Mutagenesis and Protein Design Studying a-Helix and P-Sheet Formation in Small Proteins Catherine K. Smith, Mary Munson, and Lynne Regan
323
Circular Permutation of RNase Tl through PCR Based Site-Directed Mutagenesis 333 Jane M. Kuo, Leisha S. Mullins, James B. Garrett, and Frank M. Raushel E. c30 kDa) Proteins and Macromolecular Complexes 503 Cheryl H. Arrowsmith, Weontae Lee, Matthew Revington, Toshio Yamazaki, and Lewis E. Kay Solution Structures of Horse Ferro- and Ferricytochrome c Using 2D and 3D 'H NMR and Restrained Simulated Annealing 511 Phoebe X. Qi, Ernesto J. Fuentes, Robert A. Beckman, Deena L. Di Stefano, and A. Joshua Wand NMR Relaxation Methods to Study Ligand-Receptor Interactions David W. Hoyt, Jian-Jun Wang, and Brian D. Sykes
521
Section IX Peptide Synthesis Application of 2-Chlorotrityl Resin: Simultaneous Synthesis of Peptides Which Differ in the C-Termini 531 Anita L. Hong, Tin T. Le, and Trung Phan Correlation of Cleavage Techniques with Side-Reactions following SolidPhase Peptide Synthesis 539 Gregg B. Fields, Ruth H. Angeletti, Lynda F. Bonewald, William T. Moore, Alan J. Smith, John T. Stults, and Lynn C. Williams Protein Synthesis on a Solid Support Using Fragment Condensation Siegfried Brandtner and Christian Griesinger
547
Characterization of a Side Reaction Using Stepwise Detection in Peptide Synthesis with Fmoc Chemistry 555 Yan Yang, William V. Sweeney, Susanna Thornqvist, Klaus Schneider, Brian T. Chait, and James P. Tam
Contents
Erratum High Sensitivity Peptide Sequence Analysis Using in Situ Proteolysis on High Retention PVDF Membranes and a Biphasic Reaction Column Sequencer 565 Sandra Best, David F. Reim, Jacek Mozdzanowski, and David W. Speicher Index
575
This Page Intentionally Left Blank
Foreword
I express my sincere thanks to John W. Crabb for editing this outstanding volume of Techniques in Protein Chemistry. Last year John edited Volume V, which was lauded as a useful "bench-top" manual. Volume VI promises timely information for many scientists. Organizing the topics and contacting the potential authors of papers for each volume are difficult tasks, but crucial for the success of the book. John and his associates were diligent; they collected an impressive array of papers on diverse subjects of interest to the broad membership of the Society. Most of the papers in Volume VI were obtained from the poster sessions as has been our custom. The Eighth Symposium of the Protein Society, held in San Diego July 9-13, 1994, was very well attended, with almost 1200 scientists and 630 exhibitors. Six hundred and fifty posters were presented and 66 companies displayed their products. We expect next year's symposium to be as successful. The ABRF annual meeting will be held in Boston, Massachusetts, July 7-8, 1995, immediately followed by the Ninth Symposium of the Protein Society (July 8-12). The next volume of Techniques, to be edited by Daniel R. Marshak, will be based on the scientific content of these meetings.
Joseph J. Villafranca President The Protein Society
This Page Intentionally Left Blank
Preface
As in previous volumes of this series, Techniques in Protein Chemistry VI highlights current methods in peptide and protein chemistry. Contributions were selected from over 650 abstracts submitted for presentations at the eighth annual symposium of the Protein Society held in San Diego in July 1994. The authorship is international, with contributions from Australia, Canada, England, Germany, Israel, Japan, and the United States. Chapters focus on mass spectrometry, sequence and amino acid analysis, separations, protein folding and conformation, peptide and protein NMR, and peptide synthesis. In addition, the mutagenesis and protein design section has been expanded and a new section addresses analysis of protein interactions. Very special thanks are due the associate editors who have helped make both Techniques V and VI high-quality resource books. This year's associate editors, all of whom assisted with last year's volume, include Richard M. Caprioli, Gerald M. Carlson, Steven A. Carr, Gregory A. Grant, Michael F. Rohde, David W. Speicher, Leonard D. Spicer, and Kenneth R. Williams. I am also grateful for contributions to the editorial process from William Seifert and Dan Marshak and for the timely assistance of Shirley Light of Academic Press and my secretary, Valerie Oliver. Essential support was provided by the president of the Protein Society (Joe Villafranca) and the secretary/treasurer (George Rose). Finally, I thank all the authors for their cooperation in meeting deadlines and providing their up-to-date methodology.
John W. Crabb
This Page Intentionally Left Blank
Acknowledgments The Protein Society acknowledges with thanks the following organizations who through their support of the Society's program goals contributed in a meaningful way to the eighth annual symposium and thus to this volume. Amgen Inc. Applied Biosystems Inc., Division of Perkin-Elmer
Merck Sharp & Dohme Research Laboratories Michrom BioResources, Inc.
Autodesk Inc.
Millipore Corporation
Beckman Instruments, Inc.
Oxford Molecular, Inc.
Biosym Technologies, Inc.
PreSeptive Biosystems, Inc.
Bristol-Myers Squibb
Pharmacia Biotechnology, Inc.
Brookhaven National Laboratory
Pickering Laboratories, Inc.
Dionex Corporation
Promega Corporation
Du Pont Merck Pharmaceutical Co.
Rainin Instrument Co.
Finnigan MAT
Supelco, Inc.
Fisons Instruments
Vestec Corporation
Hewlett-Packard Co.
VYDAC
This Page Intentionally Left Blank
SECTION I Mass Spectrometry of Peptides and Proteins
This Page Intentionally Left Blank
The Use of a Volatile N-Terminal Degradation Reagent for Rapid, High-Sensitivity Sequence Analysis of Peptides by Generation of Sequence Ladders M. Bartlet-Jones, W.A. Jeffery, H.F. Hansen,and D.J.C. Pappin Protein Sequencing Laboratory, Imperial Cancer Research Fund, PO Box 123, Lincoln's Inn Fields, London WC2A3PX, UK I. Introduction For more than 25 years, automated Edman chemistry [1,2] has remained the favoured method for routine protein sequence analysis. Several limitations, however, have never been overcome. The procedure is inherently slow and does not allow direct identification of many posttranslational modifications. In addition, current detection limits are only at the level of hundreds of femtomoles [3]. Large-format 2Delectrophoresis systems now make it possible to resolve several thousand proteins from whole-cell lysates in the low- to upperfemtomole concentration range [4,5] and more versatile and sensitive methods of protein sequencing are needed to meet analytical problems of this scale. The recent introduction of matrix-assisted laser-desorption (MALD) time-of-flight mass spectrometers [6] has led to the rapid analysis (at high sensitivity) of peptide mixtures. Using this technology, Chait et al. [7] developed a novel sequencing strategy involving the production of a nested set of peptides, each peptide differing from its precursor by the loss of one amino acid. The peptide mixtures were analysed by MALDI time-of-flight mass spectrometry with the generated ladder' sequence read directly from the mass spectrum in a single step. The nested set of peptides were generated using Phenylisothiocyanate (PITC) in the presence of a small molar ratio of phenylisocyanate (PIC). Repeated reaction cycles generated a raggedend set of peptides terminated with non-cleavable phenylureas. The potential advantages of this approach were the speed of analysis (minutes), femtomole detection limits and the ability to process samples in parallel. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
3
4
M. Bartlet-Jones et al.
The described PITCiPIC procedure was essentially derived from early, manual degradation protocols [8,9]. The need to remove excess PITC, buffer and thiourea by-products by repeated solvent extraction remains a major cause of peptide loss and limits the manual processing of large numbers of samples. In order to achieve the routine sequencing of peptides at the low pico- and femtomolar levels it is desirable to keep the number of manipulations to a minimum. Any transfer or extraction of sample may lead to loss of peptide. It is also necessary to eliminate the accumulation of contaminating by-products which cause suppression of sample signal. This led us to the working hypothesis that all reagents and by-products of the chemistry should be volatile in order to minimise work-up procedures which result in peptide loss. For this purpose we synthesised a novel, volatile isothiocyanate (trifluoroethyl isothiocyanate; TFEITC) that demonstrated the necessary characteristics. II. Materials and Methods Reactions were performed in Hewlett Packard 100 |Lil glass minivials with 8mm cap seals. Vacuum systems comprised 2-stage Edwards E2M8 pumps coupled to dual-trap assemblies composed of -70°C refrigerated solvent traps (V.A.Howe Ltd.) and ACE no. 15 solvent traps cooled with dry ice. Separate, independent vacuum systems and desiccators were used for removal of acid or base. Synthetic peptide CD28-3PY (phosphorylated) was a gift from Miss N. O'Reilly and Miss E. Li (ICRF). Human synthetic [Glul]-fibrinopeptide B was purchased from Sigma Chemical Co. and 12.5% w/v aqueous trimethylamine (protein sequencing grade) from Applied Biosystems. Toluenesulphonic acid monohydrate, alpha-cyano-4-hydroxycinnamic acid and trifluoroethanol (99%+) were obtained from Aldrich; trifluoroacetic acid (HPLC/Spectro grade) and heptafluorobutyric acid (anhydrous) from Pierce. The TEA was redistilled from 2aminoethanethiol (Ig/L), The HFBA was used as received. Stock solutions of trimethylammonium bicarbonate buffer were prepared by bubbling dry carbon dioxide into 12.5% w/v aq. trimethylamine until the required pH (8.5) was reached. These stock solutions were stable (weeks) if stored at 0-4^C in a sealed container under nitrogen. Sequencing coupling buffer was prepared from trifluoroethanol: water : 12.5% trimethylammonium bicarbonate pH 8.5 (5:4:1 v/v). Trifluoroethylisothiocyanate (TFEITC; I) and N-hydroxysuccinimidyl6-trimethyl-ammoniumhexanoate (a C-5 linker quatemary-NHS ester; II) were synthesised in these laboratories [10]. The TFEITC reagent was generally used and stored as a 10% v/v solution in acetonitrile (stable for weeks at room temperature).
A Volatile Reagent for Sequence Ladders
FaO^^^NCS
(I)
(11)
A. Ladder Generation Peptides were dissolved in coupling buffer in sufficient volume to give n+1 cycles of 2.5 |Lil per cycle (i.e. peptide requiring sequencing through 5 cycles was dissolved in 15 |xl). Aliquots of 2.5 jxl were pipetted into a minivial and a further 2.5 ^ll of 10% v/v TFEITC in acetonitrile immediately added. The vial was sealed, heated at 80°C for 5 min, then diluted with 5 [ill of water. Solvent, excess reagent and coupling base were evaporated under vacuum for 10 min (4x10"2 mbar) in the presence of toluenesulphonic acid and phosphorus pentoxide. Cleavage was initiated by the addition of 2.5 |Lil of either trifluoroacetic acid (TFA) or heptafluorobutyric acid (HFBA), sealing the vial and again heating to 80°C for 5 min. Acid was removed in vacuo (4x10"2 mbar for 10 min) in a separate vacuum system using sodium hydroxide. This procedure was repeated n times, where n was equal to the number of required cycles. Following the final cleavage step, the last aliquot of peptide was added, followed by the addition of 5 |Lil of water, and the solution dried under high vacuum (base vacuum system) for a minimum of 1 hour. B. Derivatisation with quaternary amines Peptide (20-50 fmol in 0.5 |il) was pipetted onto a target slide and allowed to air-dry for 5 minutes. The slide was then cooled on ice and 0.5 |il of a pre-cooled, freshly prepared solution of 0.25% w/v C-5 quaternary N-hydroxysuccinimide ester dissolved in 12% w/v trimethylammonium bicarbonate (pH 8.5) was added and left over ice for a further 10 min. The slide was then allowed to warm to room temperature, dried under high vacuum (15 min) and matrix added for MS analysis as described below. C. MS analysis of samples Peptide samples or ladder mixtures were dissolved in 50% aq. acetonitrile/0.1% TFA (3-5 |il) and sonicated for 5 min to aid solvation. Small aliquots (0.3-0.5 |xl) were pipetted onto a target slide and allowed to air-dry (approximately 5 min). One 0.3 |LI1 aliquot of matrix solution
6
M. Bartlet-Jones et al.
(1% w/v alpha-cyano-4-hydroxycinnamic acid in 50% aq. acetonitrile/0.1% v/v TFA) was finally applied to the dried sample and again allowed to air-dry. Peptide spectra were obtained using a Finnigan MAT LaserMat mass spectrometer essentially as described by Mock et al.[ll]. III. Results and Discussion In order to maintain adequate repetitive sequencing yields, reagent coupling and cleavage kinetics should be similar to PITC. Short-chain alkyl isothiocyanates (e.g. methyl or ethyl isothiocyanate) are characterised by coupling rates some 5-10 times slower than PITC [9]. In the case of the TFEITC reagent, the strong electron-withdrawing effect of the trifluoromethyl group enhances nucleophilic attack, giving reaction kinetics only 50% slower than PITC. The single methylene spacer between the trifluoromethyl and the isothiocyanate group also enhances the cleavage kinetics, making cleavage faster than PITC. Finally, the low boiling point of this reagent (approx. 94^C) allows rapid removal under moderate vacuum. In the procedure described in this manuscript, the nested set of peptides was generated simply by adding fresh peptide to each cycle and driving both the coupling and cleavage chemistry to completion. No additional reagents were required to act as chain terminators. The process is summarised in Figure 1 and as follows: Add volatile Isothiocyanate
Add
P^ptodd
^(TFEITC)
-In coupling buffer
C@ypllS (5min,80X)
DDl Wigy© (lOmin)
DDD V i ^ y © (lOmin)
(5min,80**) Figure 1: Flow-diagram of TFEITC degradation protocol.
A Volatile Reagent for Sequence Ladders
7.4 7J
1 1
7.0 (.8 6.6
H
6.4
Y
t
6.2
Q
6.0
i
5.8
P
S.6 S.4
Ar
S.2
n
1 i
5.0 4.8
i
4.6 4.4
4a 4.0
1
f Y-PO> ^ 1
,1. d
,
1
^***Hkk.LL,4Jw'\
U
y ^ WW vf n..
^^^^'wp^f \WV
3.8
i ""
wIta
^111
^*%to filTr
3.6 3.4 900
1000
1100
1200
1300
1400
1500
1600 1700 1800 1900 2000 2100 2200
2400 2600 MaasdnAt)
Figure 2: A ladder sequence through 6 cycles of a synthetic peptide (CD283PY) carried out on a total of 17.5 picomoles of starting peptide. The peptide is especially interesting in that it contains both a proline and phosphorylated tyrosine residue. Both residues undergo the sequencing chemistry satisfactorily. 9.8
3
9.6 9.4 92
8.8
i
8.6
1i
31
8.4 8.2 8.0 7.8 7.6 7.4
I
i1
1
1^
1
E
1 JL i
1
D
i
1
9.0
N
N
\\
11
j|
,
11
^
M
M mi r' i
7.2
a
.
1
7.0 6.8
950
1000
1050
1100
1150
1200
1250
1300
1350
1400
1450
1500
1550
1600
1650 1700 Mawftn^)
Figure 3: A ladder sequence through 6 cycles of [Glul] fibrinopeptide B carried out on a total of 17.5 picomoles of starting peptide.
8
M. Bartlet-Jones et al.
In cycle 1, all added peptide is shortened by one residue to give peptide-1. In cycle 2, this peptide quantitatively loses a second residue to become peptide-2 and the freshly added peptide becomes peptide-1. This process is repeated for the required number of cycles. The main practical requirement is that the starting peptide is dissolved in a defined volume of coupling buffer, dependent only on the number of cycles required and the volume applied per cycle. As all excess reagent, buffer and volatile reaction by-products are removed under vacuum, there are no extractive or transfer losses. It is important that all coupling base is removed after each coupling step so that salt formation with the cleavage acid is minimised. Residual trimethylamine or the conjugate TFA salt can cause significant suppression of signal in the mass spectrometer. For this reason two independent vacuum systems containing suitable trapping agents were used for the removal of acid or base. Results obtained on a number of peptide samples are shown in Figures 2-4. Ladder sequences of 5-6 residues were successfully generated on test peptides starting with as little as 600 fmol of material. One example peptide (Fig. 2) is of particular interest in that an internal phosphotyrosine residue was identified using only 17.5 picomoles of starting peptide. There was no evidence of loss of phosphate from the modified residue. This and other examples (Figs. 3 and 4) demonstrate practical ladder sequencing at levels between 10 and 200 times more sensitive than reported in the initial work of Chait et al. [7]. Proline and hydroxy-amino acids (often problematic for the standard Edman chemistry) have presented few problems with the TFEITC reagent (an example of sequence through proline is shown in Figure 2). In some trial experiments (not shown) up to two dozen samples were processed simultaneously. One advantage of generating a sequence without the use of chainterminating reagents is that the terminal amino group is retained. In contrast to the Chait procedure, where all positive charges may be replaced by neutral phenylureas or thioureas, retention of the Nterminal primary amino group improves the ionisation efficiency of the resultant peptides. Other problems associated with the PITC:PIC procedure include the labelling of internal lysine residues with both isothiocyanate and isocyanate groups (yielding products of different mass) and side-reactions associated with N-terminal phenylureas. Retention of the terminal amine also allows the potential for further modification of the peptide with sensitivity-enhancing molecules such as the C-5 alkyl quaternary ammonium activated ester developed in this laboratory (Figure 5). N-terminal derivatisation of peptides with *fixed-charge' compounds including quaternary amines, phosphonium and pyridinium ions have already been shown to simplify the daughterion spectra of peptides produced by high-energy collision-induced
A Volatile Reagent for Sequence Ladders
Figure 4: A ladder sequence through 5 cycles of [Glu^] fibrinopeptide B carried out on a total of 600 femtomoles of starting peptide.
•;
8.4
0
1
>"-
8.2
NH-PeptWe
8.0 7.8 ?.« 7.4
i
7.2
7.0 6.8 6.6
6.4
i
6.2 6.0 S.8 5.6 1000
1200
1400
1600
1800
2000 2200 2400 2600 2800
3200
3600
4000
4400
4800
Figure 5: Quantitative derivatisation of 50 fmol [Glul] fibrinopeptide B with a C-5 quaternary linked tag. The derivatisation was performed on the peptide sample already spotted onto the target slide.
10
M. Bartlet-Jones et al.
fragmentation [12,13]. It is possible that the activated quaternary NHS esters reported here may be useful for facile modification of peptides to aid interpretation of CAD spectra for sequence analysis by such MS/MS techniques. Table I: Table of monoisotopic residue masses Amino acid
Residue mass
Amino acid
Residue mass
Gly Ala Ser Pro Val Thr Cys He Leu Asn
57.02 71.04 87.03 97.05 99.07 101.05 103.01 113.08 113.08 114.04
Asp Lys Gin Glu Met His Phe Arg Tyr Trp
115.03 128.09 128.06 129.04 131.04 137.07 147.07 156.10 163.06 186.08
The principal practical problems that remain are due to limitations in current instrumentation. Residues L and I are isomers and therefore indistinguishable using this procedure. Residues K, Q and E share very similar masses (see Table I) although lysine side chains are modified by the TFEITC reagent and increase in mass by 141 Daltons. With TOP instruments yielding mass resolution below approximately 300, acidic residues and their corresponding amides (E and Q, D and N) are only resolved by repeated mass analysis following chemical modification (e.g. esterification). Such limitations are entirely instrument-related, and not relevant to the demonstration of the degradation chemistry reported here. Future developments in instrumentation (particularly with respect to resolution) are required to overcome these limitations. rV. Summary The primary aim of this work was to explore sequencing strategies capable of rapid analysis of proteins, possibly recovered from 2D- electrophoresis gels. For this purpose, the chemistry needed to be adaptable to multiple samples and sensitive enough to work in the femtomole range. The described TFEITC chemistry is showing early signs of meeting these criteria. The demonstration, on a low-picomolar scale, that a phosphorylated tyrosine residue could be directly identified make this a potentially powerful tool for the identification of this and other sites of post-translational modification. The inherent simplicity of the process should also allow for easy automation to permit rapid processing of samples in parallel.
A Volatile Reagent for Sequence Ladders
11
Acknowledgments This work was supported by the ICRF. Some aspects of the work were presented at the 42nd ASMS Conference on Mass Spectrometry and Allied Topics, May 29-June 3 1994, Chicago, IL. References 1) Edman, P. and Beg, G. (1967) Eur. J. Biochem. 1, 80-91 2) Hewick, R.M. et al. (1981) J. Biol. Chem. 256, 7990-7997 3) Totty, N.F. et al. (1992) Protein Sci. 1, 1215-1224 4) O'Farrell, P. (1975) J. Biol. Chem. 250, 4007-4021 5) Patton, W.F. et al. (1990) Biotechniques 8, 518-527 6) Karas, M. and Hillenkamp, F. (1988) Anal. Chem. 60, 2299-2301 7) Chait, B.T. et al. (1993) Science 262, 89-92 8) Tarr, G.E. (1986) in Methods of Protein Microcharacterization, (J.E. Shively, Ed.) p. 155-194, Humana Press. 9) Tarr, G.E. (1977) Methods Enzymol. 47, 335-357 10) Bartlet-Jones, M et al. (1994) Rapid Commun. Mass Spectrom., in press 11) Mock, K.K. et al. (1992) Rapid Commun. Mass Spectrom. 6, 233-238 12) Vath, J.E. and Biemann, K. (1990) Int. J. Mass Spectrom. and Ion Processes 100, 287-299. 13) Stults, J.T. et al. (1993) Anal. Chem. 65, 1703-1708
This Page Intentionally Left Blank
Investigation of Polyethylene Membranes as Potential Sample Support for Linking SDSPAGE with MALDI-TOF MS for the Mass Measurement of Proteins James A. Blackledge and Anthony J. Alexander Bristol-Myers Squibb, Analytical Research & Development, Pharmaceutical Research Institute, P.O. Box 4755, Syracuse, NY 13221-4755
I. Introduction Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) is a classic technique used for the separation and molecular weight (MW) determination of biomolecules. Unfortunately, the technique gives only a rough estimation of MW, with values of ± 5-10 % being typical. Additionally, it can be subject to systematic errors if the species under investigation has different electrophoretic migration behavior then the MW markers. Matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) routinely gives MW values with an accuracy of ± 0.1% or better, and has become increasingly popular for the mass measurement of biopolymers [1]. The technique is simple, rugged, has a mass range in excess of 200,000 Da, and is extremely sensitive, requiring low nanomole to picomole amounts of material. Additionally, the technique is relatively insensitive to the presence of various salts and buffers that are often associated with the isolation of biomolecules. As the separated protein bands in an SDS-PAGE gel are typically transferred (electroblotted) onto a supporting membrane prior to ftirther analysis, most of the effort to combine the resolving power of SDS-PAGE with the mass accuracy of MALDI-MS has focused on desorbing proteins directly from the transfer membrane. Typical membranes currently employed are nitrocellulose [2], nylon [3], and polyvinylidene difluoride (PVDF) [4,5]. While these have been well developed for the purpose of electroblotting protein bands, their use as sample supports for MALDI-MS is still somewhat problematic. With a UV laser, useful MALDI mass spectra have so far only been obtained for low MW peptides and proteins ( c.a. 30,000 Da.) this becomes progressively less effective, as ionization of one species often results in near total suppression of the other, or in broad, poorly-resolved ion signals. Sample desorption/ionization from PE does not appear to be as susceptible to such effects. This is demonstrated in Figure 4 by the spectrum of a chimeric monoclonal antibody (BR96) which has been internally mass calibrated by co-addition of BAD. Excellent ion intensity and peak shape is maintained for both the singly- and doubly-charged species of both proteins.
Membranes for Linking SDS-PAGE & MALDI-TOF MS
17
,.i.
2500
-
2000
-
a MH* 5736
Bovine Insulin
1 1500
-
1000
-
500
-
III
1
J\A ^
0
^^^^* &952
^^^^
1n ^...H^...^^^..
.
1
L
J u_„.,
^
500
1000
1500
2000
2500
3000
' ! 3500
I ' 4000
4500
5000
5500
6000
6500
7000
m/z
MH*
Horse Heart Myoglobin
16886
8448
I 1 SOOO
10000
2MH*
34048
-/•^^ 15000
30000
2S000
30000
35000
n/z
Bovine Albumin Dimer
66525 133049
33269 '^"^'* •44266 199143 I
^^mmKJ%iiti\i0*ti^
ynitkkMM
Figure 2. Utility of PE membrane over a large mass range. Spectra o f bovine insulin (0.4 pmol/mm^), horse heart myoglobin (0.4 pmol/nmi^), and bovine serum albuimin dimer (1 pmol/mm^) are displayed from top to bottom. All spectra are the summation of 20 shots, and are unsmoothed.
James A. Blackledge and Anthony J. Alexander
- BSA
\ . . L
Jv
TV
J BSA In 0.73% SDS
1
BSA in 0.73% SDS, after Washing
_J--^^—J
V—
20000
40000
60000
^
80000
/\^
100000
120000
140000
m/z
Figure 3. Washing of protein bound to PE membrane. Top spectrum is of control BSA applied to PE membrane. The middle spectrum is BSA in 0.73% SDS immobilized on PE membrane. The bottom spectrum is BSA in 0.73% SDS immobilized on PE membrane, then vortexed in 50% methanol for 30 seconds prior to the addition of matrix. All spectra were acquired at a protein load of 1 pmol/mm^
a.i.
1
800
H
700
H
149771 1
600
A
500
-]
400
H
74902
* 6652J
*
1
133049 300
-j
1
49942 1
200
H
100
-\
1
lAWu*[j
-¥-T 0
T-,
.
20000
1 T ,
.
40000
,
.
1
y
1 . 1
60000
BOOOO
lOOOOO
120000
140000
160000
180000
200000
220000
m/z
Figure 4. Internal standardization of high-mass samples. The spectrum of the chimeric monoclonal antibody, BR96 (0.64 pmol/mm^), was mass calibrated using the singly and doubly charged species of BAD, which was present as an internal standard (0.5 pmol/mm^). Asterisks indicate BAD calibration ions.
Membranes for Linking SDS-PAGE & MALDI-TOF MS
19
IV. Conclusion Spectra acquired from PE membranes are of equal or better quality as those acquired from metal sample stages under standard sample preparation conditions. The PE membrane provides access to higher molecular weights than the more common transfer membrane materials (PVDF, nylon, and nitrocellulose). This permits the mass analysis of the large proteins for which MALDI-TOF MS is ideally suited. Mass accuracy and reproducibility approaches that obtained with standard sample preparations. Furthermore, the use of PE reduces the severe ion suppression effects typically observed in the MALDI analysis of high mass mixtures. This also permits more accurate mass measurements to be made via the use of internal calibration. While it remains to be shown that proteins can be desorbed from PE membranes following the electrotransfer of bands from SDS-PAGE gels, results to date are very encouraging.
References 1. Hillenkamp, F., Karas, M., Beavis, R.C., and Chait, B.T.; Anal. Chem. 63, 1193A1203A (1991). 2. Klarskov, K. and Roepstorff, P.; Biol. Mass Spectrom. 22, 433-440 (1993). 3. Zaluzek, E.J., Gage, D.A., Allison, J., and Watson J.T.; J. Amer. Soc. Mass Spectrom. 5 (1994). 4. Vestling, M.M. and Fenselau, C ; Anal. Chem. 66, 471-477 (1994). 5. Strupat, K., Karas, M., Hillenkamp, F., Eckerskom C , and Lottspeich P.; Anal. Chem. 66, 464-470 (1994). 6. Le Maire, M, Deschamps, S., Moller, J.V., Le Caer, J.P., Rossier, J.; Anal. Biochem. 214, 50-57 (1993). 7. Mock, K.K., Sutton, C.W., and Cottrell, J.S.; Rapid Commun. Mass Spectrom. 6, 233-238 (1992).
This Page Intentionally Left Blank
Comparison of ESI-MS, LSIMS and MALDI-TOF-MS for the Primary Structure Analysis of a Monoclonal Antibody Leticia Cano, Kristine M. Swiderek,and John E, Shively Division of Immunology, Beckman Research Institute, City of Hope Duarte, CA 91010 Abbreviations: HPLC, high performance liquid chromatography; LSIMS, liquid secondary ion mass spectrometry; MALDI-TOF, matrix assisted laser desorption ionization/time offlight;ESI, electrospray ionization; PVP, polypyrolidone; TFA, trifluoroacetic acid.
I. Introduction Confirmation or verification of a known protein sequence is a common task in protein structural analysis. Examples include recombinant proteins whose sequence must be verified to demonstrate that the correct product was made, and proteins that have been isolated from natural sources and for which the cDNA sequence is known. Monoclonal antibodies belong to the latter class. Antibodies often are cloned and converted to a variety of engineered constructs for use in in vivo imaging and therapy. In these applications it is absolutely essential that the protein chemist confirm that the protein sequence and cDNA predicted sequence agree before investing in costly and time consuming antibody engineering projects. In our lab we have been interested in anti-carcinoembryonic (CEA) antibodies which have excellent tumor targeting properties for imaging and therapy of solid tumors of the colon, lung, and breast. Antibodies of the IgG class are composed of heavy and light chains of mass of 50 kDa and 25 kDa, respectively. Each contain (^sulfide bonds which must be reduced and alkylated in order to obtain complete peptide maps and structural information. While it is relatively easy to isolate and map light chains, the heavy chains are often hydrophobic and difficult to andyze. For this reason, we have chosen to separate the chains on an SDS gel (reduction and alkylation is performed prior to loading the samples to the gel), electrotransfer to membrane (nitrocellulose in this case), and perform an on-blot digest with trypsin. Since this approach is commonly TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
21
22
Leticia Cano et al.
used to sequence a large variety of proteins, the application to a monoclonal antibody should be of general interest. Mass analysis of peptide fragments from a protein of "known sequence" is the method of choice for speed and accuracy. However, at least three major options are available for mass analysis, each varying in their merits. The three most widely used techniques are LSIMS, MALDITOF-MS, and ESI-MS. In LSIMS and MALDI, individual peptide fractions are analyzed by mixing with a matrix followed by ionization and mass analysis, while in ESI the samples can be separated on-line by LC, eliminating the need for individual peak collection. In this report we compare each of the techniques for the analysis of the heavy and light chains of the anti-CEA antibody CEA.11 H5 (1).
11. Materials and Methods The cDNAs for the heavy and light chain variable regions were cloned using consensus PCR primers. The variable sequences were appended to the consensus sequences for murine constant regions of the light chain (kappa isotype) and heavy chain (gamma-1 isotype). The predicted masses for the heavy and Ught chain tryptic fragments were obtainedfromthe program MacProMass (2). The anti-CEA antibody CEA.l 1 H5 (1 nmole) wasreduced(1 jol P-mercaptoethanol), S-alkylated (1 ml 4-vinylpyridine), electrophoresed on SDS-PAGE, blotted onto nitrocellulose, and stained with Ponceau S. The bands corresponding to the light and heavy chains (25 kDa and 50 kDa respectively) were excised, blocked with PVP-360 (0.25% in 10% acetic acid for 20 minutes), and digested with trypsin ( 2 ^g, 37^C ovemight). The protocol is similar to that described by Henzel et al. (3). A 10% aliquot of the digestion mixture was analyzed on a Vydac C18 250iam ID fused silica capUlary column connected on-line to ESI-MS (4). A linear gradient of 2% B to 92% B in 45 minutes using solvent A (0.1% TFA) and solvent B (90% acetonitrile, 0.07% TFA) was used with a flow rate of 2 jul/min. Sample elution was monitored by UV detection at 200 nm. Mass spectra wererecordedin the positive ion mode using a TSQ-700 triple quadrupole instrument (Finnigan-MAT, San Jose, CA) with an electrospray ion source operating at atmospheric pressure. Scans were continuously acquired every three seconds between nVz 500 and 2000 in the centroid mode. Theremainderof the mixture was separated by RP-HPLC on a Vydac CI8 530um ID fused silica capillary column. Capillary LC was carried out using a model 140B Applied Biosystem HPLC and a Rheodyne injection valve with a 100 )ul injection loop. A linear gradient of 2% B to 70% B in 60 minutes using solvent A (0.1% TFA) and solvent B (90% acetonitrile, 0.07% TFA) was used with a flow rate of 20 jul/min. Fractions corresponding to all UV-absorbing peaks were hand collected.
23
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
655.6
&-
1310.6
i«*:#fM!J^-*J«^^
JJlrfXjU«hftftliii
• y
I
••^•^»M»y*.
M/Z
1310.1
B
I,
2260.7
2909.4
M/Z 100,
1310
2259
I^ ^BO :
•=< 6 0 -
2905
LLx M/Z
Figure 1. Mass spectra of a trypdc peptide from the heavy chain of monoclonal antibody CEAll.H5. A. ESI spectrum showing the doubly charged (655.6) and single charged (1310.6) ions. B. LSIMS spectrum for HPLC fraction #50 (tn/z 1310.1). C. MALDI spectrum for the same fraction (nVz 1310,2258,2906, and 4118). An external standard was used for calibration (bovine insulin).
24
Leticia Cano et al.
About 1 ]ul of each fraction was analyzed by SIMS using a thioglycerol (3-mercapto-l,2-piopanediol) matrix. Mass spectra were recorded in the positive ion mode using a TSQ-700 triple quadrupole instrument (Finnigan-MAT, San Jose, CA) equipped with an 8 keV cesium ion gun (Phrasor Scientific, Inc, Duarte, CA). Scans were continuously acquired every seven seconds between rnlz 400 and 4000 in the centroid mode. Approximately 0.5 jul of each fractions was analyzed on a Kratos Kompact in TOF instrument. Samples were prepared using a-cyano-4hydroxycinnamic acid as matrix. Tlie sample wells were prespotted with matrix dissolved in acetone, dried, and respotted with a 1:1 solution of peptide and matrix dissolved in 30% acetonitrile/0.01% TFA. Microsequence analysis was performed on samples spotted onto PVDF membranes in continuous flow reactor and sequenced on a City of Hope-built sequencer (5).
III. Results and Discussion The sample (1 nmole) was reduced with DTT, S-alkylated with 2vinylpyridine and run direcdy on an SDS gel. After electrotransfer to nitrocellulose, the bands were stained, excised, and digested with trypsin according to Henzel eL al. (3) with the modification of PVP360 used in place of PVP40 (Henzel, personal communication). An aliquot was subjected to ESI on LC/MS. The remainder was separated by capillary LC, the peaks collected, and analyzed by LSIMS and MALDI. For the heavy chain, over 40 peaks were identified by all three techniques. Figure 1 illustrates typical spectra obtained by each of these methods for one of die peptidesfromthe heavy chain (H78-88, NFLSLQMTSLR). The ESI spectrum (Figure lA) identifies the peptide at scans 415-417 as a doubly charged and single charged species. The mass corresponds to the expected average mass (1310.56). The LSIMS and MALDI were taken from the RP-HPLCfraction#50. The LSIMS spectrum (Figure IB) shows a predominant peak at mass 1310.1 with traces of other peptides at masses 1855, 2260, and 2909. The MALDI spectrum shows the same peaks, in addition to a peak at 4118, but in different intensities. The intensity differences almost certainly reflect the sample suppression and enhancement effects inherent to the matrix and ionization differences in the two techniques. Since the peak intensities do not necessarily reflect their actual abundance in the sample, no comment can be made on this issue. Although the ESI spectrum is simple, it cannot be directiy compared to the LSIMS and MALDI peaks which were collectedfix)ma separate HPLC run. Each of the trypticfragmentsfor the heavy and light chains was compared to the predicted masses (see Methods). A correct match was found for 199/214 for the light chain, and 349/446 for the heavy chain. Unidentified peaks were subjected to microsequence analysis. Several of the unidentified peaks corresponded to single amino acid substitutions
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
25
1302.9 539.7
600
8D0
1000
1200
1400
1600
'
1800
2000
M/Z
2603.1
M/Z 2606
-^-
Figure 2. Mass spectra of a glycopeptide from the heavy chain of monoclonal antibody CEA11.H5. A. ESI spectrum showing the glycopeptide doubly charged ion at 1302.9. The peak at 539.7 is an unrelat^ peak. B. LSIMS spectrum for the glycopeptide (m/z at 2603.1), C. MALDI spectrum for the glycopeptide (m/z at 2606). Theextemal calibrant was bovine insulin (5734).
26
Leticia Cano et al.
compared to the consensus sequences for the murine constant regions. One of the unidentified peaks corresponded to a glycosylated peptide. The heavy chain (gamma-1 isotype) has a single glycosylation site at Asn-293. The expected mass for this peptide (1158.2, without carbohydrate) was not observed in any of the three analyses; however, an unidentified peak of mass 2603.1 was identified as the glycopeptide by microsequence analysis (EEQF-STFR). The blank at cycle 5 corresponds to Asn, predicted to be glycosylated by the recognition sequence Asn-xxxSer/Thr. The ESI spectrum for this peptide (Figure 2A) gives a mass of 2603 (calculated from the doubly charged species at nVz 1302.9). The LSIMS spectrum, although weak, confirms the mass at 2603 (Figure 2B). The glycopeptide was also identified by MALDI (Figure 2C). The mass difference between the unglycosylated and glycosylate peptide, 1445, probably corresponds to GlcNAc(Fuc)GlcNAc(Man)3(GlcNAc)2, the identical glycopeptide we identified in the monoclonal antibody T84.65 (6). Several peaks were observed which did not correspond to peptides. These peaks were due to the mandatory blocking agent used to prevent trypsin from adsorbing to the membrane. In our initial trials we used reduced Triton XlOO as the blocking agent as described by Femandez et al. (7). This agent has been recommended because it has little or no UV absorbance and does not interfere with microsequence analysis. However, this detergent gives an impressive series of ions interfering with ESI analysis as shown in Figure 3A. The peaks observedfi-omscan 160-230 are all part of the reduced Triton series as evidenced by a closely spaced series with a mass difference of 44. These peaks efficienfly suppressed peptides eluting in the same region of the gradient, and rendered this analysis practically useless. This problem was overcome by using PVP360 (Figure 3B). This high molecular weight polymer elutes late in the gradient away from all but the most hydrophobic peptides. It should be noted that even PVP40 (3) can cause interference in ESI analysis. In spite of the problem encountered with reduced Triton X1(X) in ESI, some peptides could be analyzed. In the scan region 260-263 on ESI, a peptide peak (1138,MH2-»-) was identified amidst the detergent peaks (Figure 4A). The peptide peak (mass 2274) corresponded to H303-322 (SVSELPIMHQDWLNGKEFK) where the Met (shown in bold type) is oxidized to the sulfone. The detergent peaks 994 and 1038, and 1362 and 1406 can be identified by their inclusion in the expected series, including the mass difference of 44 for each pair. The H303-322 peptide was also observed by LSIMS with a much reduced intensity compared to the detergent peaks (Figure 4B). However, the peak on MALDI had an excellent intensity with litde or no interferencefiromthe detergent (Figure 4C). This example also points out the possibility of observing oxidia^ Met residues. In this example, the unoxidized version of the peptide was not observed.
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
27
IV. Summary After the analysis of 87 peptides by three methods, it was possible to account for 78% of the sequence of the heavy chain, and 93% of the light chain (Figure 5). The regions missed correspond to either very large or very small tryptic fragments. This problem could be overcome by performing a second map using an enzyme with a different specificity, but would have increased the amount of work. Of the three methods, ESI and MALDI have no problems with the analysis of very large peptides (mass > 3000). This is an obvious limitation of LSIMS. Spectra containing
CO
C
>
Scan Number
Scan Number Figure 3. Effect of detergent on tryptic map of the heavy chain of monoclonal antibody CEAll.H5. A. ESI base peak chromatogram for sample containing 1% reduced Triton XIOO. B. ESI base peak chromatogram for equivalent sample containing 0.25% PVP360.
Leticia Cano et al.
28
1138.0
I 1406.8
i i j y l p L ^ — . - - ^ M/Z
B
>
2273.8
ii^^i^ 500
1000
1500
2000
2500
3000
3500
4000
M/Z 2274 ^
80
I I" I
1
5734 3000
M/Z
Figure 4. Mass spectra of a tryptic peptide from the detergent-containing digest. A. ESIspectrumshowingthedoubly charged peak at 1138.0. B. LSIMS spectrum for the same peptide (m/z 2273.8). C. MALDI spectrum for the same peptide (m/z 2274). The internal calita-ant was bovine insulin (5734).
Comparison of ESI-MS, LS/MS & MALDI-TOF-MS
29
DVQLVESGGGLVQPGGSRVKLSXAASGFTFSSFGMHWIRQAPEKGLEWAYISGGSSTIYYADTV1C
"56 69
_^
77 78
as 99
99
*sS
101
12S
GRFTISRDNPKNFLSLQMTSLRSEDTAMYYXARDYYVNNYWYFDVWGQGTTVTVSSAKTTPPSVY -232^
..^
JJJL^
^
460
2723
TTPPSVYPLAPGSAAQTNSMVTLGXLy^dYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSSS
VTVPSSPRPSETVTXNVAHPAS^^m)KK^^RDXGXKPXIVXTVPSVSSVFIFPPK^kDVj,TIT S12 ^4 UOi 271 ^71
259
1327
29« *"
jg-
LTPKVTXVWDISKDDPEVQFSWFVDDVEVHTAQTQPREEQFNSTERSVSELPIMHQDWLNGK 2276 2617
^
2260
3996 318
323
2««
34S
350
3«1
372
SFKCRVNSAAFPAP ISKTISKTKGRPKAPQVYTIPPPKEQMAKDICySLTCMITDFFPSDITVEWQW 788 *
jOa
^
687
^ ^
"^^
65:^ 1211
""•
1479 ^_
IT;;—^
Xrry)PAF.WYK^^PTNfTJTMryYT^SgTtyj^nKA?^Ar;>JTPTY5^W ^ 1340 49^ 602 2955
^
^?^?
684
>
DISLTQSPKFMSTSVGHRVqjJ!CKASQNVRTAyAWFjQKLg^PKALIYL£SNRY^G^DR 1031 1022 756 "sJT" 551 630 955
B
2267
^
,
'^
*
^1090 *
103
107
FTGSGSGTDFTLTINNVHSEDLADYFXLQHWNYPLTFGAGTKLEIK 3776 108
119
* " 142
^
^RADAAPTVSIFPPSSEQLTSGGASVVXFLNNFYPK *".^n-^? ^
143
J-"
1«3
3
"TT
"^
18S
DIlg/KWKIDGSERQNGVLNSWTD.QDSKDSTYSMSSTLTLT!g3EY "^ST *99lf* * 1 4 / / •• 1576
ERHNSYTCEATHKTSTSP|VKSFNRNEC 306
833
524 4 7 1
Figure 5. Summary of mass analysis for tryptic peptidesfromthe heavy and light chains of monoclonal antibody CEA11.H5. A. Heavy chain. B. light chain.
30
Leticia Cano et al.
multiply charged ions in ESI become problematic when the singly charged states are missing and multiple species are present. This problem was illustrated in Figure 4A. MALDI was the simplest of the techniques, but required careful attention to calibration, usually requiring intemal calibrants to obtain accurate masses. While this may seem a small annoyance, it often caused problems when the intemal calibrants suppressed the peptide peaks. On-line LCTMS (ESI) would have emerged as the clear favorite if it could have identified all of the peaks in single run. Clearly this was not the case. Many peaks that should have been observed were not, and many identified peaks arose in their place. Some corresponded to multiply charged species of known peaks, and some corresponded to peptide variants either at the level of an amino acid substitution, glycosylation (Figure 2), or methionine oxidation (Figure 4). In general, unidentffied peaks had to be sequenced in their entirety to verify the nature of the mass variance. In all cases, this led to a correct assignment. Another problem was unexpected proteolytic cleavages. This occurred for 22 peptides. It was due to chymotrypic cleavages. In spite of the use of Promega sequencing grade modified trypsin, this remains a rare, but real issue in mass assignments by this method. The effect of detergent on LSIMS and ESI is severe. MALDI is clearly the method of choice for samples that include detergents, but otherwise detergents should be avoided. A final comment can be made on mass analysis of peptides blotted and digested from SDS gels: it is a powerful, general approach, but requires much time and effort in sorting out the data no matter which mass spectrometric approach is used.
Acknowledgments This work was supported by the City of Hope Cancer Center Grant from NCI, CA 33572, and by NCI grant CA 43904.
References 1. Wagener, C , Yang, Y.H.J., Crawford, F.G., and Shively, J.E. (1983) Immunology 130, 2308-2315. 2. Lee, T.D., and Vemuri, S. (1989) Proc. 37th ASMS Conf. Mass Spec. 352-353. 3. Henzel, W.J., Billed, T.M., Stults, J.T., Wong, S.C, Grimley, C , and Watanabe, C. (1993) Proc. Natl. Acad. Sci. USA 90, 50115015. 4. Davis, M.T., and Lee, T.D. (1992) Protein Science 7,935-944. 5. Calaycay, J., Rusnak, M., and Shively, J.E. (1991) Anal. Biochem. 192, 23-31. 6. Shively, J.E.,Paxton,R.J., and Lee, T.D. (1989) 77fi5 74,246-252. 7. Fernandez, J., DeMott, M., Atherton, D., and Mische, S. (1992) Anal. Biochem. 201,255-264.
MS Based Scanning Methodologies Applied to Conus Venom t A. Grey Craig, Wolfgang H. Fischer, Jean E. Rivier, J. Michael Mcintosh; and William R. Grayt The Clayton Foundation Laboratories for Peptide Biology, The Salk Institute, San Diego, CA 92138-9216 and tDepartments of Psychiatry and Biology, University of Utah, Salk Lake City, UT 84112
I. Introduction Known biologically active agents in the venom produced by marine cone snails (Conus)y are small, highly constrained and specialized peptides. These venoms are a rich source of unique neuroactive molecules (1). Although the venoms from different species of cone snails may contain homologous peptides (e.g. both C. geographus, and C. striatus make peptides targeted to acetylcholine receptors and voltage sensitive calcium channels) they may also contain a number of distinct specialized peptides (e.g. the activity of conantokin-G of C. geographus which targets NMDA receptors has not been observed in C. striatus venom) (1). We describe a number of strategies (including derivatizations) that allow the identification of as yet uncharacterized toxins in these venoms. The extent of the challenge lies in the fact that most Conus venoms are complex mixtures containing over 100 peptides. With the advent of very sensitive ionization techniques such as matrix assisted laser desorption (MALDI) coupled with time-of-flight (TOF) mass analysis, measurement of the intact mass of peptides at sub-pmol levels has become a reality (2) of which we have taken advantage for the systematic screening of HPLC fractions. Partial sequence information can be obtained by carrying out enzymatic hydrolysis with exoproteinases (e.g. carboxypeptidases and aminopeptidases) (3, 4). More recently, MALDI has been used to measure metastable decomposition occurring in the first field free region of a reflectron TOF instrument (referred to as post source decay (PSD)) with only marginally more sample (5-7). While significantly less material is required for MALDI than for either UV detection, chemical sequencing or amino acid analysis, the nature of the information derived from MALDI spectra is also different. Clearly, obtaining unambiguous composition or sequence information is not a simple task. This is due to the fact that (i) the mass accuracy of MALDI-TOF measurements is generally lower than that of liquid secondary ionization (LSI) with a magnetic sector mass spectrometer, (ii) enzymatic sequencing is affected by the varying rates of cleavages at different amino acid residues and the reduced activity of most enzymes towards particular residues (e.g., tyrosine and proline) and (iii) TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
31
32
A. Grey Craig et al.
fragmentation information from PSD suffers from the ambiguities of assigning fragment ions as being derived from the N-terminus, C-terminus or "internal sequence". We have implemented scanning methodologies using MALDI-TOF mass spectrometry to partially purified venom from C striatus and C. ermineus. We have carried out specific derivatizations in order to deduce composition and sequence information. Together with an intact mass these measurements are used to determine whether an ionized species observed in the MALDI mass spectrum corresponds with the intact protonated molecule of a previously characterized conotoxin. The information obtained from derivatizations is also important when the ionized species does not correspond with the intact mass of peptides of known sequence. In that case, post source decay of the native and derivatized species may help assign the fragment ions.
II. Materials and Methods MALDI mass spectra were measured with a Bruker Reflex time-of-flight mass spectrometer fitted with a gridless reflectron energy analyzer and a nitrogen laser. Accelerating and reflectron voltage of +31 kV and +30 kV were employed unless otherwise specified. Typically, the amount of sample necessary for MALDI analysis was 100 fold less than that used for LSI analysis. All MALDI samples were prepared in six or more different sample preparation formats including three different UV absorbing matrices (a-cyano-4hydroxycinnamic acid, sinapinic acid and 2,5-dihydroxybenzoic acid) and two methods of preparation. In the first method peptide solution was pre-mixed with a solution of each matrix prior to application onto the probe tip (see refs. (8-10) for preparation of matrix solutions). In the second method a solution of the matrix in acetone was dried on the probe tip and then a separate sample of peptide was applied and left to dry onto the matrix (11) As noted previously, the second preparation was found to give more reliable analyses of samples which required a rinse with H2O (12, 13). No ions were observed with any other sample preparation which were not observed using a combination of the second procedure, a-cyano-4-hydroxycinnamic acid as the matrix, and rinsing of the samples. All tabulated data are for samples prepared in this manner. LSIMS spectra were measured with a JEOL IMS HXllO mass spectrometer fitted with a Cs"*" ion gun. An accelerating voltage of +10 kV and Cs"^ ion gun voltage of +30 kV were employed. An electric field scan over a narrow mass range was used to measure segments of the mass spectrum corresponding with appropriate regions of the MALDI mass spectrum. The samples, prior to the dilution used for MALDI analysis, (1 |il; 100 pmol; 0.1 % TFA solution) were added directly to a 1:1 mixture of m-nitrobenzyl alcohol and glycerol. The mass accuracy of LSIMS for measurement of the unresolved isotopic cluster is typically within 100 p.p.m. of the calculated average [M+H]+ mass. The accuracy of the observed masses listed in Tables I and II for the MALDI mass spectra benefited from the use of a reflectron instrument which generally reduced the deviation observed between spectra measured under different experimental conditions (e.g. different matrix or laser power) from ±1000 p.p.m. to ±300 p.p.m. Reflecting this level of mass accuracy we present the MALDI measurements with only 4 significant figures, compared with 5 significant figures for the LSIMS measurements. For calculation of possible amino acid substitutions, the 20 most common amino acids were used together with y-carboxyglutamate (Gla) and hydroxyproline (Hyp) which are commonly found in conotoxins (14).
MS Scanning Methodologies
33
Peptide Modification : lodination was carried out on a stainless steel probe target by adding 0.1 % aq. I2 (1 |il) to the dried peptide (ca. 1 pmol). The reaction was stopped after 1 minute by addition of ascorbic acid and the MALDI matrix, a-cyano cinnamic acid in excess. Esterification with ethanol was carried out using the method of Hunt et al. (15), where an acetylchloride and ethanol solution (1:6, v:v) was added (5 |il) to the peptide dried in a microcentrifuge tube (ca. 1 pmol). After incubation for 15 minutes at room temperature a 2 mM p-mercaptoethanol (in ethanol) solution, was added (1 [i\) and the mixture was dried. The matrix, a-cyano-4-hydroxycinnamic acid (2 |j.l), was added to the micro tube and after 5 minutes 1 |il of this matrix was removed and applied to a target.
IIL Results and Discussion Figure 1 shows the HPLC profile of semi-purified venom from C. striatus (fractions labeled 3, 5-18,20 and 22). A summary of the observed masses in the MALDI and LSI mass spectra for each of these fractions is given in Table I. A data base of known conotoxins was searched for correspondence (±3 Da) with the observed masses: "matches" are scored irrespective of whether the peptide in question was originally isolated from the particular venom. Figure 2 shows the HPLC profile of semi-purified venom from C. ermineus (fractions labeled 4a, 4b, 5-11 and 14). A summary of the masses of the major species observed in the LSIMS and MALDI mass spectra for each of these fractions is given in Table II. Our finding that the preparation of a-cyano-4-hydroxycinnamic acid in acetone and subsequent rinsing (see Materials & Methods) produced all ion species which were observed with a variety of other MALDI procedures is important for further scanning of the Conus venoms for novel conotoxins. Generally, we observe at least one major species in the MALDI mass spectra corresponding to each HPLC component. The increased mass accuracy available when the instrument was operated in the reflectron mode was important for the analysis carried out. For example, fractions 4b and 7 or fractions 5 and 8 from C. ermineus appeared to be the same species when measured with the instrument operated in the linear mode. Only in the reflectron mode were we able to reliably distinguish the masses of each species. The high sensitivity of MALDI-TOF is particularly important for the analysis of native peptides such as conotoxins where often the venom of many milkings must be collected to obtain sufficient material for sequence analysis. The increased sensitivity of MALDI over LSIMS is illustrated in the analysis of fraction 5 from C. striatus venom (see Table I). Despite the two orders of magnitude difference in the amount of material consumed in the LSI experiment we did not discern any intact species in fraction 5, whereas the MALDI measurement yielded useful information. However, the comparisons in Tables I and II reveal that some components may be detected by LSIMS but not observed in the MALDI mass spectrum (measured with any of the matrices or sample preparation methods). The contrary is most likely more prevalent, i.e. that a large number of the species detected by MALDI with one or more of the matrices are difficult species to ionize with LSIMS.
34
A. Grey Craig
et al.
Table I. Observed masses in the LSIMS and MALDI mass spectra of fractions of C striatus [RTI 1 match CaJc! 1 MALDI LSIMS Obs. mass (m/z) Obs. mass (m/z) mass^ 2740.2 ISVIB 12739.5 12739t NO 5 NO 2579 2544 2494.9 6 SVIA 2494.9 2494t 2521.4 7 2521 1241.5 1791.2 NO 8 SII 1792.0 1240 1794t 1813 1814 2786 1354.4 1791.3 NO 10 SI NO 1354.6 1354t 2782.7 NO 11 sm 2498 1456.7 1457t 2497.5 12 2500 4099 9218 1396.4 NO 4098.6 NA 1397 13 2498 3886 3940 NA NA NA NA 1367 14 4898 4952 4968 NO NO 4947.0 4965.9 4882 15 4084 4100 4792 NA 4082.0 4098.5 NA 3175 16 2498 3924 4758 2176.6 2497.4 NO NA 17 4743 5025 3938 NO 18 3782 3713 3778.1 NO 20 3418 NO NO 3348 3400.0 3416.0 3432.4 122 1 INA 1^ calculated average [M+H]+ mass. NO indicates corresponding ion in ]^ALDI or LSIMS spec trum not observecI. NAindi :ates noi analyze5d. t ind icates the obs. species which matched.
ffn
It is apparent from Table I that masses of several previously known peptides from C. striatus correspond to those found for major UV absorbance peaks. Similarly, Table II shows that the C. ermineus venom contained a peptide in fraction 7 that matched the mass of conotoxin GVIB from C. geographus — in this case, the peptide has since been analyzed and found to be completely unrelated to GVIB, whereas the putative match to SI in fraction 9 has been confirmed with chemical sequencing and mass spectrometry.
100
60
h40
— 17
20 "•0
n^^ IIIII M l I I I I I I n I I I I I I I I I m i l I I I I I I I I I I I I I I I III IIII
time (min)
Figure 1. UV trace of HPLC of C. striatus venom.
in
IIIIIIII1
100
MS Scanning Methodologies
35
Table II. Observed masses in the LSIMS and MALDI mass spectra of C. ermineus fractions 1 LSIMS Fr. 1 match Calc. 1 MALDI mass Obs. mass (m/z)J L (m/z) Obs. mass^J 3451.1 NO NO 2513 3105 NO 1 12496.1 [2497'" ^ 3101 3100.7 4b 3085 5 3099.6 NA 2111 NO 2094 2495.4 NA 3085 NA 6 7 GVIB 3095.4 3098t 3096.2 3082 3082.3 8 1792 1944 3068 1354.0 1790.8 1943.6 3065.6 SI 9 1354.6 1353t 2094.2 NA 2094 3047 3558 NO 1803 NA 10 1397.2 2765.6 2781.4 2766 2781 1398 11 12369.9 NA 3512 3528 [2370 NA 1 14 1 ^ calculated average [M+H]+ mass. NO indicates corresponding ion in MALDI or LSIMS spectrum not observed. NA indicates not analyzed, t indicates the obs. species which matched.
From these results it is clear that useful information can be obtained from MALDI, but that it cannot be used directly to establish the identity of peptides — our ultimate aim is to obtain sequence information from these fractions. Towards that goal we are currendy developing protocols that allow reduction of cysteine residues, alkylation and MALDI measurement without the need for further purification (19). In the simplest version, measurement of the peptide before and after modification reveals the number of disulfide bridges present in the peptide. With the linear alkylated peptide, we can more easily interpret the metastable decomposition occurring in the first field free region of a time-of-flight instrument to measure the fragment ions. This protocol is shown in Figure 3 for reduced and S-carboxamidomethylated (Cam) conotoxin GIA(H-Glu-Cam-Cam-Asn-Pro-Ala-Cam-Gly-Arg-His-Tyr-Ser-Cam-Gly-LysNH2) (16, 17). The spectrum shown in Figure 3 was obtained from a derivatization of 10 pmol of peptide in which 1 pmol of peptide was applied to the target. The PSD spectrum is a composite of scans measured at reflectron voltages between 1.25 and 29.9 kV: the total ion current and therefore the baseline noise varies between individual scans. The 'b' and 'y+2' type fragment ions (18) are the most prolific series observed for this peptide and are therefore identified in Figure 3. However, significantly more sequence information is present in this spectrum (19).
time (min)
Figure 2. UV trace of HPLC of C. ermineus venom
100
36
A. Grey Craig et al.
At this point in time it is impractical to sequence every peptide, given the complexity of the venoms. A strategy that directs the sequencing effort to selected peptides can be based on the rapidly growing number of conotoxin sequences which have previously been determined. In 1991, there were over 70 conotoxin peptides characterized from over 10 species (1); this number is now above 200, and growing rapidly with the acquisition of sequences from DNA cloning (20). As described above, we used a database of conotoxin sequences to assign fractions 3, 6, 8 and 11 of C. striatus venom as possibly containing peptides corresponding to SVIB, SVIA, SI and SIB respectively. More accurate mass measurement with LSIMS confirmed that the intact mass was consistent with this assignment in 3 of these cases (no signal was observed with LSIMS in one case). This type of data-base scanning is also being employed in reverse, to search among venom fractions for candidate peptides to match predicted translation products corresponding to cloned cDNAs. The Conus venoms often contain several minor sequence variants of toxins, arising from genetic polymorphisms, multi-gene families, and variation in post-translational processing. When the mass difference between two closely related peptides (in terms of HPLC retention times and mass) corresponds to a single amino acid substitution, a simple experiment may suffice to choose among alternatives. Consider for example fractions 5 and 7 from C. striatus venom, satellites of the major fraction tentatively identified as conotoxin SVIA (Fig 1 and Table I). The mass difference of 50±1 Da between peptides in fractions 5 and 6 could be explained by any of the following changes (i) Tyr to either Hyp, Leu or He (50 Da); (ii) His to Ser (50 Da) (iii) Phe to Pro (50 Da) or (iv) Trp to His (49 Da). Although options (ii) to (iv) are formally possible, SVIA does not contain His, Phe, or Trp, so option (i) would be favored. In order to test this directly, we iodinated small samples of approx. 2 pmol each of fractions 5 and 6. After treatment with I2, fraction 6 was shifted towards higher mass by 126 Da This shift was consistent with the presence of a tyrosine residue in this peptide (we have determined that iodination under these conditions modifies tyrosine but not histidine residues (21)). 73+2 yi3+2
' — I — ^ — I — ^ — I — ^ — r -
400
600
800
1000
1200
1400
m/z
Figure 3. The PSD spectrum (200-1600 Da) of alkylated Conotoxin GIA.
MS Scanning Methodologies
37
In contrast, fraction 5 was not modified by this protocol, indicating that the mass difference of 50 Da could be attributed to the tyrosine residue in the peptide in fraction 6, being replaced by either hydroxyproline, leucine or isoleucine in the peptide in fraction 5. Similarly, assuming that SVIA is the major component in fraction 6, the additional 27±1 Da of the peptide in fraction 7 could be attributed to change of Ser to (Leu, He or Hyp), or of Lys to Arg. Esterification of fractions 6 and 7 verified that neither component contain acidic groups, which was consistent with C-terminal amidation of SVIA. These derivatizations are highly selective, and may thus allow PSD measurements to be carried out on peptides after modification. Such a protocol would significantly enhance our ability to derive sequence information from PSD spectra, because the mass shifts observed in fragments help locate the particular residue within the peptide, and also confirm assignments of fragments as arising from N- or C-terminal regions. In addition to derivatizations that may modify the C- and N-termini and the derivatization of tyrosine residues, we have carried out oxidation of methionine residues with sufficient specificity to enable measurement of PSD spectra.
IV. Conclusion We have gained an appreciation for the ionization bias observed between MALDI and LSIMS. The utilization of PSD to identify known peptides and provide sequence information has been investigated for conotoxins. This approach to obtaining sequence information on novel peptides is attractive because of the low amount of material required. A number of mass spectrometric based derivatizations have been used to scan fractions of venoms in order to characterize peptides of interest. For closely related components (based on HPLC retention time and mass), the small scale derivatization schemes can be used to test hypotheses about peptides with otherwise novel masses (i.e. which may be homologs). The mass accuracy of the TOP technique, with a gridless reflector, was important for identifying and calling these substitutions.
Acknowledgments We would like to thank Drs. B. Olivera and UCruz, University of Utah for stimulating discussions. This work was supported by the National Institute of Health (K20MH00929, lSlORR-8425, HD-13527, DK-26741, CA-54418, HL41910, GM-48677) and supported in part by the Foundation for Medical Research, Inc. (AGC and WHF).
38
A. Grey Craig et al.
References 1. B. M. Olivera, J. Rivier, J. K. Scott, D. R. Hillyard and L. J. Cruz (1991) Journal of Biological Chemistry 266,22067. 2. M. Karas, A. Ingendoh, U. Bahr and F. Hillenkamp (1989) Biomed. Mass Spectrom. 18,841. 3. M. Schar, K. O. Bomsen and E. Gassmann (1991) Rapid Commun Mass Spectrom 5,319. 4. A. S. Woods, W. Gibson and R. J. Cotter, (1994). In " Time of Flight Mass Spectrometry" (R. J. Cotter, eds.) ACS, Washington D.C., 5. B. Spengler, D. Kirsch, R. Kaufmann and E. Jaeger (1992) Rapid Commun Mass Spectrom 6, 105. 6. M. C. Huberty, J. E. Vath, W. Yu and S. A. Martin (1993) Anal Chem 65,2791. 7. W. Yu, J. E. Vath, M. C. Huberty and S. A. Martin (1993) Anal Chem 65,3015. 8. R. C. Beavis and B. T. Chait (1992) Org Mass Spectrom 27,156. 9. R. C. Beavis and B. C. Chait (1989) Rapid Commun Mass Spectrom. 3,432. 10. K. Strupat, M. Karas and F. Hillenkamp (1991) Int. J. Mass Spectrom. Ion Proc. Ill, 89. 11. O. Vorm, P. Roepstorff and M. Mann (1994). 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, ILL, May 29- June 3 1994. 12.0. Vorm and M. Mann (1994) J Am Soc Mass Spectrom in press, 13. R. C. Beavis and F. Xiang (1994). 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, ILL, May 29- June 3 1994. 14. B. M. Olivera, W. R. Gray, R. Zeikus, J. M. Mcintosh, J. Varga, J. Rivier, V. de Santos and L. J. Cruz (1985) Science. 230,1338. 15. D. Hunt, J. R. Yates III, J. Shabanowitz, S. Winston and C. R. Hauer (1986) Proc. Natl. Acad. Sci. USA S3,6233. 16. W. R. Gray, A. Luque, B. M. Olivera, J. Barrett and L. D. Cruz (1981) /. Biol. Chem. 256, 4734. 17. L. J. Cruz, W. R. Gray and B. M. Olivera (1978) Arch. Biochem. Biophys. 190.539. 18. P. Roepstorff and J. Fohlman (1984) Biomed. Mass Spectrom. 11,601. 19. A. G. Craig, W.H. Fischer, W. R. Gray, J. Dykert, J. E. Rivier (unpublished results). 20. D. R. Hillyard, B. M. Olivera, S. Woodward, G. P. Corpuz, W. R. Gray, C. R. Ramilo and L. J. Cruz (1989) Biochemistry 28,358. 21. A. G. Craig, J. E. Rivier, W. R. Gray and W. H. Fischer (1994). 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, ILL, May 29- June 3 1994
DIRECT COUPLING OF AN AUTOMATED 2-DIMENSIONAL MICROCOLUMN AFFINITY CHROMATOGRAPHY-CAPILLARY HPLC SYSTEM WITH MASS SPECTROMETRY FOR BIOMOLECULE ANALYSIS D. B. Kassel^, T.G. Consler^, M. Shalaby^, P. Sekhri^, N. Gordon^ and T.Nadler2 ^Glaxo Res. Inst., 5 Moore Drive, RTF, NC 27709 and ^PerSeptive Biosystems, 11 Sidney St., Cambridge, MA 01960 I.
INTRODUCTION
Two-dimensional (2-D) separations provide the possibility for exquisite resolution of complex mixtures. A benefit to the direct coupling of 2-D chromatographic methods is that sample handling and transfer steps can be virtually eliminated. This is critical when attempting to isolate and identify analytes at very low detection levels. One use of a 2-D separation scheme is to selectively identify "active" components in a compound library. The library may be naturally occurring, such as a cellular lysate or fermentation broth. Alternatively, it may be a pool of synthetic compounds, or a set of enzymatically or chemically modified compounds. Identification of "active" ligands relies on the ability of individual components from the molecular mixture to bind with high affinity to a target molecule. Like many groups, we have been interested in identifying protein-protein interactions involved in signal transduction (1-3). Because ultra-high sensitivity is required to isolate and characterize these interactions, we have initiated the use of a nucroanalytical immunological system which uses an affinity based binding site selection as the first dimension followed by a separation in the second dimension by reverse phase HPLC. The identification of specifically selected molecules is facilitated by the use of an electrospray ionization mass spectrometer as an on-line detection device. Songyong et al. demonstrated that SH2-specific phosphopeptides could be selected from randomly synthesized peptide libraries. Using glutathione affinity resin coupled with GST-SH2 domain fusion proteins, high affinity phosphopeptides were selectively bound and eluted in a rapid on/off assay. Bound ligands were isolated by elution from the affinity resin with phenylphosphate. Consensus sequences of the high affinity peptides were identified by gas phase Edman degradation. (4,5). Cantley et al. demonstrated that various tyrosine kinase peptide substrates can be trapped and identified using a similar approach with the exception that Fe"*"*" chelation chromatography is used as the affinity selection for phosphopeptides (6). Both of the above studies were limited in that they identified a consensus sequence of the highest affinity peptides from the library, not individual components. In addition, these approaches require that the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
39
40
D. B. Kassel et al.
peptide ligands have free amino termini and that their component residues are amenable to identification by Edman sequencing. We have developed an analogous, but more robust system which is not necessarily constrained by the aforementioned limitations. The obvious extension has been to couple an affinity-based separation with mass spectrometry. Hutchens et al. have shown that affinity probe surfaces can be used to capture specific protein ligands allowing detection by laser desorption mass spectrometry (7). The limitations to their technique have been that the surface area for ligand capture is quite small and salt (or detergent) contaminants are still problematic. Perfusive affinity resins, on the other hand, provide a tremendous surface area for binding. The nature and composition of the solvents required for affinity chromatography, however, are not directly compatible with mass spectrometric analysis. The coupling of micro column affinity chromatography with capillary RPHPLC/ESI-MS, should permit a highly sensitive and highly selective approach to decoding complex nuxtures. Using an automated 2-D system which allows for rapid colunm and solvent switching capabilities, we have assessed the feasibility of coupling a variety of affinity chromatography methods in-line with HPLC/ESIMS. Applications include flow-through enzyme digestions of proteins on immobilized trypsin cartridges, binding and elution of phosphotyrosine containing synthetic peptides on micro-column anti-phosphotyrosine antibody resins, and binding and elution of peptide ligands for the SH2 domain of pp60^' ^^^ on an avidin-biotinyl-SH2 affinity surface capillary column. n. EXPERIMENTAL Automated l-dimensional chomatographic system An Integral™ micro-analytical workstation was used for all affinity chromatography/HPLC/ESI-MS analyses. The workstation consists of three 10port injector valves which have been configured for 2-dimensional separations. In order to capture either unbound or bound species from the affinity resins on the Poros"*^^ R2/H reverse phase resins it was necessary to plumb in a mixing tee just prior to column 2 {i.e., the reverse-phase column) and add an equal volume of aqueous 0.1%-0.2% TFA to the volume of liquid displaced from column 1 {i.e., the affinity column) to adjust the ionic strength and ion pairing capabilities and permit binding of the more hydrophilic peptides. In the absence of this postcolumn mixing tee, binding capacity of the RP-HPLC column was compromised. Mass spectrometer conditions A PE-Sciex API-I lonSpray''^^ mass spectrometer (PE-Sciex, Thomhill, Ontario, Canada) was used to acquire all mass spectra. The Sciex API-Ill mass spectrometer and Integral micro analytical workstation were coupled through a Im piece of fused silica tubing (75|im i.d.) at the exit of the capillary flow cell detector using a 250|im i.d. teflon sleeve as described previously (8). The source
Coupling of 2D Chromatography with Mass Spectrometry
41
needle assembly was aligned slightly off-axis of the entrance aperture to permit high flow rates (i.e., 100|jl/min). The mass spectrometer was scanned from 5001500 Da in 2 sec using a 1.0 msec dwell time and a 0.5 Da step size. The ion multiplier was -4200V, the orifice potential was 70V and the resolution was 1000. Rapid flow-through digestion of proteins using trypsin micro-columns The SH2 domain of pp60^"^^^ was purified from e. coli cells using a T7 expression plasmid and isolated following the procedures of Willard et al. (9). A 20p.l aliquot of purified SH2 corresponding to O.lnmole of protein (in 350 mM NaCl, ImM DTT, 50 mM HEPES, pH 8.0) was loaded at a flow rate of 20 pl/min (corresponding to roughly 0.5 column volumes/min) onto a 750pin x 10cm immobilized trypsin column (Porozjone^^) equilibrated in 50 mM NH4HCC)3, pH 8.5. The total residence time of protein on this column was less than 2 minutes! Effluent from the digest column was trapped at the head of a Poros R2/H capillary column using the post-column mixing tee as described above. Solvent lines were purged with reverse phase buffers and the digestion products were eluted from the column using a linear gradient of 1% to 31% buffer B in 10 min. Preparation of micro-affinity anti-phosphotyrosine antibody column An IgG-2a monoclonal antiphosphotyrosine (P-Tyr) antibody was grown as an ascitic mouse tumor and purified to homogeneity by Protein A fast flow sepharose chromatography to a concentration of 11.3 mg/nil. Cross-linking of the antibody to a Protein G ID sensor cartridge^^ was accomplished by passing 1ml of P-Tyr antibody over the ID cartridge at a flow rate of 0.5 ml/min.. Crosslinking reagent (dimethylpimelimidate) was added in 7 x 2ml aliquots at 0.5 ml/min flow rate. Upon completion of cross-linking, excess reagent was removed by addition of 2 x 2ml aliquots of quenching reagent (ethanolamine). The cross-linked P-tyr antibody column was then washed with 10 column volumes of loading buffer (20 mM Tris in 150 mM NaCl, pH 7.4) and elution buffer (12 mM HCl in 150 mM NaCl, pH 2.5) and repeated until a stable baseline 280 nm absorbance was achieved. Binding and elution of phosphopeptides from P-Tyr antibody micro-column Peptide Mixture I contained AcY*EEIE (1), LIEDNEY*TAR (2) TSTEPQY*EEIENL(3), TSTEPQYEEIENL (4), and PTFEYLQAFLEDYFTSTEPQY*QPGENL (5) and was S)mthesized either in-house (M. Rodriguez, Glaxo Research Institute) or by Zeneca. Peptides were dissolved in loading buffer and their concentration adjusted to 30^M each. A 10)11 aliquot was loaded onto the P-Tyr antibody column at a flow rate of 50 |jl/min and washed for a total of 5 minutes to minimize non-specific binding. The column effluent was trapped at the head of a Poros R2/H capillary column. Specifically bound phosphopeptides were eluted from the P-Tyr antibody using 0.2% aqueous TFA and trapped at the head of the Poros R2/H column. Buffer A was aqueous 0.1% TFA and Buffer B was 90/10 Me(3J/H20 containing 0.1% TFA. A gradient of 1% to 31% B in 10 min and 31% to 61% B in 5 nun was used to separate the peptides.
42
D. B. Kassel et al.
Preparation of micro-affinity Avidin-Biotinyl-SHl column A disposable micro-affinity avidin cartridge was prepared by slurry packing bulk B/A Poros resin into a 500 urn x 5 cm piece of peek tubing fitted with 0.062" X 0.028" 2 pm stainless steel frits at both ends of the column. Biotinyl-SH2 was expressed in e. coli cells using a T7 expression plasmid and purified as described by Consler et al. (10). An aliquot corresponding to approximately 2.5nmoles of Biotinyl-SH2 was loaded onto the avidin micro affinity column at a flow rate of 2 column volumes/min. Binding and elution of SH2 ligands from SH2 Affinity Column Phosphopeptide Mixture II, containing a previously determined high affinity ligand for the SH2 domain of pp60C-src^ TSTEPQY*EEIENL, MW=1633, IC50=1.5^M) and its non-phosphorylated isoform, TSTEPQYEEIENL MW=1553, ICsc^l mM) were prepared and diluted to a final concentration of 30)JM each. A total of 600 pmole of the mixture was loaded onto the SH2 micro affinity column at a flow rate of 20 ^il/min and washed through the column for a total of 5 min. Unbound ligand was trapped on the RP-HPLC capillary column coupled in-line in between the affinity column and the lonSpray'^'^ mass spectrometer. Unbound peptides were eluted from the column using the same gradient as above. Specifically bound ligand was competed off the SH2 affinity column using a 4nmole "plug" of the high affinity ligand, AcY*EEIE (MW=804, IC50=1.5^iM). III. RESULTS AND DISCUSSION Molecular weight mapping of proteins by mass spectrometry is a powerful tool that allows for the identification of post-translational modifications, including glycosylation, phosphorylation, amino-terminal acetylation, truncation, myristoylation, to name only a few. One of the rate limiting steps in the mapping of proteins has been in the digestion itself. Typically, we have either purified the protein by HPLC prior to reduction, alkylation and enzymatic digestion to remove potential enzyme interferents or alternatively, bound the protein to a reverse-phase HP sequencing cartridge support and incubate with an enzyme cocktail (11). Purification of proteins by HPLC often gives rise to significant sample losses. Digestions on sequencing cartridges are, in general.
UV Chromatogram
Lv^yvw^
Time (minutes)
Figure 1. 2 min enzymatic digestion of SH2 on 750^m i.d. Immobilized Trypsin Column coupled to HPLC/ESI/MS
Coupling of 2D Chromatography with Mass Spectrometry
43
quite long (> 24 hours) and incomplete. Recently, we have evaluated immobilized trypsin columns. Because a large amount of enzyme can be coupled to the resin (due to particle's perfusive nature), enzyme digests can be performed extremely rapidly as has been shown by Maleknia and co-workers (12). Using the IntegraF^, it has been possible to couple the digestion of proteins on these enzyme digest cartridges with a "trapping" reverse-phase HPLC column coupled to an electrospray mass spectrometer. The results of such an experiment are depicted in Figure 1. Digestion of SH2 was carried out in a flow-stream of pH 8.5 NH4HCO3 buffer for 2 min. The TIC chromatogram was also recorded (data not shown). Analysis of the electrospray mass spectra showed that the digestion was complete (no intact protein was observed). Underlined sequences (below) represent those tryptic fragments observed in the LC/ESI/MS analysis. Peptides that were not accounted for had molecular weights below the m/z range scanned. SH2 SEQUENCE MDSIQAEEWYFGKITRRESERLLLNAENPRGTFLVRESET TKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRT OFNSLOOLVAYYSKHADGLCHRLTTVCP We have been interested equally in developing other immunoaffinity-based chromatographic methods that can be coupled directly with HPLC and mass spectrometry. Phosphorylation events govern many of the interactions between proteins involved in signal transduction (13) and cell cycle events (14). Western blot analyses are commonly employed that use anti-P-Tyr antibodies to identify the proteins involved in signal transduction. Our aim has been to use microaffinity (250-750|im i.d.) P-Tyr antibody columns with flow rates compatible with reverse-phase capillary HPLC/ESI/MS for the purpose of selectively binding phosphorylated peptides and proteins from complex mixtures and detecting them with high sensitivity by mass spectrometry. Figures 2 illustrates the results of a binding assay using model Peptide Mixture I with the P-Tyr Ab cross-linked to a micro-affinity Protein G column. The TIC chromatogram for Peptide Mixture I analyzed solely by capillary HPLC/ESI/MS is shown in Figure 2a. Binding of the mixture to the affinity column was achieved in less than 2 minutes. Analysis of the "flow-through" material is shown in Figure 2b. Only the non-phosphorylated peptide was observed. Bound peptides were acid eluted from the affinity column, trapped on the capillary HPLC column and mass analyzed by ESI-MS. Figure 2c shows excellent recovery of the 4 phosphopeptides. The retention times were affected following elution onto the reverse-phase column. This could be attributed to altered ion pairing resulting from the highly acidic solution used to displace the peptides from the affinity resin. The electrospray mass spectra of peptides 1 and 2 are shown in Figure 2b and are representative of the quality of data observed from this 2-dimensional analysis. The results suggest that this approach should be particularly useful for identifying, for example, preferred substrates for serine, threonine or tyrosine kinases. Attempts were made to identify preferred peptide substrates for c-src from complex peptide libraries but initial attempts to completely eliminate non-
D. B. Kassel etal.
44
specific binding were unsuccessful. Currently, we are evaluating parameters such as incubation time as well as other means for immobilizing the capture ligand, such as through the use of biotinylated tags. 1.2
•5
(A)
1
>%
(C)
«0
a . c B
c
1,2
4
c
1
[A
3
^
B c Z
Ul_^-^
J"-^^ „
_
time (minutes)
(D)
(B)
>% C
4
0
>
a
.2
>
0 OC
—•-'^
\
^..JU
QC
m/z1303(4
L
time (minutes)
Figure 2. HPLC UV Chromatogram for Peptide Mixture I (A) prior to and (B) following binding to P-Tyr Ab column. (C) TIC chromatogram following elution of "bound" ligands from Ab column; (D) Representative ESI mass spectra. In an analogous manner we have evaluated nrdcro affinity chromatography for identifying ligands for the SH2 domain of c-src. The c-src SH2 domain has been well characterized and is known to bind with high specificity and affinity phosphot5^osine-containing peptides. Construction of a capillary SH2 affinity column was achieved as described previously. Peptide Mixture II was incubated in a flow-based format through the Src SH2 affinity column. The results are sununarized in Figure 3a-3b. Figure 3a shows the mixture analyzed by capillary HPLC/MS operating the Integral in the column-2 only mode. Figure 3b shows the TIC chromatogram and mass spectrum (insert) obtained as a result of incubation and trapping of unbound material on the reverse phase column (operating the IntegraF'^ in the column 1/column 2 mode). Only the nonphosphoiylated, weak affinity peptide is observed in the flow-through, consistent with prediction. The bound, high affinity ligand for the Src SH2 affinity column was displaced using the competing high affinity ligand, AcY*EEIE. Figure 3c shows that the phosphopeptide, TSTEPQY*EEIENL, was completely liberated
Coupling of 2D Chromatography with Mass Spectrometry
45
from the resin. Some of the competing ligand, AcY*EEIE was also observed, as shown in Figure 3c. This could be explained by the fact that a 4nmole injection of the competing ligand was made and the micro-affinity column contained maximally 2.5nmoles of binding sites. Figure 3d shows that prolonged washing of the column with cold TBS (> 10 column volumes) was sufficient to remove all remaining bound ligand. This is consistent with the fact that many of these peptide ligands have reasonably fast off-rates, a necessary consideration in identifying the optimal flow rate for binding in a flow-based immunoaffinity assay such as the one described here. Importantly, this suggests that the Src SH2 affinity column could be re-cycled for multiple screening analyses.
(A) CO
CO
c o
c
>
lNrN^/^-^^^y^
Time (min)
Time (mIn)
(D) 4.1
(0
c o
>
o
QC
Time (mIn)
Time (min)
Figure 3. LC/MS TIC Chromatogram for Peptide Mixture II (A) Prior to and (B) Following incubation with SH2 affinity column. (C) Following displacement of "bound" TSTEPQY*EEIENL using a 4 nmole "plug" of AcY*EEIE and (D) following displacement of bound AcY*EEIE by washing with cold TBS.
46
D. B. Kassel et al.
IV. CONCLUSIONS We have demonstrated that 2-D separations based principally upon microaffinity chromatography and RP-HPLC are readily coupled with electrospray ionization mass spectrometry. Using a number of model systems the following principles were demonstrated: 1) Extremely fast enzyme digests could be performed in situ and could be coupled directly with RP-HPLC/MS to provide protein molecular weight maps; 2) Anti-phosphot5a'osine antibodies immobilized to micro affinity Protein G resins could be used to bind phosphopeptides from complex mixtures and be detected by electrospray ionization mass spectrometry and 3) SH2 domains immobilized to Strep-avidin resins through biotinylated tags were capable of selectively binding high affinity ligands and could be identified readily by mass spectrometry. ACKNOWLEDGMENTS The authors are grateful to D. Weigl (Glaxo Research Institute) for purification of the phosphotyrsoine antibody. The authors also wish to acknowledge J. Mark for providing immobilized trypsin columns. Finally, the authors wish to thank M. Rodriguez, D. Kinder, M. Green and J. Berman (all of Glaxo Research Institute) for providing some of the synthetic peptides used in this study. VI. REFERENCES 1. Koch, C.A., Anderson, Moran, M.F., Ellis, C, Pawson, T. (1992) Science 252, 668-674. 2. Koch, C.A., Moran, M.F., Anderson, D., Liu, X., Mbamalu, G. and Pawson, T. (1992) Molec. Cell Biol. 12 (3). 1366-1374. 3. Pawson, T. and Gish, G.D. (1992) Cell ZL 359-362. 4. Songyang,Z.etal. (1993) Cell 72,767-778. 5. Marengere, L.E.M., Songyang, Z., Gish, G.D., Schaller, M.D., Parsons, J.T., Stem, M.J., Cantley, L.C. and Pawson T. (1994) Nature 369,502-505. 6. Cantley. L.C, Songyang, Z. Proceedings of the American Society of Biochemistry and Molecular Biology, Wash., D.C., 1994. Book of Abstracts. 7. Hutchens, T.W. and Yip, T.-T. (1993) Rapid Commun. Mass Spectrom., 7, 576-580. 8. Kassel, D.B., Musselman, B.D. and Smith, J.A. (1991) Anal. Chem. 63,10911096. 9. Knight, W.B. et al., Glaxo Research Institute, manuscript in preparation. 10. Consler, T.G. Glaxo Research Institute, unpublished results. 11. Burkhart, W. (1993) in "Techniques in Protein Chemistry IV," Academic Press, Inc., R. Hogue Angeletti, Ed., pp. 399-406. 12. Maleknia, S., Dixon, J.D., Mark, J. and Afeyan, N.B. (1994) 42nd ASMS Conference on Mass Spectrometry and Allied Topics, Chicago, IL, Abstract. 13. Zhu, G., Decker, S.J., Maclean, D., McNamara, D.J., Singh, J., Sawyer, T.K. and Saltiel, A.R. (1994) Oncogene 9,1379-1385. 14. Taylor, S.J. and Shalloway, D. (1994) Nature 368.867-870.
Edman Degradation and MALDI Sequencing Enables N- and C-Terminal Sequence Analysis of Peptides Roland Kellner^ Gert Talbo, Tony Houthaeve, and Matthias Mann European Molecular Biology Laboratory, D-69012 Heidelbeig, Germany
I. Introduction In recent applications of protein characterization we focused on the in matrix digestion of samples and automated Edman degradation of the resulting peptide fragments [1]. This strategy proved to be advantageous with demanding biological samples like membrane proteins and cell signalling components, and we could successfully characterize proteins with starting amounts down to ca. 25 pmol sample [2-5]. However, inherent limitations of the Edman chemistry often cause ambiguous results in the low picomole range. The signal of the first amino acid residue(s) are often overlapped by background signals; residues like tryptophan and cysteine are hardly detected; and the C-terminal end of a peptide may not be identified due to sample washout. The molecular weight information from mass analysis can be used either to confirm results from Edman sequencing or to decide minor ambiguities. However, frequently more than one peak appears in the mass spectrum, or more than one amino acid is unassigned. This makes it impossible to correlate the Edman and the MS data by the measured mass alone. Recently, fragmentations by post source decay (PSD) was introduced as a technique to obtain structural information in MALDI MS [6,7]. We describe here the combined use of automated Edman sequencing and MALDI sequencing for the determination of proteolytic peptide fragments in the low picomole range. ^Present address: Institute for Physiological Chemistry and Pathobiochemistry JohannesGutenbei;g-University, D-55099 Mainz, Germany TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
47
48
Roland Kellner et al.
II. Materials and Methods A. Sample Preparation Proteins were digested in the gel matrix as described [5]. Briefly, after separation by 2D-gel electrophoresis and staining by Coomassie Blue R250, the protein spots were excised and thoroughly washed. 1 \x% protease (e.g. trypsin or chymotrypsin, sequencing grade, Boehringer Mannheim, Germany) dissolved in 100 |iil 100 mM NH4HCO3, pH 8.0 and 0.5 mM CaCl2 was added and digestion was performed at 37°C overnight. Peptide fragments were then extracted from the gel slice using 3 x 100 |il 70% trifluoroacetic acid in water and 3 x 100 |il trifluoroacetic acid / acetonitrile 1:1. The supematants were combined and concentrated. The peptides were separated by RP-HPLC (Vydac C18 218TP 1.6 x 250 mm, 120 jiil/min) and peak fractions were collected manually. B. Amino Acid Sequence Analysis A major aliquot of a peak fraction (90%) was subjected to automated Edman degradation (model 477, Applied Biosystems). Typically, fractions of ca. 60 [xl were applied to a polybrene-coated glass fibre filter and sequenced. C. Mass Analysis About 10% of the sample was concentrated in a vacuum centrifuge to give an estimated concentration of 1 pmol/|il if possible. Samples were not dried down completely to avoid sample loss. The MALDI matrix was a saturated solution of a-cyano-4-hydroxy cinnamic acid dissolved in water/acetonitrile (7:3) [8]. A 0.5-|il aliquot of the matrix solution was placed on the stainless steel probe and mixed with an equal volume of sample solution. The mixture was left to dry at room temperature prior to introduction into the mass spectrometer. Anew sample preparation method which decouples matrix surface preparation and sample handling was also used [9]. Mass spectra were acquired using a time-of-flight MALDI mass spectrometer (Bruker REFLEX, Bruker-Franzen, Bremen, Germany) equipped with a reflector. The ion signals were monitored by aLeCroy 9450 digital oscilloscope (400 MHz sampling rate) and the spectra were transferred to a Macintosh Quadra 950, where sets of data were averaged. The acceleration voltage was set to 23 kV and for the stable ion measurements the reflector voltage was set to 26 kV. Depending on the signal to noise ratio each spectrum was the average of 50 to 200 single shot spectra acquired in sets of 5 shots. The mass spectrometer was equipped with a set of short deflection plates and fast pulse electronics. Together they allow selection of a small mass range of interest in a mixture. After the stable spectrum had been obtained in the linear mode a pulse window was set around the precursor ion of interest deflecting all other ions.
Edman Degradation and MALDI Sequencing
49
Metastable reflector spectra were obtained by the method of Spengler and Kaufmann [7] as modified for a gridless reflector [10]. Reflector voltage steps were chosen to result in overlapping mass ranges for metastable ions (see below). Calibration of the metastable spectra was performed as described [10]. Metastable spectra were interpreted manually using the observation that low energy fragments, especially A, B and Y ions, are predominant. Y ions can often be identified by a simultaneous loss of 17 Da.
III. Results and Discussion Previously, we gave a short description of the principle of MALDI sequencing [11]. In the following we will briefly expand on the basics of the process as applied to peptide sequencing. A time-of-flight instrument offers several different modes for mass analysis: linear, reflector and reflector metastable mode (MALDI sequencing). For the linear mode ions are measured at the end of the flight tube. Alternatively, the flight direction of ions can be reversed by a reflector field and the ions are measured at a second detector. The reflector mode balances some of the energy spread and, therefore, increases resolution and mass accuracy. In all modes ions are created directly at or above the target surface by the interaction of the laser beam with the matrix material. Normally, ions will be formed whose mass corresponds to the complete molecular weight of the analyte molecule plus the charge agent. The molecular ions created may be stable or they may fragment before detection. Particularly weak bonds may result in fragmentation on the surface of the target, the so called ^prompt fragmentation*. In the linear mode these fragments show up at distinct flight times in the MS spectrum. However, if the fragmentation occurs after the acceleration region, where the ions have acquired their full kinetic energy, those fragments will reach the detector at about the same time as their stable counterparts. Then there will be only one peak per component even though there may be a significant degree of fragmentation [12]. The reflector mode is somewhat more complex. The reflector constitutes an energy analyzer, which time focuses ions that have the full acceleration voltage. However, it will separate fragment ions that have the same velocity, because they originate from the same parent ion, but different mass and hence, different energy. Light fragment ions will not penetrate as far into the reflector and will appear earlier in the time-of-flight spectrum. After proper calibration, correct masses can be assigned to a segment of the metastable ion spectrum [13]. Procedures for calibration of metastable fragmentation spectra are now available for most comimercial MALDI mass spectrometers. By stepping the reflector voltage the whole product ion mass range can be covered. In the same measurement, a signal for the corresponding neutral fragments can be obtained on the linear detector. The pres-
Roland Kellner et al.
50
ence of the neutral fragment peak ensures that peptide precursor ions have been formed. This is helpful in mass ranges where no fragment ions are produced. The intensity of the neutral peak is related to the degree of fragmentation of the sample and may help in determining the level of laser irradiance to apply. A. Examples for the identification of proteolytic peptide fragments by Edman degradation and MALDI sequencing 1. A14 kDa vesicular membrane protein which had been the subject of study for two years was separated on a 2D-gel. Poor staining was achieved with Coomassie Blue. Nevertheless, we excised that spot because of the low abundance of the sample. The position of the protein in the gel was determined by comparison to an earlier, silver-stained gel which had given a more intensive staining with less sample. Chymotrypsin was chosen as protease because trypsin failed to cleave this membrane protein in a first attempt. The peak intensity after HPLC separation of the digestion mixture showed the low sample amount (Figure 1). Ninety percent of a peak fraction was applied to the sequencer. The following sequence information for the 12-residue peptide KQYHENIS A W F could be determined by Edman degradation in the range of 1.5 pmol (Tyr^: 1.4 pmol). It contained several ambiguities, namely His"*, Ser^ and Val^^ (Figure 2). Simultaneously, an aliquot of the fractionated peptide (5 |il = ca. 10%) was used for MALDI MS. The reflector spectra showed a monoisotopic molecular weight of 1434.7 Da which means an accuracy of 40 ppm (Figure 3). Furthermore, MALDI sequencing resulted in the peak pattern shown in Figure 4. The fragmentation pattern clearly identified a Ser residue at position 8 and Val at position 10 in the molecule. Together with the overall mass a His residue at position 4 could be derived. He and Leu were already distinguished by Edman sequencing. The combined interpretation of these results gave the unequivocal determination of this chymotryptic peptide fragment.
KQYHENISAVVF
/VTV^ yv^
Figure 1. In matrix chymotryptic digestion of a 14 kDa vesicular membrane protein separated on RP-HPLC and UV-detection at 214 nm (solid line). The peaks were fractionated manually. From the same 2D-gel a blank gel piece was treated simultaneously and used as a comparison to eliminate background peaks (dashed line).
51
Edman Degradation and MALDI Sequencing
Cycle No.
UMM
M UWAJ
r**^
U
5.0
y
12.0
-i.O
1.0
21.9
Time (min)
3-0
6.0
3.0
12.0
15.0
1S.0
21.0
Figure 2. Edman degradation of a chymotryptic peak fraction. OnJy the finally assigned PTHamino acids of the 12-rcsidue peptide are labelled (see text). I
II
434.>7
KQYHENISAVVF MH-i- - 1434.74 amu
UMWH^*;%M>MIV^^ Figure 3. Reflector MALDI MS was performed on an aliquot of the chymotryptic peak fraction (Figure 2). The monoisotopic peak could be determined via an internal standard to an accuracy of 40 ppm.
52
Roland Kellner et al.
K - Q - Y- H -
EJ- N - J J - S
-
AJ- V I - V J - F
687.34 |914.46 I 1072.53 I 1271.67 801.38 1001.50 1171.60 Signal (mV)
Figure 4. MALDI sequencing of a chymotryptic peptide fragment.The overlapping mass ranges for metastable ions are due to steps in the reflector voltage.
Figure 5. Reflector MALDI MS. The tryptic peptide ALLNNSHFYHLAHGKDFASR has i calculated molecular weight of [M+H]* = 2299.56 Da.
Edman Degradation and MALDI Sequencing
53
2. In another example, a tryptic fragment of a 52 kDa protein was subjected to Edman degradation and a 15 residue peptide was identified. It contained two ambiguities: ALLNNSH(F/Y)YHLA(H)GK'^; Lys was assumed to be the C-terminus of the tryptic fragment. However, the molecular weight was determined by MALDI to be 2299.4 Da (Figure 5), which implied that it was larger than 15 residues. Therefore, the component was further analysed by MALDI sequencing. Interpretation of several fragmentation spectra identified position 8 to be a Phe, and position 13 could be confirmed to be His; the fragment ions were in agreement with the Edman result (Figure 6A and 6B). A database search showed strong homology to a known protein and two amino acid exchanges were identified: Phe^ instead of Tyr, and Leu*' instead of Met. In the protein sequence the 15 residue fragment is followed by the tryptic peptide DFASR. Calculating an extented 20 residue peptide its mass agrees with the measured molecular weight of 2299.4 Da. MALDI sequencing was particularly important to determine and verify the amino acid exchanges comparing the studied protein and the sequence which was obtained from the database.
300
1350
Figure 6. The molecular ion peak (Figure 5) was subjected to MALDI sequencingThe mass range, in A, from 650 - 1000 Da displays theY serie LAHGK and the B serie HF"V^ in B the Y and the B serie YHLA from 1000 - 1350 Da are shown.
54
Roland Kellner et al.
IV. Conclusions Automated Edman degradation or MALDI sequencing can in principle yield the complete sequence of peptides. However, the amounts of peptide for sequencing in demanding biological problems are very low. Both methods need to be operated at their highest performance possible and inherent limitations must be considered. A combined application of both methods is feasible because of the very low sample consumption for MALDI MS. As shown in our example ambiguous sequence calls could be clarified by the complementary information. The application on proteolytic peptide fragments resulted often in metastable ions from the C-terminal part. In this way the N-terminal sequence is achieved by Edman degradation, the molecular weight determines the overall size of the fragment, and ambiguous amino acid residues are identified by MALDI sequencing.
References 1. Kellner, R., Houthaeve, T., Kurzchalia, T.V., Dupree, P., Simons, K. (1992) J.Prot.Chem, 77,356. 2. Kurzchalia, T.V., Dupree, P., Parton, R., Kellner, R., Lehnert, M., Simons, K. (1992)7. Cell Biol 775,1003-1014. 3. Emans, N., Gorvel, J.R, Walter, C , Gerke, V., Kellner, R., Griffith, G., Gruenberg, J. (1992) J.Cell Biol 120,1351-1369. 4. Kurzchalia, T.V., Gorvel, J.R, Dupree, R, Parton, R., Kellner, R., Houthaeve, T., Gruenberg, J., Simons, K. (1992) J.BiolChem. 257,18419-18423. 5. Fiedler, K., Parton, R.G., Kellner, R., Etzold, T., Simons, K. (1994) EMBO J. 75,1729-1740. 6. Spengler, B., Kirsch, D., Kaufmann, R., Jaeger, E. (1992) Rapid Commun. Mass Spectrom. 5,105-108. 7. Kaufmann, R., Spengler, B., Lutzenkirchen, F. (1993) Rapid Commun. Mass Spectrom. 7,902-910. 8. Beavis R., Chait B. (1989) Rapid Commun. Mass Spectrom. 5,432-435. 9. Vorm, O., Roepstorff, P., Mann, M. (1994) Anal Chem., in press. 10. Vorm, O., Talbo, G., Mortensen, P., Mann, M. (1994) personal communication. 11. Talbo, G., Mann, M. (1994) In: Techniques in Protein Chemistry V(J.W. Crabb; ed.), p.105-114, Academic Press, San Diego. 12. Spengler B., Kirsch D., Kaufmann R. (1992) J.Phys. Chem. 96, 9678-9684. 13. Vorm O, Talbo G., Mann M. (1994) personal communication.
Identification of the Amino Terminal Peptide of N-temiinally Blocked Proteins by Differential Deutero-Acetylation Using LC/MS Techniques Gaig D. Thulin and Kenneth A. Walsh Dept. of Biochemistry, University of Washington Seattle, WA 98195
I. Introduction More than 80% of eukaryotic proteins are blocked at their amino terminus, often by N-acetylation (1, 2). This prevents direct analysis of the primary structure of many proteins by standard techniques. Instead, blocked proteins are usually digested with a high-specificity protease and the resultant peptides separated chromatographically and sequenced in turn. The one peptide which is refractory to sequencing is then assumed to be the amino terminal peptide, and amino acid analysis yields its composition (see 3, 4 for examples). Chemical analysis of the hydrolytic products, e.g. by gas chromatography, can reveal the nature of the blocking group (4). Alternately, proteins or peptides can be deblocked using specific enzymes or chemical treatment (5), although success is variable. More recently, mass spectrometry has been applied to the identification of blocked N-terminal peptides and tandem MS/MS techniques to the characterization of the blocking moiety (see 6, 7 and 8). Until now no facile method has been available to identify the amino terminal peptide in a mixture. Although LC/MS does allow the analysis of a complex mixture of peptides, one must examine every component in order to identify the amino terminal peptide, even when one knows the amino acid sequence and suspects the nature of the blocking group. We report a method that facilitates the identification of the blocked amino terminal peptide in a digest of a blocked protein. This method can be applied even if the amino acid sequence and blocking group are unknown. In this method, the amino groups on lysine side chains are first deuteroacetylated, then the protein is digested with a high-specificity protease. The digestion mixture is divided and half of it retreated with deuterated acetic anhydride. Only one peptide (from the blocked N-terminus) should not be altered by the second deutero-acetylation; it is identified by comparing LC/MS patterns of the treated and the untreated digest. The use of tandem mass spectrometry (MS/MS) then provides the sequence of that peptide and the identity of the blocking group.
II. Materials and Methods A.
Modification of Lysine Side-chains
Acetylation was performed using a modification of the method of Fraenkel-Conrat (9). Horse cytochrome-C (Sigma) was dissolved in 100 mM Tris pH 8.0 to give a concentration of 0.25 mg/ml. To approximately 2 nMoles of protein (100 ^,1) TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
55
56
Craig D. Thulin and Kenneth A. Walsh
was added an equal volume of 400 mM sodium acetate. The protein solution was cooled in an ice bath and 0.5 ^1 of deuterated acetic anhydride (99 + atom % D; a generous gift of Dr. Hiroshi Ohguru) was added every 20 minutes for one hour. The modified protein was desalted with a POROS Rl/M 2.1mm x 30mm reversephase column (PerSeptive Biosystems, Inc.) using a step gradient from 0 to 80% acetonitrile in 0.05% trifluoroacetic acid (TFA). Fractions were collected and lyophilized.
B.
Proteolytic Digestion of Modified Protein
The modified protein was redissolved in 100 ^ll 100 mM Tris pH 8.0 and 4 ^1 of a 0.1 mg/ml solution of chymotrypsin (to give approx. 1:100 enzymerprotein ratio) was added. Digestion was carried out at 37° C for 18 hours.
C.
Re-acetylation (of New Amino Termini)
To one half of the digest was added 50 |il of 100 mM Tris pH 8.0 and 100 ^ll of 400 mM sodium acetate. This mixture was then placed on ice and three additions of 0.5 |il of deuterated acetic anhydride were made over one hour as before.
D.
Identification of Amino Terminal Peptide
Both the modified and unmodified digest were analyzed by liquid cluromatography/mass spectrometry using an Applied Biosystems Model 140A HPLC with an Upchurch 2mm Cl8 reverse-phase column and a PE Sciex API HI triple-quadrupole ionspray mass spectrometer. At a flow rate of 200 |il/min the chromatography was developed with 0.05% TFA and a gradient of 0 to 60% acetonitrile, 0.03% TFA over 30 min. Ten percent of the HPLC effluent was directed to the mass spectrometer; the remainder was directed to an ABI Model 785A UV detector and fractions were collected by hand. Comparison of the LC/MS data before and after re-acetylation sought a single peptide that did not change mass or mobility. Data analysis was performed using the MacSpec software from PE Sciex, as well as an in-house program, Sherpa, written by J. Alex Taylor, which identifies and relates peptide m/z values in an LC/MS experiment to masses predicted from a given protein sequence. MacBioSpec (PE Sciex) was also used to predict masses for some modified peptides. Fractions containing ions of interest were infused into the mass spectrometer at 1.7 |il/min. and ions selected in the first quadrupole were analyzed by interpreting collisionally induced dissociation (CID) spectra. MacBioSpec was used to generate lists of expected mass spectral fragments for comparison to the observed data.
E. In situ derivatization on a Cationic PVDF Membrane A large protein, rabbit glycogen phosphorylase b (97 kD) was not soluble at high enough protein concentration for the procedures used for cytochrome-C. To overcome this limitation, 20 |il of 9 M urea, 50 mM Tris pH 8 containing 10 mg/ml rabbit glycogen phosphorylase b (a giftfromthe laboratory of Dr. Edmond Fisher) was subjected to electrophoresis on a 4-16% gradient SDS polyacrylamide gel, blotted to Immobilon™-CD (Millipore Corp.), a cationic PVDF membrane, and stained with Immobilon-CD Stain according to the manufacturer's protocol. Stained bands were excised and cut into 1 mm^ pieces. One hundred |ll of 100 mM Tris pH 8.0 and 2 M urea were added to the
Identification of Bloclced N-Terminal Peptides
57
membrane pieces, followed by 100 [i\ of 400 mM sodium acetate. This was then cooled in an ice bath and 0.5 [il of deuterated acetic anhydride was added every 20 minutes for one hour. The supernatant and three 1 ml washes with distilled water were decanted and discarded. Subsequent digestion conditions were based on those of Hess et al. (10) as follows: 50 ^il of 100 mM Tris pH 8, 1 M NaCl, 10% (v/v) acetonitrile, with 0.1 mg/ml chymotrypsin was added to the membrane pieces and incubated overnight at 37°C. Five |il of 9M urea, 50 mM Tris pH 8 plus 1 |xl of 1 mg/ml chymotrypsin was added, and the digest was incubated another 5 hours. Alternative treatment with trypsin used the same Tris, NaCl, acetonitrile mixture but with 2 mM CaCl2 and 0.01 mg/ml trypsin. After digesting overnight, 5 ^1 of 9 M urea, 50 mM Tris pH 8 and 0.25 jil of 2 mg/ml trypsin was added and the digestion incubated another 5 hours. In either case, the supernatant was then decanted and combined with 50 |il 100 mM Tris pH 8. The resulting 100 |il was divided in half, and one half was re-deutero-acetylated as in step C above. Analysis was conducted as previously indicated.
III.
Results and Discussion
Cytochrome-C and glycogen phophorylase b were chosen as model proteins for tfiis study. Both have been well characterized and in both cases the amino termini
a) <s> 1000-
<s>
«>
• | 800 i
600^
400^
1
•
'
•
b)
1
«
•
•
•
1000. "ci^
-
i
m
•
600-
-«
- ^ '<s>
•
•
ft
400! 8.9
•
•
«
1
11.1
•
•
«
•
1
13.3
••
- cr>
"" '
«
'
•
1 •1
15.5
'
'
»
1
^ •
•
•
•
1
17.8 20.1 Elution timo (min)
«
•
«
•
1
22.3
•
•
•
>
1
•
•
24.5
Figure 5. Contour plot of an LC/MS of a tryptic digest of modified phosphorylase b, printed in black and white mode. Circles indicate the positions of peaks from a contour plot of the reacetylated digest for comparison. The two significant share peaks (indicated by arrows) are the N-terminal blocked peptide (at m/z=652 and 13 min elution time) and the leucine enkephalin internal standard. citraconylated before proteolysis, and then the new amino termini were dinitrophenylated. After subsequent de-citraconylation, the dinitrophenylpeptides were adsorbed on a polystyrene column, and the amino terminal peptide flowed through the column. Additional steps were required when histidine or tyrosine were in the N-terminal peptide. In the present method we also exploited Ac unique resistance of the blocked amino terminal peptide to modification but use LCA^S techniques. It is not necessary to purify the amino terminal peptide; the separation in two dimensions (elution time and m/z) by LC/MS is sufficient to identify this peptide in the mixture. The high resolution of mass analysis assures distinction between naturally occurring acetylation (with naturally abundant isotopes) and deuterated acetyl groups introduced in the procedure. Modification with different chemical moieties and deblocking of lysine side chains is unnecessary. 652
Figure 6. CID specu-um of the N-terminal tryptic peptide from phophorylase b. The spectrum is consistent with the known sequence, namely: Ac-Ser-Arg-Pro-Leu-Ser-Asp-Gln-GluLys*-Arg, where the Lys* has a deuterated acetylation. Ions labelled with an arrow pointing right are b fragment ions; those labelled with an arrow pointing left are y fragment ions. The ion at m/z=652 is the doubly charged parent ion. (Note that trypsin does not cleave at Arg-Pro.)
62
Craig D. Thulin and Kenneth A. Walsh
Visentin and Kaplan (13) also employed a strategy similar to the present method but using [l-^^C]- and [^H] acetic anhydride. The present method allows the analysis to be accomplished without resorting to radioisotopes. The method described in this report has recently been employed in the study of two amino terminally blocked proteins wherein the amino terminus had not previously been identified (Thulin and Walsh, in preparation; and Presland, Kimball, Thulin and Dale, in preparation).
IV. Conclusion Comparison by LC/MS of enzymatic digests before and after modification with deuterated acetic anhydride allows facile identification of the amino terminal peptide of amino terminally blocked proteins. Subsequent MS/MS analysis establishes the specific nature of the blocking group and the sequence of the peptide. The chemical modifications involved (both before digestion, to block free side chain amino groups; and after digestion to modify new amino termini) are simple and effective. These procedures can be done in solution or after the protein has been separated by SDS-PAGE and blotted to a membrane. Modem mass spectrometric techniques greatly simplify this kind of analysis when compared with classical approaches to the same problem.
Acknowledgment CD. Thulin was supported by Public Health Service National Research Service Award T32 GM07270fromthe National Institute of General Medical Sciences.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Brown, J.L. and W.K. Roberts (1975) /. BioL Chem. 251(4); 1009-1014. Brown, J.L. (1979) /. Biol Chem. 254(5); 1447-1449. Resing, K.A., K.A. Walsh, J. Haugen-Scofield, and B.A. Dale (1989) /. Biol Chem, 264(3); 1837-1845. Margoliash, E., E.L. Smith, G. Kreil, and H. Tuppy (1961) Nature 192; 1121-1127. Tsunasawa, S. and H. Hirano (1993) in Methods in Protein Sequence Analysis, ed. by K. Imahori and F. Sakiyama, Plenum Press, NY pp 45-53. Labdon, J.E., E. Nieves, and U.K. Schubart (1992) /. Biol Chem. 267(5); 3506-3513. Gibson, B.W., A.M. Falick, J.J. Lipka, and L.A. Waskell (1990) /. Protein Chem. 9(6); 695-703. Anderegg, R.J., S.A. Carr, I.Y. Huang, R.A. Hiipakka, C.S. Chang, and S.T. Liao (1988) Biochem. 27(12); 4214-4221. Fraenkel-Conrat, H. (1957) Methods in Enzymology 4, 247-269. Hess, D., T.C. Covey, R. Winz, R.W. Brownsey, and R. Aebersold (1993) Protein Science 2; 1342-1351. Koide, A., K. Titani, L.H. Ericsson, S. Kumar, H. Neurath, and K.A. Walsh (1978) Biochem. 17; 5657-5572. Kaplan, H. and G. Oda (1983) Anal Biochem. 132; 384-388. Visentin, L.P. and H. Kaplan (1975) Biochem. 14(3); 463-468.
SECTION II Analysis of Posttranslational Processing Events
This Page Intentionally Left Blank
HPAEC-PAD Analysis of Monoclonal Antibody Glycosylation Jeffrey Rohrer, Jim Thayer, Nebojsa Avdalovic, and Michael Weitzhandler Dionex Corporation, Sunnyvale, CA 94088 L
Introduction
Virtually all antibodies are glycoproteins that contain 2-3% carbohydrate by mass. The carbohydrate of IgG MAbs consists mainly of complex biantennary N-linked oligosaccharide chains. 0-glycosylation has also been documented in the constant region hinge domain of mouse IgG2b (1). Glycosylation of immunoglobulins has been shown to have significant effects on their effector functions, stability, and serum half-life (2,3). Human and mouse IgG are N-glycosylated on each heavy chain in the constant region CH2 domain at Asn-297 (4). Glycosylation at this site effects Fc receptor binding and complement activation (2). Variable region glycosylation has also been reported (5-9). Differing effects on binding affinity were seen due to this glycosylation (10-11). Sialylation on MAb oligosaccharides has been shown to be associated with decreased solubility for monoclonal IgMs and IgGs (12-15). Finally, the presence of a-galactosylation in murine IgGs has recently been documented and was postulated to effect clearance because it is known that 1% of circulating antibodies in humans are directed against this epitope (16). Because of these potential effects of glycosylation on MAb pharmaceutical efficacy, it is important to evaluate MAb glycosylation prior to entering clinical trials. Minimally, this analysis should determine if the glycosylation of the product destined for clinical administration is comparable to that of a reference product, and that the production process achieves a reproducible glycosylation pattern of the MAb. Monosaccharide and oligosaccharide analysis using HPAEC-PAD are two of the methods biopharmaceutical manufacturers use to characterize glycan structures and to monitor the lot-to-lot consistency of therapeutic glycoproteins. A strategy for the mapping of N-glycans by HPAEC-PAD was recently reported (17). In this chapter, these methods were used in conjunction with oligosaccharide standards and endo- and exoglycosidases to identify the oligosaccharide structures present in MAb MY9-6.
TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
55
^^
n. A.
Jeffrey Rohrer et al.
Methods Chromatography
Reagents, a Dionex BioLC system, and conditions used for monosaccharide and oligosaccharide analysis were as described (18). MAb MY9-6 is an ascitesderived murine monoclonal IgG provided by Dr. Mark Hardy, ImmunoGen (Norwood, MA). The MAb was purified by Protein A chromatography. The UV detector configured after the electrochemical detector was a Dionex VDM2 (variable wavelength detector). Peaks were monitored at 215 nm. The sample pretreatment resin used in this study was derived from OnGuard-A cartridges available from Dionex (Sunnyvale, CA). Amino acids used in this study were obtained from Pierce (Rockford, IL). m.
Results and Discussion
A.
Monosaccharide
Analysis
An accurate molar ratio of composite sugars relative to protein 1) provides a basis for further structural elucidation of glycoproteins, 2) provides direct evidence that the polypeptide is glycosylated, 3) suggests classes of oligosaccharide chains, and 4) may serve as a measure of production consistency for therapeutic recombinant glycoproteins (19). Glycoproteins with low percentages of glycosylation represent a challenge for monosaccharide analysis. When there are large molar ratios of peptides and amino acids relative to monosaccharides (e.g., a glycoprotein with < 5% glycosylation), monosaccharide separation and detection can be compromised by coelution of amino acids or peptides from hydrolyzed proteins. In this work, we focused on analysis of glycoproteins with low percentages of glycosylation (MAbs) and the use of sample pretreatment and internal standards to improve monosaccharide quantification. To analyze potential interference of amino acids in monosaccharide analysis, each of the 20 amino acids (10 /xg each, each injected separately) was subjected to the chromatography conditions used for separating, detecting, and quantifying monosaccharides. In addition to PAD detection, we monitored UV detection at 215 nm after the electrochemical detector to verify amino acid electrochemical detection. Ten amino acids (R, K, Q, V, N, A, I, L, T and C) eluted between 2 and 25 min and were both PAD and UV active. Of these ten, two amino acids could potentially interfere with monosaccharide analysis. Glutamine was found to elute as a shoulder on mannose. However, acid hydrolysis conditions used to release monosaccharides from glycoproteins likely would oxidize glutamine. Lysine was found to elute very close to rhamnose, a monosaccharide used here as an internal standard (18). The other eight amino acids eluted either before fiicose (< 5 min) or after mannose (> 20 min). The remaining ten amino acids eluted while washing the column (25 to 35 min) or remained bound to the column.
HPAEC-PAD Analysis of Protein Glycosylation
67
We evaluated peptide interference by using UV detection after PAD detection to identify interfering peptides/amino acids in the MAb hydrolysates. Results of these studies are shown in Fig. 1 and show that at the levels of hydrolysate used to give quantifiable monosaccharide responses (16.7 /xg injected), there is little UV response in the region of the chromatogram where monosaccharides elute. UV detectable peptides/amino acids were found to elute near the column void and after 20min. OnGuard-A resin (50 mg), a microporous strong anion exchanger in the bicarbonate form, was used to remove peptides/amino acids (20). Under the conditions used, we determined that monosaccharides do not bind to OnGuard-A resin (data not shown). Comparison of Fig. lA and IB and IC and ID reveal the results of sample pretreatment with the anion exchanger. Clearly, the sample pretreatment removed UV and PAD active components that eluted after 20 minutes. Additionally, the monosaccharide peaks appeared to be "cleaned up" after this treatment, as evidenced by a more Gaussian shape. OnGuard-A treatment will remove glutamine but not lysine (data not shown). Monosaccharide composition analysis of the intact MAb revealed monosaccharide ratios consistent with the presence of lactosamine type, fiicosylated biantennary oligosaccharides with less than complete galactosylation (gal: man is < 2 : 3) (Table 1). The absence of galactosamine indicates the absence of 0-linked glycosylation. HPAEC-PAD monosaccharide analysis of hydrolyzed heavy and light chain bands verified predominantly heavy chain glycosylation (18,21). Comparison of monosaccharide compositions of PNGase F treated heavy chain bands with corresponding untreated heavy chain bands revealed essentially complete deglycosylation (>90%) by PNGase F (18). We determined MY9-6 monosaccharides with two amounts of injected protein (4.16 and 16.7 )ug) with similar results. This analysis shows that with higher amoimts of injected glycoprotein hydrolysate, there is a greater need for internal standard correction due to electrode poisoning (20). This poisoning is presimied to be primarily due to the amino acids and peptides not removed by the OnGuard-A resin. R,
Oligosaccharide Mapping
Glycosylation of the MY9-6 preparation was investigated by HPAEC-PAD oligosaccharide mapping after release of the N-linked structures by PNGase F. Oligosaccharide peak 1 (Fig. 2A) has a retention time identical to a fucosylated agalactosyl biantennary oligosaccharide standard (Table 2, structure 1). Peak 3 (Fig. 2A) has a retention time identical to a fucosylated, fully galactosylated biantennary oligosaccharide standard (Table 2, structure 3). Oligosaccharide peak 2 (Fig. 2A) has a retention time intermediate between peaks 1 and 3, perhaps suggestive of a fucosylated, monogalactosylated biantennary oligosaccharide structure (Table 2, structure 2). Alternatively, the retention time (15.3 min) of peak 2 is nearly identical to an agalactosyl biantennary oligosaccharide standard
68
Jeffrey Rohrer et al.
(D
Ul C O
¥lJ
CL
^ 300 0)
cr ^ CL
200
f--J 1 2
'
"^^ 10
20
30
Time
0
10
20
30
(min)
Figure 1. Monosaccharide analysis of MAb MY9-6. CarboPac PAl chromatography of 2M TFA hydrolysates (A & B) and 6M HCl hydrolysates (C & D). Chromatography of hydrolysates without prior OnGuard A sample pretreatment (A & C) or with prior OnGuard A sample pretreatment (B & D). Peaks are as follows: 1. fucose; 2. rhamnose; 3. mannosamine; 4. glucosamine; 5. galactose; 6. glucose; 7. mannose. Insets show UV absorbance monitored at 215 imi Table 1. Monosaccharide Analysis of Monoclonal Antibody MY9-6 Residues Monosaccharide/Mole MY9-6 Amount Hydrolyzed 4.16 ng
GlcN** GalN** Fuc* Gal* Man* 4.22 (3.76) 1.97(0.99) 1.14(1.05) 3.56 (3.30) 0(0) [3.54] [0.93] [0.97] [3.0] 3.85 (3.23) 0.93 (0.55) 0.86 (0.50) 16.7 ng 2.64(1.55) 0(0) [4.37] [1.061 [0.97] [3.01 = determined by 2M TFA, 4h, 100°C. ** = determined by 6M HCl, 4h, 100°C. = These values prior to internal standard correction (rhamnose and mannosamine () internal standards for the 2M TFA and 6M Hcl hydrolysates, respectively). = These values are normalized to man = 3. []
(15.2 min; structure not shown). Thus peak 2 may contain either or both of the aforementioned structures. MAb MY9-6 also possessed charged ohgosaccharides (peaks under 4, Fig. 2A). To further elucidate the identities of these oligosaccharide peaks, endoglycosidase treatment (Endo F2, Endo H) was used to classify the oligosaccharides. Exoglycosidase treatment (neuraminidase, pgalactosidase, p-N-acetyl hexosaminidase) of the PNGase F released structures was then used to substantiate the preliminary identifications.
HPAEC-PAD Analysis of Protein Glycosylation
69
Endo F2 Treatment Endo F2, a recently described endoglycosidase which cleaves predominantly biantennary oligosaccharides (23), was used to evaluate whether the released Nlinked oligosaccharides from MAb MY9-6 were biantennary-type chains as has been reported for mouse and human IgGs (24,25). More highly branched structures have also been reported (26). Endo F2 differsfromthe amidase PNGase F not only in its more restricted specificity but also in that it releases oligosaccharides with only half of their chitobiose core (PNGase F releases all types of N-linked oligosaccharides with their chitobiose core intact). A released N-linked oligosaccharide with a complete chitobiose elutes earlier in HPAEC-PAD than the identical pligosaccharide with half its chitobiose; thus biantennary oligosaccharides released by Endo F2 are expected to elute later than the identical oligosaccharide if released by PNGase F (27). Also, if fucose is attached at the reducing end GlcNAc, Endo F2 treatment would leave the fucose bound to the reducing end GlcNAc still attached to the polypeptide. Because the presence of core fucosylation reduces retention times of oligosaccharides on HPAEC-PAD, Endo F2 release of oligosaccharides without the reducing end core fucosylated GlcNAc would result in afiirtherincrease in retention time for the product (when compared to the identical oligosaccharide released by PNGase F). Endo F2 digestion of agalactosylated biantennary structures either with or without core fucosylation would give an identical product. Thus if peak 2 is an agalactosylated biantennary structure (no core fucose), Endo F2 digestion of the three neutral oligosaccharides would result in two products, an agalactosylated and a digalactosylated biantennary oligosaccharide, both with half of their chitobiose. Alternatively, if peak 2 is a monogalactosylated, core fucosylated biantennary oligosaccharide, Endo F2 treatment of the three neutral oligosaccharides would result in three products; an agalactosylated, a monogalactosylated and a digalactosylated biantennary oligosaccharide, each with half of their chitobiose. Comparing PNGase F vs. Endo F2 digestions of MAb MY9-6 revealed similar maps with three Endo F2 released oligosaccharide products, all with somewhat longer retention times when compared to the PNGase F released oligosaccharides (compare Fig 2 A and 2B). Ratios of peak areas of PNGase F neutral peaks 1, 2, and 3 were 49% : 35% : 10%. Similarly, ratios of peak areas (expressed as percentage of total oligosaccharide peak areas) of the Endo F2 peaks 1, 2, and 3 were 53%: 35%: and 7%, respectively. These results confirm that the major N-linked oligosaccharides present in MAb MY9-6 were predominantly biantennary oligosaccharides. These results also support identification of peak 2 as a monogalactosylated biantennary oligosaccharide with core fucosylation.
Jeffrey Rohrer et al.
70
Neuraminidase
PNGose F
c O
/?-Galactosidase
Endo F2
Q_ (D
JUJ Q < 200 CL
C
0
Endo H
15
Time
|S -N-Acetylhexosominidase
30
45 0
15
30
(min)
Figure 2. Oligosaccharide mapping of MAb MY9-6. CarboPac PAIOO chromatography of enzyme digests of MAb MY9-6. In Panels A, B, and C the substrate was 100 jLtg of MAb MY9-6. In Panels D, E, and F the substrate was a PNGase F digest derived from 100 jUg of MAb MY9-6. Table 2. Some Potential Oligosaccharide Structures for MAb MY9-6 1.
GlcNAc(Pl,2)Man(al,6)
Fuc(al,6)
I
I
Man(Pl,4)GlcNAc(Pl,4)GlcNAc GlcNAc(Pl,2)Man(al,3)
r'
GlcNAc(Pl,2)Man(al,6)
Gal(Pl,4)
Fuc(al,6)
I Man(Pl,4)GlcNAc(Pl,4)GlcNAc
I
31cNAc(Pl,2)Man(al,3) Gal(Pl,4)GlcNAc(Pl,2)Man(al,6)
I
Fuc(al,6)
I
Man(Pl,4)GlcNAc(pl,4)GlcNAc
I Gal(Pl,4)GlcNAc(Pl,2)Man(al,3)
Endo H Treatment Endo H treatment of MAb MY9-6 was used to evaluate if oligomannosidic or hybrid structures were present (22). Figure 2C shows that no oHgomannosidic or hybrid structures were released from the monoclonal preparations by Endo H treatment. Ribonuclease B was used as a positive control (data not shown). Hence, this antibody did not contain high mannose or hybrid type oligosaccharides.
HPAEC-PAD Analysis of Protein Glycosylation
Neuraminidase
(Arthrobacter urefaciens)
71
Treatment
Neuraminidase treatment was used to evaluate whether the oHgosaccharides identified as peaks under #4 in Fig. 2 Panel A in the PNGase F maps of the MY96 IgG preparation were modified with sialic acid. HPAEC-PAD was used to distinguish between the different forms of siaUc acid (28). Treatment of half a PNGase F digest of MAb MY9-6 with neuraminidase resulted in the disappearance of the oligosaccharide peaks migrating at 40-42 min. (compare Fig. 2A and 2D) confirming the presence of sialic acid. Concomitantly there was the appearance of a new single peak at 43 min. which has a retention time identical to N-glycolylneuraminic acid (NeuSGc). The presence of Neu5Gc in mouse IgG has been reported.(24) Additionally, upon neuraminidase treatment there was a 26% increase in area of peak 2 and a 123% increase in the area of peak 3. These results suggest NeuSGc sialylation of structures in peaks 2 and 3. fi-galactosidase (Diplococcus pneumoniae)
Treatment
P-galactosidase treatment of the PNGase F treated IgG preparations was used to assess putative galactosylation differences between peak 1 (we believe to be an agalactosyl fiicosylated biantennary oligosaccharide; Table 2, structure 1), peak 2 (we believe to be a monogalactosylated fucosylated biantennary oligosaccharide; Table 2, structure 2), and peak 3 (we believe to be afiiUygalactosylated, fucosylated biantennary oligosaccharide; Table 2, structure 3). If the proposed identifications are correct, P-galactosidase treatment would be expected to convert peaks 2 and 3 to peak 1. Additionally, such a result would confirm a p 1,4 linkage for galactose in the galactosylated oligosaccharides because the Diplococcus enzyme is specific for this linkage. Treatment of the PNGase F digest of MAb MY9-6 with the Diplococcus enzyme resulted in complete disappearance of peak 3 (Fig 2E). The area of Peak 2 was reduced by 86%, while peak 1 exhibited an increase in peak area. With p-galactosidase treatment, a new peak appears at 4 min, which is likely the released galactose (Fig. 2E). New peaks between 10 and 13 min were not expected but may represent the product of contaminating hexosaminidase activity in the P-galactosidase preparation (see below for expected products of a hexosaminidase digest). P-N'Acetylhexosaminidase
Treatment (Jack bean)
The expected products of a hexosaminidase digest of structure 1 would be FucMan3GlcNAc2 and the released monosaccharide, N-acetylglucosamine. Treatment of the PNGase F digest of MAb MY9-6 with the Jack bean enzyme (Fig. 2F) resulted in complete disappearance of Peak 1. A new peak with a retention time identical to a FucMan3GlcNAc2 standard (between 9 and 10 min) was observed (Fig. 2D). Additionally, a peak seen at 5 min. upon hexosaminidase treatment, presumably represents the released GlcNAc (Fig. 2F).
72
Jeffrey Rohrer et al.
The expected products of a hexosaminidase digest of structure 2 would be two isomers in which either the Man 1-3 or Man 1-6 antennae could be terminated with Gal-GlcNAc-Man (no terminal GlcNAc, and thus not susceptible to hexosaminidase), while the second antennae would be newly terminated with Man after release of the exposed terminal GlcNAc by the hexosaminidase. These two isomers would likely have retention times shorter than structure 2 (Table 2). Because there are no commercially available standards corresponding to these structures, identification by chromatographic retention time alone would not be possible. Use of a mannosidase that could distinguish terminal mannose on the Man 1-3 or Man 1-6 antennae (29) could confirm the proposed structures. Results of hexosaminidase treatment of MAb MY9-6 showed that the peak area of peak 2 was reduced by 51% upon hexosaminidase treatment (Fig. 2F). Additionally, several new peaks appeared; between 11 and 13 min (Fig. 2F). Susceptibility of Peak 2 to digestion separately by p-galactosidase and hexosaminidase further confirm that this oligosaccharide is structure 2, monogalactosylated with the ungalactosylated antennae terminated with GlcNAc. The Jack bean enzyme would not be expected to digest the fully galactosylated structure 3 (Table 2) and peak 3, which we believe is structure 3 (Fig. 2F; between 16 and 17 min), was not reduced in peak area upon hexosaminidase treatment. In summary, HPAEC-PAD monosaccharide analysis and oligosaccharide mapping when used in conjunction with oligosaccharide standards and endo- and exoglycosidases of well defined specificities, can be used to identify the oligosaccharide structures present in oligosaccharide maps of IgG preparations, as well as to monitor the lot-to-lot consistency of the production of therapeutic glycoproteins (30). HPAEC-PAD, which employs automated chromatography with rugged, high resolution pellicular anion exchange columns (stable from pH 0-14 and at pressures up to 3000 psi), and direct detection of carbohydrates by PAD (thus eliminating the need for derivatization) at picomol sensitivities offers the advantages of convenience, durability, high resolution (separation of branch, linkage, and positional isomers [31]), and faster speed of analysis (flow rates of 1 ml/min vs. 30 /xl/min) when compared to traditional gel filtration methods for separating and detecting carbohydrates.
References 1. 2. 3. 4. 5. 6. 7.
Kim, H.; Yamaguchi, Y.; Masuda, K.; Matsunaga, C; Yamamoto, K.; Mmura, T.; Takahashi, N.; Kato, K.; and Arato, Y. J. Biol Chem. 1994,269,12345-12350. Nose, M.; and Wigzell, H. Proc. Natl. Acad. Sci. USA. 1983, 80,6632-6636. Tao, M.H.; and Morrison, S.L. J. Immunol. 1989,143,2595-2601. Sutton, B.J.; and Phillips, D.C. Biochem. Soc. Trans. 1983,11,130-132. Sox, H.C.; and Hood, L. Proc. Natl. Acad. Sci. 1970,66, 975-982. Spiegelberg, H.; Abel, C; Fishkin, B.; Grey, H. Biochemistry 1970, 9,4217-4223. Sawidou, G.; Klein, M.; Grey, A.A.; Dorrington, K.J.; Carver, J.P. Biochemistry 1984, 23, 3736-3740.
HPAEC-PAD Analysis of Protein Glycosylation 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.
73
Taniguchi, T,; Mizuochi, T.; Beale, M.; Dwek, R. A.; Rademacher, T.W.; Kobata, A. Biochemistry 1985,24, 5551-5557. Arvieux, J.; Willis, A.C.; Williams, A.F. MoL Immunol. 1986,23, 983-990. Wallick, S.C; Kabat, E.A.; Morrison, S.L. J. Exp. Med. 1988,168,1099-1109. Co, M.S.; Scheinberg D.A.; Avdalovic, N.M.; McGraw, K.; Vasquez, M.; Caron, P.C; and Queen, C. MoL Immunol. 1993, 30,1361 - 1367. Tsai, CM.; Zopf, D.A.; Yu, R.K.; Wistar, R.; Ginsburg, V. Proc. Natl. Acad. Sci. 1977, 74, 4591-4594. Weber, R.J.; Clem, L.W. J. Immunol. 1981,127, 300-305. Lawson, E.Q.; Hedlund, B.E.; Ericson, M.E.; Mood, D.A.; Litman, G.W.; Middaugh, R. Arch. Biochem. Biophys. 1983, 220, 572-575. Middaugh, C.R. and Litman, G.W. J. Biol Chem. 1990,262, 3671-3673. Borrebaeck, C.A.K,; Malmborg, A.; and Ohlin, M. Immunol. Today 1993,14,477^79. Hermentin, P.; Witzel, R.; Vliegenthart, J.F.G.; Kamerling, J.P.; Nimtz, M.; and Conradt, H.S. Anal. Biochem. 1992, 203,281-289. Weitzhandler, M.; Hardy, M.; Co, M.S.; and Avdalovic, N. J. Pharm. Sci. 1994, in press. Townsend, R. Quantitative Monosaccharide Analysis of Glycoproteins Using HPLC, 1994 in Chromatography in Biotechnology editors Horvath, C. and Ettre, L. S. ACS, Washington DC, ACS Symosiimi Series 529. Rohrer, J. S.; Weitzhandler, M.; and Avdalovic, N. A. Glycobiology 1994 4, 91. Weitzhandler, M.; Kadlecek, D.; Avdalovic, N.; Forte, J.G.; Townsend, R.R. J. Biol. Chem. 1993 268,5121-5130. Tandai, M.; Endo, T.; Sasaki, S.; Masuho, Y.; Kochibe, N.; and Kobata, A. Arch. Biochem. Biophys. 1991,291, 339-348. Tarentino, A.L.; Quinones, G.; Schrader, W.P.; Changchien, L.; and Plummer, T.H. J. Biol. Chem 1992,267, 3868-3872. Kobata, A. Glycobiology, 1990, 1, 5-8. Mizuochi, T.; Hamako, J.; and Titani, K. Arch. Biochem. Biophys. 1987,257, 387-394. Krotkiewski, H.; Gronberg, G.; Krotkiewska, B.; Nilsson, B.; and Svensson, S. J. Biol. Chem. 1990, 265,20195-20201. Basa, L. J. and Spelhnan, M. W. J. of Chromatography 1990,499, 205-222. Anderson, D.; Goochee, C ; Cooper, G.; and Weitzhandler, M. Glycobiology 1994, in press. Amano, J. and Kobata, A. J. Biochem. 1986, 99,1645-1654. Bhat, U.R. and Helgeson, E.A. Am. Biotech. Lab. January 1994,16. Hardy, M.R. and Townsend, R.R. Proc. Natl. Acad. Sci, 1988, 85, 3289-3293.
This Page Intentionally Left Blank
Carbohydrate Structure Characterization of Two Soluble Forms of a Ligand for the ECK Receptor Tyrosine Kinase Christi L. Clogston, Patricia L. Derby, Robert Toso, James D. Skxine, Ming Zhang, Vann Parker, G. Michael Fox, Timothy D. Hartley, and Hsieng S. Lu Amgen, Amgen Center, Thousand Oaks, CA 91320-1789
I. INTRODUCTION Human B61 was originally found as the product of an immediate-early response gene induced by treatment of human umbilical vein epithelial cells with TOT-a (1) and subsequently identified as a ligand for the receptor tyrosine kinase ECK (2). The gene encodes a 187 amino acid polypeptide chain with a hydrophobic C-terminus presumably connected to a membrane-bound GPI anchor. Cells transfected with the gene coding for all 187 amino acids secreted two main soluble forms with differing C-terminal lengths (150 and 159 amino acids) and a membrane-bound form (4,5). After a B61^^^ (the identical 150 amino acids in one of the above described soluble forms) gene construct was transfected into CHO cells, two soluble forms are observed by SDS-PAGE, with estimated Mr of 22,000 and 24,000. The proteins terminate at Ala^^o with a theoretical Mr of 17,788. The two forms of r-HuB61i^^ are glycosylated at Asn^ and Thr^^^, which accounts for the larger estimated molecular weights. After isolation and identification of the peptides containing these glycosylation sites, high pH anion exchange chromatography, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, and electrospray ionization mass spectrometry (ES-MS) were utilized to evaluate carbohydrate microheterogeneity. II. METHODS Expression of recombinant human B61^^^ in CHO cells was achieved using the expression vector pDSVRa2 (3) carrying a B61 gene encoding the leader sequence plus the coding sequence of 150 amino acids. Purification of recombinant B61 derived from CHO cells was performed as described (6). N-terminal sequence determination of the intact protein and isolated peptides was performed by automated Edman degradation using a Hewlett Packard G1005A protein sequencer (7,8). Automated C-terminal sequence analysis was done in collaboration with Perkin Elmer-Applied Biosystems Division (9-12). Reversed-phase HPLC of the isolated r-HuB61i5^ was performed with a SynChrom C4 widepore column (3(X) A, 4.6 x 250 mm) using a Hewlett Packard 1090M LC system equipped with a diode array detector and Chemstation. Peptide mapping was performed by digesting aliquots of B61 (154 ug of the 22KD form and 237 ug of the 24KD form in 100-200 \iL) in 0.1 M CHAPS/PBS (pH 7.2) with endoproteinase Asp-N as described (4). Peptides TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
75
76
Christi L. Clogston et al.
were collected manually. Speed Vac dried, and stored at -IQfiC until further analysis. Just prior to N-terminal sequence analysis or MALDI-TOF, the dried peptide fractions were reconstituted in HPLC grade H2O (Burdick & Jackson). Glycosidase digestion of isolated N-linked glycosylated peptides by neuraminidase and N-glycanase was performed as described (13,14). High pH anion exchange chromatography of N-glycanase cleaved oligosaccharides (13) and neuraminidase digested asialo oligosaccharides was performed as described (14). MALDI-TOF was done on a Kratos Kompact Maldi III mass spectrometer fitted with a standard 337 nm nitrogen laser and operated in the linear mode at an accelerating voltage of 20 kV. The matrix used was a-cyano-4hydroxycinnamic acid (33mM in acetonitrile/methanol, premade from BRS) at a ratio of 1:1 with purified peptide samples. MALDI sample slides were loaded with 0.5-1.0 jtiL of matrix/sample mixture (estimated 1-10 pmol peptide). The data was reprocessed using the Kratos software provided with the instrument. Theoretical masses were determined by utilizing a spreadsheet in which individual peptide masses were added to all possible carbohydrate forms; these masses were then compared to the observed masses to identify structures consistent with the mass results obtained. Molecular masses of the purified peptides were determined by a Finnigan SSQ710C or a Sciex API HI electrospray mass spectrometer operating in single quadrupole mode. The samples dissolved in a mixture of H20/MeOH/formic acid (50:50:3 volume ratio) and introduced by flow injection into the same solvent stream at 30 uL/min. For normal molecular weight measurements an orifice potential of 70 V was used. In the Sciex instrument this orifice potential affects the desolvation and also determines the extent of collisional activation. The analysis of glycopeptides was achieved by stepped orifice scans (15). In this mode a higher orifice potential (120V to 140V) was used while scanning the low mass region (m/z = 150-440) and a lower orifice potential (60 to lOOV ramp) was used wlnle scanning the high mass region (m/z = 550-2400). SDS-polyacrylamide gel electrophoresis was carried out under reducing conditions according to Laemmli (16). III. RESULTS AND DISCUSSION A.
N-terminal sequence analysis of r-HuB61^^^ forms N-terminal sequencing of both purified B61 molecular weight forms gave the predicted N-terminal sequence D-R-H-T-V-F-W-(X)-S-S-N-P-K-F-R-N-E-, etc., where (X) is predicted by the gene sequence to be N. To determine the Cterminus of the secreted polypeptide, the C-terminal peptides from an both endoproteinase Asp-N digestions were isolated and analyzed. Figure 1 shows reversed-phase HPLC separation of peptides in the endoproteinase Asp-N digest. Several peptides at 58-64 minutes were found to have the disulfide-linked sequences D-R-(*)-L-R-L-K-V-T-V-S-G-K-I-(Z)-H-S-P-Q-A-H-V-N-P-Q-E-KR-L-A-A and D-A-A-M-E-Q-Y-I-L-Y-L-V-E-H-E-E-Y-Q-L-(*)-Q-P-Q-S-K where (*, X, or Z) denotes an unassigned residue. These sequences correspond to the positions 120-150 and 43-67 of human B61150, fhe human B61 gene encodes Thr at the position indicated as (Z), and the lack of an assignment here could reflect the presence of 0-linked carbohydrate. The human B61 gene encodes Cys at the positions indicated as (*).
B.
C-terminal sequence analysis of r-HuB61^^^ forms C-terminal sequencing of both purified B61 forms produced the predicted C-terminal sequence -L-A-A-COOH, indicating that both forms did terminate at
77
Carbohydrate Characterization
Figure 1. Endoproteinase Asp-N peptide mapping of r-HuB61 24KD (chromatogram A) and 22KD (chromatogram B) forms.
9M Figure 2. SDS-PAGE of T-HUB61^^^ forms. Lanes 1,2: E.coli control N- and 0-glycanse treated and untreated. Lane 3: N- and 0-glycanse treated. Lane 4: N-glycanse treated. Lane 5: untreated. Lane 6: r-HuB61 ^^'^ treated with N- and Oglycanse. Lane 7: r-HuB61 ^^^ untreated.
v4iiii4ij!!-
^.J^
n^ititi^i.U, 4ii,M>yt>i0.03 |ig/mm^ therefore, we use the latter as the minimimi recommended protein/gel density (following staining). To reach this latter density typically requires that at least 1-2 jig of the protein of interest be loaded/lane in a 0.75-1 mm thick gel. The average percent initial sequencing yield of the 46 peptides summarized in Table I was 14% as calculated from the ratio of the initial pmol peptide sequencing yield to the amount of protein that was actually digested (as determined by amino acid analysis of an aliquot of the gel piece). As suggested by the mean (and particularly the median) initial yields reported in Table I, the initial sequencing yield appears to correlate more strongly with the density as opposed to the total amount of protein in the gel. The latter again emphasizes the importance of doing everything possible to maximize the protein density in the polyacrylamide gel matrix. Since neither the initial sequencing yield nor the overall success rate appears to correlate well with the total amount of protein submitted in the gel, it appears that we have not yet reached the lower limits of this approach (in terms of the least amount of protein that is required to succeed). So far, the least amount of an unknown protein we have subjected to in gel digestion was 26 pmol (2.8 jig) of a 106 kD protein that provided 25 residues of positively called sequence from the 2 peptides sequenced. Database searching of these two sequences indicated they matched published sequences for a-actinin. HPLC profiles that resulted from the in gel tryptic digest of 34 pmol (2.1 jig) of a 62 kD protein are shown in Fig. 1. At this low level there are at least 5-6 peaks present in the blank chromatogram that are similar in size to peaks
148
Kenneth R. Williams and Kathryn L. Stone
Table I. Summary of results from the in gel tryptic digestion of 25 proteins Parameter
Amount of Protein Digested (pmol)
Total
200
Number of proteins digested
5
5
8
7
25
Average mass of protein (kD)
70
36
65
66
60
Average amount of protein digested (pmol)
38
84
146
322
161
Average density of protein band (|ig/mm^)
0.14
0.054
0.25
0.40
0.23
Number of peptides sequenced
7
10
19
10
46
%Peptides sequenced that provided >6 positive residues
100
91
83
91
89
Average %initial yield*
25.1
6.7
13.1
15.2
14.0
Median %initial yield*
11.1
2.9
9.1
13.3
10.0
Average number of residues called/peptide sequenced
15.0
12.0
11.8
14.8
13.0
%"Unknown" proteins identified via database searches
60
20
38
50
46
Overall digest success rate**
100
100
100
86
92
^Calculated from the ratio of the initial sequencing yield to the amount of protein that was digested - as judged by hydrolysis/amino acid analysis of a ~10% aliquot of the stained gel piece. ''A digest was scored as a success if at least 12 residues of positively called sequence was obtained from 1-2 peptides.
that are unique to the actual protein digest shown in the lower panel. Although the origin of the peaks in the 30-95 min region of the blank run is not known, the peaks that elute after this point are due to residual Coomassie Blue. By only selecting peaks for further analysis that elute in the 30-95 min "window" and that are unique to the protein digest, it is possible to avoid most artifact peaks that resultfi-omtrypsin autolysis, Coomassie Blue and other reagents. The impact that routine peptide LDMS "screening" has on internal sequencing is reflected by the fraction of peptides that were subjected to sequencing that provided usable data. That is, with prior LDMS analysis, approximately 90% of the peptides selected for sequencing provided at least 6 residues of usefiil sequence with the overall average number of residues sequenced/peptide being 13 (Table I). In a previous study, without LDMS "screening" only 67% of peptides selected for sequencing fell into this same category (13). In the latter study -17% of the peptides proved to contain mixtures and another -16% failed to yield any sequence whatsoever (13). Based on our experience, LDMS readily differentiates peptides from Coomassie Blue and other reagent peaks and (by comparison to a table of predicted peptide masses) detects most trypsin autolysis products (that are not present in the blank
In-Gel Digestion and Sequencing of 25 Proteins
60 70 T i mo (m1n . )
149
80
90
100
110
Figure 1. Reverse phase HPLC separation of an in gel tryptic digest of 34 pmol (2.1 jxg) of a 62 kD protein (lower panel). The upper panel shows the profile that resulted from incubating an equal size slice of polyacrylamide gel that did not contain protein. The 4 peaks in the 96-102 min region are due to residual Coomassie Blue. The digest and HPLC separation were carried out as described in Materials and Methods.
chromatogram). In addition, by only sequencing peptide peaks that have a major/minor LDMS peak ratio that is >10, we have found that LDMS can serve as a valuable criterion (in addition to HPLC absorbance peak shape) of peptide purity. Thus, in our laboratory, LDMS screening of peptide fractions has decreased the average number of peptides that must be sequenced/protem to obtain definitive internal sequences. Finally, the routine mass accuracy (±0.25% with external calibration without a reflectron) of our LDMS analyses is sufficiently high that it allows accurate prediction of the end of peptides sequenced by Edman degradation and sometimes allows tentative assignment of a "missing" residue based on mass comparison. Since the sensitivity of LDMS is so high (ie., most tryptic peptides can be easily detected in the 20-500 fmol range), we routinely use an average of only - 3 % of each peptide fraction for this analysis.
B. Suggestions for Optimizing Internal Sequencing from In Situ Gel Digests 1. Quantify the amount of protein prior to digestion Hydrolysis and ion exchange amino acid analysis of an aliquot of the gel band provides a more accurate estimate of the amount of protein than can be obtained by comparing relative Coomassie Blue staining intensities. By detecting samples that contain too little protein or that are at too low density
150
Kenneth R. Williams and Kathryn L. Stone
before digestion and HPLC, the investigator can be warned that the pending digest is likely to fail and that additional material should be isolated before proceeding. Obviously, this approach is likely to increase the success rate of in gel digestions. In addition, by quantifying the amount of protein that has been digested, it is possible to calculate the overall recovery of any peptide sequenced and hence, often determine whether it was derived from the major component in a gel band. That is, if the recovery of a given peptide (in terms of pmol sequencing yield/pmol of protein digested) is above average, clearly, it must have derived from a major component in the preparation. 2. Carry out a positive and negative digest control While the negative (no substrate protein) is helpful in quickly identifying reagent peaks and autolysis products derived from the protease, the positive control (ie., a gel slice containing a similar amount of a standard protein such as transferrin) is useful for continuous optimization of the procedures that are being used, for verifying the activity of the protease and for providing a "benchmark" against which the yield of unknown peptides (as judged by the average peak heights obtained from the HPLC tracing) can be quickly compared. 3. "Screen** selected peptide peaks by LDMS prior to sequencing As indicated above, there are many advantages to LDMS analysis of peptide peaks that are candidates for Edman degradation. Since we have previously sent numerous LDMS targets for analysis by an "outside" facility, we know it is quite possible to use this approach even if a laser desorption mass spectrometer is not immediately available. 4. Routinely include an internal sequencing standard with all samples As detailed in reference 11, this practice allows the continuous optimization/on-line monitoring of sequencer performance that is so critical to being able to routinely sequence the 50 KDa—f 36 KDa—,. 30 K D a - , 16 K D a - ^ 8 KDa-' 4 KDa-
r-^ Figure 3. PsaC and Apomyoglobin, electroblotted onto ProBlott
Fe-S protein involved in photosynthetic electron transport, and is expressed in E. coli in inclusion bodies. The protein is conveniently isolated for sequencing by SDS-PAGE. In Figure 3, the protein bands immobilized on a ProBlott membrane are revealed after electrophoresis, transfer, and staining. Table II shows the initial yields for each of the blotted samples, and the total number of residues identified. Initial yields of the PsaC protein are roughly linear over the range of concentrations. Sequencing for myoglobin was successful, although the initial yields were not linear. For these experiments, a smaller cartridge (6 mm) was used for the sequencing reactions. This cartridge was not designed to accommodate the large pieces of membrane (i.e., the myoglobin bands) likely contributing to the inconsistencies in sequencing yields. Table 11. C-terminal Sequencing Initial Yields of SDS-PAGE-blotted Samples Picomoles Loaded on Gel 150 200 300 500 600 1000
]Initial
PsaC
Yield (pmol) Apomyoglobin
8.6
—
—
9.6
23.1
—
—
44.1
46.0
—
—
33.5
Number of Residues Identified 3 3 4 4 4 4
MeriLisa Bozzini et al.
236
Residue 3
Residue 1 1
11.00
P
9.od 8.od
kiuJJwull•^ U W M J 8.0
8.0
Residue 2
LJJLI
LAAJ
8.0
Hi
L_JJ
VJ
12.0 16.0 20.0 24.0 28.0 32.0
12.0 16.0 20.0 24.0 28.0 32.0
Residue 4
\i
12.0 16.0 20.0 24.0 28.0 32.0
fl.o
12.0 16.0 20.0 24.0 28.0 32.0
Figure 4. C-temiinal sequencing analysis of the electroblotted PsaC protein
Figure 4 shows the C-terminal sequencing data for the electroblotted PsaC protein. Approximately 300 pmol of the protein was loaded onto the tricine gel; therefore, an estimated 150-250 pmol of protein was transferred to the ProBlott membrane for sequencing. The C-terminal sequence was confirmed to be NH-...Gly-Leu-Ala-Tyr-COOH.
IV. Conclusions The ability to sequence proteins from the C-terminus enhances and expands the analytical methods available for protein characterization. Although we are currently in the development stages of the chemical degradation process for C-terminal sequencing, the Perkin-Elmer method has proven utility in the protein analysis laboratory. We have demonstrated the chemistry's robustness in confirming the expected sequence of recombinant proteins. Equally valuable, this chemistry method can identify post-translational modifications and sample heterogeneity. Our sequencing method has been demonstrated on proteins obtained from several independent laboratories. Furthermore, the information obtained has been complimentary and consistent with data obtained from other analytical techniques, such as peptide mapping, amino acid analysis, and mass spectrometry. In addition, the proteins were sequenced from a conventional sequencing matrix, PVDF, introduced either by centrifugation or electroblotting from a gel. As little as 150 pmol of protein was sequenced for 3 to 5 cycles from the blotted proteins. Although some amino acid residues interfere with our ability to sequence all protein samples, many proteins have been sequenced successfully by this new C-terminal sequencing method, which continues to improve and, thereby, provides a useful analytical tool.
Applications of C-Terminal Sequencing
237
Acknowledgments We gratefully acknowledge the following contributors for their expertise in preparing the protein samples referred to in this manuscript: Jason Kass, Mark Vandenberg, and Nicholas Chester, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, for their work on CKII. Work at Cold Spring Harbor is supported by a Cancer Research Fund of the Damon-RunyonWalter Winchell Foundation Fellowship, DRG-1193 (to J.L.H.), by grant #CB-72 from the American Cancer Society (to D.R.M.) and by grant AG10208 from the Public Health Service (to D.R.M.). John Golbeck, The University of Nebraska, Lincoln, Nebraska, for his work on PsaC. Zafeer Ahmad and Fazul Khan, Hoffmann-La Roche, Nutley, NJ, for their workonHMWrIL-2. Ling Chen, Perkin-Elmer, Applied Biosystems Division, for mass analysis. We also wish to thank Anne Marie Bozzini and Karen Felker for their editorial assistance and desktop publishing expertise.
References 1. Rangarajan, M. (1988). In "Protein/Peptide Sequence Analysis: Current Methodologies", A.S. Bhown, ed. 135-144. 2. Inglis A.S. {\991). Anal Biochem. 195,183-196. 3. Boyd, V.L., Bozzini, M., Zon, G., Noble, R.L., and Mattaliano, R.J. (1992). Anal Biochem. 206, 344-352. 4. Boyd, V. L., Bozzini, M., Zon, G., Noble, R,L., and Mattaliano, R.J. (1992). "A New Chemical Method for Protein C-terminal Sequence Analysis." Presented at the Sixth Symposium of the Protein Society, San Diego, CA. 5. Guga, P.J., Bozzini, M., DeFranco, R.J., Large, G.B., and Boyd, V.L. (1993). "C-terminal Sequence Analysis of the Amino Acid Residues with Reactive Side-Chains: Ser, Thr, Cys, Glu, Asp, His, Lys." Presented at the Seventh Symposium of the Protein Society, San Diego, CA. 6. Bozzini, M., DeFranco, R.J., Guga, P.J., Mattaliano, R.J., and Boyd, V.L., (1993). "C-terminal Sequencing Automation and Performance Assessment for the Alkylated Thiohydantoin Method." Presented at the Seventh Symposium of the Protein Society, San Diego, CA. 7. Ju, G., Collins, L., Kaffka, K. L., Tsien, W.-H., Chizzonite, R., Crowl, R., Bhatt, R., and Kilian, P.L. (1987)./. Biol Chem. 262,5723-5731. 8. Ahmad, Z., Ciolek, D., Pan, Y.-C.E., Michel, H., and Khan, F. (1994)7. Protein Chemistry, (in press). 9. Zhao, J., Snyder, W.B., Muhlenhoff, U., Rhiel, E., Warren, P.V., Golbeck, J.H., and Bryant, D.A. (1993). Mol Microbiol 9,183-194. 10. Schagger, H. and von Jagow, G. (\9%l).Anal Biochem. 166,368-379.
This Page Intentionally Left Blank
C-Terminal Sequence Analysis of Polypeptides Containing C-Terminai Proiine Jerome M. Bailey, Oanh Tu, Gilbert Issai, and John E. Shively Beckman Research Institute of the City of Hope, Diviston of Immunology, Duarte.CA 91010
I.
INTRODUCTION
The last few years have seen a renewed interest in the development of a chemical method for the sequential C-temiinal sequence analysis of proteins and peptides. Such a method would be analogous and complimentary to the Edman degradation commonly used for N-terminal sequence analysis (1). It would also be invaluable for the sequence analysis of proteins with naturally occurring N-terminal blocking groups, for the detection of post-translational processing at the carboxy-terminus of expressed gene products, and for assistance in the design of oligonucleotide probes for gene cloning. Although a number of methods have been described, the method known as the "thiocyanate method", first described in 1926 (2), has been the most widely studied and appears to offer the most promise due to its similarity to current methods of N-terminal sequence analysis. Work performed in our laboratory over the last several years has systematically addressed many of the problems associated with the thiocyanate chemistry. The use of sodium or potassium trimethylsilanolate for the cleavage reaction provided a method for rapid and specific hydrolysis of the derivatized C-terminal amino acid, which left the shortened peptide with a free C-terminal carboxylate ready for continued rounds of sequencing (3). The use of diphenylphosphoroisothiocyanatidate (DPP-ITC) and pyridine combined the activation and derivatization steps and TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
239
240
Jerome M. Bailey et al.
permitted the quantitative conversion of 19 of the twenty common amino acids (the exception being proline) to a thiohydantoin derivative. These improvements permitted application of the Cterminal chemistry to a wide variety of protein samples with cycle times similar to those employed for N-terminal sequence analysis (4). The introduction of Zitex (porous Teflon) as a support for protein sequencing permitted the C-terminal sequence analysis of protein samples that were non-covalently applied to the sequencing support (4,5). The inability of C-terminal proline to be derivatized to a thiohydantoin has been a major impediment to the development of a routine method for the C-terminal sequence analysis of proteins and peptides. Since the method was first described in 1926 (2), the derivatization of C-terminal proline has been problematic. While over the years a few investigators have reported the derivatization of proline, either with the free amino acid or on a peptide, to a thiohydantoin (6-8), others have been unable to obtain any experimental evidence for the formation of a thiohydantoin derivative of proline (9-12). Recently, utilizing a procedure similar to that described by Kubo et ai. (6), Inglis et al. (13) have described the successful synthesis of thiohydantoin proline from N-acetylproline. This was done by the one-step reaction of acetic anhydride, acetic acid, trifluoroacetic acid, and ammonium thiocyanate with N-acetyl proline. We have reproduced this synthesis and further developed it to a large scale synthesis of TH-Proline. We also describe the development of chemistry based on the DPP-ITC/pyridine reaction which permits the efficient derivatization and hydrolysis of peptidyl C-terminal proline to a thiohydantoin and discuss the integration of this chemistry into an automated method for the C-terminal sequence analysis of polypeptides containing C-terminal proline.
II.
MATERIALS AND METHODS
Materials. Diphenyl chlorophosphate, acetic anhydride, trimethylsilylisothiocyanate, anhydrous dimethylformamide (DMF), anhydrous acetonitrile, and anhydrous pyridine were from Aldrich. Water was purified on a Millipore Milli-Q system. Sodium trimethylsilanolate was obtained from Fluka. Diphenyl phosphoroisothiocyanatidate was synthesized as described (14). All of the peptides used in this study were either obtained from Bachem or Sigma. N-Acetyl proline was from Sigma.
C-Terminal Sequencing through Proline
241
Diisopropylethylamine (sequenal grade), trifluoroacetic acid (sequenal grade), and 1,3-dicyclohexylcarbodiimide (DCC) were obtained from Pierce. Tlie carboxylic acid modified polyethylene membranes were from the Pall Corporation (Long Island, NY). Zitex G-110 was from Norton Performance Plastics (Wayne, NJ). The amino acid thiohydantoins used in this study were synthesized as described (9). The Reliasil HPLC columns used in this study were obtained from Column Engineering (Ontario, CA). Synthesis of Thjohydantgin Prolipg. Acetic anhydride (100 ml), acetic acid (20 ml), and trifluoroacetic acid (10 ml) were added to N-acetylproline (500 mg). The mixture was stirred until dissolved. Trimethylsilylisothiocyanate (3 ml) was added and mixture stirred at 60^C for 90 min. The reaction was dried to a powder by rotary evaporation and water (50 ml) added. This solution was again dried by rotary evaporation and water (20 ml) added. A white powder formed. The solution was kept on ice for approximately 30 min. The powder (thiohydantoin proline) was collected by vacuum filtration. The yield was approximately 40%. The product was characterized by UV, FAB/MS, and NMR. The UV absorption spectrum had a ^max of 271 nm in methanol. FAB/MS gave the expected MH+ = 157. NMR 5 4.32(Ha, m), 3.85 (Hs, m), 3.43 (Hg, m), 2.20 (Hvand Hp, m), 1.70 (Hp, m).
Covalent Coupling of Peptides to Carboxylic Acid Modified Polyethylene. Peptides were covalently coupled to carboxylic acid modified polyethylene and quantitated as described (15). Application of Protein Samples to Zitex. The Zitex support (2x10 mm) was pre-wet with isopropanol and protein samples (25 III) dissolved in water were applied. The samples were allowed to dry before sequencing.
HPLC Separation of the AmiPO Acid ThichydantoinsReverse phase HPLC separation of the thiohydantoin amino acid derivatives (400 pmol) was performed on a C-18 (3^1, lOOA) Reliasil column (2.0 mm x 250 mm) on a Beckman 126 Pump Module with a Shimadzu (SPD-6A) detector (Figure 1). The column was eluted for 2 min with solvent A (0.1% trifluoroacetic acid in water) and then followed by a discontinuous gradient to solvent B (10% methanol, 10% water, 80% acetonitrile) at a flow rate of 0.15 ml/min at 35°C. The gradient used was as follows: 0% B for 2 min, 0-4% B over 3 min, 4-35% B over 35 min, 35-45% B for 5 min, and 45-0% B over 3 min. Absorbance was monitored at 265 nm.
Jerome M. Bailey et al.
242
Automation of the C-Terminal Sequencing Chemistry. The instrument used for automation of the chemistry described in this manuscript has been described previously (5).
lO CD
20
30
40
48
Retention Time (min.)
Figure 1. HPLC Separation of the Amino Acid Thiohydantoins.
III.
RESULTS AND DISCUSSION
Chemistry for the Automated C-Terminal Sequence Analysis of Proline Containing Polypeptides. Application of the acetic anhydride/TMS-ITC/TFA procedure, used for the synthesis of TH-proline, to the tripeptide, N-acetyl-Ala-Phe-Pro, in our laboratory, found that thiohydantoin proline was formed in low yield (approx. 1-2% of theoretical). Recovery of the peptide products after the reaction revealed that approximately half of the starting peptide was unchanged and the remaining half had been decarboxylated at the C-terminus, thereby blocking it to C-terminal sequence analysis. This was most likely caused by the high concentration of trifluoroacetic acid, the excess of acetic anhydride present, and the high temperature (SO'^C) at which the reaction was performed. The poor reaction with C-terminal proline most likely stems from the fact that proline cannot form the necessary oxazolinone for efficient reaction with the isothiocyanate. Work in our laboratory has obviated the need for oxazolinone formation by the use of diphenyl phosphoroisothiocyanatidate and pyridine. Reaction of this reagent with C-terminal proline directly forms the acylisothiocyanate. Once the acylisothiocyanate is formed, the addition of either liquid or gas phase acid followed by water was
C-Terminal Sequencing through ProHne
243
found to release proline as a thiohydantoin amino acid derivative. Unlike thiohydantoin formation with the other 19 naturally occurring amino acids, C-terminal proline thiohydantoin requires the addition of acid to provide a hydrogen ion for protonation of the thiohydantoin ring nitrogen. This step is necessary for stabilization of the proline thiohydantoin ring. The resulting quaternary amine containing thiohydantoin can then be readily hydrolyzed to a shortened peptide and thiohydantoin proline by introduction of water vapor or by the addition of sodium trimethylsilanolate (the reagent normally used for cleavage of peptidylthiohydantoins). The automation of this chemistry has allowed proline to be analyzed in a sequential fashion without affecting the chemical degradation of the other amino acids.
Peptide—
y/^
-cm
Diisopropylethylamine
Peptide-C - N - C H - C - O -
0
% ' Peptide-c - N - C H - C - o - P - N - c - s ^ 0^
Peptide Mixed Anhydride
Ph
I
pyridine
— 0 Peptide-c - N-CH-c - N - C - s Peptide Isothiocyanate
Pepcide-i! - * A - a .
>
T
S-C
P*Pdde-8000 A) which transect the particle and confer two distinct advantages over non-porous supports- (i) increased surface area, and hence higher mass loading, and - (ii) the minimisation of slow diffusive flow that permits rapid partitioning, and hence resolution, of proteins and peptides within the supports over a wide range of high flow velocities (1000-9000 cm/h). Although ultrafast (15-90 sec duration) protein and peptide separations have obvious analytical applications [9,10], the high flow velocities involved (>4000 cm/h) severely restrict their preparative application since the peak bandwidths involved (typically, 1-2 sec) present the researcher with the technical challenge of collecting fractions at these flows; a problem akin to trying to collect precision fractions from a "high-pressure garden hose". However, when these macroporous supports are used at flow velocities in the range 500-10(K) cm/h, peak bandwidths of 10-20 sec are obtained with only a minimal decrease in resolution, but in a time-frame that permits manual fraction collection. One important consideration of the macroporous supports that is often overlooked is that they do not possess the high efficiencies of conventional supports when operated at < 300 cm/h; flow velocities deemed optimal for the latter supports. As the flow velocities are increased, the efficiency of conventional supports deteriorates while that of the macroporous supports varies little. However, at flow velocities of ~ 3500 cm/h, the chromatographic behaviour of both supports is comparable. Here we describe a protocol for fast chromatographic analysis of proteins and peptides using conventional "wide pore" supports and standard liquid chromatographs. Using flow velocities of 500-l()00 cm/h (0.3-0.6 ml/min for a 2.1 mm I.D. column) and a constant linear gradient volume of 6 ml, peptide separations can be achieved in ~ 10 min; almost an order of magnitude faster than standard chromatographic conditions. Under these conditions, highly efficient, reproducible peptide separations (0.(X)9 RSD's) can be achieved on conventional columns. Additionally, we present our modified procedure for performing in-gel proteolytic digestion of 2-D gel protein spots for the purpose of identifying proteins by peptide-mass fingerprinting along with microsequencing.
Rapid Separations Using Conventional Silica-Based Supports
313
II. Materials and Methods All protein standards were from Sigma, Coomassie Brilliant Blue from LKBPharmacia, HPLC-grade solvents from Mallinckrodt and trifluoroacetic (TFA) was obtained from Pierce. Proteolytic digestion of standard proteins was performed at 37°C for 16 h in 0.2M ammonium bicarbonate using trypsin (Promega) at a 1:50 enzyme:substrate ratio. Homogeneous 10% Trisglycine gels were from Novex. Chromatography of proteins and peptides was performed on a Hewlett-Packard liquid chromatograph (model 1090A) as described [11] using - (i) Brownlee RP-300, 7 |im dimethyloctyl silica packed into a 100 mm x 2.1 mm I.D. cartridge, Applied Biosystems; - (ii) Poros Rn/H, 10 |im divinylbenzene crosslinked polystyrene packed into a 100 mm X 2.1 mm I.D. column, Perseptive Biosystems. Preparation of whole-cell lysates of the human colon carcinoma cell line LIM 1863, 2-D gel and SDS-PAGE analysis were performed as described [12,13]. In-gel proteolysis of gel protein spots was essentially as described [14] with the modifications shown in Table 1. Table 1. 1.
Protocol for in-gel protein digestion
Run 1-D or 2-D acrylamide gel. Visualize proteins with Coomassie Blue Stain gel: 50% MeOH, 10% HOAC, 0.1% CBR250 (~ 5-10 min) Destain: 12% MeOH, 7% HOAC, for 1-1.5 h (with ~ 3 changes)
2.
In-gel digestion: (i) Excise stained gel band (ii) Wash twice (~ 200 ^il 0.2M NH4HCO3 / 50% CH3CN) for 30 min at 30°C (iii) Dry gel band completely in Savant (~ 30 min) (iv) Rehydrate gel band with trypsin solution (~ 0.5-1.0 |ig trypsin in 10 \i\ 0.2M NH4HCO3, 0.5 mM CaCl2), 15-30 min. Repeat step (v) Add 150 ^il of digestion buffer (0.2M NH4HCO3, 0.5 mM CaCl2). Incubate at 37°C, ~16h
3.
Peptide Extraction (i) Collect digest buffer (ii) Add 200 ^il 1% TFA, sonicate ~ 30 min (35-40°C), collect extract (iii) Add 200 fxl 0.1% TFA/60% CH3CN, sonicate - 30 min (35-40°C), collect extract (iv) Pool extracts, concentrate by centrifugal lyophilization to ~ 10-20 jxl
4.
Peptide Identification: Separate peptides by either rapid microbore RP-HPLC (for Edman degradation) or rapid capillary RP-HPLC (for LC/MS/MS or peptidemass fingerprinting)
314
Robert L. Moritz et al.
i n Results and Discussion Binding Capacity and Mass Transfer Kinetics Frontal analysis chromatography, using a 1 mg/ml solution of lysozyme in aqueous 0.1% TFA, was performed at varying flow velocities with a conventional wide-pore (300 A) column and a macroporous (8000 A) column in order to evaluate their binding capacities as well as mass transfer kinetics. It can be seen in Fig.l that the total protein binding capacity (saturation level) was significantly greater (~ 3-fold) for the silica-based 300A support (~ 11.5 mg) when compared to the macroporous support (~ 4 mg). As expected, the protein saturation level in both cases was independent of flow velocities over the range examined. However, it should be noted that for the silica-based support, the initial binding (or breakthrough) is very much dependent on flow velocity. For example, at 347 cm/h the breakthrough occurs at protein loads > 11 mg, while at 3465 cm/h - at ~ 7 mg. The lack of variation in the frontal curve shape for the macroporous support reflects minimal "stagnant mobile phase mass transfer" - a feature of this support design. By contrast, for the conventional silica-based support, as the flow velocity increases the slow mass transfer kinetics attributable to the large number of stagnant (or inaccessible pockets) pools that are inherent in these supports becomes more pronounced. This observation has been previously reported for a wide range of other sihca-based supports [15]. e 1-5 a o
.B
LA
00
^
1.0
1
173^m/h
173^xm/h/^ 3 4 7 cm/h
O
g 0.5
r
.
3465^cm/h/ i / 86^ cm/h
3465^cm/h
O
0.0
1 i 1 1 1 1 1 1 1 i 1
10
1
1
1
1
1
15
1
1
f
J
86$ cm/h
1
.i-j. _ _ ^ _ . _ _ L _ 1
0
M7cm/h
1
1
1
1
1 H
3
Lysozyme (mg) Fig 1. Frontal loading adsorption isotherms (?/conventional "widepore" derivatised silica (Brownlee RP-300) and macroporous divinylbenzene crosslinked polystyrene (Poros RII/H) supports. Protein: 1 mg/ml solution in aqueous 0.1% TFA. Superficial linear flow velocities: 347, 866, 1732 and 3465 cm/h. Temperature: 25°C. (A) RP-300 2.1 mm ID cartridge. (B) Poros RII/H 2.1 mm ID column.
Rapid Separations Using Conventional Silica-Based Supports
1.5
0.1 ml/min 1.0 h 173 cm/h
e
0.5
5
0.0
o
JJJ
n
IL
0.1 ml/min 1.0 h 173 cm/h
c 0.5 o 0.0 CO
X)
.Lj^... V
.k.
1300
n/z
Figure 3 Analysis of the mature NT-3 and the higher molecular weight material by mass spectrometry. To identify the site of incorporation, the protein was subjected to endoproteinase Lys-C digestion and the resulting peptides were separated using a narrow bore C18 column. The peptide map of the mature NT-3 protein is shown in Figure 4A and the expected C-terminal peptide I116GRT119 is highlighted in bold print. Examination of the endoproteinase Lys-C map of the higher molecular weight material (Figure 4B) demonstrates that
347
Characterization of a C-Terminal Extended Product
the peptides corresponding to the C-terminus were absent and a new peptide was generated. /w-
i £ c CO
600-
'.
1 '
.
=o ' CO
500:
< ^ >
400: U
o o c
300:
n oCO n
1S
lUU—V A
20
J 40
30
50
Time (min.)
500 (B)
I
400
E c 1£
300
a
HO-C-H
rC-OH
CO2®
H-C-OH H9C-OPO,®
>? H2C-OPO3®]
HgC OPO3® HO-C-O-O I C-O I H-C-OH I • ^ [_ HgC-OPOs®
IT
HO-^C-0-0® HgO
COg^
HjO* H2C-OP03*
CO2® H-C-OH H2C-0P03
Figure 1. Reaction pathways for the carboxylation and oxygenation of RuBP as catalyzed by Rubisco. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
357
358
Mark R. Harpel et al.
oxidative degradation of RuBP (4-5), which forms one equivalent each of PGA and 2-phosphoglycolate (PGyc) (Fig. 1, lower pathway), can reduce photosynthetic yields in certain plants by 50%. Although the enzyme's specificity for carboxylation is not immutable (see refs. 1-3, 6, and citations therein), the molecular bases for the discrimination between the two pathways are not entirely clear. Hence, understanding the structural and mechanistic features of Rubisco that limit its efficiency and specificity are of practical importance. In this chapter, we present three approaches to address mechanistic issues with Rubisco mutants: characterization of catalysis of partial reactions, analysis of side products, and subtle alteration of the active-site microenvironment by manipulation with exogenous reagents.
II. General Considerations in Characterizing Rubisco Mutants In the absence of 3D structures, a general challenge in exploiting site-directed mutants of Rubisco is to determine whether catalytic deficiencies reflect improper folding of polypeptide, failure of subunits to associate, inability to undergo requisite activation [carbamylation of active site Lysl91 followed by binding of Mg2+ (7)], failure to bind substrates, or loss of a group that participates in catalysis; only the mutants of this latter class are mechanistically revealing. Even partial retention of these properties by catalytically-impaired mutants can allay fears that detrimental consequences of amino acid substitutions at the active site are indirect effects brought about by major conformational changes^. Due to ease of genetic manipulation, heterologous expression, and assembly, Rubisco from Rhodospirillum rubrum, a homodimer (50,500 dalton subunit), has been the target of most mutagenesis studies. Despite the difference in quaternary structure from the Rubisco found in most photosynthetic organisms, which consists of eight large (L) (53,000 dalton) and eight small (S) (14,000 dalton) subunits, species invariance of active-site residues and homologous threedimensional structures justify extrapolation of mechanistic conclusions from the L2 protein to the LgSg form (10, 11). The original clone of the R. rubrum Rubisco gene was expressed as a fusion protein (12), but all the mutant proteins generated in our laboratory are derived from a reconstruction that encodes authentic, wild-type enzyme (13). Mutant genes are constructed by single primer extension utilizing an appropriate single-stranded M13 vector (14) and expressed in Escherichia coli strain MV1190. Mutant enzymes are purified chromatographically to near homogeneity (15). Expression of the mutated genes as full-length translation products that have not been proteolyzed is readily assessed by the electrophoretic mobility on denaturing gels of the mutant proteins in comparison to wild-type. Dimer formation, distinguishable from monomers by gel permeation chromatography or non-denaturing gel electrophoresis, is taken as evidence for proper folding and assembly. The reaction-intermediate analog, 2-carboxyarabinitol 1,5-bisphosphate (CABP), is used to assess whether an inactive mutant can nevertheless undergo activation chemistry and bind phosphorylated ligands. Only the carbamate form of Rubisco binds the analog in an exchange-resistant complex [ti/2 -20 h for wild-type R. rubrum Rubisco (16)] of equimolar subunit, CO2, Mg2+, and CABP (17). This complex is readily isolated by gel filtration (18). Tight binding of each of the three ligands is dependent on the presence of the other two. Therefore, lOnly the inactive D193N mutant of Rubisco has been successfully crystallized and subjected to crystallographic analysis (8). Although this mutant exhibited large conformational differences relative to the wild-type enzyme, these properties may have been expected due to the mutant's deficiencies in both activation and binding of CABP (9).
Rubisco Mutants
359
complex formation proves competence in activation chemistry and binding of phosphorylated ligands.
III. Partial Reactions The overall carboxylation or oxygenation of RuBP as catalyzed by Rubisco consists of discrete partial reactions illustrated in Fig. 1 (reviewed extensively in 1-3, 19). Because an active-site residue will not necessarily be involved in all catalytic steps, site-directed mutants devoid of overall activity may retain competence in one or more of the partial reactions. Independent of overall carboxylase activity, enolization of RuBP and turnover of the isolated six-carbon reaction intermediate can be assayed as distinct reactions, providing an avenue for discerning the particular step(s) preferentially facilitated by a given active-site residue. Formation of the enediol(ate) of RuBP is readily assayed on the basis of exchange of solvent protons with the C3 proton of substrate (20-22). The sixcarbon intermediate of the carboxylation pathway (11 in Fig. 1) can be prepared by rapid quench after mixing equimolar amounts of RuBP and the carboxylase in the presence of 14CO2 (23). Availability of this labeled intermediate allows determination of an enzyme's commitment to forward processing in the carboxylation step. Decomposition, via decarboxylation, is observed as a decrease in radioactivity that can be stabilized by borohydride, whereas forward catalysis is equated with an increase in acid-stable radioactivity.
IV. Oxygenation and Other Side Reactions Completion of a catalytic cycle by Rubisco requires stabilization of several inherently unstable intermediates. Imperfect stabilization of these intermediates is reflected in non-productive side reactions, including several involving the enediol(ate) (I). Not only does the formation of side products provide insight into limitations of Rubisco efficiency in vivo, but perturbation of these side reactions by mutants can reveal amino acid groups crucial to intermediate stabilization. The most prominent side reaction of Rubisco is its counterproductive oxygenase activity, reflecting competition with CO2 for the enediol(ate) intermediate (I) (23). Partitioning between the two pathways (VC/VQ) is defined by VJVQ = T • ([C02]/[02]), where T ISVCKQ/VOKC (24). Because x can be interpreted in terms of the free energy differential for carboxylated versus oxygenated transition states (25-26), it provides insight into determinants of Rubisco specificity. Many assays have been described for determining T; unfortunately, inherent limitations generally render them cumbersome for screening of enzymes and particularly ill-suited for analyzing mutant Rubiscos exhibiting low levels of activity. We have developed a simple high-resolution anion-exchange chromatographic method (27) to derive VQ/VQ (and hence x) from the ratio of radioactive peak areas for PGA and PGyc generated from [l-3H]RuBP. This method also provides a complete picture of the ultimate fate of input substrate (from a single chromatographic profile), including degradation products, which impact X but are overlooked in some widely-used assays (see Fig. 2). Any method for determining the specificity is dependent upon knowledge of gaseous substrate concentrations. Generally, [O2] is maintained constant at either ambient concentration (255 |xM) or at 100% O2 saturation (1.2 mM). Free [CO2] is varied with exogenously-added NaHCOs. The total concentration of all species of "CO2" can be determined spectrophotometrically by the phospho^«6>/pyruvate
360
Mark R. Harpel et al.
carboxylase/malic dehydrogenase assay of O'Leary et al. (28). Decline of carboxylase activity during the course of assay, which is particularly endemic to higher-plant Rubiscos, has been denoted as "fallover". Characterization of the products of RuBP turnover under fallover conditions has shown that this process is a result of enediol(ate)-derived side reactions (see refs. 29-31 and citations therein). Misprotonation at C3 (net epimerization), which occurs once every 400 turnovers, gives rise to D-xylulose 1,5-bisphosphate (XuBP), a potent inhibitor and alternate substrate of the enzyme (32-33). This inhibitor accumulates because its utilization as substrate is exceedingly slow. Another inhibitor formed in similar amounts to XuBP during RuBP turnover by spinach Rubisco has chemical properties suggestive of 3-ketoarabinitol 1,5bisphosphate, which would result from protonation (rather than carboxylation or oxygenation) at C2 of the enediol(ate) (net isomerization) on the same face as carboxylation. Enhanced formation of either misprotonation product by a mutant Rubisco is an indicator of compromised processing of the enediol(ate). Another potential side reaction of the enediol(ate) intermediate is formation of the dicarbonyl compound, l-deoxy-D-glycero-2,3-pentodiulose 5-phosphate, resulting from p-elimination of the CI-phosphate due to improper stabilization and/or premature dissociation of enediol(ate) from the enzyme active site. This compound has been characterized by reduction with borohydride, oxidation with H2O2, complexation with o-phenylenediamine, and i^C-NMR (23, 34). The pelimination product is not detected in reactions with wild-type R, rubrum Rubisco but is formed in substantial amounts with mutants in which the CI-phosphate ligands are substituted, demonstrating the required role of these amino acid side chains in stabilizing the enediol(ate) intermediate (34-35). p-Elimination of phosphate and concomitant formation of pyruvate from the terminal ac/-carbanion (VI) of PGA also occurs (36). Abstraction of a hydroxyl proton from the gem-dio\ carboxylated intermediate (III) promotes C2 - C3 scission with liberation of PGA derived from C3, C4, and C5 of RuBP. The resulting ac/-carbanion of PGA (derived from CI and C2 of RuBP and from CO2) must undergo inversion of configuration at C2 and protonation prior to its release as the D-isomer of PGA. The status of this final step of carboxylation (protonation of the PGA carbanion) is reflected by the ratio of protonation (PGA formation) to p-elimination (pyruvate formation). Detection of side products generated by Rubisco is accomplished by various means. XuBP (30) and pyruvate (36) are both conveniently detected spectrophotometrically by coupling to NADH oxidation with appropriate enzymes. Alternatively, our chromatographic procedure (27) gives a complete profile of all RuBP-derived products. Resolution of these compounds is enhanced by inclusion of 10 mM sodium borate, which complexes v/c-diols, in elution buffers. Since our initial report of the separation of borohydride-reduced misprotonation products (27), we have observed that borate also effects complete separation of unreduced RuBP and XuBP (37). Thus, the analysis is simplified by circumventing the necessity to deduce the amounts of misprotonation-derived bisphosphate based on ratios of ribitol-, arabinitol-, and xylitol-1,5bisphosphates.
V. Applications of Chemical Rescue Site-directed mutagenesis is generally restricted to the 20 amino acids normally occurring in proteins. Thus, reliance on homologous series of compounds to establish structure-reactivity correlations, a hallmark of mechanistic studies with non-enzymic catalysts, has been lacking with enzymes. One manner in which this limitation can be partially overcome is provided by the demonstration that an enzyme, crippled because of an active-site substitution, can be rehabilitated
Rubisco Mutants
361
("rescued") by the addition of exogenous organic compounds that mimic the missing side chain (38-39). Thus, systematic variation in the substituted side chain is Umited only by the number of homologous compounds available to test. For example, the virtujdly inactive K258A mutant of aspartate aminotransferase is stimulated by primary amines; the degree of stimulation, after correcting for steric effects, correlates with the pJ^a of the amine in accordance with the Br0nsted relationship (38-39). More recently, this approach has been extended to a variety of systems, including Rubisco (see ref. 15 and citations therein). Chemical rescue of deficient site-directed mutants can also be achieved through covalent chemical modification, thereby expanding the diversity and subtlety of structural changes that can be effected through mutagenesis. Examples include substitution of lysyl with aminoethylcysteinyl residues (net replacement of the y-methylene group with a sulfur atom) (16,40), substitution of glutamyl with carboxymethylcysteinyl residues (net insertion of a sulfur atom between the pand y-methylene groups with lengthening of the side chain by ~ lA) (41-42) and substitution of arginyl with homoarginyl residues (net insertion of a methylene group with lengthening of the side chain by - 1 A) (43-44).
VI. Lys329 - A Case Study Our studies of active-site Lys329 of R. rubrum Rubisco by site-directed mutagenesis illustrate the value of combining these methodologies. Lys329 is the apical residue of a flexible loop ("loop 6") located in the eight-stranded p/a-barrel of the C-terminal domain of this protein. In the activated enzyme with CABP bound, this loop folds over the top of the barrel and becomes immobilized, in part by electrostatic interactions between Lys329 and Glu48 of the adjacent subunit and between Lys329 and the carboxylate of the bound analog (11, 45-47). Closure of loop 6 and the NH2-terminal segment of the adjacent subunit presumably controls ligand access to the active site and mitigates dissociation of reaction intermediates from the active site. Replacement of Lys329 by site-directed mutagenesis greatly diminished carboxylation activity (~lCK-fold reduction) and formation of a stable quaternary complex with CABP (48). However, these mutants catalyze the CO2- and Mg2+dependent enolization of [3-3H]RuBP, indicating that their primary deficiency is not in assisting this partial reaction, but at a latter step in the reaction pathway (22). Evaluation of the K329G mutant as a catalyst for the turnover of isolated six-carbon, carboxylated intermediate further localized the functional role of Lys329 (25). Despite its lack of carboxylation activity, K329G exhibited a high forward commitment to hydrolysis of this intermediate to PGA. Thus, Lys329 is not needed for enolization nor for processing of the carboxylated intermediate; by deduction, it must be required for reaction of gaseous substrate with enediol(ate). This conclusion is entirely consistent with the location of the e-amino group of Lys329 in the enzyme^CABP quatemary complex of wild-type enzyme as seen by crystallography (11, 45-47). Furthermore, these results demonstrate that carboxylation is not spontaneous with wild-type Rubisco, but requires direct intervention by amino-acid side chains. Precise positioning of the amino group of Lys329 should be crucial if one of its roles is to stabilize the incipient negative charge formed in the intermediate of the gaseous substrate addition step. This supposition was validated by aminoalkylation ("covalent rescue") of the K329C mutant. Treatment of K329C with 2-bromoethylamine or 3-bromopropylamine partially restored activity, as a consequence of selective modification of the introduced thiol group (16, 25). Reduced ifccat for aminoethyl- and aminopropyl-K329C (22% and 5% wild-type, respectively) and corresponding reductions in x (56% and 30%, respectively) emphasize the stringent requirement for placement of the amine at position 329.
362
Mark R. Harpel et al.
Position-329 mutants are also amenable to noncovalent chemical rescue by aliphatic amines (15). For example, at 450 mM ethylamine and 2 mM RuBP, the K329A mutant exhibited about 2% of the wild-type carboxylation activity, representing ~80-fold stimulation compared to the marginal activity of K329A measured in the absence of amine. The system was saturable with respect to both amine and RuBP, and rescue was effected by various amines. Both the extent of rescue and the CO2/O2 specificity of the rescued enzyme (also reduced relative to the wild-type level) showed a steric preference for amine, emphasizing the importance of amine orientation. In addition, amine-rescued K329A formed a detectable complex with CABP. Given the mobility of loop 6, the effectiveness of an exogenous amine in stabilizing the catalytically competent conformation of the protein while concomitantly fulfilling the ftinctionality of a lysyl side chain is rather remarkable. Presumably, loop 6 of K329A, even in the absence of amine, can adopt the conformation necessary for catalysis; but without the lysyl side chain, reaction of gaseous substrate with the enediol(ate) of RuBP cannot occur. A volume-adjusted Br0nsted coefficient of ~l, derived from the rescue of K329A by various amines, is consistent with the amine being fully protonated in the rescued transition state(s). The role of Lys329 was further defined by product analyses of K329A turnover reactions. As shown in Fig. 2A, at high [enzyme]/[RuBP], K329A can consume RuBP, but with formation of two novel side products (dicarbonyl and X). Predominate formation of PGA and PGyc (the normal Rubisco products) in the presence of amines (Fig. 2B) is consistent with rescue of activity deriving from enhanced stabilization of intermediates, effected by the amine through direct interaction and/or maintenance of the closed conformation of loop 6. The side product denoted "dicarbonyl" is l-deoxy-D-glycero-2,3-pentodiulose 5phosphate, derived from ^-elimination of the CI-phosphate from the enediol(ate) intermediate. Formation of this compound supports a role of Lys329 in enediol(ate) interactions and stabilization. XuBP is also formed transiently before eventual consumption and can be quantified in non-reduced samples (not shown).
4000
2000
§*3000
B
/"- H 1500
PGA
K
2000 h
,
PGyc
/
/
/
/
/
1 1 I
1
1000
r.-'l. n
J
10
20
30
40
50
Time (min) Figure 2. Product analysis of K329A turnover reactions in the absence of amine (A) or in the presence of 400 mM ethylamine {B). Other reaction constituents at pH 8 were 20 |xM K329A protomer, 1 mM EDTA, 10 mM MgCl2, 415 mM bicine, 19.6 mM NaHCOa, 10% glycerol, and 250 ^M [l-3H]RuBP. Reactions were quenched after 4 h by reduction with borohydride.
363
Rubisco Mutants
4.3
"T 4.2
4.0 ppm
"T
"T
3.9
3.8
3.7
Figure 3. iR-NMR (400 MHz) of compound X isolated from a K329A reaction mixture by chromatography on MonoQ. The chemical shifts and proton-proton coupling assignments, based on selective decoupling and 2D experiments (not shown), are consistent with the structure of 2-carboxytetritol 1,4-bisphosphate (inset). Two phosphorous resonances were observed by 31P-NMR; proton-phosphorous couplings, assigned by broad-band decoupling and 2D heteronuclear COSY experiments (not shown), are also consistent with the proposed structure.
The other side product generated by K329A provides insight into Rubisco's oxygenase intermediate; X does not contain i^c derived from i^C02 and its formation is dependent on O2. Furthermore, periodate degradation (data not shown) and NMR analyses (Fig. 3) are consistent with this side product being 2carboxytetritol 1,4-bisphosphate (Fig. 3, inset; stereoconfiguration unknown). It could arise by rearrangement of the peroxy adduct of the enediol(ate) (the putative, but as yet unproven, oxygenation intermediate) with elimination of H2O2. If confirmed, the proposed structure will provide the first evidence of multiple fates for the intermediate of Rubisco's oxygenase pathway. Ironically, wild-type enzyme does not normally form X, demonstrating that despite the counterproductive nature of Rubisco's oxygenase reaction, the peroxy ketone intermediate for formation of the C2-C3 cleavage products is stabilized by this enzyme. Collectively, these studies illustrate the power of multiple approaches for the analysis of Rubisco mutants. Characterizations of position-329 mutants have not only localized the step of catalysis facilitated by Lys329 but also uncovered its role in intermediate stabilization and optimization of carboxylation selectivity.
Acknowledgment This work was supported by USDOE under contract DE-AC0584OR21400 with Martin Marietta Energy Systems, Inc.
References 1. Andrews, T.J. and Lorimer, G.H. (1987). in The Biochemistry of Plants (M.D. Hatch and N.K. Boardman, Eds.), Vol. 10, pp. 131-218, Academic Press, New York. 2. Hartman, F.C. and Harpel, M.R. (1993). Adv. Enzymol. 67, 1-75. 3. Hartman, F.C. and Harpel, M.R. (1994). Ann. Rev. Biochem. 63, 197-234.
364
Mark R. Harpel et al.
4. Bowes, G., Ogren, W.L., and Hageman, R.H. (1971). Biochem. Biophys. Res. Commun. 45, 716-722. 5. Lorimer, G.H., Andrews, T.J., and Tolbert, N.E. (1973). Biochemistry 12, 18-23. 6. Spreitzer, R.J. (1993). Ann. Rev. Plant Physiol. Mol. Biol. 44, 411-434. 7. Lorimer, G.H., Badger, M.R., and Andrews, T.J. (1976). Biochemistry 15, 529-536. 8. Soderiind, E., Schneider, G., and Gutteridge, S. (1992). Eur. J. Biochem. 206, 729-735. 9. Gutteridge, S., Lorimer, G., and Pierce, J. (1988). Plant Physiol. Biochem. 26, 675-682. 10. Schneider, G., Lindqvist, Y., and Lundqvist, T. (1990). / Mol. Biol. 211, 989-1008. 11. Knight, S., Andersson, L, and Brandon, C.-L (1990). J. Mol. Biol. 215, 113-160. 12. Somerville, C.R. and Somerville, S.C. (1984). Mol. Gen. Genetics 193, 214-219. 13. Larimer, F.W., Machanoff, R., and Hartman, F.C. (1986). Gene 41, 113-120. 14. Zoller, M.J. and Smith, M. (1983). Methods Enzymol. 100, 468-500. 15. Harpel, M.R. and Hartman, F.C. (1994). Biochemistry 33, 5553-5561. 16. Smith, H.B. and Hartman, F.C. (1988). J. Biol. Chem. 263, 4921-4925. 17. Pierce, J., Tolbert, N.E., and Barker, R. (1980). Biochemistry 19, 934-942. 18. Miziorko, H.M. and Sealy, R.C. (1980). Biochemistry 19, 1167-1171. 19. Schloss, J.V. (1990). in The Proceedings of NATO ASI on Enzymatic and Model Carboxylation and Reduction Reactions for Carbon Dioxide Utilization (M. Aresta and J.V. Schloss, Eds.) pp. 321-345, Kluwer Academic Press, Netherlands. 20. Saver, B.G and Knowles, J.R. (1982). Biochemistry 21, 5398-5403. 21. Sue, J.M. and Knowles, J.R. (1982). Biochemistry 21, 5404-5410. 22. Hartman, F.C. and Lee, E.H. (1989). J. Biol. Chem. 264, 11784-11789. 23. Pierce, J., Andrews, T.J., and Lorimer, G.H. (1986). / Biol. Chem. 261, 10248-10256. 24. Laing, W.A., Ogren, W.L., and Hageman, R.H. (1974). Plant Physiol. 54, 678-685. 25. Lorimer, G.H., Chen, Y.-R., and Hartman, F.C. (1993). Biochemistry 32, 9018-9024. 26. Chen, Z. and Spreitzer, R.J. (1991). Planta 183, 597-603. 27. Harpel, M.R., Lee, E.H., and Hartman, F.C. (1993). Analyt. Biochem. 209, 367-374. 28. O'Leary, M.H., Rife, J.E., and Slater, J.D. (1981). Biochemistry 20, 7308-7314. 29. Edmondson, D.L., Badger, M.R., and Andrews, T.J. (1990). Plant Physiol. 93, 13901397. 30. Edmondson, D.L., Kane, H.J., and Andrews, T.J. (1990). FEBS Lett. 260, 62-66. 31. Zhu, G. and Jensen, R.G. (1991). Plant Physiol. 97, 1354-1358. 32. McCurry, S.D. and Tolbert. N.E. (1977). /. Biol. Chem. 252, 8344-8346. 33. Yokota, A. (1991), Plant Cell Physiol. 32, 755-762. 34. Larimer, F.W., Harpel, M.R., and Hartman, F.C. (1994). /. Biol. Chem. 269, 1111411120. 35. Morell, M.K., Paul, K., O'Shea, N.J., Kane, H.J., and Andrews, T.J. (1994). /. Biol. Chem. 269, 8091-8098. 36. Andrews, T.J. and Kane, H.J. (1991). J. Biol. Chem. 266, 9447-9452. 37. Lee, E.H., Harpel, M.R., Chen, Y.-R., and Hartman, F.C. (1993). J. Biol. Chem. 268, 26583-26591. 38. Toney, M.D. and Kirsch, J.F. (1989). Science 243, 1485-1488. 39. Toney, M.D. and Kirsch, J.F. (1992). Protein Sci. 1, 107-119. 40. Planas, A. and Kirsch, J.F. (1991). Biochemistry 30, 8268-8276. 41. Lukac, M. and Collier, R.J. (1988). J. Biol. Chem. 263, 6146-6149. 42. Smith, H.B., Larimer, F.W., and Hartman, F.C. (1990). /. Biol. Chem. 265, 1243-1245. 43. Beyer, W.F., Jr., Fridovich, L, Mullenbach, G.T., and Hallewell, R. (1987). J. Biol. Chem. 262, 11182-11187. 44. Engler, D.A., Campion, S.R., Hauser, M.R., Cook, J.S., and Niyogi, S.K. (1992). J. Biol. Chem. 267, 2274-2281 45. Andersson, L, Knight, S., Schneider, G., Lindqvist, Y., Lundqvist, T., Brandon, C.-I., and Lorimer, G.H. (1989). Nature 337, 229-234. 46. Schreuder, H.A., Knight, S., Curmi, P.M.G., Andersson, L, Cascio, D.,Branden, C.-L, and Eisenberg, D. (1993). Proc. Natl. Acad. Sci. USA 90, 9968-9972. 47. Newman, J. and Gutteridge, S. (1993). J. Biol. Chem. 268, 25876-25886. 48. Soper, T.S., Mural, R.J., Larimer, F.W., Lee, E.H., Machanoff, R., and Hartman, F.C. (1988). Protein Eng. 2, 39-44.
Probing The Roles Of Conserved Histidine Residues In B-Galactosidase (E. coli) Using Site Directed Mutagenesis And Transition State Analog Inhibition Nathan J. Roth, Katherine Y.N. Wong, and Reuben E. Ruber Div. Of Biochemistry, Dept. of Biological Sciences, University of Calgary, Calgary, Alberta, Canada T2N 1N4
I. Introduction His residues often play important roles in the structure and function of enzymes. The imidazole side chain of His is unique in that it has a pKa near neutral and therefore can gain or lose protons by small changes in the local environment. Thus His is often found within the active site as an acid/base catalyst or plays roles in modulating conformational changes (1). His is capable of acting as a hydrogen bond acceptor or donor, and often functions directly in metal and ligand binding. In addition, it can assist catalysis by acting as a nucleophil. 6-Galactosidase from E. coli is a retaining glycosidase which catalyses the hydrolysis of 6-D-galactosides. Native 6-galactosidase is a tetramer, consisting of four identical monomers of 1023 residues each (2). We decided to probe the functional roles of conserved His in 6-galactosidase using site directed mutagenesis followed by a quick characterization of the resultant enzymes. E, coli 6-galactosidase contains 34 His residues (2). Alignment of the sequences of the related 6-galactosidases and 6-glucuronidases sequenced to date reveals that of these 34 His, only 3 (corresponding to His 357, His 391, and His 540 of the E. coli enzyme) are absolutely conserved (2-15). In this paper we illustrate the approach we took, in the absence of structural data, to determine the functional roles of His-357 and His-391 of 6galactosidase. The data acquired using this approach indicates that His-357 and His-391 appear to be highly important in transition state stabilization but not in ground state binding of the substrate.
II. Materials and Methods A. Site Directed Mutagenesis Site directed mutagenesis was carried out using a modified procedure of Kunkel's dut' ung' method (16). A 1.1 kb fragment of the lac Zgent containing the codons we wished to alter was excised from the plasmid pIPlOl using the TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
355
366
Nathan J. Roth et al.
restriction enzymes Sac I and Cla I, and then ligated into BS SK+ previously cut at the corresponding sites. The single stranded mutagenic template DNA was rescued from E. coli RZ1032 containing the subclone with the help of VCS M13 helper phage. The mutagenic primers that were used were phosphorylated prior to annealing using T4 polynucleotide kinase. The synthesis and ligation of the second strand was carried out at 3TC for 3 hr by T4 polymerase and T4 DNA ligase. The mutagenic reaction mix was transformed into E. coli XLBlue. Putative mutants were screened directly by sequencing using Sequenase 2.0. The 1.1 kb fragment containing the mutation was excised and reinserted into pIPlOl. The pIPlOl plasmid carrying the desired mutation was transformed and expressed in a /ac Z - strain of E, coli. Finally, the integrity of the mutation in pIPlOl was reconfirmed using a thermocycling DNA sequencing method.
B. fi'Galactosidase Purification 6-Galactosidase was purified as described previously (17), except for several slight modifications. An 800 mL gradient from 0.09 M to 0.18 M NaCl was used to elute the protein from the DEAE column. The active fractions from the DEAE elution were pooled, precipitated with ammonium sulfate, and then applied to an FPLC Superose 6™ size exclusion column into the appropriate assay buffer. Protein purity was assessed by SDS-PAGE, and the enzyme concentration was determined using an extinction coefficient of 2.09 cm^/mg at 280nm(18).
C. Determination ofKinetic and Inhibitor Constants Kinetic assays were performed at 25*C and pH 7.0 in TES assay buffer (30 mM TES, 145 mM NaCl, 1 mM MgS04) in a UV 2101 Shimadzu spectrophotometer. o-Nitrophenyl-6-D-galactopyranoside (ONPG) or pnitrophenyl-6-D-galactopyranoside (PNPG) was used as the substrate. The kcat and Km were determined by least squares analysis of Eadie Hofstee plots. Kinetic inhibitor constants were determined using the method described by Deschavanne et al. (19) and Huber and Gaunt (20). There are two kinetically relevant steps in the 6-galactosidase reaction mechanism, denoted by the rate constants k2 and k3 (Fig. 1). The first step, "galactosylation" (k2), results in galactosidic bond breakage and the formation of an enzyme substrate intermediate. The second step, "degalactosylation" (ka), involves the hydrolysis of the enzyme-galactosyl intermediate and the release of galactose. The rate of "degalactosylation" (ka) is the same for any 6-galactoside substrate. Therefore, if similar kcat values are obtained for different substrates, k3 is probably the rate determining step. Nucleophilic competition experiments, (in the presence of 1 M methanol), also aid in determining which kinetic step is rate determining. Nucleophiles (e.g. MeOH) can compete with water to attack the galactosylenzyme intermediate (E»GAL) (Fig. 1). If k3 is the rate determining step and if k4 > k3, the addition of methanol should increase the rate of reaction. However, if k2 is rate determining, the addition of methanol should have no effect on the observed catalytic rate.
Probing Conserved Residues in p-Galactosidase
GAL
367
^
H20
E«GAL-OR - ^
E'GAL
GAL-OMe
^"^
^^^^
Figure 1. Probable mechanism offi-galactosidasein the presence of a nucleoj^l (methanol). E,fi-galactosidase;GAL-OR, galactoside substrate; EOAL, galactosyl enzyme; MeOH. added nucleoli (methanol); GAL-OMe galactosyl nucleophil product; Ks, dissociation constant for E-GAL-OR.
III. Results and Discussion A. Rationale for Mutagenesis It has been demonstrated that conserved amino acid residues are functionally more important than non conserved residues (21). Mutation data matrices show that His is most often replaced by acid and amide residues, and thus replacement of His by a residue from this group would probably be the most obvious choice (22, 23). However, replacement of His with a member of the acid or amide group to screen for a catalytic role for a His might not discriminate since acids and amides could perform a similar function to that of His. Therefore, in our initial studies, we decided instead to replace His with Phe. Phe was chosen because its side chain has a size similar to His but it lacks the metal complexing properties and hydrogen bond forming capabilities present in the imidazole side chain and also in the members of the acid and amide group of amino acids. Purification and initial screening of the Phe substituted enzymes using transition state analog inhibitors showed that the enzymes with Phe substituted for His357 and His-391 bound transition state analogs very poorly compared to substrate analogs but Mg^-*- binding was not affected. His-357 and His-391 were therefore selected for further study using site directed mutagenesis to replace the His with acid and amide residues.
B. Purification and Stability All of the enzymes with substitutions at His-357 and His-391 precipitated at the same ammonium sulfate concentrations, and eluted from the ion exchange and gel filtration columns in similar volumes as wild type. This indicated that the physical properties associated with purification (aggregation, quaternary structure, charge, etc.) were not affected by the mutations. The enzymes were greater than 95% pure as analyzed by SDS PAGE. H357F was stable for at
368
Nathan J. Roth et al.
Table L Kinetic constants of the wild type and substitutedfi-galactosidaseswith ONPG and PNPG as substrates H391F Wild H357D tD57F H357N H391E ONPG kcat(s'^) IQn(mM)
q)pkcat(s'¥ PNPG kcat(s"b IQn(mM) appkcatCs"^)^ Values d* ^pkcat
620 0.12 1150
7.85 0.12 6.63
15.9 0.76 19.8
63.5 0.22 57.9
1.71 6.9 1.71
0.24 0.44 0.18
90 0.041 90
1.22 0.47 0.90
2.82 0.069 2.33
0.77 0.006 0.65
0.017 3.94 0.013
0.009 0.066 0.007
refer to the turnover number 100 N.D. 36
14.8 17.8 N.D.
370
Nathan J. Roth et al.
since, except for the substitution of Glu for His-391, the substitutions had little effect on the binding of substrate. The substitution of Glu for His-391 may cause a general active site disruption as it is in close proximity to the catalytic nucleophile, Glu-537 (24). All of the enzymes with substituted residues at His-391 and His-357 bound the transition state analogs very poorly (Table II) suggesting that His-391 and His-357 are required for proper transition state stabilization. Although, H391E 6-galactosidase bound the substrate analogs about 40 times poorer than wild type, it bound the transition state analogs even more poorly by a full order of magnitude. The observed decreases in catalysis of the substituted enzymes may be a consequence of increased energy barriers due to the losses of transition state solvation. The effect seems to be mainly on "galactosylation" (k2). This is supported by the results of the nucleophilic competition studies which showed that the addition of methanol to the assay did not result in an increase in the k^at. Furthermore, the kcat values for each enzyme were quite different depending upon which substrate was used. This indicates that "galactosylation" (k2) was rate determining, and shows that this step was affected much more than "degalactosylation" (ks) by the changes in solvation of the planar transition state.
IV. Conclusions The results suggest that His-357 and His-391 are required for proper transition state stabilization and may form direct inter-actions with a planar galactosyl transition state intermediate. The presence of active site His residues which interact with the transition state in glycosidases has previously been shown in a-amylase (28). Studies of 6-galactosidase utilizing deoxy and deoxyfluoro galactosyl analogs indicate that interactions at the 3-, 4-, and 6positions contribute approximately 16.7 kJ (4 kcal) • mol-1 each to the stabilization of the transition state, while interactions at the 2- position contribute at least 33.5 kJ (8 kcal) • mol-l (27). Our findings show that His-357 and His-391 might be the residues within the active site of 6-galactosidase which mediate some of these interactions. As His-357 and His-391 are conserved in both the 6-galactosidase and 6-glucuronidase family, it is unlikely that either of these His are involved in interactions with the 4- hydroxyl position since this hydroxyl is axial in galactose but is equatorial in glucuronic acid. We can not absolutely discount the possibility that the observed loss in transition state binding is indirectly due to minor structural aberrations in the enzyme as a result of the substitutions. However, the crystal structure of 6galactosidase, which became available near the completion of this study, shows that His-357 and His-391 are near the known active site residues and probably line the active site cavity (24). Therefore, they have the potential to form direct interactions with the substrate in the transition state form.
Acknowledgments We would like to thank Rob Penner for his invaluable technical assistance in the latter stages of this woik. We would also like to thank R.H. Jacobson and B.W. Matthews for providing us the opportunity to examine preprints of the structure of fi-galactosidase. Funding for this work was provided by the Alberta Heritage Foundation for Medical Research (AHFMR)
Probing Conserved Residues in P-Galactosidase
371
in the form of studentships and by the National Science and Engineering Research Council of Canada (NSERC).
References 1. Richardson. J. S., and Richardscm, D. C. (1989) In The Prediction of Protein Structure and the Prindi^es of Protein Conformation" (G. D. Fasman, ed.), 1-98. 2. Kahiins. A., Otto, K., Ruther, U. and Muller-HiU. B. (1983) EMBO J. 2,593-597. 3. BurchhaTdt.G..andBahl,H. (1991) Gene 106,13-19. 4. Buvinger.W.E.. and Riley. M. (1985) J.BacterioL 163,850-857. 5. David.S..Stevens.H..vanRiel.M.,Simons.G.,anddeVos.W.M. (1992) J.BacterioL 174,4475-4481. 6. Fanning. S..Leahy.M..andSheehan.D. (1994) Gene Ul,9\-96, 7. Hancock. K. R.. Rockman. E.. Young. C. A.. Pearce. L.. Maddox. I.S.. and Scott. D.B. (1991) 7. Bacteriol 173,3084 - 3095. 8. Poch.O..L*Hote.H.L..Dallery.V..DebeauxF..Reer.R..andSodoyer.R. (1992) Gene 118, 55-63. 9. Schmidt. B. F.. Adams. R. M.. Requadt. C. Power. S.. andMainzer. S. E. (1989) J, Bacteriol 171,625-635. 10. Schroeder.C. J.. Robert. C..Lenzen.G.. McKay. L.L.. and Mercenier. A. (1991) J, Gen. Microbiol. 137,369-380. 11. Stokes. H. W.. Betts. P. W. and HaU. B. G. (1985) Mol. Biol. Evol. 2,469 - 477. 12. Gallagher. P. M..D'Amore.M. A.. Lund. S.D.. and Ganschow.R.E. (1988) Genomics 1, 215-219. 13. Jefferscm. R. A.. Burgess. S. M.. and Hirsh. D. (1986) Proc. Natl. Acad. Sci. USA 81, 414-418. 14. Oshima. A.. Kyle. J. W.. MiUer. R. D.. Hoffman. J. W.. PoweU. P. P.. Grubb. J. H.. Sly. W. S., Trq)ak. M.. Guise. K. S.. and Gravel. R. A. (1987) Proc. Natl. Acad. Sci. USA 84, 685-689. 15. Nishimura. Y.. Rosenfeld. M. G.. Kreibich. G., Gubler. U.. Sabatinit. D. D.. Adesnik. M.. and Andy. R. (1986) Proc. Natl. Acad. Sci. USA 83, 7292-72%. 16. Kunkel.T.A..Roberts.J.D..andZakour.R.A. (1987) Meth. Enzymol. 154,367-382. 17. Cupples.C.G..Miller.J.H..andHuber.RE. (1990) J.Biol.Chem. 265,5512-5518. 18. Wallenfels.K.. and Weil. R. (1972) In "The Enzymes" (Boyer. P. D.. ed) VoL7, pp.617663. 19. Deschavanne.P.J..Viratelle.O.M..andY<Mi,J.M. (1978) J.Biol.Chem. 253,833-837. 20. Huber.R.E.. and Gaunt. M.T. (1982) Can.J.Biochem.60,6OS-6l2. 21. Poteete.A.R..Rennell.D.,andBouvier.S.E. (1992) Proteins 13,3^-40. 22. Overington,J..I>onnelly.D..Johnson.M. S..SaU.A..andBlundell.T. L. (1992) Protein Science 1,216-226. 23. Dayhoff.H.0.. Schwartz. R.M.. and Orcutt. B.C. (1978) /n "Adas -16.8^
-19.4^
> 1 X 10-14^
(2.5 ± 0.2) X 10-16 (5.0 ± 0.9) X 10-16^
-20.0 -21.2
AAG^ -0.4 >2.3^ 2.3^ -0.1 1.1 1.0^ 0.8 1.6 >2.5^ >2.5^
^determined as described in Materials and Methods. Unless noted otherwise, all values refer to data obtained at 25 °C. Each value represents the mean of at least 3 determinations (except for gcc at 4 OC) ± SEM. ^AG^obs in units of kcal»mol"l is equal to -/?71n(l/^app) where T is 298 (or 277) K and R is 0.001987 kcal»mol-l»K-l. ^AAG^obs is equal to AG^obsCAP-l)AG^obsCCRE). ^determined at 4 ^C ^estimated based on AAG^obs determined at 4 ^C and the failure to observe a ccc*AP-1 or ccg*AP-1 complex at a protein monomer concentration of 300 nM. We believe this estimate is reasonable because the AAG^obs values determined for the gcc peptide at 25 ^C and 4 ^C are comparable.
C.
Conclusions
Using block substitutions, we determined the effects on DNA specificity and affinity of the zipper, spacer, and basic segments of GCN4 and CRE-BPl. CRE/AP-1 specificity is encoded by residues within the spacer and basic segments of the bZIP element. Of these two regions, the basic segment plays the dominant role. Our finding that the determinants of half-site spacing specificity, like the determinants of base-pair specificity, are encoded primarily within the basic segment represents a further concentration of recognition information within the short span of a bZIP recognition helix.
Models of bZIP Proteins
391
References (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25)
Cuenoud, B.; Schepartz, A. Proc. Natl Acad. ScL USA 1993, 90, 1154-1159. McKnight, S. L. Sci. Am. 1991, 54-64. Landschulz, W. H.; Johnson, P. F.; McKnight, S. L. Science 1988, 240, 1759-1764. Pu, W. T.; Struhl, K. Proc. Natl. Acad. Sci. USA 1991, 88, 69016905. O'Shea, E. K.; Klemm, J. D.; Kim, P. S.; Alber, T. Science 1991, 254, 539-544. Konig, P.; Richmond, T. J. Mol. Biol. 1993, 230, 139-154. Ellenberger, T. E.; Brandl, C. J.; Struhl, K.; Harrison, S. C. Cell 1992, 71, 1223-1237. Umesono, K.; Evans, R. M. Cell 1989, 57, 1139-1146. Reece, R. J.; Ptashne, M. Science 1993, 261, 909-911. Gorton, J. C ; Johnston, S. A. Nature 1989, 340, llA-lll. Paolella, D. N.; Palmer, C. R.; Schepartz, A. Science 1994, 264, 11301133. Hai, T.; Liu, P.; Coukos, J.; Green, M. R. Genes & Development 1989, 3, 2083-2090. Penn, M. D.; Galgoci, B.; Greer, H. Proc. Natl. Acad. Sci. USA 1983, 80, 2704-2708. Hinnebusch, A. G.; Fink, G. R. Proc. Natl. Acad. Sci. USA 1983, 80, 5374-5378. Sellers, J. W.; Vincent, A. C.; Struhl, K. Mol. Cell. Biol. 1990, 10, 5077-5086. Metallo, S. J.; Scepartz, A. submitted 1994, Cuenoud, B.; Schepartz, A. Science 1993, 259, 510-513. Garner, M. M.; Revzin, A. Nucleic Acids Research 1981, 9, 3047-3060. Fried, M.; Crothers, D. M. Nucleic Acids Research 1981, 9, 6505-6525. Weiss, M. A.; Ellenberger, T.; Wobbe, C. R.; Lee, J. P.; Harrison, S. C.; Struhl, K. Nature 1990, 347, 575-578. Brown, B. M.; Sauer, R. T. Biochemistry 1993, 32, 1354-1363. Weiss, M. Biochemistry 1990, 29, 8020-8024. O'Shea, E. K.; Rutkowski, R.; Kim, P. S. Cell 1992, 68, 699-708. Kim, J.; Tzamarias, D.; Ellenberger, T.; Harrison, S. C.; Struhl, K. Proc. Natl. Acad. Sci. USA 1993, 90, 4513-4517. Johnson, P. F. Mol. Cell. Biol. 1993,13, 6919-6930.
This Page Intentionally Left Blank
Applying affinity coelectrophoresis to the study of non-specific, DNA binding peptides Michael L. Nedved and Gregory R. Moe Department of Chemistry and Biochemistry, University of Delaware, Newark, DE 19716
L Introduction There are many instances where it is of interest to characterize the nucleic acid binding activity of peptides and peptidefragmentsof proteins. However, it is often difficult or impossible to accurately measure binding constants for nucleic acid binding peptides since the methods available were developed for proteins that bind to nucleic acids with high affinity and specificity. For example, the commonly usedfilterbinding and gel-shift assays depend on the relatively slow kinetics of ligand dissociation in order to separate the bound and unbound species (1). This requirement is rarely met in peptide binding experiments. Alternatively, spectroscopic methods such as NMR, fluorescence, or circular dichroism spectroscopy can be used but are often complicated by a non-linear dependence of the measured spectral parameter on the extent of binding (1). Recently, we have used affinity coelectrophoresis (ACE) (2) to characterize the non-specific, DNA binding activity of several small peptides. In order to demonstrate the uses and advantages of this method, typical results for three peptides, TPPI, Xfin-31, and clupeine Z, are summarized below. TPPI is a twenty-two amino acid peptide having an amino acid sequence similar to that of a proline repeat motif in the replication arrest protein, Tus (3). Xfin-31 (4) is the thirty-first zincfingerof the Xenopus laevis protein, Xfin (5), and is a typical example of a "classical" zinc finger. The DNA binding properties of Xfin-31 have been characterized previously using the gel-shift assay (6). Clupeine Z is a protamine isolated from salmon sperm (7), and its DNA binding activity has been characterized using spectroscopic methods (8, 9). We show here that the ACE data can be analyzed using the theory developed by McGhee and von Hippel (10) for non-specific binding of ligands to a homogeneous lattice to obtain binding constants and cooperativity parameters. Additionally, the effi^ct of lattice length on the estimation of these parameters is considered. Site sizes are estimated based on the DNA mobility at saturating peptide concentrations. Finally, ACE can also be used to measure the salt dependence of peptide-DNA binding. The number of cations released and the non-electrostatic component of the binding constant can then be obtained byfittingthe data to the equation derived by Record et al. (11). TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
393
394
Michael L. Nedved and Gregory R. Moe
11. Materials and Methods Xfin-31 (4) and TPPI, a twenty-two residue peptide (3), were synthesized using solid-phase peptide synthesis and purified by IffLC. The oxidized form of Xfin-31 (Xfin-3 l(S-S)) was made using K3Fe(CN)6 (12). Clupeme sulfate was fi-om Sigma, and clupeine Z was purified using the method of Ando and Suzuki (13). The identity of all peptides was determined by electrospray mass spectrometry and where possible by amino terminal sequencing. Concentrations of Xfin-31, TPPI, and clupeine Z were determined spectrophotometrically. Synthetic deoxyoligonucleotides were end-filled after annealing using unlabeled nucleotide triphosphates, [a-^2p]-dATP, and Sequenase® (United States Biochemical) to give a 19-mer containing the consensus binding site of Spl (14), a 37-mer containing the TerB termination sequence in E, coli (15), and a 51-mer containing the UV-5 sequence of the lac operon (16). The procedure for affinity coelectrophoresis (2) was used with the following modifications. A glass plate (10.1 cm x 8.2 cm x 0.1 cm) was placed in a Plexiglas gel-casting box (inner dimensions 10.1 cm x 8.2 cm) open on two sides which were subsequently sealed with cellophane tape. High-mehing agarose was used to cast a 1% (w/v) gel (10.1 cm x 8.2 cm x 0.5 cm) containing TAE buffer (40 mM Tris-acetate pH 8.3, 1 mM Na2EDTA). A continuous DNA well was made at one end of the gel approximately 0.5 cmfromthe peptide lanes by setting a comb (7 cm x 2.2 cm x 0.1 cm) to a depth of 0.4 cm. Using a template drawn on the bottom of the box, a control lane and eight peptide lanes (5 cm x 0.5 cm X 0.5 cm each) were cutfromthe gel with a scalpel. Individual 500 |aL solutions containing buffer or peptide at twice the desired concentration were incubated for two minutes at 65 °C, diluted with an equal volume of 2% agarose in TAE buffer, and pipetted into the cut lanes. The gel was then cut to 7.5 cm x 8.2 cm x 0.5 cm, placed in an electrophoresis chamber, and equilibrated forfifteenminutes in TAE buffer that had been cooled to 5 °C. A labeled DNA solution (100 |iL) containing TAE buffer and 15% glycerol without dyes was loaded into the DNA well and electrophoresed at 3-4 V/cm with buffer circulation until the DNA had migrated over half the length of the peptide lanes. The gel was then wrapped in plasticfilmand autoradiographed. Gels run with Xfin-31 in the zinc complex form used TAE buffer without Na2EDTA (TA buffer). The salt dependence of TPPI was done using TA buffer and KCl. The data were analyzed using Sigma-Plot® (Jandel Scientific), a non-linear, least-squares, curvefittingprogram. Errors given arefromthe best fits of the data.
III. Results In an ACE gel, the electrophoretic mobility of the DNA depends on its net charge as long as the DNA and the DNA-ligand complex is much smaller than the pore size of the matrix through which it moves (2,17). The net charge of the
Affinity Coelectrophoresis and DNA Binding Peptides
395
Figure 1. Autoradiogram of an ACE gel showing Xfin-31 (oxidized) binding to the 19-mer DNA sequence. Peptide concentrationsfromleft are 0,2, 3,4, 5,8,10,14, and 18 |iM.
DNA is reduced in proportion to the amount of ligand bound. A typical autoradiogram of an ACE gel for oxidized Xfin-31 binding to a nineteen basepair oligonucleotide is shown in Figure 1. The change in mobility as a function of ligand concentration is represented by the retardation coefficient (R) which is the ratio of the distance traveled by the DNA in the presence of peptide to the distance it travels in the absence of peptide as measuredfi*omthe center of the bands on the autoradiogram. Roo is the DNA mobility at saturating peptide concentrations. Since all of the peptides used in this study bind non-specifically to DNA, the data were analyzed using the McGhee-von Hippel model (10) where in this case, the binding density is proportional to R. For ligands that bind to a specific sequence, the method of data analysis described by Lim et al. (2) is appropriate. For non-cooperative binding, the McGhee-von Hippel equation (10) becomes ( 1-cR ^^-1 U-(c-l)R
f = K(l-cR)|
(1)
where L is thefreeligand concentration which is approximately equal to the total peptide concentration when the peptide is in excess over the DNA, K is the intrinsic association constant, and c is the unitless proportionality constant relating R to the binding density and is equal to 1 / Roo. For cooperative binding, the corresponding McGhee-von Hippel equation becomes R _ K(l ^T>/(2co - l)(l-cR)+R-HY-Yl-(c+l)R + H y L - *^"-'n 2(o-l)(l-cR) J I 2(l-cR) J
^^^
396
Michael L. Nedved and Gregory R. Moe
[TPPIl(jiM) Figure 2. A.) A binding curve for TPPI using the 37 base-pair sequence. The data were fit using the binding equation of Lim et al. (2). Error bars represent the error in R based on a 1 mm error in measuring distances on the autoradiogram. B.) A Scatchard plot of the same data which werefitusing equation 1.
where H = J { [ 1 - ( C + 1)R]^+[4CDR(1-CR)]}, andoisthecooperativity factor as defined by McGhee and von Hippel (10). As shown in Figure 2 for TPPI binding to a 37 base-pair synthetic oligonucleotide, a plot of R as afiinctionof peptide concentration describes a simple binding isotherm characteristic of ligands binding to non-interacting sites. A Scatchard plot of the data is shown in the inset. Byfittingthe data to the noncooperative equation of the McGhee-von Hippel model (equation 1) (10), the binding constant and the value of Roo were obtained (Table I). A plot of log K
Table L Summary of ACE Data -«..-.«.«^^--_««---«.«_. Zp^ M^^ n^ o^ KQA'^)^ K@(M-^/ Roog Peptide Xfin-31(Zn2+) 5+ Na+(2mM) 3.0 61 8.1 ±0.7x103 4.9 ±0.5x105 0.96 Xfin-31(S-S) 5+ Na+(2mM) 3.4 3.9 3.0±0.1xl04 1.2±0.1xl05 0.84 TPPI
7+
clupeine Z
21+ Na+(50mM) 24
K+ (2mM) 4.0
1
1.4±0.1xl06
1.4±0.1xl06
1.00
85
5.3 ±1.3x104
4.5±1.5xl06
0.50
^net charge on the peptide at pH 8.3 ^mono-valent cation ^site size in base-pairs calculated using equation 3 ^cooperativity factor ^intrinsic association constant binding constant for singly-contiguous sites ^ N A mobility at saturating peptide concentrations
Affinity Coelectrophoresis and DNA Binding Peptides
8
397
12
20
pCfm-31] (MM) Figure 3. Binding isotherms for Xfin-3 l(S-S) (closed circles) and Xfin-3 l(Zn2+) (open circles) using the 19 base-pair sequence. The data werefitusing the binding equation of Lim et al. (2).
versus log [K"^] for TPPI is linear in the range of [KCl]from50 mM to 150 mM. Applying the theory of Record et al. (11), the maximum number of cations released can be estimatedfromthe slope which is 4.7 ± 0.3 for TPPI. The nonelectrostatic binding constant obtained by extrapolation to 1 M KCl is 0.6 M"l indicating that the binding is almost entirely electrostatic for this ligand. ACE data can be used to estimate the site size by using the DNA mobility at saturating peptide concentrations, Rx, and equation 3. NZp n =
ZDROO
(3)
In equation 3, N is the number of base-pairs, Zp is the net charge on the peptide, and ZD is the net charge on the nucleic acid. ZD is estimated by mukiplying the number of phosphates by 0.88, the theoretical constant for duplex DNA (11). In contrast to TPPI, the binding curves for the Xfin-31 in the zinc complex and oxidized forms binding to the 19-mer were sigmoidal (Figure 3), and the Scatchard plots were "humped" (Figure 4). These characteristics are typical of ligands that bind cooperatively. The data in Figure 4 werefittedto equation 2, the McGhee-von Hippel equation for cooperative binding to an infinite, homogeneous lattice (10). This form of the equation includes a cooperativity factor, G), in addition to the intrinsic association constant, K, and Roo. DNA binding by the Xfin-31 zinc complex has been characterized previously using the gel-shift assay (6). Although the value for Ko, the binding constant for singlycontiguous sites (10), determined by ACE is similar to the binding constant determined by the gel-shift assay (6), the cooperative nature of Xiin-31 in the
398
Michael L. Nedved and Gregory R. Moe
Figure 4. Scatchard plots of the binding data from Figure 3 for Xfin-3 l(S-S) (closed circles) and Xfin-3 ICZn^"*^ (open circles). The data were fit using equation 2.
zinc complex form binding to DNA was not apparent in the gel shift assay. The binding site sizes determined for both forms of Xfin-31 using equation 3 are nearly identical (Table I) and agree well with the three base-pair binding site observedfiDrsingle zincfingersin zincfinger-DNAco-crystal structures (18, 19). The McGhee-von Hippel theory (10) was developed for an infinite lattice; however, it is necessary to use relatively small,finitelattices (oligonucleotides) in the ACE gel assay to avoid sieving effects. This might be expected to affect the accuracy of the binding parameters derivedfi"omfitsof the ACE data to equation 2 as the lattice approaches saturation for highly-cooperative ligands with large binding site sizes (20, 21). In order to determine the effects of finite lattice length on estimates of the binding parameters, the ACE assay was used to characterize the DNA binding of clupeine Z to a 51 base-pair synthetic oligonucleotide. Clupeine Z has been shown previously to bind cooperatively to DNA using spectroscopic methods (8, 9). The binding site size determined previously for clupeine Z is -- 20 base-pairs (8, 9) so that there are approximately two binding sites on the oligonucleotide used here. As shown in Figure 5, clupeine Z exhibits the "humped" Scatchard plot of cooperative ligands confirming that ligands known to bind cooperatively to DNA exhibit cooperative binding in the ACE assay as well. Of the parameters estimated from a non-linear, least-squares fit of the clupeine Z ACE data, K, and the cooperativity parameter, o, are smaller by a factor of--100 and ~2, respectively, than those determined spectroscopically using considerably longer DNAfragments(2, 3). These results are consistent with theoretical (20) and experimental estimates (21) of the effects offinitelattice length on binding parameters obtainedfromthe McGhee-von Hippel equation. Therefore, for highly cooperative ligands, longer DNAfragmentsor alternative methods of data
399
Affinity Coeiectrophoresis and DNA Binding Peptides
14 ^ «r>
o -^
,^ 10-
N'
8 ~
•§
6-
s
a
•^.^ «!
•/'^^^"'"•^
X^
12X ^
X
2 -
X
/ /
4A
\
^r
X
y
£
u —
0.0
1
0.1
1
0.2
1
0.3
1 0.4
1 0.5
Figure 5. A Scatchard plot showing the binding of clupeine Z to the 51 base-pair DNA sequence. The data werefitusing equation 2.
analysis must be used to avoid underestimating K and © (20,21,22). In contrast, the value of Roo and the binding site size estimated therefrom appear to be insensitive to the lattice length since the binding site size determinedfromthe ACE assay (~24 base-pairs) agrees reasonably well with the site size estimated using a much longer lattice (8, 9). The binding parameters of Xfin-31 in the zinc complex and oxidized forms were unaffected due to the lower cooperativity and larger number of available binding sites on the oligonucleotide used.
IV. Conclusions Affinity coeiectrophoresis (ACE) (2) was applied to study the non-specific binding properties of a peptide (TPPI)fromthe replication arrest protein, Tus (3), a zincfingerpeptide, Xfin-31 (4), in the oxidized and zinc complex forms, and a protamine, clupeine Z (7), using different DNA sequences. ACE data were used to construct simple binding curves and Scatchard plots, and the McGheevon Hippel theory (10) was used to model the binding of both non-cooperatively (TPPI) and cooperatively binding ligands (Xfin-31, clupeine Z) in order to determine association constants (K) and cooperativity parameters (o). Additionally, the number of salt contacts and the non-electrostatic component of binding were determined for the TPPI peptide by applying the theory of Record et al. (11). The binding constant determined for Xfin-31 in the zinc complex form is in good agreement with that reported using the gel-shifl assay (6). Site sizes can be estimated using ACE data, and the site size determined for Xfin-31 m the zinc complex form was similar to that observed in zinc finger-DNA co-crystal structures (18,19).
400
Michael L. Nedved and Gregory R. Moe
Since ACE requires the use of finite lattices (small oligonucleotides) to avoid sieving effects, the binding of clupeine Z, a highly-cooperative ligand with a large site size (8, 9), to a two-site oligonucleotide was studied. ACE data for clupeine Z demonstrated the cooperative nature of the bmding (8,9), and the binding site size correlated well to data obtained using spectroscopic methods (8, 9); however, the magnitude of both the cooperativity factor and the binding constant were underestimated, an effect which has been predicted theoretically (20) and verified experimentally (21). The binding parameters of Xfin-31 and TPPIwere unaffected due to the larger number available of binding sites on the oligonucleotides used. In summary, the data presented here for peptides binding to different sequences and lengths of DNA demonstrate that afiSnity coelectrophoresis (ACE) (2) can be utilized to study non-specific, peptide-DNA binding. It is a simple, gel electrophoresis technique which measures equilibrium binding of small ligands whose rapid dissociation kinetics may preclude the use of other DNA binding assays. Thus, it is a valuable supplement to other techniques used to measure DNA binding.
References 1. Revzin, A. (1990). In "The Biology of Nonspecific DNA-Protein Interactions" (Revzin, A., ed.), pp. 1-31. CRC Press, Boca Raton. 2. Lim, W.A., Sauer, R.T., and Lander, A.D. (1991). Methods Enzymol 208,196-210. 3. Nedved, M.L., Gottlid), P.A., and Moe, G.R., submitted. 4. Lee, M.S., Gippert, G.P., Soman, K.V., Case, D.A., and Wright, P.E. (1989). Science 245,635-637. 5. Altaba, A.R., Peny-O'Keefe, H., and Melton, D.A. (1987). EMBOJ. 6,3065-3070. 6. Lee, MS., Gottesfeld, J.M., and Wright, RE. (1991). FEES Lett. 279,289-294. 7. Ando, T., Iwai, K., Ishii, S., Azegami, M., and Nakahara, C. (1962). Biochim. Biophys. ^cto 56,628-630. 8. Willmitzer, L., and Wagner, K.G. (1980). Biophys, Struct. Mech. 6, 95-110. 9. Watanabe, F., and Schwarz, G. (1983). J. Mol. Biol. 163,485-498. 10. McGhee, J.D., and von Hippel, P.H. (1974). J. Mol. Biol. 86,469-489; (1976). J. Mol. Biol. 103,679. 11. Record, M.T., Jr., Lohman, T.M., and De Haseth, P. (1976). J. Mol. Biol. 107,145-158. 12. Mirsky, A.E., and Anson, ML. (1936). J. Gen. Physiol. 19,451-459. 13. Suzuki, K., and Ando, T. (1968). J. Biochem. 63,701-708. 14. Gidoni, D., Dynan, W.S., and Tjian, R. (1984). Nature 312,409-413. 15. GotUi*, RA., Wu, S., Zhang, X., Tecklenburg, M., Kuempel, R, and ffill, T.M (1992). /. Biol. Chem. 161, 7434-7443. 16. Zhang, X., and Gottlieb, P.A. (1993). Biochemistry 21,11374-11384. 17. Serwer, P., and Hayes, S.J. (1986). Anal. Biochem. 158,72-78. 18. Pavletich, N.P., and Pabo, CO. (1991). Science 252,809-817. 19. Pavletich, N.P., and Pabo, CO. (1993). Science 261,1701-1707. 20. Epstein, I.R (1978). Biophys. Chem. 8, 327-339. 21. Kowalczykowski, S.C, Paul, L.S., Lonberg, N., Newport, J.W., McSwiggen, J.A., and von Hippel, P.H. (1986). Biochemistry 25,1226-1240. 22. Draper, D.E., and von Hippel, P.H, (1978). J. Mol. Biol. Ill, 339-359.
Investigating Calmodulin-Target Sequence Interactions Using Mutant Proteins and Synthetic Target Peptides Wendy A. Findlay, Stephen R. Martin, and Peter M. Bayley Division of Physical Biochemistry, National Institute for Medical Research, MiU Hill, London NW7 lAA, England, U.K.
I. Introduction Protein-protein interactions are important in the function and regulation of many biological pathways. Associations between proteins are often characterized by strong and extremely specific noncovalent interactions between complementary surfaces. One way of looking at the details of these interactions is to use peptides corresponding to the "interaction region" of one of the proteins to determine the minimum sequence needed for the interaction as well as the effect of changing individual residues. We report here on the use of this approach to study the interaction of calmodulin with a target sequence from skeletal muscle myosin light chain kinase (sk-MLCK). The strategy is to use a number of sequence variants of the peptide and site directed mutants of calmodulin. Calmodulin is a small ubiquitous calcium binding protein which regulates a variety of enzymes in several different metabolic pathways. Calmodulin interacts with many of its target proteins with very high affinity (K^ — nM), usually in a calcium specific manner. It also binds target sequence peptides derived from the calmodulin binding regions of many of these proteins with affinities close to those for the intact enzymes. Many target sequences are predicted to form basic amphipathic helices and this has been proposed as a common structural motif for calmodulin binding (1). In the solution structure of a complex of calmodulin with a 26-residue target peptide derived from the sequence of sk-MLCK, the two domains of TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
401
402
Wendy A. Findlay et al.
calmodulin surround the Ml3 peptide, which has adopted an oj-helical conformation (2). The peptide lies in a hydrophobic channel and the sidechains of Trp4 and Phel7 of the peptide appear to anchor the peptide to the two domains by fitting into the hydrophobic "pockets". Maune et al. (3) have produced a set of single site mutants of Drosophila melanogaster calmodulin, each of which has the conserved glutamic acid residue in the 12th position of one of the four calcium binding loops mutated to either glutamine or lysine. Mutations to sites 2 or 4 effectively eliminate calcium binding to the mutated site and cause structural changes in the protein (3, 4) as well as decreasing the ability of calmodulin to activate target enzymes (5). The availability of these mutant calmodulins and the choice of synthetic peptides derived from the sk-MLCK target sequence allows the manipulation of both components involved in the interaction. The 18residue sk-MLCK target sequence has 3 aromatic residues: Trp4, Phe8, and Phel7. In this work, we use peptides with tryptophan in either position 4 (WFF peptide) or position 17 (FFW peptide). Since the calmodulin itself has no tryptophan residues, we can use optical spectroscopy to monitor the interaction of a tryptophan in a specific position in the target sequence with an individual domain of calmodulin. We have studied binding of the two peptides to wildtype calmodulin and to the site 2 and site 4 mutants (B2Q, B2K, B4Q, and B4K) to see how mutations which effectively eliminate calcium binding to a particular site affect the interaction of the protein with the two target sequence analogues.
n . Materials and Methods Proteins and peptides - Drosophila melanogaster calmodulin and the various mutants expressed in E. coli were purified essentially as previously described (3). Peptides were synthesized on an Applied Biosystems 430A peptide synthesizer and purified by reverse phase HPLC on a CIS column (WFF peptide) or a C8 column (FFW peptide) and were provided with free carboxy and amino termini. All concentrations were determined spectrophotometrically using a calculated extinction coefficient of 5560 M"^ cm"^ at 280 nm for the peptides (2 Phe and 1 Trp) and published extinction coefficients for wildtype and the four mutant calmodulins (3).
Calmodulin-Target Sequence Interactions
403
Fluorescence and affinity measurements - Peptide in 25 mM Tris, 100 mM KCl and 1 mM CaCl2 at pH 7.5 and 30 C was titrated with a stock solution of calmodulin in UV transmitting plastic cuvettes since the peptides appear to bind to glass. Fluorescence titration spectra were recorded using a SPEX FluoroMax fluorescence spectrometer with excitation at 280 nm and emission scanned from 310 to 390 nm. The value of fluorescence intensity at 330nm was plotted as a function of calmodulin concentration and fitted using standard non-linear least squares methods (6) to obtain optimal values of the dissociation constant (K^) and the maximum fluorescence enhancement (F/FQ). The detection limit under our experimental conditions was 50 nM peptide and all quoted K,, values are the average of at least 3 independent determinations. Circular dichroism spectra - Spectra were recorded on a Jasco J-600 spectropolarimeter at room temperature. Far UV CD spectra (190 to 260 nm) of 7.5 fiM peptide:calmodulin complex in 25 mM Tris, 100 mM KCl and 1 mM CaCl2 were measured in a 0.1 cm path length cuvette. Near UV CD spectra (250 to 340 nm) of 20 /xM peptide:calmodulin complex in the same buffer were measured in a 1 cm path length cuvette.
n i . Results We have used two synthetic 18-residue peptides related to the target sequence of sk-MLCK to study their interaction with calmodulin in the presence of calcium. The WFF peptide (KKRWKKNFIAVSAANRFK) corresponds to residues 577 to 594 of rabbit sk-MLCK. In the FFW peptide (KKRFKKNFIAVSAANRWK) the W4 and F17 residues have been interchanged. Upon binding to the protein the tryptophan fluorescence emission maximum for each peptide is shifted from 356 nm to 334 nm as shown in Fig. lA, with an enhancement in fluorescence intensity at 330 nm of about 2.4-fold for the WFF peptide and 3-fold for the FFW peptide (Table 1). These results indicate that the Trp residue is in a hydrophobic environment when either peptide binds to the protein. By monitoring fluorescence intensity at 330 nm while titrating either peptide with calmodulin, we determined the affinities of calmodulin for the FFW peptide (Kd= 1.6 nM) and the WFF peptide (K^ 275 nm) derives from the single Tyr located at position 138 in the C-terminal domain. The free peptides show negligible circular dichroism in this wavelength range. The spectra of the complexes of calmodulin with the WFF or FFW peptide show clear evidence of a major contribution from the Trp residue in the peptide. Tryptophan model compounds (7) generally show two sharp bands (from I^ transitions), one at 289 - 294 nm and the second some seven nanometres to shorter wavelength, which generally has the same sign. Bands corresponding to the L. transitions usually occur at shorter wavelengths (265-275 nm) and show little fine structure. The Ae values for these bands are expected to lie in the range ± 3 M"^ cm"^ (7). The changes in the near UV CD spectrum upon binding of peptide conform to the general pattern described for Trp CD (indicating that there is little contribution from the two Phe residues in the peptides), but the two spectra differ significantly in both magnitude and sign. The large negative intensity of the Trp of the bound FFW peptide clearly indicates that the indole chromophore is strongly immobilized in an asymmetric environment. Based on the solution structure of the CaM:M13 CaM:WFF
280
300
320
340
360
wavelength (nm)
380
400
250
260
270
280
290
300 310
320
wavelength (nm)
Figure 1 - A) Fluorescence spectra of WFF and FFW peptides, free and bound to wildtype calmodulin, [peptide] = 200 nM, [CaM] = 200 nM in 25 mM Tris (pH 7.5), 100 mM KCl, and 1 mM CaClj. B) Near UV CD spectra of 20 /xM wildtype calmodulin alone and in (1:1) complex with WFF and FFW peptide (Ae is per mole calmodulin).
Calmodulin-Target Sequence Interactions
405
peptide complex (2) the Trp sidechain of the bound WFF peptide is also expected to be strongly immobilized. The weaker CD signal of the WFF Trp is diagnostic of lower asymmetry but not necessarily greater mobility. The asymmetry derives from two sources -1) electronic interaction with other chromophores (eg Phe) and polarisable groups (eg sidechains) in the closely packed interior of the protein, and 2) electronic interaction with neighbouring peptide groups which are arrayed asymmetrically (owing to the L-chiral configuration of natural amino acids). The CD properties of a chromophoric side chain thus reflect both secondary and tertiary structure. Near UV CD is a sensitive indicator of relatively small changes in protein conformation in the vicinity of the aromatic group, as well as an indicator of different chiral environments within a protein. The very different near UV CD spectra indicate distinct chiroptical environments of the Trp residue in the two peptide complexes and are consistent with interaction of Trp4 (in peptide WFF) and Trpl7 (in peptide FFW) with different domains of the protein. This would suggest that both target peptides are binding to calmodulin in the same orientation i.e. with residue 4 interacting with the C-domain of calmodulin and residue 17 interacting with the N-domain, as was found for the homologous 26-residue M13 peptide bound to calmodulin (2). The affinities of the two peptides for four calcium binding site mutants of calmodulin also provide important information. The B2K and B2Q calmodulins have Glu67 (in binding site 2) mutated to Lys and Gin respectively, and B4K and B4Q calmodulins have Glul40 (in binding site 4) mutated to Lys and Gin respectively. Each of these mutations effectively eliminates calcium binding to the altered site (3). As shown in Table 1, the affinity of each of the mutant proteins for either peptide is at least 10-fold lower than that of wildtype calmodulin. The B2K mutant has the highest affinity for both peptides, suggesting that it is the least altered in function. The B4K mutant has the lowest affinity for both peptides - more than 200-fold lower than wildtype calmodulin suggesting that it is the most altered in function. Although the B2Q and B4Q mutants both have about 100-fold lower affinity for the WFF peptide than wildtype CaM, there is a 10-fold difference in their affinities for the FFW peptide. The fluorescence enhancement upon binding of the FFW peptide to the B2Q mutant is also much lower than that for any of the other proteins. These results suggest that the E67Q (B2Q) but not the E67K (B2K) mutation in site 2 of the N-domain has significantly altered the interaction with the sidechain of the residue in position 17 of the peptide. It is interesting to note that two different replacements for a single residue in the protein result in significantly different affinities.
Wendy A. Findlay et al.
406
Table I - Dissociation constants of wildtype and four mutant calmodulins for WFF and FFW peptides and fluorescence enhancement (at X =330 nm) upon complex formation CaM
WFF peptide K,(nM) F/Fo
FFW peptide F/Fo
Ksi(nM)
WT
0.12
0.10
0.16
0.08
(0
0.12
A
0.08
i
0.06
0.04
V
0.04
0.02
0.00
0.0
0.00
1.0
2.0
3.0 s*
4.0
5.0
6.0
7.0
8.0
(svedbergs)
Figure 2: Sedimentation velocity analysis of the interaction between bisANS and bacteriophage P22 coat protein. Coat protein in the presence of 30 |xM bisANS; CQ = 0.3 mg/ml; The right hand ordinate has units of mg-ml"^-svedberg. Sedimentation was carried out at 56,000 rpm at 20 °C; t = 7304 sec in a Beckman XL-A ultracentrifuge equipped with a photoelectric scanner and video-based on-line Rayleigh optical system similar to the one installed on the Model-E but using a high resolution Kodak MegaPlus 1.4 digital camera (Stafford, to be described elsewhere).
In order to determine whether both the monomer and dimer of coat protein were capable of binding bisANS, the sedimentation velocity runs were performed in a Beckman XL-A analytical ultracentrifuge equipped with both photoelectric scanner and refractive optics (Fig. 2). Absorbance at 390 nm was followed to monitor the distribution of bisANS, while Rayleigh optics were employed to monitor the distribution of the coat protein. The distribution of the bisANS is superimposible with that of the coat protein in both monomer and dimer form. Therefore, it is apparent that both monomeric and dimeric coat protein molecules are capable of binding bisANS. This results suggests that the binding of bisANS to the protein subunit induces a subtle conformational change leading to dimerization, rather than directly mediating the interaction.
IV. Conclusion This chapter has discussed the use of the apparent sedimentation coefficient distribution function, g(s*) vs. s*, as a tool for studying both interacting and non-interacting systems especially at low concentrations. The apparent distribution function can be computed from the time derivative of sedimentation concentration curves. The relatively high precision afforded by combining use of the time derivative with signal averaging allows the analysis of systems at total concentrations of a few micrograms per milliliter with the Rayleigh optical system or on the order of 0.01-0.02 a.u. with the photoelectric scanner system of the Beckman Instruments XL-A analytical ultracentrifuge. These methods can be applied to data obtained with other optical systems such as fluoresence to increase the sensitivity further. In the run shown in Figure 2, the ability to obtain concentration measurements in terms of both optical density and refractive index allowed us to determine that
432
Walter E Stafford III et al.
bisANS bound to both the monomer and the dimer of the coat protein. A dual wavelength determination would not have been feasible in this case because of the high absorbance of bisANS at 280nm. The weight average sedimentation coefficient can be estimated from the g(s*) patterns by simple integration according to equations 5 and 6. and, in generS, if one knows the sedimentation coefficient of each species as well as the stoichiometry, one may obtain an accurate estimate of the equilibrium constants and standard free energies describing the system (Kegeles, 1967; Cann, ,1970; Stafford, 1994b). Thus, the new techniques will allow the investigation of interacting systems that were previously inaccessible to analysis by analytical ultracentrifugation.
Acknowledgement This work was supported in part by NIH grant GM-47980.
References Teschke, CM., KingJ. and Prevelige Jr., P.E., (1993)Inhibition of Viral Capsid Assembly by l,r-bi(4-anilino)napthalene-5,5'-disulfonic acid (bisANS). {Biochemistry 32:10658-10665) Prevelige Jr. P.E., Thomas, D., and King, J. (1993) Nucleation and Growth Phases in the Polymerization of Coat and Scaffolding Subunits into Icosahedral Shells {Biophys. J. 64: 824-835) Prevelige Jr., P. E., Thomas, D. and King, J. (1988) Scaffolding Protein Regulates the Polymerization of P22 Coat Subunits into Icosahedral Shells in vitro.. J. Mol. Biol. 202: 743-757. Cann, J. R. (1970). Interacting Macromolecules. New York, Academic Press. Kegeles, G., L. Rhodes and J. L. Bethune (1967). "Sedimentation Behavior of Chemically Reacting Systems." Proc. Natl. Acad. Sci. 58, 45-51. Liu, S. and W. F. Stafford (1992). "A Real-Time Video-Based Rayleigh Optical System for an Analytical Ultracentrifuge Allowing Imaging of the Entire Centrifuge Cell." Biophys. J. 61, A476, #2745. Stafford, W. F. (1992a). "Boundary Analysis in Sedimentation Transport Experiments: A Procedure for Obtaining Sedimentation Coefficient Distributions Using the Time Derivative of the Concentration Profile." Anal. Biochem. 203, 295-301. Stafford, W. F. (1992b). "Sedimenation Boundary Analysis: An averaging Method for Increasing the Precision of the Rayeigh Optical System by Nearly Two Orders of Magnitude." Biophys. J. 61, A476,(#2746). Stafford, W. F. (1994a). "Methods of Boundary Analysis in Sedimentation Velocity Experiments." in Numerical Computer Methods, Part B., Methods in Enzymology,286, Eds. L. Brand and M. L. Johnson. Orlando, Academic Press. Stafford, W. F. (1994b). "Sedimentation Boundary Analysis of Interacting Systems: Use of the Apparent Sedimentation Coefficient Distribution Function" in MODERN ANALYTICAL ULTRACENTRIFUGATION: Acquisition and Interpretation of Data for Biological and Synthetic Polymer Systems Eds. T. M. Schuster and T. M. Laue. Boston, Birkhauser Boston, Inc. Yphantis, D. A., W. F. Stafford, S. Liu, P. H. Olsen, J. W. Lary, D. B. Hayes, T. P. Moody, T. M. Ridgeway, D. A. Lyons and T. M. Laue (1994). "On line Data Acquistion for the Rayleigh Interference Optical System of the Analytical Ultracentrifuge." in MODERN ANALYTICAL ULTRACENTRIFUGATION: Acquisition and Interpretation of Data for Biological and Synthetic Polymer Systems Eds. T. M. Schuster and T. M. Laue. Boston, Birkhauser Boston, Inc.
SECTION VII Protein Conformation and Folding
This Page Intentionally Left Blank
CYANOGEN AS A CONFORMATIONAL PROBE Richard A. Day, Amy Hignite, and Warren E. Gooden Department of Chemistry University of Cincinnati Cincinnati, OH 45221-0172
I. INTRODUCTION Cyanogen (ethanedinitrile, N s C - C ^ N , C2N2) is a unique protein reagent. C2N2 drives the condensation of paired groups to form covalent links in a mode similar to that of carbodiimides in peptide and amide bond formation. It differs from carbodiimides in a critical and useful manner: it only drives intra-molecular condensation of paired groups such as salt bridges and does not lead to inter-molecular condensation products (1,2). The intra-molecular changes, i ^ covalent bonds replacing paired groups such as salt bridges, have been shown to involve HIS, ARG, and LYS side-chain functional groups (3,4,5). depsi-Peptidt bond formation was also seen (3). Carboxylate is the other component of each pair. Preformed paired groups may be within the same molecule (1-5) or between subunits. The subunits of Hb are rapidly covalently linked with no aggregation beyond ^2)32 seen (1,6); the a,i8 subunits of human chorionic gonadotropin are also linked by the action of C2N2 (7). Hens egg white lysozyme has been shown to associate weakly and heterologously (8,9); C2N2 reacts rapidly with it producing insoluble aggregates (1). This result suggests that C2N2 can "trap" an associated pair even if formed transiently with an unfavorable equilibrium as in the case of HEW lysozyme. In principle, a suitable reagent could convert pairs of associated sidechain groups such as salt bridges into stable covalent bonds. If there are distinct and distinguishable sets of associated groups representing two or TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
435
436
Richard A. Day et al.
more conformational states, then employment of the reagent can (a) be expected to interfere with transitions between conformations, (b) provide a means to identify the paired groups formed/disassociated in the transition(s), and (c) with altered conformational response affect ligand binding. A candidate reagent should react in a non-perturbing way with naturally formed associated pairs. It appears that cyanogen is perhaps the only reagent that fulfills these requirements. Ribonuclease S, of known crystal structure (10), presents an instructive case. Except in the region of residues 18-23, it is very similar to RNase A (11). C2N2 modifies known salt-bridge pairs (5). The distance between the C„'s of ALA20 and SER 21 is 27 A calculated from the X-ray diffraction based coordinates (12), however, considerable uncertainty exists relative to structural parameters for residues 18-23 (12). Are residues 20 and 21 transiently associated and susceptible to C2N2 driven condensation? The answer appears to be yes but not as a reestablished peptide bond. Instead, it appears to be a depsi-^^txAt ester link. HSA exists as a single polypeptide chain of 585 amino acids. XRay crystallographic analysis has confirmed that this chain is folded into three domains, each of which is seen to exist as two subdomains (13). HSA undergoes extensive, pH dependent, reversible conformational changes (14). The so-called acid expansion at low pH is accompanied by loss of a large fraction of a-helical structure. We report here the C2N2 treatment of RNase S which traps a minor conformer restoring some of the properties of RNase A, and of serum albumin where C2N2 locks in a low a - helix form and alters ligand binding. n . Experimental Materials and Methods A. Protein and Reagents All chemicals were reagent grade and used without further purification. Cyanogen is a toxic gas and should be handled with care (15); it is not always available commercially. It can be readily prepared in one step from AgCN (16). B. Cyanogen Modification of Ribonuclease S Essentially the procedures described (3,4) followed. C. Sequence Analysis of Tryptic Peptides Sequence analyses of the protein and tryptic peptide samples were
Cyanogen as a Conformational Probe
437
performed by the Protein Chemistry Core facility of the Department of Pharmacology and Cell Biophysics at the University of Cincinnati, Cincinnati, Ohio. i n . Results and Discussion A. Ribonuclease S The covalent attachment of the S-peptide to the S-protein was determined from the relative areas of peaks corresponding to these two components on reverse-phase chromatograms. The enzymatic activity as measured by the method of Crook et al. (17), of RNase S is modified by C2N2 treatment. There is an initial increase in V,^ and K^ as seen within one minute The heightened activity at one minute coincides with the 86% reduction in peptide 7, Ser21-LYS31, from the map (compare Fig. lb, Ic). Subsequent lowering of activity with longer C2N2 treatment can be attributed to covalent modification of active site HIS 12 at a slower rate as shown where the four HIS ring C2 protons were monitored by NMR (5). A tryptic digest and HPLC analysis of standard RNase A and RNase S using the method of McWherter et al. (18) resulted in the tryptic maps shown in Figure la and lb. The order of elution and identity of the tryptides for RNase S were established to be the same as that reported by McWherter et al. (18) for RNase A (Fig. la), with the exception of the two peptides arising from the cleaved ALA20 to SER 21 bond characteristic of RNase S. The specific residue(s) involved in the covalent modifications present within the affected tryptides were identified through a combination of amino acid analysis, sequence analysis, molecular modeling and comparison with the X-ray diffraction based structure (10,11). The areas of peptide 7 (SER21-LYS31) and 5(ASN62-ARG85 and ASN67-ARG85) are reduced more rapidly than other tryptides in the map. For 7 the only candidate groups for reaction are the SER21 a-ammonium, LYS31 e-ammonium, and SER hydroxyls. Neither X-ray crystallography nor molecular modeling place the e-amino group in a salt bridge. However, a brief exposure to pH 10 (5-10 min) restores full enzymatic activity and at the same time makes the C2N2 RNase S susceptible to trypsin once more; pH 10 will have no effect on an e-carboxamido link. Re-formation of an ALA20-SER21 peptide is ruled out by the pH 10 lability of the linkage. The loss of 5 from the map (Fig. 1) is consistent with a cross-link of saltbridge ASP121 to LYS66 as confirmed by sequence analysis. It has nothing to do with the covalent reattachment of the S-peptide to the S-protein; such
Richard A. Day et al.
438
H
I
I K
E
XM
Li II—' Retention Time (min)
Retention Time (min)
Retention Time (min)
Figure 1. Reverse-phase chromatographic maps of tryptic digests of reduced carboxymethylated ribonuclease preparations: (a) RNase A control (b) RNase S control and (c) RNase S treated for 1 min with C2N2. Modified and unmodified protein samples were analyzed by reverse-phase HPLC (Perkin-Elmer 250 Binary Pumping Model 235, Diode array Detector, Vydac C-18, #218TP54 Column) using the method of McWherter et al. (18). The sample digests were frozen and stored at -70*^0 until analyzed.
Cyanogen as a Conformational Probe
439
side reactions at other pairs cause the trace in Fig. Ic not to be a replica of that in Fig. la. Thus, formation of the depsi-ALA20-SERll link occurs by trapping of a minor conformational state of the RNase S. At one minute C2N2 reaction time the new protein species remains an active enzyme. We conclude that most of the sites targeted by C2N2 are consistent with the Xray data. Facile deletion of peptide 7 from the map within one minute and restoration of trypsin resistance to RNase S through an alkali labile link is consistent with an ALA20 to SER21 depsi-ptptidc link formation. B. Human Serum Albumin Treatment of HSA with C2N2 at pHs 4, 7 and 9 gave a protein (HSA-CN) with altered conformational responses to changes in pH and altered ligand binding. Representative data are shown here; all molar elliptical values ([tJ]deg M* m"^) vs pH profiles have been duplicated one or more times. [i>] X values are shown at the X values indicated. Three sets of data are given: pH dependence of [d] of (1) HSA and HSA-CN in the far UV (2) in the near UV and (3) in the presence of Ca "^"^. Monitoring the [t>]227 pH dependence (Fig 2) shows a variation from -2000 at pH 2 - 3 to - 17,000 at pH 8 - 9. After C2N2 treatment at pH 7, the pH dependence of [t)]227 varies between - 0 and -8000 to -9000. While not shown, C2N2 modification at pH 4 and pH 9 gives a similar range of [tJ]227 values from pH 2 - 9. The absolute [i>]227 values for the HSA-CN are about one half that of the HSA control. Upon storage of the HSA-CN at pH 9 for several days (with NaNa), the profile slowly reverts to that of natural HSA with molar ellipticity values returning to that of native HSA. In the near UV the molar ellipticity [iJ]292 values for the control HSA are essentially the same as reported in thefirstCD based study of the N-B transition of HSA (19) at pH 6-9. C2N2 treatment changes [t>]292 = -3500 at pH 9 to -2000 deg M"^ m"^ (not shown). The ligand Ca"^"^ gives enhanced values of [t?]2i7 for HSA over the entire pH range when compared with HSA in its absence (Fig. 3). The values range from - -15,000 at pH 2-3 to —65,000 at pH 8 - 9 (Fig 3). The HSA-CN in the presence of Ca"'"' is changed to +8000 at pH 2 to -3000 to -6000 at pH 7 - 9. This represents a 90 - 95% change in the molar ellipticity in the a, jS band region and presumably reflects a correspondingly large change in the secondary structure. C2N2 treatment reduces the absolute values of molar ellipticities at all wavelengths and at all pH values. This is consistent with reduction in a-helical content. The changes in the near UV spectra are suggestive of
440
Richard A. Day et al. HSA (0.005mg/ml)
HSA (0.005mg/ml) and C2N2(pH=7)
Figure 2. Molar ellipticity [t>]227 deg M"^ m"* as a function of pH of control HSA Geft) and of HSA-CN (right). The HSA-CN was formed at pH 7 by passing C2N2 (Ice) through the HSA solution (Ice) contained in a 2cc septum-capped vial. After 1 hour at 25° , the C2N2 was entrained in a stream of N2 passed over the surface of the solution, followed by chromatography over Sephedex G25 and lyophilization of the protein fraction. CD spectra were taken on a Gary 61 spectropolarimeter. Fatty- acid free HSA was acquired from Sigma (St. Louis, MO).
HSA (0.005mg/ml) with CaClgCSmeq / L)
HSA (0.005mg/mi) and C2N2(pH=7) with CaClgCSmeq / L, pH=7)
Figure 3. Molar ellipticity as a function of pH in the presence of Ca"*"^ (5 meq-L-* CaCy of control HSA (left) and of HSA-CN formed at pH 7 (see caption to Fig 2).
Cyanogen as a Conformational Probe
441
alteration in tertiary structure. In prior studies C2N2 has not been found to cause significant changes in the CD spectra of conformationally stable proteins such as carbonic anhydrase (16). In no case among our studies cited above has C2N2 treatment lead to protein denaturation but in fact can lead to protein stabilization(6). Non-specific denaturation of HSA is unlikely for an additional reason, viz.. the HSA-CN reverts to HSA at pH 9 and gives once again the CD spectrum of native HSA. All but one of the types of covalent bonds formed by the action of C2N2 are hydrolyzed slowly at pH 9, 25^(1,2,16). We conclude that C2N2 traps conformation(s) similar to that seen at low pHs where a-helix is dramatically reduced . rV. Conclusions The cyanogen treated protein is covalently modified "locking" the protein in one form. In RNase S the trapping of the enzymatically active, trypsin-resistant form is consistent with a finite amount of an associated ALA20-SER21 pair at any given time. The high a conformation of HSA is locked into a low a, form by C2N2. Ca"^"^ enhances the difference in secondary structures of HSA and HSA-CN; in fact, Ca"^"*" exerts opposite effects on HSA and HSA-CN. Note. Cyanogen treatment of human gonadotropin (hCG), an ajS heterodimer, resulted in cross-linking of a significant fraction of the a-and j8-subunits. hCG in which the a-subset is radioiodinated binds to LH receptors. Treatment of these receptor complexes with cyanogen caused the hCG to become cross-linked to the receptors as shown by the appearance of a high molecular weight species of approximately 123 VD on SDS-polyacrylamide gels. This complex disappeared upon reduction with jS-ME. Thus, the hCG j8-subunit appears to be cross-linked to the LH receptors. However, the aand j8- subunits of hCG do not appear to be readily cross-linked to each other under the same conditions. These observations are consistent with a model in which the conformation of hCG is altered during binding of the hormone to its receptors, a phenomenon that may be studied further through the use of cyanogen cross-linking. (W. R. Moyle, personal communication, also see ref. 7). Acknowledgment 42697.
This work was supported in part by USPHS Grant GM
442
Richard A. Day et al.
References 1. Day, R.A., Kirley, J., Tharp, R., Ficker, D., Strange, C. and Ghenbot, G. (1989). In 'Techniques in Protein Chemistry" (T. E. Hugli, ed.), p.517. Academic Press, San Diego. 2. Day, R.A., Tharp, R.L., Madis, M.E., Wallace, J.A., Silanee, A., Hurt, P. and Mastruserio, N. (1990). Peptide Res. 3, 169. 3. Ghenbot, G., Emge, T. and Day, R.A. (1993). Biochim. Biophys. Acta. 1161, 59. 4. Karagozler, A.A., Ghenbot, G., and Day, R.A. (1993). Biopolymers 33, 687. 5. Gooden, W.E., Day, R.A. and Kreishman, G. P. (1993). Proc. 1993. Miami Bioltechnology Winter Symposium 3, 15. 6. Tharp, R. L. (1987). Ph.D. Dissertation. University of Cincinnati. 7. Lin, W., Day, R. A. and Moyle, W. R. (1993). Proc. 1993 Miami Bio/Technology Winter Symposium 3, 19. 8. Sophianapoulos, A.J. (1969). J. Biol. Chem. 244, 3188. 9. Zehavi, U. and Lustig, A. (1969). Biochim. Biophys. Acta. 194, 532. 10. Wyckoff, H.W., Tsemoglou, D., Hanson, A.W., Knox, J. R., Lee, B. and Richards, P.M. (1970). J. Biol. Chem. 245, 305. 11. Wlodawer, A. and Sjolin, L. (1983). Biochemistry 22, 2720. 12. Richards, F. M. and Wyckoff, H. W. (1973). In "Atlas of Molecular Structures in Biology" (D. C. Phillips and F. M. Richards, ed.), p 9, 19. Clarendon Press, Oxford. 13. He, X. M. and Carter, D. C. (1992). Nature 358, 209; Carter, D.C. and Ho, J.X. (1994). Adv. Prot. Chem. 45, 153. 14. Zurawski, V. R. Jr. and Foster, J. F., (1974). Biochemistry 13, 3465. 15. Fassett, D.W. (1983). In "Industrial Hygiene and Toxicology" Second Edition (F.A. Patty, ed.), p 2003. Interscience Publishers, New York. 16. Kirley, J. W., Day, R. A. and Kreishman, G. P. (1985). FEBS Lett. 193, 145. 17. Crook, E. M., Mathias, A. P. and Rabin, B. R. (1960). Biochem. J. 74, 234. 18. McWherter, C. A., Thannhauser, T.W., Fredrickson, R. A., Zogotta, M. T. and Scheraga, H. A. (1984). Anal. Biochem.Ul 523.
Evaluation of Interactions Between Residues in a-Helices by Exhaustive Conformational Search Trevor P. Creamer^, Rajgopal Srinivasani, and George D. Rose^ Department of Biochemistry and Molecular Biophysics Washington University School of Medicine, St. Louis, MO 63110
I. Introduction Recently, it was hypothesized that native protein structure is specified by a stereochemical code (1). We have been evaluating this hypothesis by examining high resolution protein structures to extract recurrent patterns and identify formative interactions (2,3). Many patterns are small enough to be analyzed exhaustively by conformational search techniques. Initially, we have focused on the a-helix. Exhaustive conformational searches were performed to evaluate interactions between pairs of hydrophobic side chains in mid-helix positions and to imderstand the motifs adopted by glydne-terminated helices (2). Both studies are described below.
II, Exhaustive Conformational Search Exhaustive conformational search is a simple and practical way to explore the entire conformational space available to a peptide (or molecular segment) with fewer than a dozen rotatable bonds. A search is performed by systematically varying each rotatable bond in the peptide. Rotations are made about backbone dihedrsds (^ and \|/) and/or side chain torsions (%). In our work, bond lengths and angles are held rigid. After each rotation, the molecule is checked for steric overlap. If overlap occurs, the conformer is discarded; otherwise it is
1 Current address: Dept. Biophysics and Biophysical Chemistry, Johns Hopkins University School of Medicine, Baltimore, MD 21205 TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
443
444
Trevor P. Creamer et al.
retained for later analysis. Helix nomenclature is: ...-N"-N'-Ncap-NlN2-N3-...-C3-C2-Cl-Ccap-C'-C"-... where residues N l to CI define the helix proper and have both helical i —> i-\-4 hydrogen bonds and backbone dihedral angles, ^ and \|f, with mean values of-64±7° and -41±7°, respectively (4). Residues Neap and Ccap depart from helical ((),\|/ angles, but make one additional helical hydrogen bond. Flanking residues - N', N",... and C\ C",... - are non-helical.
III. Side Chain Interactions Within a-Helices Hydrophobic interactions between residue side chains in an a-helix are thought to be helix-stabiHzing (5). This hypothesis was tested experimentally in isolated helical peptides (6), where stabilizing interactions were foimd between Leu-Tyr and Tyr-Leu pairs spaced either three or four residues apart. Using exhaustive conformational search, we evaluated the energy, entropy, and free energy of Leu-Phe and Phe-Leu pairs in a model helix, as described below. To model side chain - side chain interactions, a peptide with the sequence CH3CO-(Ala)i9-NHCH3 was used, with rigid backbone geometry ((|)=-64°, \|/=-41°). The two residues of interest were substituted into middle helix positions and their side chains rotated incrementally. After each rotation, the resultant conformer was checked for steric overlap and discarded whenever any two atoms were closer than 70% of their summed van der Waals radii (values taken from the OPLS non-bonded parameters (7)). When retained, the energy of the conformer was calculated using the AMBER/OPLS forcefield (7,8) (dielectric constant of 78 and temperature of 298K) and the solvation model of Wesson and Eisenberg (9). At the conclusion of the conformational search, the Boltzmannaveraged energy of the system was calculated using E = J^EiPi
(1)
where Ei is the energy of the ith conformer, pi is it's Boltzmann weighting factor and the sum is taken over all N conformers generated in IJie search. The Boltzmann weighting factor for the ^th conformer was calculated from the partition fimction
Pk=4 y^-E,/RT
(2)
Exhaustive Conformational Search Methodology
445
where R is the gas constant and T is the temperature (298K). The conformational entropy of the two side chains was estimated as N
S^-Rj^Pilnpi
(3)
i
using weights from Equation 2. Interactions between Leu-Phe and Phe-Leu pairs were analyzed using the protocol described above. Side chains were modeled at positions i (residue 8) and i+2 (residue 10), i+3 (residue 11) or i+4 (residue 12). The (i,i-\-2) pairs serve as a standard state because, in this position, the side chains are on opposite sides of the helix and cannot interact. Side chain torsions were rotated in increments of 30°. Results are shown in Table I. Although the (i,i+3) and (hi+4) Leu-Phe pairs have the same overall interaction energy, AE-TAS, they differ in both energy and entropy (Table I). The most stabiUzing of the pairs modeled, Phe-Leu at (i,i-\-4X undergoes again in entropy, a consequence of the fact that interactions between the two side chains cause the rotamer populations to be distributed more imiformly across their possible rotamer classes (10). The conformational entropy calculated using Equation 3 is maximal when all probabilities pj are equal (and nonzero). In agreement with the experimental results of Padmanabhan and Baldwin (6) on Leu-Tyr and Tyr-Leu pairs, we also find stabilizing interactions between these hydrophobic side chains. Favorable van der Waals contacts promote side chain - side chain interaction, and the solvation model (9) also biases the side chains toward hydrophobic burial. Similarly, these factors are seen in the favorable energy terms, AE. Conversely, these same factors can lead to a loss of side chain conformational entropy, -TAS, since the side chains typically lose conformationsil freedom (10), although in one case again in entropy is observed. We note that the energy in Table I is due to interactions between side chains and does not include the energy of helix formation. In particular, it can be entropically costly to fix residues in a helix (10), but favorable interactions can help "pay for" this cost. Table I : Energies and entropies (in kcal.mol"^) from the conformational searches for Leu-Phe and Phe-Leu pairs at spacings of (i,i+3) and {hi+4) normalized against the same pairs at spacings of iUi+2) Pair Spacing ^E -TAS AE-TAS -0.24 +0.06 -0.32 Leu-Phe i,i+3 Phe-Leu
i,i+4 i,i+S i,i+4
-0.45 -0.23 -0.12
+0.21 +0.05 -0.18
-0.24 -0.18 -0.30
Trevor P. Creamer et al.
446
IV. Glycine Terminated a-Helices A second example that illustrates the power of exhaustive conformational searching is found in the recent analysis of glycine terminated helices (Gly at C) (2). Specifically, two recurrent motifs are observed in a-helices terminated by a glycine (Figure 1). In each, the glycine residue adopts a left-handed conformation {Le, (^>0). In one case, termed the Schellman motif, a distinctive, doubly hydrogen bonded pattern between backbone partners, consisting of 6 -^ 1,5 -^ 2 hydrogen bonds between the N-H at C" and C=0 at C3 and between the N-H at C and C=0 at C2 is observed. In the other case, termed thettLmotif, a 5 -> 1 hydrogen bond between the N-H at C* and C=0 at C3 is observed. A distinguishing feature of the Schellman motif is the presence of interacting hydrophobic residues at C3 and C", while in the a^ motif C" is invariably a polar residue. From these observations, stereochemical rules were developed (2). Using these rules, simple visual inspection of the amino acid sequence was sufficient to distinguish the two motifs from each other, andfrominternal glycines that fail to terminate helices. The key feature of these motifs is that they involve interactions that are local in sequence space, i.e., within a
Hydrophobic Interaction
..
*
. W C3 Hydrogen Bond
Hydrogen Bonds
L.
C2
CI
C-cap
CI Schellman Motif
0^ Motif
Figure 1 : The two helix-terminating Gly motifs.
Exhaustive Conformational Search Methodology
447
six residue segment. For this reason, both motifs are ideal candidates for exhaustive modeling. We define a helix stop signal to be a residue sequence for which it is energetically more favorable to terminate the helix than to continue it. Under this definition, the two motifs are helix stop signals, as can be shown by exhaustive conformational analysis of suitable peptides.
A. The Schellman Motif For the Schellman motif, modeling was used to address three questions. For each question, the model peptide consisted of a helical fragment followed by a Schellman sequence. In detail, all conformations of the peptide CHsCO-Leu-Ala-Ala-Ala-Gly-Ile-NHCHs were generated, subject to the constraint that backbone torsions of the subsequence ...-Leu-Ala-Ala-... were maintained in helical conformation ( D-Ser)
(D-Phe - > D-Asn) (D-Phe - > D-His)
A Water-Soluble P-Sheet Peptide
453
The following 12-residue peptides were synthesized in the same manner, starting with Boc-Val-PAM resin, but with the £-amino group of lysine protected with Fmoc (instead of Cl-Z): Peptide 5:
KLKFPKVKLFPV
Peptide 6: Peptide 7:
ILKSPKVILSPV GLKSPKVILSPV
It should be noted that substitution of lysine to ornithine has no effect on the structure or activity of the peptide (12). The blocked peptides were cleaved from the resin using anhydrous HF in the presence of anisole. All peptides were purified prior to cyclization using reversed phase (Cg) HPLC with a linear AB gradient, where A=0.05% TFA/H2O and 3=0.05% TFA/acetonitrile. Cyclizations were performed at concentrations of --2 mg/mL in dichloromethane using 1.2 equivalents each of N,N'-dicyclohexylcarbodiimide, N-hydroxybenzotriazole and diisopropylethylamine. The progress of the cyclization reaction was monitored by both reversed phase HPLC and plasma desorption TOP mass spectrometry. For gramicidin S and Peptides 1-3, cyclizations were typically complete in 6 hours, with final overall yields ranging from 45-90%. These high yields were achieved with no indication of racemization, while using only the free peptide as the starting material. Peptides 4-7 took longer to cyclize (~24 hrs.) and the overall yields tended to be lower (5-10%). Fmoc groups on Peptides 5, 6 and 7 were removed after the cyclization step by treatment with piperidine (2 hr). Final purification was achieved using reversed phase HPLC. B. Spectroscopic
Characterization
(NMR and CD)
All NMR spectra were collected on a Varian Unity 500 MHz spectrometer (^H frequency 499.8 MHz) equipped with a 5 mm inverse detection probe. Sample concentrations were typically 1-2 mM and sample temperatures maintained at 25 "C (unless otherwise noted). Sample pH was typically 3.5 - 4.0. Onedimensional ^H data were acquired with a 'H sweep width of 6000 Hz and an acquisition time of 2.3 seconds. The residual water signal was suppressed by presaturation. 'H DQF-COSY, NOESY and TOCSY (15) spectra were collected and processed using standard methods. All chemical shifts were referenced relative to internal DSS. CD samples (-1 mg/mL) were prepared by dissolving the peptides into a 10 mM sodium acetate buffer (pH 5.5) and sonicating for approximately 1 minute. Insoluble material was removed by centrifugation. CD spectra were recorded at 25 *C (unless otherwise noted) on a Jasco J-500C spectropolarimeter using a 0.02 cm pathlength cell attached to a circulating water bath. CD spectra represent the average of four scans collected over a wavelength interval of 190 to 250 nm. Ellipticity is reported as mean residue ellipticity [9], with an approximate error of-500° at 220 nm.
III. Results A. Substitution
Effects
of D-Phe on Solubility
and
Structure
Gramicidin S is not readily soluble in water and often precipitates in the presence of divalent counter ions (HPO4). In order to design p-sheet analogs that were more water soluble and less sensitive to salt or pH, we investigated the effect of replacing the most hydrophobic amino acid (D-Phe) in gramicidin S with a series
454
David S. Wishart et al.
of polar amino acids. Analogs were synthesized with D-Tyr (Peptide 1), D-Ser (Peptide 2), D-Asn (Peptide 3) and D-His (Peptide 4) in the 4 and 4' positions (numbering according to reference 12). Three of the four peptides were found to be significandy more soluble than native gramicidin S, with Peptides 1 and 2 being soluble to >10 mg/mL and Peptide 4 being soluble to 8.5 mg/mL. In addition, all four peptides were analyzed by NMR and far UV CD spectroscopy to characterize their structure. For peptides 1, 2 and 3, chemical shifts, coupling constants, nOe connectivities and far UV CD spectra are all consistent with a psheet structure similar to gramicidin S. Peptide 4, however, appears to retain very littie p-sheet structure. These results are summarized in Figure 2, where the chemical shift index (16), derived from a-^H NMR chemical shifts, is plotted for each analog.
talLlllJ
tanuHiJ Peptide 3
Peptide 1
V K L N P V K L N P
V K L Y P V K L Y P
0)
ULiJL
o Peptide 4 V K L H P V K L H P
Figure 2. Chemical Shift Index plots of peptides 1, 2, 3 and 4. Arrows indicate the location of P-sheets in these peptides. Clusters of three or more positive chemical shift index (CSI) values are indicative of a p-sheet. Overall, these results suggest that it is possible to greatiy increase the solubility of this p-sheet model without significantly disrupting the structure. They also suggest that some residues (His in particular) can disrupt the type IF p-tum and eliminate most of the p-sheet structure. This result also implies that it may be possible to use host-guest techniques (17) to study type IF p-tum propensities with this system. B. Effects of Extending
the
p-sheet
An unexpected result concerning the 10 residue p-sheet analogs was their remarkable stability. None of the peptide models exhibited any significant structural change upon heating to 85 °C or upon addition of significant quantities of chaotropic solvents. To make these peptides more susceptible to denaturation, the p-sheet was extended by two residues. This chain extension was expected to increase the chain entropy, thereby reducing the thermal stability of the peptide. Unexpectedly, the addition of two lysines to the hydrophilic side of gramicidin S (Peptide 5) significandy reduced its p-sheet content under benign conditions. However, the addition of the structureinducing solvent trifluroethanol (TFE) actually enhanced the p-sheet content of this molecule (Figure 3a). While TFE is commonly used to induce helical
455
A Water-Soluble P-Sheet Peptide
Structure in peptides, we believe this represents one of the few instances where TFE has been used to induce the formation of p-sheet structure (18). A
5-1
_-^ o E •a u
0^ i"
•\
o E •o E o
^
''^^^
2.
'o -15H
\
'o
* TFE
B
5
-5
-10 -15
A ',\
5 "C
»\ * ^^
-25 J 190
200
220 230 Wavelength (nm)
240
250|
' ^^ ^85
»C
-----
2! -20
_ 210
. ' • ' • *
^^
X
5;-20
,"' ^ y' /
-25
1
190
200
210 220 230 Wavelength (nm)
240
250|
Figure 3. A) CD spectrum of Peptide 5 with TFE (51% P-sheet) and without TFE (26% Psheet). B) CD spectrum of Peptide 5 at 5 'C and 85 *C. Spectra were analyzed using the program RBOCON (R.F. Boyko, unpubUshed). Another interesting feature of this extended p-sheet model can be seen in Figure 3b, where we show the effect of temperature on the CD spectrum of Peptide 5. Curiously, when the sample temperature is decreased (to 5 °C), the spectrum takes on more of a "random coil" character (only 20% p-sheet); but, when the temperature is increased to 85 X , the spectrum exhibits significantly more psheet character (38% p-sheet). In other words, high temperatures induce structure and cold temperatures reduce structure. We believe that this represents an excellent example of cold denaturation (19), and it suggests that the thermodynamics of p-sheet formation may be more complex than currently appreciated. C. Stabilizing
and Destabilizing
Amino
Acid
Substitutions
To enhance the p-sheet content of Peptide 5, two of its lysines were exchanged for isoleucines. Isoleucine is known to have a stronger p-sheet propensity than lysine (20). However, because these changes were expected to reduce the solubility of the peptide, the two phenylalanines were exchanged for serines. The resulting construct was called Peptide 6. A second construct (Peptide 7) was synthesized wherein one of the isoleucine residues was substituted with a glycine. This substitution was predicted to reduce the p-sheet content of the peptide. In Figure 4 we compare the CD spectra of Peptides 6 and 7. As expected, the spectrum for Peptide 6 has substantially more p-sheet than Peptide 7 (31% p-sheet vs. 2% p-sheet). Indeed, the CD spectrum for Peptide 7 closely resembles that of a classic random coil (21). Furthermore, as judged by the overall shape of the CD curve, the spectrum for Peptide 6 appears to have slightly more p-sheet (31%) than Peptide 5 (26%), as expected. It is also worth noting that Peptide 6, just like Peptide 5, exhibits features of cold denaturation and TFE induced structure stabilization (data not shown). These results suggest that a model based on the sequence of Peptide 6 has many of the features required of an ideal p-sheet model.
Davids. Wishart^r a/.
456 5 1
w^
1
0Peptide 7. • " '
CM
^^"^
8" . 5 . CO 0
^-10-
- 1
1
5 ^
"
190
\/*
^/^
Peptide 6
••
200
210
220
230
240
250
1
Wavelength (nm)
Figure 4. CD spectra of Peptide 6 and Peptide 7 collected at 25 'C under benign (aqueous) conditions.
IV. Conclusions This report describes our efforts at designing, synthesizing and characterizing a water-soluble p-sheet analog. We think that we have succeeded in designing a 12 residue cyclic peptide (Peptide 6) which satisfies most of the criteria required of a model p-sheet: it is small (< 20 residues), monomeric, water-soluble, pure (composed of only p-sheets and p-turns), mostly amphipathic, reversibly denaturable, composed of only natural amino acids, relatively easily synthesized and easily characterized by either CD or NMR. We plan to refine this model to enhance its amphipathicity and to improve its cyclization efficiency. Once this refinement stage is complete, we will begin to systematically investigate the influence of amino acid substitutions on both the hydrophilic and hydrophobic sides of this peptide. The resultant data will be used to extract specific p-sheet propensities for all 20 naturally occurring amino acids. In addition to this work on monomeric p-sheets, we are beginning to study dimeric p-sheets (psandwiches) by preparing a variety of disulfide-linked p-sheet analogs. This will allow us to investigate the influences of side chain packing and hydrophobic effects on the stabilization of "idealized" p-sandwiches. We are hopeful that these model systems will provide researchers with the detailed information they need to understand the intricacies of p-sheet and p-turn formation in natural proteins.
References 1. 2. 3. 4. 5. 6. 7.
Padmanabhan, S., Marquesee, S,, Ridgeway, T., Laue, T.M., and Baldwin, R.L. (1990) Nature 344, 268-270. O'Neil, K.T., and DeGrado, W.F. (1990) Science 250, 646-651. Zhou, N.E., Kay, CM., and Hodges, R.S. (1992) J. Biol. Chem. 267, 2664-2670. Hill, C.P., Anderson. D.H., Wesson, L., DeGrado, W.F., and Eisenberg, D. (1990) Science 249, 543-546. Horovitz, A:, Matthews, J.M„ and Fersht, A.R. (1992) J. Mol. Biol. Ill, 560-568. Hartman, R., Schawaner, R.C., and Hermans, J. (1974) J. Mol. Biol. 175, 195-212. Kemp, D.S, (1990) Trends Bioiechnol 8, 249-255.
A Water-Soluble P-Sheet Peptide 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
457
Diaz, H., Tsang, K.Y., Choo, D., and Kelly, J.W. (1993) Tetrahedron 49, 3533-3545. Minor, D.L., and Kim, P.S. (1994) Nature 367, 660-663. Smith, K., Withka, J.M., and Regan, L. (1994) Biochemistry 33, 5510-5517. Kim, C.A., and Berg, J.M. (1993) Nature 362, 267-270. Izumiya, N., Kato, T., Aoyagi, H., Waki, M., and Kondo, M. (1979) "Synthetic Aspects of Biologically Active Cyclic Peptides - Gramicidin S and Tyrocidines", Kodansha Ltd., Tokyo. Hull, S.E., Karlson, R., Main, P., Woolfson, M.M„ and Dodson, E.J. (1978) Nature 275, 206-207. Krauss, E.M., and Chan, S.I. (1982) J. Am. Chem. Soc. 104, 6953-6961. Wuthrich, K. (1986) "NMR of Proteins and Nucleic Acids", J. Wiley & Sons, New York. Wishart. D.S., Sykes, B.D., and Richards, P.M. (1992) Biochemistry 31, 1647-1651. Scheraga. H.A. (1978) J. Pure Appl. Chem. 50, 315-324. Sonnichsen, F.D., Van Eyk, J.E., Hodges, R.S., and Sykes, B.D. (1992) Biochemistry 31. 8790-8798. Privalov, P.L., Griko, Y.V., Venyaminov, S.Y., and Kutyshenko, V.P. (1986) J. MoL Biol. 190, 487-498. Chou, P.Y., and Fasman, G.D. (1974) Biochemistry 13, 211 -222. Johnson, W.C. (1990) Proteins: Struct. Funct. Genet. 7, 205-214.
This Page Intentionally Left Blank
Automated Analysis of Protein Folding Richard A. Smiths, Jack Henkin^, and Thomas F. Holzmanl'^ ^Protein Biochemistry and ^Thrombolytics Research, ^To whom correspondence is addressed at D-46Y, Discovery Research, Abbott Laboratories, Abbott Park, IL 60064.
I. Introduction In recent years the need to obtain recombinant proteins has increased dramatically in the pharmaceutical and biotechnology-related industries. The ability to define and understand pathways for folding recombinant proteins can often be a rate-limiting step in the preparation of such proteins for use in diagnostic tests, drug screening, or for structural analysis by NMR and X-ray crystallography. These proteins are routinely obtained through heterologous over-expression in prokaryotic hosts. Unfortunately, instead of producing soluble folded protein, high-level expression often results in formation of inclusion bodies composed of partially folded, or misfolded, protein. In addition to being misfolded, proteins in inclusion bodies often have either mispaired or unpaired disulfide bonds. Since eukaryotic expression does not usually result in inclusion body formation, high-level expression in eukaryotic hosts can be pursued as a solution to this problem. However, obtaining high levels of intracellular expression, and concomitant secretion, routinely require substantially more effort and time to produce levels equivalent to those observed in prokaryotes. It is possible that metabolic engineering of prokaryotic hosts for enhancement of expression of native proteins may lead to "expression-tailored" organisms (1), but these efforts are clearly in their infancy. Methods presently employed for obtaining correctly refolded proteins from inclusion body preparations are often allor-none propositions. They typically consist of denaturant solubilization, in urea or guanidine, followed by dilution or Gradient dialysis (2). Recovery of native activity or Controller structure may be aided by using additives and Data (enzyme inhibitors, co-factors, oxidationreduction couples, etc.), which act to Acquisition stabilize the native-state protein conformation. However, because such efforts are time-consuming and tedious, CD systematic examinations of solution Detector conditions for protein folding/unfolding Figure 1. Diagram of HPLC-based protein are rarely performed. system. Thin lines represent biSome years ago, in an effort to improve folding directional computer communications and previous approaches (3-9), we developed data capture; thick lines represent an HPLC-derived method to automate fluid^ufier flow towards fraction analysis of protein folding and preparative collector. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
459
460
Richard A. Smith et al.
recovery of refolded recombinant proteins. As we recently described (10), this approach makes analyses easier to perform, permits exploration of an array of refolding conditions, and affords reproducibility at a preparative scale. In this method, continuous-flow injection into a narrow-bore, open-channel tube is used for both equilibrium and time-resolved kinetic folding experiments (Fig. 1). With appropriate in-line detectors it is possible to measure, nearly simultaneously, changes in protein quaternary, tertiary and secondary structure in response to chemical denaturants and solvent additives. Changes in protein quaternary structure or aggregation state are monitored by either simple lightscattering in the UV, or if resources permit, use of a detector designed specifically for light-scattering analyses. Tertiary structure changes Hypothetical Spectra are monitored by zero-order UV of Equilibrium Folding Intermediates protein absorbance spectra, the resulting second-derivative UV spectra, and changes in protein tryptophan fluorescence. Changes in Signals from Probes of secondary structure are monitored by Denaturation/Renaturation CD. •••Constant Total ProteinAlthough a number of system configurations are possible, the Denaturant instrumentation described here Gradient consists of a commonly available, temperature-controlled, ternary HPLC Increasing Time and Volume system and LC detectors with data After Mixing capture and analysis software (Fig. 1). Mixing and Delay It employs a combination of narrowFor Equilibrium i bore tubing and HPLC microbore c: Hypothetical Spectra of static and dynamic mixers to give a Kinetic Intemnediates at total, pre-sample mixing, dead volume Various Times After Mixing of less than 1 mL and sequential volumes of individual detectors ranging jfrom 12 to 40 |LIL. In comparison to standard manual mixing Denaturation/Renaturation techniques the system affords fast, 00 ...Constant Total Proteinrepetitive, unattended analyses of folding/unfolding combined with high data-capture rates. For example, after several hours of preparation, a typical denaturation profile by manual mixing might comprise 20-30 samples Increasing Time and Volume between a folded state in buffer and an After Mixing Mixing unfolded state in high denaturant, with Figure 2. Schematic depiction of potential each sample individually analyzed by modes of instrument operation. S i s a l s due to UV, fluorescence and CD. folding can be observed at equilibrium (upper Automation of a single run on the panel) or kinetically before equilibrium is flowing sample yields several hundred attained (lower panel). Insets depict hypothetical spectral differences that may be to several thousand data points each observed at different times (volumes) after for the UV, fluorescence, and CD mixing, measurements. Because the method employs standard HPLC control software, it is possible to establish with ease botii reproducibility and reversibility of equilibrium folding/unfolding. When combined with various oxidant/reductant electro-chemical couples, the method can be used in both analytical and preparative modes to perform automated refolding of proteins containing disulfides. Finally, the precise control of sample and denaturant flow rates, combined with variable volume delays to detectors, permits, in principle, selective observation and manipulation of a particular kinetic folding process, independent of time of refolding.
s
I I
I I
•L.
Automatic Analysis of Protein Folding
461
II. Materials and Methods HPLC SYSTEM AND CHARACTERISTICS. This technique was developed with the use of a ternary HPLC system employing three microbore-capable, steppermotor controlled, reciprocating HPLC piunps (Fig. 1). Specifically, the experiments presented here were accomplished using the following equipment: 1) a Beckman Ternary HPLC comprising one 126 binary pump and one 116 single pump (both pumps were equipped with programmable quaternary solvent selection valves); 2) a Measured Phase Delay Beckman 166 variable wavelength UV detector to monitor at 230 nm the gradient formed at the primary mixer; 3) a Beckman 168 diode array UVVis detector to Average Delay = 21.1 min at 0.2 mUmin acquire protein spectra from 200 to Signal After Delay Loop 400 nm (1 nm at Second UV Detector resolution and Average of 8 scans each, with HighA.ow 99.9% Confidence Intervals individual absorbance measurements at 250 and 278 nm); 4) a g 10.0 Shimadzu RF 551 HPLC fluorescence detector to record the intrinsic Trp fluorescence, and 5) a Jasco J600 spectropolarimeter to monitor changes in CD signal, typically at 222 nm. In addition, both static and dynamic microbore mixers and a modified pulsedamping system were Figure 3. System characterization for shape and position of added to the HPLC. absorbance-derived signals for urea unfolding gradients alone. The protein pump is Upper panel depicts the average of 8 runs measured at each primed off-line, and detector and tiie associated high/low statistical error limits at a confidence interval of 99.9%. Inset to upper panel the protein is briefly computed depicts measured urea phase delay between the averaged runs. recirculated to the The lower panel depicts the averaged data phase-corrected. The reservoir to establish lower panel inset depicts the residual differences in urea concentrations between the two absorbance detectors. uniform concentrations throughout the pump/dampener system for protein delivery. Although the priming process consumes no protein sample, the system "dead" volume up to the secondary mixer (Fig. 1) is -1 mL. In typical use the protein flow rate is - 5 10% of the total and varies between ~5 and 40 |xL/min for a -100-120 min run. For example, a 100 minute run using a sample of protein at 0.5 mg/mL flowing at 20 |iL/min consumes 1.0 mg. Thus, a single automated folding/unfolding experiment consumes significantly less protein than a manual mixing experiment in which 10-30 0.5 mL samples are prepared, each at 0.5 mg/mL. We routinely set the system up to perform 6-12 runs overnight to establish reproducibility. The use of the system in this fashion relies on flow accuracy and reproducibility at rates as low as 3.0 jaL/min. Standard reciprocating HPLC
Richard A. Smith et al.
462
pumps will not suffice; they lack flow-rate accuracy and often exhibit pronounced solvent pulsation. In contrast. microprocessor control. combined with a modified pulse damping system, provides peak-to-peak protein pulses of less than -^10 ^A.U. at the 280 nm protein absorbance maximum. As an alternative a syringe-type pump system could, in principle, yield pulseless flow until a refill cycle occured. Execution of multiple folding/ unfolding runs and coincident data capture is accomplished using the standard HPLC system control software provided by Beckman Instruments. 95 105 Minutes The rate of Figure 4. System characterization for shape and position of simultaneous data absorbance-derived signals for urea folding gradients alone. from all Upper panel depicts the average of 5 runs measured at each capture detector and the associated high/low statistical error limits detectors is variable computed at a confidence interval of 99.9%. Inset to upper panel from 2 Hz to -^20 Hz; depicts measured urea phase delay between the averaged runs. we typically collect The broad "spikes" of urea occurring after -100 min in the upper panel is due to system re-equilibration. The lower panel depicts data at 2 Hz. With all the averaged data phase-corrected. The lower panel inset depicts detectors on-line, a the residual differences in urea concentrations between the two single run produces absorbance detectors. about 2 MB of data stored in binary form. Data is parsed, analyzed, and plotted using a combination of programs. Beckman System Gold and Array-View software, ASYST Software from Keithly Instruments and Turbo Pascal for Windows are used for initial file conversions and parsing. Final data parsing and plotting are performed with Microsoft Excel 5.0 and Charisma 2.1 from Micrografx. After parsing and analysis, a typical single run collected at 2 Hz produces 4 to 8 MB of uncompressed data files. Because a series overnight runs can consume >100 MB of disk space, we find it convenient to store intermediate and final data sets on inexpensive magneto-optical media. Signal After Delay Loop at Second UV Detector
Measured Phase Delay
POTENTIAL MODES OF OPERATION FOR ANALYSIS OF FOLDING. Schematic diagrams showing the signals expected from the two possible modes of instrument operation are presented in Figure 2. In the upper panel signals are observed after a delay for the attainment of equilibrium. In equilibrium mode a denaturant gradient is formed at the primary mixer (Fig. 1) and observed after passage through delay tubing sufficiently long for the total fluid flow rate selected. If the same flow rate is used with a much shorter delay and a constant
463
Automatic Analysis of Protein Folding
concentration of "denaturant", then the signals observed are "pre-equilibrium". This latter mode is termed "isocratic freeze-frame" and is shown in the lower panel of Fig. 2. The signals represent folding or unfolding kinetic events occurring within the time domain after protein is mixed with buffer/denaturant at the secondary mixer (Fig. 1). Although we do not present here any data for pre-equilibrium measurements, the potential utility of such measurements is of note. If thorough mixing of protein with buffer/denaturant is attained at the secondary Static Manual Measurements 250^
200-
^
8.0 M Urea
150-
100 J nl
°^^VNA^ Iff! Urea X ^ V ^ ^
50 J
0J
w w
3C0
^ ^ \r^^^^^ 350
400 Wavelength, nm
450
50
Continuous Flow Measurement
2.0
4.0
6.0
Urea (M)
Figure 5. Fraction unfolded FKBP measured by automated folding \ First Unfolding Event analysis. The experiment was peri^i I I I I I I I I I 0.00 formed by flow-injecting a constant 8 9 10% stream of FKBP in 9.31 M Urea (M) buffered urea into the secondary mixer (Fig. 1) and performing automated refolding using intrinsic Figure 6. Equilibrium unfolding of low molecular tryptophan fluorescence (11) as weight urokinase observed by mtrinsic tryptophan fluorescence using manual mixing (upper panel) and described in Fig. 4. continuous flow analysis (lower panel). Samples for mixer, then a detector(s) placed manual analysis equilibrated for >1 hr before reading. immediately after the secondary The vertical arrow in the upper panel indicates mixer observes a static "freeze- fluorescence signal at 365 imi from excitation at 290 frame" signal that is earlier in nm. Data in lower panel was measured using the same excitation and emission wavelengths. Both sets "time" than a detector observing of measurements indicate two transitions in a static signal at equilibrium, fluorescence quenching. Manual measurements were with a Shimadzu RF-5000U after a long delay (Fig. 1). This made means that it is possible to make spectrofluorometer.
static, time-independent, spectroscopic measurements of folding events within the observable kinetic time frame between initiation of unfolding/folding at mixing and the point at which equilibrium is attained. This is accomplished simply by varying flow rate or by altering delay tubing length.
Richard A. Smith et al.
464
40 Fraction Number Figure 7. Activity measurements of low molecular weight urokinase refolded by continuous flow from 9.3 M urea to 20 mM Bis-Tris, pH 7.3 buffer and collected during run (Fig. 1). Circles indicate urea concentrations in each fraction as measured by refractive index. Squares indicate enzyme activity measured with urokinase substrate S-2444 (12). Inset shows a plot of activity versus urea concentration.
Finally, the ability to perform equilibrium and pre-equilibrium measurements can be combined with other experimental protocols. The concentration of protein can be varied automatically to determine the effects of concentration on folding and to observe, through sample scattering, the optimal concentration at which to perform refolding experiments. Folding measurements can be combined with reductant/oxidant couples (reduced/oxidized DTT or glutathione) to superimpose a specific redox potential on an experimental run. For example, for the refolding of reduced and unfolded protein from urea, a gradient ratio of reduced to oxidized DTT can be co-injected with protein to enhance intra/intermolecular disulfide exchange. SYSTEM CHARACTERIZATION AND SUITABILITY TESTS. In Figures 3 and 4 system characterization tests of unfolding and refolding gradients of urea alone are presented. The stock urea concentration was measured by refractive index (2). In both figures gradients were programmed to extend from 0-90% stock urea (9.82 M), or vice versa. In both experiments water was the remaining 10% fluid. In Figure 3 eight sequential runs of an unfolding urea gradient were collected overnight. The gradient reproducibility was so high that the computed error curves differ only slightly from the averaged data. In these unfolding runs the tubing length and secondary mixer produced a delay of-21 min between the gradient detector (Fig. 1) and the secondary UV detector used to monitor signals from folding. This test demonstrates little or no degradation in the denaturant gradient due to passage through the secondary mixer and the tubing delay (Fig. 1). The residual differences in the phase-corrected scans (Fig. 3, lower panel inset) indicate urea concentration variations between these two points in the system are on the order of O.l-to-0.2 M across the entire denaturation gradient. In Figure 4 a similar analysis is performed for the formation of a refolding gradient of urea, with similar results. Taken together these data suggest that the
Automatic Analysis of Protein Folding
465
HPLC-based system is capable of forming denaturant gradients with an acceptable degree of accuracy. III. Results and Discussion Using manual mixing experiments we have shown the equilibrium folding behavior of a recombinant peptidylprolyl cistrans isomerase, FK binding protein (FKBP), in urea (11). In that previous study second-derivative UV absorbance and intrinsic Trp fluorescence were used as probes of tertiary structure, and CD as a probe of secondary structure. The reversibility of folding was followed both by these optical probes of structure, as well as by two-dimensional N/ H heteronuclear single quantum coherence (HSQC) NMR of [U- N] FKBP. Fluorescence measurements indicated a transition midpoint at ~3.9 M urea (11). As a comparison, data for FKBP unfolding using the automated analysis is presented in Figure 5; total flow was 200 |iL/min (upper three panels are replicates). The dashed lines in each panel are the pre- and post-transition least-square baselines fitted to the thin data curve. The thicker curve in each of the upper three panels is the baseline-corrected data set. The lowest panel depicts the average of the three data sets overlaid with error curves from the three data sets computed at 99.9% confidence. The dashed lines in the lowest panel indicate the urea concentration at the folding transition mid-point. Data from the automated analysis of FKBP folding demonstrates a transition mid-point at -3.6 M urea, in good agreement with the previous results from manual mixing (11). In Figure 6 the equilibrium denaturation in urea of low molecular weight urokinase is performed by both manual mixing and automated equilibrium denaturation and measured in both cases by changes in the intrinsic Trp fluorescence. Manual readings (-15 samples) indicate the occurrence of two transitions in fluorescence quenching. The second transition, the major one, also exhibits an accompanying red-shift, consistent with exposure of Trp residues to bulk solvent upon complete unfolding. In the sample subjected to automated folding analysis, the two intensity transitions are clearly evident as well. Although we do not present data for automation of fluorescence wavelength scans, it is 40 60 possible to perform and record Fraction Number excitation and emission spectral scans using this system. In Figure 8. Renaturation of low molecular weight observed in samples collected from particular, the RF-551 fluorometer urokinase continuous flow refolding. UK was injected in can be externally triggered and 9.3 M urea into 2.0 M urea, 20 mM Bis-Tris, pH programed for scanning. These 7.8 buffer with a gradient from 2.5 mM reduced scans can be recorded sequentially glut-athione (GSH) to 2.5 mM oxidized (GSSG). Upper panel: Data for in the same data channel and then glutathioneconcen-tration are from direct parsed into discrete spectra post-run. GSSG absorbance measurements at the secondary UV As indicated in Fig. 1 the effluent detector; data for GSH were determined by from the spectrometer analyses can DTNB titration of collected fractions. Lower be collected and fiirther analyzed, or panel: urokinase activity in collected fractions by spectrophotometric assay using Sbe used for other purposes. In measured 2444 (12) in a Molecular Devices titerplate Figure 7 a manual analysis is reader. presented of low molecular weight EXAMPLES OF SYSTEM OPERATION.
466
Richard A. Smith et al.
urokinase which has been automatically refolded from urea and collected after the detectors (Fig. 1). In this analysis urea concentration was also manually determined in each sample using refractive index measurements, and urokinase activity was measured using a spectrophotometric kinetic titerplate reader from Molecular Devices. It is evident that the recovery of enzyme activity closely corresponds with the first fluorescence transition observed in Figure 6. Although interpretation of these data is not yet conclusive, it is likely that the transitions observed can be attributed to the known (calorimetric) domain structure of urokinase (13). In Figure 8 a demonstration of the use of the system for control of protein disulfide formation is presented. In this experi-ment low molecular weight urokinase was prepared in 9.3 M urea in a fiilly reduced form. The reduced UK was then injected into a gradient formed between reduced and oxidized glutathione in a constant concentration of 2.0 M urea. The protein eluting from the detectors was captured and analyzed for enzymatic activity (Fig. 8, lower panel). The data indicate recovery of activity is greatest at high GSH/GSSG ratios. In summary, we describe modifications to a standard HPLC system that will permit its use in automation of the analysis of protein folding. The system presented has a number of pertinent advantages. From a stock solution of concentrated protein, the concentrations of protein actually utilized are programmable over a broad range from 10 mg/mL. The operational flow of protein sample has a low minimum flow rate of ~3 |iL/min. The total fluid flow rate is adjustable, with typical flows for equilibrium runs ranging from 0.2 to 0.5 mL/min. The typical sample volume consumed during a run is ~ 2 mL. The typical run length, including recycling to initial conditions, is ~ 140 min. The fluid delay is adjustable to essentially any length desired by the user. The aging delay for equilibrium in the system described is ~ 20 min. The delay tubing is commonly placed in an HPLC column heater allowing temperature control from ambient to ~80°C. In the equilibrium mode about 28 mL of total buffer and denaturant are consumed per run. At 12 runs per day a single gallon each of buffer and denaturant will last 10-12 days. With an integrated HPLC-triggered fraction collector and larger bore delay tubing the system can be set up to repetitively inject and collect preparative samples of unfolded protein to be refolded and analyzed. Because the preparatively folded samples of protein will only be folded and active in a certain range of denaturant concentration, this approach permits recapture of misfolded material and recycling through the system. References 1. 2. 3. 4. 5. 6. 7.
Bailey, J.E. (1991) Science 252, 1668-1675. Pace, C.N. (1986) Methods Enzymol 131, 266-280. Adler, M. and Scheraga, H.A. (1988) Biochemistry 27,2471-2480. Endo S., Saito Y., and Wada, A. {\m)Anal Biochem. 131, 108-120. Saito, Y. and Wada, A. (1983) Biopolymers 22, 2105-2122. Saito, Y. and Wada, A (1983) 5/o/7o(vmgr5 22, 2123-2132. Thannhauser, T.W., McWherter, C.A., and Scheraga, H.A. {\9%5) Anal Biochem. 149, 322330. 8. Wada, A., Tachibana, H., Hayashi, H., and Saito, Y. (1980) Biochem. Biophys. Methods 2, 257-269. 9. Wada, A., Saito, Y., and Ohogushi, M. (1983) Biopolymers 22, 93-99. 10. Smith, R.A., Henkin, J., Egan, D.A., and Holzman, T.F. (1994) Protein Science 3 (Suppl. 1) 62. 11. Egan, D.A., Logan, T.M., Liang,H., Matayoshi, E., Fesik, S.W., and Holzman, T.F. (1993) Biochemistry 32, 1920-1927. 12. Marcotte, P.A. and Henkin, J. (1993) Biochim. Biophys. Acta 1161, 105-112. 13. Novokahatny, V., Medved, L., Mazar, A., Marcotte, P., Henkin, J., and Ingham, K. (1992) J. Biol. Chem. 267, 3878-3885.
Hsp70-protein complexes: Their characterization by size-exclusion HPLC Daniel R. Palleros, Li Shi^ , Katherine Reid, and Andiony Fink Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064
I. Introduction The term molecular chaperones has been coined to refer to several families of structurally unrelated proteins with a common functional property: they associate with partially unfolded proteins and assist in their in vivo translocation, folding and assembly. Among the most intensively studied molecular chaperones are the heat shock proteins of 70 kDa molecular mass, hsp70 (for reviews see ref. 1.2). These proteins are known to bind small peptides (3) and unfolded proteins (4,5) and the evidence gathered in the last 5 years indicates that they play an important role in the prevention of protein misfolding and aggregation in vivo. To fully understand the nature of the hsp70-protein interaction, the complexes between the chaperone and the substrate proteins must be isolated and characterized by chemical and physical methods. Ideally, these processes should be carried out with minimum change or disruption of the proteinprotein interaction. Attempts to crystallize hsp70's, or their complexes, have been unsuccessful and only a 44-kDa fragment of an hsp70 has been crystallized and studied by X-ray diffraction (6). Among the substrate proteins we have investigated are reduced, carboxymethylated a-lactalbumin, RCMLA, (a permanently unfolded protein), and thermally unstable mutants of staphylococcal nuclease. The latter have the advantage that at temperatures of 30^0 or higher, where they are unfolded, they can be bound to hsp70. and then released at lower temperatures (e.g. lO^C). where they should be in their native states. We have found that complexes between unfolded proteins and bovine brain ^ Present address: Department of Molecular Biology MB-2 Scripps Institute. La Jolla. CA 92037. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
467
468
Daniel R. Palleros et al.
hsp73 (a constitutive member of the hsp70 family), or human hsp72 (a protein highly inducible by heat shock and other metabolic stress), or DnaK (E. coli) are stable enough to be analyzed and isolated by size-exclusion HPLC (SEC-HPLC) on silica-based columns. This technique can be used to estimate the Stokes radius of the complexes and their components (5.7), to study hsp70 unfolding behavior (7,8), to follow the kinetics of complex formation and dissociation (4,5,9). to investigate the effects of nucleotides and ions on complex stability (4,5,9), and to isolate the complex for further spectroscopic and chemical studies (5,10); for example, SEC-HPLC in combination with SDS-PAGE is a very useful technique for the determination of complex stoichiometiy. In this paper we focus on the application of SECHPLC to the determination of Stokes radii and the stoichiometiy of hsp70-protein complexes.
II. Materials and Methods All chemicals were from the same sources as previously reported (5). NCA-SNase is a staphylococcal nuclease A mutant in which the pentapeptide Ser-Gly-Asn-Gly-Ser has been substituted for the tetrapeptide Tyr-Lys-Gfy-Gln at positions 27-30 (11). SECHPLC was run on a Bio-SEP 3000 silica column (600 x 7.8 mm; Phenomenex, Torrance, CA) using 20 mM sodium phosphate, 200 mM KCl. pH 6.5, as the mobile phase at 22°C; flow rate was 1 mL/min; detection was by absorbance at 215 nm. SDS-PAGE was run on a Pharmacia PhastSystem^^ using 8-25% polyacrylamidegradient gels and Coomassie Blue R staining, following the protocol described in PhastSystem Development Technique File No. 200 (Pharmacia); densitometry was performed using an ISCO gel scanner (ISCO model 1312) coupled to an absorbance monitor (ISCO model UA5) and integrator (Spectraphysics, model 4270). SDS sample buffer contained 5.5% SDS, 46% glycerol. 0.01% bromophenol blue, 200 mM Tris-HCl, and 0.7% (3mercaptoethanol, pH 6.8. E. coli DnaK stock solution (17.4 \M) was in 18 mM Tris-HCl. 45 mM NaCl, 10% glycerol, 5 mM pmercaptoethanol, pH 7.5. For the determination of the relative response factor, k (see below), the RCMLA stock solution was 17.4 MM in 20 mM Tris-HCl, 19 mM NaCl, pH 7.3. The RCMLA stock solution for complex formation was 143 pM in 13 mM Tris-HCl. 12 mM NaCl, pH 7. NCA-SNase stock solution was 510 pM in 20 mM Tris-HCl. pH 7.1. The concentrations of DnaK, RCMLA and NCA-SNase were determined using molar extinction coeflficients at 280 nm of 27000 (7), 27200 (12) and 15400 M-^cm"! (13). respectively. The size-exclusion partition coefficient, K^j, was calculated as K^ = (Vi-Vo)/Vt. where Vi is the elution volume of the protein. VQ is the void volume (elution volume of blue-dextran. 11.4 mL) and Vt is the total solvent-accessible volume (elution volume of sodium azide. 24.0 mL). It should be noted that this definition of the partition coefficient differs from others found in the literature; the partition coefficient a (14). often also called K^j
Characterization of HSP-70 Protein Complexes
469
(15). is defined as: a = (Vi-Vo)/(Vt-Vo). It follows that both constants are related by the following relationship: a = K(i/[1(Vo/Vt)l. In our experience (over 2000 SEC-HPLC runs with hsp70). the silica-based columns give better resolution than agarosebased columns (for example Superose 12 for FPLC from Pharmacia); however, we found that the silica-based columns have a limited lifetime when used with hsp70. After approximatefy 100-150 injecUons (20 nL; Ihsp701 « 5 \M) hsp70 will no longer elute from the column; this happens very suddenly without progressive retardation in the elution volume of hsp70 as the number of runs increases. Attempts to clean the column with five volumes of 6 M guanidine hydrochloride or 20% dimethylsulfoxide or 5% acetonitrile have been unsuccessful. It should be pointed out that when the column reaches the state in which hsp70 is no longer eluted. most other proteins will still elute at their normal elution volumes, however, unfolded proteins such as RCMLA are also considerabfy retarded, probably due to binding to the hsp70 stuck on the column. We sdso observed that injection of hsp73 resulted in shorter column life-times than when DnaK was used.
III. Results and Discussion HspYO'protein complex stoichiometry. Hsp70-protein complexes were formed by incubating hsp70 with an excess of substrate protein at 37^C. The reaction mixtures were then analyzed by SEC-HPLC; the chromatograms corresponding to mixtures of DnaK with RCMLA and NCA-SNase are shown in Fig. 1. The stoichiometry of hsp70-RCMLA complex was determined by a combination of SEC-HPLC and SDS-PAGE for both bovine brain hsp73 and DnaK. Similar results were obtained in each case; the molar ratio of hsp70 and RCMLA in the complexes was 1:1. Only the results for DnaK are discussed here. The complex between DnaK and RCMLA was formed by incubating equal volumes of both protein stock solutions for about 100 min at 370C; 500 ^iL of the reaction mixture was injected and several 1.5-mL fractions were collected and concentrated using a Centricon 3 (cut-off 3000 Da); fraction # 4 corresponded to the peak attributed to the DnaK-RCMLA complex (elution volume ca. 16 mL. see Fig. 1). The final volume of the concentrated fractions was about 150 |iL of which 10 ^.L was treated with SDS sample buffer, heated at 95^C for 2 min and loaded (1 ^iL) onto a SDS-PAGE gel. After developing with freshly made Coomassie Blue R solution, the gel was dried overnight at 37^C and then scanned with the densitometer. Only two bands, corresponding to the positions of standard RCMLA and DnaK samples, were observed for fraction # 4. The intensity of the
470
Daniel R. Palleros et al.
bands was determined by the integration of the corresponding densitometry peaks. In order to minimize the error, the same fraction was run on 4 different gel lanes, and each lane was scanned at least three times. The average for the ratio of the areas for DnaK (AD) and RCMLA (AR). A D / A R , was 5.32 ± 0.83. I I I I I 1 I I I—I I I I I—I—I—I I I I I 1 I I I I I I I I I I I I I I I I I
DnaK Complexes ;• NCA-SNase
i-r%r-r'h*rTv';J^'-M*t* T T ' I
14
15
16
17 Elution
18 19 20 Volume (ml)
21
22
Fig. 1. Complex formation between DnaK and substrate proteins monitored by SEC-HPLC. DnaK stock solution was mixed with RCMLA or NCA-SNase stock solutions and 20 mM Tris-HCl buffer. pH 7.1. incubated for 30 min. at 37°C and then analyzed by HPLC. Final concentrations were: [DnaK] = 5.5 nM. [RCMLA] = 17 ^iM and [NCA-SNase] = 24 jiM. DnaK (5.5 jiM) alone was analyzed under the same conditions. The complexes partially dissociate (20-40% depending on the conditions) during the HPLC run (5).
The relative response factor of DnaK and RCMLA to Coomassie Blue R staining was determined by SDS-PAGE analysis of samples of known concentrations of DnaK and RCMLA. Aliquots of RCMLA and DnaK stock solutions of identical concentrations and Tris-HCl buffer (20 mM; 19 mM NaCl. pH 7.3) were mixed to afford different [DnaK]:[RCMLA] molar ratios (2:1; 1:1; 1:2; 1:3); the concentration of DnaK was kept constant at 4.4 ^iM. These solutions were treated with SDS sample buffer and analyzed by densitometry as described above. The relative
Characterization of HSP-70 Protein Complexes
471
response factor for DnaK and RCMLA, k. was calculated using eq. 1, [RCMLA] A D _ | ^ [DnaK] " AR The average from four different lanes (each one with a different [DnaK]:[RCMLA] molar ratio) gave a value for k of 5.43 ± 0.42. It should be noted that the relative response factor k reflects the relative ability of these two proteins to bind Coomassie Blue R; although the nature of the interaction between this dye and proteins is not clearly understood, the binding seems to be favored by the presence of basic residues (Lys, Arg and His). While the number of dye molecules bound to proteins varies largely from protein to protein, the number of dye molecules bound per positive charge on the protein seems to be fairly constant, ranging from 1.4 to 2.7 in a series of proteins (16). Therefore, for proteins with a similar proportion of basic amino acids, the number of Coomassie Blue R molecules bound to the protein is expected to be proportional to its molecular mass. Fortuitously, the proportion of basic amino acids in hsp73. DnaK and RCMLA is about the same (13%); therefore, the relative response factor of DnaK and RCMLA should be comparable to the ratio of their molecular masses (i.e. 69100/14700 = 4.7), which is the case, as the k value of 5.43 ± 0.42 indicates. With the knowledge of k and the ratio of the areas for DnaK and RCMLA bands in fraction # 4 as already determined ( A D / A R = 5.32), the ratio of the molar concentrations of DnaK and RCMLA in the complex can be calculated, eq. 2:
t5CMt^ = i k . k = ^ = M i = , . 0 2 t o . 2 4 [DnaK] AQ An/ 5.32
(2)
/AR
These results indicate that the stoichiometiy for the DnaK-RCMLA complex is 1:1. As Fig. 1 clearly shows, no stable complexes of higher molecular mass were detected. Moreover, performing the incubation with a molar excess of DnaK over substrate protein did not result in higher molecular mass complexes. Stokes radius determtnatiorh The Stokes radii (Rs) of DnaK, RCMLA. NCA-SNase and their complexes were determined by SEC-HPLC using a series of standard globular proteins for which Stokes radii were available (15, and references therein). It is well established that size-exclusion partition coefficients can be correlated to the molecular mass, MM, of proteins by eq. 3: Kd = - A log MM + B
(3)
472
Daniel R. Palleros et al.
where A and B are empirical constants. However, for non-globular or highly asymmetric proteins, a better correlation is obtained if the hydrodynamic radius (Stokes radius. Rg) is used instead of the molecular mass (14). A plot of log Rs us. Kd for standard proteins gave a good linear correlation (log Rs= 2.1037 - 2.1552 Kd; r= 0.990). Standard proteins (K^; Rs in A) were: ribonuclease A (0.376; 19.3); myoglobin (0.368; 20.2); bovine carbonic anhydrase (0.343; 23.6); ovalbumin (0.293; 31.2) and bovine serum albumin (0.258; 33.9). Using the correlation mentioned above and the Kd values listed below (in parenthesis), the following Rs (error: ± 2A) values were obtained: DnaK (0.228): 41A; DnaK-RCMLA complex (0.185): 51A; DnaK-NCA-SNase complex (0.187): 50A; RCMLA (0.309): 27A. and NCA-SNase (0.354): 22A. Our results indicate that DnaK does not behave as a globular protein on the SEC-HPLXD experiments. A Stokes radius of 41 A for DnaK is in agreement with previously published data determined by dynamic light scattering (7). and is also comparable with the Stokes radius determined for hsp73. 39 A (17). These values are greater tham predicted, however, for the Stokes radius of a globular protein of molecular mass 70 kDa. A correlation between volume (Rg^) and molecular mass for nine globular proteins is shown in Fig. 2; R^^ = -1707 + 0.6091 MM (r = 0.989). For a globular protein of 70 kDa a Stokes radius of about 34A is expected. For the DnaK-protein complexes. Stokes radii of about 51 A have been determined, which are too large for spherical-shaped complexes; a radius of 37A is expected for a globular protein of molecular mass 84 kDa (the molecular mass of the complexes). This abnormally large Stokes radius is in part a reflection of the non-globular character of DnaK; however, the 14A difference between the expected (37A) and observed (51A) Stokes radius for the complexes, is much larger than the difference of 7A detected for free DnaK. This disparity suggests that the substrate proteins must be substantially unfolded when bound to DnaK. This is not surprising in the case of RCMLA. because the protein is permanently unfolded regardless of the experimental conditions. This is also evidenced by its large Stokes radius. (27A); for a globular protein of molecular mass 14700. the Rs is expected to be around 19A. The results with NCA-SNase are unexpected in as much as the free substrate protein is folded under the conditions of the SEC-HPLC analysis (18). The unfolded nature of NCASNase in the complex with DnaK was further investigated by fluorescence spectroscopy and far-UV circular dichroism (5).
Characterization of HSP-70 Protein Complexes
4.0
473
10
Rs^ (A^) 2.0
10
1.0
h
10^
5.0
10^
Molecular Mass
9.0
10
(Da)
Fig. 2. Correlation between Rg^ and molecular mass for globular proteins. In order of increasing mass the proteins are: cytochrome c, ribonuclease A, myoglobin, bovine carbonic anhydrase, Plactoglobulin. ovalbumin, hemoglobin, bovine serum albumin and transferrin. Early attempts to determine the stoichiometry of hsp70protein complexes by a correlation between K^ and log MM were unsuccessful because the substrate proteins are substantially unfolded in their complexes with hsp70. and hsp70s themselves probably deviate from a spherical shape; to illustrate this point it should suffice to say that DnaK and DnaK-RCMLA complex behave as if they had apparent molecular masses of 93 and 156 kDa, respectively, when a correlation of log MM us. K^ (using the same five standard proteins mentioned above plus bovine serum albumin dimer) was used to estimate molecular masses.
IV. Conclusions SEC-HPLC on silica-based columns is a fast and versatile technique for the characterization of complexes between molecular chaperones and substrate proteins. The technique provides an invaluable tool for a rapid determination of their hydrodynamic properties and, in combination with SDS-PAGE, can be used to determine the stoichiometry of such complexes. One limitation is the fact that SEC-HPLC can be applied only to the study of stable complexes, i.e. complexes that will not dissociate significantly during the chromatographic run.
.•^.
Daniel R. Palleros et al.
References 1. Ellis. R.J.. and van der Vies, S.M. (1991) Annu. Rev. Biochem. 60. 321-347. 2. McKay. D. (1993) Aduances Prot Chem. 44. 67-97. 3. Flynn. G.C.. Chappell. T.G. and Rothman. J.E. (1989) Science 2 4 5 385-390 4. Palieros, D.R, Welch. W.J.. and Fink. A. L. (1991) Proc. Natt. Acad. Set U.SA. 88. 5719-5723. 5. Palleros. D.R. Shi. L.. Reid. K.L.. and Fink. A.L. (1994) J. BtoL Chem. 269. 13107-13114. 6. Flaherty. K.M.. DeLuca-Flaherty. C . and McKay. D.B. (1990) Nature 346. 623-628. 7. Palleros. D.R.. Shi. L.. Reid. K.L.. and Fink. A.L. (1993) Biochemistry 32, 4314-4321. 8. Palleros. D.R. Reid. K.L.. McCarty. J.S.. Walker. G.C.. and Fink, A.L. (1992) J. Biol Chem. 267. 5279-5285. 9. Palleros. D.R. Reid. K.L.. Shi. L.. Welch. W.J.. and Fink. A.L. (1993) Nature 365. 664-666. 10. Palleros, D.R. Reid. K.L., Shi, L., and Fink. A.L. (1993) FEBS Lett 336. 124-128. 11. Hynes. T.R. Kautz. RA.. Goodman. M.A.. Gill. J.F.. and Fox. R.O.. (1989) Nature 399. 73-76. 12. Ikeguchi. M.. and Sugai. S. (1989) Int J. Peptide Protein Res. 33, 289-297. 13. Fuchs. S.. Cuatrecasas. P.. and Anfinsen. C.B. (1967) J. Biol Chem. 242.4768-4770. 14. Ackers. G.K.. (1970) Aduances Prot Chem. 2 4 . 381-383. 15. Corbett. R.J.T.. and Roche. R S . (1984) Biochemistry 2 3 . 1888-1894. 16. Tal. M.. Silberstein. A., and Nusser. E.. (1985) J. Biol Chem. 260. 9976-9980. 17. Schlossman. D.. Schmid, S.L.. Braell., W.A., and Rothman. J.E. (1984) J. CeRBiol 99. 723-733. 18. Antonino. L.C.. Kautz, RA.. Nakano. T., Fox, R.O.. and Fink. A.L. (1991) Proc. NatL Acad. Set U.S.A. 8 8 . 7715-7718.
Methods for Collecting and Analyzing Attenuated Total Reflectance FTIR Spectra of Proteins in Solution* Keith A. Oberg and Anthony L. Fink Department of Chemistry and Biochemistry University of California, Santa Cruz 95064
I.
Introduction
Attenuated Total Reflectance FTIR (ATR-FTIRt) is a method that has been applied by a number of workers for the study of protein conformation. ATR has been used for monitoring adsorption of proteins or blood components to surfaces (1,2), and for the structural analysis of proteins dried onto an IRE (thin fihn) (3,4), It has also been used for exploring the effects of solution conditions on the structure of proteins irreversibly adsorbed to an IRE (5,6,7), and has been shown to be useful for studying the secondary structure and ligand binding properties of membrane proteins (8,9). To date, there have been no published studies using ATR-FTIR to measure the spectra of just the protein in (bulk) solution. We have found that such solution spectra can be obtained by subtracting the contribution of denatured material irreversibly bound to the IRE surface from that of bulk and adsorbed protein. The strong interactions between polypeptides and IRE materials that immobilize proteins on IRE surfaces have a deleterious effect on the structure of these molecules. The characteristics of the adsorption process and its affects on protein structure will be discussed elsewhere (10). * This work was supported by a grant from the National Science Foundation. t Abbreviations used are as follows. FTIR: Fourier transform infrared spectroscopy, ATR: attenuated total reflectance, IRE: internal reflection element, SATR: solution ATR-FTIR, FSD: Fourier self-deconvolution, PLS: partial least-squares analysis, PRESS: prediction residual sum of squares from PLS. SECV: standard error of calibration values from PLS, PLSl: PLS analysis in which each component is predicted independently, PLS2: PLS analysis in which all components are predicted simultaneously. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
475
476
II.
Keith A. Oberg and Anthony L. Fink
Materials: Protein Solutions
Proteins were purchased in the purest form available from Sigma, Worthington or Biocell Laboratories, and used without fiirther purification. Proteins used for the generation of PLS basis sets were chosen to represent a wide range of structural motifs. Proteins used were as follows: Sigma: Carbonic Anhydrase (C3934), Concanavalin A (C7275), Cytochrome C (C7552), Insulin (13505), pLactoglobulin (L7880), Myoglobin (Ml882), Papain (P4762), Trypsin Inhibitor (bovine pancreas, T0256), Ubiquitin (U6253); Worthington: Chymotrypsinogen A (5630), Lysozyme (2931), Ribonuclease A (3433), Biocell: IGG (goat anti rabbit). To prepare protein solutions, 15-25 mg of each protein were dissolved shortly before use in sufficient 20 mM sodium phosphate, pH 7.0 to give a 30 mg/ml solution. 100 jil of the resulting solutions were diluted to 1 ml to make 3 mg/ml solutions. Samples that had visible precipitate after dissolution in buffer (carbonic anhydrase, concanavalin A, and insulin) were centrifuged for 2 minutes before use.
III. Methods for Data Collection and Structure Analysis The contribution of water absorption in solution spectra can be more than 99% of the total signal. The technological sophistication and high sensitivity of modem FTIR instruments make it possible to extract protein spectra from solution data; however extreme care must be exercised to insure accuracy. Frequently FTIR data processing is not done with sufficient rigor, and hence flawed analyses have appeared in print. As a part of this work, a careful exploration of the various aspects of data collection, spectral processing, and structural analysis was performed in order to develop a reproducible and reliable protocol. The complete protocol is presented in the following sections. A,
Data collection
A modified out-of-compartment IRE holder (SPECAC) that could be configured as a 125 \i\ flow cell was used for this study. The IREs used here measure 72x10x6 mm; they have a 45° angle of incidence, and 7 internal reflections. Interferograms were collected on a Nicolet 800 FTIR spectrometer equipped with a liquid nitrogen cooled MCT detector. For each spectrum, 1000 interferograms were co-added at 4 cm"^ resolution (total collection time: 6 minutes). Duplicate data sets were collected on Ge and ZnSe IREs. For each spectrum, 500-1000 |LI1 of a 3 or 30mg/ml protein solution (20 mM phosphate, pH 7.0) were used. The protocol developed for the collection of solution spectra is summarized in Table I.
Protein FTIR Spectral Analyses
477
Table I. The general protocol for collection of bulk and adsorbed protein spectra Step
1 2 3 4 5 6 7 8 opt.
9
B.
Procedure Assemble & align flow cell Collect 3 spectra of empty flow cell Fill with buffer Remove buffer Load protein solution, (allow to adsorb, 1-5 min.) Remove protein solution Flow 2ml buffer once through cell («10sec) Soak cell in soap 5-10 minutes Rinse with >100 volumes H2O For additional spectra, steps 5-8 were repeated
Spectral
Spectrum
Components
~ Background Buffer
— Empty cell, spectrum of gasket Water, (+buffer components)
~ Total
1
Adsorbed & bulk protein, buffer
1
— Flush Wash
1
Tightly adsorbed protein, 1 buffer Irreversibly adsorbed protein, 1 water
~
—
Pre-processing
All data manipulation was performed with Lab Gale or GRAMS/386 (Galactic Industries). Interferograms were Fourier transformed using the Mertz method (11) with medium Norton-Beer apodization. Spectra were converted to absorbance by ratioing against an appropriate background spectrum (Table I). Water vapor spectra ("vapor") were generated by ratioing two background spectra; one background was collected with the instrument open to the atmosphere, which produced strong adsorption bands. The high intensity of the vapor spectrum was beneficial in that it served to minimize the increase of noise in protein spectra due to vapor subtraction. The contribution of water vapor in "buffer," "total," and "flush" spectra (Table II) was subtracted automatically by an Array Basic (Galactic Industries) program that optimized the scaling factor, s, (Equation 1) using the same type of linear search described by Powell et al. (12). original - s (subtrahend) = corrected spectrum
(1)
Briefly, optimization of s was performed iteratively. This is shown graphically in Figure 1. This approach can be used for any subtraction where a "goodness of s" parameter, g, can be defined and evaluated. At each step, g was evaluated for «„ (gn) The subtraction was then performed using Sn+i=*n+» (where / is an arbitrary increment factor), and g was evaluated again (gn+\). If l^n+il 'Ccp)It is also possible that the structure of desmopressin changes when bound to the carrier protein. The new crosspeaks indicate a variety of new contacts within desmopressin which are labeled in Figure 2. In other regions of the 2-D transferred NOESY, contacts can also be seen between the peptide and the protein which can be very useful in identifying the binding site of the ligand and interactions in the desmopressin /NP-II complex. From the full transferred NOE data (which includes more than 230 crosspeaks for the 9 residue peptide), distance restraints can be derived for calulating the bound structure of desmopressin using molecular modeling.
527
N M R A n a l y s i s o f L i g a n d - R e c e p t o r Interactions
6P'/9lNH'
| ^ V 3 ^ 2P'/36 ^1^ 4Pf3£
3pf3£ I
7.46 7.48
8y/3e
6Pf9tNH 6pV9tNH
|iiii|iiii|iiii|iiii|iiii|iiii|iiii|iiii[iiii|iiii|iin|iiii|ini|iiii|nii|iiii|iiii|ir
3.6
3.2
2.8 Fl (ppm)
2.4
2.0
"'i[iiii|iiii[iiii|iiii|iiii|iin[iiii|iiii|iiii|iiii[iiii|iiii|iiii|iiii|iiiniiii[iiii|ii'
3.6
3.2
2.8
2.4
2.0
PI (ppm)
Figure 2. Transferred NOESY aiid NOESY spectra of desmopressin. The aromatic-sidechain region of the transferred NOESY spectrum of desmopressin in the presence of 0.1 mole equivalent NP-II (panel A) and the same spectral region of the NOESY spectrum of desmopressin in absence of NP-II (panel B).
IV.
Conclusions
In both of the NMR relaxation approaches presented herein the parameters measured R2ij and ay are proportional to [TCB] , where indicates an ensemble average. In the first approacn one assumes the distance is constant and known between particular proton pairs and looks for changes in local mobility. The caveat on this approach is that other proton intemuclear contacts can contribute to the relaxation, or other sources of linebroadening such as exchange broadening. In the second approach one assumes the bound correlation time to be uniform within the bound ligand and looks for changes in proton-proton distances. The obvious caveat here is that some section of the ligand may be still flexible when bound, and the contribution of the protein protons in the relaxation is hard to evaluate. Acknowledgments The TGF-a/EGFR-ED and the desmopressin /NP-II projects were supported by the Government of Canada through the Networks of Centres of Excellence Program. DWH wishes to express his thanks for additional financial support and the provision of TGF-a samples by Berlex Biosciences, Inc., USA. DWH also thanks Dr. Maureen O'Connor-McCourt's lab for providing EGFR-ED. JJW wishes to thank Dr. Esther Breslow for providing NP-II and Paul Semchuk for peptide synthesis of desmopressin. References 1. Weber, C , Wider, G., von Freyberg, B., Traber, R., Braun, W., Widmer, H., & Wuthrich, K. (1991). Biochemistry 30, 6563-6574.
528
David W. Hoyt et al.
2. Tsang P., Ranee M., Fieser, T.M., Ustresh, J. M., Houghten, R.A., Lemer, R. A., & Wright, P.E. (1992). Biochemistry 31,3862-3871. 3. Ikura, M. & Bax, A. (1992). J. Am. Chem. Soc. 114,2433-2440. 4. Englander, S. W. & Mayne, L. (1992). Anna. Rev. Biophys. Biomol. Struct. 21,243-265. 5. Hoyt, D. W., Harkins, R. N., Debanne, M. T., O'Connor-McCourt, M., & Sykes, B. D. (1994). submitted to Biochemistry:. 6. Wang, J. J., Hodges, R. S., & Sykes, B. D. (1994a). submitted to Int. J. Pept. Protein Res. 7. Void, R. L., Waugh, J. S., Klein, M. P., & Phelps, D. E. (1968). J. Cfiem. Phys. 48, 38313832. 8. Meiboom, S., «& Gill, D. (1958). Rev. Sci. Instr. 29,688-691. 9. Marshall, A. G., Schmidt, P. G., & Sykes, B. D. (1972). Biochemistry 11, 3875-3879. 10. Sykes, B. D., HuU, W. E., & Snyder, G. H. (1978). Biophysical J. 21,137-146. 11. Goldman, M. (1988). Quantum Description of High-Resolution NMR in Liquids (J. S. Rowlinson, ed.). International Series of Monographs on Chemistry 15, pp. 243-248, Clarendon Press, Oxford. 12. Clore, G. M. & Gronenbom, A. M. (1982). J. Magn. Reson. 48,402-417. 13. Landy, A. B. & Rao, B. D. N. (1990). J. Magn. Reson. 81,371-377. 14. Campbell, A. P. & Sykes, B. D. (1991). J. Magn. Reson. 93,77-92. 15. Wang, J. J., Hodges, R. S., Breslow, E., & Sykes, B. D. (1994b). manuscript in preparation. 16. Sykes, B. D. (1994). In "Peptides: Chemistry, Structure and Biology" (R. S. Hodges and J. A. Smith, eds.), pp. 1099-1102, ESCOM, Leiden, The Netheriands.
SECTION IX Peptide Synthesis
This Page Intentionally Left Blank
Application of 2-Chlorotrityl Resin: Simultaneous Synthesis of Peptides which Differ in the C-Termini Anita L. Hong, Tin T. Le, and Tning Phan AnaSpec, Inc., San Jose, CA 95131
I.
Introduction
Peptides which differ in their C-termini often exhibit different structural-activity relationship, for example, the amide C terminal of the des-pentapeptide(B26-30) insulin(l) was shown to be 100% active in contrast to the free acid(20-30%) (2). Semisynthesis employing proteases such as trypsin, chymotrypsin, carboxypeptidase Y were used to modify the C-terminal of these biologically active peptide hormones(3). One of the limitations of enzyme peptide synthesis is the substrate specificity of the proteases. Tjoeng et al (4) reported on multiple peptide synthesis using a single support. Peptide mixture was obtained and the peptides were separated by HPLC. We have demonstrated that peptides which differ in their C-termini can be simultaneously synthesized in one reaction vessel by employing resins that possess different cleavage properties. The resins that we used were the weak acid labile 2-chlorotrityl resins(5-7) and the TFA cleavable Wang resins.The success of this approach was shown by the co-synthesis of : a). ACTH(4-10) with ACTH(4-11); b). Neuropeptide Y, a C-terminal amide peptide with its corresponding C-terminalfreeacid peptide. II.
Materials and Methods
A. Reaggms and Materials Fmoc protected amino acids, 2-chlorotrityl(Clt) resins. Rink amide MBHA resin, Wang resins, HBTU and HOBt are commercially available from AnaSpec, Inc. The side chain protecting groups for the amino acids are t-butyl for Asp, Glu, Ser, Thr, Tyr; trityl for Asn, Cys, Ghi, His; Pmc for Arg and Boc for Lys. N-methylpyrrolidinone(NMP), Omnisolv^ grade, was purchased from VWR. Piperidine, diisopropylethylamine (DIEA)were purchased from Aldrich. TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
531
532
Anita L. Hong et al.
All syntheses were performed on the Applied Biosystems Peptide Synthesizers Model 430A or 431 A. Peptides were synthesized using solid phase peptide synthesis employing Fmoc chemistry methodology. Fmoc amino acids were activated using one equivalent of 0.45M HBTU/HOBt solution and two equivalents of DIEA. NMP was used as the coupling medium. Synthesis was performed starting with the appropriate resin. Fmoc protecting groups were removed using 20% piperidin^MP. Cleavage of the protected peptide from the Qt-resin was accompUshed using 30% acetic acid in DCM for 3 hours at room temperature. Alternatively, cleavage of the protected peptide from the Clt-resin can be achieved by using a mixture of 1:2:7: acetic acid: trifluoroethanol: DCM for 45 minutes at room temperature. Deprotection/cleavage of the peptide from the Wang resin was performed using trifluoroacetic acid in the presence of scavenger mixture (0.75g phenol, 0.25ml EDT, 0.5 ml thioanisole, 0.5ml water and 10ml TFA) for 2-3 hours at room temperature. Peptides were purified to >95% purity on reverse phase HPLC using CI8 columns and an AB gradient from 0%B where A is 0.1%TFA in water and B is 0.08% TFA in acetonitrile. Analytical HPLC was obtained using HP 1090 Liquid Chromatograph equipped with a diode-array UV detector. Capillary electrophoresis was performed on Waters Quanta 4000E. The authenticity of the peptides were identified by molecular weight determination using the Vestec 201 electrospray quadrupole mass spectrometer or Finnigan Mat 900 magnetic sector mass spectrometer. Amino acid analysis was performed on the Apphed Biosystems Model 420H amino acid analyzer. III.
Results
A. Synth^is Qf ACTH(4-10) and ACTHf4-ll) ACTH(4.10), Met-Glu-His-Phe-Arg-Trp-Gly and ACTH(4-1 l),Met-Glu-His-Phe-Arg-Trp-Gly-Lys were co-synthesized using a mixture of Qt resin and Wang resia The synthesis was accomplished by starting with Fmoc-Lys(Boc)-Wang resin. After the Fmoc group was removed, followed by several washing steps, Fmoc-Gly was coupled to form the Fmoc-Gly-Lys(Boc) Wang resin.
Peptide Synthesis with 2-Chlorotrityl Resin
533
To begin the synthesis of the second peptide, Gly-Clt resin was added. The synthesis was continued to complete the chain assembly for both sequences. ACTH (4-10) was isolated by cleaving the resin mixture with 30% acetic acid/DCM followed by removal of the protecting groups with TFA /scavenger mixture. The crude peptide was purified to yield 25 mg of peptide with purity of >95%. HPLC profiles of the crude and purified ACTH (4-10) are shown in Figures 1 and 2. ACTH (4-11) was obtained by cleaving the remaining resin with the TFA/scavenger mixture. The crude peptide obtained was purified by reverse phase HPLC to yield 83 mg of product with purity of >95%. HPLC profiles of the crude and purified ACITH (4-11) are shown in Figures 3 and 4. HPLC profile of a mixture of purified ACTH (4-10) and ACTH (4-11) is shown in Figure 5. Mass spec analysis of ACTH (4-10) and ACTH (4-11) showed mass units of 962 (Theoretical 962.4) and 1090.5 (Theoretical: 1090.5) respectively. Amino acid analysis showed both peptides to have the correct amino acid compositions.
LC
R
220,4
450,80
o-F
nCTH4-10.D
15001 ^
1 000
^
500-i 0 T 1 me
Figure 1
15
20
HPLC Profile of Unpurified ACTH (4-10)
LC n 220,4
D CE E
10 (m1n. )
450,80
o-F
nCTH4-10.D
500 400i 3001 200 1001 0 Time
Figure 2
10 (m i n. )
HPLCProfileof Purified ACTH (4-10)
15
20
534
Anita L. Hong et al.
LC R 2 2 0 , 4
450,80
of
nCTH4-ll.D
i000i CE
^
500H 0-1.
"qp"—^^g*!
\
I
V J ^
•
10 Tlme
Figure 3
15
20
(mln.)
HPLC Profile of Unpurified ACTH (4-11)
LC R 220,4
450,80
o-f R C T H 4 - 1 1 . D
1500H 3 1000(E ^ 500'
10 Tlme Cmln.)
Figure 4 UV 100T\UV
11.7GG 12.522
CR) of (R) of
300
450,80
of
1500d
U
^
500-J 0
RCTH(4-1 1 )
I
I
I
I
5 Tlme
Figure 5
20
RCTCOINJ RCTCOINJ
280 (nm)
LC R 2 2 0 , 4
1000H
I
15
HPLC Profile of Purified ACTH(4-11)
240 260 Wavelength
D
I
RCTCOINJ.D RCTHC4-10)
>
I t " ' — * —
10 Cm 1n. )
15
20
HPLC Profile of a Mixture of Purified ACTH (4-10) and ACTH (4-11)
Peptide Synthesis with 2-Chlorotrityl Resin
535
B.Svnthesis of Neuropeptide Y (YPSKPDNPGEDAPAEDMARYYSALRHYINLITRORY-amide^ and its corresponding C-terminal free acid peptide Synthesis of Neuropeptide Y(NPY) was started by first adding Rink amide MB HA resin (0.14 mmol) to the reaction vessel. One synthesis cycle was performed to load Fmoc-Tyr (0-t-Butyl) to tiie Rink amide MBHA. For synthesizing the corresponding peptide with C-terminal acid, Neuropeptide Y free acid peptide (NPYFA), 0.07 mmol of Tyr-Clt resin was added to die reaction vessel. The synthesis was continued to complete the chain assembly for both sequences. NPYFA was isolated by cleaving the resin mixture with 1:2:7: acetic acid: trifluoroethanol: DCM, followed by removal of the protecting groups with TFA/scavenger mixture. HPLC profiles of the crude and purified NPYFA are shown in Figxu-es 6 and 7. NPY was obtained by cleaving the remaining resin with the TFA/scavenger mixture. The crude peptide was purified to yield 43 mg of product with purity of >95%. Figures 8 and 9 show the HPLC profiles of the crude and purified NPY. Analytical HPLC as well as capillary electrophoresis of a mixture of NPY and NPYFA showed two distinct peaks corresponding to the two peptides as shown in Figures 10 and 11. Mass spec analysis of NPY and NPYFA showed mass units of 4269.1(Theoretical: 4269.08) and 4270.1 (Theoretical: 4270.07), respectively. Amino acid analysis showed peptides to have the correct amino acid compositions. A summary of the synthesis results for the two sets of peptides is shown in Table 1. LC R
220,4
450,80
of
NPY-FR
CD
300 D
200i
E
100 0 T1 me
Figure 6
800i 600 400H 200i 0
450,80
20
I
I
of NPY-FR.D
I
I
I
,
10 T i me (m i n. )
Figure 7
15
HPLC Profile of Unpurilied NPYFA
LC R 220,4 -3 ^
10 (m1n. )
HPLC of Purified NPYFA
15
20
536
Anita L. Hong et al. LC
R
220,4
450,80
o-F
NPY
CRU.D
G00-I D
400
E
200
T i me
Figure 8
10 Cmi n . )
15
20
HPLC Profile of Unpurified NPY
LC
n 220,4
450,80
of
NPY.D
400^ 300-^ D cn E
200-^ 1000T i me
Figure 9 UV
10 (m i n . )
20
15
HPLC Profile of Purified NPY
16.03? (R) of 1 S . 3 3 1 ( F^ ) o f
RNflSPEC
TESTR100.D 'r r; c^ -T- p 1 0 c^ ^ n
INC.
u D
240 260 280 W a v e l e n g t h (nm) LC
CE E
250-i 200i 150i 100i 50i 0-^
Figure 10
R
220.4
300 of
450,80
TESTR100.D l||
.
,
15
1
NPY-FR
L
HPLC Profile of a Nfixture of purified NPY and NPYFA
— •
1
20
"
537
Peptide Synthesis with 2-Chlorotrityl Resin SaaplMlaiM: MFXIH 4 MVY C a ^ l l u y : 75iH x COoNJuiokBlot Oat* Aequirwl: OS/21/94 02;S3 M Dttteotlon: ISS mi I l « e t x o l y t « : 50 iM MaP, pB 2.5 InjMtMBda: liydzostatia RunVoltag*: 15 kv T i p r a t u r a ; 30 Voli—;. 20.00
«
Ret Time (min)
1 2 3 4 5 6
13.925 14.217 14.858 15.250 15.825 16.908
(uV*sec) 18467 60421 77473 1544276 1716179 136349
% Area 0.52 1.70 2.18 43.46 48.30 3.84
Height (uV) 1598 3012 5282 91194 83738 4310
% Height
I n t Type
0.84 1.59 2.79 48.22 44.27 2.28
BV W W W W VB
Figure 11
Capillary Electrophoresis Profile of a Mixture of NPY and NPYFA
Table 1.
Summary of the Synthesis Results of Two Sets of Peptides which Differ in their C-Termini
Peptide ACTH (4-10) ACTH(4-11) NPY NPYFA
Starting Resin (mmol) Yield Mol. Weight 1 AA* Fmoc-Lys(Boc) Rink Amide (mg) Found (Theor.) Chlorotrityl MBHA Wang 0.125 83 962(962.5) 0.125 25 1090.5(1090.5) 0.14 43 4269.1(4269.08) 0.07 56 4270.1(4270.07) | * Starting resins used for synthesizing ACTH(4-10) and NPYFA were Gly-Clt-resin and Tyr-Clt-resin, respectively.
538
Anita L. Hong et al.
rV. Conclusion We have demonstrated that peptides which differ in their C-termini can be simultaneously synthesized in one reaction vessel by employing resins that possess different cleavage properties. This synthesis, strategy can also be used for the synthesis of multiple antigenic peptide systems, MAPS (8) and their corresponding des-lysine core sequences. Moreover, this strategy can be expanded to the synthesis of morefliantwo peptides by employing other resins such as the HF cleavable Pam resins and the photo-labile resins.
References 1. 2. 3. 4. 5. 6. 7. 8.
Fischer, W.H., Saunders, D.,Brandenburg, D., Wollmer, A., Zahn, H.(1985) Biol Chem. Hoppe-Seyler 366, 521-525. Gattner, H. G.,(1975), Z. Physiol Chem. Hoppe-Seyler 356, 1397-1404. Widmer, F. and Johansen, J. T., In Alitalo, K., Partanen, P. and Vaheri, A.(Eds.) (X9%S)Synthetic Peptides in Biology and Medicine, ElsevierScience PublishersAmsterdam p.79-86. Tjoeng, F. S., Towery, D. S., Bulock, J. W., Whipple D. E., Fok, K. F., Willianis, M. H., Zupec, M. E., Adams, S. P.(1990)/n/. J. Peptide Protein Res.35, 141-146. Barlos, K., Chatzi, O., Gates, D., Stavropoulos, G.(1991)/«r. / . Peptide Protein Res37, 513-520. Barlos, K., Gatos, D., Kapolos, S., Poulos, C. Schafer, W., Wenqing,Y.(1991) Int. J. Peptide Protein Res.3%, 553-561. Barlos, K. Gatos, D., Kutsogknni, S., Papahotiou G., Poulos, C. Ysegenidis, T. (1991) Int. J. Peptide Protein Res. 38, 562-568. Tarn, J. P.(1988) Proc. Natl Acad. 5CJ.85, 5409-5413.
Correlation of Cleavage Techniques With Side-Reactions Following Solid-Phase Peptide Synthesis Gregg B. Fields,^ Ruth H. Angeletti,^ Lynda F. Bonewald,^ William T. Moore,^ Alan J. Smith,^ John T. Stults,^ and Lynn C. Williams'^ ^Dept. Lab Medicine & Pathology, Univ. Minnesota, Minne^)olis, MN 55455 ^Dept. Develop. Biol. & Cancer, Albeit Einstein College Med., Bronx, NY 10461 ^Depts. Med. «& Biochem., Univ. Texas Health Sci. Center, San Antonio, TX 78284 ^Dept. Pathology & Lab Medicine, Univ. Pennsylvania School Med., Philadelphia, PA 19104 ^Beckman Center, Stanford University Medical Center, Stanford, CA 94305 ^Genentech, Inc., South San Francisco, CA 94080 ^Norris Cancer Research Institute, University of Southern California, Los Angeles, CA 90033
L Introduction Solid-phase peptide synthesis is routinely used in research ranging from the elucidation of chemical mechanisms to the development of potential therapeutics. The solid-phase method was originally designed for the synthesis of a single peptide at a time, but has more recently been applied for multiple peptide syndiesis and the creation of synthetic peptide libraries. The vast number of diverse products that can be created with peptide libraries has made the need for highly efficient synthetic methods especially critical. The Association of Biomolecular Resource Facilities (ABRF) Research Committee on Peptide Synthesis (PS) was formed to evaluate the quality of the synthetic methods utilized in its member laboratories for peptide synthesis. Peptide synthesis, as defined by this committee, includes the chemistries used for peptide assembly and cleavage and the methods used for characterization of the final product. Studies in 1991 and 1992 requested the synthesis of test peptides by ABRF member laboratories. Products from these syntheses were characterized by amino acid analysis (AAA), reversed-phase high-performance liquid chromatography (RPHPLC), capillary electrophoresis (CE), and mass spectrometry. Results were somewhat unexpected, as 13% of the 94 ABRF laboratory crude samples submitted for the 1991 and 1992 studies did not contain any of the desired peptide products (1,2). The respective cleavage conditions of Fmoc and Boc solid-phase peptide synthesis were believed to be the primary source of synthetic difficulties, since most non-desired products were the result of covalent adducts, not deletions. The 1993 study focused on peptide-resin cleavage conditions, as ABRF member laboratories were supplied with a peptide-resin that was assembled by the ABRF PS Committee. Even with the preassembled peptideresin, 20% of the 46 crude samples did not contain any of the desired product (3), further emphasizing problems in cleavage conditions. This year a study was designed whereby problems in peptide assembly versus cleavage were evaluated by AAA, RP-HPLC, Edman degradation sequence analysis, electrospray mass spectrometry (ESMS), and matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). The ABRF PS Committee specified a limited number of cleavage conditions to determine if certain protocols could be generally recommended. A total of 45 ABRF laboratories participated in the study by TECHNIQUES IN PROTEIN CHEMISTRY VI Copyright © 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.
539
540
Gregg B. Fields et al.
supplying 82 crude samples of a peptide whose sequence was designed by the ABRF PS Committee. The sequence was identical totiiatof the 1991 ABRF PS peptide (1), and thus would idlow for direct comparison of the efficiency of methodologies used in laboratories now versus 3 years ago.
II. Materials and Methods Participating ABRF laboratories were asked to synthesize the following peptide by the methodology most commonly used in their facility: H-Val-Lys-Lys-Arg-Cys-Ser-Met-Trp-ne41e-Pio-Thr-Asp-Asp-Glu-Ala-OH This particular sequence, which is identical to that used for the 1991 study (1), was chosen based on the potential for side-reactions during assembly, side-chain deprotection, and cleavage (1). Recommended side-chain protecting group strategies were Arg(Tos), Asp(OBzl), Cys(Meb), Glu(OBzl), Lys(ClZ), Ser(Bzl), Thr(Bzl), and Trp(For) for Boc-based chemistry and Arg(Pmc), Asp(OrBu), Cys(Trt), Glu(OrBu), Lys(Boc), Ser(rBu), Thr(fBu), and Trp(Boc) for Fmoc-based chemistry (4). Participants were asked to chose either HF or trimethylsilyl trifluorometiiane sulfonate (TMSOTf) cleavage methods following Boc chemistry. Following Fmoc chemistty, participants were asked to use either reagent K (5) or reagent B (6) for peptide-resin cleavage and to work-up the product by either (i)filteringthe resin and precipitating the product with ether or (ii)filteringthe resin, diluting thefiltratewith water and extracting with ether, and lyophilizing the prcKiuct. Samples containing --5 mg of crude product were supplied to the ABRF PS Committee in coded form via a third party to maintain participant anonymity but allow the participants to identify data sets resulting from their samples.
A. Analytical RP-HPLC Samples were dissolved in 0.1% aqueous TFA and --SO ug analyzed on a Peridn Elmer Series 4 HPLC using a Vydac Cig column (300 A pore size, 4.6 x 250 mm). The linear gradient extended from 0.1% aqueous TFA to 70% acetonitrUe (containing 0.09% TFA) over 33 min. Theflowrate was 2 mL/min and the absorbance monitored at 214 nm using a Perkin Elmer LC 95 detector. Samples were injected by a Perkin Elmer SS 100 autosampler. Quantitation was by a Nelson Model 1020 Data System.
B. Amino Acid Analysis Samples (-0.5 ^ig) were hydrolyzed for 24 h at 112 °C in 100 jiL 6 N HCl, 0.2% phenol. Analysis was performed on a Beckman 6300 with a sulfated polystyrene cation-exchange column (0.4 x 25 cm). Quantitation was by a Beckman 7300.
C. Sequence Analysis Edman degradation sequence analysis of selected samples (dissolved in 0.1% TFA-20% acetonitrile) was performed on an Applied Biosystems 477A Protein Sequencer/120A Analyzer using BioBrene Plus as described (7). In order to identify deprotection products and deletions, 800-900 pmol of sample was sequenced.
ABRF 1994 Peptide Synthesis Study
Z>. Mass
541
Spectrometry
ESMS was performed with a Fisons VG Quattro outfitted with a Fisons Electrospray Source. Samples were dissolved in 1.0 mL of 50% methanol-1% acetic acid, then diluted 1:10 with 50% acetonitrile-1.0 mM ammonium acetate to give 25 pmol/pl-. A 10 pL aliquot of each sample was injected into a 10 ^L/min stream of 50% acetonitnle-l.O mM ammonium acetate. Data was processed using Fisons MassLynx Software. MALDI-MS was performed with a Vestec Benchtop lit linear time-of-flight mass spectrometer, operated in the linear mode with an N2 laser (337 nm). Samples were dissolved in 1.0 mL of 25% acetonitrile-0.1% TFA, then diluted 3:100 to give 5-10 pmol/^L. A 0.5 ^iL aliquot of each sample solution was added to 0.5 jiL of matrix [a-cyano-4hydroxycinnamic acid, saturated solution in 50% acetonitrile-2% TFA]. Samples were dried at ambient temperature and pressure. Each spectrum was the sum of ion intensity from 10-50 laser pulses, liie mass axis was c^brated externally.
III. Results and Discussion A total of 82 crude peptide samples were supplied for inclusion in this study. Two of the crude peptides were synthesized by Boc chemistry (2.4%) and 80 by Fmoc chemistry (98%). The fraction of peptides synthesized by Fmoc chemistry has thus increased each year of the ABRF PS study (Table I). The 2 peptide-resins assembled by Boc chemistry were cleaved with HF containing 10% anisole, 10% dimethyl sulfide, and 2%/7-thiocresol. One laboratory elected to deprotect Trp(For) by treatment of the peptide-resin with 10% piperidine-DMF prior to HF cleavage; the other did not specify Trp(For) deprotection conditions. Of the 80 pepti4e-resins assembled by Fmoc chemistry, 49 were cleaved by reagent K (EDT-thioanisole-water-phenol-TFA, 1:2:2:2:33), 22 by reagent B (triisopropylsilane-phenol-water-TFA, 2:5:5:88), and 9 by other cleavage cocktails. Laboratorie? preferred to recover their products by (i)filteringthe resin and precipitating the product with ether (67 samples, 84%) rather than (ii) filtering the resin, diluting thefiltratewith water and extracting with etiier, and lyophilizing the product (13 samples, 16%). AAA showed 66 of 82 crude products to be compositionally correct (Trp was not quantitated). Only one product had a complete deletion, corresponding to a des(Val,Lys,Arg) peptide synthesized by Boc chemistry. Of the other 15 samples that were not compositionally correct, 13 showed at least a low Lys value, often accompanied by a low Ser, Thr, and/or Arg value. Based on sequence analysis and mass spectrometric results (see later discussion), neither the low Lys nor other low amino acid values could be demonstrated, indicating that AAA of crude peptides may sometimes suffer interference from scavengers used during peptide-resin cleavage. A similar effect was seen in the 1993 ABRF PS study (3). TahkJL Chemistrv Utilized bv Core Facihties for ABRF Test Pentides Year Fmoc Boc % # # 1991 18 50 18 42 1992 72 16 34 1993 74 12
1994
80
98
2
% 50 28 26
2
542
Gregg B. Fields et al.
l a b k IL Characterization of 1994 A R R F Test Pentide
Desired Product*
RP-HPLC^
ESMS^
MALDI-MS^
75
2.5
0
16
0
16
0
*A Gin-containing peptide was the desired product for 2 Fmoc syntheses. See text for discussion. ^ e total number of samples was 80 Fmoc and 2 Boc.
The RP-HPLC retention time of die apparent desired peptide was 16.32 ± 0.18 min. However, time variations were found outside this range with different samples of the same product and different sample sizes. These differences were probably attributable to the effects of scavengers on RP-HPLC, as mass spectrometric results confirmed the presence of the desired product. RP-HPLC analyses indicated a good percentage of successful Fmoc syntheses, as 72% of the products had ^25% of tiie apparent desired peptide (Table II). RP-HPLC analyses of the 2 peptides synthesized by Boc chemistry indicated that neither contained >25% of the desired product. It should be noted that RP-HPLC may overestimate the percentage of non-desired product due to the high UV absorbance of scavengers and side-chain protecting group adducts. Eight peptides were subjected to Edman degradation sequence analysis. Three showed Wghly efficient peptide assembly, resulting in desired sequence purities of >95%. One of these two samples had a low Ala value by AAA. The AAA result was thus not consistent with that from sequence analysis. Three peptides had partial sequence deletions that included Val^ in one sample, Cys^ and/or Ser^ in one sample, and Ile^ or Ile^^ in two samples. One sample had a complete deletion of Val^, Lys^, Lys^, and Arg*. These deletions were consistent with results from AAA and mass spectrometric analyses (see below). One sample contained --12.6% of an unidentified component eluting in the Trp^ cycle. The hydrophobic nature of this component (elution time = 31.4 min) and the mass spectrometric results for the peptide (see below) are indicative of PTHTrp containing a rBu adduct. This apparent PTH-Trp(rBu) peak was seen in several other samples, although a noted variation in retention time (31.8 - 35.3 min) suggests Trp modification at several different positions by the rBu group (8). Assessment of product purity by ESMS and MALDI-MS showed semiquantitative agreement with RP-HPLC analyses (Table 11). Figure 1A shows the analyses of a crude peptide containing >75% of the desired product as evaluated by ESMS and MALDI-MS. The molecular ions were [M + 3H]3+ = 631.3 Da and [M + 2H]2+ = 946.3 Da by ESMS and [M + H]+ = 1893.2 Da by MALDIMS. For ESMS, samples were run at atypically high concentrations so that minor components were detectable. Further dilution of samples yielded identical results for those cases examined. In contrast, the relative abundances of peaks in a mixture detected by MALDI-MS varied with different dilutions of die sample, along with different spots on the sample target and different laser power settings. By comparison of the relative abundance of the desired molecular ions for all samples, the combined mass spectrometric techniques assigned 6 (6.2%) Fmocsyntiiesized products and 1 (50%) Boc-based product as "poor" quality (