TECHNIQUES IN PROTEIN CHEMISTRY
VIII
This Page Intentionally Left Blank
TECHNIQUES IN PROTEIN CHEMISTRY
VIII
Edi...
36 downloads
1535 Views
23MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
TECHNIQUES IN PROTEIN CHEMISTRY
VIII
This Page Intentionally Left Blank
TECHNIQUES IN PROTEIN CHEMISTRY
VIII
Edited by
Daniel R. Marshak Osiris Therapeutics, Inc. Baltimore, Maryland
ACADEMIC PRESS San Diego New York Boston London Sydney Tokyo Toronto
This book is printed on acid-free paper. Copyright © 1997 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press
525 B Street, Suite 1900, San Diego, California 92101-4495, USA http ://www. apnet. com United Kingdom Edition published by Academic Press Limited 24-28 Oval Road, London NWl 7DX, UK http://www.hbuk.co.uk/ap/ Library of Congress Card Catalog Number: 94-230592 International Standard Book Number: 0-12-473557-6 (case) International Standard Book Number: 0-12-473558-4 (comb) PRINTED IN THE UNITED STATES OF AMERICA 97 98 99 00 01 02 EB 9 8 7 6 5
4
3
2 1
Contents
Foreword xvii Preface xix Acknowledgments
xxi
Section I Primary Structural Analysis Protein Sequencing Using Microreactors and Capillary Electrophoresis with Thermo-optical Absorbance Detection 3 Xing-fang Li, Hongji Ren, Ming Qi, Darren E Lewis, Ian D. Ireland, Karen C. Waldron, and Norman J. Dovichi Enhancement of Concentration Limits of Detection in Capillary Electrophoresis: Examples of On-Line Sample Preconcentration, Cleanup, and Microreactor Technology in Protein Characterization 15 Andy J. Tomlinson, Linda M. Benson, Norberto A. Guzman, and Stephen Naylor Sequencing MHC Class I Peptides Using Membrane Preconcentration-Capillary Electrophoresis Tandem Mass Spectrometry (mPC-CE-MS/MS) 25 Andy J. Tomlinson, Stephen Jameson, and Stephen Naylor Nano-electrospray Mass Spectrometry and Edman Sequencing of Peptides and Proteins Collected from Capillary Electrophoresis 37 Mark D. Bauer, Yiping Sun, and Feng Wang Characterization of a Recombinant Hepatitis E Protein Vaccine Candidate by Mass Spectrometry and Sequencing Techniques 47 C Patrick McAtee and Yifan Zhang
vi
Contents
Comparison of the High Sensitivity and Standard Versions of AppUed Biosystems Procise™ 494 N-Terminal Protein Sequencers Using Various Sequencing Supports 57 Anita E. Lavin, Lee Anne Merewether, Christi L. Clogston, and Michael E Rohde
Evaluation of ABRF-96SEQ: A Sequence Assignment Exercise 69 Joseph Eernandez, ArieAdmon, Karen De Jongh, Greg Grant, William Henzel, William 5. Lane, Kathryn L. Stone, and Barbara Merrill
Internal Protein Sequencing of SDS-PAGE-Separated Proteins: Optimization of an In Gel Digest Protocol 79 Ken Williams, Mary LoPresti, and Kathy Stone
A Strategy to Obtain Internal Sequence Information from Blotted Proteins after Initial N-terminal Sequencing 91 Kuo-Liang Hsi, William E. Werner, Lynn R. Zieske, Chris H. Grimley, Steven A. O'Neill, Michael L. Kochersperger, Kent Yamada, and Pau-Miau Yuan
Internal Protein Sequencing of SDS PAGE-Separated Proteins: A Collaborative ABRF Study 99 Ken Williams, Ulf Hellman, Ryuji Kohayashi, William Lane, Sheenah Mische, and David Speicher
Section II Physical and Chemical Analysis Chromatographic Determination of Extinction Coefficients of Non-Glycosylated Proteins Using Refractive Index (RI) and UV Absorbance (UV) Detectors: Applications for Studying Protein Interactions by Size Exclusion Chromatography with Light-Scattering, UV, and RI Detectors 113 Jie Wen, Tsutomu Arakawa, Jette Wypych, Keith E. Langley, Meredith G. Schwartz, and John S. Philo
Contents
vii
Single Alkaline Phosphatase Molecule Assay by Capillary Electrophoresis Laser-Induced Fluorescence Detection 121 Douglas B. Craig, Edgar A. Arriaga, Jerome C. Y. Wong, Hui Lu, and Norman J. Dovichi
A New Centrifugal Device Used in Sample Clean-up and Concentration of Peptides 133 Donald G. Sheer, Elizabeth Kellard, William Kopaciewicz, Patrick Gearing, Jeff Wong, and Michael Klein
Sample Preparation Using Synthetic Membranes for the Study of Biopolymers by Matrix Assisted Laser Desorption/Ionization Mass Spectrometry 143 T. A. Worrall, J A. Porter, R. J Cotter, and A. S. Woods
Use of LC/MS Peptide Mapping for Characterization of Isoforms in ^^N-Labeled Recombinant Human Leptin 155 Jennifer L. Liu, Tamer Eris, Scott L. Lauren, George W. Stearns, Keith R. Westcott, and Hsieng Lu
Hyphenated HPLC Methodology for the Resolution and Elucidation of Peptides from Proteolytic Digests 165 Randall T. Bishop, Vincent E. Turula, James A. de Haseth, and Robert D. Richer
Detecting and Identifying Active Compounds from a Combinatorial Library Using lAsys and Electrospray Mass Spectrometry 177 Bolong Cao, Jan Urban, Tomas Vaisar, Richard Y. W. Shen, and Michael Kahn
Amino Acid Analysis of Unusual and Complex Samples Based on 6-Aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization Steven A. Cohen and Charlie van Wandelen
185
Development of a Method for Analysis of Free Amino Acids from Physiological Samples Using a 420A ABI/PE Amino Acid Analyzer 197 Klaus D. Linse, Sandie Smith, and Michelle Gadush
viii
Contents
Quantitation and Identification of Proteins by Amino Acid Analysis: ABRF-96AAA Collaborative Trial 207 K. M. Schegg, N. D. Denslow, T. T. Andersen, Y. Bao, S, A. Cohen, A. M. Mahrenholz, and K. Mann
Section III Chemical Modification Nonaqueous Chemical Modification of Lyophilized Proteins
219
Harvey Kaplan and Alpay Taralp Reaction of HIV-1 NC p7 Zinc Fingers with Electrophilic Reagents 231 E. Chertova, B. R Kane, L. V. Coren, D. G. Johnson, R. C Sowder II, P. Nower, J. R. Casas-Finet, L. O. Arthur, and L. E. Henderson The Identification and Isolation of Reactive Thiols in Ricin A-Chain and Blocked Ricin Using 2-(4'-Maleimidylanilino)naphthalene-6-sulfonic Acid 245 Mary E, Denton, Rita M. Steeves, and John M. Lambert Inactivation of the Human Cytomegalovirus Protease by Diisopropylfluorophosphate 257 Thomas Hesson, Anthony Tsarbopoulos, S. Shane Taremi, Winifred W. Prosise, Nancy Butkiewicz, Bimalendu DasMahapatra, Michael Cable, Hung Van he, and Patricia C. Weber Studies on the Status of Arginine Residues in Phospholipase A2 from Naja naja atra (Taiwan cobra) Snake Venom 267 C C Yang, T. S. Yuo, and C. K Chen Selective Reduction of the Intermolecular Disulfide Bridge in Human GUal Cell Line-Derived Neurotrophic Factor Using Tris-(2-Carboxyethyl)Phosphine 277 John O. Hui, John Le, Viswanatham Katta, Michael E Rohde, and Mitsuru Haniu Effects of Surface Hydrophobicity on the Structural Properties of Insuhn Mark L. Brader, Rohn L. Millican, David N. Brems, Henry A. Havel, Aidas Kriauciunas, and Victor J. Chen
289
Contents
ix
The Effects of in Vitro Methionine Oxidation on the Bioactivity and Structure of Human Keratinocyte Growth Factor 299 Christopher S. Spahr, Linda O. Narhi, James Speakman, Hsieng S. Lu, and Yueh-Rong Hsu
Section IV Posttranslational and Other Modifications Effects of Enzyme Giycosylation on the Chemical Step of Catalysis, as Probed by Hydrogen Tunnehng and Enthalpy of Activation 311 Amnon Kohen, Thorlakur Jonsson, and Judith P. Klinman Profile Analysis of Oligosaccharides from Glycoproteins by PMP Labeling. Comparison of Chemical and Enzymatic Release Methods Using RP-HPLC and Mass Spectrometry 321 Hanspeter Michel, Yuemei Ma, Barbara DeBarbieri, and Yu-Ching E. Pan Positive Identification of Giycosylation Sites in Proteins and Peptides Using a Modified Beckman LF 3600 N-Terminal Protein Sequencer 331 Xiaomei Lin, L. Wulf Carson, Saber M. A. Khan, Clark F. Ford, and Kristine M. Swiderek Deamidation and Isoaspartate Formation during in Vitro Aging of a Recombinant Hepatitis E Vaccine Candidate 341 C Patrick McAtee and Yifan Zhang The Isolation and Characterization of Active Site Peptides in Lysyl Oxidase 351 Sophie X. Wang, Judith P Klinman, Katalin F Medzihradszky, Alma L. Burlingame
and
Complement Activation in EDTA Blood/Plasma Samples May Be Caused by Coagulation Proteases 363 Philippe H. Pfeifer, Tony E. Hugh, Earl W. Davie, and Kazuo Fujikawa Disulfide-Linked Human Stem Cell Factor Dimer: Method of Identification and Molecular Comparison to the Noncovalent Dimer 371 Hsieng S. Lu, Michael D. Jones, and Keith E. Langley
Contents Autocatalytic Reduction of a Humanized Antibody 385 A. Ashok Kumar, John Kimura, and Jennifer Running Deer
Section V Interactions of Protein with Ligands Oxygen and Ascorbate Mediated Modification of a Recombinant Hemoglobin 399 Bruce A. Kerwin, Edward Hess, Julie Lippincott, Ray Kaiser, and Izydor Apostol Metal Activation and Regulation of E. coli RNase H James L. Keck and Susan Marqusee
409
Crystal Structure of Avian Sarcoma Virus Integrase with Bound Essential Cations 417 Jerry Alexandratos, Grzegorz Bujacz, Mariusz Jaskolski, Alexander Wlodawer, George Merkel, Richard A. Katz, and Anna Maria Skalka Multidimensional NMR Studies of an Exchangeable Apolipoprotein and Its Interactions with Lipids 427 Jianjun Wang, Daisy Sahoo, Dean Schieve, Stephane M. Gagne, Brian D. Sykes, and Robert O. Ryan NMR Methods for Analysis of CRALBP Retinoid Binding 439 Linda A. Luck, Ronald A. Venters, James T. Kapron, Karen E. Roth, Seth A. Barrows, Sara G. Paradis, and John W. Crabb A Novel Method for Measuring the Binding Properties of the Site-Directed Mutants of the Proteins That Bind Hydrophobic Ligands: Application to Cellular Retinoic Acid Binding Proteins 449 Honggao Yan, Lincong Wang, and Yue Li A Strategy for Predicting the Ligand Binding Competence of Recombinant Orphan Nuclear Receptors Using Biophysical Characterization 457 Derril Willard, Bruce Wisely, Derek Parks, Martin Rink, William Holmes, Michael Milburn, and Thomas Consler
Contents
xi
Section VI Protein-Protein Interactions Detection of /w/ra-Cellular Protein-Protein Interactions: Penicillin Interactive Proteins and Morphogene Proteins 469 5. Bhardwaj and R. A. Day
Use of Synthetic Peptides in Mapping the Binding Sites for hsp70 in a Mitochondrial Protein 481 Antonio Artigues, Ana Iriarte, and Marino
Martinez-Carrion
Interfacing Biomolecular Interaction Analysis with Mass Spectrometry and the Use of Bioreactive Mass Spectrometer Probe Tips in Protein Characterization 493 Randall W. Nelson, Jennifer R. Krone, David Dogruel, Kemmons Tubbs, Russ Granzow, and Osten Jansson
Transition-State Theory and Secondary Forces in Antigen-Antibody Complexes 505 Mark E. Mummert and Edward W. Voss, Jr Thermodynamic Investigation of Enzyme and Inhibitor Interactions with High Affinity 513 Yudu Cheng, Jacek Slon-Usakiewicz, Jing Wang, Enrico O. Purisima, and Yasuo Konishi Development and Characterization of a Fab Fragment as a Surrogate for the IL-1 Receptor 523 Y. Cong, A. S. McColl, T. R. Hynes, R. C Meckel, P. S. Mezes, C L. Lane, S. E. Lee, D. J. Wasilko, K. E Geoghegan, I. G. Otterness, and G. O. Daumy
Section VII Macromolecular Assemblies Topology of Membrane Proteins in Native Membranes Using Matrix-Assisted Laser Desorption lonization/Mass Spectrometry 533 Kamala Tyagarajan, John G. Forte, and R. Reid Townsend
xii
Contents
Role of D-Ser"^^ in the P-type Calcium Channel Blocker, w-Agatoxin-TK Tomohiro Watanabe, Manabu Kuwada, Kumiko Y. Kumagaye, Kiichiro Nakajima, Yukio Nishizawa, and Naoki Asakawa
543
Involvement of Basic Amphiphilic a-HeUcal Domain in the Reversible Membrane Interaction of Amphitropic Proteins: Structural Studies by Mass Spectrometry, Circular Dichroism, and Nuclear Magnetic Resonance 555 Nobuhiro Hayashi, Mamoru Matsubara, Koiti Titani, and Hisaaki Taniguchi
One-Dimensional Diffusion of a Protein along a Single-Stranded Nucleic Acid 565 Bradley R. Kelemen and Ronald T. Raines
Metal-Dependent Structure and Self Association of the RAGl Zinc-Binding Domain 573 Karla K. Rodgers and Karen G. Fleming
Localizing Flexibihty within the Target Site of DNA-Bending Proteins Anne Grove and E. Peter Geiduschek
585
Assembly of the Multifunctional EcoYLl DNA Restriction Enzyme in Vitro 593 David T. R Dry den, Laurie R Cooper, and Noreen E. Murray
Section VIII Three Dimensional Structure Strategies for NMR Assignment and Global Fold Determinations Using Perdeuterated Proteins 605 Ronald A. Venters, Hai M. Vu, Robert M. de Lorimier, and Leonard D. Spicer ^H-NMR Evidence for Two Buried ASN Side-Chains in the c-MYC-MAX Heterodimeric a-Hehcal Coiled-Coil 617 Pierre Lavigne, Matthew P. Crump, Stephane M. Gagne, Brian D. Sykes, Robert S. Hodges, and Cyril M. Kay
Contents
xiii
NMR Confirms the Presence of the Amino-Terminal Hehx of Group II PhosphoUpase A2 in Solution 625 Roman Jerala, Paulo E E Almeida, Rodney L. Biltonen, and Gordon S. Rule
The Crystallographic Analysis of Glycosylation-Inhibiting Factor 633 Yoichi Kato, Takanori Muto, Hiroshi Watarai, Takafumi Tomura, Toshifumi Mikayama, and Ryota Kuroki
Structure of the D30N Active Site Mutant of FIV Proteinase Complexed with a Statine-Based Inhibitor 643 Celine Schalk-Hihi, Jacek Lubkowski, Alexander Zdanov, Alexander Wlodawer, Alia Gustchina, Gary S. Laco, and John H. Elder A Homology-Based Model of Juvenile Hormone Esterase from the Crop Pest, Heliothis virescens 655 Beth Ann Thomas, W. Bret Church, and Bruce D. Hammock
Analysis of Linkers of Regular Secondary Structures in Proteins V. Geetha and Peter J. Munson
667
Structural and Functional Roles of Tyrosine-50 of Yeast Guanylate Kinase 679 Yanling Zhang, Yue Li, and Honggao Yan
Section IX Dynamics and Folding Flexibility of Serine Protease in Nonaqueous Solvent 693 Samuel Toba, David S. Hartsough, and Kenneth M. Merz, Jr
Higher-Order Structure and Dynamics of FK506-Binding Protein Probed by Backbone Amide Hydrogen/Deuterium Exchange and Electrospray Ionization Fourier Transform Ion Cyclotron Resonance Mass Spectrometry 703 Zhongqi Zhang, Weiqun Li, Ming Li, Timothy M. Logan, Shenheng Guan, and Alan G. Marshall
xiv
Contents
Internal Dynamics of Human Ubiquitin Revealed by ^-^C-Relaxation Studies of Randomly Fractionally Labeled Protein 715 A. Joshua Wand, Jeffrey L. Urbauer, Robert P. McEvoy, and Ramona J. Bieber Detection of Protein Unfolding and Fluctuations by Native State Hydrogen Exchange 727 Aaron K. Chamberlain, Tracy M. Handel, and Susan Marqusee Laser Temperature Jump for the Study of Early Events in Protein Folding Peggy A. Thompson
735
Biophysical and Structural Analysis of Human Acidic Fibroblast Growth Factor 745 Michael Blaber, Daniel H. Adamek, Aleksandar Popovic, and Sachiko I. Blaber A Thermodynamic Analysis Discriminating Loop Backbone Conformations 755 Jean-Luc Pellequer and Shu-wen W. Chen The Equilibrium Ensemble of Conformational States in Staphylococcal Nuclease 767 Vincent J. Hilser and Ernesto Freire An Evaluation of Protein Secondary Structure Prediction Algorithms Georgios Pappas, Jr., and Shankar Subramaniam
783
Section X Biological and Chemical Design Designing Water Soluble p-Sheet Peptides with Compact Structure Elena Ilyina, Vikram Roongta, and Kevin H. Mayo
797
Engineering Secondary Structure to Invert Coenzyme Specificity in Isopropylmalate Dehydrogenase 809 Ridong Chen, Ann F. Greer, Antony M. Dean, and James H. Hurley
Contents
xv
A Method for Determining Domain Binding Sites in Proteins with Swapped Domains: ImpUcations for pA3- and |3B2- CrystaiUns 817 Yuri V. Sergeev and J. Fielding Hejtmancik Complete Mutagenesis of the Gene Encoding TEM-1 p-Lactamase Timothy Palzkill, Wanzhi Huang, and Joseph Petrosino
827
Characterization of Truncated Kirsten-Ras Purified from Baculovirus Infected Insect Cells Indicates Heterogeneity due to N-terminal Processing and Nucleotide Dissociation 837 Lisa M. Churgay, Nancy B. Rankl, John M. Richardson, Gerald W. Becker, and John E. Hale Isolation and Characterization of Multiple-Methionine Mutants of T4 Lysozyme with Simplified Cores 851 Nadine C. Gassner, Walter A. Baase, Joel D. Lindstrom, Brian K. Shoichet, and Brian W. Matthews Synthesis of Alzheimer's (1-42) Ap-Amyloid Peptide with Preformed Fmoc-Aminoacyl Fluorides 865 Saskia C. E Milton, R. C, de Lisle Milton, Steven A. Kates, and Charles Glabe Analysis of Racemization during "Standard" SoUd Phase Peptide Synthesis: A Multicenter Study 875 Ruth Hogue Angeletti, Lisa Bibbs, Lynda E Bonewald, Gregg B. Fields, Jeffery W. Kelly, John S. McMurray, William T. Moore, and Susan T. Weintraub Index
891
This Page Intentionally Left Blank
Foreword
Once again it is a great pleasure to thank Dan Marshak on behalf of the Protein Society for editing Techniques in Protein Chemistry. The volumes in this series provide "bench-top" references that will be of ongoing value to practicing protein scientists. This volume continues this outstanding tradition. Following an organizational strategy that was introduced last year, the articles have been arranged by concepts rather than by methodology. It is hoped that this format will serve to alert the reader to alternative approaches that may be available to address a given biological or biochemical problem. This compilation of articles has been selected from presentations at the Tenth Symposium of the Protein Society held in San Jose, August 3-7,1996.1 would like to join Dan in thanking the Associate Editors, Phil Andrews, Gerry Carlson, Steve Carr, Xiaodong Cheng, Lowell Ericsson, Sheenah Mische, Nick Pace, Len Spicer, and Ken Wilhams, as well as the former volume editors, Joe Villafranca, Tony Hugh, John Crabb, and Ruth Angeletti, for their help. This is the second volume edited by Dan Marshak. I am pleased to announce that Gerry Carlson has kindly agreed to take over this task for the next two years.
Brian W. Matthews President The Protein Society
xvii
This Page Intentionally Left Blank
Preface Techniques in Protein Chemistry VIII is the latest volume in this successful series describing the most up-to-date methodologies in proteins. The contributions were selected from presentations at the Tenth Symposium of the Protein Society held in San Jose, California, in August, 1996. The structure of this year's edition continues the new format of last year's volume. The ten sections of the book are segregated by subject area to show the reader the techniques that are currently applied to certain problems in protein science. This reflects current trends in the field in which specific instruments and methodologies are used in several different arenas. For example, mass spectrometry is now used in protein sequencing, analysis of posttranslational modifications, analysis of chemical modifications, protein engineering, and higher order protein structure. Even methods such as crystallography and nuclear magnetic resonance are used in determining protein-ligand interactions, protein-protein interactions, and macromolecular assembhes in addition to traditional three-dimensional protein structural analysis. I hope this format will be useful to a readership that is rapidly expanding its horizons concerning the application of various techniques to questions in protein science. The credit for reviewing the manuscripts is due the associate editors: Phil Andrews, Gerry Carlson, Steve Carr, Xiaodong Cheng, Lowell Ericsson, Sheenah Mische, Nick Pace, Len Spicer, and Ken Williams. Their expertise in specific areas of protein science was the key to selecting contributions from the many excellent presentations. I have had the benefit of counsel from John Crabb and Ruth Angeletti, and look forward to next year's volume, which will be edited by Gerry Carlson. Finally, I thank my secretary, Debra Rizzieri, for her assistance. Protein science has become a fountainhead of new discoveries that fuel the engines of biology. The expansion of techniques that can be appUed to proteins has allowed the creation of a vast set of tools for the practitioner. This volume is a celebration of the investigators who invent and apply new methods. Daniel R. Marshak Osiris Therapeutics, Inc. and The Johns Hopkins University School of Medicine
XIX
This Page Intentionally Left Blank
Acknowledgments The Protein Society acknowledges with thanks the following organizations which, through their support of the Society's program goals, contributed in a meaningful way to the tenth annual symposium and thus to this volume.
Aviv Instruments, Inc. Beckman Instruments, Inc.
Perkin-Elmer Corporation, Applied Biosystems Division
BioMolecular Technologies, Inc.
PerSeptive Biosystems, Inc.
BIOSYM/Molecular Simulations
Pharmacia Biosensor
Bristol-Myers Squibb
Pharmacia Biotech, Inc.
Finnigan MAT
Rainin Instrument Co., Inc.
Fisons Instruments
Schering-Plough Research Institute
Hewlett-Packard Company
Shimadzu Scientific Instruments, Inc.
IntelliGenetics, Inc.
Supelco, Inc.
JASCO, Inc.
VYDAC
Kirin Brewery Co., Ltd.
Waters Corporation
Michrom BioResources, Inc.
Wyatt Technology Corporation
Molecular Simulations, Inc.
ZymoGenetics
XXI
This Page Intentionally Left Blank
SECTION I Primary Structural Analysis
This Page Intentionally Left Blank
Protein Sequencing Using Microreactors and Capillary Electrophoresis with Thermo-optical Absorbance Detection Xing-fang Li Hongji Ren MingQi Darren F. Lewis Ian D. Ireland Karen C. Waldron Norman J. Dovichi Department of Chemistry University of Alberta Edmonton, Alberta, Canada T6G 2G2
Abstract A miniaturized protein and peptide microsequencer consisting of either a fused silica capillary reactor or a microreactor made of Teflon is described. The performance of the miniaturized sequencer was evaluated by sequencing 33 and 27 picomoles of myoglobin that were covalently attached to Sequelon-DITC. The products generated by the sequencer were analyzed using capillary electrophoresis with thermo-optical absorbance detection. This CE system provides reproducible migration time (< 0.4% of RSD) and detection limits of less than 4 fmol.
I. Introduction The primary amino acid sequence of polypeptide is routinely determined using the commercially available gas-liquid-phase sequencers [1, 2] and solidphase sequencers [3, 4] based on the Edman degradation chemistry [5], These instruments can routinely obtain the primary amino acid sequence from 10 to 100 pmol of polypeptide. However, the need for higher sequencing sensitivity remains, as Kent et al. [6] have pointed out that rare proteins may only be present at the 30-300 fmol level on 2D-polyacrylamide gels. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
3
4
Xing-fang Li et al
Although tandem MS has demonstrated rapid and sensitive determination of primary sequence without use of Edman degradation chemistry [7-9], this technique is typically limited to peptides smaller than 15 residues; larger peptides generate very complex mass spectra that are difficult to interpret. The Edman chemistry-based sequencing techniques are still essential to biological studies. Researchers have improved sequencing sensitivity of the latter techniques by miniaturizing and modifying the sequencer components, from separation column to reaction cartridge. Reducing HPLC column inner diameter from 4 to 2-mm improved sensitivity fourfold. The continuous flow reactor (CFR) described by Shively's group [10], consisting of concentric Teflon tubes, gave high sensitivity sequence analysis of 5 pmol of protein adsorbed on polyvinylidene difluoride (PVDF) with polybrene. The Hewlett-Packard biphasic reaction column sequencer gave similar sequencing sensitivity [11]. However, these authors [10, 11] pointed out that routine use at this level was difficult. Regardless of configuration, miniaturizing the reaction cartridge volume permits the use of less reagent and thereby reduces the level of non-specific reactions that give rise to background noise in the chromatographic identification of sequencing products. The current technologies for protein sequencing are limited by the UV detection of PTH-amino acids [2]. As a result, we have developed an altemative technology for the separation and determination of minute amounts of PTH-amino acids: micellar electrokinetic capillary chromatography (MECC) with thermooptical absorbance detection (TOAD) [12,13]. This technology has been routinely used in our laboratory to identify the PTH-amino acids resulting from manual and semi-automated Edman degradation reactions for over two years. In the first part of this report, we present the reproducibihty of migration times for PTH-amino acids to demonstrate the reliability of this technique. Unfortunately, the MECC-TOAD system cannot be coupled directly to commercially available protein sequencers because of incompatibility of volume. Less than 10 nL of the sample solution is typically injected into the CE to preserve the high efficiency of separation, whereas up to 100 \\L of solution is collected from existing commercial sequencers. Thus miniaturization of the sequencer is essential in order to overcome this volume mismatch and to take advantage of the sensitive, fast and efficient determination of PTHs using MECC-TOAD. Recently, we have reported on the design of a miniaturized sequencer consisting of a capillary-sized reaction chamber and a multiport valve delivery system to address the problem of volume compatibility with CE [14]. However, that sequencer could not reproducibly deliver sub-microliter volumes of reagent because of microliter dead volumes in the multiport valves. Using this sequencer to do gas-liquid phase Edman degradation of picomole levels of proteins adsorbed on Polybrene-coated silica beads or PVDF membranes was not successful. Adequate sequencing results were only obtained when proteins were covalently bound to solid supports, but the background peaks were large. By reducing the amounts of reagent used for Edman degradation, background peaks were reduced. Unfortunately, the multiport valve and argon pressurized delivery system described by Waldron et al [14] precluded using less than 4 |iL of reagent or solvent. Therefore, in order to further reduce the reagent volumes, we redesigned the miniaturized microsequencer to eliminate the valves and the argon pressurized delivery system. In this paper, we describe the design of a miniaturized sequencer where syringe pumps are directly coupled via capillary tubing to the reaction chamber to deliver reagents and
Protein Sequencing Using Microreactors
5
solvents for covalent polypeptide sequencing. This system is able to deliver less than 2 |ULL of each reagent. As a result, the background peaks are significantly reduced. Preliminary results for this microsequencer are presented.
II. Experimental A. Routine analysis of PTH-amino acids using Micellar Electrokinetic Capillary Chromatography with Thermo-optical Ahsorbance Detection (MECC-TOAD) The instrument and conditions for the determination of PTH-amino acids by MECC-TOAD were reported in detail elsewhere [12, 13, 15]. The RSD values of the migration time were calculated after using the two-marker correction method that will be reported in a separate paper [16]. The two markers used were DMPTU and DPU. The detection limits for the 19 PTH-amino acids, DMPTU, DPTU, and DPU were calculated based on three times the standard deviation of the background signal. PTH-Y was used as the internal standard to calculate the sequencing yields. B. Miniaturization of the Protein Sequencer The design and construction of the fused silica capillary-based reaction chamber were described in detail elsewhere [15]. Another reaction chamber was constructed using a block of Teflon material. The configuration of the Teflon reactor is shown in Figure 1. The central channel with 0.762-mm-i.d. was used to host the membrane-bound protein samples. The top and bottom of this channel were connected to the Ar source with 0.762-mm-o.d. and 0.305-mm-i.d. Teflon tubing (Cole-Palmer). The other five small channels had the same diameter, 0.367 mm. These small channels were connected to syringe pumps by use of 367-|imo.d. and 100-|Lim-i.d. Teflon-coated fused silica capillaries (Polymicro). All the connections are made snug tight. The reagents were delivered through the small channels directly to the central channel.
Xing-fang Li et al
WSHl PITC
WSH2
Figure 1. Schematic of the Teflon microreactor, A- Teflon Block; B- Teflon Tubing (0.762-mm-o.d. and 0.305-mm-i.d.); C- Teflon-coated Fused Silica Capillary (367-|Lim-o.d. and lOO-^m-i.d.); D- Protein-bound Membrane; E- Sample vial for collecting the Products
C. Automated Protein Sequencing The sequencing conditions used in this study were similar to those reported previously [15], with a few changes: the syringe pumps for delivery of the reagents; order of the delivery; amounts of reagents; the reaction time; and Ar drying steps were all automatically controlled by a Macintosh computer with a program developed in our laboratory using Labview development software (National Instruments). Conversion of the ATZ-amino acids (extracted with TFA) to the PTH form was carried out off-line. After cleavage, the extract from each Edman degradation cycle was collected into a 200-|LIL vial, to which 25 |LiL of 25% aqueous TFA solution was added and mixed. The solution was heated at 67^C for 10 min. and then dried on a vacuum centrifuge. The residue in the vial was dissolved in 1 |iL
Protein Sequencing Using Microreactors
7
of internal standard (5.8 x IQ-^ M PTH-tyrosine (PTH-Y) in 10% acetonitrile/90% water) and then analyzed by MECC-TOAD for identification of the PTH-amino acid.
III. Results and Discussion A. Evaluation of MECC-TOAD as a Routine PTH-amino Acid Analyzer Standard solutions containing the 19 PTH-amino acids, DMPTU, DPTU, and DPU, all at concentration of 2.5 x 10"^ M, were analyzed routinely under the common conditions: 15 s hydrodynamic injection at 4 cm height difference; 40cm-long, 50-|im-i.d. and 185-|im-o.d. fused silica capillary preconditioned by gravity flow of the running buffer for over 24 hrs; 9 kV running voltage; and the running buffer composed of 10.7 mM sodium phosphate, 1.8 mM sodium borate, and 25 mM SDS. After the electropherograms were obtained, the migration times of the analytes were corrected based on DMPTU and DPU as markers. RSD values of migration times for the 21 analytes were calculated from ten electropherograms. When the ten electropherogrms were obtained in the same day, the RSD values of the corrected migration times were below 0.4% for all 22 analytes. Even when the ten electropherograms were obtained over a period of three months, the RSD values were still below 0.6% except PTH-H and PTH-R that are at 1% and 1.2%, respectively. These results demonstrate that PTH-amnio acid residues resulting from Edman degradation can be reliably identified by using MECC-TOAD. MECC-TOAD also provides high sensitivity. A typical performance of this instrument under the conditions described above is shown in Figure 2. The detection limits calculated from Fig.2 range from 0.5 to 1.7 |iM, which is equivalent to 1.4 to 4.6 fmol listed in Table I. In contrast, the HPLC-UV analyzers had about 1 pmol of mass detection limit and 2 |LIM concentration detection limit, provided that the injection volume was 50 |iL [24]. Unfortunately, the volume mismatch between MECC-TOAD and available sequencers have limited the use of this reproducible and high sensitive technology. Therefore, miniaturization of the protein sequencer is essential.
Xing-fang Li et al 0.30 n
0.25 H
0.20 H
nS/lfyf VH
0.15
0.10
1 4
I 5
I 6
I 1 7 8 Migration Time (min)
1 9
1 10
1 11
1 12
Figure 2. Electropherogram of the PTH-amino acids (5 x IQ-^ M) for calculation of the detection Umits (conditions described in the text).
Table I. Detection Limits (DL) of the MECC-TOAD for Determination of PTH-Amino Acids PTH-amino acids
Mass DL, fmol
Concentration DL, fxM
W, K N L G H Q, A, P, V, M, F Y E, R D, S I
L4 L8 2.1 2.5 2.6 2.7 3 3.2 4.0 4.6
0.5 0.7 0.8 0.9 1 1 1 L2 4.6 L7
Protein Sequencing Using Microreactors
9
B. Protein Sequencing using the miniaturized sequencer The syringe pump-based capillary sequencer has been used for protein sequencing for over half-a-year in our laboratory. The typical performance of the sequencer is demonstrated by the sequencing results obtained from 33 pmol of Sequelon-DITC-myoglobin. Because the free amino group was covalently bound to the DITC-membrane, the residue from the first cycle was not expected to be detected reliably, therefore, it was not analyzed. The pseudo-initial yield from the second cycle was 76%, and the repetitive yield was 87%. The electropherograms of the products from the Edman degradation cycles are shown in Figure 3. Figure 3 demonstrates that the MECC-TOAD provides baseUne separation of all components generated from the sequencing reactions. Positive identification of the PTH residues resulting from the degradation cycles were easily made by comparing the migration times of the residues and the standards. Performance of the Teflon microreactor is demonstrated by sequencing 27 pmol of the same protein sample using similar conditions to those used in the above experiments. Twelve cycles were performed, the first seven cycles were done in the same day, and the latter five cycles were done the following day. Original electropherograms of cycles 2 to 12 are shown in Figure 4. All products from the twelve cycles were positively identified. The first seven cycles gave better yields and fewer background peaks because the former were done on the first day. This phenomenon was also observed in our previous studies [15]. Figure 4 also shows that the residue PTH-L from cycle 2 co-eluted with an impurity peak. This impurity peak and the other background peaks observed in cycle 2 were dramatically reduced in the following cycles, which suggested that the background peaks were due to incomplete cleaning of the new Teflon microreactor before use. The sequencing products and by-products obtained using the Teflon microreactor (Figure 4) are similar to those obtained with the capillary reaction chamber (Figure 3). This suggests that the epoxy glue used to connect the inlet capillaries to the capillary reaction chamber in the initial experiments (Figure 3) does not cause problems in identifing the sequencing products.
Xing-fang Li et al
10 5F 3F it
o' 1.0 I Cycle 2
ilJ!jLJ!_^OiU
0.5 1.0 I Cycle 3 0.5
STD ^ ,^A. m .K.
i.or 0.5
^^^^^
NoSTD added , D
1.0 I Cycle 5
-
U2|l
DPTU
U3
Ul
.UL
-A-yJL
.STD
lom:
STD
I DPTU
irm
I
U2M „ ,
0.5 J
ZH
1.01 Cycle 0.5 1.0
A
i^ i l ^ V ^ - ^ J v
L Cycle 8
STD Q
0.5 I
DPTU
1.0 1 Cycle 9
STD
.Cycle 11
]|
I
V^V—^VJ
U2
Ul
U3 ^J
:L_jt_iwui
0.5
13
Ju^ULAw«.
;;:^
IIDPTU
A / p U3 I
ITUI
0.5 r 7
8 9 Time (min)
10
11
12
Figure 3 . Electropherograms showing the results of 33 pmol of Sequelon DITC-myoglobin by use of capillary r e a c t o r .
13
Protein Sequencing Using Microreactors
u
0.8 -J
0.6 -JA«*-^
—I
1
11
Cycle 2
1
r
U-
—r-
12
10
0.80 -1
Cycle 3
0.75 -I 0.70 0.65 0.60 0.55
I
J II
I
I
D .
Y
-|
r
I
I
I 10
I 11
I 12
Cycle 4
dptu
W^ \
—I
I
10
0.80
\
1
11
12
Cycle 5
0.70 0.60 0.50
dptu
SrfJli
Ly.JLA'vw^Au/V^U^ 1
r5
-|
\
\
1
1
1
7
8
9
10
11
12
0.80 0.70 •
Cycle 6
dptu
0.60 0.50 •
]w>wWv,J/^^ I
I
I
ift../^^A^^^/V^ W v w v v \ / w ^ ' ^ S - . A . ^ ^ A v V ^ —I— 10
0.36
Cycle 12
0.32 0.28
-1 12
j / J l y ^ ^ \Afr^%^
0.24 H
1
1
1
7
1
8
Figure 4. (continued) (cycles 7 to 12).
\
9
k.>..^v4>A^ n
10
1
11
1
12
Protein Sequencing Using Microreactors
13
IV. Conclusion Edman chemistry has been used for protein and peptide sequencing for over 30 years. However, the outcome of sequencing experiments are very much dependent on the performance of the instrument. When the sequencer is miniaturized to the capillary size, reproducible sequencing results are more difficult to achieve [14]. The new design of the miniaturized sequencer using syringe pumps for delivery of reagents and direct connections without valves has provided us a new approach to miniaturize the sequencer. The short flow path and very low dead volume were achieved by directly connecting the narrow-bore capillaries (100 |Lim i.d.) to the reaction chamber. This configuration minimized side reactions. The ehmination of valves, as well as the use of capillary-size reaction chamber and delivery lines, greatly simpUfied the construction of the sequencer. We have demonstrated the ability of this sequencer to sequence low pmol levels of proteins, even though the conversion of ATZ to PTH amino acids and MECC-TOAD detection of PTH-amino acids were carried out off-line. To obtain sequencing sensitivity at fmol peptides, on-line conversion and on-line detection of PTH-amino acids are necessary.
Acknowledgments This project was supported by an operating grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada. Additional support was provided by SCIEX. XFL and KCW acknowledge NSERC Industrial Postdoctoral fellowships sponsored by SCIEX. NJD acknowledges a McCalla Professorship from the University of Alberta.
References 1. 2. 3. 4. 5. 6. 7.
8.
Hewick, R.M., Hunkapiller, M.W., Hood, L.E., and Dreyer, W J . (1981) / Biol. Chem. 256, 7990. Tempst, P., Geromanos, S., Elicone, C , Erdjument-Bromage, H. (1994) Methods: A Companion to Methods in Enzymology 6, 248. Laursen, R.A. (1971) Eur. J. Biochem. 20, 89. Pappin, D.J.C., Coull, J., and Koester, H. (1990) Anal. Biochem. 187, 10. Edman, P., and Begg, G. (1967) Eur. J. Biochem. 1, 80. Kent, S., Hood, L., Aebersold, R., Teplow, D., Smith, L., Farnsworth, V., Cartier, P., Hines, W., Hughes, P., and Dodd, C. (1987) BioTechniques 5, 314. Scoble, H.A., Vath, I.E., Yu, W., and Martin, S.A. / n P. Matsudaira (Ed.), (1993) A Practical Guide to Protein and Peptide Purification for Microsequencing, Academic Press, Inc., San Diego, pp. 125. Wilm, M., Shevchenko, A., Houthaeve, T., Breit, S., Schweigerer, L., Fotsis, T., and Mann, M. (1996) Nature, 379, 466.
14 9. 10. 11.
12. 13. 14. 15. 16.
Xing-fang Li et al Figeys, D., Oostveen, I. V., Ducret, A., and Aebersold, R. (1996) Anal Chem. 68, 1822. Calaycay, J., Rusnak, M., and Shively, J.E. (1991) Anal Biochem. 192, 23. Granlund-Moyer, K., Miller, C.G., and Sahakian, J.A. (1994) 10th International Conference on Methods in Protein Structure Analysis. Snowbird, Utah Sept. 8-13, Abstract LA3. Waldron K.C., and Dovichi, N.J. (1992) Anal Chem. 64, 1396. Chen, M., Waldron, K.C., Zhao, Y., and Dovichi, N.J. (1994) Electrophoresis 15, 1290. Waldron, K.C., Li, X.-F., Chen, M., Ireland, I., Lewis, D., Carpenter, M., and Dovichi, N.J. (l996)Talanta., in press. Li, X.-F., Waldron, K.C., Black, J., Lewis, D., Ireland, I., and Dovichi, N.J. (1996)Talanta, accepted. Li, X.-F., Ren, H., Le, X.C, Ireland, I., Qi, M., and Dovichi, N.J. unpubUshed results.
ENHANCEMENT OF CONCENTRATION LIMITS OF DETECTION IN CAPILLARY ELECTROPHORESIS: EXAMPLES OF ON-LINE SAMPLE PRECONCENTRATION, CLEANUP, AND MICROREACTOR TECHNOLOGY IN PROTEIN CHARACTERIZATION Andy J. Tomlinson , Linda M. Benson\ Norberto A. Guzman*^ and Stephen Naylor ' ^Biomedical Mass Spectrometry Facility and Department of Biochemistry and Molecular Biology ^Department of Pharmacology and Clinical Pharmacology Unit, Mayo Clinic, Rochester, MN 55905 The R.W. Johnson Pharmaceutical Research Institute Raritan, NJ 08869
I. INTRODUCTION It is paradoxical that one of the noted advantages of capillary electrophoresis (CE), namely the small volume of a conventional CE capillary, also leads in the majority of cases to a significant drav^back of the technique. The total volume of the capillary is typically only -1-2 |iL, and this results in a very limited loading capacity of analyte solutions. Optimal analyte resolution and separation efficiency are usually obtained when the sample injection is 100 |iL) to be loaded without compromising analyte resolution or separation efficiency afforded by conventional CE methods (8,9). In addition to analyte preconcentration, mPC-CE technology can also be used to effect sample cleanup. This is particularly important for physiologically derived samples such as blood, bile, urine, etc. where the presence of high salt concentrations can dramatically effect analyte separations by CE. Furthermore, these matrix components can complicate, and even alter electrophoretic stacking and focusing procedures, often precluding the use of these methods for preconcentration of biologically
Enhancement of Concentration Detection Limits in CE
derived samples within the CE capillary. In contrast, mPC-CE technology is relatively unaffected by such contaminants. Indeed, this approach ensures that these compounds are removed from the CE capillary prior to electrophoresis. Furthermore, when using an off-line sample loading strategy, the bidirectional flow through a mPC-cartridge allows samples to be loaded with either reverse or forward flow. We utilize a back flow to load sample followed by subsequent sample cleanup with a forward flow of a suitable solvent (typically an aqueous medium). This approach leads to flushing of sample-derived particulates from the mPC-cartridge prior to its installation onto the CE capillary. Improved reproducibility of mPC-CE performance is a result of reduced clogging of the system which alleviates adversely affecting EOF. The potential application of mPC-CE coupled to a mass spectrometer (mPC-CE-MS) in the clinical diagnosis of disease states is substantial. In part, this is due to the fact that the technique can be utilized in the direct analysis of any physiologically derived body fluid. This is demonstrated by the direct mPC-CE-MS analysis of aqueous humor obtained from a patient undergoing eye-surgery. The chemical composition of human aqueous humor is still poorly understood, mainly due to the limited sample amounts that can be collected. It has been suggested that the chemical content of this fluid may play a role in drainage of the human eye. In particular, the protein content of aqueous humor may contain important factors in this process. Hence, any method that can readily determine the protein content of aqueous humor would be of great benefit. In this specific case, we took 7 |LIL of human aqueous humor and pressure injected it directly, without any further sample pretreatment, onto a C-8 silica-based impregnated membrane. The membrane containing the aqueous humor analytes was subsequently washed for 10 minutes with separation buffer (1% acetic acid). Analytes were eluted from the membrane with 80:20 MeOH:H20 and subjected to CE separation in a polybrene-coated capillary with final detection by ESI-MS. The mPC-CEMS ion electropherogram is shown in Figure 4. A number of ion responses were observed including singly charged species at m/z 758, 760, and 782. Further, two components were tentatively identified as human serum albumin (MHSQ^^^ = 1338) and p-2 microglobulin (MHi/^"^ = 1067). Deconvolution of the ion series containing m/z 1067 revealed a molecular weight of 11,729 Da corresponding to the oxidized form of P-microglobulin that contains a single cysteine bridge. Characterization of the other components present in the mixture is currently in progress.
19
Andy J. Tomlinson et al
20 54
MH =758
hnvsJU^Mi^^^^ 54
MH =760
I^AAvA/lji^^ (D O C S^l MH* = 782 05
c 3
S X 100 a .£?
,
127
A A 1W 1 A 11
11.4 11.8 Migration time (min)
12.2
3 4 5 6 Incubation time (min) Figure 2 Multiple incubation of a single molecule: (A) A single molecule of alkaline phosphatase was captured within the capillary. After an 8 min incubation the sample was subjected to a 15 sec pulse of 400 Vcm', moving the enzyme molecule away from the product and into fresh substrate. The process was repeated for a 4, 2 and 1 min incubation. Following the last incubation, the contents of the capillary were swept past the detector at 400 Vcm"^ The solid line represents the data and the dashed line the least-squares fit of 4 Gaussian peaks. (B) Peak area is shown by a cross and the straight line in the least-squares fit to the data.
Douglas B. Craig et al
128
individuals with differing levels of glycosylation^^'^l Variation in glycosylation causes differences in both Km and V^'^. It has also been suggested that differences in activity may result from differences in conformations of individual molecules^^ B. Activation Energy Measurement In these experiments molecules were captured within the capillary and incubated 3 times at varying temperatures, from 13 to 38°C. 8 molecules were studied. Figure 3 shows two such molecules. Peak area increases with temperature. The two molecules do not have identical activities. The dephosphorylation of substrates by alkaline phosphatase has been proposed to occur by the following mechanism^^: EH + ROP
ki k-1
EH •ROP
^tKdM
-
Microeoi^SCXEIuol*
Methionine enkephalin
Leucine enkephalir
VaUyr-Val
M===^t*
Figure 2. Reversed Phase Chromatography of HPLC Peptide Standard Overlay of Before and After Microcon-SCX. Starting peptide mixture containing 45 pg in 250 pi of standard or eluted Microcon-SCX. Separation was performed by an Amicon, C18-300-10sp, (4.6 X 250 mm) using a 4 min hold at 15 % ACN, 0.25 % TFA in DIW followed by a linear gradient in 20 min from 15 % ACN to 33 % ACN at 1 ml/min. Approximately 80 % recovery of each peptide was determined by peak area integration ratios.
The stability of the cation exchange membrane to adsorb positively charged free amine groups as dimers and trimers occurs during a brief 15 sec centrifugation. The rapid kinetics for analyte binding is shown in figure 2. Comparison as a chromatogram overlay plot of control and treated peptide standard is shown. These chromatograms remain nearly identical with sample loads from 1-250 ^igs. Cytochrome c tryptic peptide map shown in figure 3 by RP-HPLC were analyzed by peak area integration following Microcon-SCX treatment. Designated peaks expressed as % of control for 6 separate samples showed that recoveries for all peaks was ^ 90% ± 2%.
137
Centrifugal Device for Sample Preparation
Vt 10
\l
I
15
mlL (minutes)
Figure 3. Reversed Phase Chromatography of Trypshiized Cytochrome c Before and After Adsorption to Microcon-SCX. Approximately 250 pg of digest was diluted to a total volume of 500 pi and either injected directly onto column (top) or hound and eluted from Microcon-SCX 0>elow) as descrihed in Methods. Separation was performed with an Amicon, C18-300-10sp (4.6 x 250 nmi) using a linear gradient of 5 % ACN to 55 % ACN (0.1 % TFA in DIW) in 20 minutes at 1 ml/min.
A more complex digest containing peptides and glycopeptides treated by MicroconSCX using endo lys c digested human immunoglobulin heavy chain (hIgG-HC) (8) is shown as a direct HPLC chromatogram comparison of before and after SCX treatment (figure 4). Qualitatively all peptides from control appear in SCX-treated sample as further confirmed by amino acid analysis data shown in figure 5. Accurate compositional analysis in the abscence and presence of detergents demonstrate efficient analyte binding and detergent removal following Microcon-SCX treatment (11). The elevated levels of Asx and Glx observed from buffer blanks and treated samples returned to normal following a second dry down step to remove residual ammonium hydroxide from desorbant that reacted with PTC during derivitization.
Donald G. Sheer et at
138
Figure 4. Recovery of Endo Lys digested Imunoglobulin Heavy Chain Following Micron-SCX. HIgG-HC was reduced, alkylated and digested with endo lys c (8) with approximately 35 pg used for control (left) and Microcon-SCX treatment (right) for analyses. Reverse phase HPLC was performed with an Amicon C18-100lOsp column (4.6 x 250 mm) using a 180 min linear gradient from 5 to 55 % ACN, 0.1 % TFA in DIW at 0.7 ml/min following a 10 min gradient from 0 to 5 % ACN. The ability of Microcon-SCX to bind all peptides and glycopeptides demonstrates broad selectivity and efficient binding, which occurs during a 30 sec centrifiigation.
^
g S S § I
i
§ ?
^
5 !? s
s
§
Figure 5. Comparison of Compositional Analysis of Endo Lys c Digested IgG Heavy Chain Before and After Adsorption to Microcon-SCX in the Presence of Detergents. Approximately 650 picomoles of endo lys c digested hIgG-HC in 0.2 M sodium phosphate was either prepared for hydrolysis (control) or treated with detergents as described. The bound digest was desorbed with 1.2 N ammonium hydroxide in 50% methanol, vacuum dried and hydrolyzed. Amino acid analysis was performed by OPA derivatization using a HAIsil 120 C18 5 micron column (Higgins Analytical, Inc.) (11). Greater than 95 % recovery of digest was recovered from MicroconSCX treated samples (0.2 M phosphate, 0.1 % SDS and 0.1 % tween). The high Asx and Glx results from incomplete removal of ammonium from Microcon-SCX eluted samples. Normal values were achieved by repeating speed vacuum lyophilisation after adding 100 ^il of DIW.
Centrifugal Device for Sample Preparation
139
HPLC Trace . E-Ol 5.694
10 8 6 4 2 0
ysiWvY^/^/^
JfrhM^.r
VW^W^^'^^^'^^'^^^
VVr/
Mass Spectrum 100
SM3
, Base Peak
. E+01 3.791
80 60 40 20
J ill!
H II Ji|
JLJildi
1 111
U
J
33:20
Figure 6. Identification of fragments by LC-MS of Endo Lys c Digested hIgG-HC. Masses determined from MS spectrum data following Microcon-SCX desalting. Digests were injected into an LC-MS system consisting of an HP1090 plumbed to a Finnigan TSQ7000 MS with a Finnigan electrospray source. Approximately 45 L | Lg of digest was loaded onto a Nucleosil 300-5 C18 column (0.46 x 25 cm; MachereyNagel, Duren, Germany) and eluted with a gradient of ACN in 0.1% TEA at 0.7 ml/min; the effluent was analyzed on-line for both UV absorbance (top) and mass without flow splitting (below) (8). Comparison of Microcon-SCX with untreated s e x LC-MS samples showed that peptides and glycopeptides were identical. Figure 6 demonstrates the utility of Microcon-SCX to produce high quality LC-MS data. The upper LC scan was generated from endo lys c digested IgG-HC (8), following Microcon-SCX binding, washing and elution. On line LC fractions were subjected to negative ion ES (8). Mass spectrum data in lower scan was used to designate HPLC fractionated peptides in upper trace. The recovered analytes from Microcon-SCX following LC-MS showed that all predicted peptides and glycopeptides were recovered with masses ranging from 447-6,191 daltons. Microcon-SCX bound analytes washed in low pH, eluted in ammonium hydroxide and evaporated by air has given clean and accurate MALDI- TOF spectrum for a variety of samples (data not shown).
Donald G. Sheer et al
140 AMI2IF03 1 (1.254) m j
lOpmol Oligdhymidilic acid d(pT)10
J
I W j ^ ^ LXiX
K5
la?
.1.
='
.li
AMUIF01 I (1.362)
400
i 4«"i » 500
600
700
900
1000
1100
1200
Figure 7. Negative Ion Electrospray Scan of Oligo-thymidillic Acid PCR Primer (10 mer) Before and After Microcon-SCX. A. (Top) ES Scan of 10 picomoles of oligonucleotide in 20 mM sodium acetate, pH 5. (Lower) ES Scan after 10 picomoles of primer was passed over Microcon-SCX, desorbed and vacuum dried. Samples were reconstituted in water and diluted to 10 mM TEA in 50 % IPA and infused at a rate of 2 ml/min (Micro Mass). The majority of [Na"*"] passed through membrane and was removed following Microcon-SCX treatment. A 20 mM ammonium acetate or 10 mM HCl/ 20% MeOH wash removed remaning [Na"**] from sample.
Microcon-SCX was used to obtain oligo-DNA primers free of salt for DNA sequencing by MS and subsequent PCR. HPLC quantitation of Microcon-SCX recovered oligoncleotides was 75-90 % efficient in recovering most nucleotides at a pH of 2-3. To demonstrate the efficiency to remove salts from these samples, oligo-dT (19.24) was applied to Microcon-SCX in sodium acetate buffer and analyzed by ES-MS shown in figure 7. Results demonstrate that upon analyte binding, the majority of [Na"'"] passes through s e x as shown by a decrease in ionized [Na"'"]-DNA forms of treated sample. The remaining [Na"'"]-DNA forms were removed by washing membrane with 500 |Xl of 10 mM HCl in 20% MeOH prior to analyte desorption.
141
Centrifugal Device for Sample Preparation
120
120
100
100
80 nC
Glam/9.67 » Fuc/4.92
40 20 0
80
Gluni/12.08
60
Nle68) and G (Met54->Nle54) were not separated from semipreparative reverse phase HPLC, and were analyzed together for amino acid contents. ^ The typical hydrolysis yield for methionine residue is approximately 80%.
IV.
DISCUSSION
The misincorporation of norleucine for methionine was known to occur in bacteria when high level synthesis of recombinant proteins were induced in minimal medium fermentation (1,2). This misincorporation was detected in the production of l^N-labeled recombinant human leptin produced using minimal medium conditions, however, is not present in the clinical samples produced using other fermentation conditions. The mechanism for the misincorporation was believed to involve the de novo synthesized norleucine which bypasses the leucine biosynthetic pathway and enters directly into the
Jennifer L. Liu et al
162
incorporation pathway by associating with tRNA^^^ in the acylation reaction. The level of incorporation as well as the distribution of norleucine for the methionine residues, however, has varied for different recombinant proteins (1, 5,13-15). In the production of l^N-labeled leptin, a small fraction (5%) of norleucine was incorporated in the expressed protein. Within the four methionine residues (positions 1, 54, 68, and 136), the three internal residues were equally substituted by norleucine at a rate sixteen fold greater than the incorporation detected for the methionine at the amino terminus. The discrete substitution results in the generation of three isoforms containing a norleucine in place of each of the internal methionines. This observation is unique from other recombinant proteins known to have misincorporation of norleucine for methionine. Methionine is the longest unbranched nonpolar amino acid and has an unusually flexible side chain. Norleucine and methionine differ only in the substitution of a methylene group for a divalent sulfur atom. Although the side chains of norleucine and methionine have nearly identical volumes and surface areas, the methionine sulfur atom is more polar and less hydrophobic than the corresponding methylene group in norleucine. Therefore, a methionine-containing peptide would have a higher desolvation energy compared to the same peptide which contains a norleucine substitution at the methionine position (16). The single-point substitution at the three internal methionine residues in recombinant human leptin converts the homogeneous protein into three closely related heterogeneous proteins. The local environmental changes caused from the misincorporation of norleucine at the three methionine residues reflect on the elution profile of reverse phase chromatography. The less buried norleucine residue has greater surface area accessible to interact with the solid phase of the chromatography and results in the longer retention time of the norleucine-incorporated isoforms. The elution order of the three norleucine-containing isoforms, therefore, reveals information about the relative solvent accessibility of each of the three internal methionine residues. The single-point substitution of a naturally occurring amino acid by an analog provides a convenient tool for studying the effect of molecular alteration on the biologicd activity of the proteins (3). Although sterically superimposable to methionine, norleucine is not a substrate for methionine adenosyltransferase. Therefore, it is expected not to follow the same metabolic function as methionine. On the other hand, norleucine lacks the sulfur atom which is prone to oxidation upon exposure to oxidizing reagents such as free oxygen. The substitution of methionine by norleucine might diminish the need to engineer an oxidation-resistant protein. ACKNOWLEDGEMENTS The authors would like to thank Dr. Viswanathan Katta for helpful discussion, John Le for great assistance in the on-line HPLC/MS, and Dr. Michael Rohde and Tom Boone for support of this work.
REFERENCES 1. 2. 3. 4.
Lu, H. S., Tsai, L. B., Kenney, W. C , Lai, P.-H. (1988) Biochem. Biophy. Res. Commun. 156, 2, 807-813. Tsai, L. B., Kenney, W. C , Curless, C. C , Klein, M. L., Lai, P.-H., Fenton, D. M., Altrock, B. W., Mann, M. B. (1988) Biochem. Biophy. Res. Commun. 156, 2, 733-739. Barker, D. G., and Bruton, C. J. (1979) J. Mol Biol. 133, 217-231. Brown, J. (1973) Biochim. Biophys. Acta 294, 527-529.
LC/MS Peptide Mapping of Recombinant Human Leptin
5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
163
Bogosian, G., Violand, B. N., Dorward-King, E. J., Wokerman, W. E., Jung, P. E., Kane, J. F. (1989) J. Biol Chem, 264, 1, 531-539. Kerwar, S. S., and Weissbach, H. (1970) Arch. Biochem. Biophys. 141, 525532. Trupin, J., Dickerman, H., Nirenberg, M., and Weissbach, H. (1966) Biochem. Biophy. Res. Commun. 24, 50-55. Zhang, Y., Proenca, R., Maffei, M., Barone, M., Leopold, L., Friedman, J. M. (1994) Nature 372, 425-432. Pelleymounter, M. A., Cullen, M. J., Baker, M. B., Hecht, Winters, D., Boone, T., Collins, F. (1995) Science 269, 540-543. Halaas, J. L., Gajiwala, K. S., Maffei, M., Cohen, S. L., Chait, B. T., Rabinowitz, D., Lallone, R. L., Burley, S. K., Friedman, J. M. (1995) Science 269, 543-549. H, Lu, C. Clogston, L. Merewether, L. Narhi, T. Boone (1993) In Protein Folding: In Vivo and In Vitro (J. Cleland, Ed.) 526, chap. 15. Lu, H. S., Lai, P. H. (1986) J. Chromatogr. 368, 215-231. Forsberg, J., Palm, G., Ekebacke, A., Josephson, S., and Hartmanis, M. (1990) Biochem. J. Ill, 357-363. Gilles, A.-M., Marliere, P., Rose, T., Sarfati, R., Longin, R., Meier, A. Fermamdjian, S., Monnot, M., Cohen, G. N., and Barzu, O. (1998) J. Biol. Chem. 263, 8204-8209. Randhawa, Z. I., Witkowska, H. E., Cone, J., Wilkins, J. A., Hughes, P., Yamanishi, K., Yasuda, S., Masui, Y., Arthur, P., Kletke, C , Bitsch, F., and Shackleton, C. H. L. (1994) Biochemistry 33, 4352-4362. Thomson, J., Ratnaparkhi, G. S., Varadarajan, R., Sturtevant, J. M., and Richards, F. M. (1994) Biochemistry 33, 8587-8593.
This Page Intentionally Left Blank
Hyphenated HPLC Methodology for the Resolution and Elucidation of Peptides from Proteolytic Digests Randall T. Bishop, Vincent E. Turula^ and James A. de Hasettf Department of Chemistry University of Georgia Athens, GA 30602-2556 USA
Robert D. Ricker Rockland Technologies, Inc. 538 First State Boulevard Newport, DE 19804 USA
I. Introduction The use of proteolytic enzymes in the analysis of protein structure is well established, yet the identification and characterization of the resulting peptide fragments usually requires the generation of a peptide map through a mode of separation. Reversed phase chromatography is known to be a powerful tool in the analysis of complex biological mixtures, and has found great success in the resolution of peptide mixtures. (1) Common on-line detection techniques, however, such as UV and fluorescence detectors, suffer from low sensitivity or specificity, and therefore provide little structural detail about the separated peptides. (2) More structurally informative detection teclmiques are of great value to increase the speed and efficiency with which structural information can be extracted. Mass spectrometric techniques that include continuous flow fast atom bombardment (FAB), electrospray ionization (ESI), and matrix assisted laser desorption (MALDI) have been applied successfully to protein structure investigations. (3) The hyphenation of electrospray ionization mass ^ Present Address: Amvax Inc., 12103 Indian Creek Court, Beltsville, MD 20705. ^ Author to whom correspondence is to be addressed. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
165
166
Randall T. Bishop et al
spectrometric techniques to liquid chromatography for use as an on-line detector has proven to be quite successful. Minimal sample requirements (pmol) and low flow rate restrictions (r
H^ H
H S
^
COOH
Ornithine
JJ COOH
^
OPO H
Phospho-Serine
H N
Y
».'
I
\^""0H
H N^
' " ' ^ ,j-^CH
s
^
/ ' ' M K^ CH3 H
HO a-Aminobutyric acid
Structures of PTC-Amino Acids used during this study.
cis-4-Hydroxyprolin
Taurine
5-Hydroxy-L-Lysine
N^^°^
H ""^
JJ COOH 'V' S
Figure 1:
,NH2
H
N> ^ vV S
Cysteine
COOH -NH
S
Tryptophan
Klaus D. Linse et at
K Y
VM
C
'•U-^Ju SpDEHp NSQG H RT AP RfiLj D a i . Eft Citr Tau
cxAba
Y VM C
0.B5E2 AU
ILNIF
L
K^wv^,
U 17.0
IL F W K HKlOrn G.0G37 nu HK2
U )7.(i Figure 2. Chromatographic separation of amino acids after derivatization with phenyhsothiocyanate (PITC): A. Separation of 200 picomole standard amino acid mix H containing 18 amino acids. B. Separation of an extended amino acid mix containing 28 amino acids. The standard one-letter abbreviations are used for the usual amino acids. Nonstandard amino acids are Sp, phosphoserine; Hp, hydroxyproline; Citr, citrulline; Tau, taurine; aAba, a-amino butyric acid; HKl & HK2, hydroxylysines; Orn, ornithine; *, artifacts from reagents.
Analysis of Free Amino Acids from Physiological Samples
Min.
A.
0
D 2.45 E 2.78 S 3.98 G 4.4 H 4.7 R 5.33 T 5.67 A 6.08 P 6.3
Y 8.95
—
10
V 9.93 M 10.31
11
C 11.0
12
I 12.2 L 12.43 Nl 12.8 F 13.23
13 14 15
K 14.4
201
B. Sp 2.12 2.55 D 2.95 E Hp 3.25 N S Q G H Citr Tau R T A P
4.25 4.45 4.7 4.95 5.53 5.87 6.25 6.53 6.83 7.4 7.7
aAba 9.32 Y
10.2
V M C
10.83 11.2 11.7
I 12.8 L 13.0 HKl 13.3 HK2 13.48 F 13.95 Orn 14.2 W 14.43 K 15.23
16
Figure 3. Elution pattern of the two PTC-amino acid separations showing the retention shifts for the indicated derivatized amino acids. A. Standard conditions B. Physiological conditions
202
Klaus D. Linse et al
DE
Raw
8GU RTAP
Oaa^ai
1 Till / S , A A
u
VL
JUU
/u
Figure 4. Chromatographic separation of amino acids after derivatization with phenylisothiocyanate (PITC)using standardard conditions: A. Separation of 200 picomole standard amino acid mix H (17 amino acids) and Norleucine. B. Separation of a protein hydrolysate, bovine serum albumin (BSA) C. Separation of a peptide hydrolysate, bradykinin.
Analysis of Free Amino Acids from Physiological Samples
203 IL F W K
RMM
Dai: a
Rau
On t m
JBI
HKlOrn HK2
I
yiij
ill L-i/ UMI ...4' u
1
§.^637 AU
^
1
i
IT"
5
H IM
iT"
Figure 5. Chromatographic separation of amino acids after derivatization with phenylisothiocyanate (PITC)using physiological conditions: A. Separation of an extended amino acid mix containing 27 amino acids. The standard one-letter abbreviations are used for the usual amino acids. Nonstandard amino acids are Sp, phosphoserine; Hp, hydroxyproline; Citr, citrulline; Tau, taurine; aAba, a-amino butyric acid; HKl & HK2, hydroxylysines; Orn, ornithine; *, artifacts from reagents. B. Separation of free amino acids found in human serum. C. Separation of free amino acids found in human urine.
Klaus D. Linse et al.
204 Taurine
Figure 6. Chromatographic separation of amino acids after derivatization with phenylisothiocyanate (PITC)using physiological conditions: A. Free amino acids found in rat brain tissue. B. Free amino acids in ant hemolymph. C. Bone collagen hydrolysate.
Analysis of Free Amino Acids from Physiological Samples
205
III. Results and Discussion Due to an increased interest in analysis of physiological samples, we wanted to establish analyzer methods which would allow us to choose between our standard protocol for protein and peptide hydrolysates and a separate protocol for an expanded number of amino acids, to include the most important free amino acids found in physiological samples. A study of common analysis requirements in our facility indicated that only a limited number of the possible free physiological amino acids is needed for most unknown samples. These additional amino acids of interest are aamino butyric acid, citrulline, y-amino butyric acid (GABA), hydroxyproline, hydroxylysine, ornithine, taurine, and tryptophan. Other amino acids of interest to us are phosphoserine, phosphothreonine, phosphotyrosine and carboxy-amino acids since they are released from glycoprotein or glycopeptide hydrolysates. Figure 2 shows the separation of two PTC-amino acid standards, the shifts in retention times and conditions used for both methods. Figure 2A shows the separation of 200 picomoles of PTC-amino acid standards using our standard separation protocol. We use these conditions regularly for all routine analysis of protein and peptide hydrolysates. Figure 2B shows the separation of 27 amino acids at the 200 picomole level using our separation protocol for physiological samples. All amino acids separate well under these conditions and are eluted from the column in less than 16 minutes. The observed shifts in retention times are graphically displayed in Figure 3. The use of ultrapure chemicals, thorough cleaning of the analyzer slides and minimization of contaminants in the vicinity of the instrument enables detection at the 50 picomole level. Proper sample handling is critical. The major difference in separation conditions between the standard versus the physiological method is the addition of an extra step at 6 minutes into the gradient. This step decreases the steepness of the slope of the gradient development, thus allowing for the separation of citrulline, taurine and arginine. The separation of hydroxyproline from glutamic acid is achieved by increasing the pH of solvent A from 5.25 to 5.55. This increase in pH is also beneficial for the separation of ornithine from phenylalanine. The lower temperature of the physiological method resolves Proline from phenylthiourea (PTU). An additional benefit of the physiological conditions is the baseline separation of valine and methionine, y-amino butyric acid elutes after PTU under the standard protocol. Phosphoserine can be detected using either method, although the physiological conditions result in sharper peaks. Phosphothreonine will be resolved by lowering the pH of solvent A, but hydroxyproline will then be lost by coelution with glutamic acid. In general, pH has the greatest impact on amino acid separation. Gradient and molarity, while important for individual amino acids, have less of an effect on the whole elution profile. As the column ages, it becomes necessary to increase the molarity of solvent A to continue to separate arginine and threonine. The addition of 3 M sodium acetate at pH 5.5 in 5 ml increments, as needed, will raise the molarity a sufficient amount. The versatility of our methods is shown in Figures 4, 5 and 6. Figure 4 shows standard analysis conditions used for hydrolysates. A standard, a protein and a peptide sample are illustrated. Figures 5 and 6 contain chromatograms using the physiological method. Free amino acids are found in human serum 5B, human urine 5C, rat brain tissue 6A, ant hemolyph 6B, and bone collagen hydrolysates 6C. Note the large amount of glycine and taurine in rat brain tissue and the predominant glycine peak in bone collagen hydrolysate. To minimize cross-contamination, we routinely run cleaning cycles,which contain a hydrolysis cycle followed by a derivatizing cycle, after each analysis. A 30 fil aliquot of 1 mg/ml K4EDTA in ultrapure water is spotted on to the analyzer frits just prior to running the derivatizer cycle.
Klaus D. Linse et al
206
IV.
Conclusions
The two analyzer protocols described allow us to switch from standard settings to physiological settings within a few hours using the same column. The physiological separation method enables us to reproducibly analyze samples which contain nutritionally important amino acids, including taurine, in serum and organs, e.g., liver, heart, and brain. Taurine is a major free intracellular amino acid in animal tissue. Due to taurine's roles as a conjugator of bile acids and as a protector of cell membranes, it has become the focus of study for many investigators (2). Proper care needs to be taken to minimize cross-contamination. We therefore recommend routinely running cleaning cycles which contain a hydrolysis cycle followed by a derivatizing cycle after each analysis. In conclusion, we achieved excellent resolution of up to 30 amino acids, including most of the major plasma amino acids, within 16 minutes. The method has good reproducibility of both retention times and peak areas, allowing us to routinely analyze physiological samples.
References 1. Cohen, St., Tarvin, Th. and Bidlingmeyer, B. (1984). Analysis of amino acids using precolumn derivatization with phenylisothiocyanate. American Laboratory Aug. 1984. 2. Gaul, G. E. (1989). Pediatrics 83, 433-442. 3. Harihara, M., Naga, S. and VanNoord, T. (1993). J. Chromatography 621, 15- 22. 4. Janssen, P., van Nispen, J., Melgers, P., van den Bogaart, H., Hamelinck, R. and Goverde, B. (1986). Chromatographia 22, 351-357.
Quantitation and Identification of Proteins by Amino Acid Analysis: ABRF-96AAA Collaborative Trial K.M. Schegg\ N.D. Denslow^, T.T. Andersen^, Y. Bao'*, S.A. Cohen , A.M. Mahrenholz , and K. Mann 1. 2. 3. 4. 5. 6. 7.
I.
Dept. Biochemistry, Univ. Nevada, Reno NV 89557 Dept. Biochem. and Molec. Biol., Univ. Florida, Gainesville, FL 32610 Dept. Biochem. and Molec. Biol., Albany Medical College, Albany, NY 12208 Dept. Microbiology, Univ. Virginia Medical School, Charlottesville, VA 22908 Waters Corp., Milford, MA 01757 Dept. Biochemistry, Purdue Univ., West Lafayette, IN 47907 Max-Planck-Inst. Biochemie, 82152 Martinsried, Germany
Introduction
Amino acid analysis (AAA) has, for a number of years, been a valuable tool for identifying the amino acid composition of proteins and for accurately determining protein concentration. The Amino Acid Analysis Committee of the Association of Biomolecular Resource Facilities (ABRF) distributes a yearly test sample to member faciUties and pubHshes the results (1-9), allowing participants to compare their performance with that of other laboratories. Each year, the study is designed to address particular challenges associated with AAA. This year's sample addressed two challenges: (1) to test how accurately laboratories are able to quantitate proteins using AAA, and (2) to assess the ability to use composition data to identify unknown proteins. Recently, spectacular success has been achieved using amino acid composition data submitted to computerized search programs linked to protein databases to identify proteins recovered from twodimensional gel blots (10, 11). Additional information, such as species, molecular mass and pi, can also be submitted to some search programs to improve the probability of correct identification. In addition to promoting this new technique, our goal was to determine the quality of data required for successful identifications. The collaborative nature of this AAA study provided a unique opportunity for such an assessment. The data reported here, which were submitted by 71 facilities, reveal that most sites are capable of utilizing AAA to accurately quantitate protein concentrations and are able to identify a protein solely on the basis of the amino acid composition using Internet search programs such as ExPASy and Propsearch.
Abbreviations used: AAA, amino acid analysis; ABRF, Association of Biomolecular Resource Facilities; AQC, 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate; FMOC, N-(9fluorenylmethoxycarbonyl); OPA, o-phthaldialdehyde; PITC, phenylisothiocyanate; tpis, triosephosphate isomerase. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
207
208
II.
K.M.Scheggera/.
Materials and Methods
A. Sample Preparation and Analysis Rabbit triosephosphate isomerase (tpis, Sigma) was dissolved in water to 0.1 mg/ml and 50 jLll aliquots (nominally 5 |Llg) were distributed into 1.5 ml microfuge tubes and lyophilized. The tubes were mailed to member facilities with instructions that included the suggestion that the sample should be dissolved in water or 0.1% trifluoroacetic acid-20% acetonitrile and then analyzed by each laboratory's standard method. If the participant wished to gain additional information about the unknown, molecular mass could be determined by mass spectrometry or SDS electrophoresis stained with silver and pi could be determined using lEF gels. Facilities were instructed that, in order to aid in identification, a standard protein could be analyzed in parallel with the unknown. The amino acid composition of the unknown and known proteins, along with any additional information, was to be submitted to ExPASy (http://expasy.hcuge.ch/ch2d/aacompi.html) or Propsearch (http://www.emblheidelberg.de/aaa.html), which were accessible via the Internet, or to any other search program.
B.
Calculations
Each core laboratory was asked to send their results to an independent collaborator who entered the data into an Excel spreadsheet and removed any identifiers to keep the data anonymous. The labs were requested to report amino acid composition (total pmoles each amino acid) for the unknown sample and for a known sample of their choice to be used as a calibrant for the search programs. The labs were also asked to report the total jig of unknown protein in the tube and list the top five proteins identified by their search program(s). Data reduction was as described (4,9). Briefly, total pmoles tpis were estimated on the basis of data from individual amino acids by dividing the total pmoles of each amino acid in the sample by the known number of residues of that amino acid per molecule of tpis. These data were averaged. A corrected average was then calculated by excluding individual yield values differing >+15% from the average obtained above. Composition (number of each amino acid per molecule) was obtained by dividing the pmols amino acid by the corrected pmols tpis for that analysis. Accuracy (internal error) of each residue was calculated as: Error = 100 * (I experimental composition value - true value I) / true value (1) Average Error per analysis = (S error of 16 amino acids) / 16
(2)
The error and yield values from each participant were used to obtain overall averages across participants. A constellation of 16 amino acids, which excluded Cys and Trp, was used for all calculations.
III. A.
Results and Discussion Participation
Seventy-one facilities returned data to the Amino Acid Analysis Committee. One facility returned two sets of data for a total of seventy-two data sets. This year, 40 of the 72 analyses (56%) were performed using pre-column, as opposed to
Quantitation and Identification of Proteins: ABRF-96AAA
209
post-column techniques. This is a slight increase in the percent pre-colunin users over the previous 2 years, when 52% of respondents utilized pre-column methods (8,9). As in the previous years, the most popular methods remain the precolumn PITC (40 %) and the post-column Ninhydrin ( 39%) methods. B.
Error
and
Yield
The average error for each data set is shown in Figure 1. The more accurate half of the analyses is shown on the top graph. The overall error for all analyses was 11.9 ± 9.8% with a range from 4.0 to 58.9%. The accuracy in this year's study was far better than that of last year's (1995) AAA study, which asked members to analyze a protein spotted onto a PVDF membrane (overall average error = 21.4%) (9). In the 1994 study, which involved analysis of a soluble protein similar to that in the current study, the overall average error was 10.9 + 3.7% (8). It should be pointed out, however, that in 1994 the committee did not include results from laboratories with >30% average error. If we similarly exclude the 5 worst sets of data (average error 33 to 59%), the overall average error for 1996 improves to 9.6 + 4.1%, which appears to be a slight improvement over 1994. The contrast between this year's study and that of the previous year reemphasizes the difficulty with accurate,analyses of proteins on PVDF. Total yield of protein determined by each laboratory is shown in Figure 2. We expected the average yield to be about 5 |ig, assuming the weight recorded on the Sigma bottle was accurate and assuming small pipetting errors on our part in preparing the samples. Quantitation values, however, ranged from 0.9 to 10.6 |Xg, with an average of 6.8 + 1.7 jig protein. The vast majority of results fell between 6 and 9 jxg protein, and many values clustered around the median value of7^g. Table I presents both average error and yield data in terms of the methodology used. The accuracy of analyses performed using pre-column and post-column methods is very similar, despite the fact that pre-column users tend to analyze significantly less of the total protein than post-column users. In this year's study the top site (smallest error) was a pre-column PITC site. The second best site used post column-OPA methodology (see Table II). The average total protein yield for analyses using pre-column methods is lower than the average for postcolumn users, but the difference is not statistically significant. Both of the top two sites had higher than average yields, 7.9 and 8.4 |ig, respectively. Table I. Correlation of Method with Average Error and Yield
Method Overall Pre-Column PITC AQC FMOC/OPA Post-Column Ninhydrin OPA Fluram
N 72 40 29 7 4 32 28 3 1
Average Error (%) Average Error (%) Yield (|ig) Average +/- SD Range Average +/- SD 11.9 ±9.8 4.0 - 58.9 6.8 ± 1.7 11.8 ± 10.0 4.0 - 58.9 6.6 ± 1.8 12.6 ± 11.7 6.6 ± 1.6 4.0 - 58.9 8.9 ± 2.0 5.5- 11.2 7.1 ± 1.5 10.9 ± 1.9 8.6 - 13.0 5.5 ± 3.5 12.1 ±9.6 4.0 - 42.6 7.4 ± 1.5 11.5 ±8.7 4.0 - 42.6 7.4 ± 1.4 15.1 ± 18.8 4.5 - 36.8 6.9 ± 1.3 21.1 21.1 10.6
Yield (|ig) Range 0.9 - 10.6 0.9 - 9.4 2.9 - 9.0 5.4 - 9.3 0.9 - 9.4 4.7 - 10.6 4.7 - 10.2 6.0 - 8.4 10.6
jojJ3% 96ej8AV
0I.Z6 I.96t7
USl ozt^e 9111
9999 t7929 90IZ 99Z9 899e 0161Z9t^3 e89Z
zot^e 99ZI 8933 8t769 81-t^e l.9t'9 0808
6689
£ £
J0JJ3% 96BJ9AV
aZ2€6
9W2
9608
3393
U93
ze9e
t^9t'3
ot73e Z866 89t73
0969 9^92
10Z3 61^61 0^39 99999 6986
zege 9993
oce9 £699
oe^9 t7ZZ8
1^ 0 c
II I3
CJ
-^
*-•
•r
C
p
w
2 2©
»H e
•S i.
CO
I*
tz ' i l
5= 0 2 0)
9698 CO
I^ZCl.
6996
I.Z69
69ei
£8€6 ZZ88 3t'8€ 9661. 6et79 Z179Z 9809 8931
all
o
Id Bri pieiA
m
9111 ZZ88 S992
e£P9 99999 29e9 [.£68
1.Z69 9*^98 9608 0139
1
1
1
iiiiiiiiiiiiiiiiiii
feMM^^^^^^^^ 1 1
cz ^^^^^^^^ [Z
mmmmmmmmm C
Brl piajA
1
1
1
1
1'
1
1
1
I
[
L
211
0l72€
mi 9t793
928Z
2292
I9t^9 8e22 9999 8928 17ZZ8 89172
9809
69ei.
oe69 99Z9 1^092 99Z1. 6986
911 •£ C
^
^
89e9 J J 6689 Z^9Z ZOK 8921. \rP9Z lOlZ OZK 0808 21721
oe&9 90 U Z9172 81769 899e
a J3 4:3
SI re > ^ 1 % W
35n 302520-
A
5 105 0 -
•A, °°
c
O
O
o ^ o -^
*-•
1
1
10
20
% Average Error
% Average Error
Figure 3. Correlation of Scores on Propsearch or ExPASy with Average Error of Analysis. The score given by Propsearch (graph on left) or ExPASy (graph on right) for tpis_rabit was plotted against the average error for that analysis of the protein. Calculated mol % data for the query protein were submitted to search programs without calibration standard (if any). Proteins ranked as number 1 : A, rabbit tpis;iR, tpis from species other than rabbit; and o, other protein.
not identified as tpis of any species (Figure 3). The distance score assigned to rabbit tpis by ExPASy did not correlate as well with the average error (r = 0.475). Inclusion of calibration protein data improved the ranking of tpis_rabit in 14 cases (Figure 4a), caused no change in 16 cases, and decreased the ranking of tpis_rabit in 11 cases (Figure 4b). The ability of a calibration protein to aid in the correct identification of the unknown protein depended on the accurate analysis of the calibration protein. As illustrated in Table IV, those calibration proteins that either helped or made no change in the correct identification of the unknown had average errors for the analysis of the calibration protein itself of 8.3 + 3.2% and 8.0 ± 3.0%, respectively. By contrast, calibration proteins that worsened the identification had average errors values of 19.9 ± 13.9%. In some cases, laboratories misidentified the standard protein to be used as a calibrant; in others, results were poor for all reported data. Interestingly, the former problem, which can only be attributed to human error, seems to occur regularly in ABRF studies (see the 1996 report of the ABRF Peptide Synthesis committee), although at a low frequency.
D.
Molecular Mass
Determination
The Propsearch and ExPASy programs both allow entry of molecular mass to aid in identification. As part of this year's study, participants were asked to determine the molecular mass of the sample using mass spectrometry (MS) or other methods such as SDS electrophoresis or gel filtration. Sixty-five percent of the facilities did use MS to determine the molecular mass of the sample; the average mass obtained was 26,804 + 631 (one outlying value of 6000 was excluded from this calculation). The molecular mass for rabbit tpis calculated from the sequence is 26,625. Inclusion of molecular mass in the ExPASy program did not seem to aid in correct identification of the protein. However, in the Propsearch program, molecular mass can be weighted, thus making this program more sensitive to inclusion of mass; an error of 5-10% in mass was determined to adversely affect the score given by Propsearch.
Quantitation and Identification of Proteins: ABRF-96AAA
PL,
215
104-U
iWWrl^^ in
cx)
^
« n » n - ^ c N O « n o \ r n o N r ^ " ^ c N
Site Identification
H
P
0
m^^ r -- vo rvo »n CN o '—I en ^ r^
ON
r--
^
OS
^ in CN
ON
in tetraethyl > tetraisopropyl = tetrabutyl. The rate of reaction of these compounds with the peptide or the complete protein decreased as the bulk and hydrophobic character of the side chain increased. The rate of reaction for each thiuram disulfide was greater for the full length p7 nucleocapsid protein than for the zinc finger peptide. To gain some insight into the reaction pathway, as was done for 3-disulfide bonds
Reagent
Reagent
Tetramethylthiuram Disulfide CH CH-
CH
:N_fc_SS—6-Nt:C H
Tetraethylthiuram Disulfide C H oC H ry CH3CH2'^
^CH2CH3
Tetraisopropylthiuram Disulfide 9*^3 CHgCH^
CHQ
S
n
CH3CH
^CHCHo ^CHCH3
CH3
6H3
Tetrabutylthiuram Disulfide s s
CHoCHoCnpCH2\
N-C-SS-C—N^ CH3CH2CH2CH2^'
Dicyclopentamethylene-thiuram Disulfide N-C-SS—C-N
^
Figure 2. HPLC-assay of NCp7 with thiuram disulfides. NCp7 was reacted with 6-fold excess of various thiuram disulfides at pH 7.0 for 10 min at 37° C. The products of the reaction were identified as follows: I - unreacted p7; II - 3(S-S) p7; shaded peak is tetramethylthiuram disulfide; other reagents were eluted later on chromatogram (not included).
Reaction of HIV-1 NC p7 Zinc Fingers
237
NEM, the NCp7 was allowed to react for 10 min with a limiting amount of thiuram disulfide (1:6 ratio of protein to reagent) and the reaction products separated by HPLC. The chromatogram in Fig. 2 shows the distribution of protein products from each reaction. To compare the reactivities of the thiuram disulfides with the protein, we estimated the relative amounts of unreacted protein remaining after 10 min. (Fig. 2; peak I). The data reveals that the order of reactivity for the thiuram disulfides is tetramethyl > dicyclopentamethylene = tetraethyl > tetraisopropyl > tetrabutyl in good agreement with the data from the Trp fluorescence quenching study. In other studies (data not shown) the reactions were driven to completion by increasing the reaction time. The final reaction product is fully oxidized p7 with three disulfide bonds and has the chromatographic mobility as indicated in Fig. 2 (peak II). We confirmed the presence of three disulfide Figure 3 1
10
20
30
40
50
55
MQRGNFRNQRKIIKCFNCGKEGHIAKNCRAPRKRGCWKCGKEGHQMKDCTERQAN t
t
t
1st finger peptide-—t
t t-—2nd finger peptide
t
bonds by treating the oxidized p7 (Fig. 2; peak II) with 4-vinylpyridine followed by protein sequencing of the reaction product and found it to be devoid of modified free thiols. The fully oxidized protein was also reduced with p-mercaptoethanol and shown to have HPLC elution behavior identical to unreacted p7. The presence of a disulfide bond linking the finger domains was confirmed by enzymatic digestion with Arg-C endoproteinase. Enzymatic digestion at Arg residues flanking the finger domains ( t in Fig 3) produces two large peptides that are easily separated by HPLC and distinguished from each other by the UV absorption of Trp 37 in the 2nd finger peptide. Before reduction with 2-mercaptoethanol the finger peptides eluted as a single peak but separated into two chromatographic species after reduction (data not shown). The results show that the reaction with thiuram disulfides is essentially an oxidation reaction and are consistent with known properties of this class of compound. Thus the thiuram disulfides induce disulfide bonds among the zinc finger thiolates, displace zinc, and alter the active conformation of the protein.
238
E. Chertova et al
C. The mechanism of tetraethylthiuram disulfide reaction with Ncp7 To investigate transient intermediates and the reaction path we selected tetraethylthiuram disulfide as the model reagent. In Fig. 2 there are several peaks of modified protein eluting between peaks I and II that are transient intermediates in the reaction pathway leading the fully oxidized protein. The protein and reagent were reacted for 1 min and the reaction products separated by HPLC as before. In this chromatogram (Fig. 4) peak 3disulfide is fully oxidized protein and all other peaks are reaction intermediates. The two most prominent peaks of modified protein (peak B and C) had greater absorbency at 280 nm (greater ratio of 280nm/206nm) than 3-disulfide peak indicating that peaks B and C contained a mixed disulfide between the protein and the reagent. The major transient intermediates observed in Fig. 4 seemed to contain mixed disulfides between the reagent and the protein (peaks B and C) but the final reaction product (peak 3-disulfide) does not. The protein with mixed disulfide might readily undergo a rearrangement, liberating reduced reagent, to generate internal disulfide bonds. To prevent this from occurring, the free thiols of the modified protein were blocked by alkylation with NEM before separation. After 2 h in the excess NEM, the products were separated by HPLC as shown in Figure 5. Under these conditions, at least 70% of the initial protein reacted with tetraethylthiuram disulfide before addition of the NEM. Free thiols in the intermediates are most reactive with NEM and disulfides (mixed or internal) are least reactive. Reactions between the NCp7 and tetraethylthiuram disulfide that initiate after the addition of NEM are minimized by the amount of unreacted protein, the limiting amount of tetraethylthiuram disulfide in the reaction mixture and the rapid alkylation by the excess of NEM. Any unreacted NCp7 with both zinc finger should react with the excess NEM to give a final product with 6 Cys-NEM but the modification would proceed more rapidly on the second zinc finger as discussed in Fig. 1. In Fig. 5 the peak labeled "NCp7 / 6NEM" (mass-spectrometry data) is fully alkylated protein that results from the action of NEM on NCp7 protein remaining after 1 min of treatment with tetraethylthiuram disulfide. The area of the "NCp7 / 6NEM" peak relative to the other peaks in the chromatogram is consistent with the expectation that 30% of the initial protein remained unreacted after 1 min. and that this protein was quickly modified by the excess NEM. Peak A was analyzed by mass spectrometry on MALDI-II-TOF (Shimadzu) and by Edman degradation. These results showed that most of the protein was modified by the addition of 4 NEM moieties on Cys residues 28, 36, 39 and 49. However, the analysis also indicated that the peak contained lesser
Reaction of HIV-1 NC p7 Zinc Fingers
239
A A
Reagent NCp7/
2 Disulfides/ 2NEM 3 Disulfides
A
Figure 4. HPLC separation of NEM-modified NCp7 oxidated with tetraethylthiuram intermediates. amounts of other modified forms of protein. Peaks C (Fig. 5) had a ratio of absorbency at 280nm/206nm greater than peaks A, B and "NCp7/6NEM" suggesting that it contains protein with at least one mixed disulfide and derived from the intermediate in peak C in Fig. 4. To further analyze the modified intermediates separated in Fig. 4, the protein isolated in peaks A, B and C was reduced with DTT and digested with Arg-C as previously discussed (Fig. 3). The resulting peptides were separated by HPLC as shown in Fig. 5 panel A, B and C. The separated peptides were characterized by their molecular mass, HPLC mobility and by UV absorption. Peak A (Fig. 4) was analyzed in panel A (Fig. 5) and found to contain a mixture of at least three modified intermediates. The most abundant modified intermediate gave a 1st finger peptide with 1 Cys-NEM residue and a 2nd finger peptide with 3 Cys-NEM residues. The data are consistent with an oxidative intermediate with one disulfide bond in the first finger. The other two modified intermediates in peak A appear to be derived from higher oxidation products with two disulfide bonds per intermediate. The data suggest that both higher oxidation products have a disulfide bond linking the first and second finger domains and an additional disulfide bond. One intermediate with the additional disulfide in the 1st finger domain and the other with the additional disulfide in the 2nd finger domain. Peak B (Fig. 4) was analyzed and found (Fig. 5, panel B) to contain a mixture of two modified intermediates. Both modified intermediates had the same molecular mass and contained four Cys-NEM residues. One modified intermediate gave a 1st finger peptide with 3 Cys-NEM residues and a 2nd finger peptide with 1 Cys-NEM. The other modified intermediate gave a 1st finger peptide with 2 Cys-NEM residues and a 2nd finger peptide with 2 Cys-NEM residues. The results are consistent with two intermediates
E. Chertova et al
240 Arg-C Peiitides Fr#m Peak A 1st-1NEM 2nd-3NEM
/
1st-2NEM
Ar§-C Peptides Fr«m Peak B 2nd-1NEM
2nd-2NEM 1st-2NEM
1st-3NEM
Ari|-C Peptides Fr«m Peak C ,1st
A 2nd-3NEM
Figure 5. HPLC separation of Arg C NEM-modified NC p7 peptides. in the oxidative path, one with a disulfide in the 2nd finger domain and the other with a disulfide linking the 1st and 2nd finger domains. Peak C (Fig. 4) was analyzed in Fig. 5 panel C and found to contain 1st finger peptide and 2nd finger peptide with three Cys-NEM. These results are consistent with an intermediate in the NCp7 thiuram oxidation pathway with all three Cys residues in the first finger protected from NEM by disulfide bonds and all three Cys residues in the 2nd finger as thiols and unprotected. The data reveal that peak C (Fig. 4) contained an oxidative intermediate with the first finger domain modified by one internal disulfide and one mixed disulfide. The mixed disulfide was also indicated by the ratio of OD206/OD280 for peak C (Fig. 3). The data presented in Figs. 4 and 5 are consistent with a prominent reaction path that begins with an initial attack of tetraethylthiuram disulfide on the 1st zinc finger of NCp7. The earliest oxidation intermediate that accumulates as a detectable transient has one disulfide bond in the first finger linking Cys 15 to Cys 18 (major component of peak A, Fig. 5). This intermediate has a free thiol on Cys 28 which can react in the next step with tetraethylthiuram disulfide to form a mixed disulfide. This intermediate has one internal disulfide and one mixed disulfide (the major component in Fig. 5 peak C). In subsequent steps this intermediate can form either intra or
Reaction of HIV-1 NC p7 Zinc Fingers
241
inter molecular disulfide bonds and react with additional reagent to give higher oxidation products. The data for peak B (Fig. 5) suggest that a less prominent reaction path begins with an initial attack on the 2nd zinc finger and leads to higher oxidation products. Taken together the data indicate that tetraethylthiuram disulfide can attack both zinc fingers in the NCp7 but reacts more readily with the 1st finger in the native protein since peak A>peakB. D. Inactivation of HIV-1 (MN) with the thiuram disulfides in vitro In order to analyze the action of the thiuram disulfides on NC protein in whole virus, lOOOx concentrated cell-free HIV-l(MN) was incubated with 50 mM of thiuram disulfides for 60 min. The virus was then pelleted by centrifugation to remove reagents and was analyzed by western blot analysis with antibody against p7 (Fig. 6). Under non-reducing conditions the NC protein of untreated virus is a mixture of monomers, dimers, trimers and tetrameters (see Fig. 6, HIV-1 lane). These virus treated with thiuram disulfides showed NC antigen migrating above 200 kDa marker and in some cases the monomeric form of NCp7 was completely absent (tetraethyl-thiuram, tetraisopropylthiuram and dicyclopentamethyleneHIV
1
3
4
Tetramer Trimer . Dimer p7NC-
Figure 6. Non reducing SDS-PAGE analysis of HIV-1 treated with drug: lane 1 - tetramethylthiuram disulfide, 2 - tetraethylthiuram disulfide, 3 dicyclopentamethylenethiuram disulfide, 4 - tetraisopropylthiuram disulfide, and 5 - tetrabutylthiuram disulfide.
E. Chertova et al
242
thiuram disulfides). The results show that thiuram disulfides are capable of penetrating the viral membrane and attacking the NC protein in the viral core. Discussion Thiuram disulfides react with NC p7 through a sulfliydryl-disulfide exchange involving Cys thiolates and the electrophilic disulfide bond of the thiuram. The reaction proceeds through a nucleophilic attack of the thiuram disulfide by the Cys thiolate. It is known that for low molecular weight thiols such reaction forms predominantly a symmetrical disulfide, whereas with protein sulfhydryl groups, a mixed disulfide is the major product. The reaction of thiuram disulfides with NC protein yields a mixed disulfide (derived from Cys thiols and diethyldithiocarbamyl moiety) and a diethyldithiocarbamate ion as the primary products of the exchange process. A complex reaction pathway was observed and attributed to the presence of six Cys residues in close proximity in NC protein which may form a heterogeneous disulfide bonding pattern. The side group had a pronounced effect on the rate of reaction. Thiuram disulfides carrying the branched isopropyl chain as well as the longer butyl groups reacted slower than the more compact methyl or ethyl groups. This behavior may result from the increased segmental flexibility of the longer branched chain, however, the bulkier dicyclopentamethylene group reacted fastest among the compounds investigated. We attribute this effect to a constrained mobility of the cyclic derivative that maximizes productive encounters between the Cys thiolate and the thiuram disulfide moiety. This agrees with computer modeling studies suggesting a wedgelike shape for the compound. The 1st zinc finger of HIV-1 NC protein is primarily the initial target in the reaction with tetraethylthiuram disulfide. In contrast, NEM reacted faster with the 2nd p7 zinc finger. While our studies of NC protein alone demonstrated little if any crosslinking, the results with HIV-1 virus showed extensive oligomerization. The mature virion contains a compact ribonucleoprotein complex formed by the genomic RNA and ca. 2,500 copies of the NC protein. Therefor the high concentration of NC in the viral particle the formation of intermolecular disulfide bonds over intramolecular ones is expected to be favored following virus treatment with thiuram disulfides. In agreement with this model, the reaction of tetraethylthiuram disulfide with concentrated samples of HIV-1 NC protein in vitro lead to extensive p7 oligomerization. Such crosslinked macromolecular structures appear likely
Reaction of HIV-1 NC p7 Zinc Fingers
243
to result in functional impairment of the NC domain. Indeed, it has been shown that treatment of retroviruses with thiuram and aromatic disulfides rendered them non-infectious (data not shown). These results are completely compatible with the known functions of the NC protein in the viral replication cycle and published results describing the action of other oxidizing agents on a whole HIV-1 (Rice et al., 1994, Rein et al., 1996). In conclusion, thiuram disulfides are examples of a class of compounds that oxidize retroviral NC proteins and hold promise in antiretroviral therapy. References Aldovini, A., and R.A. Young. (1990). J. Virology. 64, 1920. Alexander, P., Z.M. Bacq, S.F. Cousens, M. Fox, A. Herve, and J. Lazar. {\955). Radial Res. 2,1>92. Bacq, Z.M., and A. Herve. 1953. Arch Int.Physiol 61, 433. Bacq, Z.M., A. Herve, and P. Fisher. 1953. Bull Acad. Roy. Med Belg. 18, 226. Chance, M.R., I. Sagi, M.D. Wirt, S.M. Frisbie, E. Scheuring, E. Chen, J.W. Bess, Jr., L.E. Henderson, Arthur, L.O., T.L. South, G. Perez-Alvardo, and M.F. Summers. (1992). Proc. Natl. Acad Sci. USA. 89, 10041. Child, G.P., and M. Grump. (1952). Acta Pharmacol. Toxicol. 8, 305. Copeland, T.D., M.A. Morgan, and S. Oroszlan. (1984). Virology. 133, 137. Dupraz, P., S. Oertl, C. Meric, P. Damay, and P.-F- Spahr. 1990. J. Virology. 64, 4978. Gorelick, R.J., L.E. Henderson,J.P. Hanser, and A. Rein. (1988). Proc. Natl. Acad Sci. USA. 85,8420. Gorelick, R.J., S.M. Nigida, J.W. Bess, L.O. Arthur, L.E. Henderson, and A. Rein. (1990). J. Virol. 46, 3207. Gorelick, R.J., Chabot, D.J., Rein, A., Henderson, L.E. and Arthur, L.O. (1993). J. F/>o/. 67, 4027.. Gorelick, R.J., Chabot, D.J., Ott, D.E., Gagliardi, T.D.,and Arthur, L.O. (1996).y. Virol. 70,2593. Henderson, L.E., T.D. Copeland, R.C. Sowder, II, G.W. Smythers, and S. Oroszlan. (1981). J. Biol. Chem. 256, 8400. Henderson, L.E., Rice, W.G., and Arthur, L.O. (1995). US Patent Application USSN 08/312,331 Lumper, L., and H. Zahn. (1965). Advan. Enzymol. 27,199. Meric C , and S.P. Goff. (1989). J. Virol. 63,1558. Meric C , E. Gouilloud, and P.-F. Spahr. (1988). J. Virol. 62, 3328.
244
E. Chertova et al
Rein, A., D.E. Ott, J. Mirro, L.O. Arthur, W. Rice, and L.E. Henderson. (1996). J. F/ro/. 70, 4966. Rice, W.G., Schaeffer, C.A., Harten, B., Villinger, F., South, T.L., Summers, M.F., Henderson, L.E., Bess, J.W.Jr., Arthur, L.O., McDougal, J.S., Orloff, S.L., Mendeleyev, J. and Kun, E. (1993). Nature 361, 473. Rice, W.G., J.G. Supko, L. Malspeis, R.W. Buckheit, Jr., D. Clanton, M. Bu, L. Graham, C.A. Schaeffer, J.A. Turpin, J. Domagala, R. Gogliotti, J.P. Bader, S.M. Halliday, L. Coren, R.C. Sowder II, L.O. Arthur, and L.E.Henderson. (1995). Science. 270, 1194. Summers, M.F., L.E. Henderson, M.R. Chance, J.W. Bess, Jr., T.L. South, P.R. Blake, I. Sagi, G. Perez- Alvardo, R.C. Sowder, II, D.R. Hare, and L.O. Arthur. (1992). Protein Science. 1, 563. Towbin, H., T. Staehelin, and J. Gordon. (1979). Proc. Natl Acad. Sci. USA. 76, 4350. Tummino, P.J., J.D. Scholten, P.J. Harvey, T.P. Holler, L.Maloney, R. Gogliotti, J. Domagala, and D. Hupe. (1996). Proc. Natl. Acad Sci. USA. 93, 969. Acknowledgments Research sponsored by the National Cancer Institute, Department of Health and Human Services (DHHS). The contents of this publication do not necessarily reflect the views or policies of the DHHS, nor does mention of trade names, commercial products, or organizations imply endorsements by the US Government.
The Identification and Isolation of Reactive Thiols in Ricin A-Chain and Blocked Ricin Using 2-(4'-Maleimidylanilino)naphthalene-6-sulfonic Acid Mary E. Denton, Rita M. Steeves and John M. Lambert ImmunoGen, Inc. Cambridge, MA 02139
I. Introduction The identification of reactive or chemically modified residues of proteins is often extremely important for the characterization of proteins and their activity. Peptide mapping in conjunction with Edman sequencing and/or mass spectrophotometric analysis has been the method of choice to accomplish this characterization. However, this approach alone may not be sufficient or optimal for every situation as was the case when trying to identify the affinity ligand attachment sites on the B-chain of blocked ricin (Lambert et al., 1991a). Ricin is a heterodimeric protein composed of a toxic A-chain, which is responsible for inhibiting cellular protein synthesis (Olsnes and Pihl, 1973), disulfide-linked to a B-chain, known to possess lectin activity (Baenziger and Fiete, 1979). In order to suppress the non-specific toxicity which arises from the interaction of the carbohydrate binding domains with cell-surface carbohydrates, the two carbohydrate binding pockets on the B-chain of ricin (Montfort et al., 1987) are covalently "blocked" using a modified glycopeptide containing an Nlinked triantennary oligosaccharide, thus forming "blocked ricin" (Lambert et al., 1991a). Ricin thus modified has been incorporated as the effector portion of antigen-specific immunoconjugates currently in clinical trials (Lambert et al., 1991b; Grossbard et al., 1993). The glycopeptide is derived from a pronase digestion of the serum protein fetuin and is modified in two ways to form an affinity ligand for ricin B-chain (Lambert et al., 1991a). First, a dichlorotriazine group is linked to one terminal galactose moiety of the glycopeptide to provide a cross-linking group which can react with a nucleophilic residue on the B-chain once the carbohydrate has been bound. Second, a protected thiol is added to the peptide portion of the ligand. Thus, a free thiol can be easily generated for conjugation or labeling purposes. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
245
246
Mary E. Denton et al.
Identification of the residues of ricin involved in the covalent hnkage to the affinity Hgand is important for complete characterization of the cytotoxic effector moiety and to ensure consistency of the immunoconjugate product. However, due to the intrinsic heterogeneity of the ligand, isolation of individual species of ligandbound B-chain peptides by traditional peptide mapping has not been possible. In order to minimize the effect of the ligand heterogeneity, a different approach was conceived exploiting the incorporated protected thiol at the peptide "end" of the ligand. The thiol-specific probe, 2-(4'-maleimidylanilino)naphthalene6-sulfonic acid (MIANS^) was used to label the ligand linked to B-chain in situ. MIANS was an attractive choice because its characteristic absorbance profile (Gupte and Lane, 1979; Andley et al., 1981) facilitates the identification of the peptides of interest. A monoclonal antibody recognizing MIANS was produced so that affinity chromatography could be used to isolate only those peptides of the B-chain that are cross-linked to the MIANS-labeled ligand. In order to test the specificity of the thiol labeling by MIANS and the general efficacy of this method to subsequently isolate and identify reactive thiols, the A-chain of ricin was used as a model protein. Although the A-chain contains two cysteine (Cys) residues, only the C-terminal Cys that forms the intermolecular disulfide linkage to the B-chain is accessible upon reduction of the native protein under the conditions employed (Montfort et al., 1987). If MIANS is indeed specific for reactive thiols and the anti-MIANS affinity column is effective, only MIANS-labeled peptides corresponding to the C-terminal sequence of ricin Achain should be obtained (with a "blank" cycle at the Cys position upon automated Edman sequencing, due to its derivatization). Here we report the results obtained using MIANS-labeling in conjunction with affinity chromatography to map free thiols in reduced, native ricin A-chain. We further describe the results obtained when utilizing this method to isolate ligand-bound ricin B-chain peptides. II. Materials and Methods A. Absorbance Measurements Extinction coefficients used for ricin A-chain and blocked ricin B-chain at 280 nm for 0.1% solutions were 0.765 and 1.48, respectively (Olsnes and Pihl, 1973). The extinction coefficient for MIANS is 20,000 M"' cm"^ at 322 nm (Gupte and Lane, 1979). The empirically determined contribution of MIANS to the absorbance of the labeled protein at 280 nm was 0.9 x A320.
'Abbreviations: MIANS, 2-(4'-maleimidylanilino)naphthalene-6-sulfonic acid; PBS, phosphate buffered saline: 20 mM sodium phosphate containing 150 mM sodium chloride, pH 7.2; EDTA, ethylene diamine tetraacetic acid; DTT, dithiothreitol; GuHCl, guanidine hydrochloride; 2-ME, 2-mercaptoethanol; RB2L-MIANS, ricin B-chain covalently blocked by two affinity ligands and MIANS labeled; TFA, Trifluoroacetic Acid.
Identification of A-Chain and Blocked Ricin Using MIANS
247
B. MIANS Labeling Ricin A-chain (Inland Labs) at 3 mg/mL in PBS was reduced for 30 min with 30 mM DTT at 30°C. Excess DTT was removed by Sephadex G-25 gel filtration in 5 mM sodium acetate buffer, pH 4.7, containing 50 mM NaCl and 0.5 mM EDTA. The thiol content was assessed by Ellman's assay (EUman, 1959). MIANS (Molecular Probes) was added at a ratio of 0.9 mole MIANS: 1.0 mole thiol. The reaction mixture was incubated at ambient temperature for 15 min after the pH was raised to 7.0 using 1 M Tris.HCl buffer, pH 7.4. The MIANS was quenched by addition of 1 mole equivalent of freshly prepared cysteine.HCl/mole MIANS and incubation for an additional 15 min. Any remaining thiol groups were alkylated by adding a 5-fold molar excess of iodoacetamide over total thiol and after an additional 30 min, the protein (MIANS-ricin A-chain) was dialyzed against 0.1 M Tris.HCl buffer, pH 8.5. Ricin containing two covalently attached ligands (blocked ricin) was produced according to Lambert et al. (1991a). Blocked ricin at 3 mg/mL was reduced with 4.5 mM DTT in PBS at pH 6.8 for 17 hours on ice (conditions that reduce only the ligand disulfide). The MIANS labeling, quenching and alkylation were performed as described above. The protein was dialyzed against 0.1 M Tris.HCl buffer, pH 7.7.
C. Purification of RB2L-MIANS The disulfide bond between the A and B chains of MIANS-labeled blocked ricin was reduced and the chains were separated using a modification of the method of Olsnes & Pihl (1973). Protein (approximately 10 mg) at 1 mg/mL in 0.1 M Tris.HCl buffer, pH 7.7 was reduced by incubating with 4.5% 2-ME for 20 hr at ambient temperature. After raising the pH to 8.5 with 0.1 M Tris base, the chains were separated by ion exchange chromatography on a 1.5 x 6.5 cm column of DE52 equilibrated in 0.1 M Tris.HCl buffer, pH 8.5, containing 0.1% 2-ME. After loading the protein, the colum was washed with equilibration buffer (approximately 40 mL) followed by a wash with 0.1 M Tris.HCl buffer, pH 8.5 (40 mL). Ricin A-chain does not bind to the resin under these conditions. RB2LMIANS was eluted with 0.1 M Tris.HCl buffer, pH 8.5, containing 1 M NaCl.
D. Enzymatic Digestion of Proteins MIANS-Ricin A-chain and RB2L-MIANS were reduced and alkylated under denaturing conditions. To each protein solution, GuHCl and DTT were added to 6 M and 20 mM, respectively. Each protein was incubated at 37°C for 1 hr. MIANS-ricin A-chain was reduced for 5 hr, carboxymethylated with 100 mM iodoacetic acid for 30 min, and dialyzed against 0.1 M Tris.HCl buffer, pH 8.5, containing 2.0 M GuHCl. RB2L-MIANS was reduced overnight, carboxymethylated by addition of 100 mM iodoacetic acid for 1 hr, quenched with 2-ME and dialyzed against 0.1 M Tris.HCl buffer, pH 8.5. Enzymatic digestion was performed at a 1/20 (w/w) ratio of each enzyme (Endo-Lys C and chymotrypsin) to substrate. Endoproteinase Lys-C (Endo Lys-C, Wako) digestion was performed according to Riviere et al. (1991). The protein
248
Mary E. Denton et al
was heated to 50°C for 30 min in a solution of 0.1 M Tris.HCl buffer, pH 8.5, containing 6 M GuHCl. After heating, the solution was diluted to 2 M GuHCl and enzyme was quickly added. Digestion was allowed to proceed for 6 hr at 37°C. At that time, chymotrypsin (sequencing grade, Boehringer Mannheim) was added and the incubation continued overnight. Digestion was stopped and the proteinases inactivated by adding the mixture rapidly to 3 volumes of boiling methanol (approximately 76 °C) for 3 min. Methanol was removed by rotary evaporation.
E. Affinity Purification
A murine monoclonal antibody, LG-85, specific for MIANS-labeled peptides and proteins was produced by the Hybridoma Development group at ImmunoGen, Inc. The antibody, an IgGj, was produced from the hybridoma grown as an ascites tumor in mice, and was purified from the ascites fluid by affinity chromatography over a Protein A-Sepharose column. The purified antibody was then used to prepare an LG-85-Protein G-Sepharose affinity column as follows: Antibody in PBS was added to resin at a concentration of 2 mg LG-85/mL resin packed in a column. After incubation for 90 min, the resin was washed with 0.2 M borate buffer, pH 9.0, and dimethylpimylimidate was added to a concentration of 20 mM. After cross-linking for 30 min at ambient temperature, the reaction was quenched by incubation for 30 min with 2 M ethanolamine.HCl, pH 7.8. The column was subsequently equilibrated in PBS. The peptide digests in 2 M GuHCl were diluted to 0.5 M GuHCl prior to loading onto the anti-MIANS column (LG-85-Protein G-Sepharose). After washing with 0.1 M Tris.HCl buffer, pH 8.5, followed by 0.1 M sodium citrate/sodium phosphate buffer, pH 2.9, the column was eluted with the citrate/phosphate buffer, pH 2.9, containing 4 M GuHCl. All fractions were evaluated for absorbance at 280 and 320 nm.
F. Chromatography Gelfiltrationchromatography using a Superdex Peptide HR 10/30 column (Pharmacia) was performed on a Hitachi System L-6200 Intelligent Pump equipped with an L-4200 UVA^is Detector. Reverse Phase-HPLC (RP-HPLC) was performed on a Waters 625 LC System with a 991 photodiode array detector, using a (4.6 X 250 mm) Zorbax SB-300 Cig column (MacMOD).
G. Peptide Sequencing Following affinity chromatography of the MIANS-ricin A-chain digest over the anti-MIANS column, the eluted fraction was run over the Cjg RP-HPLC column utilizing gradient elution conditions. Individual peptides were collected for amino acid analysis (data not shown) and for sequencing. Following affinity chromatography of the RB2L-MIANS digest over the anti-MIANS column, the eluted fraction was desalted over the Cjg RP-HPLC column.
Identification of A-Chain and Blocked Ricin Using MIANS
249
All peptides of interest were analyzed for amino acid composition and/or for sequence by Dr. John Leszyk at the Worcester Foundation for Experimental Biology. III. Results and Discussion A. MIANS labeling The reduction of ricin A-chain yielded 0.96 mole SH/mole protein. Following the labeling step, the ratio of MIANS/A-chain was 0.70 as determined by measurement of absorbance at 280 nm and 320 nm. The reduction of blocked ricin yielded 1.3 mole SH/mole protein. Following labeling and separation of the blocked B-chain, the ratio of MIANS/blocked B-chain was 0.97.
B. Peptide Mapping ofMIANS-Ricin A-chain Figure 1, Panel A shows the RP-HPLC chromatogram of the combined digest (Endo Lys-C followed by chymotrypsin) of MIANS-ricin A-chain detected at 214 and 320 nm. There are only 6 major peptides that are MIANS-labeled as evidenced by the 320 nm profile. All of these peptides bind to and are eluted from the anti-MIANS column as shown in Panel B of Figure 1. Moreover, as the profile at 214 nm shows, there is no evidence of significant amounts of additional, unlabeled peptides sticking to the affinity column. Note also that the ratio of the peaks in the 320 nm chromatogram of Panel B is the same as that in the 320 nm chromatogram of Panel A, indicating that each of the MIANS-labeled peptides binds to the affinity column equivalently. Amino acid analysis was performed (data not shown) on the affinity purified material corresponding to peaks labeled 1 through 6 in the 320 nm chromatogram of Panel B. Peaks 1 through 5 have very similar compositions characterized by a strong proline (Pro) signal and the presence of Arg, Ser, Ala, and Glx. In addition to the strong Pro signal and the amino acid residues contained in Peaks 1 through 5, Peak 6 contained significant amounts of Phe, Tyr and Val. Neither Cys nor carboxymethyl-Cys was found in any of the amino acid analyses (as is consistent with MIANS-derivatization at this residue). No peptides that can be derived theoretically from an Endo-Lys C plus chymotrypsin combined digestion of the ricin A-chain, other than the C-terminal peptides, fit the determined amino acid composition profiles. Although the amino acid analysis data strongly suggested that all of the peptides isolated were derived from the proline-rich C-terminal portion of the ricin A-chain containing MIANS-Cys at position 283, it was necessary to perform automated sequencing on representative peptides to unambiguously identify the location of the MIANS modification and the composition of the peptides. Two of the peptides from the affinity purified material, peaks 1 and 6, which are representative of the two amino acid analysis results, were sequenced. The sequences are shown in the 320 nm chromatogram of Panel B, Figure 1. Both peptides contain a blank cycle at the residue corresponding to the C-terminal Cys of the ricin A-chain, Cys^^^ consistent with the likelihood that this residue is MIANS-labeled. The sequence of Peak 6 differs
uoi X) nv
(i-OL X) nv
LOL X) nv
(c-Oi X) nv
^ o
^3
IZ>
^
Id
o
13
(U W)
X
d >. II J1 t-H o a.
c
^-H
< (^ H
Identification of A-Chain and Blocked Ricin Using MIANS
251
from that of Peak 1 in that it contains two additional residues N-terminal to the sequence of Peak 1 and an additional residue, Phe, on its C-terminus. Neither the buried Cys nor any primary amine (the N-terminus or Lys side chains) has been labeled by MIANS. Taken together, these data indicate that the MIANS label is highly specific for accessible thiols. Using these digestion conditions, which had been optimized for the blocked ricin B-chain, has resulted in a heterogeneous peptide map, in part due to the incomplete chymotryptic cleavage at Tyr^^^ This is likely caused by interference by the MIANS label and, along with the C-terminal Phe residue in peak 6, accounts for the difference between the peaks 1 and 6. The heterogeneity within peaks 1-5 may be derived from either the presence or absence of the C-terminal Phe and/or from instability of the MIANS label. A likely product of the "breakdown" of the MIANS label is a form where the maleimide ring opens at one of the carbonyls. (See Summary and Conclusions below). Surprisingly, whatever instability may be associated with the MIANS label, all of the labeled peptides bind equivalently well to the affinity column.
C Peptide mapping of the Ricin B-chain attachment sites The RP-HPLC analysis of the combined digest of the ricin B-chain blocked with MIANS-labeled affinity ligand (RB2L-MIANS) is shown in Figure 2, Panel A. The 320 nm detected trace demonstrates the exceptional heterogeneity of the RB2L-MIANS peptides. Although there are some seemingly well-defmed peaks in the chromatogram, attempts to sequence them directly have been unsuccessful. Panel B demonstrates the utility of the anti-MIANS column for the specific isolation of the RB2L-MIANS peptides. Here, the characteristic "double-humped" profile of the RB2L-MIANS peptides is apparent. Note that there is very little 214 nm (top trace. Panel B) absorbing material present. The relative absorbance of these peaks at 320 nm compared to 214 nm is much higher than for the MIANSricin A-chain peptides in Figure 1. Although the anti-MIANS column-eluted fraction is heterogeneous by RPHPLC, the gelfiiltrationchromatography analysis of the affinity purified RB2LMIANS peptides shown in Figure 3 reveals only one major peak with a minor component eluting on the leading shoulder. A comparison with the elution profile of ligand itself from the same column demonstrates that the RB2L-MIANS peptides are significantly larger than the ligand alone (i.e., >2500 Da). [The minor peak at 20 min in the ligand profile corresponds to ligand dimer.] Given the rigorous digestion conditions (Endo Lys-C followed by chymotrypsin, in the presence of 2 M GuHCl), it is unlikely that this profile represents B-chain peptides without bound ligand. Indeed, sequencing of the anti-MIANS column-eluted fraction without further fractionation yielded two B-chain peptide sequences of VA.AA..AA-J -I
1
4
i
r-
I
6
I
•!
i
mAU
c
150.5
o
150
LU
CO
O C CO .Q v..
o
CO
-J
I ^cH cc
3
30H 2(H
iLtii.Jliill
iLllUiiii^i. i,.ii]J,iyy,.,til.Ul,!!Ill yiiiL , ;lli|ri lli LiiMl Unl^i
Figure 4: Mass spectrometry analysis of difference peptides from the dehydroascorbate modified p-globin subunit. Dehydroascorbate modified deoxy recombinant hemoglobin was prepared as described in Fig. 2. Trypsin digestion was performed as described in materials and methods and the p-globin N-terminal peptide (see upper panel Fig. 3) and the difference peptide (see lower panel Fig. 3) analyzed by LC-MS. The spectrum in the upper panel represents the p-globin N-terminal peptide and the spectrum in the lower panel represents the difference peptide. A summary of the data is presented in Table I.
This is different from that for dehydroascorbate which has a mass of 156amu and carboxymethylation which produces a mass increase of 58amu. Edman protein sequencing analysis of the difference peptide demonstrated a blocked Nterminus, suggesting that the N-terminal methionine was modified. Experiments are currently in progress to determine the exact structure of the modification.
O2 and Ascorbate Mediated Modification of Recombinant Hb
407
Table I: Summary of mass spectrometry and sequencing data from the unmodified and dehydroascorbate modified (3-globin peptides Sample
Sequence
Expected monoisotopic
Observed mass (amu)
N-terminal peptide
MHLTPEEK
983.5
983.5
Difference peptide
Blocked to Edman sequencing
p-globin
1055.2
A = 71.7
References Al-Ayash, A.I. and Wilson, M.T. (1979) Biochem. J. Ill
Ml.
Dunn, J.A., Ahmed, M.U., Murtiashaw, M.H., Richardson, J.M., Walla, M.D., Thorpe, S.R. andBaynes, J.W. (1990) Biochemistry 29:10964. Gibson, Q.H. (1943) Biochem. 1 37:615. Hoffman, S.J., Looker, D.L., Roehrich, J.M., Cozart, P.E., Durfee, S.L., Tedesco, J.L. and Stetler, G.L. (1990) Proc. Natl Acad. Sci. USA 87:8521. Looker, D., Abbott-Brown, D., Cozart, P., Durfee, S., Hoffman, S., Mathews, A.J., Miller-Roehrich, J., Shoemaker, S., Trimble, S., Fermi, G., Komiyama, N.H., Nagai, K. and Stetler, G. (1992) Nature 356:258. Ortwerth, B.J., Slight, S.H., Prabhakaram, M., Sun., Y. and Smith, J.B. (1992) Biochim. Biophys. Acta. 1117:207. Vestling, C.S. (1941) J. BiolChem. 143:439. Washko, P.W., Welch, R.W., Dhariwal, K.R., Wang, Y. and Levine, M. (1992) Anal. Biochem. 204:1. Witkowska, H.E., Bitsch, F. and Shackleton, C.H.L. (1993) Hemoglobin 17:227.
This Page Intentionally Left Blank
Metal activation and regulation of E.coli RNase H James L. Keck and Susan Marqusee Dept. of Molecular and Cell Biology University of California, Berkeley Berkeley, CA 94720
Introduction: The ribonuclease H (RNase H) family of enzymes are ubiquitous nucleases that catalyze the hydrolysis of RNA in RNA»DNA hybrids (for review, see 1). In contrast to the well-studied ribonucleases A and Tl, RNase H does not employ the 2'-OH in RNA as a nucleophile but instead activates water as the nucleophile for hydrolysis in a metal-dependent reaction. The number and role(s) of divalent metal in the RNase H reaction mechanism are still unclear. Two RNase H mechanisms have been proposed based on wellcharacterized metal-dependent DNase activities ~ a one-metal mechanism (2,3,18), modeled after DNase I (4), and a two-metal mechanism (5,19), modeled after the exonuclease domain from Klenow fragment (6-8). The one-metal mechanism is supported by observation of a single Mg2+ binding to E.coli RNase HI via X-ray crystallography, NMR and isothermal titration calorimetry (9-11,18). Also, mutagenesis of conserved residues in RNase H shows that only three of these ten residues result in a complete loss of activity when mutated to alanine (12). Of these three acidic residues, two are found by x-ray crystallography to ligand a single Mg^^ (Asp 10 and Glu48) and the third (Asp70) is proposed to abstract a proton from the attacking nucleophilic water (3,9). This divalent metal is proposed to stabilize the r e a c t i o n ' s p e n t a c o v a l e n t p h o s p h o r e n e t r a n s i t i o n state intermediate. In contrast, the two-metal mechanism is supported by observation of two Mn^+ ions bound in the active-site in the crystal structure of HIV-1 RNase H domain (a d o m a i n of reverse transcriptase) (5). The Mn2+ ions are --4 A apart (as is seen in the Klenow fragment exonuclease domain (6,8)) and are bridged by a uranium heavy atom. It is thought that the uranium acts as an TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
409
410
James L. Keck and Susan Marqusee
artificial bridging ligand that would normally be satisfied by substrate. In this two-metal mechanism, one metal activates the hydroxyl nucleophile and the second metal stabilizes the p h o s p h o r e n e intermediate of the reaction. Both mechanistic hypotheses are daunted by the lack of structural information on RNase H bound to its nucleic acid substrate. An understanding of the RNase H mechanism is important for a number of reasons. First, the RNase H activity is essential to the lifecycle of HIV (for review, see 13). Mutations that merely reduce the reverse transcriptase RNase H activity are sufficient to completely inhibit virulence in the mutant in vivo (14). Its absolute requirement makes the RNase H activity a logical drug target for anti-HIV therapies. Development of knowledge-based inhibitors will require an understanding of the activity's mechanism. Second, a number of proteins with structures homologous to RNase H have been solved in the past two years, all of which are metal-requiring nucleic acid manipulating proteins (reviewed in 15-17). This superfamily of proteins, now termed "polynucleotide transferases", includes RNase H (5,18,19), resolvase (20), integrase (21,22) and Mu transposase (23). It is assumed that the enzymes share a common mechanism, so an understanding of RNase H mechanism will assist in clarifying the mechanism of other members of this superfamily. Clearly, a determination of the number and role(s) of metal in the E.coli RNase H active site will help establish the enzyme's mechanism. We have examined the metal dependence (Mn2+ and Mg2+) of E.coli RNase H activity. Mn^^-dependent activity requires much less metal for activation than does Mg2+-activity and is inhibited upon the further addition of Mn2+. Using electron paramagnetic resonance (EPR), we have measured two distinct Mn^^ binding constants, consistent with the concentration requirements for activation and inhibition in vitro. Our data are most consistent with a singledivalent metal catalyzed reaction which can be attenuated (inhibited) upon binding a second metal. We discuss a possible mechanism for metal activation and inhibition of RNase H in light of previous mutagenesis and structural studies.
Materials and Methods
Materials: pJK502 is a T7-overexpression vector that encodes for wild-type E.coli RNase HI. It was made by site-directed mutagenesis of pSMlOl (24), reverting three alanines (residues 13, 63 and 133) to cysteines, and then subcloning the resulting gene into a p E T l l a overexpression vector (details of plasmid sequence available upon request). Overexpression and purification of E.coli RNase HI were performed essentially as described in (24) for RNase H'*'. RNase H* is
Metal-Dependence of E. coli RNase H
411
a cysteine-free version of E.coli RNase HI and was a gift from Dung Vu. M e t h o d s : i^Nflse H activity assay. Production of R N A » D N A hybrid and RNase H activity assays were performed essentially as described in (25). MnCl2 stocks were made by serial dilution of a 1 M MnCl2 stock in 1% Nitric Acid. Final MnCl2 stocks were in 0.1% Nitric Acid. The p H of the final assay solutions were unchanged by the addition of the Nitric Acid as confirmed by measuring the p H of mock reactions containing buffer (Tris) and the appropriate volume of 0.1% Nitric Acid. Velocity values are specific activity (units/mg enzyme) measurements based on four time points in the linear range of enzymatic activity. Assays were performed in 50 mM Tris, p H 8.0/50 mM N a C l / 1 mM DTT/1.5 |LIM B S A / 1 |LIM (basepairs) RNA*DNA hybrid with 0.2 to 1.2 nM E.coli RNase HI at 37 °C. One unit is defined as the amount of enzyme needed to generate 1 |Limol of acid-soluble product in 15 minutes under our reaction conditions. Electron Paramagnetic Resonance (EPR) binding experiments: All EPR measurements were carried out on a Bruker ESP300E X-band spectrometer at ambient temperature. Lyophilized RNase H* was resuspended in 50 mM Hepes, p H 7.5, and diluted 1:1 with MnCl2 stocks made up in water ~ final concentrations were 25 mM Hepes, p H 7.5, 50 |LiM RNase H* and MnCl2 from 15 to 300 |iM. Free Mn2+ is the only component that gives an EPR spectrum, allowing calculation of [Mn2+]bound ([Mn2^]totai = [Mn2+]free + [Mn2+]bound). A Mn^^ peak (centered at 3629.6 G with a + / - 60 G sweep) was measured and then compared (height and peak shape) with Mn^^ standards in the same buffer conditions to determine the scalar difference. Using these scalar factors, [Mn^+lfree was calculated at various total Mn2 + concentrations. Dissociation constants were determined via Scatchard analysis of the data (26).
Results
Mn^-^-dependence of E.coli RNase HI: The Mn2+-dependence of E.coli RNase HI catalysis was determined using a soluble assay that monitors acid-solubility of radiolabeled RNA in an R N A ^ D N A hybrid (25). This Mn^^-dependence shows activation at low concentrations of Mn2+, followed by inhibition at higher Mn^^ concentrations (Figure 1). The optimum activity is achieved in 5 |iM M n C l 2 , and is 30 % of the maximum activity in MgCl2 (data not shown). Maximum inhibition at 1 mM MnCl2 is -'20-fold inhibited relative to activity at 5 |LiM MnCl2. Activation and inhibition of RNase H'*' was indistinguishable from RNase HI (data not shown).
James L. Keck and Susan Marqusee
412
Figure 1. Mn^'^-dependence of E.coli RNase HI activity Assays were performed as described in Materials and Methods. Data points are the average of two assays with standard deviations shown as error bars.
10
1000
100
[MnCl2],|iM
Mit^-^-binding by E.coli RNase H*: EPR spectrometry was used to determine the stoichiometry and affinity of Mn^^ binding to E.coli RNase H*. Scatchard analysis of the binding data show that E.coli RNase H* (a cysteine-free version of E.coli RNase HI) has multiple Mn2+ binding sites. Two binding sites were determined with dissociation constants (Kd) of -15 |xM and --60 |iM (Figure 2). Figure 2. Equilibrium Miri^"^-binding to E.coli RNase H*^ V represents the fraction of b o u n d M i f ^ e r total ^•^^" RNase H* as o 05 - \ described in §j Materials and rp*^ 0 04 ~ Methods. Dashed 'Splines indicate the Jg 0.03best fit two lines ^ representrag the o.02 data points. K ^ measurements o.Ol are the inverse of the fits' abscissa. Q" 0
Measured Dissociation Constants Kdi z 15 |iM Kd2=60|iM
•
\ \ \ §"^
\ 1
i
1
1
1
1
i
0.2
0.4
0.6
0.8
1
1.2
1.4
V
1.6 1
1
Metal-Dependence of E. coli RNase H
413
Figure 3. Comparison of Mn 2+-activation to simulated activation curves ?
^ 0.5-
t
both metals required for activation
^ 0.4O *X2
>^
^
first metal activates second metal inhibits
> 0.3-1
"S 0.2-1 0^
»-4
S 0.1 H 1
10
[MnCl2],|iM
T
100
1000
Comparison of metal binding to dependence of the RNase H catalysis: Using the determined Mn2+-binding constants to E.coli RNase H*, the relative populations of enzyme with either 0, 1 or 2 Mn2+ ions bound as a function of [Mn^+Jtotal were determined. Figure 3 shows two simulations of the Mn2+ dependence of E.coli RNase H activity; one for the case where both metals are required for activation and one for the case where one metal is activating and one inhibiting. A comparison of Mn^+ activation of E.coli RNase HI (from Figure 1) shows the similarity between the data and the later model. The overall shape similarity of the two plots is striking. The highest metal affinity binding correlates well to in vitro activation and binding of the lower affinity metal correlates approximately to inhibition (Figure 3). Relative RNase H activity is scaled to represent the observation that maximum Mn^+-dependent activity is -^-0.3 that for maximum Mg2+-dependent activity (27). Differences between the real and theoretical activation data may imply differences in metal binding in the presence of substrate. Discussion It has been known for over 20 years that either Mn2+ or Mg2+ can activate E.coli RNase HI. However, differences between metal requirements in Mn2+ and Mg2+ are only now beginning to be understood. Here, we have shown that the Mn2+ requirement for E.coli RNase HI activity is in the low micromolar range. This value can be contrasted to the relatively high (-0.1 to 1 mM) Mg^ + concentrations required for activity (10,27). Further, we have
414
James L. Keck and Susan Marqusee
demonstrated that the Mn2+-dependent RNase H activity can be inhibited with higher Mn2+ concentrations (> 5 |iM). Mn^+ inhibition at higher metal concentrations could be due to a n u m b e r of factors, including: (1) metal-induced conformational changes in the RNA»DNA hybrid substrate or (2) metal binding to the enzyme that reduce it's activity. Similar inhibition has been documented for E.coli RNase H Mg^^-dependent reaction, with the inhibition attributed to substrate-metal association (28). This interpretation was based on the fact that E.coli RNase H binds Mg^+ with a 1:1 stoichiometry (10) and that the binding constant for Mg2+ to nucleic acid is similar to the inhibition constant (29,30). It is possible however, that binding studies reveal a second Mg2+ binding site on the enzyme only in the presence of substrate. With Mn2+, we can correlate metal binding to both activation and inhibition. We therefore, support the idea that Mn^^ inhibits as a result of binding an inhibitory site on the enzyme. We have determined here that E.coli RNase H can bind multiple (presumably two) Mn^+ ions with KdS of --15 |LIM and --60 |LIM. Upon comparison of metal-binding with our activation/inhibition data, the simplest model is that the tightest binding metal activates the enzyme while the second metal inhibits the activity (Figure 3). If both metals were required for activity, presumably there would be no inhibition at higher metal concentrations. Binding of the first metal in the absence of substrate correlates well with activation, but binding of the second metal appears weaker without substrate (i.e. the apparent inhibitory Kd is less than the measured Kd of the second metal binding). This discrepancy may indicate that substrate is involved in complete formation of the second metal binding site. Mechanism of E.coli RNase H: The metal-dependence of the RNase H reaction mechanism is not well understood. Currently there are two primary mechanisms that have been proposed: a onemetal mechanism and a two-metal activation mechanism. In light of the information presented in this paper, and in the context of information that has been gathered on the RNase H family of enzymes, we hypothesize that the RNase H mechanism is a singledivalent catalyzed reaction that can be attenuated by a second metal binding event. This hypothesis encompasses all of the seemingly contradictory information that has been presented to defend both the one and two-metal mechanisms. Figure 4 diagrams the basis of the proposed mechanism. In the absence of metal, E.coli RNase H is completely inactive. Upon addition of metal at activating concentrations (< 5 |LiM Mn^+), the tight metal binding site is filled and the enzyme is optimally active. We assume that the tightest binding metal binds in the single Mg2+ site observed crystallographically (9). Upon increasing the metal
Metal-Dependence of E. coli RNase H
415
Figure 4. Hypothetical metal-binding in the E.coli RNase H active site
Aspl34
AsplO r
Asp70
Glu48
Glu48
Inactive
Inhibited
concentration (> 5 |iM Mn^+), the second metal binding site (assumed from the co-crystal structure of the HIV RNase H domain with Mn2+ (5)) becomes occupied and the enzyme is inhibited. What is the mechanism of metal-inhibition? Our current hypothesis is based on E.coli RNase H active-site mutagenesis results coupled with the observation of two Mn^^-binding sites in the HIV RNase H domain structure. In the one-metal mechanism. Asp 70 abstracts a proton from the attacking nucleophilic water and then needs to deprotonate to reset the enzyme for the next hydrolysis (3). This deprotonation is believed to occur by shuffling the proton to His 124, since solvent is not accessible to Asp 70 and His 124 is nearby (within 4 A). Mutagenesis of His 124 to Ala results in a 100-fold reduction of kcat (3), presumably since Asp 70 must deprotonate through a less efficient mechanism. If His 124 is a liganding element for the second metal binding site, its pKa would be expected to shift down upon metal binding, making it more difficult to protonate. The effect of this pKa shift would be to inhibit the proton-transfer from Asp 70 to His 124, and thus slow the overall kcat for the reaction. We are currently testing this mechanism t h r o u g h mutagenesis and structural studies of E.coli RNase HI in Mn^^. Acknowledgments: We thank Mark Rabenstein and Yeon-kyun Shin for assistance with the EPR measurements. This work was supported by a grant from the N.I.H. (GM53321).
References
1. Hostomsky, Z., Hostomska, Z. and Matthews, D. A (1993) Ribonucleases H in Nucleases (ed. Linn, S. M., Lloyd, R. S. and Roberts, R. J.) 2nd Ed.,pp. 341-76, Cold Spring Harbor Laboratory, Cold Spring Harbor NY 2. Nakamura, H., Oda, Y., Iwai, S., Inoue, H., Ohtsuka, E., Kanaya, S., Kimura, S., Katsuda, C , Katayanagi, K., Morikawa, K., Miyashiro, H. and Ikehara, M. (1991) Proc. Natl Acad. Sci. U.S.A., 88, 11535-9
416
James L. Keck and Susan Marqusee
3. Oda, Y., Yoshida, M., and Kanaya, S., (1993) /. Biol Chem. 268, 88-92 4. Suck, D. and Oefner, C. (1986) Nature 321, 620-5 5. Davies, J. F., Hostomska, Z., Hostomsky, S., Jordan, S. and Mathews, D. A. (1991) Science 252, 88-95 6. Beese, L. and Steitz, T. A. (1991) EMBO ]. 10, 25-33 7. Derbyshire, V., Grindley, N. D. F., and Joyce, C. M. (1991) EMBO J. 10,17-24 8. Freemont, P. S., Friedman, J. M., Beese, L. S., Sanderson, M. R. and Steitz, T. A. (1988) Proc. Natl Acad. Sci. U.S.A. 85, 8924-8 9. Katayanagi, K., Okumura, M., and Morikawa, K. (1993) Proteins 17, 337-46 10. Huang, H. W. and Cowan, J. A. (1994) Eur. J. Biochem 219, 253-60 11. Oda, Y., Nakamura, H., Kanaya, S. and Ikehara, M. (1991) /. Biomol. Nmr. 1, 247-55 12. Kanaya, S., Kohara, Y., Miura, Y., Sekiguchi, A., Iwai, S., Inoue, H., Otsuka, E. and Ikehara, M. (1990) /. Biol Chem.,265, 4615-21 13. Skalka, A.-M., and Goff, S. P. (eds) (1993) Reverse Transcriptase, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 14. Tisdale, M., Schulze, T., Larder, B. A. and Moelling, K. (1991) /. Gen. Virol 72, 59-66 15. Yang, W. and Steitz, T. A. (1995) Structure 3,131-4 16. Venclovas, C. and Siksnys, V. (1995) Nature Struct. Biol 2, 838-41 17. Rice, P., Cragie, R., and Davies, D. R. (1996) Curr. Op. Struct. Biol 6, 76-83 18. Katayanagi, K., Miyagawa, M., Matsushima, M., Ishikawa, M., Kanaya, S., Ikehara, M., Matsuzaki, T. and Morikawa, K. (1990) Nature 347, 306-9 19. Yang, W., Hendrickson, W. A., Crouch, R. J. and Satow, Y. (1990) Science 249, 1398-405 20. Ariyoshi, M., Vassylyev, D. C , Iwasaki, H., Nakamura, H., Shinagawa, H. and Morikawa, K. (1994) Cell 78, 1063-72 21. Dyda, F., Hickman, A. B., Jenkins, T. M., Engelman, A., Craigie, R. and Davies, D. R. (1994) Science 266, 1981-6 22. Bujacz, G., Jaskolski, M., Alexandratos, J. and Wlodawer, A. (1995) /. Mol Biol 253, 333-46 23. Rice, P. and Mizuuchi, K. (1995) Cell 82, 209-20 24. Dabora, J. M. and Marqusee, S. (1994) Protein ScL 3,1401-8 25. Keck, J. L. and Marqusee, S. (1995) Proc. Natl Acad. Scl U.S.A. 92, 2740-4 26. Scatchard, G. (1949) Ann. N. Y. Acad. Scl 51, 660-72 27. Keck, J. L. and Marqusee, S. (1996) /. Biol Chem. 271, 19883-7. 28. Black, C. B. and Cowan, J. A. (1994) Inorg. Chem. 33, 5805-8 29. Cowan, J. A. (1991) /. Am. Chem. Soc. 113, 6025-32 30. Black, C. B. and Cowan, J. A. (1994) /. Am. Chem. Soc. 116, 1174-8
Crystal structure of avian sarcoma virus integrase with bound essential cations Jerry Alexandratos\ Grzegorz Bujacz^'^, Mariusz Jaskolski^'^ and Alexander Wlodawer^*, ^Macromolecular Structure Laboratory, NCI-Frederick Cancer Research and Development Center, ABL-Basic Research Program, Frederick, Maryland ^Faculty of Food Chemistry and Biotechnology, Technical University of Lodz, Lodz, Poland ^Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznaii, Poland
George Merkel, Richard A. Katz and Anna Marie Skalka Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania
I. Introduction Retroviral integrase (IN) is a virus-encoded enzyme that catalyzes nonspecific insertion of viral DNA into multiple sites on host DNA (1-3). Since DNA integration is an essential step in the retroviral replication cycle, this enzyme is an attractive target for inhibition of human immunodeficiency virus (HIV), the causative agent of acquired immunodeficiency syndrome (AIDS). Work over the last several years has resulted in a general understanding of the enzymatic mechanism, but more detailed analyses have been hampered by the lack of precise structural information. The situation changed when the crystal structures of the catalytic domains of both HIV-1 IN (4) and avian sarcoma virus (ASV) IN (5,6) became available. Precise data on the interaction of these enzymes and the essential ligands are necessary for understanding the structural basis of the reaction mechanism and for guiding rational drug design. Members of the structurally related superfamily of enzymes that include RNase H, RuvC resolvase, MuA transposase, and retroviral integrase contain at least three acidic residues in the active site and require divalent cations, such as Mg^"^ or Mn^"^, for their enzymatic activity. However, the precise placement of cations is reported in the X-ray crystal structures of only two of these proteins, E. coli RNase H and HIV-1 RNase H. Details of the location of metal ions in the active site of retroviral integrases can enhance our understanding of the catalytic mechanism of these enzymes and their relationship to that of other members of the superfamily. We present the structure of ASV IN catalytic domain with the essential cations Mg^"^ or Mn^"^ bound in the active site. In addition, we present the structure of an inactive complex of the catalytic domain of ASV IN with Zn^"^. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
417
Jerry Alexandratos et al
418
II. Methods The expression strategy, purification from E. coli, and activities of the purified ASV IN 52-207 fragment have been described previously (7). ASV IN protein crystals were produced from 20% polyethylene glycol (PEG) solution by the hanging drop vapor diffusion method, also described previously (5). All conditions yielded tetragonal crystals with approximate cell dimensions of a = b = 66 A, c = 81 A, space group of P432i2, with one molecule in the asymmetric unit. Crystals were soaked in metal chloride solutions in a synthetic mother liquor for at least 3 days each. Concentrations of 10 mM MnCl2,100 and 500 mM MgCl2, and 100 mM ZnCl2 solutions produced complete occupancy, whereas the occupancy was only partial in 20 mM MgCl2. X-ray diffraction data were collected at room temperature on a MAR 300 mm image plate detector, using a Rigaku RU200 rotating anode operated at 50 kV and 100 mA (Table I). Data were processed with DENZO and scaled with SCALE?ACK (8). Variations in the unit cell parameters for different crystals were less than 1.25 A, even between metal complexes and low temperature native structures. Electron density maps, calculated with the program PROTEIN (9), were interpreted using FRODO (10). The model underwent multiple cycles of restrained structure-factor least-squares refinement using PROLSQ (11). In addition to protein atoms and water molecules, the final models also include a well-ordered HEPES molecule that cocrystallized with the native protein. At least 60 water molecules were added to each structure during refinement.
III. Results and Discussion The central, catalytic domain of ASV IN is a five-stranded mixed P-sheet flanked by five a-helices. The active site is characterized by the presence of the D,D(35)E motif of three carboxylate-containing amino acids, the last two of which Table I. Summary of data collection and refmement Protein
Cell dimensions (A)
Resolution (A)
R-factor
R-free
Root mean square deviation from Mg structure
Low temperature selenomethionine
a,b= 65.40 c= 80.41
6.0-1.70
0.139
0.208
0.27
Mg (500 mM)
a,b= 66.05 c= 81.65
8.0-1.75
0.150
0.191
""
Mn(lOmM)
a,b= 66.24 c= 81.60
8.0-2.05
0.130
0.189
0.14
Zn(lOOmM)
a,b= 66.08 c=80.96
10.0-1.95
0.176
—-
0.17
Crystal Structure of Sarcoma Virus Integrase
419
are separated by 35 residues. The two aspartate residues are located on strand pi and the end of strand P4, with these strands being part of the stable P-sheet core of the protein. There is a long 10-residue loop between strand P5 and helix a4, which has higher B-factors and slightly different conformations in the different structures. This loop extends out of the compact shape of the molecule and appears quite flexible. The third catalytic residue appears near the end of helix a4 at one end of this flexible loop. The structure of ASV IN complexed with divalent cations (Mn^"^, Mg^"^, and Zn^"*") was solved at the resolution of 1.70 - 2.05 A. This enzyme is active in the presence of either Mn^"^ or Mg^"*", with the activity higher in the former than in the latter, and is inactive in the presence of Zn^"^. After refinement, the structures of Mn^"^ and Mg^"^ complexes were nearly identical in their overall architecture and in the metal binding scheme. A single ion of either metal interacts with the aspartate side chains of the D,D(35)E catalytic center and uses four water molecules to complete its octahedral coordination (Fig. la). The metal-ligand distance is within a 2.1 - 2.4 A range for the Mn^"^ ion and 2.05 - 2.15 A for the Mg^'^ion. Glu-157 does not take part in binding of these cations. Only small adjustments take place in the active site of ASV IN upon binding of a metal cofactor. The Asp-64 carboxylate shifts less than 0.4 A compared with the uncomplexed enzyme (5). A slight rotation of the Asp-121 carboxylate shifts the oxygen atom about 1 A. Only the non-metal-binding side chain of Glu-157 in the Mn^"*" structure moves a greater distance. This result agrees with previous studies, which show that even conservative mutations to these residues abolish protein activity. The soaking experiments were performed with modest metal salt concentrations. The 10 mM Mn^"^ concentration was only three times above that used in activity assays, and the 100 mM Mg concentration was similarly proportional to the concentration used in in vitro activity assays. We are certain that under the conditions of these soaking experiments only one of these catalytic metal cations occupies the active site at any one time. Because the final Mn^"^ Fo-Fc map showed a peak height of 6.5 o above background, we are very confident of its location. Metal binding with even one third of this occupancy would have been easily visible. Even the structure obtained using an extremely high 500 mM Mg^"^ concentration did not indicate a second metal-binding site. It is not known at this time whether one or two metals are required for the integration reaction to proceed, although the detailed modeling of reactions catalyzed by nucleotidyl tranferases appears to require two metal ions (12). Since we could find only one divalent cafion in the complex, we decided to examine whether other divalent cations could be bound with different stoichiometry. Even though it is known that Zn^"^ does not activate integrases, we used this cation because zinc chemistry is very similar to that of Mg (13). Unexpectedly, we observed a structure with two Zn^"^ ions coordinated by all three active site residues, and with one water ligand interacting with each metal ion (Fig. lb). As in the other structures, we observed only minimal conformational changes for both aspartate side chains. One of the Zn^"^ ions appears in essentially the same location as the Mn'^'^ and Mg^"*" ions, only 0.36 A from the position of the former and 0.29 A from the latter, coordinating with Asp-64 and Asp-121, respectively. The coordination of the second active site metal involves the other carboxylate oxygen
420
Jerry Alexandratos et al
Figure 1. Stereo views of active sites of ASV IN complexed with metals, la. Electron density map of Mg coordinated with four water molecules. Two active site carboxylate oxygens and four waters create the octahedral coordination for the metal cation, lb. Electron density map of two Zv?^ ions with coordinating two water molecules. Four active site carboxylate oxygens and two water molecules coordinate the metal cations.
Figure 2. Ribbon diagram of the ASV IN catalytic domam, with explicitly shown active site side chains for both Zv?^ (black) and Mn^"^ (grey) complexes. The metal locations and corresponding coordinated water molecules are indicated in the same colors.
Crystal Structure of Sarcoma Virus Integrase
421
of Asp-64, as well as Glu-157. Not surprisingly, the side chain of Glu-157 was observed to rotate, as this residue did not previously point into the active site. The fact that only a side chain rotation, with no backbone displacement, was needed for the second cation to bind explains why even conservative mutations of these active site residues inactivate IN completely. The coordination also seems to extend between the metal ions themselves, since the distance between the Zn^"^ ions is only 3.5 A. Both Zn^"^ ions are coplanar with the two carboxylate oxygens from Asp-121 and Glu-157 and the liganded waters. Each Zn^^ is located in the center of a triangle formed by coordination with a carbonyl oxygen, a water molecule, and the other cation, with Asp-64 coordinating both Zn^"^ ions from below this plane. The distance between any oxygen Hgand and the metal ion is within 2.1-2.4 A. The number of waters coordinating the metal bound by IN may be crucial for catalytic activity. If one water is replaced by an incoming DNA phosphate ligand, then this active site-Zn^"^ arrangement will not have a water molecule available for the hydrolysis reaction (14). Another Zn^"*" ion was found bound in a distant part of the structure, with His-103 and three additional water molecules, forming a more typical tetrahedral coordination. Binding of the Mg^"^, Mn^"^, and Zn^"*" ions does not lead to significant structural modifications in the active site or the overall protein architecture when compared with native ASV IN (Table I). This result indicates that metal-binding sites are preformed in this IN structure. The observed configuration of the D,D(35)E residues may represent a catalytically-competent active site (Fig. 2). When one divalent cation is bound, the active site side chains remain in the positions seen in the PEG native structures. Binding of two Zn^"^ cations also causes no change to the IN backbone. The side chain of Glu-157 rotates with respect to the conformation seen in the protein complexed with the other metals. However, this is completely consistent with the side chain conformation seen in the native structure of IN crystals grown from ammonium sulfate. These minor differences between the active sites of ASV IN with different cofactors seem to reflect a tendency for structural flexibility in the active site of integrases. The two Zn^"^ ions are observed at a similar concentration as one Mg^"^ ion, with no overall changes to the protein, indicating a possible mode of binding for Mg^"^ under other conditions. This observation supports the hypothesis that a second metal-binding site exists for Mg^'^/Mn^'^, but forms in the presence of substrate and/or other domains of the protein.
A. Comparison with related enzymes Although many enzymes that are active in the processing of nucleic acids, such as nucleases, DNA polymerases, or reverse transcriptases, have acidic residues in the active sites and require divalent cations for activity, such cations have been reported only for a few published structures. One published structure of reverse transcriptase from Moloney murine leukemia virus, (MMLV RT) shows a single metal bound in the active site (15), whereas none of the available structures of HIV-1 RT show bound metals. In addition, the structures of MMLV RT and E. coli RNase H with bound metals have been solved at a lower resolution than the
Jerry Alexandratos et al
422
same proteins without metals. As our data show, the quaUty of the metalcontaining ASV IN structures are as good as that of the apoenzyme. Since the electron density maps are of excellent quality, we can describe the active site with high accuracy. We have compared compared the structure of the ASV IN active site with the active sites of the other members of this superfamily for which metal complexes have been described or inferred, namely both HIV-1 and E. coli RNases H, and E. coli RuvC resolvase. As reported by Yang and Steitz (16), the similarity of the cluster of acidic residues forming the active sites of the RNase H enzymes is striking. With alignment based on conserved secondary structure elements, we have found that the best agreement in the active site of these enzymes is with ASV IN Asp-64. The placement and direction of the analogous carboxylates are very similar in RNases H and in ASV IN (Fig. 3). ASV IN Asp-121 is close to its equivalent in HIV-1 RNase H (17) and E. coli RNase H (18), whereas the side chain of the equivalent residue in E. coli RuvC resolvase (19) is more distant (not shown). The third residue of the cluster, ASV IN Glu-157, is also in quite good agreement among these enzymes. The other acidic residues in this region do not have counterparts in ASV IN. Similar to what we have observed for ASV IN, the residues in the active site of £. coli RNase H are moved only slightly upon binding of the metal, shifting Ca atoms less than 0.4 A when comparing structures without (20) and with (18) divalent cations; the side chain acidic groups shift no more than 1.5 A. Interestingly, the two residues that coordinate the Mg^"^ ion move less than the other carboxylates, indicating that the part of the active site directly coordinating the metal ion has an invariant character. A similar case is noted with a comparison of the metal-bound and unbound active sites of MMLV RT (18,20) and HIV-1 RT (21), with less than a 1.5 A r.m.s. deviation among the three active site residues. The sole Mg^"*" ion reported for E. coli RNase H (22) is complexed by Asp10 (Asp-64) and by Glu-48 (no ASV IN equivalent). Although no precise data on the location of a divalent cation are available for RuvC resolvase, a Mn^"^-binding site apparently exists between Asp-7 (Asp-64) and Asp-141 (Glu-157) (19). ZnMn
D121 D98
^ 1 D64D43 m D64D.
^ E157E149
0153N145 Figure 3. Comparison of the active sites of the HIV-1 RNase H-Mn^"^ complex (16) (black) with the ASV EST-Zn^"^ complex (grey) shows the excellent alignment of catalytic residues and metal cations.
Crystal Structure of Sarcoma Virus Integrase
423
Structural alignment of the crystallographically determined structures of ASV IN complexed with Zn^"^ and of the HIV-1 RNase H complexed with Mn^"^ was carried out using ALIGN (23). The general architecture of the ASV IN and RNase H monomers is significantly different, but it is possible to superimpose structurally conserved regions, three a-helices and one P-strand, which contain the active site and nearby structurally important residues. This alignment reveals a surprisingly good superposition of the catalytic residues, with the r.m.s. deviation of 1.3 A for the 28 atom pairs. The positions of the two Zn^"^ ions in ASV IN are very close to the two Mn^"^ ions in RNase H (17), which are directly coordinated by the carboxylates of Asp-43 (equivalent to Asp-64 in ASV IN) and Asp-98 (equivalent to Asp-121 in ASV IN), and between Asp-43 and Asp-149 (equivalent to Glu-157 in ASV IN). The distances between the two pairs of cations are 0.38 A for the Zn^"^ ions bound between the two aspartates and 0.48 A for the other Zn^"*" ion, less than the r.m.s. deviations between the protein atoms (Fig. 3). Although the three most highly conserved acidic residues are present in similar locations in all of these enzymes, the exact relationships between them are not strictly preserved. However, for the two (quite divergent) RNase H enzymes, the maximum differences in the positions of the carboxylates do not exceed 1.5 A, despite some disorder reported in the vicinity of the active site of the isolated HIV1 RNase H domain from HIV-1 RT (17). The minimal influence of the presence of the divalent cation on the disposition of residues in the active site of RNase H and ASV IN is mirrored in MMLV RT, where the differences observed in the positions of the three critical aspartates in the active site are not larger than 0.4 A when metal bound and unbound structures are compared. In the case of MMLV RT, however, the quality of the difference Fourier map is not sufficient to determine the details of the coordination of the metal. The similarities between the active sites of MMLV RT and ASV IN include the interaction of only two of the three carboxylates with a single Mn ion present in the active site. We have presented here the structure of the catalytic domain of ASV IN complexed with three different divalent cations. These results clearly show that the active site of this enzyme is preformed, in that only relatively small movements of side chains and no shifts of the main chain are needed in order to provide an environment suitable for cation binding. This is in contrast with the related core HIV-1 IN (4), in which no binding of the divalent cations could be shown by crystallographic means, and in which the constellation of active site residues differs significantly from their counterparts in ASV IN. However, antibody-binding experiments have shown that HIV-1 IN undergoes a conformational change when incubated with divalent cation cofactors (Asante-Appiah, E. and Skalka, A. M., personal communication). These results are consistent with the notion that the activity can be modulated by transitional order-disorder phenomena involving the active site, and that such conformational changes are be different for enzymes obtained from different sources. Although only a single divalent cation was observed upon soaking ASV IN in Mn^"^ and Mg^"^, the unexpected observation of two Zn^"^ ions binding to the active site of ASV IN could provide indirect proof of the hypothesis postulating the utilization of two cations for catalytic activity.
Jerry Alexandratos et al
424
Acknowledgements Research sponsored in part by the National Cancer Institute, DHHS, under contract with ABL. Other support includes National Institutes of Health grants CA47486 and CA06927, a grant for infectious disease research from Bristol-Myers Squibb Foundation, and an appropriation from the Commonwealth of Pennsylvania. The contents of this publication do not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
References 1. 2. 3.
4.
5.
6.
7.
8. 9. 10. 11. 12. 13. 14. 15.
16. 17.
18.
Katz, R. A., and Skalka, A. M. (1994). The retroviral enzymes. Annu. Rev. Biochem. 63, 133-173. Goff, S. P. (1992). Genetics of retroviral integration. Annu. Rev. Genet. 26, 527-544. Vink, C , Groeneger, O. A. M., and Plasterk, R. H. (1993). Identification of the catalytic and DNA-binding region of the human immunodeficiency virus type I integrase protein. Nucleic Acids Res. 1\, 1419-1425. Dyda, F., Hickman, A. B., Jenkins, T. M., Engelman, A., Craigie, R., and Davies, D. R. (1994). Crystal structure of the catalytic domain of HIV-1 integrase: Similarity to other polynucleotidyl transferases. Science 266, 1981-1986. Bujacz, G., Jaskolski, M., Alexandratos, J., Wlodawer, A., Merkel, G., Katz, R. A., and Skalka, A. M. (1995). High resolution structure of the catalytic domain of the avian sarcoma virus integrase. / Mol Biol. 253, 333-346. Bujacz, G., Jaskolski, M., Alexandratos, J., Wlodawer, A., Merkel, G., Katz, R. A., and Skalka, A. M. (1996). The catalytic domain of avian sarcoma vims integrase: conformation of the active-site residues in the presence of divalent cations. Structure 4, 89-96. Kulkosky, J., Katz, R. A., Merkel, G., and Skalka, A. M. (1995). Activities and substrate specificity of the evolutionarily conserved central domain of retroviral integrase. Virology 206, 448-456. Otwinowski, Z. (1992). An Oscillation Data Processing Suite for Macromolecular Crystallography, Yale University, New Haven. Sheriff, S. (1987). Addition of symmetry-related contact restraints to PROTIN and PROLSQ. J. Appl Crystallogr. 20, 55-57. Jones, T.A. (1985). Interactive computer graphics: FRODO. Methods Enzym. 115:157-171. Hendrickson, W. A. (1985). Stereochemically restrained refinement of macromolecular structures. Methods Enzymol. 115, 252-270. Steitz, T. A. (1993). DNA- and RNA-dependent DNA polymerases. Curr. Opin. Struct. 5/o/. 3,31-38. Cotton, F., and Wilkinson, K. (1988). Advanced Inorganic Chemistry (5th edition, Wiley-Interscience) Beese, L. S., and Steitz, T. A. (1991). Structural basis for the 3'-5' exonuclease activity of Escherichia coli DNA polymerase I: a two metal ion mechanism. EMBOJ. 1, 25-33. Georgiadis, M. M., Jessen, S. M., Ogata, C. M., Telesnitsky, A., Goff, S. P., and Hendrickson, W. A. (1995). Mechanistic implications fi*om the structure of a catalytic fragment of Moloney murine leukemia virus reverse transcriptase. Structure 3, 879-892. Yang, W., and Steitz, T. A. (1995). Recombining the structures of HTV integrase, RuvC and RNase H. Structure 3, 131-134. Davies, J. F.,II, Hostomska, Z., Hostomsky, Z., Jordan, S. R., and Matthews, D. A. (1991). Crystal structure of the ribonuclease H domain of HTV-1 reverse transcriptase. Science 252, 88-95. Yang, W., Hendrickson, W. A., Crouch, R. J., and Satow, Y. (1990). Structure of ribonuclease H phased at 2 A resolution by MAD analysis of the selenomethionyl protein. Science 249,139^-1405.
Crystal Structure of Sarcoma Virus Integrase 19.
20.
21.
22.
23.
425
Ariyoshi, M., Vassylyev, D. G., Iwasaki, H., Nakamura, H., Shinagawa, H., and Morikawa, K. (1994). Atomic structure of the RuvC resolvase: A HoUiday jxmction-specific endonuclease from^. colL Celin, 1063-1072. Katayanagi, K., Miyagawa, M., Matsushima, M., Ishikawa, M., Kanaya, S., Nakamura, H., Ikehara, M., Matsuzaki, T., and Morikawa, K. (1992). Structural details of ribonuclease H from Escherichia coli as refined to an atomic resolution. J. Mol Biol. 11^, 1029-1052. Unge, T., Knight, S., Bhikhabhai, R., Lovgren, S., Dauter, Z., Wilson, K., and Strandberg, B. (1994). 2.2 A resolution structure of the amino-terminal half of HIV-1 reverse transcriptase (fingers and palm subdomains). Structure 2, 953-961. Katayanagi, K., Okumura, M., and Morikawa, K. (1993). Crystal structure of Escherichia coli Rnase HI in complex with Mg^"*" at 2.8 A resolution: proof for a single Mg^''"-binding site. Proteins 17: 337-346. Satow, Y., Cohen, G. H., Padlan, E. A., and Davies, D. R. (1986). Phosphocholine binding immunoglobulin Fab McPC603: An X-ray diffraction study at 2.7 A. J. Mol Biol 190, 593-604.
This Page Intentionally Left Blank
Multidimensional NMR Studies of an Exchangeable Apolipoprotein and Its Interactions with Lipids
Jianjun Wang^, Daisy Sahoo^, Dean Schieve^, Stephane M. Gagne§, Brian D. Sykes§ and Robert O. Ryan^
^Lipid and Lipoprotein Research Group, »Protein Engineering Network Centres of Excellence, Department of Biochemistry, University of Alberta Edmonton, Alberta, Canada T6G 2S2
I. INTRODUCTION Exchangeable apolipoproteins are a class of functionally important proteins which play a key role in plasma lipoprotein metabolism. In this capacity they have been associated with several human disorders, including hyperlipidemia and cardiovascular disease (1,2). Apolipophorin-III (apoLp-III) is a model exchangeable apolipoprotein derived from the insect Manduca sexta (166 residues, Mr 18,380). ApoLp-III is a major hemolymph protein in the adult life stage and functions in lipid transport during sustained flight (3,4). Biophysical studies demonstrate that apoLp-III is a soluble monomeric protein at concentrations of 15 mg/ml (5). While the tertiary structure of M. sexta apoLp-III has not been solved. X-ray crystallography of apoLp-III from Locusta migratoria reveals a globular structure comprised of a bundle of five elongated amphipathic a-helices which are connected by short loops (6). A similar molecular architecture was also found for the 22 kDa N-terminal fragment of human apolipoprotein E (7). The crystal structure of L. migratoria apoLp-III was obtained for the protein in its lipid-free state. The lipid-bound structure of apoLp-III, however, is more interesting since it represents the active form of the protein. To date, no detailed structural reports for exchangeable apolipoproteins in complex with lipid have been reported. The crystal structure of lipid-free apoLp-III demonstrated that the five amphipathic heUces orient in such a way that their hydrophobic faces are directed toward each other to form a hydrophobic core while the hydrophilic faces of the helices are exposed to solvent. It has been hypothesized that, upon binding to a TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
427
Jianjun Wang et al
428
lipid surface, the protein undergoes a major conformational change that results in opening of the helix bundle, with exposure of the hydrophobic surfaces of the helices which contact the Hpid (6). This putative conformational change is depicted in Figure 1:
Lipid Binding
Figure 1. Open conformation model of exchangeable apolipoproteins upon lipid-binding.
Since M. sexta apoLp-in is a well-behaved member of the exchangeable apolipoprotein family in terms of its physico-chemical properties, it represents a good candidate for investigation of the molecular details of the exchangeable apolipoproteins associated with lipid binding. A study of this system may reveal a general mechanism for exchangeable apolipoprotein-Hpid interaction. In a manner similar to all exchangeable apolipoproteins, apoLp-III resists crystallization when complexed with lipid. Thus, NMR is the only potentially useful high resolution technique to investigate structural changes of apoLp-III which accompany lipidbinding. To date, no NMR structures of exchangeable apolipoproteins have been reported. In order to carry out 3D and/or 4D-NMR experiments, however, the protein of interest must be either ^^N and/or ^^C-isotope labeled. Isotope labeling strategies require an efficient bacterial expression system and such a system has recendy been developed in this laboratory for apoLp-III from M. sexta (8). The present study describes the results of labeling experiments and presents a useful method to incorporate ^^N specifically and exclusively into peptide backbone amide nitrogens. 3D-NMR experiments have been performed on isotopically labeled apoLp-m and nearly complete assignment has been achieved. In addition.
NMR Studies of an Exchangeable Apolipoprotein
429
preliminary experiments provide direct experimental evidence in support of a significant conformational change in apoLp-III upon interaction with lipid. II. METHODS Materials. 1^NH4C1, ^^N-Leucine, ^^N-Glycine, ^^N-Valine and ^^N-Lysine were purchased from Cambridge Isotope Laboratories (Andover, MA). i^C-C^-Glucose and ^H-dodecylphosphocholine were obtained from Isotec. Inc. (Miamisburg, Ohio). Unlabeled amino acids were obtained from Sigma Chemical Co. (St. Louis, MO) Bacterial expression and isotope-labeling of recombinant apoLp-III. The coding sequence of M. sexta apoLp-III was cloned into the pET expression vector (Novagen Corp., Madison, WI) directly downstream from the pelB leader sequence cleavage site. Introduction of the plasmid vector into E. coli BL21(DE3) permits high level expression upon induction with 1 mM isopropyl 6-D thiogalactopyranoside (IPTG). Significant amounts of recombinant apoLp-III were secreted into the culture medium during expression and protein was isolated from the culture supernatant following five hours incubation at 30°C. Typically, a one liter cell culture produces 150 - 200 mg of pure apoLp-III (8). The purification procedure was essentially as described by Ryan et al. (8). Isotope-labeling of apoLp-III used M9 minimal media with l^NILtCl for ^^N-uniform labeling; 15NH4Cl/^^C6-glucose for l^N/l^C uniform labeling; I5]sj4eucine/1^NH4C1 for ^^N specific backbone nitrogen labeling; ^^N-amino acid of interest/19 other ^^N-amino acids for specific amino acid ^^N-labeling. NMR Spectroscopy. NMR experiments were carried out at 30 °C on a Varian Unity 600 spectrometer equipped with three channels, a pulse-field gradient triple resonance probe with an actively shielded z gradient and a gradient amplifier unit. NMR sample concentrations ranged from 0.5 to 1.1 mM, pH. 6.5±0.1, with 250 mM phosphate buffer and 0.5 mM NaN3. 2D ^H-^^N HSQC spectra were recorded using the enhanced sensitivity mode with 8 - 3 2 transients (9). Triple resonance HNCACB (10) and CBCA(CO)NNH (10) 3D-NMR spectra, recorded on an uniformly l^N/^^C-labeled H2O sample with 8 - 1 6 transients, correlates backbone amide protons of residue / with CA and CB atoms of residue / (HNCACB) and i-1 (HNCACB, CBCA(CO)NNH) for the backbone sequential assignment. l^N-edited NOESY (9) and l^N-edited TOCSY (9) were also acquked on an uniformly l^N-labeled H2O sample with 8-12 transients, for both backbone and sidechain assignments. A mixing time of 150 ms was used for ^^N-edited NOESY experiments and a mixing time of 59 ms was used for l^N-edited TOCSY
430
Jianjun Wang et al
experiments. Pulse field gradient HCCH-TOCSY (11) and simultaneous ^^N- and l^C-edited NOESY (12) were acquired for the sidechain assignment. A mixing time of 100 ms was used for simultaneous l^N- and ^^C-edited NOESY. Titrations of ^H-dodecylphosphocholine (DPC) to specific amino acid l^N-labeled apoLp-HI samples were monitored by 2D ^H-^^N HSQC spectra at pH 6.9 - 7.0 in order to investigate structural changes induced by lipid-binding. Electrospray ionization mass spectrometry. Molecular weight determinations for control and isotope enriched apoLp-IIIs were made using a VG quattro electrospray mass spectrometer (Fisons Instruments, Manchester, UK). Molecular weights were determined as the mean value calculated for several multiply charged ions within a coherent series. The instrument was calibrated using the series of ion peaks from horse heart myoglobin with a molecular mass of 16,951 daltons. Calculated masses were derived from the amino acid sequence using the program MacPro Mass (Terry Lee, City of Hope, Duarte CA). III. RESULTS and DISCUSSION ^^N Isotope-labeling Strategies In order to pursue heteronuclear multidimensional NMR experiments, a bacterial system for expression of apoLp-III has been developed which allows facile production of 150 - 200 mg/L l^N-labeled apoLp-III or 100 - 125 mg/L l5N/l3C-double labeled apoLp-III. Figure 2, panel A shows the iR-l^N HSQC spectrum of a 1.0 mM solution of lipid-free, uniformly ^^N-labeled apoLp-III. Panel A also indicates that, although the chemical shift dispersion in the ^Hdimension is rather small (6.5 ppm to 9.5 ppm), it is generally upfield shifted, consistent with the fact that the protein secondary structure is predominantly a-helix (13). The chemical shifts in the ^^N-dimension are well-dispersed which results in good separation of the overall crosspeaks. However, certain regions in the spectrum are still crowded as shown in Figure 2. The upper right comer of the HSQC spectrum of ^^N-labeled apoLp-III shown in Figure 2, panel A contains numerous doublet crosspeaks which are derived firom side chain amines of glutamine and asparagine residues. Since apoLp-III is rich in glutamine and asparagine (25 of the 166 amino acids), this region of the spectrum is
1 CD
c o
§•
•Hi
o cc
13
3-
"if •
u •
c
a. 03
'
'
1
'
•
' < ] > ' > >
T ^
•
' ' 1 ' ' ' ' 1 ' ''
' 1 '
o
•
E ^ o i5
I
, C
""" CO c C CO CD
CD ^
o> E CO 0 CNJ CO 0 » -
^ 2 0) E
t
^-K-^ -D
^O TO m . 5: =5 0
• O ^ - D
o L 0 JO
n^ y
0
Q.
^ O (0 to _^ T i
0c :k C w
(D
0)
Q) (0
w CD ^
0
.E E g
So
IS Co
•oco cLU
Q. QQ
c CO
cr c 'co , CO 0
^ •^
< o
I
z
o
0
0 : ^
S oo - oS T 3 OLO
3
Q - 2^ . CO
X
d
—
i^s £ : 1 -
d) CO
•2 -D 0 ^ c
^ CO
—I 0 > . < ^
C
15
c^ C3 c j ' CO CO 0 ^ 0 -0
0
0
c.5y
E
O
CO > , < 0 ••£ DC
O) 0 E^m .£ c , ^ ^ o Zi
C/) c o ^ •c ^.
^ ^
CO ^
i5 0 ^ [ E o o 0
0
P ^ C CO fc: CO ij; 0
i 5 gj-D
o . 5 2 0)
3=5 "o « j r E 2
'i
CM
0 O ii ^ 3 CO D) 0 i Z CO
445
NMR Methods for Analysis of CRALBP Retinoid Binding y
9l
iH[Tjyjp2
il'^il^iri Cpk!^9i)
Ldl (pfl
93 t,C + T2
15N
y
I
y
y
12
WALTZ-16
M
13C
G^
G, gGg
gGg
G4
G4 65
G5
Figure 3. GESE-HSQC Experiment. Coherence transfer selection between "• H nuclei and their directly bonded ''^N nuclei was achieved using the GESE-HSQC experiment shown above. In the depicted pulse sequence, 90 degree pulses are represented by wide lines, simple 180 degree pulses by black rectangles and composite inversion pulses by cross-hatched rectangles. Gz pulses represent pulsed field gradient pulses. The water-selective 90 degree flipback pulse is labeled with phase 02 and had a duration of 1.7 msec. Phase cycle elements are: 0l=2(x), 2(-x); 03=x, -x; all other phases are x unless othenA/ise indicated. Additional acquisition parameters are: t1=2.65 msec; t2=5.56 msec, z=1.2 msec, sw ( ' ' H ) = 1 1 0 0 1 HZ, SW ( ' ' 5 N ) = 5 0 0 0 HZ, gB2.("'^N decoupling field strength)=1.27 KHz with GARP1, t2=93 msec. Pulsed field gradient parameters were G3=26 G/cm, tG3=5 |Lisec; G6=25.76 G/cm, tG6=0-5 msec; G-|=1 G/cm, tQ 1=0.4 msec, gradient labeled G2, G4 and G5 were not used.
the case of rCRALBP, provides a rational foundation for pursuing identification of specific amino acid residues involved in the localized change. One dimensional "'^p NMR experiments offer advantages of low cost, sensitivity and simplicity. Fluorine provides a sensitive probe for monitoring protein conformational changes and ligand interactions in part because the chemical shift range is 100 fold larger than that of the proton due to the lone pair electrons (Gerig, 1994; Danielson and Faike, 1996; Sykes and Hull 1978). Most commonly, fluorine is substituted for hydrogen in the ring structure of aromatic amino acids such as tyrosine and tryptophan which enhances sensitivity due to the resonance electrons in the ring. Because of the usual low abundance of tryptophan in proteins and the low cost of 5-fluorotryptophan, tryptophan is generally the residue of choice for fluorine incorporation. A potential pitfall is that fluorine incorporation may cause protein denaturation and/or instability, depending on the site of incorporation and the nature of the protein. Partial apparent denaturation of 5fluorotryptophan labeled rCRALBP was observed by NMR, nevertheless, distinct "I^F resonances for each of the two labeled Trp were also observed (in NMR spectra collected over 8h from 18 mg/ml
Linda A. Luck et al
446
protein solutions before and after exposure to bleaching illumination). Apparent chemical shift differences of about 0.25 and 0.75 ppm for the two Trp were seen upon removal of ligand. The smaller shift suggests that this residue is experiencing a small conformational change whereas the larger shift suggests that the other Trp residue may be in more immediate contact with bound retinoid (data not shown). To determine which Trp residue in the CRALBP sequence is associated
BEFORE BLEACH
AFTER BLEACH
T—r-T—I—r—I—r-|—n—TT—n—r-\—n—i—fT—i
20
18
16
14
12
ppm
Figure 4. NMR Spectra of "^^C-Met labeled CRALBP with and without ligand. Solution state NMR spectra of CRALBP labeled with '•^c-Met (23 mg/ml) containing bound 11-c/s-retinaldehyde were recorded in the dark. The sample was then exposed to bleaching illumination and reanalyzed without bound ligand. Major "'^C chemical shift differences are apparent between the two spectra, suggesting that a Met residue may be in direct contact with bound retinoid and/or associated with the CRALBP retinoid binding pocket. NMR conditions include: pw90 (^^C) = 15 ^isec, sw (13Q) _ 13422 Hz, preacquisltion delay= 1 sec, "'H broadband decoupling during aquisition at a field strength of 1.85 KHz using MLEV-16.
NMR Methods for Analysis of CRALBP Retinoid Binding
447
with the observed chemical shifts, site directed Trp to Phe mutants are being prepared for additional retinoid binding and ^^F NMR analyses. One dimensional "^^c NMR experiments are also relatively inexpensive and straightforward however natural abundance ""^C (0.0018 relative to ""H) does not exhibit the sensitivity in NMR of fluorine (0.8331 relative to 1H). Specific enrichment of sites, such as the methyl carbons of methionine can increase the sensitivity and ""^CMet has proven to be an effective tool for probing protein conformational changes (Beatty et al., 1996). The advantage of the innocuous substitution of carbon-13 for carbon-12 is counteracted by the smaller chemical shift dispersion of ""^C compared with ""^F. Residues of high abundance In the protein may cause overlap of resonances in the NMR. Another consideration is that while '•^cmethlonine Is less expensive than many isotopes, it remains more expensive than '•^F-Trp and "^^N ammonium chloride. NMR analysis of the ''^c-Met labeled rCRALBP (Fig. 4) was used to probe the possible interaction of any of the seven Met residues with retinoid. Preliminary one dimensional NMR results reveal a predominant set of ovelapping 13c resonances with minor chemical shift differences before and after bleaching (Fig. 4). However, a distinct and major ""^c chemical shift difference of about 1 ppm is also apparent after removal of ligand, suggesting that at least one Met residue may be in direct contact with bound retinoid. Site directed mutagenesis of the Met residues in CRALBP is underway and additional two dimensional HSQC NMR analyses will be used to assign specific Met residues to observed chemical shifts and ligand interactions.
V. Conclusions This study demonstrates the applicability of ""^N, ""^F and ^^c NMR methodology for studying ligand interactions in a light sensitive protein such as rCRALBP. Gradient enhanced sensitivity enhanced heteronuclear single quantum correlation "^^N NMR has provided evidence that rCRALBP undergoes a specific localized conformational change upon photoisomerization of 11-c/s-retinaldehyde and removal of the ligand from the binding pocket. The results from the multidimensional NMR measurements strongly support the likelihood that the ^^F Trp and i^c-Met NMR chemical shift differences observed for the protein with and without bound 11-c/s-retinaldehyde are associated with protein-ligand interactions. Site directed mutagenesis in conjunction with further NMR and ligand binding studies promises to identify components of the rCRALBP retinoid binding pocket. The effectiveness of these NMR experiments was greatly facilitated by careful quantification and protein characterization prior to NMR analysis, particularly by liquid chromatography electrospray mass spectrometry.
448
Linda A. Luck et al
References Beatty, E.J., Cox, M.C., Frenkiel, T.A., Tarn, B.M., Kubal, G., Mason, A.B., MacGillivray, R.T.A., Sadler, P.J. and Woodworth, R.C. (1996) J Amer Chem Soc. (in press). Chen, Y., Johnson, C, West, K., Goldflam, S., Bean, M.F., Huddleston, M.J., Carr, S.C., Gabriel, J.L and Crabb, J.W. (1994) In Techniques In Protein Chemistry V, J.W. Crabb, ed., pp 371-378, Academic Press, San Diego,CA. Crabb, J.W., Johnson, CM., Carr, S.A., Armes, LG. and Saari, J.C. (1988) J Biol Chem. 263, 18678-18687. Crabb, J.W., Chen, Y., Goldflam, S., West, K.A. and Kapron, J.T. (1996) In Techniques in Molecular Biology: Retinoids, Redfern, C, ed., Humana Press, NJ (in press). Danielson, M.A. and Faike, J.J. (1996) Annu. Rev. Biophys. Biomol. Struct 25: 163-195. Gerig JT (1994) Prog. Nucl. Magn. Reson. Spectrosc. 26:293370. Kim, H.W., Perez, J.A., Ferguson, S.J., Campbell, I.D. (1990) FEBS Letts. 272: 34-36. Luck, L.A. (1995) In Techniques In Protein Chemistry V I , J.W. Crabb, ed., pp 487-494, Academic Press, San Diego,CA. Luck, L.A. and FaIke, J.J. (1991) Biochemistry 30: 4248-4252. Sykes, B.D. and Hull, W.E. (1978) Meth Enzymol. 49: 270-295. Saari, J.C. and D.L. Bredberg (1987) J Biol Chem 262, 7618-7622. Saari, J.C, Bredburg, D.L. and Noy, N. (1994) Biochemistry 331: 3106-3112. Farmer II, B.T. and Venters, R.A. (1996) J BioMolecular NMR 7, 59-71. Venters, R.A., Calderone, T.L., Spicer, LD. and Fierke, CA. (1991) Biochemistry 30, 2291-4494. Venters, R.A. and Spicer, L.D. (1995) In Techniques In Protein Chemistry V I , J.W. Crabb, ed., pp 495-502, Academic Press, San Diego, CA
A Novel Method for Measuring the Binding Properties of the Site-Directed Mutants of the Proteins That Bind Hydrophobic Ligands: Application to Cellular Retinoic Acid Binding Proteins Honggao Yan, Lincong Wang and Yue Li
Department of Biochemistry Michigan State University East Lansing, Michigan
I. Introduction Many hydrophobic molecules such as vitamin A, vitamin D and steroid hormones play vital roles in a variety of cellular processes. Because of the low solubility of these molecules in water, it has been difficult to measure the binding properties of the site-directed mutants of the proteins that interact with these hydrophobic ligands such as cellular retinoic acid binding proteins (CRABPs) (Zhang et al. 1992; Chen et al. 1995). This has greatly hampered the studies of the quantitative structure-function relationships of these important proteins. Retinoic acid (RA), a hormonally active metabolite of vitamin A, has profound effects on cell growth, differentiation, and morphogenesis. Two types of proteins have been found to bind RA: nuclear retinoic acid receptors (RARs and RXRs) and CRABPs. RARs and RXRs are RA-activated transcriptional factors that regulate expression of target genes (Mangelsdorf et al., 1994). Although the physiological roles of CRABPs are not clear at present, they are thought to be involved in cellular transport and metabolism of RA (Ong et al., 1994). Two isoforms (CRABP-I and CRABP-H) have been characterized. Both CRABP-I and CRABP-H bind specifically to SiW-trans-TQimoic acid, but they differ in affinity for RA, expression pattern and regulation. It appears that the two isoforms may have distinct functions. The idea is supported by the fact that the sequence identity of human and mouse CRABP-I (99.3%) or human and TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
449
450
Honggao Yan et al
mouse CRABP-n (93.5%) is much higher than the sequence identity (13.1%) between the two isoforms from the same source. Four conserved residues (Arg-Ul, Leu-121, Arg-132 and Tyr-134 in CRABP-n) line at the bottom of the RA binding pockets of CRABPs and interact with the carboxyl group of RA (Kleywegt et al., 1994). Site-directed mutagenesis studies have shown that the two arginine residues are important for binding of RA (Zhang et al., 1992; Chen et al., 1995). However, the affinities of these mutants for RA have not been quantitatively determined because the current RA binding assays are inapplicable to mutants with greatly decreased affinity for RA. We have developed a novel competitive binding assay for measuring the dissociation constants of the site-directed mutants of CRABPs. We have used this novel method to evaluate the contribution of Leu-121 of CRABP-H to binding of RA in conjunction with site-directed mutagenesis and NMR. The results show that Leu-121 is also important for binding of RA and contributes to the binding energy by ~ 1.4 kcal/mol.
II. Experimental Procedures A. Site-Directed Mutagenesis The oligonucleotide for making L121A mutant was 5'-GGAACTGATCGCGACCATGACG-3'. The mutant was generated by the method of Kunkel (1985) and screened by DNA sequencing. In order to ensure that there were no unintended mutations in the mutant, the entire sequence of the mutated gene was determined. Both the wild-type and mutant proteins were purified by ion exchange chromatography using DEAE-cellulose DE53 followed by gel filtration using Sephadex G-50 (Wang et al., 1996).
B. Competitive Binding Assay The assays were designed to measure the affinity of a mutant for RA relative to that of the wild-type CRABP-H. The proteins were dissolved in a phosphate buffer (4 mM NaH2P04, 16 mM Na2HP04, 150 mM NaCl, pH 7.3). The concentrations of the protein stock solutions were measured by OD280 using the absorption coefficient 19,480 M-i cm-^ for CRABP-H. RA stock solutions were prepared in absolute ethanol. The concentrations of the RA stock solutions were determined by OD336 using the absorption coefficient of 45,000 M-i cm-i. The assays were carried out in equilibrium dialysis cells at room temperature. The two compartments of each dialysis cell were separated by a semipermeable membrane with a molecular weight cutoff of 6-8 kDa. One compartment was filled with the wild-type CRABP-H (1 ml), and the other with LI21 A. An equal amount of [^H] RA (100 nM) was added to each compartment. The proteins in both compartments were in large excess of RA
Cellular Retinoic Acid Binding Proteins
451
(>20-fold). 100 |il of samples were taken from the two compartments after various times of incubation at room temperature, mixed with 5 ml of scintillation fluids and counted by a liquid scintillation counter. The equilibria in the two compartments that contain the wild-type CRABP-II and mutant proteins can be described by Eq. (1) and Eq. (2): J.
_ [WTIRA]
...
^
_ [MTIRA]
...
^d(WT) - [WT*RA]
J^d(MT) - [MT*RA]
^^ ^^
where WT, MT, WT«RA and MT#RA represent the wild-type CRABP-H, a CRABP-II mutant and their RA complexes, respectively. Therefore ^d{MT) _ [MTIWT^RA] WT\MT*RA] ^d{WT)
^3^
Since the concentrations of the proteins were much greater than their respective dissociation constants and the concentration of RA, [WT]»[\yr]^^^^p [MT] - [MT\,,i, [WT• RA\»[RA], and [MT• RA]»[RA]. Then the relative dissociation constant can be calculated by Eq. (4): ^dmT) '^diWT)
_
\.^T\otal^Wr L t^^ ]total ^MT
(4)
where C ^ and C^j are the measured radioactivities of the two compartments containing the wild-type CRABF-E and the mutant, respectively. It turned out that the system could not reach equilibrium in 2 days, presumably because of few free RA in solution to diffuse across the membrane. Since RA is not stable even in dark, the assay was redesigned to match the equilibrium conditions by varying the ratio of the protein concentrations of the wild-type and the mutant ([MT\^^^J[WT\^j^i). Thus the concentration of the mutant was varied while keeping the concentration of the wild-type at ~2 |iM. Initially the concentration of the mutant was increased in an exponential manner (e.g., 2, 20, 200 |iM). Then it was varied in a small range. Since an equal amount of RA was added to the two compartments of the dialysis cell, the two compartments should have the same RA concentration and radioactivity at the beginning of each assay. If [MT]j^^^i/[WT]j^t^i ^ i5:^(Mr)/^^(wr)' there would have a net transfer of RA across the semipermeable membrane separating the two compartments. Thus the radioactivity counts of the two compartments {C^ and C^^.) would differ after incubation for a certain period. When [MTl^^^J[WTl^^^i < K^^j^^/K^^^^, then C^-C^r >0. When [MTl^^J[WTl^^^, > K,,^r,/K,,^,, 0. When [MT],,J[WTl^,, = K,,^r)/K,,^,, then C^-C^= 0.
452
Honggao Yan et al
C. NMR Spectroscopy NOESY was performed at 32 °C on a VXR-500 spectrometer operating at a proton frequency of 500 MHz. The protein was dissolved in 20 mM sodium phosphate, pH 7.5 (direct pH meter reading), 100 mM NaCl, 5 mM DTT in D2O. The protein concentration was ~2 mM. The data was acquired in the hypercomplex mode with a mixing time of 150 ms (Jeener et al., 1979; Macura & Ernst, 1980). The spectral width was 7200 Hz in both dimensions. 2048 complex points in the t2 dimension and 256 complex points in the tl dimension were acquired. 96 transients were collected for each FID. Data processing was performed on a Sun Sparc 10 station using VNMR software from Varian. The time domain data were zero-filled once and multiplied by shifted sinebell or Gaussian functions before Fourier transformation in both dimensions. Chemical shifts were referenced to internal sodium 3-(trimethylsilyl)-propionate-2,2,3,3d4.
III. Results and Discussions A. Competitive Binding Assay Two types of methods have been in general use for measuring binding of RA to CRABPs: fluorometry and radiometry. The radiometric method involves separation of bound from free RA by dextran-coated charcoal, gel filtration and other means. Substantial loss of bound ligand during the separation process makes the method unsuitable for measuring the dissociation constants of sitedirected mutants with greatly decreased affinity for RA. The very limited solubility of RA in water (-200 nM, Szuts & Harosi, 1991) also makes the fluorometric method inapplicable for determining the dissociation constants of these mutants. Studies of the quantitative structure-function relationships of CRABPs have been hampered by the lack of methods for measuring the affinities of site-directed mutants for RA (Zhang et al. 1992; Chen et al. 1995). We have developed a novel competitive binding assay for measuring the affinities of site-directed mutants for RA relative to that of the wild-type CRABP. The essence of the method is to monitor the competition between a mutant and the wild-type protein for binding of limited RA. Equilibrium dialysis cells are used for the assays. The two compartments of each dialysis cell are filled with the wild-type and mutant proteins respectively. The absolute concentration of RA is not important as long as the concentration of free RA is much smaller than that of bound RA. There is no need to separate bound from free RA. The transfer of RA from one compartment to the other is determined by measuring the radioactivities of the samples taken from the two compartments. The direction of the net transfer is dependent on the relative affinity of the proteins and the ratio of the protein concentrations of the two compartments. A representative result is shown in Figure 1. When the ratio of
Cellular Retinoic Acid Binding Proteins
453
the concentrations of the two proteins ([L121A]/[WT]) is < 8, there is a net transfer of RA from the compartment containing L121A to the compartment containing the WT. When the ratio of the concentrations of the two proteins is > 12, there is a net transfer of RA from the compartment containing the WT to the compartment containing LI21 A. Since the relative K^ lies between the points with opposite net transfers, the K^ of L121A relative to that of the (^d(L121A/^d(WT)) ^^ 8-12.
Determination of the relative dissociation constant of a point mutant is sufficient for estimating the energetic contribution of the amino acid residue to ligand binding (AAG = RT\n{K^(^^j.^ I A:^(HT) ))• The method can also be used for measuring the relative dissociation constants of the mutants of other proteins that bind hydrophobic ligands such as RA receptors, vitamin D receptors and steroid hormone receptors.
i
-400
-600 H
[L121A]/[WT]
Figure 1. Competitive binding assays for measuring the dissociation constant of L121A mutant relative to that of the wild-type CRABP-II. The relative radioactivity is the radioactivity count of the compartment containing the wild-type protein minus that of the compartment containing L121A.
Honggao Yan et al
454
B. Conformational Characterization By NMR Since a decrease in the affinity of a mutant for RA may be caused by conformational changes, we compared the conformation of L121A with that of the wild-type protein by NMR. Parts of the NOESY spectrum of L121A are shown in Figure 2. We have recently made total sequential resonance assignment of the wild-type CRABP-E (Wang et al., in preparation). 18 interresidue NOEs between the aromatic protons in the wild-type protein have been identified and assigned. Among the 18 NOE cross peaks, 16 of them can be identified in the NOESY spectrum of LI21 A. The other two NOEs are rather weak in L121A. We have not assigned the NOEs between aromatic and aliphatic protons. Qualitatively, the aromatic-aliphatic NOE patterns of the wild-type and L121A are very similar. The results suggest that L121A mutant is properly folded and its conformation is highly similar to that of the wild-type protein. Thus the decrease in the affinity of L121A for RA is unlikely to be caused by conformational perturbations.
C. LeU'121 Is Important for Binding ofRA The results of the competitive binding assay and NMR characterization of L121A mutant suggest that Leu-121 is important for binding of RA. Leu-121 is located at the bottom of the RA binding pocket of CRABP-H. One of the methyl group of Leu-121 is in close contact with the carboxyl group of the bound RA (Kleywegt et al., 1994). The distance between the carbon of the methyl group and the oxygen of the carboxyl group is 3.26 A. The packing of the methyl group and the carboxyl group is very close to the optimal van der Waals interaction (Derewenda et al., 1995). On the basis of the relative dissociation constant, the van der Waals interaction between the methyl group of Leu-121 and the carboxyl group of RA contributes to the binding energy by ~1.4kcal/mol.
IV.
Conclusions
A novel competitive binding assay has been developed for measuring the relative dissociation constants of the site-directed mutants of CRABPs. Leu121 has been replaced with alanine by site-directed mutagenesis. The affinity of the mutant for RA is decreased by ~ 10-fold as measured by the competitive binding assay. NMR characterization indicates that the conformation of L121A mutant is very similar to that of the wild-type protein. The results taken together show that Leu-121 is important for binding retinoic acid and contribute to the binding energy by -1.4 kcal/mol.
Cellular Retinoic Acid Binding Proteins Fl
H
455
0
(ppm)^
^
o6
1
yP
0
@
0.0^ 0.1^
0
0.2^
^
0
O^ @ e @0
0.3^ 0.4^ 0.5i 0.6^ 0.7-^ 0.8^ 0.9^
0
0
[}
i
8.0
7.8
1 ' ^
7.6
1^1 11^
7.4
|llll'
7.2 F2
llT
7.0
|lft
6.8
|l
6.6
1
6.4
MM
6.2
(ppm)
Fl (ppm)^ 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0
F2 (ppm)
Figure 2. Parts of the 500 MHz NOESY spectrum of L121A at 32 °C. The mixing time was 150 ms. Only the interresidue NOEs are labeled. The identities of the NOEs are: A, F65-2,6H ••• F71-2,6H; B, F65-2,6H ••• F71-3,5H; C, F65-4H • • W109-6H; D, F65-3,5 H ••• W109-6H; E, F50-2,6H ••• W87-7H; F, F71-4H • • W109-7H; G, F65-4H • • W109-7H; J, W87-5H •• F3-4H; K, F50-3,5H - F3-2,6H; L, F50-3,5H - W87-7H; M, F50-4H •. W87-7H; N, F50-4H ." F32,6H; O, F50-2,6H •• W87-6H; P, F50-2,6H •• F3-2,6H; Q, W87-5H ••• F3-2,6H; R, F3-4H •• W87-4H.
456
Honggao Yan et al
Acknowledgments We are indebted to Dr. Anders Astrom for providing us the wild-type cDNA clone of human CRABP-II. This work was supported by funds from the REF Center of Protein Structure and Design and the Cancer Center at Michigan State University.
References Chen, L. X., Zhang, Z.-P., Scafonas, A., Cavalli, R. C , Gabriel, J. L., Soprano, K. J., & Soprano, D. R. (1995) J, Biol Chem. 270,4518-4525. Cogan, U., Kopelman, M., Mokady, S., & Shinitzky, M. (1976) Eur. J. Biochem. 65, 71-78. Derewenda, Z. S., Lee, L., & Derewenda, U. (1995) /. Mol. Biol 252, 248-262. Jenneer, J and Ernst, R. P. (1979) J. Chem. Phys. 71,4546-4553. Kleywegt, G. J., Bergfors, T., Senn, H., Le Motte, P., Gsell, B., Shudo, K., & Jones, T.'A. (1994) Structure 2, 1241-1258. Kunkel, T. A. (1985) Proc. Natl. Acad. Sci. U. S. A. 82,488-492. Macura, S and Ernst, R.P. (1980) Mol. Phys. 41, 95-117. Mangelsdorf, D. J., Umesono, K., & Evans, R. M. (1994) In The Retinoids: Biology, Chemistry, and Medicine (Spom, M. B., Rorberts, A. B., & Goodman, D. S., Eds.) pp 319-349, Raven, New York. Ong, D. E., Newcomer, M. E., & Chytil, F. (1994) In The Retinoids: Biology, Chemistry, and Medicine (Sporn, M. B., Rorberts, A. B., & Goodman, D. S., Eds.) pp 283-317, Raven, New York. Szuts, E. Z., & Harosi, F. I. (1991) Arch. Biochem. Biophys. 287, 297-304. Wang, L., Li, Y., & Yan, H. (1996) submitted to J. Biol. Chem.. Zhang, J., Liu, Z.-P., Jones, T. A., Gierasch, L. M., & Sambrook, J. F. (1992) Proteins: Struct, Funct., Genet. 13, 87-89.
A Strategy for Predicting the Ligand Binding Competence of Recombinant Orphan Nuclear Receptors using Biophysical Characterization Derril Willard^ Bruce Wisely^, Derek Parks^ Martin Rink^ William Holmes', Michael Milbum^, and Thomas Consler' Departments of'Molecular Sciences, ^Structural Chemistry, and ^Molecular Biochemistry. Glaxo Wellcome Research and Development, Research Triangle Park, NC, 27709 I. Introduction Nuclear receptors are a loosely related group of ligand dependent transcriptional regulators with varying degrees of sequence homology. These proteins have been historically associated with the steroid hormone receptors, e.g. estrogen and glucocorticoid receptors, by virtue of DNA binding domain sequence homology comprising two zinc finger motifs. Many of these are orphan receptors, having no defined ligand. The nuclear receptors present tempting targets in the pursuit of a systems based research approach since so many have now been cloned. However, when recombinant forms of a receptor are available before its cognate ligand has been identified, a problem arises. How does one determine if an orphan receptor is active for use in in vitro assays? Researchers now have access to unparalleled amounts of DNA sequence and genetic data. Families of homologous gene products can be studied with the intent of connecting specific proteins to various disease conditions. An obvious advantage to this wide scope of research is that once the mechanism of action has been elucidated for a few family members, this information can then be applied in a general sense to other homologues. Specifically, to apply this type of strategy to our studies, we approached the problem with two premises. First, we engineered recombinant constructs of orphan nuclear receptors to contain domains with hypothetical fiinctional homology to receptors with known activities and ligands. In particular, a great deal is known concerning retinoid X receptor a (RXRa) (1) and the domains necessary for DNA binding, retinoid binding, and selfi^hetero-association. For the purposes of this study, PPARa, PPAR5, PPARy, RXRa, and LXRa constructs were created to contain the putative ligand binding domains (LBD). The amino acid residues within this conserved contiguous region have been TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
457
458
DerrilWillarderfl/.
shown to be both necessary and sufficient to demonstrate ligand binding competence for RXRa and other nuclear receptors. Structurally, the LBDs are composed primarily of multiple a-helicies (2,3). Second, we began to characterize each nuclear receptor using a variety of biophysical techniques. The object of this scrutiny was to compile a set of physical traits for each protein and to use these characteristics as a basis to compare the orphans with those receptors having defined ligand binding ability. Purified proteins were characterized by low resolution solution structure, thermal stability and propensity to self-associate or to aggregate. Observations of protein expression levels and solubility were also considered as qualitative estimates of native structure. Circular dichroism spectroscopy (CD) was employed to probe secondary structural features. CD monitored unfolding and differential scanning calorimetry (DSC) provided two separate measures of thermal stability. Static and dynamic light scattering (SLS and DLS, respectively) and analytical ultracentrifiigation were used to determine solution values for molecular size, association and aggregation state.
IL Materials and Methods A. Expression and purification Nuclear receptor LBDs were engineered to have an amino-terminal polyhistidinefiasiontag. Constructs were expressed in E. coli strain BL21 (DE3) using the T7 promoter. PPARa LBD was also expressed as a polyhistidine tagged recombinant in baculovirus infected T. ni. Protein purification in most cases was performed using a single nickel affinity step. Tagged protein was either initially purified by anion exchange chromatography then adsorbed onto a Pharmacia Chelating Sepharose Fast Flow column charged with nickel or adsorbed by nickel-chelating chromatography directly out of crude cell lysate. Proteins were eluted by 0-1 M linear gradient of imidazole in the lysis buffer.
B. Structure and stability Purified nuclear receptor proteins were buffer exchanged into PBS for CD spectral analysis using an Aviv model 62DS CD spectropolarimeter. The proteins were scanned repetitively in 0.1 cm quartz cuvettesfi^om197 to 300 nm in 1 nm wavelength increments. EUipticity was converted to molar ellipticity for comparisons. Thermal transitions were performed with the CD instrument above. Proteins were monitored at 222 nm over a temperature range of 5-80°C. Data were collected in 1°C increments with a slope of 10°C/min. Initially, data were fit to a simple sigmoidal mathematical relationship for comparison. The half-point of the thermal transition, T1/2, was determined by iterative fitting using the Boltzmann equation. Data were also fit to the following thermodynamic model:
Ligand Binding of Recombinant Orphan Nuclear Receptors
''
459
1 + exp"
where u = [{l/T-\/
T^'^AcJ^ - A//,,) + Ac^ ln(r/ T^)]/R
and where 0T is ellipticity at T (temperature in °K), 0N is the native protein ellipticity, 0D is the unfolded protein ellipticity, R is the gas constant, TD is the temperature at which the protein unfolding transition is half-complete, AHTD is the enthalpy change at TD, and Acp is the heat capacity change. DSC analyses were performed using a MicroCal MCS DSC unit. Data were analyzed to determine the midpoint of a two-state thermal transition (Tm) using the accompanying MicroCal Origin data analysis package. For each experiment approximately 1.5 mL of protein at concentrations ranging from 3-20 micromolar was analyzed during an increase in chamber temperature from 5-80 °C. The data were collected at a scan rate of 90 °C/hr and a filter period of 5 seconds. Scans were performed using PBS as the reference buffer.
C Association/Solution State Dynamic light scattering (DLS) measurements were performed on a DynaPro-801 instrument (Protein Solutions) and the data analyzed with the Autopro software package. All proteins were filtered through a 0.10 micron syringe filter and analyzed at 22°C. The translational difiiisional coefficient (DT) was obtained directly and the hydrodynamic radius (RH) is derived by a rearrangement of the following equation: DT
kT 67cr|RH
where k is the Boltzmann's constant, T is the absolute temperature and TJ is the solvent viscosity. Static light scattering (SLS) measurements were performed at 22°C on a Wyatt Technology Dawn DSP laser photometer (argon-ion laser 488 nm). Data were analyzed with the Astra software package. All proteins were filtered through a 0.10 micron syringe fiher and held at 30 psi during analysis. Each sample was run in duplicate and the refractive index increment was estimated to equal 0.180. Molecular weights are derived from a rearrangement of the following equation: K*c RiO)
1 + 2A2C M^P{e)
460
Derril Willard et al
where R(0) is the excess intensity of scattered light, c is sample concentration, A2 is the second viral coefficient, P(9) is the scattering function which depends on the molecular configuration and approaches 1 for proteins, and K* is an optical parameter. Sedimentation equilibrium analytical ultracentrifiigation was performed using a Beckman XL-A (Palo Alto, CA) centrifuge with two-channel or six-channel 12 nmi charcoal-filled epon centerpieces. Runs were performed at 20, 25, and 30 krpm at 4°C. Equilibrium was judged to be achieved by the absence of change between plots of several successive scans after approximately 20 hours. Solvent density was determined empirically at 4°C and 20°C using a Mettler DA-110 density/specific gravity meter calibrated against water. The partial specific volume of each protein was calculated using the method of Cohn and Edsall (4). Temperature differentials were incorporated using the appropriate equation (5) modified from values of each amino acid at 25°C (6). Raw data was analyzed by the Beckman/Microcal Origin non-linear regression software package using multiple iterations of the Marquardt-Levenberg algorithm (7) for parameter estimation.
D. Ligand Binding Binding constants were determined for PPARy and RXRa using ligands with known affinities. Each protein was incubated with the appropriate tritiated radioligand for two hours. Bound ligand was separated from free by gel filtration. The unbound radioligand was mixed with scintillation fluid and counted. For PPARa, PPAR5, and LXRa, similar assays were performed using compounds which had been implicated as nuclear receptor effectors in a separate cell-based assay.
III. Results Nuclear receptor LBDs were constructedfromthe homologous regions of the native sequences (Figure 1). Expression and purification of the nuclear receptors used in this study proceeded as detailed. N-terminal histidine fusion tags provided an easy and similar method of purification for each protein. In general expression yields were good (over 20 mg protein/fermentation liter) with the exception of LXRa which produced >5 mg protein/fermentation liter. Post nickel-chelating chromatography, the proteins were dialyzed into PBS for use in these studies, akhough long-term stability was found to vary from protein to protein over a range of buffer and storage conditions. In particular the histidine tag of PPARa produced in baculovirus infected T. ni cells was found to be processed away in all purification attempts shortly after elution from the nickelchelating chromatography step. Protein solubility as evidenced by ability to concentrate each nuclear receptor was found to be sufficient for the studies involved. The ultimate concentration attainable with LXRa was however
Ligand Binding of Recombinant Orphan Nuclear Receptors PPARa
461
. .6MSHNAIRFORMPRSEKAKLKAEILTCEHDIEDSETADLKSLAKRIYEAYLKNFN
PPARS
. .ONSHNAIRFORNPEAEKRKLVAOLTANE6SQYNPQVADLKAFSKHIYNAYLKNFN
PPARy LXRa RXRa
. .OMSHNAIRF6RNPQAEKEKLLAEISSDIDQLNPESADLRQALAKHLYDSYIKSFP ..QAHATSLPPRASS . .6NKREAVQEERQ_R6KDR ^NENEVESTSSANEDMPVERILEAELAVEP
PPARa
MNKVKARVILSOKASNNPPFVIHDMETLCMAEKTLVAKLVANOIQ_NKEAEVRIFHC
PPARS
MTKKKARSILT6KASHTAPFVIHDIETLWQAEKOLVWKQLVN6LPPYKEISVHVFYR
PPARY LXRa RXRa
LTKAKARAILT6KTTDKSPFVIYDMNSLNN6EDKIKFKHITPLQEQSKEVAIRIFQ6 PPQILPQLSPEQL6MIEKLVAAQQQCNRRSFSDRLRVTPWPMAPDPHSREARQQRFA KTETYVEANNOLNPS SPNDP VTN I
PPARa PPAR5 PPARY LXRa RXRa
CQCTSVETVTELTEFAKAIP6FANLDLNDQVTLLKYGVYEAIFAMLS SVNNKDGM CQCTTVETVRELTEFAKSIPSFSSLFLNI)QVTLLKYOVHEAIFAMLA__SIVNKDGL CQFRSVEAVQEITEYAKSIPGFVNLDLNDQVTLLKYGVHEIIYTMLA^SLMNKDGV HFTELAIVSVQIVDFAKQLP6FLQLSREDQIALLKTSAIEVMLLETS RRYNPGSE CQAADKQLFT_LVEWAKRIPHFSELPLDDQVILLRAGWNELLIASFSHRSIAVKDGI
PPARa PPAR5 PPARY LXRa RXRa
LVAYGNGFITREFLKSLRKPFCDINEPKFDFAMKFNALELDDSDISLFVAAIICCGD LVANGSGFVTREFLRSLRKPFSDI lEPKFEFAVKFNALELDDSDLALFI AAIILCGD LISEGQGFMTREFLKSLRKPFGDFMEPKFEFAVKFNALELDDSDLAIFIAVIILSGD SITFLKDFSYNREDFAKAGLQVEFINPIFEFSRANNELQLNDAEFALLIAISIFSAD LLATGLHVHRNSAHSAGVGAIFDRVLT ELVSKMRDMQMDKTELGCLRAIVLFNPD
PPARa PPAR5 PPARY LXRa RXRa
RPGLLNVGHIEKMQEGIVHVLRLHLQSWHPDDIFLFPKLLQKMADLRQL VTEHA RPGLMWVPRVEAIQDTILRALEFHLQANHPDAQYLFPKLLQKMADLRQL VTEHA RPGLLNVKPIEDIQDNLLQALELQLKLNHPESSQLFAKLLQKMTDLRQI ^VTEHV RPWVQDQLQVERLQHTYVEALHAYVSIHHPHDRLMFPRMLMKLVSLRTL SSVHS SKGLSNPAEVEALREKVYASLEAYCKHKYPEQPGRFAKLLLRLPALRSIGLKCLEHL
PPARa PPAR5 PPARY LXRa RXRa
QLVQIIKKTESDAALHPLLQEIYRDNY QMMQRIKKTETETSLHPLLQEIYKDMY QLLQVIKKTETDNSLHPLLQEIYKDLY EQVFALRLQDK KLPPLLSEIWDVHE FFFKLIGDTPIDTFLMEMLEAPHQMT
Figure 1. Primary sequence aligrunent of PPARa, PPARy, PPAR5, LXRa, and RXRa LBDs.
considerably lower than for the other constructs, resulting in precipitation of L X R a in the range of 0.5 mg/ml. P P A R a , PPARy, P P A R S , and R X R a all exhibited classic a-helical structure by C D spectroscopy (Figure 2). None of the four proteins had significant ellipticities in the aromatic region and all began downslopes near 240 nm. Minima at 222 n m and 208 n m were present in each scan with a crossover point at or near 200 nm. However, the spectrum for L X R a did not exhibit characteristic a-helical traits. The spectrum did not agree well with any major structural class and showed a considerable amount of scattering in the far U V . Thermal stability of the nuclear receptor L B D s was determined by both C D (Figure 3) and D S C melts. All of the proteins except L X R a showed secondary
Derril Willard et al
462
A
B,
—T 200
'
1 220
'
1 240
'
1 260
«
1 280
'
— I
1 300
1
200
r
)o] -1000-
-2000-
• •3000— I — I — I — I — I — I — I — I — I — I — I
200
220
240
260
1
1
1
240
1
260
1
1
280
280
300
Wavelength (nm)
. r •
• •
• • • •
••
—1—1—1—1—I—1—1—1—1—1—1 200 220 240 260 280 300
Wavelength (nm)
—r—'—I—'—I—'—I—'—I—'—I 220
240
260
1
300
•
.'w' 200
1
Wavelength (nm)
Wavdength (nm)
-4000H
1
220
280
300
Wavelength (nm)
Figure 2. CD spectra of A, PPARa; B, PPARy; C, PPAR8; D, RXRa; and E, LXRa LBDs. [0] denotes molar ellipticity in deg cm^/dmol, converted from raw ellipticity units. Data are averaged from multiple scans in each case and blank subtracted.
463
Ligand Binding of Recombinant Orphan Nuclear Receptors
]B
0-1 -1000-1
/ ^
-2000 H
-2000 -3000-1
i
a
-4000
^
-5000 H
-4000
^ ^ /a,k
Labeling of bacterial cells The labeling of Gram positive and Gram negative penicillin binding proteins (PBPs) is outlined in Scheme 1. Log phase bacterial cells were harvested by centrifuging them at 12,000 rcf for 10 minutes at 4''C. The cell pellet was washed twice with the 50 mM Tris HCl buffer (pH=7.2) and divided into three parts. One part of the pellet (control) was suspended in 500 ML of the Tris buffer and then added 25/uL of 15% SDS was added. It was left standing for 5 minutes. The other two equally divided pellets were then resuspended in 200 ML of buffer each and treated with 100-150 u^g dansyl (I)X APA or monocyclic p-lactam of N-€-dansyl lysine (II). The cell pellets were incubated for 20 minutes at 37°C. Treated cells were washed 3-4 times with buffer by centrifuging them for 5-10 minutes at 12,000 rcf and at 4"C so as to get rid of excess label as determined by visual inspection of the supernatant with UV irradiation. The cell pellets were resuspended in 200 uL of the Tris buffer. The
SCHEME 1: LABELING OF PBPS AND PBP-MORPHOGENE PRODUCT(MGP) COMPLEXES BY FLUORESCENCE SPECTROSCOPIC (F/S) PROBES BACTERIAL CELLS F/S probe 2. 1 Centrifuge
CONTROL CELLS
i LABELED CEUS | C2N2
CROSS-LINKEO PBP-MGP COMPLEXES
Centrifuge SOLUBLE UNLABELED PBP-S Reverse Phase Chromatography
Centrifuge Centrifuge SOLUBLE F/S LABELED PBP-MGP COMPLEXES
SOLUBLE F/S LABELED PBP COMPLEXES RPLC
CHROMATOGRAPHICALLY RESOLVED F/S LABELED PBP-MGP COMPLEXES
CHROMATOGRAPHICALLY RESOLVED F/S LABELED PBP COMPLEXES
SCHEME 2 : PROTEIN IDENTIFICATION CHROMATOGRAPHICALLY RESOLVED F/S LABELED PBF/FBP-MGP COMPLEXES
CHROMATOGRAPHIC FRACTION Enzymatic Digestion
PEPTIDE FRAGMENTS
MALDI-TOF/LSIMS Mass Spectrometry
Search the Protein Sequence database for muitipeptides of individual proteins using "MS-FIT' Program
Generation of Sequence fragments with matching Mol. wt.(s)
472
S. Bhardwaj and R. A. Day
of the Tris buffer. The label only pellet is treated with 25 /A. of 15% SDS and left standing for 5 minutes. C. Cyanogen Treatment Cyanogen may be obtained commercially or prepared readily in one step from AgCN(lO). Cyanogen is far less toxic than HCN, HjS or CO, but should be handled carefully (11). The headspace over the second F/S labeled pellet was swept out with -5 mL of cyanogen and allowed to stand at room temperature for 30 minutes. The color of cell pellet changes from cream to brown and the pellet disintegrates. This treatment was repeated three times with the same amount of CjNj. After the final CjNj treatment, the cell pellet was subjected to 25 yuL of 15% SDS for 5 minutes. The total volume was brought to 1 mL by adding Tris HCl buffer to each set and the cells were lysed by sonication. Cellular debris was removed by centrifugation at 12,000 rcf for 10 minutes at 4"C. The supernatant wasfilteredthrough 0.22 ^.m nylon membrane filter and diluted 5-10 fold before analysis by HPLC. The unused portion of the labeled supernatant was kept frozen at -80"C. D. Reverse phase chromatographic separation of penicillin binding proteins (PBPs) and cross-linked PBP-morphogene products (See Scheme 2) The chromatographic separation of hydrophobic membrane proteins, as well as hydrophilic "hybrid" proteins, was carried out using Microsorb-MV C18 reverse phase column (Rainin Instruments; 4.6x250 mm, 300 A pore size, 5 M particle size). The mobile phase for chromatography consisted of 99.9% acetonitrile with 0.1% trifluoroacetic acid as one solvent and 99.9% water with 0.1 % trifluoroacetic acid as the other solvent. The sample size was 50-80 yuL with lOOyuL loop attached to the injector. A gradient of 0-100% acetonitrile in 60 minutes was used with monitoring at 280 nm and 320 nm. A minimum of ten collections were made and the separated fractions were pooled. The column was cleaned periodically by injecting 100 ML of trifluroethanol (2-3 times) (12). The rechromatography of collected protein fractions was done using solvent gradients related to the retention time of that fraction in the original chromatographic profile. E. Enzymatic digestion of the RPLC purified penicillin binding proteins and "hybrid" proteins The enzymatic digestion of the purified proteins was carried out as described (13). The enzymes TPCK-treated pancreatic trypsin (EC 3.4.21.4; Type XIII) and TLCK-treated pancreatic chymotrypsin (EC 3.4.21; Type VII) werefromSigma Chemical Company. The amount of proteinase used was -2% of the concentration of the purified protein. The proteins were dissolved in 200 yuL of 0.1 N ammonium bicarbonate buffer (pH=7.0), followed by addition of the enzyme dissolved in deionized water. The enzymatic digestion was carried out by incubating the samples for -18 hours at 37° and the reaction was stopped by addition of 10% trifluoroacetic acid solution. The proteins were lyophilized.
Intra-CclMar Protein-Protein Interactions
473
E
Mass spectral analysis of digested proteins The samples were analyzed on the VG TofSpec-SE MALDI TOF mass spectrometer in the reflectron mode with positive ion detection. The samples were spotted on the sample plate in acetonitrile:water (60:40) or chloroform: methanol:TFA (1:1:0.1) mixture plus ammonium sulfate on alpha-C (a-cyano-4hydroxycinnamic acid) matrix. The ionization of the samples was carried out with Nd:YAG laser at 355 nm or nitrogen laser at 337 nm. Some of the fractions were analyzed by SIMS on a Kratos 890 mass spectrometer equipped with a Phrasor Scientific SIMS source. The mass spectral data were analyzed by the MSFIT program at the University of California, San Francisco. i n . RESULTS A,
F/S labeled PBPs The F/S labeled p-lactams were prepared as described. On the basis of their mode of action against bacteria I is termed "lytic" and II, "non-lytic." That is, diastereomers of II produce the microbiological Liesegang effect (MLEs)(14). I and n each produced characteristic chromatographic patterns in the F/S labeled penicillin interactive proteins, the penicillin binding proteins (PBPs), by RPLC (7). The hydrophobic PBPs elute much later than the cytosolic proteins. When monitored at 320 nm I-labeled PBPs showed a pattern of eight or nine major peaks (Fig. la) and >20 minor peaks. II-Labeled PBPs appeared essentially as one peak (Fig. lb). The major peaks were rechromatographed before further analysis of their proteinase digests by MALDI-TOF and/or SIMS. Two sets of peptides were generated from trypsin and chymotrypsin treatment. Characteristically each digest showed among the peptides only one F/S labeled peptide, the active site labeled peptide. The entire digest was needed for peptide mass mapping (15). The details of peptide mass-fmgerprinting of only one fraction out of several (Fig. la) analyzed is shown here. Digestion of fraction 1 (Fig. la) by the two proteinases was carried out. Shown are the results of MALDI-TOF analyses of the resultant tryptides (Fig. 2a) and chymotryptides (Fig. 2b). After eliminating peaks that were ambiguous either because (a) they could arise from more than one PBP or (b) could come from the proteinase itself, we were left with a set of peptide identities associated only with PBP IB. A typical result is shown in Tkble 1. In this case as with other reports the mass window used to examine the MALDI-TOF was less than ±3 amu as consistent with the range used in other studies (15-18). Peptide mass-fingerprinting of the I-labeled PBPs (Fig. la) revealed the eight known PBPs of E, coli (Tkble 2). Unlike SDS-PAGE analysis of labeled PBPs (19), the elution order by RPLC is not related directly to molecular weight but is dictated by hydrophobicity. II-Labeled PBPs presented only one peak (Fig. lb). Peptide-mass fingerprinting of the chymotryptic and tryptic digests revealed that it was a
S. Bhardwaj and R. A. Day
474
\2j4i6f
*320
'320
'320
'320
Figure 1. Chromatographic Traces of F/S Labeled Proteins from E. coli: (a) from i treated cells, (b) from i i treated cells, (c) from cells treated sequentially with i and C2N2, and (d) from cells treated sequentially with 11 and C2N2.
/ntra-Cellular Protein-Protein Interactions
400
600
800
1000
M/Z 1200
1400 M/Z 1800
475
1600
2200
2000
2600
476
S. Bhardwaj and R. A. Day
complex containing the eight well-known PBft plus a candidate PBP, PHSE (Tkble 2). B.
Cyanogen linking ofPBPs to other morphogene proteins The I-treated E, coli cells were treated with C2N2 and again carried through the procedures up to and including peptide mass-fingerprinting. A new set of F/S labeled peaks intermediate in retention times between the early putative cytosolic proteins and the later eluting PBPs showed combinations of PBPs and known MGPs (20,21). The chromatographic profile of these "hybride" proteins is shown in Figure Ic. Space only allows us to show the identity of proteins found in fraction 1 (Tkble 3). The cyanogen trapped F/S labeled complexes reveal only PBPs and MGPs. Donachie (21) summarizes their functions. Chromatography of the E. coli proteins from the sequentially II and cyanogen treated cells (Fig. Id) showed F/S labeled components. The peptide massfingerprintingof proteolytic digests of these purified components showed only PBPs and MGPs (20). In Tkble 4 are shown the constituent proteins of one component. Found here are all the PBPs, save PBPS and PBP7, and seventeen MGPs. Not shown here are the results from a parallel analysis of the Gram (+) B, subtilis which gave entirely analogous results with I and II with and without C2N2 treatment (20). IV, A.
DISCUSSION
Specificity The most significant measurements of cellular processes must be done with no perturbation of the process. While zero perturbation is probably not attainable, nevertheless minimization of the perturbation is important. Cyanogen as a reagent readily diffuses into a cell, is itself apolar and noninteractive until it participates in a specific reaction. Its specificity is very high for hydration by bi-functional catalysis (6). When that catalysis is provided by a salt-bridge, it is accompanied by conversion of the salt-bridge to a covalent link. This study strongly supports a high specificity for pre-formed salt-bridges. The E, coli K-12 genome has almost completely been sequenced. It contains thousands of structural genes (3469 according to the data base Swiss Prot. r33). Of these, almost 70 have been identified as PBPs and MGPs (21). Wthout cyanogen treatment only PBPs were found in the F/S labeled proteins. With cyanogen only other MGPs were found covalently linked to the PBPs. That none of the other thousands of proteins became linked provides an extremely stringent internal control. It can be anticipated that this is a general property of protein-protein interactions and that any appropriately labeled protein could become a productive target for identification of proteins that may be interacting with it. It should be noted that in this study the C2N2 treatment
Intra-CellulsiT Protein-Protein Interactions
477
Table 1. Peptides Identified from Mass Spectral Data of Chymotryptides Shown in Fig. 2a Showing Identification of PBPIB. There were 7 matches P02919 penicillin-binding protein IB (PBP-IB), (94267.0Da) Mass Found
Mass Matched
Delta Da
Sequence Start
Peptide Sequence
End 540
(W)IADAPIAL(R)
1017.700
1016.556
1.144
415
422
(F)MQLVRQEL(Q)
1017.700
1019.451
-1.751
734
743
(L)YGASGAMSIY(Q)
1099.600
1100.610
-1.010
216
225
(F)VPRSGFPDLL(V)
1594.300
1592.811
1.489
202
215
(L)ITMISSPNGEQRLF(V)
2026.840
2027.997
-1.157
129
145
(L)EATQYRQVSKMTRPGEF(T)
2435.030
2437.040
-2.010
787
808
(W)TSDPQSLCQQSEMQQQPSGNPF(D) 1
Unmatched masses: 824.2 1126.7 1879.1 2515.1
Table 2. Identification of PBPs of E, coU K12 (ATCC 29079) Labeled with F/S labeled plactams. The intact log phase cells were treated with the F/S p-lactams for 10 minutes and then disrupted by sonication. After removal of the debris by centrifugation (10,000 rpm, 10 min.), ahquots of the supemate were separated chromatographically as described (39). The F/S labeled peaks were isolated and rechromatographed on a shallower gradient. The purified peaks were divkied and subjected to tiyptic and chymotryptic digestion respectively. The digests were analyzed by MALDI-TOF or by SIMS. The resultant mass spectral data were submitted for peptide mass fingerprinting at the UCSF Mass Spectrometry Facihty.
F/S p-Lactam
F/S Labeled Chromatographic Fraction
Protein
MoLWt.
PBPIA PBP4 PBPIB PBP7 PBP2 PBP6 PBP3 PBP5
93,636 51,798 94,266 34,245 70,856 43,639 63,877 44,444
PBPIA, PBPIB, PBP2, PBP3, PBP4, PBP5, PBP7, PHSE
was extreme, C2N2 driven reactions proceed at rates that vary by five orders of magnitude (4,6). For example, the subunits of hemoglobin have salt-bridges among them; it requires -60 seconds to covalently cross-link them (4). Thus, in this study, if there were any possibility of adventitious salt-bridge formation and conversion to covalently linked functionalities, it should have been seen.
|
478
S. Bhardwaj and R. A. Day
Table 3. Identification of PBPs and MGPs in Fraction I Isolated from the Sequential F/S p-Lactam DNS-APA (I) and CjN, IVeated E. coU Log Phase Cells. P^enthesis indicate possible but not confirmed by one or more peptides unique to the parenthetic protein. PBPS Protein PBPIA PBPIB PBP2 PBP3
Mol. Wt. 93,636 94,266 70,856 63,877
Protein (PBP4) (PBP5) PBP6 PBP7
Mol. Wt. 51,798 44,444 43,639 34,245
Other MorphogeneJProteins _ Protein Mol. Wt. Protein Mol. Wt. (FTSH) 70,708 ALRI 39,000 (SLT70) 73,369 (ENVC) 41.317 LON 87,438 MURA 44,817 SECA 101,909 FTSN 45,987 MUKB 176,935 FTSY 54,513
Table 4. Identification of PBPs and MGPs in Fraction I L Isolated from the Sequential F/S DNS-Monocyclic p-Lactam (II) and C^N^ Treated E, coli Log Phase Cells. Parenthesis indicate possible but not confirmed by one or more peptides unique to the parenthetic protein. Other Morphogene Proteins
PBPS Protein PBPIA PBPIB PBP2 PBPS
Mol. Wt 93,636 94,266 70,856 63,877
Protein Mol. Wt. (PBP4)r 51,798 (PBP5) 44,444 (PBP6) 43,639
Protein MUKB SECA (LON) (FTSH) SLT70 (MURE)
Mol. Wt. 176,935 101,909 87,438 70,708 78,969 53,212
Protein (FTSY) ENVC MURA (DDLA) ALRI MREB
Mol. Wt 54,513 41,317 44,817 39,315 39,060 39,952
B. Membrane proteins Membrane proteins are difficult to work with. Solubilization in a semifunctional form is possible with the aid of certain non-ionic detergents (3); the sheath of detergent molecules attached to the hydrophobic regions of these molecules compromise studies of interactions of membrane proteins. Thus, a method applicable to an intact cell or organelle may have some value. The system here involves membrane proteins (PBPs) and cytosolic proteins (most of the other MGPs) and thus demonstrates that such proteins and their interacting ligands can be accessed. C Covalently linked complex ofpenicillin interactive proteins The non-lytic II labeled a single peak in the chromatographic profile (Fig. lb). Since the denaturing conditions of isolation and chromatography separate the I-labeled PBPs (Fig. la), it is clear that non-lytic II prevents their separation from a covalently bonded complex. As with I-treatment, Il-treatment must be followed with C2N2 to show MGP association. More MGPs are associated in the latter case. The inference is that a covalently bonded network is normal and prevents perforation and lysis as had been observed (22); the typical lytic p-lactam causes this dissociation of the complex murasome situated
/w/rfl-Cellular Protein-Protein Interactions
479
in a pre-existing opening in the cell wall which is, then, a factor of importance in bacterial cell death. When the complex breaks up the cytosol is released. D. Sites of Salt Bridging There are two important consequences of the C2N2 treatment for the analysis. The first is that some peptides that contain cross-linked residues will be diminished and may not appear in the MSFIT analysis. The second is that such cross-linked residues identify the salt-bridged site(s); in the first phase of this study we identified only the interacting proteins. E. Limitations of the Method Thus far no data have been found which indicate that C2N2 produces zn/^r-molecular condensations between proteins that have no normal association to form complexes. Neither the published papers cited here nor in an extensive unpublished body of experiments from this laboratory where conditions had been optimized for z/z/^r-molecular amide bond formation had any detectable condensation been seen. The only important limitation arises from the peptide mass fragment analysis. It appears to be completely unambiguous for single proteins (Table 2). As the number of proteins in a mixture or in a complex increase the number of unambiguous M/Z values from the MALDI-TOF analysis decrease. An important observation is that when ambiguities arose in this study, only MGPs appeared in the C2N2 driven association. None of the other ~ 3400 proteins showed up. At this stage of development, it is clear that maximum complexity is limiting. For the ~70 MGP system there is a need to follow the time course of condensation not only to deal with complexity, but more importantly to collect information about rates and degree of association. V. CONCLUSION There is no way to know what fraction of protein-protein interactions involve salt-bridges; however, recent crystallographic studies of protein-protein complexes of cytokines and growth hormone with their receptors show multiple salt bridges at the interfaces (23). Those that do not have salt bridges or hydrogen-bonded carboxylate carbinol associations (24) must remain undetectable by this technique. However, these early results show a high specificity and a good fraction of the expected interactions. It seems likely in the case of PBP-MGP interactions that we have seen vegetative state complexes and can expect a different set for cell division. When the septasome complex(es) and any other specialized complex(es) are characterized, we feel that a different set of MGPs will be found. This technique also provides a means of identifying each covalently linked, inter-protein salt-bridged site through identification of the peptide component contributed by each protein of the linked pairs and/or multimers. This is an aspect of this ongoing project. Such data will provide more specific
480
S. Bhardwaj and R. A. Day
information on how the almost 70 MGP/PBPs mutually modulate each other and cell wall synthesis (20). ACKNOWLEDGEMENTS. We thank Professor Jayasimhulu for the SIMS analyses and Mr. J. E. Carlson for the MALDI-TOF work. We appreciate being given access to the MALDI-TOF unit by Professor J. Monaco. We appreciate having access to the MASS-FIT program at UCSF and assistance from K. Clauser. The cyanogen aspect of this work was developed in the past under NIH Grant GM42697. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
22. 23. 24.
Phizicky, E.M. and Fields, S. (1995) Microbiol. Rev. 59, 94-123. M e n , J.R, Walber& M.W., Edwaixls, M.C., and EUedge, S.V. (1995) Trends Biochem. 5c?i., 20, 511-516. Kyte, J. (1995) Structure in Protein Chemistry, Garland Publishing (New York), p. 520 j ^ Day, R.A., Kirley, J., Tharp, R, Flicker, O., Strange, C. and Ghenbot, G. (1989) In (T.E. Hugli, Ed.) Techniques in Protein Chemistry, Academic Press, San Diego, pp. 517-525. Day, R.A., Hignite^ A. and Gooden, W.E. (1995) In (J.W. Crabb, Ed.) Techniques in Protein Chemistry VI, Academic Press, San Diego, pp. 435-442. Day, R.A., Tharp, R.L., Madis, M.E., Wallace, J.A., Silanee^ A.A., Hurt, P. and Mastruserio, N. (1990). Peptide Res. 3, 169-175. Day, R.A., Ahluwaha, R. and Du, Y. (1994) LC-GC12, 384-394. Cartwright, S.J., Tan, A.K. and Fink, A.L. (1989) Biochem. J., 263, 905-912. Ahluwalia, R., D ^ , R.A., and Nauss, J. (1995) Biochem. Biophys. Res. Common. 206, 577-583. Kirley, J.W., Day, R.A. and Kreishman, G.P. (1985) FEBS Lett. 193, 145. Fassett, D.W. (1983) In (F.A. I ^ y , Ed.) Industrial Hygiene and Toxicology, 2nd Ed., Interscience Publishers, NY, p. 2003. Bhardwaj, S. and Day, R.A. (1996) submitted to LC/GC. Lee, T.D. and Shively, J.E. (1990) Methods Enzymol. 193, 361-374. Day, R.A., Bhardwaj, S. and Bai, H. (1995) Miami Bio/Technology Short Reports 6, 88. Henzel, W.J., Billeci, T.M., Stults, J.T., Wong, S.C, Grimly, C and Watanabe, C. (1993) Proc. Natl. Acad. Sci. USA 90, 5011-5015. Pappin, D.J.C., Hojrup, P. and Bleaby, A.J. (1993) Current Biol. 3, 327-332. James, P., Quadron, M., Carafoh, E. and Gonnet, G. (1993) Biochem. Biophys. Res. Commun. 195, 58-64. Hynes, G., Sutton, C.W., U, S., and WiUison, K.R. (1996) EiSEB J. 10, 127-147. Waxman, D.J. and Strominger J.L. (1983) Ann. Rev. Biochem. 52, 825-869. Bhardwaj, S. (1996) Ph.D. Dissertation, University of Cincinnati. Donachie, W.D. (1993) In (M.A. de Pbdro, J.-V. Holtje and W. Loffelhaixlt, Eds.) Bacterial Growth and Lysis. Metabolism and Structure of the Bacterial Sacculus, Plenum, New York, pp. 409-18. Giesbrecht, P., Kersten, T., Madela, K., Grob, H., Bliimel, P. and Wecke, J. (1993) in de Pedro et al. op. cit. pp. 393-407. Ealick, S., Thiel, D , le Du, M., Walter, R., D'Arey, A., Chene, C , Fontoulabis, M., Garotta, G. and Winklet, F. (1996) Prot. Science 5 (Suppl. 1) 59. Karagozler, A.A., Ghenbot, G. and Day, R.A. (1993) Biopolymers 33, 687-692.
Use of Synthetic Peptides in Mapping the Binding Sites for hsp70 in a Mitochondrial Protein Antonio Artigues, Ana Iriarte, and Marino Martinez-Carrion Division of Molecular Biology and Biochemistry. School of Biological Sciences. University of Missouri-Kansas City, Kansas City, MO 64110
I. Introduction The members of the 70-kDa heat shock protein (hsp70) family perform functions that are essential for cell viability, both under normal and stress conditions. Constitutively expressed hsp70s are thought to be involved in the folding and assembly of newly synthesized proteins, disassembly of oligomeric proteins, protein degradation, and the transport of nascent peptide chains across membranes (l, 2). The structure of hsp70 consists of a variable C-terminal peptidebinding domain and a highly conserved N-terminal ATPase domain. The crystal structure of a peptide-C-terminal domain complex shows that the peptide substrate is bound in an extended conformation through numerous interactions from both side chains and backbone groups (3). As with all molecular chaperones, hsp70 shows a remarkable selectivity for unfolded structures, but low specificity towards the sequence of its potential peptide substrates. Among the few consensus features identified in peptides binding to hsp70 with high affinity is the presence of internal hydrophobic residues (3, 4), which agrees with the proposed role of this chaperone in binding to hydrophobic regions of unfolded proteins normally hidden in the native structure. However, a great variety of synthetic peptides, including organellar targeting sequences containing basic residues (5-7) also bind to hsp70. Perhaps the binding sites recognized by hsp70 depend on the specific role fulfilled by the chaperone with a given substrate. In any case, the precise characteristics of the targeting sequences that determine the recognition and binding of hsp70 to its multiple substrates remain largely obscure. The cytosolic (cAAT) and mitochondrial (mAAT) isozymes of aspartate aminotransferase share a significant degree of sequence homology TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
481
Antonio Artigues et al
482
(63%), and almost identical crystallographic structures (8, 9). The mitochondrial isozyme is synthesized in the cytosol as a precursor protein (pmAAT) with a 29-residue presequence peptide that is required for targeting and import into mitochondria. Despite the broad substrate specificity of hsp70 mentioned before, this chaperone is able to discriminate between the two highly homologous AAT isozymes. Following either synthesis in cell-free extracts (lO) or refolding firom acid-denatured states the mitochondrial isozyme binds to hsp70, whereas its cytosolic counterpart does not interact with this molecular chaperone. Thus, these isozymes provide a particularly attractive system to delineate sequence elements that might be binding motifs for hsp70 in pmAAT, but are absent or hidden in cAAT. An initial screening with a series of tetradecameric peptides corresponding to the complete amino acid sequence of pmAAT has led to the identification of several binding regions scattered over the full length of the polypeptide chain.
II. Methods A. Protein
Purification
Purification of pmAAT and cAAT was carried out as previously described (11,12). After concentration by ultrafiltration (Amicon centricon, 30,000 molecular weight cutoff), the proteins were transferred to 2 mM Tris HCl, pH 7.5, a low ionic strength buffer suitable for the subsequent unfolding of the proteins at low pH (pH=2.0). Protein concentrations were estimated from the absorbance at 356 nm or 362 nm of the pyridoxal-5'-phosphate (PLP) cofactor bound to either pm- or cAAT, respectively (molar absorption coefficient of 8,500 M cm ), and Mr=46,597 for pmAAT or Mr=46,399 for cAAT. Hsp70 was purified following published procedures (13). Following the last ammonium sulfate precipitation, the protein was exhaustively dialyzed against 25 mM Tris HCl, pH 7.5, 20 mM NaCl, 10 mM p-mercaptoethanol and kept at 4 ^C until use. Hsp70 protein concentration was measured using a molar absorption coefficient at 280 nm of 47,800 M'^ cm"^ and Mr=70,000 (14).
B. Unfolding and Refolding of pmAAT and cAAT For the reversible acid unfolding of pmAAT, a stock solution of the enzyme in 2 mM Tris HCl, pH 7.5 was denatured by addition of diluted HCl to pH 2.0, followed by incubation for 90 min at room temperature (15). Refolding of pmAAT was performed by rapid dilution of the unfolded protein to 1.8 pM final concentration in refolding buffer (40 mM
Mapping Binding Sites for hsp70 in Mitochondrial Protein
483
Hepes, 0.1 mM EDTA, 1 mM DTT, 10 jaM PLP, pH 7.5) at 10 OC. When studying the effect of hsp70 on the refolding of the enzyme, hspTO ( 1.8 |LiM) was present in the refolding mixture before initiation of refolding by addition of the unfolded protein. For competition studies, hsp70 (1.8 |iM) was preincubated with a 70-fold molar excess of each peptide (120 |LiM) in refolding buffer for 2 to 16 h at 10 ^C before addition of aciddenatured pmAAT (1.8 |iM). The reaction mixture was incubated for an additional 2 h to allow complete refolding of the pmAAT molecules that were not complexed with hsp70. Binding of a peptide to hsp70 would result in an increase in the fraction of pmAAT molecules that are allowed to recover full catalytic activity. Therefore, peptide competition was examined by determining the recovery of pmAAT activity in the presence of hsp70 and in the presence or absence of peptide. The effect of the peptides on the spontaneous refolding of pmAAT was analyzed by measuring the yield of reactivation in samples containing 120 |LiM peptide but no chaperone. The yield of reactivation was determined by measuring the recovery of transaminase activity. Data are expressed as percentage relative to the activity of a control sample maintained under identical conditions.
C. Synthesis and Purification of Peptides The collection of 14-residue peptides spanning the complete amino acid sequence of pmAAT was a generous gift from Dr. B.M. Conti-Tronconi (University of Minnesota). The peptides were synthesized according to Houghten (16). The purity of the peptides was assessed by reverse phase HPLC using a C18 column (Vydac 218TP, 250 x 4.6 mm) and an acetonitrile/water gradient (5 to 70% over 30 min) containing 1% trifluoroacetic acid. The purity of the different peptide preparations ranged from 50-95%. Most of the contaminating peptides represented truncated peptides randomly missing amino acids from incomplete coupling. For screening purposes, these peptides were used without further purification. Selected peptides were further purified by HPLC reverse phase chromatography on a BioRad C18 Hi-Pore RP-318 semipreparative column (250 x 10 mm), using the same gradient as before. Major peaks were collected and the full-length peptide peak was identified by amino acid composition analysis. Peptide concentration was determined based on the molar composition obtained by amino acid analysis. The presequence peptide, MALLHSGRVLSGM-AAAFHPGLAAAASARA, was also synthesized as a single 29-mer peptide in an Applied Biosystems 433A peptide synthesizer at the Molecular Core Facility of the School of Biological Sciences and purified by reverse phase chromatography as described above.
Antonio Artigues et al
484
D. Enzymatic
Activities
The transaminase activity was measured at 37 ^C using L-aspartate and a-ketoglutarate as substrates, using a coupled assay with malate dehydrogenase, as described previously (17). To measure the ATPase activity of hsp70, the chaperone (0.5 |LIM) was incubated at 37 ^C in 40 mM Hepes, 45 mM KCl, 120 ^iM MgCl2, 60 ^iM ATP, pH 7.5, in the presence or absence of different synthetic peptides (120 |LIM). At different incubation times, a 20-|LI1 aliquot was withdrawn and assayed for ATP content on a Turner TD 15-e Luminometer, using the ATP bioluminescence assay kit from Sigma and following manufacturer's instructions. This assay measures the light emitted upon spontaneous decomposition of adenylate-luciferin produced by luciferase from ATP and luciferin substrates (18). When ATP is the limiting reagent, the light emitted is proportional to the ATP present in the sample, and the concentration of ATP can be calculated by reference to an ATP standard curve. The rate of spontaneous hydrolysis of ATP was estimated in samples incubated at 37 ^C in the absence of hsp70.
III. RESULTS AND DISCUSSION A. Effect of hsplO on the Refolding of pmAAT and cAAT In vitro refolding of the acid-unfolded isozymes results in the reconstitution of native-like proteins. Figure 1 shows the yield of reactivation of cAAT and pmAAT (1.8 |LiM) in the absence or presence of hsp70 (1.8 |uM). In the absence of hsp70, a significant recovery of activity (70-80%) can be achieved following spontaneous refolding of both proteins. However, when hsp70 is present, the yield of reactivation of pmAAT is reduced considerably, whereas the yield of reactivation of its cytosolic counterpart is not affected. Inhibition of pmAAT refolding by hsp70 results in the formation of insoluble aggregates of the hsp70-pmAAT complex. hsp70 does not affect the activity of the native protein. This effect of hsp70 is specific, since addition of high concentrations (1 mg/ml) of other unrelated proteins such as bovine serum albumin, aldehyde dehydrogenase, or malic dehydrogenase does not prevent pmAAT refolding and reactivation (data not shown).
Mapping Binding Sites for hsp70 in Mitochondrial Protein
i
100
80
K//X
•I 60 > O CO 0 V-
o
40
485
]
0
> 20
#
——
— \ — —\— ^
Hsp70 cAAT
pmAAT
Fig. 1. The effect of hsp70 on the refolding of cAAT and pmAAT. Refolding of acid unfolded cAAT or pmAAT was performed by rapid dilution of the denatured enzymes in the refolding buffer to a final protein concentration of 1.8 fiM. When present, hsp70 (1.8 ^iM) was added to the refolding buffer before initiation of the refolding reaction. After incubation for 120 min at 10 °C, the transaminase activity recovered was measured as indicated under Methods. Reactivation data are expressed relative to that of the native enzyme incubated under identical conditions.
B. Competition by pmAAT Peptides tact Unfolded pmAAT to hsp70
of Binding of In-
Taking advantage of the fact that hsp70 binds to unfolded pmAAT and markedly reduces the yield of reactivation (from 70% to 20%), we developed a competition assay to search for putative binding sites for hsp70 in the pmAAT polypeptide. In this assay, each peptide in a collection of 43 synthetic tetradecamers spanning the entire amino acid sequence of pmAAT was tested for its ability to compete with unfolded pmAAT for binding to hsp70. Since binding to hsp70 stops refolding of pmAAT, competition by a given synthetic peptide should result in an increase in the fraction of pmAAT activity recovered. Thus, the relative affinity of the different 14-mer peptides for binding to hsp70 was established by comparing the yield of pmAAT reactivation in the presence of hsp70 alone, or hsp70 that had been preincubated with a 70-fold molar excess
Antonio Artigues et al
486
Table I. Selective binding of pmAAT peptides to hsp70. Synthetic tetradecamer peptides corresponding to the amino acid sequence of rat liver pmAAT (p-1 to p-43 from the N-terminai to the C-terminal end) were tested for their ability to compete for the binding of unfolded pmAAT to hsp70 as described under Methods. The percentage of pmAAT activity recovered relative to that obtained in the presence of hsp70 alone (20%) represents an index of the magnitude of peptide competition of pmAAT binding to hsp70: 66 % (+++). The maximum yield of reactivation in the absence of hsp70 is 75 5 %.
Peptide sequence
Number
Activity recoveredCompetition a (+ hsp70, %)
none presequence MALLHSGRVLSGMA SGMAAAFHPGLAAA LAAAASARASSWWT SWWTHVEMGPPDPI PDPILGVTEAFKRD FKRDTNSKKMNLGV NLGVGAYRDDNGKP NGKPYVLPSVRKAE RKAEAQIAGKNLDK NLDKEYLPIGGLAD GLADFCKASAELAL ELALGENSEVLKSG LKSGRFVTVQTISG TISGTGALRVGASF GASFLQRFFKFSRD FSRDVFLPKPSWGN SWGNHTPIFRDAGM DAGMQLQGYRYYDP YYDPKTCGFDFSGA FSGALEDISKIPEQ IPEQSVLLLHACAH ACAHNPTGVDPRDE PRPEQWKEMAAVVK AVVKKKNLFAFFDM FFDMAYQGFASGDG SGDGDKDAWAVRHF VRHFIEQGINVCLC VCLCQSYAKNMGLY MGLYGERVGAFTW FTWCDKAEEAKRV AKRVESQLKILIRP LIRPLYSNPPLNGA LNGARIAATILTSP LTSPDLRKQWLQEV LQEVKGMADRIISM IISMRTQLVSNLKK NLKKEGSSHNWQHI WQHITDQIGMFCFT
20 p r e - p * b 58 p-1 C 45 p-2 C 47 p-3 c 50 p-4 d nd p-5 * 76 p-6 37 p-7 32 p-8 35 p-9 20 p-10 6 p-11 65 p-12 80 p-13 75 p-14 45 p-15 * 80 p-16 38 p-17 30 p-18 53 p-19 43 p-20 62 p-21 e nd p-22 42 p-23 * 80 p-24 52 p-25 11 p-26 39 p-27 e nd p-28 52 p-2 9 e nd p-30 69 p-31 * 80 p-32 45 p-33 52 p-34 34 p-35 46 p-36 44 p-37 33 p-38 e nd
+-f+ + + nd +4-+
+-h ++-h +++ + +-1--H + ++ + ++ nd + +++ + + nd + nd +-I-+ ++-H -f -I+ + nd
Mapping Binding Sites for hsp70 in Mitochondrial Protein FCFTGLKPEQVERL VERLTKEFSVYMTK YMTKDGRISVAGVT AGVTSGNVGYLAHA LAHAIHQVTK (a) (b) (c) (d) (e) (*)
p-39 p-40 p-41 p-42 p-43
* * * *
52 70 74 71 20
487 + +++ +++ +++ ^
Reactivation data are expressed relative to a sample of native pmAAT incubated under identical conditions. A 29-residue peptide corresponding to the entire presequence region of pniAAT. Tetradecameric peptides with 4-residue overlapping regions spanning the presequence and the first five residues of the mature sequence. Binding of peptide p-4 to hsp70 could not be analyzed by the competition assay due to its strong inhibition of pmAAT refolding, nd, not determined Peptides p-21, p-27, p-29 and p-38 show very low solubility in aqueous solutions. These peptides were selected for further characterization of their hsp70 binding by ATPase activity stimulation assays.
of the peptide. The results of this competition cissay are summarized in Table I. Addition of several synthetic peptides (p-5, p-12, p-13, p-15, p-23, p-30, p-31, p-40, p-41; p-42, rated as +++ in Table I) produced a complete reversal of the hsp70-induced reduction in pmAAT refolding, indicating that they bind stroncfly to the chaperone and thus prevent formation of a hsp70-pmAAT complex. Another group of peptides (labeled as ++ or + in Table I) induce only a partial recovery of pmAAT activity in the presence of hsp70, suggesting a lower affinity for binding to hsp70. Finally, equivalent concentrations of several peptides (those rated or in Table I) had very little or no effect on the interaction of pmAAT with hsp70, indicating that they do not bind to the chaperone. Obviously this competition assay would not be feasible if the synthetic peptides interfered with the spontaneous refolding of pmAAT. This was tested by monitoring the yield of reactivation in the presence of concentrations of peptide similar to those used in the competition experiments (120 M) but minus hsp70. Among the 43 tetradecamers tested, only p-4, whose sequence corresponds to the N-terminal peptide of the mature portion of pmAAT, had a marked effect on the recovery of pmAAT activity (Table I). In the presence of this peptide, the yield was reduced from about 7 0% to 8%, and there was extensive aggregation of the refolding polypeptide. Consequently, binding of this peptide to hsp70 could not be tested using the competition assay. Since in the native pmAAT dimer this N-terminal peptide interacts strongly with a hydrophobic pocket on the surface of the neighboring subunit (8, 9) the presence of an excess of the synthetic peptide may interfere with the dimerization step in the folding pathway. On the other hand four peptides (p-21, p-27, p-29 and p-38) have a very limited solubility in aqueous solutions and therefore could not be tested at concentrations similar to those used for the other peptides. When used at a lower concentration, they did not show any effect on either the spontaneous refolding of pmAAT or its binding to hsp70.
488
Antonio Artigues et al
The peptide sequences with highest affinity for binding to hsp70 are not clustered in a specific region of the polypeptide pmAAT chain, but rather are scattered over the entire amino acid sequence of the enzyme. The sequence of these regions shows several of the characteristics described for peptides with high binding afiBnity to hsp70 (6, 19), such as the presence of hydrophobic and positively charged residues. Moreover, with the exception of the presequence peptides, they are localized within regions of the enzyme that are normally hidden in the folded state of the protein. However, sequence homology analysis of the different high affinity peptides did not allow for the identification of a consensus sequence, which agrees with the known broad specificity of hsp70 for peptide substrates (l, 20, 21). In addition, the majority of the peptides with high binding affinity to hsp70 map to regions in the amino acid sequence of pmAAT having the lowest degree of homology with the corresponding position in the cytosolic homologue. In addition to the collection of tetradecamers having 4-residue overlapping ends (see Table I for sequences), we also tested the competition of a 29-residue peptide corresponding to the entire presequence peptide (pre-p in Table I). Interestingly, the effect of this peptide in the competition assay is more pronounced than that of each of the 14-mer peptides (p-1, p-2, and p-3) containing sequence elements from the same region (see first four entries in Table I). One possible explanation for this different behavior is that the targeting sequence recognized by hsp70 in the intact presequence peptide has been split in the three related shorter peptides. The effect of the presequence peptide is of particular interest since it is unique to the mitochondrial enzyme. The competition of the presequence peptide with pmAAT for binding to hsp70 is concentration dependent, with an apparent affinity constant of about 9.4 jiM (data not shown). Preincubation of hsp70 with saturating concentrations of the presequence peptide also stimulates the ATPase activity of hsp70 (see below. Table II). Binding of other mitochondrial presequences to hsp70 has been recently reported (5, 7).
C. stimulation of the ATPase Activity High Affinity Binding Peptides
of hsp70
by
Hsp70 has a weak ATPase activity, with turnover rates ranging from 0.0004 to 0.0012 s"^ (14). Peptides binding to the C-terminal domain of hsp70 induce a conformational change in the N-terminal domain (6, 23, 24), which leads to a discrete stimulation of the ATPase activity. Therefore, binding of substrates to hsp70 can also be tested by monitoring changes in its ATPase activity. For this reason, we next examined the effect of several of the pmAAT tetradecamers on the ATPase activity of hsp70 using a sensitive bioluminescence assay to monitor the decrease in ATP concentration with time. All of the peptides assayed were repu-
Mapping Binding Sites for hsp70 in Mitochondrial Protein
489
Table II. The effect of peptides on the hsp70 ATPase activity. Hsp70 ATPase activity was measured by monitoring the disappearance of ATP substrate over time using a bioluminescence assay as described under Methods. The concentration of the various peptides in the assay mixture was 120 |j,M. Peptide
ATPase activity (nmole/min/mg)
Stimulation ^
none
0.92
1.00
pre-p p-5 p-15 p-23 p-31 p-40 p-41 p-42
1.70 1.82 1.75 1.34 1.63 1.50 1.30 1.30
1.85 1.98 1.90 1.46 1.77 1.63 1.41 1.41
p-43
1.00
1.09
^ Activity in the presence of peptide/basal activity in the absence of peptide.
rifled by RP-HPLC before use. Stimulation of the ATPase activity correlated well with peptide binding data obtained from competition experiments. The presequence peptide and several of the 14-mer peptides showing maximal competition with pmAAT for binding to hspTO induced a 1.5 to 2-fold ATPase stimulation (Table II). In contrast, p-43, the C-terminal peptide from pmAAT which did not bind to hspTO according to the competition assay, showed no stimulation of the chaperone ATPase activity. IV.
CONCLUSIONS
Possible hsp70 binding sites on the primary structure of the pmAAT polypeptide have been identifled by competition studies in which, previous to the initiation of pmAAT refolding, hsp70 had been preincubated with a series of synthetic tetradecameric peptides spanning the complete sequence of pmAAT. The rationale of this approach was based on two assumptions: i) hsp70 binds peptides in an extended, or at least flexible, conformation, and ii) sequence homology analysis and the use of peptides derived from a known sequence will allow the identiflcation of peptide motifs responsible for the differential interaction of hspTO with two homologous proteins, pmAAT and cAAT. The flrst assumption has recently been strengthened by the publication of the crystal structure of the hspTO peptide binding domain (3). The second has led to the mapping of putative binding sites of polypeptide regions that show maximum sequence divergence between the two isozymes.
Antonio Artigues et al
490
100
200
300
400
Position
Fig. 2. Structural comparison between cAAT and pmAAT. The average sequence homology between cAAT and pmAAT was calculated using the Plotsimilarity program included in the Wisconsin Package of the Genetics Computer Group suite of programs (version 8.0, 1984) with a window size of seven residues, after the proteins were aligned inserting gaps where necessary to maximize homology. A score of 1.5 corresponds to a region of perfect homology. The dotted line represents the overall average similarity between the two proteins. Horizontal bars indicate the position of peptides showing maximum competition with pmAAT for binding to hsp70. The peptides are identified by numbers as assigned in Table I.
Mechanistic studies on the structure-function of hsp70 have shown that upon binding of peptides there is a conformational change in hsp70 that results in a slight stimulation of hsp70 ATPase activity. Release of peptide substrates is expected to be a slow step and may require coupling to ATP hydrolysis and possibly the cooperation of other molecular chaperones. Consequently, in the absence of any other cytosolic factors, the binding of peptides to hspTO is basically irreversible. Considering these properties, several strategies have been used to identify substrate recognition features of hsp70. An initial screening of a battery of peptides derived from pmAAT for their ability to compete with pmAAT for the formation of a complex with hsp70 has allowed for a fast, easy, and accurate identification of protein sequences that efficiently bind to hsp70. Confirmation of the binding of selected peptides has been obtained by measuring the stimulation of the hsp70 ATPase activity as a consequence of the conformational change induced upon substrate binding.
Mapping Binding Sites for hsp70 in Mitochondrial Protein
491
With the exception of the presequence-containing peptides, and in agreement with the generally accepted mechanism of hsp70 action, the peptides that bind with high affinity to hsp70 comprise sequences that are hidden in the native state of the protein. These peptides contain central hydrophobic and basic carboxyl terminal amino acids, but few acidic residues. More interestingly, a sequence homology comparison of the cytosolic and mitochondrial protein sequences shows that the mitochondrial peptides binding to hsp70 correspond to regions of major sequence dissimilarity between the two isozymes (Figure 2). This suggests that sequence divergences observed between the mitochondrial and cytosolic isozymes may have arisen as a consequence of biochemical specialization to ensure the different interaction of each enzyme with the cellular machinery responsible for protein folding and translocation in vivo, thus promoting efficient import into the organelle of pmAAT and rapid folding in the cytosol of cAAT. Detailed analyses of the binding properties of each peptide, including the accurate determination of the binding affinity of each region as well as the identification of the critical residues involved in the peptidehsp70 interaction, are in progress. Information gathered from these studies should contribute to a better characterization of putative recognition sites responsible for the distinct interaction of the two isozymes with hsp70.
Bibliography 1. Mckay, D. (1993) Advances in Protein Chemistry 44, 67-98. 2. Hendrick, J.D., and Hartl, F.U. (1993) Annu. Rev. Biochem. 62, 349-384 3. Zhu, X., Zhao, X., Burholder, W.F., Gragerov, A., Ogata, CM., Gottesman, M.E., and Hendrikson, W.A. (1996) Science 272, 1606-1614. 4. Flynn, G.C., Pohl, M.T. Flocco, M.T., and Rothmann, J.E. (1991) Nature 353, 726730. 5. Endo, T., Mitsui, S., Nakai, M., and Roise, D. (1996) J. Biol. Chem. 271, 41614167. 6. Takenaka, I.M., Leung, S.M., McAndrew, S.J., Brown, J.P., and Hightower, L.E. (1995) J. Biol. Chem. 270, 19839-19844. 7. Schmid, D., Baici, A., Gehring, H., and Cristen, P. (1994) Science 263, 971-973. 8. Malashkevich, V.N., Strokopytov, B.V., Borisov, V.V., Dauter, Z., Wilson, K.S., and Torchinsky, Y.M. (1995) J. Mol. Biol. 247, 111-124. 9. Jansonius, J.N., and Vincent, M.G. (1987) In Biological Macromolecules and Assemblies (Jurnak, F., and McPherson, A., Eds.) Vol. 3, pp. 187-285, John Wiley & Sons Inc., New York. 10. Lain, B., Iriarte, A., Mattingly, J.R. Jr., Moreno, J.I., and Martinez-Carrion, M. (1995) J. Biol. Chem. 42, 24732-2739. 11. Altieri, F., Mattingly, J.R. Jr., Rodriguez-Berrocal, F.J., Iriarte, A., Wu, T., and Martinez-Carrion, M. (1989) J. Biol. Chem. 264, 4782-4786. 12. Mattingly, J. R., Jr., Iriarte, A., and Martinez-Carrion, M. (1995) J. Biol. Chem. 270, 1138-1148.
492
Antonio Artigues et al
13. Welch, W.J., and Feramisco, J.R. (1985) Molecular and Cellular Biology 5,1494914959. 14. Palleros, D.R., Welch, W.J., and Fink, A.L. (1991) Proc. Natl Acad. Sci. U.S.A. 88, 5719-5723. 15. Artigues A., Iriarte, A., and Martinez-Carrion, M. (1994) J. Biol. Chem. 269, 21990-21999. 16. Houghten, R.A. (1985) Proc. Natl. Acad. Sci. U.S.A. 82, 5431- 5135. 17. Martinez-Carrion, M., Turano, C, Chiacone, E., Bossa, F., Giartosio, A., Riva, F., and Fasella, P. (1967) J. Biol. Chem. 242, 2397-2409. 18. Leach, F.R., and Webster, J.J. (1986) Methods in Enzymology 133, 51-70. 19. Fourie, A.M., Sambrook, J.F., and Gething, M.-J. (1994) J. Biol. Chem. 269, 30470 - 30478. 20. Gilk, B.S. (1995) Cell 80, 11-14. 21. Hightower, L.E., Sadis, S.E., and Takenaka, I.M. (1994). In The Biology of Heat Shock Proteins and Molecular Caperones, (Morimoto, R.I., Tissieres, A. and Georgopoulos, C, Eds.) pp. 179-207, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. 22. Mattingly, J. R., Jr., Iriarte, A., and Martinez-Carrion, M. (1993) J. Biol. Chem. 268, 26320-26327. 23. Takeda, S., and McKay, D.B. (1996) Biochemistry. 35, 4636-4644. 24. Park, K., Flynn, G.C., Rothman, J-E., and Fasman, G.D. (1993) Protein Science 2, 325-330.
Interfacing Biomolecular Interaction Analysis with Mass Spectrometry and the use of Bioreactive Mass Spectrometer Probe Tips in Protein Characterization Randall W. Nelson, Jennifer R. Krone, David Dogruel, Kemmons Tubbs Department of Chemistry and Biochemistry Arizona State University Tempe AZ 85287-1604 Russ Granzow and Osten Jansson Pharmacia Biosensor AB, S-751 82 Uppsala, Sweden OVERVIEW The past decade has seen the development of new and powerful technologies capable of the accurate characterization of biomolecules with extreme speed and sensitivity. Two of these techniques, Biomolecular Interaction Analysis (BIA) and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF), lend themselves particularly to such analyses; the former ideally suited for the real-time investigation of biomolecular interactions, the latter finding much use in the qualitative assessment of analytes. Although the two analytical approaches operate on mutually exclusive detection principles (either surface plasmon resonance detection of a refractive index change or the physical determination the molecular mass of a gas-phase ion), they can share a common denominator ~ the use of affinity interactions in selecting the analyte. Interfacing of the two thereby creates a unique approach for the investigation of the kinetic parameters of biomolecular interaction (using BIA), and the unambiguous confirmation of the presence of targeted affinity ligands by direct mass analysis (using MALDI-TOF). In other applications, MALDI-TOF analysis can be extended beyond the primary role of protein molecular weight determination by combination with analytical enzymologies. The simplest use of enzymes in combination with MALDI-TOF is digestion of analytes into smaller fragments using endoproteases. The masses of the fragments are then determined in order to confirm or deny the sequence of the protein (or the presence of a given variant of the analyte). Traditionally, digestions are performed with both the analyte and enzymes in solution. As a result, autolysis signals are frequently observed in the mass spectra. Enzyme autolysis can be eliminated by using proteases immobilized to chromatographic supports, but generally at the expense of speed and sensitivity in analysis. An alternative to using enzymatically active chromatographic supports is to covalently attach enzymes to the surface of the mass spectrometer sample introduction device (probe). The probe device thus serves a two-fold purpose: as the enzymatic agent used for modification of the analyte, and, as a sample introduction device into the mass spectrometer. Over the past few years we have been developing new mass spectrometric approaches for the rapid, sensitive, and, accurate characterization of proteins. Reported here are some of our findings on the interfacing of Biomolecular Interaction Analysis with mass spectrometry, and the use of enzymatically active - or bioreactive - mass spectrometer probe tips in the characterization of analytes. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
493
Randall W. Nelson et al
494
INTERFACING BIA WITH MASS SPECTROMETRY I.
Introduction
Biomolecular Interaction Analysis (BIA) is an acronym given to a number of techniques used in the characterization of bio-specific interactions. A form of the technique is based on the non-destructive detection principle of surface plasmon resonance (SPR), and is capable of monitoring the binding of an analyte to a surface-immobilized binding partner in real-time [1]. Briefly, a biosensor surface (chip) comprised of an affinity ligand-derivatized carboxylated dextran layer coupled to a thin gold surface is monitored using SPR while the chip surface is exposed to the complementary affinant. Differences in surface concentration resulting from ligand-affinant interaction are detected as a change in the SPR signal, expressed in resonance units (RU), with 1000 RU corresponding to a surface concentration of- 1 ng/mml The resulting sensorgrams report the mass quantity of analyte bound to the chip surface as a function of time. Sensorgram data, as a function of analyte concentration, can then be used to determine kinetic parameters, and molar absorptivity constants, of the interaction [2]. i?
100 nm dextran u.
_JL_ 1
m/z
50 nm Gold
Polarized Light Source Resonance
Signal r-\_qsigr Surface Plasmon Resonance
Fig. 1 Biomolecular interaction analysis/mass spectrometry (BIA/MS). Biosensor chips are derivatized with affmant (or used with an affmant of streptavidin) and used in the BIA analysis of biological fluids. The chips are then introduced into a MALDI time-of-flight mass spectrometer and retained ligands analyzed by virtue of molecular weight.
Biomolecular Interaction Analysis with MS and MS Probe Tips
495
Although BIA is capable of providing pertinent information on ligand binding and kinetics, SPR detection is indirect. As a resuh the identity of the bound affinant(s) may not always be certain. This situation can hold particularly true in complex systems where there exists the possibility of binding muhiple, or unknown affmants, either non-specifically or in competition for the surface bound ligand. MALDI-TOF mass spectrometry differentiates between species by detection of analytes at precise mass-to-charge (m/z) values. When coupled with affinity isolation, this direct detection enables the unambiguous determination, or, possible identification, of the retained affinants. Interfacing of BIA with MALDI-TOF thus affords a powerful combination of techniques capable of real-time monitoring of biospecific interactions, and absolute determination of retained analytes. The coupling of BIA with MALDI-TOF mass spectrometry has therefore been explored [3,4,5]. An approach was taken in which BIA analyses were first performed; the retained analytes then mass analyzed directly from the sensor chips (see Fig. 1). II.
Materials and Methods
A.
Biomolecular Interaction Analysis
BIA analyses were performed on a rabbit anti-human IgG/human myoglobin system using a Pharmacia Biosensor BIAcore 2000 (Uppsala, Sweden). Individual flow cells of CMS (carboxylated dextran) sensor chips were derivatized with polyclonal rabbit anti-human IgG using an amine-coupling protocol described previously [6]. Cyano-stabilized human myoglobin (400 ng/mL) in the presence of human serum albumin (20 mg/mL) wasflowed(10 jiL/minute, 20 mM HEPES, 0.005% Tween 20, 150 mM NaCl, 5 mM EDTA, pH 7.4 (HBS)) over the antimyoglobin-derivatized flow cells for times ranging from 30 second to 3 minutes while monitoring the SPR signal. After incubation, the flow cell surfaces were (flow) rinsed with HBS for an additional 3 minutes before the chips were de-blocked from the instrument. Chips were dried and stored at ambient until mass spectrometric analysis. B.
MALDI Mass Spectrometry
Approximately 100 nL of a MALDI matrix, a-cyano-4-hydroxycinnamic acid, (-50 mM dissolved in 1:2, acetonitrile:1.4% TFA) was applied to each of the four flow cells (500|im x 2.0 mm) and allowed to air dry. The chips were next introduced into a prototype MALDI time-offlight mass spectrometer built specifically for analysis of the BIA chips. Briefly, the instrument consists of a linear translation stage/ion source capable of the precise targeting of each of the four flow cells under a focused laser spot (with a spatial resolution on the order of the diameter of the laser spot; ~ 200 |xm). Ions generated during a 4 ns laser pulse (355 nm; Q-switched frequencytripled Nd:YAG) were accelerated to a potential of 25 kV (in a continuous extraction mode) over a single-stage ion extraction source distance of ~ 1 cm before entering a 1.5 m field-free drift region. Ions signals were detected using a 2-stage hybrid (channel plate/discrete dynode) electron multiplier. Time-of-flight spectra were produced by signal averaging the individual spectra from 50 - 100 laser pulses (using a 500 Mhz; 500 MS/sec digital transient recorder). Custom software was used in acquisition and analysis of the mass spectra. Spectra were obtained in the positive ion mode and externally calibrated using equine cytochrome c (MW = 12360.7 Da) as a standard.
Randall W. Nelson et al
496
III.
Results/Discussion
Sensorgrams of the antibody immobilization and myoglobin binding are shown in Fig. 2. Fig. 2A shows a sensorgram obtained for one of the flow cell during the amine-coupling of the anti-myoglobin IgG to the surface of the CMS sensor chip. Anti-human IgG (~ 2 mg/mL in HBS) was flow incubated over the chip surface for ~ 7 minutes before a ~ 2 minute rinse with HBS, followed by a ~ 7 minute blocking with ethanolamine. Thefinaldifference in the sensorgram reading of- 15,000 RU translates to ~ 15 ng of antibody covalently linked to the surface of the 1 mm^ area of the flow cell. Considering two binding sites per antibody molecule, a myoglobin binding capacity of 200 fmol is estimated for the flow cell. All four flow cells of the sensor chip were derivatized using identical conditions and resulted in virtually identical sensorgrams (i.e. < 1 % deviation in the amount of antibody bound). Fig. 2B shows sensorgrams obtained during the incubation of the anti-myoglobin-derivatized flow cells with human myoglobin. Sensorgrams for flow cells two and three are shown. A difference in the sensorgram signal of- 250 RU translates to approximately 20 fmoles of myoglobin retained in flow cell 2 (during the 2.5 minute incubation). The sensorgram signal for flow cell 3 indicates roughly half that amount (-10 fmole) retained during the shorter (1 minute) incubation time. Immobilization of anti-human myoglobin on CMS chip
45000 40000
1600-, 1400
Blocking
Human myoglobin bound on CMS chip
1200H S ^
1000 800
CO
§
600
i"
400 200 H 0 -200
100
200
300 400 Time (sec)
Fig. 2 Sensorgrams of CM5/anti-human myoglobin IgG/HSA; myoglobin system. (A) Covalent immobilization of IgG to flow cells. A sensorgram reading of 15,000 RU is indicated corresponding to an antibody binding capacity of- 200 fmol myoglobin. (B) Myoglobin retained by flow cells 2 (FC2), and 3 (FC3). Retention of 20 fmol, and 10 fmol, of myoglobin is indicated for flow cells 2, and 3, respectively.
Biomolecular Interaction Analysis with MS and MS Probe Tips
497
Fig. 3 shows the mass spectra obtained from the direct MALDI-TOF analysis of flow cells 2 and 3 of the anti-myoglobin-derivatized/myoglobin-incubated CMS sensor chip. Fig. 3 (lower) was one of ca. 5 mass spectra taken from the area within flow cell 2. Significant signal is observed for both the singly-and doubly-charged ion species of the myoglobin. A measured molecular mass of 17,150 + 15 Da was found for the myoglobin by averaging the centroided mass values of the 5 spectra acquired from flow cell 2. This molecular weight is significantly higher (~ 0.4 %) than that calculated for the mono-derivatized (cyano) myoglobin (MW = 17,080 Da). Considering that the myoglobin ion signals are fairly broad, the shift to higher mass is consistent with the attachment of multiple cyano groups to the myoglobin (creating a heterogeneous sample). Fig. 3 (upper) shows a mass spectrum obtained from within flow cell 3. From the sensorgram it was estimated that - 1 0 fmol of myoglobin was present within the area of the flow cell. Again ion signal is readily observed for the myoglobin. A measured mass of MW = 17160 + 15 Da was determined for the myoglobin using the average of ca. 5 mass spectra taken from within the area of flow cell 3.
FC3 FC2 20000 Fig. 3 BIA/MS of CM5/anti-human myoglobin IgG/HSA; myoglobin system flow cells 2 (FC2) and 3 (FC3). Ion signals are observed in both spectra for the singly- and doubly-charged myoglobin. Retention of species other than the myoglobin is observed in flow cell 3 (marked by *), possibly due to non-specific interactions or the specific retention of myoglobin fragments.
Randall W. Nelson et al
498
A few issues of BIA/MS are worth noting. The first is the comparable sensitivities of the two techniques. BIA analyses registering above the ~ 100 RU level are generally considered significant. This sensorgram response translates to ~ 5 fmole of a 20 kDa protein retained over an area of ~ 1 mm^ (the area of a flow cell), an amount generally at the limit of detection of MALDITOF analysis (this is, of course, a general statement as the limits of detection observed during MALDI-TOF are highly dependent on acquisition factors, e.g., matrix and instrument, and the nature of the analyte). Furthermore, the overall sensitivity of the BIA/MS approach reported here (analysis of retained analytes directly from the sensor chip) is not compromised by sample losses associated with eluting the retained affinants and transfer to the mass spectrometer. In fact, there was no actual handling of the samples for mass spectrometry beyond the simple application of matrix solution to the sensor chip surface. While making no claims on the universality of the limits of detection, similar studies with other systems have demonstrated BIA/MS limits of detection of comparable to, or less than, those observed here [3,5]. A second aspect of the BIA/MS analysis is the observance of species in the mass spectra other than those targeted. Fig. 3 (upper) shows the presence of a number of lower molecular weight species retained along with the myoglobin. Blank analyses of flow cells derivatized with antibody and incubated with HBS/HS A buffer (no myoglobin) demonstrated the presence of a number of the lower molecular weight species, however, not all those observed in Fig. 3 (upper). A combination of both nonspecific retention of background species, and specific retention of myoglobin fragments (present in the starting solution) is suggested. Non-specific retention (while of obvious concern) can be compensated for during BIA analysis by blank substraction or saturation of the sensor chip surface. That is to say that the BIA analysis is concerned with the change in response, due to the biospecific interactions defined by the immobilized affinity ligand, after a baseline measurement is established. It is not easy, however, to compensate for the specific binding of non-targeted ligands while simultaneously analyzing for targeted ligands. By direct detection of retained species at defined molecular weights, and incorporation of quantitative methodologies [7,8], MALDI-TOF mass spectrometry has the potential to compensate for such competitive binding. BIOREACTIVE PROBE TIPS IN PROTEIN CHARACTERIZATION L
Introduction
A particular strength of MALDI-TOF mass spectrometry is the ability to analyze complex biological mixtures with little or no prior sample workup. This ability allows for a number of intricate analyses directed at the characterization - from primary to quaternary structure, and post-translational modifications - of proteins. Several such analyses involve the use of enzymes to modify a protein or peptide prior to analysis of the resultant using MALDI-TOF. More often than not, digestions are performed free in solution; a process which allows the possibility of the enzyme autolysis. The resulting autolysis products are recognized in the mass spectrum as interferences and pose a hinderance to the analysis through potential mis-interpretation, or masking of true analyte signals. A way to eliminate such interferences is to covalently immobilize the enzymes to a solid support, the complex then used as the enzymatic reagent. When considering the MALDI-TOF analysis, the support of choice is in fact the mass spectrometer probe device, which, when enzymatically-derivatized, serves to both digest the analyte, and to introduce the digestion mixture into the mass spectrometer [9,10]. There are several advantages to performing digestions using enzymatically-derivatized
Biomolecular Interaction Analysis with MS and MS Probe Tips
499
mass spectrometer probe tips. First is an overall increase in sensitivity as sample losses (in transfer and handling) are minimized. Lack of sample loss is critical in maintaining limits of detection throughout the process which are comparable to conventional MALDI analyses (elimination of sample losses is also a contributing factor to the number of proteolytic fragments observed in the mass spectrum during mass mapping). A second advantage is (as stated) the absence of interfering, or background signals due to autolytic digestion of the enzyme. The enzyme is covalently anchored to the probe surface preventing association into the MALDI matrix DSP/isopropanol ISmin.
Fig. 4 General approach of the bioreactive MALDI mass spectrometer probe tips. Gold plated probe tips are activated through the covalent attachment of enzymes (the general terminology of Au/enzyme is used to indicate the nature of the activated surfaces). The probe tips are then used for protein characterization by direct application of the analyte and time given for digestion. The digestions are stopped with the addition of a MALDI matrix, the reaction productmatrix mixture allowed to dry, and the probe tips are inserted into the mass spectrometer for MALDI-TOF analysis.
(negating desorption/ionization), and also prohibiting the freedom necessary for autolysis (which would also produce interferences). Third, digestions can be performed on a time scale equivalent to that required for the MALDI analysis (a few minutes). Covalent anchoring of the enzymes is again largely responsible for the ability to perform digestions rapidly because high effective enzyme concentrations can be used without introducing interferences. Digestion rates can be further increased by using the probe tips at elevated temperatures (accelerating diffusion limited processes and equilibrium kinetics). Lastly, use of enzymatically-derivatized probe devices is
Randall W. Nelson et al
500
quite easy, requiring no more steps than those required for a normal MALDI analysis (application of analyte and matrix to the probe). Reported here is the use of bioreactive mass spectrometer probe tips to serially digest myoglobin. The object of the serial digestion was to simultaneously view the relative stability of molecule fragments of myoglobin (generated during an initial, limited digestion of the myoglobin under denaturing conditions using pepsin-active tips at low pH), by exposing the fragment set to extensive digestion (using trypsin tips) under re-naturing conditions. II.
Experimental
A graphic depiction of the experimental process is given in Fig. 4. Stainless steel probe tips were first sputter-coated with ~ 300 nm of gold, and then activated by treatment with dithiobis (succinimidyl propionate) (DSP)/isopropanol solution (for -- 30 minutes). Probe tips were then rinsed vigorously with isopropanol and either used directly (for amine linkage), or further derivatized (for carbodiimide mediated carboxylic acid linkage) by a 15 minute incubation with a solution of ethylene diamine (EDA):isopropanol: triethylamine (40:40:20%). Trypsin was linked through amine coupling by addition of the enzyme (0.1 mg/mL in 20 mM phosphate buffer; pH 8.0) directly to the DSP-derivatized probe tips. Pepsin was linked through carboxylate coupling by addition of the enzyme (0.1 mg/mL in acetate buffer; pH 4.5; 0.1 mg/mL l-ethyl-3-(3dimethylaminopropyl) carbodiimide) to DSP/EDA-derivatized gold tips. Tips were prepared in batches (20 - 40) with the reactions performed in 50 mL conical tubes, generally overnight at ~ 4°C, using volumes of enzyme solution equal to ~ 0.5 mL per probe tip. After incubation the tips were washed with liter volumes of ice-cold incubation buffer, dried, and stored at ambient until needed. For clarity, tips are termed as Au/enzyme to denote the gold surface and linked enzyme. Whale myoglobin (MW = 17,200.4 Da) was dissolved to 0.01 mg/mL (~ 0.6 |iM) in 20 mM ammonium acetate buffer, pH 2.7, and allowed to stand for ~ 30 min. A one minute pepsin digestion was performed by application of 3 |iL of the myoglobin solution directly to the surface of an Au/pepsin probe tip (maintained in a humidified environment at 60 °C). At the same time, 1.5 |aL of a 20 mM phosphate buffer (pH 10) was applied to the surface of an Au/trypsin tip (maintained in high humidity at 60 °C). After one minute, the tips were touched together, effectively transferring a portion of the peptic digest to the Au/trypsin tip (the combination of the two buffers resulted in a solution pH of- 7.5, as verified with pH paper). Immediately following, 1.5 fxL of a a-cyano-4-hydroxycinnamic acid solution (in 1:2; acetonitrile:1.5 % TFA (ACCA)) was applied to the ~ 2 fxL of the digest mixture remaining on the Au/pepsin tip. Trypsin digestion was then allowed to proceed for 5 minutes before termination by addition of 1.5 |LIL of the ACCA matrix. Samples were allowed to air dry prior to insertion of the probes into the mass spectrometer. MALDI-TOF mass spectrometry was performed using a Vestec LaserTec ResearcH linear time-of-flight mass spectrometer (Vestec Corp. Houston, TX), modified to accommodate the probe tips (see Fig. 4), and equipped with a two-stage gridded ion source operating at 30 kV. The rest of the instrument remained unchanged from that described previously [11]. Mass spectra were acquired in the positive ion mode with each spectrum the sum of 50 - 100 individual laser desorption/ionization events. Spectra were externally calibrated using horse heart cytochrome c (MW 12,360.7 Da) as a standard. Mass data was analyzed using protein analytical worksheet software (PAWS) [12].
Biomolecular Interaction Analysis with MS and MS Probe Tips
III.
501
Results/Discussion
A combination of enzymatically-active probe tips was used to investigate the regional stability of myoglobin. A set of molecular fragments representing different regions of the protein was first prepared by a limited pepsin digestion of the protein under denaturing conditions (pH ~ 3). The set was then exposed to further, more extensive, degradation (using trypsin) under native conditions (pH ~ 8). Regions of myoglobin that do not exhibit an intrinsic steric shielding by the tertiary structure of the molecule (the molecule being either the intact myoglobin or one of the fragments) are more susceptible to digestion by the trypsin, and therefore, signals representing these fragments are expected to be attenuated in the final mass spectrum. Regions of the myoglobin possessing a tighter tertiary structure (when folded under native conditions) will exhibit a higher degree of immunity to the trypsin digestion, and representative signals in the mass spectrum will be attenuated to a lesser extent.
c 0) 4
.> ''•3
JS
«
2H
20000
Fig. 5 One minute Au/pepsin digestion of whale myoglobin under denaturing conditions (pH 3, 60 °C) (A, grey). Peptic fragments digested for 5 minutes using an Au/trypsin probe tip (pH ~ 8,60 °C) (B). Select fragments have been completely digest indicating a relatively low degree of steric hinderance (to tryptic sites) in the final 46 residues of the protein. Ion signals are marked with residue numbers. Region indicated is shown in Fig. 7.
502
Randall W.Nelson era/. Residue Number N
20
40
60
80
100
120
140
^
^^^^"" " 30 153
110.153 • 13«-153
N
20
40
60
Residue Number 100 80
120
140
R
„
^
N
20
40
60
"
•
^
^
^
Residue Number 80 100
^
120
140
c • 13«-153
Fig. 6 Coverage maps derived from the Au/pepsin-Au/trypsin serial digestion of whale myoglobin. (A) Au/pepsin digest fragments. (B) Peptic fragments exhibiting a relatively high immunity to tryptic digestion. (C) Peptic fragments eliminated during trypsin digestion. Residue numbers are as indicated.
Fig. 5A shows the resuhs of myoglobin digested under denaturing conditions using an Au/pepsin tip. Strong ion signals representing fragments due to cleavage of the myoglobin at five sites, residues 29, 69, 106, 109, and, 137, are observed. All peptic fragments contain between one and nine trypsin cleavage sites. Upon exposure to an Au/trypsin tip (Fig. 5B), select fragments in the peptic mixture are observed to undergo complete digestion, whereas others exhibit a relative immunity to digestion. Fig. 6 shows mass coverage maps of the fragments from the pepsin digestion, the fragments surviving the Au/trypsin digestion, and those completely digested by the trypsin. In general, minimal stearic shielding of tryptic sites is observed in fragments comprised of the final two helices of the myoglobin. Fig. 7 shows an evolution of tryptic fragments derived from the original pepsin digest. Signals consistent with cleavage at three of the six possible trypsin sites present in the 107 - 153 region of the myoglobin are observed. There are no other strong ion signals in the Au/pepsin; Au/trypsin spectrum due to both pepsin and trypsin digestion of the myoglobin (other signals in the spectrum are consistent with cleavage at trypsin sites — confirmed by Au/trypsin digestion of myoglobin). This observation, and the survival of numerous fragments containing residues 1 - 106, is consistent with the steric inaccessibility of sites within the central, heme-coordinated region of the molecule (independent of the final 46 residues of the molecule).
503
Biomolecular Interaction Analysis with MS and MS Probe Tips
C 0)
I
0)
1000
2000
3000
4000
5000
6000
m/z
Fig. 7 Mass spectra showing the evolution of proteolytic fragments generated by the successive Au/pepsin Au/trypsin digestion of myoglobin. Ion signals representing peptic fragments of the myoglobin originating between residues 107-153 (A) are observed to undergo complete tryptic digestion (B) indicating relatively free access to trypsin cleavage sites. Ion signals are marked to indicate proteolytic fragments (by residue).
Obviously, more data is needed in order to make any broader statements on the relative degree of m^ra-molecular interaction of the myoglobin. However, serial digestions are possible in numerous combinations, and quite easy to perform using the bioreactive probe tips. Further, digestion of the myoglobin in the presence of denaturants (detergents, salts) is also possible to study the relative accessibility to proteolytic sites, yielding additional information on the overall structure of the molecule [13]. Finally, incorporation of quantitative MALDI-TOF techniques allows the tracking of digestions as a function of time, providing even further insight into the dynamics of digestion {i.e., determination of fragment pre-cursors and final products) and molecular stability [13, 14]. Currently, we are exploring such uses of the bioreactive probe tips, in combination with the defined and accurate mass spectrometric identification of proteolytic fragments, in the study of higher-order protein structure.
Randall W. Nelson et at
504
FINAL REMARKS The rapid advancement of analytical technologies such as SPR-based biomolecular interaction analysis (BIA), and MALDI-TOF mass spectrometry, has allowed the routine characterization of biomolecules present in complex environments at physiological concentrations. Presented here has been the coupling of the two orthogonal techniques into a combined approach capable of observing real-time, solution-phase biospecific interactions (using BIA), and the rapid qualitative assessment of binding partners (using MALDI-TOF). The combined analysis is performed without compromise of the speed, sensitivity, or, accuracy of the constituent techniques, and therefore demonstrates the inception of a new bioanalytical approach: Biomolecular Interaction Analysis Mass Spectrometry (BIA/MS). An additional approach to biomolecular analysis, hioreactive mass spectrometry probe tips, has also been given. These are conceptually, and practically simple devices constructed to analytically modify biomolecules prior to mass spectrometric analysis. The bioreactive devices have proven quite convenient in use, and often necessary in maintaining high speed, sensitivity, and, accuracy in the mass spectrometric analysis of proteolytic mixtures. An obvious next step is the combination of the two techniques, BIA/MS with bioreactive mass spectrometry probe tips. Such an approach would thereby allow (all on a single surface), the real-time observance of affinity interaction followed by enzymatic modification and mass spectrometric characterization of retained ligands. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Szabo, A., Stoltz, L., and Granzow, R. (1995) Curr. Opinion Struc. Biol 5, 699- 705. Karlsson, R., Roos, H., Fargerstam, Persson, B., (1994) Methods: A Companion to Methods in Enzymology, 6 , 9 9 - 110. Krone, J.R., Nelson, R.W., Dogruel, D., Granzow, R., Williams, P., in Proceedings of the 5th Annual European BIAsymposium, Stockholm, Sweden, September 27-29, 1995, Ed. R. Millett. Page 173 - 179. Krone, J.R., Nelson, R.W., Dogruel, D., Williams, P., Granzow, R., (1996) BIAjournal 3, 16 - 17. Krone, J.R., Nelson, R.W., Dogruel, D., Williams, P., Granzow, R., (1996) Anal. Biochem. In press. BIAapplications Handbook (1994). Chapter 4. Nelson, R.W., McLean, M.A., Hutchens, T.W., (1994) Anal. Chem. 66, 1408 - 1415. Nelson, R.W., Krone, J.R., Bieber, A.L., Williams, P., (1995) Anal.Chem. 67, 1153 -1158. Dogruel, D., Williams, P., Nelson, R.W., (1995) Anal. Chem. 67, 4343 - 4348. Nelson, R.W., Dogruel, D., Krone, J.R., Williams, P., (1995) Rapid. Comm. Mass Spectrom. 9, 1380 1385. Vestec LaserTec ResearcH specification sheet, Vestec Corporation, Houston, TX, (1992). Beavis, R.C. Protein Analysis Worksheet Version 6.1.1, (1995). Patterson, D.H., Tarr, G.E., Hines, W.M., Vestal, M.L.,Proceeding of The 44*^ ASMS Conference on Mass Spectrometry and Allied Topics, Portland, Oregon, May 1996. In press. Lewis, J.K., Krone, J.R., Dogruel, D., Williams, P., Nelson, R.W., Proceeding of The 44* ASMS Conference on Mass Spectrometry and Allied Topics, Portland, Oregon, May 1996. In press.
Transition-State Theory and Secondary Forces in Antigen-Antibody Complexes Mark E. Mummert and Edward W. Voss, Jr. Dept. of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
I. Introduction Secondary forces, defined as those interactions exhibited outside of the classically defined antibody active site, have been demonstrated to modulate the conformation and free energy of binding of antifluorescein antibodies (1-3). Figure 1 defines and distinguishes primary from secondary interactive components. The ability of the epitopic environment to influence antibody binding has obvious immunological ramifications. Dissection of those interactions that influence the overall dynamic and thermodynamics of a given protein system is of general importance in understanding interfacial protein chemistry. The antifluorescein system is advantageous for evaluating and quantitating interfacial chemistry. Binding of fluorescein ligand in the antifluorescein active site results in bathochromic shifts of the ligand's absorption spectrum and a decrease in both the fluorescence quantum yield and lifetime. These properties allow sensitive spectral and kinetic measurements to be made (4). Changes in the spectral and kinetic properties of a given antifluorescein antibody upon interacting with fluorescein attached to a carrier molecule compared to fluorescein (devoid of carrier residues) thus provides important information about secondary force directed perturbations. Placement of the fluorescein moiety in various environments is easily achieved due to the availability of the highly reactive isothiocyanate derivative of fluorescein. Evaluations of secondary interactive components have been discussed (5-8). In general, the delineation between primary and secondary interactive components have been vague (9). An important advantage of the fluorescein system is that the ligand fills the active site (10-12) which has been conclusively demonstrated by X-ray crystallographic results for the monoclonal antifluorescein antibody (mAb) 4-4-20 (13-15). Thus, interactions with carrier residues associated with the ligand-carrier complex are by necessity outside of the primary interactions. An understanding of interfacial protein chemistry requires evaluation of the thermodynamics of the system under investigation as well as the energetic barriers responsible for the observed kinetics and affinity. Due to the kinetic methodology available for the antifluorescein system, the energetic barriers for complex decomposition TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
505
505
Mark E. Mummert and Edward W. Voss, Jr. Antibody variable domains
Carrier environment (highly charged protein or lipid membrane)
Figure 1. Schematic representation differentiating primary and secondary interactions. Secondary interactions are the result of interactions between regions surrounding the mouth of the active site and regions of the carrier environment surrounding the ligand. A highly charged protein or lipid interface represents an example of substrate exerting secondary effects.
can be evaluated (Figure 2). It is important to note that the kinetic measurements can be conducted in solution under near physiological conditions. Thus, the results obtained can be extrapolated to biological situations. In this report, we summarize the results of a study in which the energetic barriers of several protein/complex decompositions were analyzed utilizing transition-state theory. In essence, fluorescein 5-isothiocyanate was covalently linked to a variety of synthetic peptides and allowed to bind with the well defined high affinity 4-4-20 mAb. Differences in the rates of decomposition were measured at 275 K and 291 K and the height of energetic barriers calculated using classical transition-state analysis (16).
n. Methods and Materials A. Monoclonal anti-fluorescein antibody 4-4-20 mAb 4-4-20 was produced in ascitic fluid from pristane treated Balb/c mice and affinity purified using a fluorescein Sepharose 4B adsorbent (17,18). B. Peptide synthesis for use as carriers Peptides of different chemical composition acetylated in the amino-terminal position were synthesized using an Applied Biosystem model 430A peptide synthesizer at the University of Illinois Genetic Engineering Facility (Urbana, DL) employing solid-phase F-moc chemistry with standard amino acid protecting groups. The generic peptide design was as follows: Ac-NH-(X)6-K-(X)6-COOwhere Ac-NH denotes the acetylated a-amino group, X represents glutamate or arginine, and K is the central lysine residue available for FITC (I) derivatization. Peptides were desalted and purity verified by RP-HPLC. Purified peptides were analyzed by mass spectrometry to verify composition.
Transition-State Theory and Secondary Forces in Ag-AB Complexes
507
Second Transition Stat*
Reaction Coordinate
Figure 2. Two dimensional reaction coordinate depicting the interaction of mAb 4-4-20 with homologous ligand. The x-axis is arbitrarily assigned reaction progression while the y-axis is the chemical potential. The height of the chemical potential barriers dictates the rate of the reaction. The encounter complex was included based on kinetic considerations (19). Monofluoresceinated peptides were synthesized by adding an equimolar concentration of FITC(I) to peptides. The reaction was adjusted to a pH of 10.3 with K2CO3 and incubated at ambient temperature overnight. The resulting reaction mixture was resolved over a P-2 column (Bio-Rad) equilibrated in 0.1 M phosphate, pH 8.0 to remove unreacted fluorescein from the peptides. Fluorescently labeled peptides were analyzed by thin layer chromatography with water saturated methyl ethyl ketone as the solvent system. C. Determination of unimolecular rate constants Ligand dissociation rates were determined at 275 K and 291 K utilizing the methodology and analysis as described in detail by (19). This technique provides an essentially unidirectional displacement of the fluorescein/antibody complex. D. Calculation of transition-state thermodynamic parameters All calculations have been described in detail elsewhere (3). Transition-state equations can be found in most elementary physical chemistry texts or in the classical work of Wynne-Jones and Eyring (16). III. Results A. Monofluoresceinated peptides Thin layer chromatographic analyses of monofluoresceinated peptides indicated a single fluorescent band for each of the labeled peptides. RF values were 0.90, 0.85, 0.83 and 0.76 for FDS, D12KF1, R6D6KF1 and R12KF1 respectively.
508
Mark E. Mummert and Edward W. Voss, Jr.
Table L Comparative unimolecular rate constants at 275 K and 291 K for the interaction of FDS and monofluoresceinated peptides with mAb 4-4-20 Ligand
k.i^
k.^i,
^Ab^Asi
FDS
1.63(±0.02)xl0-4
1.92(iO.09)xl0-3
11.8
D12KF1
3.52(±O.62)xl0-3
1.06(±0.19)xl0-l
30.1
R6D6KF1
6.96(±1.02)xl0-3
I.15(dt0.42)xl0-1
16.5
R12KR
6.79(±0.25)xlO-3
6.08(±0.81)xl0"2
8.9
k.ia = unimolecular rate constant at 275 K k_n, = unimolecular rate constant at 291 K
B. Affinity of mAb 4-4-20 with various ligands In previous studies (2), the affinity constants (Ka) for the interaction of mAb 4-420 with fluorescein and monofluoresceinated peptides were measured at 275 K. The affinities of mAb 4-4-20 for FDS, D12KH, R12KF1 and R6D6KF1 were 3.14x10^° M'\ 1.49x10^ M"^ 7.49x10^ M^^ and 7.55x10^ M \ respectively. C. Unimolecular rate constants Unimolecular rate constants for decay of the mAb 4-4-20/fluorescein complex and mAb 4-4-20/monofluoresceinated peptide complexes were determined at 275 K and 291 K. The 16 K differential resulted in significant changes in the individual decay rates of the various complexes. The largest change with temperature was with the mAb 4-420/D12KF1 complex (30.1-fold), while the smallest change was with R12KF1 (8.9-fold). Importantly, the R6D6KF1 ligand resulted in an approximate average (16.5 -fold) of the poly anionic (D12KF1) and polycationic (R12KF1) environments. Table 1 summarizes these results. D. Relationship between enthalpy and entropy Table 2 summarizes the calculated transition state thermodynamic parameters (AH", AS" and AG"). The secondary effects that resulted from the carrier molecule caused an apparent enhancement in AH" and AS" relative to fluorescein devoid of carrier residues. The enhanced values of AS" offset the enhanced AH" with the net effect of lowering the overall energetic barriers (AG") of the 4-4-20/monofluoresceinated complexes relative to the 4-4-20/fluorescein complex (Table 3).
Transition-State Theory and Secondary Forces in A g - A B Complexes
509
Table IL Comparative thermodynamic transition-state parameters and transition-state equilibria for the interaction of mAb 4-4-20 with FDS and monofluoresceinated peptides at 275 K Ligand
AH^
AS^
AC"
K^
FDS
+23.96±0.06
+0.0110.00
+20.8210.07
2.84x10-^7
D12Kn
+33.28+1.95
+0.0510.00
+19.1511.95
6.03x10-^6
R6D6KF1
+27.3213.36
+0.0310.00
+18.7713.36
1.21x10-1^
R12KF1
N.A.
N.A.
N.A.
N.A.
AH^ = transition-state enthalpy (kcal/mol) AS"^ = transition-state entropy (kcal/mol/K) K"^ = transition-state equilibrium (dimensionless) N.A. = not applicable; does not conform to the theoretical assumptions of transition-state theory
E. K values Values for the transmission coefficient (K) at 275 K were 1.00, 1.02, 1.00 and 0.58 for FDS, D12KF1, R6D6KF1 and R12KF1, respectively. Transition-state theory assumes unity for K. Deviations of K from unity indicated poor approximation of the various transition-state thermodynamic parameters. Thus all complex decays were adequately described by transition-state theory, except for the R12KF1 peptide.
IV. Discussion Understanding those components that influence the interfacial binding properties in protein/protein and protein/ligand interactions is of basic importance in protein chemistry. In this report, we have defined a system that should allow the dissection of those chemical properties that influence primary interactions via an evaluation of the transition-state thermodynamic components. It is important to realize fundamental assumptions made in the calculations. At the temperatures utilized in these experiments (275 K and 291 K), it was assumed that complexes moved over energetic barriers with standard Arrhenius motion. Deviations from Ahrrenius motion (e.g., tunneling) usually result as a consequence of low temperature (20-22). It is also important to realize that the values calculated for AH^, AS"" and /SG^ are the upper limits of the system, since solvent was considered as a part of the system (23). This study suggested that secondary forces of the mAb 4-4-20 /monofluoresceinated peptide complexes modulated binding interactions via increased transition-state enthalpic and entropic contributions. The net result was a decreased energetic barrier that allowed modulation of the previously reported affinity constants of mAb 4-4-20 for the monofluoresceinated peptides due to variation of the unimolecular rate constant (2).
510
Mark E. Mummert and Edward W. Voss, Jr.
Table i n . Comparative differences in thermcxlynaniic transition-state parameters of monofluoresceinated peptides with respect to FDS at 275 K
Ligand
AAH^
AAS''
AAC
D12KF1
+9.32±1.95
•K).04±0.00
-1.67±1.95
R6D6KF1
+3.36±3.36
0.02±0.00
-2.05±3.36
R12KF1
N.A.
N.A.
N.A.
AAIT^ = change in transition-state enthalpy with respect to FDS (kcal/mol) AAS'^ = change in transition-state entropy with respect to FDS (kcal/mol/K) AAG^ = change in transition-state free energy with respect to FDS (kcal/mol) N.A. = not applicable
Increased values of AH" and AS" for the mAb 4-4-20/monofluoresceinated peptide complexes relative to the mAb 4-4-20/fluorescein complex decay were interpreted as resulting from inclusion of the carrier peptides. Increased enthalpic contributions may have resulted from actual binding interactions between the surface accessible complementarity determining regions (CDRs) surrounding the mouth of the antibody active site and the amino acids of the peptides. Whitlow et al. (15) reported that a significant percentage of the amino acids that compose the mAb 4-4-20 CDRs were solvent accessible when fluorescein was in the active site. The increased values for AH" also may have been due to differences in hydration of the antibody complexes. Enhanced AS" values for the antibody/peptide complexes may have been a result of the greater rotational, translational and vibrational degrees of freedom as the complexes decayed relative to the mAb 4-4-20/fluorescein complex. As in the AH" argument, hydration may also be an important factor to consider. Hydration has been shown to significantly influence the free energy of binding (14). We interpreted the inability of transition-state theory to predict the decay of the mAb 4-4-20/R12KF1 complex to be a result of differential conformational changes. Deviations of K from unity are a direct result of the inertial (solvent coupling) and diffusive (intramolecular dynamic) regimes (24-27). The frictional coefficient in both of these regimes dictates the value of K (24,25). Both inertial and diffusive regimes modulate K in proteins (27-29). We therefore proposed that the mAb 4-4-20/R12KF1 complex could not be evaluated by transition-state theory due to inertial and/or diffusive regimes. We conceived that the secondary forces dictated by R12KF1 resulted in greater perturbation of the antibody variable domains than the secondary forces dictated by either D12KF1 or R6D6KF1. It was postulated that the greater van der Waals volume for arginine (R~148 A^) as opposed to aspartic acid (D~91 A^) resulted in greater variable domain atomic coordinate displacement and thus enhanced frictional components. In conclusion, the antifluorescein system provides a reasonable model with which to evaluate interfacial interactions utilizing transition-state theory. Evaluations like those presented herein provide means to develop mechanistic models to describe interfacial interaction from an energetic barrier viewpoint.
Transition-State Theory and Secondary Forces in Ag-AB Complexes
511
References Mummert, M.E. and Voss, E.W., Jr. (1995). Mol Immunol 32, 1225-1233. Mummert, M.E. and Voss, E.W., Jr. (1996) Mol Immunol in press. Mummert, M.E. and Voss, E.W., Jr. (1996) Biochemistry 35, 8187-8192. Voss, E.W., Jr. (1993) 7. Mol Recog. 6, 51-58. vanOss, C.J. and Absolom, D.R. (1984) In "Molecular Immunology" (Atassi, M.Z., vanOss,C.J. and Absolom, D.R., eds.) pp. 337-360. Marcel Dekker, New York. 6. vanOss, C.J., Good, R.J. and Chaudhuny, M.K. (1986) 7. Chromatog. 376, 111-119. 7. vanOss, C.J. (1992) In "Structure of Antigens" (Van Regenmortel, M.H.V., ed.) vol. 1, pp. 179-208. CRC Press, Inc., Boca Raton, FL. 8. vanOss, C.J. (1994) In "Immunochemistry" (vanOss, C.J. and Van Regenmortel, M.H.V., eds.) pp. 581-613, Marcel Dekker, New York. 9. vanOss, C.J. (1995) Mol Immunol 32, 199-211. 10. Voss, E.W., Jr., Eschenfeldt, W. and Root, R.T. (1976) Immunochemistry 12, 745749. 11. Omelyanenko, V.G., Jiskoot, W. and Herron, J.N. (1993) Biochemistry 32, 1042310429. 12. Carrero, J. and Voss, E.W., Jr. (1996) 7. Biol Chem. Ill, 5332-5337. 13. Herron, J.N., He, X-m., Mason, M.L., Voss, E.W., Jr. and Edmundson, A.B. (1989) Proteins: Struct., Funct., Genet. 5, 271-280. 14. Herron, J.N., Terry, A.H., Johnson, S., He, X-m., Gudday, L.W., Voss, E.W., Jr. and Edmundson, A.B. (1994) Biophys. 7. 67, 2167-2183. 15. Whitlow, M., Howard, A.J., Wood, J.F., Voss, E.W., Jr. and Hardman, K.D. (1995) Prot.Eng.^,lA9-16\. 16. Wynne-Jones, W.F.K. and Eyring, H. (1935) 7. Chem. Phys. 3, 492-502. 17. Kranz, D.M. and Voss, E.W., Jr. (1981) 7. Biol Chem. 257, 6987-6995. 18. Weidner, K.M., Denzin, L.K., Kim, M.L., Mallender, W.D., Miklasz, S.D. and Voss, E.W., Jr. (1993) Mol Immunol 30, 1003-1011. 19. Herron, J.N. (1984) In "Fluorescein Hapten: An Immunological Probe" (Voss, E.W., Jr., ed.) pp. 50-75. CRC Press, Inc., Boca Raton, FL. 20. Frauenfelder, H., Nienhaus, G.U. and Johnson, J.B. (1991) Ber. Bunsenges. Phys. Chem. 95, 272-278. 21. Wolynes, P. (1987) In "Protein Structure: Molecular and Electronic Reactivity" (Austin, R., Buhks, E., Chance, B., DeVault, D., Dutton, P.L., Frauenfelder, H. and Gol'daskii, V.I., eds) pp. 201-209, Springer-Verlag, Inc., New York. 22. Frauenfelder, H. (1979) In "Tunneling in Biological Systems" (Chance, B., DeVault, D.C., Frauenfelder, H., Marcus, R.A., Schriefer, J.R. and Sutin, N., eds.) pp. 627649. Academic Press, Inc., New York. 23. Beece, D., Eisenstein, L., Frauenfelder, H., Good, D., Marden,M.C., Reinisch, L., Reynolds, A.H., Sorensen, L.B. and Yue, K.T. (1980) Biochemistry 19, 51575157. 24. Chandler, D. (1978) 7 Chem. Phys. 68, 2959-2970. 25. Northrup, S. and Hynes, J.T. (1978) 7 Chem. Phys. 69, 5246-5260. 26. Hasha, D.L., Eguchi, T. and Jonas, J. (1982) 7. Am. Chem. Soc. 104, 2290-2297.
1. 2. 3. 4. 5.
512
Mark E. Mummert and Edward W. Voss, Jr.
27. Doster, W. (1983) Biophys. Chem. 17, 97-103. 28. Karplus, M.A. and McCammon, J.A. (1981) FEES Lett. 131, 34-36. 29. McCammon, J.A. and Karplus, M. (1979) Proc. Natl Acad. Sci. U.S.A. 76, 35853589.
Thermodynamic Investigation of Enzyme and Inhibitor Interactions with High Affinityi Yudu Cheng, Jacek Slon-Usakiewicz, Jing Wang, Enrico O. Purisima and Yasuo Konishi National Research Council of Canada, Biotechnology Research Institute Montreal, Quebec, Canada
I. Introduction The enzyme and inhibitor binding interactions may be elucidated by the thermodynamic functions such as the free energy (AG), enthalpy (AH), entropy(TAS) and heat capacity(ACp). These thermodynamic functions are related through the following equation: AG°(T) = AH°(T) - TAS°(T) = [AH°(T°) - TAS°(T°)] + ACp[(T - T°) - Tin (T/T°)]
(1)
In the above equation, AG°, AH°, AS° and ACp are the thermodynamic functions relative to a standard state(1.0 mol/L for all chemical species and 25 °C), T° is the reference temperature (298.15 K in this work). The thermodynamic study plays a major role in accessing the molecular basis of enzyme and inhibitor interactions because the thermodynamic functions convey extensive information from the binding affinity to the conformational change. In general, AG is the affinity between enzyme and inhibitor. AH is the binding energy arisen from the van der Waals interactions, hydrogen bonding interaction, dehydration and other effects (e.g. deprotonation, ion-bridge etc.). AS measures the loss or gain in the rotational, translational and/or vibrational degrees of freedom in the conformational change and consists of both solvent and conformational contributions. ACp measures the temperature dependence of AH and AS. ACp may also be temperature dependent. In eq 1, ACp is assumed to be temperature independent for simplicity. We have conducted thermodynamic studies on the thrombin and its bivalent inhibitors' interactions in which the binding affinity ranges from Kj = 10-9 to Kj = 10-12 M (Ki is the inhibition constant)(l,2). Thrombin is a key enzyme regulating thrombosis in cardiovascular disease. The synthetic bivalent thrombin inhibitors possess an active-site binding segment, a linker and a fibrinogen recognition exosite (FRE) binding segment which is based on the C-terminal sequence of hirudin, AspH55-PheH56-GluH57_GluH58_IleH59_ProH60_GluH61-GluH62 -TyrH63_LeuH64_
1NRC publication No. 39931 TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © Government of Canada All rights of reproduction in any form reserved.
513
Yudu Cheng et al
514
GlnH65-OH(HirudinH55-H65^ H stands for hirudin). Hirudin is a 65 amino acid protein and naturally occurring thrombin inhibitor with a Ki value of 2.2 x 10-14 M(3). The crystal structure of thrombin-hirudin complex(4) indicates that, besides those distinct electrostatic interactions of hirudin and thrombin, the complementary fit of the nonpolar residues seems to be of particular importance. The site-directed mutagenesis(5) and Gly substitution(6) studies of five nonpolar residues PheH56^ IleH59^ ProH60, TyrH63 and LeuH64 in the FRE binding segment showed that the residues PheH56 and IleH59 are crucial to the binding at the FRE. In order to understand the molecular details of the nonpolar residue and thrombin interactions at the FRE, the complete thermodynamic profiles (AG°, AH°, TAS° and ACp) of the analogs, in which the five nonpolar residues PheH56, IleH59, ProH60, TyrH63 and LeuH64 of a thrombin inhibitor P552(7) are consecutively substituted by Gly, were measured and analyzed in conjunction with the structural features obtained from the molecular modelling for these substitutions. The results show that the change in the binding free energy (AAG°) due to the Gly substitution has a linear correlation with the change in the molecular surface area (AAA) around the Gly substitution site, evidencing the structural basis of the free energy. Meanwhile, the components of AAG°, AAH° and TAAS°, appear no correlation with AAA because of the linear compensation of these two quantities, but are specific to the conformational effects (e.g the movement of the inhibitor's backbone and neighboring water molecules) due to the Gly substitution. In this article, we describe the technique procedures employed by us to measure and analyze the thermodynamic functions in eq 1 for the system of thrombin and inhibitor interactions.
II. Experimental method A. Materials Human a-thrombin and the fluorogenic substrate (Tos-Gly-Pro-Arg-AMCHCl) were purchased from Sigma. Fmoc derivatives of amino acids were purchased from Advanced ChemTech and Novabiochem. N-a-Fmoc-N-y-trityl-L-Gln-Wang resin was purchased from Applied Biosystem Inc. The solvents used in peptide synthesis were obtained from Anachemia Chemical Inc. and Applied Biosystems Inc. B, Peptide synthesis and
purification
The thrombin inhibitors are synthesized on a 396 Multiple Peptide Synthesizer (Advanced ChemTech) by using a conventional Fmoc strategy of solid phase peptide synthesis. Double couplings are performed throughout the synthesis. The peptides are purified on a preparative HPLC using a linear gradient of 20 to 50% acetonitrile in 0.1% TEA ( 0.5%/min gradient, 33 mL/min flow rate). The purified products with >98% purity estimated by an analytical HPLC are lyophilized. The final peptides are identified using a Beckman 6300 amino acid analyzer and a
Thermodynamics of Enzyme and Inhibitor Interactions
515
SCIEX API III mass spectrometer. C
Enzymatic
assay
The inhibition of the amidolytic activity of human a-thrombin is measured using Tos-Gly-Pro-Arg-AMC as a fluorogenic substrate in 50 mM Tris HCl buffer (pH 7.80) containing 0.1 M NaCl and 0.1% poly(ethylene glycol) 8000 at various temperatures from 10 to 45 °C. Human a-thrombin is stable in this temperature range in the presence of poly(ethylene glycol)(8). The temperature dependence of Km and V^ax is measured at 1—40 |iM and 30 pM of the substrate and thrombin, respectively. The Ki is measured under various temperatures at 40|LiM, 30pM and varying concentrations (0.3 - 100-fold of the Ki values in the temperature range of 10-45°C) of the substrate, thrombin and inhibitor, respectively. The steady-state velocity of the enzyme reaction is measured under the condition of XQ^ =383 nm and ?Lem =455 nm in a Hitachi F2000 spectrophotometer and a Perkin Elmer LS50B luminescence spectrometer. The running solutions is preincubated at the temperature of assay for 15 min. The temperature is controlled and monitored by using a HAAK^E Circulator and a YSI Series 400 Probe (±0.1 °C). The reaction starts with adding thrombin and the progressive curve is traced for 5-15 min. D, Molecular
modelling
Energy minimization. The crystal structure of the thrombin-P500 complex(9) is used as the starting structures for molecular modelling. P500 is a bivalent thrombin inhibitor with the sequence of dansyl-Arg-(D-Pip)-)LlAdod-Gly-HirudinH55-H65. Because the residues interested (PheH56^ IleH59, ProH60^ TyrH63 and LeuH64) locate only in the PRE binding segment, the bound state of the inhibitors is modelled using the sequence of Ac-HirudinH55-H65-NHCH3. The complex which includes the ERE segment Ac-HirudinH55-H65-NHCH3 and thrombin residues and water molecules within 6 A from any atom of the ERE segment is re-energy-minimized. This refined structure is the starting point for the structure modelling of the thrombin and analogous inhibitor complexation. The energy minimization for the analogs is conducted only for the atoms in the static substructure set which includes the residues within 4 A from the residue substituted. The AMBER force-field(lO) as implemented in SYBYL 6.1 (Tripos Inc.) is used with a non-bonded cutoff of 8 A, a dielectric constant of 80 and a gradient convergence tolerance of 0.005 kcal/molA. Conformational search, Monte Carlo sampling and energy minimization. In the case of the Gly substitution for the residue PheH56^ the considerably different thermodynamic profiles (small decrease in AH°, but large decrease in T°AS° and ACp) from other P552 analogs were observed, an alternative procedure, which includes the systematic conformational search, Monte Carlo sampling and energy-
Yudu Cheng et al
516
minimization(2), is applied for further study. The rotatable bonds of the backbone and the side chains are varied in 15 degree increment for (PheH56 or GlyH56) and GluH57 and 30 and 45 degree increment for AspH55 in the complex with thrombin in order to generate a database of sterically feasible conformations. The water molecules are not included in the conformational search but are included in the energy minimization. The conformers in the database are sampled and energyminimized. This is then followed by a clustering step where all of the energyminimized conformers are grouped into several clusters. Each cluster contains the conformers with similar energies and structures. The further energy minimization is conducted only for the conformer with the lowest energy in each cluster. Molecular surface area calculation. The molecular surface area is an envelope of a molecule from which the solvent is excluded(ll). The molecular surface area is estimated using the GEPOL algorithm(12) with the van der Waals radii used in the AMBER force-field(lO) and a solvent probe radius of 1.4 A. The polar molecular surface area is composed of oxygen, nitrogen and polar hydrogens (e.g., NH and OH), and nonpolar molecular surface area is composed of all other atoms. The molecular surface area of the bound state is calculated using the energy-minimized complex structures.The molecular surface area of the free state is calculated using the geometry of a tripeptide Gly-Xaa-Gly, where Xaa stands for the residues studied. The backbone conformation of the tripeptide is set as \|/ = 140° and (j) =140°. The side-chain conformations and their populations are determined based on the statistical survey of the side-chain conformations in 100 refined protein structures(13). The change in the molecular surface area(AAA) is estimated by using a thermodynamic circle shown in Scheme I, where E, I and V stand for the free enzyme, inhibitor and analog, respectively, and EI and EI' stand for the complex of thrombin and wild-type and mutated inhibitors, respectively. The circle satisfies that AAX = AX2 - AX] = AX4 - AX3 and enables that the relative thermodynamic function measured (AAG°, AAH°, T°AAS°) can be analyzed by using the structural properties predicted (conformations and AAA).
Scheme II
Scheme I E+I
AX| ^^
^1 AX3
AX4
E + r ^;;:
AX2
M
EI
Er
+
^2
+
^3
E
r
P + E (Ila)
ES ^5
EI - ^ r ^ EI k6
(Hb)
Thermodynamics of Enzyme and Inhibitor Interactions
517
III. Data analysis A. Kinetic data transformation For the system studied, the reaction between enzyme and substrate, enzyme and inhibitor may be described by Scheme 11. Conforming to Scheme Ila which represents the reaction of enzyme and substrate in the absence of inhibitor, the Michaehs constant (Km) and maximal velocity (Vmax) are given by Kn, = [E][S]/[E-S] =(k2+kp)/ki
(2)
Vn,ax = kp[E]
(3)
and
respectively. [E], [S] and [ES] are the concentrations of enzyme, substrate and enzyme-substrate complex, respectively. The enzymatic parameters, K^ and Vmax. are estimated at each temperature by using the equation: V = Vmax[S]/(Km+[S])
(4)
where v is the velocity of the enzyme and substrate reaction. Conforming to the Scheme lib which represents a slow-binding inhibition, the progressive curves of the enzymatic assay in the presence of a competitive inhibitor are analyzed using the following equation(14): P = Vst + (vo-Vs)(l-e-kt)/k
(5)
where P is the fluorescence intensity, Vs is the steady-state velocity, t is time, VQ is the initial velocity and k is a parameter relevant to the kinetic mechanism(15). The variation of steady-state velocity (Vs) with inhibitor concentration ([I]) obtained by using eq 5 is then used to determine the inhibition constant (Kj) through the following equation(16) Vs=V„ax[S]/{K„(l+[l]/Ki)+[S]}+Ve
(6)
where Ki = k4k6/k3(k5+k6) and represents the overall inhibition constant and Vc is a parameter used to account the deviation from the linearity (Vc > 0). Temperature dependence ofK^ and Vmax- Since both Km and Vmax are encountered in the calculation of the inhibition constant (Ki) at various temperatures, the temperature dependence of Km and Vmax should be a priori determined. Figure 1 shows the temperature dependence of Michaelis constant (Km)
Yudu Cheng et al
518
and maximal velocity (Vmax) in the range of 10-45 °C. The temperature dependence of Km and Vmax is analyzed using van't Hoff equation: InKm - InKd = AG°(T)/RT = [AH°(T) - TAS°(T)]/RT = {[AH°(T°) - TAS°(T°)] + ACp[(T-T°)-Tln(T/T°)]}/RT
(7)
Vmax = kp[E]T=A[E]T^(-E/RT)
(8)
and
respectively. The temperature dependence of Km is fairly weak at low temperature(< 25 °C), but becomes strong at high temperatures. Vmax is rapidly increased with temperature. The parameters in eqs 7 and 8 estimated are AH° =12.3 ± 0.5 kcal/mol, T°AS° = -5.1 ± 0.5 kcal/mol, ACp° = -0.80 ±0.09 kcal/mol-K, A = 9.65 X 1011 s-i and E = 10.4 kcal/mol. The values of AH° and T°AS° are in good agreement with those previously published for the same system(17). Temperature dependence of K^. Prior to determining the temperature dependence of Ki, the progressive curve of the enzymatic assay is analyzed using eq 5 in order to obtain the steady-state velocity. Figure 2 shows the assay data and the fitting results for the thrombin and substrate reaction inhibited by an inhibitor with varying concentrations at 25 °C. It is readily seen that steady-state velocity becomes more evident with increased inhibitor concentration ([I]) because the inhibitor slows down the decrease of the substrate concentration. Figure 3 shows 1/Vs vs. [I] for the same system in the inhibitor concentration range of 0 - 0.269 nM at 25 °C. The parameters of eq 6 for Figure 3 are Vmax = 28.8 |LiM/s, Km = 5.3 |LiM, Ki = 0.011 nM and Vc = 0.0000 l|lM/s. Similar to the temperature dependence of Km, the variation of inhibition constant with temperature may be analyzed by using van't Hoff equation (eq 6). Figure 4 shows InKi vs. T(K) for P552 and its analogs 12000.0 10000.0
:
[\]=o/i\]=^ ^<x^]=0.082
8000.0 ,^A-*^=0A2i
j _ 6000.0 4000.0 2000.0 0.0 r, iV: 1
1111
: :
^ -13.0
5,*^-*1i^^
^ j
^
•-***tn=0.26{
:
^
280 285 290 295 300 305 310 315 320
T (K)
Figure 1. Temperature dependence of Michaelis constant (K ) and maximal velocity (V ). The solid and dashed lines are the fitting results by using eqs 7 and 8 in the text, respectively.
0.00
100
200
300
400
t (sec.)
500
600
700
Figure 2. Progressive curves of the reaction of thrombin, substrate and inhibitor(P552) at 25 °C. The solid lines are the fitting results by using eq 5 in the text.
Thermodynamics of Enzyme and Inhibitor Interactions P552(F56G) and P552(I59G). In general, InKi vs. T (K) is nonlinear. This states that the binding enthalpy and entropy should be considered as temperature dependent. In fact, the fitting values of the heat capacity are largely negative, ranging from -644 to -193 cal/mol-K for the inhibitors listed in Table 1. It is noteworthy that the negative heat capacity is characteristic of the binding interactions.
280 285 290 295 300 305 310 315 320
[I] (nM)
Figure 3, Reciprocal of the steady-state velocity (1/v^) vs. the inhibitor concentration ([I]) for P552 at 25 °C. The data is fitted by using eq 6 in the text.
280 285 290 295 300 305 310 315 320
T(K)
519
T(K)
280 285 290 295 300 305 310 315 320
T(K)
Figure 4. Variation of the inhibition constant (Kj) with temperature: (A)P552, (B)P552(F56G) and (C)P552(I59G). The soHd Hnes are the fitting results by using eq 7 where K^ is replaced by Kj.
Thermodynamic
data analysis
Table 1 lists the thermodynamic functions of some inhibitors studied previously. The free energy change, AG°, is directly determined from the logarithm of the inhibition constant (i.e. RTlnK,), the enthalpy, entropy and heat capacity are the fitting results by using eq 7 where K^ is replaced by Kj. Based on Scheme I, by using the relative thermodynamic functions (AAG°, AAH°, T°AAS°) given by AAX = AXn, - AX^
(9)
where "m" and "w" stand for the "mutated" and "wild-type" inhibitors, respectively, we are able to analyze the thermodynamic changes due to the mutation in conjunction with the structural features predicted from the molecular modelling. It is interestingly noticed that, for most analogs, the decrease in the binding affinity is attributed to the less favorable enthalpy whereas, for P552(F56G), the decrease in the binding affinity is attributed to the unfavorable entropy instead enthalpy (see Table 1). Molecular modeling for P552(F56G) suggests that the backbone located in AspH55-GlyH56-GluH57 has a large movement, in which the C" of GlyH56 shifts to C^ position of the original residue PheH56, Meanwhile, the water molecules with
520
Yudu Cheng era/. Table I Thermodynamic functions of thrombin and inhibitor interactions(25°C)*
Inhibitor
P552
Ki
AG°
AH°
T°AS°
0.011±0.0 -14.97+0.02 -13.66±1.12
ACn
1.31±1.13 -0.644±0.211
P552(F56G)
5.20±0.32 -11.30±0.03 -12.74±0.69 -1.44±0.70 -0.193±0.131
P552(I59G)
1.94±0.03 -11.88±0.01
-8.08±1.23
3.80+1.24 -0.439±0.232
P552(P60G)
0.39±0.08 -12.84±0.10 -10.73±1.62
2.11±1.64 -0.573±0.305
P552(Y63G)
0.21±0.03 -13.19±0.11
-6.07±1.57
7.12+1.58 -0.623±0.294
P552(L64G)
0.32±0.02 -12.95±0.04 -10.22±1.04
2.73±1.05 -0.532±0.197
*Data is adapted from reference (1). Units: Ki-nM; AG°, AH°, T°AS°-kcal/mol; ACp-kcal/molK.
hydrogen bonds to AspH55 and GluH57 move towards thrombin and form some more hydrogen bonds. It is then beheved that the movement of both the backbone and water molecule towards thrombin must compensate the loss in the molecular interactions due to the removal of the phenyl ring of PheH56^ and reduce the configurational entropy of the inhibitor around this area. For the analogs other than P552(F56G), no such conformational 8000.0 V6:^( change is predicted. Correspondingly, the 6000.0 I59G0 / ^ \ enthalpy and entropy of these analogs "O 4000.0 L64G ^J^ exhibit regular changes. Moreover, the : S 2000.0 \ • 'y^ O linear enthalpy-entropy compensation are ca 0.0 F56G ^ o observed for the congeneric series of _ -2000.0 thrombin bivalent inhibitors. Figure 5 < -4000.0 \ shows the relative enthalpy (AAH°) vs. the -6000.0 , relative entropy(AAS°). The relative free -^°^ - , o ,-20.0 -10.0 0.00 10.0 AAS° (cal/mol-K) energy(AAG°) and the overall change in the r
P60G5^
"^^^G
m o l e c u l a r s u r f a c e a r e a a r e f o u n d t o b e Figure S. Enthalpy-emropy compensation for the congeneric ,. , A . i r j.t ^ U.-A. A.r^u series of bivalent thrombin inhibitors. The filled circles are data I m e a r l y c o r r e l a t e d t o r t h e s u b s t i t u t i o n o r t h e from reference ( D and the open drcles are data from reference
nonpolar residues at the FRE:
(17).
AAG° = 0.0154 AAAnet + 1-34 (kcal/mol) (r=1.00)
(10)
where AAAnet = AAAnpi - AAApoi ("npl" and "pol" stand for the nonpolar and polar molecular surface area, respectively) and are calculated from the inhibitors in their bound and free states by using Scheme I. It is important to notice that the enthalpyentropy compensation is an intrinsic property of solute-solute interaction but largely
Thermodynamics of Enzyme and Inhibitor Interactions
521
enhanced by the solvent participation, and this feature of the enthalpy-entropy compensation is essential for the linear correlation of the relative free energy and the molecular surface area change due to the mutation(18).
IV. Summary The method presented in this article is applicable to the enzyme and inhibitor interactions and is particularly useful for those systems with high binding affinity. Analysis of the thermodynamic functions in conjunction with the structural features predicable from the molecular modeling may access the structural origin of the enzyme and inhibitor interactions upon the point mutation or substitution. The method may be utilized to develop the strategy of the rational inhibitor/drug design and protein/enzyme engineering.
References 1. Cheng, Y., Slon-Usakiewicz, J., Wang, J., Purisima, E., & Konishi, Y. (1996) Biochemistry 35, in press. 2. Wang, J., Szewczuk, Z., Yue, S. -Y., Tsuda, Y., Konishi, Y., & Purisima, E. O. (1995) J. Mol. Biol. 253, 473-492. 3. Stone, S. R., & Hofsteenge, J. (1986) Biochemistry 25, 4622-4628. 4. Rydel, T. J., Ravichandran, K. G., Tulinsky, A., Bode, W., Huber, R., Roitsch, C , & Fenton, J. W., II. (1990) Sciences 249, 277-280. 5. Betz, A., Hofsteenge, J., & Stone, S. R.(1991) Biochemistry 30, 9848-9853. 6. Yue, S. -Y., DiMaio, J., Szewczuk, Z., Purisima, E. O., Ni, F., & Konishi, Y. (1992) Protein Eng. 5, 77-85. 7. Yuko, T., Cygler, M., Gibbs, B. F., Pedyczak, A., Fethiere, J., Yue. S. -Y., & Konishi, Y. (1994) Biochemistry 33, 14443-14451. 8. Borgne, S. L., & Graber, M. {\99A)Appl. Biochem. Biotech. 48, 125-135. 9. Fethiere, J., Tsuda, Y., Coulombe, R., Konishi, Y., & Cygler, M. (1996) Protein Sci. 5, 1174-1183. 10. Weiner, S. J., Kollman, P. A., Nguyen, D. T., & Case, D. A. (1986) J. Comp. Chem. 7, 230-252. 11. Richards, F. M. {1911) Annu. Rev. Biophys. Bioeng. 6, 151-176. 12. Pascual-Ahuir, J. L., Silla, E., & Tuiionm I. (1994) J. Comp. Chem. 15, 1127-1138. 13. Blaber, M., Zhang, X., Lindstrom, J.D., Pepiot, S. D., Baase, W. A., & Matthews, B. W. (1994) /. Mol Biol. 235, 600-624. 14. Segel, I. H. (1975) Enzyme Kinetics: Behavior and Analysis of Rapid Equilibrium and SteadyState Enzyme Systems pp 100-160, John Wiley & Sons. 15. Morrison, J. F., & Stone, S. R. (1985) Comments Mol. Cell Biophys. 2, 347-368. 16. Morrison, J. (1988) Adv. Enzymol. 61, 201-301. 17. Di Cera, E., De Cristofaro, R., Albright, D. J., & Fenton, J. W., II (1991) Biochemistry 30, 7913-7924. 18. Cheng, Y., Wang, J., Slon-Usakiewicz, J., Purisima, E. O., and Konishi, Y., manuscript in preparation.
This Page Intentionally Left Blank
Development and characterization of a Fab fragment as a surrogate for the IL-1 receptor Y, Cong, A. S. McColl, T. R. Hynes, R. C. Meckel, P. S. Mezes, C. L. Lane, S. E. Lee, D. J. Wasilko, K. F. Geoghegan, I. G. Otterness and G. O. Daumy Central Research Division, Pfizer Inc., Eastern Point Road, Groton, Connecticut 06340 L Introduction Interactions between proteins account for a substantial number of biological signaling events. These include classical hormone-receptor interactions at the cell surface, as well as the great number of intracellular interactions revealed by analyses of signal transduction pathways. They represent an attractive but difficult set of potential targets for pharmaceutical intervention. Most successfiil drugs are compounds of relative mass < 500 Da that are bioavailable when taken orally. Experience has shown that such small compounds, while capable of being bound tightly to pocket-like sites that accommodate natural ligands of similar structure and size, usually cannot achieve binding with sufficiently high affinity to the molecular surfaces of proteins recognized by other proteins (1,2). As a result, it remains to be determined how drugs that disrupt protein-protein interactions can be developed. In these circumstances, it appears worthwhile to create model systems that allow aspects of this problem to be analyzed. We have elected to use a monoclonal antibody that binds to human interleukin-lp (IL-IP) by recognizing amino acid residues that are also recognized by the IL-lp receptor (IL-IR). A monoclonal antibody that recognizes the receptor-binding residues of a cytokine can be considered a surrogate for the cytokine's natural receptor. Such a reagent might be valuable in assessing the structural basis of cytokine-receptor affinity, and could furnish a starting point from which to attempt the design of smaller competitive agents (3, 4). Certain precautions need to be observed at the outset of this effort. To be an appropriate subject for downsizing, an antibody must achieve critical interactions with at least part of the receptor-binding surface of IL-1 p. This is to exclude selection of an antibody that blocks access of IL-lp to the receptor merely by steric overlap of its molecular bulk with the space occupied by bound receptor (5). Such an antibody, on downsizing, would lead to a compound that fails to compete with IL-ip binding. Here we describe the selection and characterization of an antibody, and its Fab fragment, that provides a suitable starting point for this endeavor. n . Methods A. Cloning and expression oflLlp and its mutant derivatives A clone of the human IL-lp gene, modified to reflect the preferred codon usage of £. coli, was obtained from R&D Systems (Minneapolis, MN) and subcloned into the expression vector pET22b (Novagen, Madison, WI). Site-directed mutagenesis was performed to TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
523
524
Y.Cong era/.
generate the desired mutations of IL-ip (6). The pelB leader encoded by the vector was eliminated by Ndel digestion and religation, resulting in the final expression construct. The mutations chosen for this study were a cysteine substitution of residue K138 and alanine substitutions of residues R4 and L6. While the K138C mutation does not affect receptor binding (7, 8), the R4A and L6A changes were expected to produce proteins defective in binding to the IL-IR Type I (9-11). Two mutant forms of IL-ip were constructed. One harbored only the single K138C substitution, and was termed "native" to denote the unaltered condition of its receptor-binding surface. The second, termed "mutant #1", harbored the R4A and L6A substitutions in addition to the K138C replacement. The mutations were verified by DNA sequencing. B. Preparation of wild type rhlL-ip and rhlL-ip mutants Recombinant wild type hIL-lp, the K138C mutant and the K138C, R4A, L6A triple mutant (mutant #1) were isolated from the soluble fraction ofE. coli lysates by ammonium sulfatefi*actionationand hydrophobic interaction chromatography. The purified proteins were characterized by SDS-PAGE, western blots, N-terminal sequence, size exclusion chromatography (SEC), isoelectric focusing (lEF), matrix assisted laser desorption ionization mass spectrometry (MALDI-MS), and electrospray mass spectrometry (ESMS). C. Biotinylation of K138C IL-lp mutants The K138C mutants (-1.5 mg in 0.5 mL of PBS) were treated with 50 mM DTT (removed by gel filtration) and then biotinylated using biotin-maleimide (Sigma) (2:1 molar ratio). The biotinylated product was purified by gel filtration on Superdex 75. Evidence of biotinylation was routinely obtained by western blots probed with an avidinHRP conjugate (Pierce) and by binding of the proteins to streptavidin-coated BIAcore chips (Pharmacia SA5). D. Isolation of monoclonal antibody (mAb) and preparation of Fab fragments BALB/c mice immunized with recombinant human interleukin-lp (rhIL-lp) were the source of splenocytes for the production of mAbs. The X63-Ag8.653 myeloma cell line was used as the fiision partner (12). The mAbs were selected by binding to biotinylated K138C IL-ip immobilized in streptavidin-coated plates. A clone (F18/1E3) that produced an anti-IL-ip (IgGi) with the slowest off-rate when tested on K138C IL-ip immobilized on BIAcore chips was scaled-up in two 1 L spinners and the mAb (1E3) was isolated by protein G affinity chromatography. Fab fi-agments were prepared by digestion with immobilized papain and purified by size exclusion chromatography on Superdex 75 (Pharmacia) after undigested IgG and Fc fragments were first removed by affinity chromatography on protein A. E. Determination of kinetic constants Biotinylated K138C and mutant #1 were immobilized onto streptavidin-derivatized biosensor chips (Pharmacia SA5) by direct injection. Kinetic analysis of the binding of soluble IL-IR to the immobilized forms of IL-ip was carried out at low density, i.e. below 100 refi*active units (RU). Each binding cycle, with either soluble rhIL-lR (Genzyme) or
Fab Fragment as Surrogate for IL-1 Receptor
525
Fab fragments as analytes, was performed at a constant flow of 5 |LiL/min in PBS containing 0.02 % Tween. After the binding cycle, regeneration of the chip to its RU base line was achieved with either 5 mM or 2.5 mM NaOH, depending on the amount of analyte bound (as gauged by RU). Rate constants of dissociation (koff) were determined from analyte-saturated chips in order to minimize rebinding. Kinject, as described in the BIAcore manual, was used to confirm that koff was being measured under conditions free of rebinding. Rate constants of association (kon) were determined at different analyte concentrations and averaged (ave kon). The ratio of kog/ave kon was used to determine the dissociation constant (Ka) of the soluble IL-1 receptor and Fab fragments for immobilized K138C and mutant #1. All the data analysis was carried out using the BIAevaluation software (Pharmacia). For competition experiments, Fab was chemically coupled to CMS chips activated by treatment with l-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride and N-hydroxysuccinimide, and the binding of wild type IL-ip (29 nM) was measured at different concentrations of soluble IL-1 receptor (0-167 nM). m . Results rhIL-lp and its mutant and biotinylated mutant derivatives were characterized biochemically before use. All the proteins appeared homogeneous by SDS-PAGE and SEC, but microheterogeneity was observed by ESMS, MALDI-MS, DBF and N-terminal sequencing. This was largely accounted for by N-terminal variation. Although all of the recombinant forms of IL-1 P originally had an N-terminus of Met-Ala-Pro, products with N-termini of Met, Ala and Pro were obtained. Kinetics of the binding of soluble IL-IR to immobilized IL-lp K138C and mutant #1 were measured using a BIAcore system (Table I). Soluble IL-IR exhibited about 8fold higher affinity for the K138C mutant than for mutant #1. The higher Kd exhibited for mutant #1 resulted from both a 4-fold lower association constant (kon) and a 2-fold higher dissociation constant (kog). This experiment confirmed that mutant #1, with the R4A and L6A mutations, was defective in binding the receptor. Table I. Kinetics of IL-IR binding to biotinylated IL-lfi K138C and ''mutant m** IL-1 form Mutation Kd(nM) kofif (sec'^) kon (M^ sec"^) "native"
K138C
0.0037
8.7x10^
4.3
mutant #1
K138C, R4A, L6A
0.0087
2.4x10^
35.5
In a similar fashion, the binding of Fab fragment 103095 to the same two ligands was also assessed (Table II). The Fab fragment bound to IL-ip K138C with Kd ~ 200 nM, i.e. about 50-fold more weakly than the IL-IR exhibited for this ligand. The higher Kd resulted primarily from a 20-fold lower kon. However, the Fab expressed the same relative preference as the IL-IR between the two forms of IL-lp; it recognized the K138C mutant, but failed to recognize the mutant defective in receptor binding.
Y. Cong etal.
526
Table n . Kinetics of Fab binding to hiotinylated IL-lp K138C and ''mutant #i'' Kd(nM) Mutation IL-1 form kon (IVr^ sec'^) koflF (sec'^) "native" mutant #1
K138C K138C, R4A, L6A
0.0069
3.5x10*
197
no binding
For competition studies, the Fab fragment was chemically coupled to the BIAcore chip and wild type rhIL-ip was used as analyte. Figure 1 presents the BIAcore sensorgrams obtained when different concentrations of wild type rhIL-ip (0.6-29 nM) were passed sequentially over the immobilized Fab chip. Soluble IL-IR was able to inhibit the binding of wild type rhIL-lp to immobilized Fab (Figure 2). Inhibition increased with increasing levels of soluble receptor until about 75% of the binding signal was suppressed. A residual level of apparent binding was attributed to nonspecific interaction by soluble proteins with the chip.
120 TIME (sec) Figure 1. Sensorgram traces for binding experiments in which different concentrations ofrhlL-lp were allowed to bind to anti-IL-1 Fab immobilized on a biosensor chip. The arrow indicates the time at which analyte was injected
Fab Fragment as Surrogate for IL-1 Receptor
527
80' ^^ 3
£60 O 0)
c o
8-40
o
20 1
20
1
-
1
40 60 IL-1 receptor (nM)
_
1
n
1
80
Figure 2. Competitive inhibition by soluble IL-l receptor of the binding of wild type rhlL-ip to immobilized Fab.
IV. Discussion The receptor-binding site of IL-ip has been mapped extensively by site-directed mutagenesis (9-11). These studies have revealed that two regions located about 25 A apart are important for receptor interactions (Figure 3, see color insert). One region (Patch A) is formed by the loop between p-strands 3 and 4, and is located on the side of the pbarrel. It is comprised of H30 and Q32. The other (Patch B) is located at the open end of the p-barrel, and encompasses a discontinuous set of residues including R4, L6, F46,156, K93, K103 and E105. The spatial separation of the two regions makes it unlikely that they could be bridged by a designed small molecule or even by an antibody, since antibody epitopes typically encompass four to nine amino acids (13-15). Nevertheless, anti IL-ip neutralizing antibodies have been shown to block the IL-lp:receptor interaction, and were also shown to bind at different but overlapping regions where the receptor binds (5). Thus, an antibody binding interaction to one of the two domains was strong enough to block the IL-lp:receptor interaction. We sought such an antibody as a first step toward designing a small molecular weight IL-ip antagonist. The traditional method of generating a blocking antibody has been to elicit a series of antibodies against the ligand and then determine which of the antibodies are neutralizing. Instead, we set out to generate a series of antibodies against one of the critical binding patches in the IL-1P:IL-1R interaction. Our criterion for antibody selection was that an antibody should bind well with wild type IL-lp, but poorly against a
528
Y. Cong etaL
mutant IL-ip in which prominent residues of the patch B binding region had been substituted by alanine. We also sought an efficient way to select for antibody that recognized the patch B binding region. It has previously been shown (5, 8) that the site-selectively monobiotinylated IL-lp mutant K138C retains its binding to IL-IR on a streptavidin surface, presumably because biotinylation occurs on a surface residue remote from the two binding regions (Figure 3). We bound monobiotinylated, oriented K138C IL-lp to a BIAcore streptavidin-coated chip and carried out binding studies with IL-IR in the presence of different concentrations of either K138C or wild type IL-lp as competitor. No appreciable differences in binding constants were apparent between K138C and wild type IL-ip, confirming that K138C was fully active in receptor binding using the E14 murine cell line and in the traditional IL-l/LAF bioassay (data not shown). Therefore, use of biotinylated K138C allows coating of a functional, orientated IL-1 on a streptavidincoated microtiter plate. In a standard antibody selection protocol, IL-ip would have been applied to the wells of microtiter plates so that binding could occur in a random fashion. With much of the antigen denatured by interaction with the plastic, antibodies selected by the screen would include many capable of recognizing only denatured IL-1 p. These would lack the ability to block receptor binding by the native cytokine. To focus monoclonal antibody selection on the native receptor-recognizing regions, only antibodies that bound to biotinylated K138C orientated on streptavidin plates were selected for further study. As a result, instead of the relatively large number of positive clones that might have been expected (potentially >100 in this case), only six antibodies were obtained, and all six of these bound to biologically active IL-1 p. This result appeared to validate the strategy of using a strategically oriented IL-ip as the antigen in the screening step, although no systematic comparison with a less specific method was performed to confirm this. Since the rate constant of dissociation (kofif), rather than the rate constant for association, is the primary determinant of differences in the Kd, we determined the apparent koff for each of the antibodies. The antibody (1E3) with the lowest apparent koff, and therefore, presumably the lowest Kd, was chosen for further study with the triple mutant K138C, R4A, L6A. The triple mutant K138C, R4A, L6A was prepared and its binding to IL-IR was compared to that of K138C. The results confirmed the importance of R4 and L6 for ILIR binding. A 10-fold increase in Kd was found in the triple mutant compared to K138C alone. To minimize the effect of steric hindrance and divalent binding of the IgG-lE3, a Fabfi-agmentwas prepared and its binding to the triple mutant was compared with its binding to biotinylated K138C. Fab-1E3 failed to bind to the triple mutant. This result demonstrated the successful selection of an antibody to the receptor-binding surface of the IL-lp molecule. It also demonstrated a fundamental difference between the ILlp:antibody and the IL-lp:IL-lR binding interfaces. The BL-IR protein:protein interaction interface contains at least two spatially separated binding domains. Diminished binding due to mutation at one domain raises the Kd, but need not abolish binding, because residues elsewhere can still support a lower affinity interaction. By contrast, an antibody
Fab Fragment as Surrogate for IL-1 Receptor
529
binding domain encompasses a limited number of spatially contiguous residues. Changes in those spatially close critical residues more readily abolish antibody binding. Finally, although by the criterion of non-binding to the triple mutant, Fab-1E3 appeared likely to be a receptor antagonist, it was important to confirm this. Consistent with the direct involvement of R4 and L6 in receptor binding, IL-IR binding to wild type IL-ip decreased the binding of Fab-1E3, and conversely, the binding of Fab-1E3 to wild type IL-ip decreased the binding of IL-1 R. The techniques developed during these studies are broadly applicable to selecting surrogate receptor (or ligand) antibodies toward other protein ligand:receptor pairs. First, the use of a biologically active, oriented ligand can result in a much more efficient first selection for blocking antibodies. Second, negative selection using an appropriate mutant will directly provide a blocking antibody that will also be a surrogate receptor (or ligand). We used K138C for the first selection and the triple mutant K138C, R4, L6 for the second selection, and found the blocking antibody Fab-1E3. Replacing negative selection using an appropriate mutant with a traditional positive selection scheme based on blocking activity will, of course, provide blocking antibodies, but such a selection scheme will detect blocking antibodies that are not receptor surrogates and thus are poor candidates for downsizing. Fab-1E3 fits the criteria that it is a receptor surrogate and therefore should be suitable for downsizing. References 1. 2. 3. 4. 5. 6. 7. 8. 9.
10. 11. 12. 13. 14. 15. 16.
Braisted, A. C. and Wells, J. A. (1996) Proc. Natl. Acad. Sci. USA 93, 5688-5692. DeGrado, W. F. and Sosnick, T.R. (1996) Proc. Natl. Acad. Sci. USA 93, 5680-5681. Smythe, M. L. and von Itzstein, M. (1994) J. Am. Chem. Soc. 116, 2725-2733. Saragovi, H. U., Fitzpatrick, D., Raktabutr, A., Nakanishi, H., Kahn, M. and Greene, M. I. (1991) Science 253, 792-795. Simon, P. L., Kumar, V., Lillquist, J. S., Bhatnagar, P., Einstein, R., Lee, J., Porter, T., Green, D., Sathe, G. and Young, P. R. (1993) J. Biol. Chem. 268, 9771-9779. Kunkel, T. A., Roberts, J. D. and Zakour, R. A. (1987) Methods Enzymol. 154, 367-382. Wingfield, P., Graber, P., Shaw, A. R., Gronenborn, A. M., Clore, G. M. and MacDonald, H. R. {\9%9)Eur. J. Biochem. 179, 565-571. Chollet, A., Bomiefoy, J.-Y. and Odermatt, N. (1990) J. Immunol. Methods 127, 179-185. Labriola-Tomkins, E., Chandran, C, Kaffka, K. L., Biondi, D., Graves, B. J., Hatada, M., Madison, V. S., Karas, J., Kilian, P. L. and Ju, G. (1991) Proc. Natl. Acad Sci. USA 88, 1118211186. Grutter, M. G., van Oostrum, J., Priestle, J. P., Edelmann, E., Joss, U., Feige, U., Vosbeck, K. and Schmitz A. (1994) Prot. Eng. 7, 663-671. Evans, R. J., Bray, J., Childs, J. D., Vigers, G. P. A., Brandhuber, B. J., Skalicky, J. J., Thompson, R. C. andEisenberg, S. P. (1995) J. Biol. Chem. ll^S, 11477-11483. Kearney, J. F., Radbruch, A., Liesegang, B. and Rajewsky, K. (1979) J. Immunol. 123, 1548-1550. Kabat, E. {1910) Ann. N. Y. Acad Sci. 169, 43-54. Schecter, I. {1911) Ann. N Y. Acad Sci. 190, 394-419. Hodges, R. S., Heaton, R. J., Parker, J. M. R., Molday, L. and Molday, R. S. (1988) J. Biol. Chem. 263, 11768-11775. Priestle, J. P., Schaer, H. P. and Gruetter, M. G. (1989) Proc. Natl. Acad Sci. USA 86, 9667-9671.
This Page Intentionally Left Blank
SECTION VII Macromolecular Assemblies
This Page Intentionally Left Blank
Topology of Membrane Proteins in Native Membranes Using Matrix-assisted Laser Desorption lonization/Mass Spectrometry Kamala Tyagarajanl, John G. Forte ^ and R.Reid Townsend^ iDept. of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3200 and ^Dept. of Pharmaceutical Chemistry, University of California, San Francisco, CA 941430446
I.
Introduction
Knowledge of the topological orientation of membrane proteins within native membranes is fundamental to establishing structure-function relationships. In particular, information on topology is important for understanding the structural basis underlying the translocation function of cation pumps like the Na,KATPase, Ca-ATPase or H,K-ATPase. Previous efforts to define topology have used both theoretical and experimental approaches, such as hydropathy plots, proteolysis of vesicles, binding of regio-specific antibodies, or labeling with group-specific, membrane-sided reagents followed by identifying the modified sites (1, 2). Proteolysis of sided vesicles followed by analysis of peptide products has been one of the most common approaches to determine exposed peptide sequences. Conversely, remaining membrane-associated peptides can be analyzed after exhaustive protease digestion. Many analyses have utilized SDSPAGE to separate proteolytic fragments followed by Edman sequencing of peptides or identification using regio-specific antibodies (3, 4). However these approaches are not useful for identifying small peptides (< 5 kDa) from proteolysis. Alternatively, HPLC separation of peptides followed by Edman sequencing is possible but time-consuming and the coelution of multiple peptides makes identification by Edman sequencing difficult. More recently, mass spectrometry has been used in the identification of peptides and glycopeptides, in topological studies (5-8). In this study, we used matrix-assisted laser desorption ionization /Mass Spectrometry (MALDI/MS) to identify the peptides released from gastric parietal cell microsomes. MALDI, because of its sensitivity and relative tolerance to the presence of salts and buffers was examined for the analysis of unfractionated proteolytic digests (9, 10). MALDI with post-source decay (PSD) analysis was used to obtain sequence information on peptides even in crude digestion mixtures. Our strategy (Figure 1) consisted of proteolysis of intact vesicles, centrifugation at high speeds to separate membrane bound and soluble fractions and analysis of the mixture of released peptides by MALDI/MS. In addition, to increase the TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
533
Kamala Tyagarajan et al
534
Protein in Vesicles Proteolysis ; Centrifugation / \ Pellet Supernatant Peptides V^
MALDI/MS
I
\ . RP-HPLC separation
HPLC fractions
;
MALDI/MS with PSD analysis Topological Models—Sequencing of peptides Figure 1. Methodology used to determine the topology of membrane proteins.
sensitivity and breadth of analysis, supernatant peptides were separated by reverse-phase HPLC and individual fractions were analyzed by MALDI/MS. PSD-analysis was also performed to obtain partial sequence information and identify peptides (11). On basis of the released peptide products a topological map for a major portion of the H,K-ATPase in gastric parietal cell tubulovesicles is proposed. We focused on the gastric H,K-ATPase as a test protein because i) purified gastric microsomal vesicles are highly enriched in the enzyme (> 85-90% purity), ii) the vesicles are oriented with a common asymmetry i.e. cytoplasmic side out (12), iii) the vesicles are sealed allowing selective cytoplasmic digestion, and iv) there is a pool of existing topological data from other methods (13, 14, 15& 16).
subunit beta-subunit
Cytoplasm
Figure 2. The gastric H,K-ATPase in gastric microsomal vesicles. The H,K-ATPase is a heterodimer composed of an a-subunit and a glycoprotein p-subunit, which are asymmetrically oriented.
Topology of Membrane Proteins Using MALDI/MS
535
Thus, the H,K-ATPase in microsomes is a useful model to develop new methods to determine protein topology. The cartoon in Figure 2 illustrates that the H,KATPase is a heterodimer composed of two subunit proteins: an a-subunit of 1035 amino acids, traversing the membrane either 8 or 10 times (13), with most of its mass cytoplasmically disposed (and therefore outside the vesicles); and a glycosylated (i-subunit of 300 amino acids, traversing the membrane once and, except for a short cytoplasmic tail, with most of its mass on the extracellular side (inside the vesicles).
II. Materials and Methods Materials. Trypsin, Lys C, chymotrypsin and adrenocorticotropic hormone fragment (18-39) were purchased from Sigma (St. Louis, Mo). Tris(hydroxymethyl)aminomethane (Tris), sucrose, acetonitrile, HPLC grade water and acetic acid were purchased from Fisher Scientific (Pittsburgh, PA). Matrix (a-cyano 4-hydroxy-cinnamic acid) was purchased from Hewlett Packard (Palo Alto, CA). The low-molecular weight calibration standard was purchased from Bio-Rad (Richmond, CA). Preparation of H,K-ATPase enriched microsomal vesicles. H,K-ATPasecontaining gastric microsomal vesicles were isolated from rabbit stomach as previously described (12). Crude microsomes were harvested from homogenized mucosa of unstimulated rabbit stomach (H2 receptor-blocked) as the membrane pellet sedimenting between 10 min at 13,000 x g and 1 hr at 100,000 x g. The pellet was resuspended in 10% sucrose, brought to 40% sucrose (9 ml), and overlaid with successive layers of 30% sucrose (11 ml), 10% sucrose (16 ml) [300 mM sucrose, 5 mM tris(hydroxymethyl)aminomethane (Tris), and 0.2 mM EDTA, pH 7.4] in a 37 ml tube. After centrifugation at 80,000 x g for 4 hr, the purified gastric microsomal vesicles were collected from the interface between 10% and 30 % sucrose and stored at 4° C until use. Trypsinization of H,K-ATPase-enriched gastric microsomal vesicles. Tubulovesicles (-100 jxg of protein) were treated with trypsin (5 jig) in Tris.HCl (20 mM, pH 7.5) at 37°C for 30 min. The vesicles were next centrifuged at 100,000 X g on a TLIOO table top centrifuge for 1 hr at 4°C. The supernatant was carefully separated from the pellet. The supernatant was next boiled for 5 min and stored at -20°C until further analysis. Reverse Phase-HPLC separation of tryptic digest.. The tryptic digest (60%) was separated on an Aquapore OD-300 (Applied Biosystems Inc) C18 reverse phase column (7 |i and 1 x 250 mm) using a Michrom UMA Model 600 HPLC system with eluant monitoring at 214 nm. The first 5 min of the gradient was isocratic at 5% eluant B (98% CH3CN, 0.1% TFA) and 95 % eluant A (2% CH3CN, 0.1% TFA). This was followed by a linear gradient of 5-15 % B in 15 min, 15-50% B at 75 min and 50-75 % B at 90 min. The flow rate was 50.0 |il/min. Individual fractions were collected and stored for subsequent use. MALDI/MS analysis. Supernatant (1 jxl) was diluted 1:2 with 50% CH3CN in water and this mixture was mixed with 2 |il of a-cyano 4-hydroxy-cinnamic acid, vortexed and centrifuged. One |il was spotted onto the target. MALDI/MS of samples was carried out on a TofSpec SE from Micromass (Manchester, UK), equipped with a reflectron and using a nitrogen laser (337 nm). Samples were
Kamala lyagarajan et al
536
initially examined in the linear mode to determine whether signals > 5 kDa were present. An accelerating potential of 25 kV, a reflectron voltage of 28.5 kV and an extraction voltage of 10 kV in the reflectron-ion mode were typically used. Thirty shots were usually averaged. The instrument was calibrated with peptides from a low molecular weight peptide set from Biorad (Richmond, CA). Molecular ions of bombesin and the 18-39 amino acid clip of adrenocorticotropic hormone fragment were used as calibration standards.
III.
Results and Discussion
The H,K-ATPase-enriched vesicles were trypsinized for 30 min using a trypsinrprotein ratio of 1:25 and then centrifuged to separate the pellet from the supematant fractions. An aliquot of the supernatant was analyzed by MALDI/MS as shown in Figure 3. The observed signals {m/z 600-4400) were assigned to masses (±2 Da) of the predicted tryptic digestion products for the gastric H,KATPase a-subunit as shown in Table 1. Since, the exposure to trypsin was for a limited period of time (30 min) incompletely cleaved tryptic peptides were also observed, so it was important to include the possibility of these incompletely cleaved peptides in the search through the molecular mass signals.
I
l^^i;^^,)!^^^ 800
1200
1600
2000 2400 m/z
2800
3200
3600
4000
Figure 3. MALDI mass spectrum of the total supernatant from the tryptic digest of the H,KATPase-enriched tubulovesicles. The H,K-ATPase was digested with trypsin and the vesicles centrifuged to separate supernatant from the pellet. An aliquot of the supernatant was analyzed by MALDI/MS in the reflectron ion mode using a-cyano 4-hydroxy cinnamic acid as a matrix. The signals are denoted by numbers and were assigned to a-subunit peptides (Table 1).
537
Topology of Membrane Proteins Using MALDI/MS
Table 1. Assignment of signals obtained by MALDI/MS of a tryptic digest supernatant from H,K-ATPase-enriched vesicles. The observed masses of the numbered signals shown in Figure 3 were assigned to the masses of a-subunit peptides. Signal
No. Observed MH+
Calculated MH+
a-subunit peptide Asp483-Lys487
1 2
663
662.4
819
818.5
Val435_Arg44l
3
899
898.6
Leu'78_Arg85
4
1043
1044.6
Ala32-Lys42
5
1047
1046.6
Leu7lO-Arg7l8
6
1056
1055.6
Gly206.Arg215
7
1076
1076.6
Ala673-Lys682
8
1088
1087.5
Thr695-Arg703
9
1093
1092.6
Asp86-Arg95
10
1196
1195.7
Leu659-Arg668
11
1239
1238.7
Asnl'74.Argl84
12
1283
1282.7
Tyr66.Arg'7'7
13
1324
1323.8
Leu659-Lys669 or Ala838.Arg848
14
1370
1370.7
Gly536_Arg546
15
1375
1374.7
Asp683_Arg694
16
1458
1458.7
Phe470-Arg482
17
1485
1485.7
Ser239-Arg251 or Arg456-Lys469
18
1619
1619.8
Val224.Arg238 or Asp5ll-Arg524
19
1679
1678.9
Ala43l-Lys445
20
1712
1711.9
Glu547-Arg562
21
1824
1823.9
Phe499-Arg513
22
2013
2013
23
2159
2159.1
Glu49-Lys65 Ala637.Arg658
24
2448
2449.2
Asn755-Arg777
25
2543
2544.2
Asn252_Arg275
26
2736
2737.4
Asn37l-Arg396
27
2793
2792.3
Glu49-Lys72
28
3553
3553.9
Asnl74.Lys205
29
4004
4005
Lys738.Arg'777
Based on mass analysis by MALDI/MS of the unfractionated tryptic digest, 29 tryptic peptides from the a-subunit were tentatively identified, but only two signals corresponded to (3-subunit peptides. These latter signals at m/z 791 and 1485 corresponded to peptides from the short cytoplasmic tail of the p-subunit, and included Met^-Lys'^ and Lys^-Lys^^ from the N-terminus of the sequence. No signals that corresponded to masses of the extracellular domain of the (3subunit peptides were observed, consistent with the vesicles being oriented with their cytoplasmic side-out and preservation of vesicular integrity during
538
Kamala Tyagarajan et al
proteolysis by trypsin. The a-subunit peptides tentatively identified by MALDyMS are listed in Table 1, including: Ala32-Lys42, Glu49-Lys65, Glu^^Lys'72, Tyr66-Arg'7'7, Leu'^^-Arg^^ and Asp^^-Arg^^ from the N-terminus (before membrane segment Ml); peptides Asn^'^^-Argi^^^ ^sn^'^^-Lys^o^^ Q\y206. Arg2i5^ Yai224_Arg238 Ser239.Arg25i and Asn252-Arg275 in the cytosolic loop between membrane segments M2 and M3; peptides in the large cytosolic loop between M4 and M5 which included Asn37i.Arg396, Ala^3i.Lys445 Arg^^e. Lys469, Phe470-Arg482, Asp483.Lys487, Phe499.Arg5i3, Gly536-Arg546, Glu547Arg562, Ala637_Arg658, Leu659.Lys669 , Ala673.Lys682, Asp683-Arg694, Thr695. Arg703, Leu7io_Arg7i8^ Lys738_Arg777 and Asn755-Arg777; and a peptide Ala838Arg848 from the cytosolic loop between membrane segments M6 and M7. All of these regions have previously been deduced to be cytoplasmic (15, 16). No peptides corresponding to any of the intramembrane segments or intravesicular (extracellular) regions of the a-subunit were observed. Thus the topological prediction obtained by analysis of the MALDI mass spectrum of the entire tryptic supernatant was consistent with the currently accepted topological model of H,KATPase (16). An assignment of the identified peptides to putative extracellular regions of the H,K-ATPase is schematically shown in Figure 4. Although analysis of peptide masses in the total supernatant allows a tentative identification, mass overlap at this resolution may lead to erroneous assignments. For example, signal suppression can lead to low intensity or abolition of certain peptide signals. Since the peptides are tentatively identified on the basis of mass alone, it is prudent to perform PSD analysis to obtain sequence information and confirm the identity. PSD analysis could be performed on some peptides in the total mixture; however, it was difficult to obtain sequence information on low Luminal solution
f\ Apical I ? plasma ^Jmembrane
Cytoplasm Figure 4. Topological model of the gastric H,K-ATPase. The topological model shown is adapted from a proposal by Besancon et al. (14). The model depicts the a-subunit having ten tranmsmembrane segments, denoted as Ml-MlO. Amino acid numbers are shown for the cytoplasmic ends of segments M1-M8. The glycoprotein p-subunit traverses the membrane once and has most of its mass luminally oriented. The darkened regions indicate peptides of the asubunit that were identified by MALDI/MS analysis of the total tryptic digest supernatant of H,K-ATPase-enriched vesicles (Figure 3 and Table 1).
Topology of Membrane Proteins Using MALDI/MS
539
40 Time (min) Figure 5. Reverse-phase HPLC of the supernatant from a tryptic digest of H,K-ATPaseenriched vesicles. Peak fractions were collected up to 60 min using the gradient described in "Methods".
intensity peptides and peptides that were separated by less than 14 Da. In order to obtain a series of purified peptides, we subjected the tryptic digest supernatant to RP-HPLC as described in "Methods". The RP-HPLC trace of the digest is shown in Figure 5. We collected 30 individual fractions and an aliquot of each was subjected to MALDI/MS. The MALDI/MS of each HPLC fraction showed the presence of several peptides which had sufficient mass-differences for successful PSD-analyses. Figure 6A shows the MALDI mass spectrum of a representative fraction, Fraction 13, from the HPLC preparation. Signals were observed at m/z 730, 1047, 1327, 1371, 1394, 1798, and 2141. The assignment of these signals to peptides of the a-subunit is summarized in Table 2. Although we had noted peptides at 1047 and 1371 in the total supernatant material, signals at m/z 730, 1327, 1394, 1798 and 2141 were apparent only after HPLC fractionation. The sequence and identity of the peptides was confirmed by PSD-analyses. The PSD spectrum of the signal at m/z 1798 is shown in Figure 6B. As an example, the PSD spectrum of m/z 1798 gave a series of y ions ranging from ya-yi? and the b ions from b3-b6 confirming the amino acid sequence to be identical to peptide '719LGAIVAVTGDGVNDSPALK737 of the a-subunit. Interestingly, the presence of the series of y ions from ys-yi? demonstrated that the sequon, '731 Asn-Asp-Ser'733, exists in a non-glycosylated form. It has been suggested that one of the Asn residues within the cytoplasmic domain of the a-subunit is glycosylated (17).
Kamala lyagarajan et al
540
700
m/z
900
Figure 6. MALDI mass spectrum of fraction 13 from RP-HPLC. The H,K-ATPase-enriched vesicles were trypsinized and centrifuged to separate supernatant from pellet. The supernatant was subjected to RP/HPLC and individual fractions collected and subjected to MALDI/MS. The MALDI mass spectrum (reflectron-ion mode) was obtained using a-cyano-4-hydroxy cinnamic acid as a matrix (Panel A). The signals were assigned to a-subunit peptides (Table 2). The signal at m/z 1798, indicated by an arrow was next subjected to PSD-analysis. The PSDspectrum of MH"*" 1798.4 is shown in Panel B. Only the peaks for the b and y fragment ions are labeled. The deduced amino acid sequence is shown at the top of the panel.
Topology of Membrane Proteins Using MALDI/MS
541
Table 2. Tryptic peptides of a-subunit in Fraction 13. The H,K-ATPase-enriched microsomes were trypsinized and centrifuged to separate the supernatant from the pellet. The supernatant was subjected to RP/HPLC and individual fractions were collected and analyzed by MALDI/MS. The MALDI mass spectrum of fraction 13 is shown in Figure 6A. The signals seen were assigned to a-subunit tryptic peptides, as shown below. Observed MH+
Calculated MH+
1047
1046.6
a-subunit peptide ^lOLVIVESCQR^ls
1327
1329.8
457IVIGDASETALLK469
1371
1370.7
536GQELPLDEQWR546
1396.8
661VPVDQVNRKDAR672
1798
1797.0
'719LGAIVAAVTGDGNDSPALK737
2141
2141.1
48KEMEINDHQLSVAELEQK65
1394
Use of alternative proteases Proteases other than trypsin may be used to increase the coverage of the protein sequence or resolve ambiguities from mass overlap. For example, Lys C for topological analysis gave results that were complementary to trypsin (data not shown). From a Lys C digest it was determined that several peptides from regions Ala^-Lys^^^, Seri^4-Lys223^ Arg^^^-Lys'^^^ and Asp^^s.LygSSi ^vere cytoplasmic. Again, a signal at m/z 2824 corresponded to the mass of peptide '7iOLeu-Lys'737 (2825 Da) of the a-subunit which includes Asn'731. These data were again consistent with the accepted topological model of the H,K-ATPase (Figure 4). Treatment of vesicles with chymotrypsin using similar conditions as for trypsin (1:20, chymoptrypsin:protein) and MALDI-MS analysis of the supernatant after centrifugation of the digest gave some interesting results. Signals at m/z 996, 1015, 1298, 1460 and 1678 were observed which corresponded to the masses of p-subunit peptides (Tyr2i9-Leu227, Seri5i-Leui59, Leu25i-Leu262, Cys58-Tyr69 and Arg^^-Tyr^^, respectively) and are known to have an intra-vesicular orientation. Further investigation including PSD analyses will be performed to confirm the identity of these peptides.
IV.
Conclusions
We have demonstrated the utility of MALDI/MS in combination with proteolysis to investigate the topology of a heterodimeric membrane glycoprotein, the gastric H,K-ATPase within its native microsomal membrane. MALDI/MS proved to be a rapid and sensitive method for topological analysis of membrane proteins in native membranes. The high sensitivity, and relative tolerance of MALDI/MS to buffers and some detergents, allowed rapid assessment of topology by examination of unfractionated supernatants from vesicular digests. The above approach may also be usefully employed to assess the reconstitution of proteins into vesicles and vesicular integrity. Analysis of HPLC fractions by MALDI with PSD analysis allowed the determination of partial peptide sequence and may prove suitable for identifying post-translational modifications of
542
Kamala Tyagarajan et al
extravesiculj peptides. Finally, this approach should provide a convenient, extravesicular sensitive anc and rigorous assessment of protein topology in artificial and native membranes.
Acknowledgments This project was supported in part by NIH grant DK38792. The mass spectra were obtained at the UCSF Mass Spectrometry Facility supported by the Biomedical Research Technology Program of the National Center for Research Resources (NIH NCRR BRTP RR01614 and RR08282). The VG TofSpec SE was partially supported by Micromass, Beverley, MA.
References 1. Modyanov, N., Lutsenko, S., Chertova, E., Efremov, R. and Gulyaev, D. (1992) Acta Physiol. Scand. Supplementum, 607, 49-58. 2. Loo, T.W. and Clarke, D.M. (1995) J. Biol. Chem. 270, 843-848. 3. Serrano, R., Monk, B.C., Villalba, J.M., Montesinos, C. and Weiler EW. (1993) Eur. J. Biochem., 212, 737-744. 4. Ban, W.J. Jr, Abbott, A., Sun, Y. and Malik, B. (1992) Ann. New York Acad. Sci., 671, 436-439. 5. le Maire, M., Deschamps, S., Moller, J.V., La Caer, J.P. and Rossier, J. (1993) Anal. Biochem., 214, 50-57. 6. Mel, S.F., Falick, A.M., Burlingame, A.L. and Stroud, R.M. (1993) Biochemsitry, 32, 9473-9479. 7. Moore, C.R., Yates, J.R., Griffin, P.R., Shabnowitz, J., Martino, P.A., Hunt, D.F. and Cafiso, D.S. (1989) Biochemistry, 28, 9184-9191. 8. Poulter, L., Earnest, J.P., Stroud, R.M. and Burlingame, A.L. (1989) Proc. Natl. Acad. Sci., 86, 6645-6649. 9. Tsarbopoulos, A., Karas, M., Strupat, K., Pramanik, B.N., Nagabushan, T.L. and Hillenkamp, F. (1994) Anal. Chem., 66, 2062-2070. 10. Billeci, T. M., and Stults, J.T. (1993) Anal. Chem., 65, 1709-1716. 11. Spengler, B., Kirsch, D., Kaufmann, R. and Jaeger, E. (1992) Rapid Commun. Mass Spectrom., 6, 105-108. 12. Reenstra, W.W. and Forte, J.G. (1990) Meth. in Enzymol., 192, 151-165. 13. Bamberg, K. and Sachs, G. (1994) J. Biol. Chem., 269, 16909-16919. 14. Asano, S., Arakawa, S., Hirasawa, M., Sakai, H., Ohta, M., Ohta, K.and Takeguchi N. (1994) Biochem. J., 299, 59-64. 15. Sachs, G., Besancon, M., Shin, J.M., Mercier, F., Munson, K. and Hersey S. (1992) J. Bioenerg. Biomem., 24, 301-308. 16. Besancon, M., Shin, J.M., Mercier, F., Munson, K., Miller, M., Hersey, S. and Sachs, G. (1993) Biochemistry, 32, 2345-2355. 17. Tai, M.M, Im, W.B., Davis, J.P., Blakeman, D.P., Zurcher-Neely, H.A. and Heinrikson, R.L. (1989) Biochemistry, 28, 3183-3187.
Role of D-Ser*^ in the P-type Calcium Channel Blocker, co-Agatoxin-TK Tomohiro Watanabe, Manabu Kuwada, Kumiko Y. Kumagaye*, Kiichiro Nakajima*, Yukio Nishizawa and Naoki Asakawa Eisai Tsukuba Research Laboratories, 5-1-3 Tokodai, Tsukuba, Ibaraki 300-26, Japan and *Peptide Institute Inc., Protein Research Foundation, Osaka 562, Japan
I. Introduction Multiple types of voltage-dependent calcium channels in mammalian neurons play important roles in controlling various nervous functions such as synaptic transmission, gene expression, neuronal development and differentiation. There are at least four subtypes of the calcium channels, namely T-type, L-type, N-type, and P-type channels, classified on the basis of their electrophysiological and pharmacological properties. Among them, the P-type calcium channels have been reported to be primarily associated with neuronal transmission through regulating the release of excitatory amino acids and catecholamines (1-4). We have previously isolated a 48-amino-acid peptide, named co-agatoxin-TK (co-Aga-TK), from the venom of the funnel web spider, Agelenopsis aperta. It was found to be a potent blocker of the P-type calcium channels in rat cerebellar Purkinje neurons, but TECHNIQUES IN PROTEIN CHEMISTRY VIII
543
544
Tomohiro Watanabe et al
had no activity against T-type, L-type, or N-type channels in brain neurons. The peptide has a unique structural profile including a high-density disulfide core structure with four disulfide bonds and a D-form amino acid, D-Ser, at position 46 (Fig. 1). Interestingly, coAga-TK contains two serine residues at positions 28 and 46 of which only Ser^^ is in the D-form (©-[o-Ser^^JAga-TK) (5,6). W e have also found in the spider venom a related peptide with the same amino acid sequence and disulfide pairings as those of CO-[DSej-46]Aga-TK except for the L-configuration of Ser^^ (co-[L-Ser46]AgaTK), though the L-Ser^^ toxin is about six times less abundant than the D-Ser^^ toxin (7). These findings raise the questions of why only Sef*^ of the two serine residues is in the D-form and why the two co-Aga-TKs containing opposite configuration at Ser^^ are both present in the Agelenopsis aperta venom. Heck et al. (8) have reported the presence in the venom of a novel peptide isomerase that specifically converts L-Ser^^ to D-Ser^^ residue of co-Aga-TK. We have recently reported the complete primary structure of the peptide isomerase, which is a 29-kDa glycoprotein consisting of a 243-residue heavy chain and an 18-residue light chain cormected by a single disulfide bond (9). This was the first report to assign the structure of a peptide isomerase from an eukaryotic organism that converts the chirality of amino acid residues. co-[D-Ser4^]Aga-TK has very low solubility under neutral conditions, which precluded detailed studies of its tertiary structure by NMR spectroscopy. However, Adams et al (10) and Yu et al (11) reported two-dimensional NMR analyses of co-[D-Ser4^]Aga-TK in acidic solution; they concluded that the cystine-rich region consists of a triple-stranded antiparallel p-sheet with four loops formed by four disulfides (Cys^-Pro^^), but the carboxyl-terminal tail was very poorly defined since the carboxyl-terminal ten residues containing the D-Ser'^^ residue (Arg^^-Ala^^) adopt a disordered structure. Our structure-function relationship studies of co-[D-Ser4^]Aga-TK demonstrated that co-[L-Ser^6]Aga-TK has 80- to 90-fold less potency towards the P-type calcium channels compared with CO-[DSer46]Aga-TK. Two proteolytic fragments of co-[D-Ser46]Aga-TK, namely co-Aga-TK (1-43) and a carboxyl-terminal peptide fragment, co-Aga-TK (44-48), did not exert any significant inhibition of P-type calcium channels or interfere with the blockade of the channels elicited by native co-Aga-TK (12). Furthermore, molecular dynamics calculations showed that the carboxyl-terminal sixamino-acid peptide of co-Aga-TK containing D-Ser^^ assumes a different conformation from that containing L-Ser^^. These data suggested that the specific conformation of the carboxyl-terminal
Role of D-Ser46 in w-Agatoxin-TK
545
tail generated by the D-Ser^^ residue, together with the triplestranded antiparallel p-sheet, might be essential for the blockade of the P-type calcium channels. loop-4
OOH Figure 1. Schematic diagram of the high-density disulfide core and carboxylterminal tail containing D-Ser^^ in Q)-[D-Ser^^]Aga-TK. The disulfide core structures are represented en the basis of the coordinates determined by NMR spectroscopy (11). Amino acid residues of the peptide are represented by single-letter abbreviations in the circles.
546
Tomohiro Watanabe et al
In this study, the conformations of co-[D-Ser4^]Aga-TK and co[L-Ser46]Aga-TK were investigated by the combination of sizeexclusion chromatography, circular dichroism (CD) measurement, and fluorescence spectroscopy in order to elucidate the structural and functional effects of the configuration of the Ser^^ residue in coAga-TK. We have found that co-[D-Ser4^]Aga-TK has a particularly compact molecular shape involving p-sheet structure, whereas co[L-Ser46]Aga-TK has a relatively unfolded or extended structure at physiological pH and ionic strength. These data are discussed in terms of the possible role of the configuration of the Ser^^ residue in determining the molecular conformation of ©-Aga-TK.
11. Experimental Procedures A. Peptides and Reagents co-[L-Ser46]Aga-TK and co-[D-Ser^^]Aga-TK were synthesized by Drs. K. Y. Kumagaye and K. Nakajima of Peptide Institute Inc. using a Applied Biosystems type 430A peptide synthesizer as described previously (5). Synthetic co-[D-Ser46]Aga-TK is commercially available from the company. High-purity guanidine hydrochloride was obtained ICN Biomedicals, Inc. (Aurora, OH). The phosphate-buffered saline, pH 7.4, was prepared by dissolving Dulbecco's PBS Powder (Nissui Pharmaceutical Co., Ltd., Tokyo) in Milli-Q water, and consists of 8.10 mM Na2HP04,1.47 mM KH2PO4, 2.68 mM KCl, and 137 mM NaCl. Other chemicals and reagents used were of reagent grade.
B. Size-Exclusion Chromatography The apparent molecular masses of co-[D-Ser4^]Aga-TK and CO-[LSer^6]Aga-TK were determined by size-exclusion chromatography (LKB GTI HPLC Systems) with a Pharmacia Superdex 75HR column (10 x 300 mm) or a TSK G3000SWXL column (10 x 300 m m ) equilibrated with Dulbecco's phosphate-buffered saline, pH 7.4, with or without 5.2 M guanidine hydrochloride. The peptides were eluted from the columns with the buffer at the flow rate of 0.5 ml/min at 25 °C and elution profiles were monitored by measuring the absorbance at 280 nm or 220 nm. The column was calibrated using a Pharmacia low-molecular-weight marker kit (blue dextran, bovine serum albumin, ovalbumin, chymotrypsinogen, and
Role of D-SeH6 in co-Agatoxin-TK
ribonuclease A), aprotinin Institute Inc.).
547
(Sigma), and substance P (Peptide
C. Spectroscopic analysis For CD and fluorescence spectroscopic analyses of CO-[LSer^^]Aga-TK and co-[D-Ser^^]Aga-TK, peptide samples were prepared by freshly dissolving the lyophilized peptides at a concentration of 150 |Lig/ml in Dulbecco's phosphate-buffered saline, pH 7.4, in the presence or absence of guanidine hydrochloride. CD spectra were recorded with a Jasco J-720WI spectropolarimeter at room temperature using a 0.1 cm path-length cell. In all cases, the buffer base-line spectrum was subtracted, and the results were expressed in terms of the mean residue ellipticity {0) in units of degrees cm^ dmol'^. Fluorescence spectra were determined with a Hitachi F4500 spectrofluorometer using a 1 cm path-length cell at 25 °C.
III. Results And Discussion A. Different molecular shapes of co-lD-Ser^^lAga-TK and co[L-Ser^^lAga-TK During the isolation and characterization of biologically active peptides from Agelenopsis aperta venom, we found that two stereoisomers of the P-type calcium channel blocker, CO-[DSer4^]Aga-TK and co-[L-Ser46]Aga-TK, were eluted in distinct fractions on size-exclusion chromatography (13). This finding was confirmed with synthetic standards of the two toxins on a Superdex HR75 column equilibrated with phosphate-buffered saline, pH 7.4, as shown in Fig. 2. co-[D-Ser4^]Aga-TK was found to be eluted from the column significantly later than co-[L-Ser^^]Aga-TK (the D-Ser toxin, 39.9 min; the L-Ser toxin, 31.5 min), in spite of their identical molecular mass of 5273 Da. The D-Ser toxin was eluted in close proximity to an 11-residue peptide, substance-P (molecular mass of 1348 Da), but the L-form toxin was eluted at a similar position to aprotinin (molecular mass of 6512 Da). Under the conditions used, each toxin was eluted as a single peak at the same position at a loading concentration from 2 ^M to 200 ^M, whereas aggregated or oligomeric forms were observed at the concentration of 2 mM.
Tomohiro Watanabe et al
548
O GO
^
co-[L-Ser^^]Aga-TK
0)
u
a o
67kDa I
43kDa 25kDa
14kDa TkDa I
(0-[i>Ser46]Aga-TK
10
20
I
30
40
50
60
Time (min) Figure 2. Size-exclusion chromatography of (o-[D-Ser^^]Aga-TK and co-[LSer46]Aga-TK. The two toxins (200 |iM) were analyzed en a Superdex HR75 column equilibrated with phosphate-buffered saline, pH 7.4, as described under Experimental Procedures. The molecular masses and elution positions of bovine serum albumin (67 kDa), ovalbumin (43 kDa), chymotrypsinogen (25 kDa), ribonuclease A (14 kDa), and aprotinin (7 kDa), used as calibration standards, are shown.
Role of D-Ser46 in co-Agatoxin-TK
549
The calibration of the size-exclusion column with standard proteins demonstrated that the L-Ser toxin has an apparent molecular mass of 6 kDa, which is close to the real molecular mass of the toxin. The apparent molecular mass of the D-Ser toxin was too small to evaluate accurately from the calibration data. These results indicated that both ©-[D-Ser^^] Aga-TK and co-[L-Ser^6] Aga-TK take monomeric form at physiological pH and ionic strength, but the two toxins are significantly different in apparent molecular mass. The apparent molecular mass of D-Ser toxin was dramatically increased by the addition of guanidine hydrochloride to the elution buffer, although that of the L-Ser toxin was not altered by the denaturing reagent. In the presence of 5.2 M guanidine hydrochloride, the D-form toxin was eluted at the same position as the L-form toxin and the apparent molecular masses of the two toxins were estimated as 6 kDa based on calibration with the standard proteins. CD and fluorescence spectroscopic analyses revealed that the two toxins were unfolded and lost their secondary and tertiary structure in 5.2 M guanidine hydrochloride at pH 7.4, as described below. It, therefore, appears that the D-Ser toxin forms a compact folded structure, whereas the L-Ser toxin has a relatively unfolded or extended structure. In order to see whether or not the elution behavior of the two toxins depends on the specificity of the separation support used, the two toxins were also analyzed on a TSK GSOOOSWXL column under the same elution conditions as those of the Superdex column. Similar results were obtained, i.e., the D-form toxin was eluted later than the L-form toxin with phosphate-buffered saline. These results confirm that the different elution behavior of the two toxins was caused by the distinct molecular shapes of the two toxins.
JB. Conformational analyses of o}-[D-Ser^^]Aga-TK and co[L'Ser^^lAga-TK We examined the CD spectra of co-[D-Ser^^]Aga-TK and CO-[LSer46]Aga-TK in phosphate-buffered saline, pH 7.4, to compare the secondary structures of the two toxins. As illustrated in figure 3, the spectrum of the D-Ser toxin showed a negative peak at 208 n m , while the spectrum of the L-Ser toxin had both a negative peak at 200 nm and broad positive ellipticity centered near 220 nm.
Tomohiro Watanabe et al
550
2000
o
S
-5000
-9000
220 Wavelength (nm)
250
Figure 3. CD spectra of co-[D-Ser'*^]Aga-TK and co-[L-Ser'*^]Aga-TK in phosphatebuffered saline, pH 7.4, in the presence or absence of guanidine hydrochloride. 1, co-[D-Ser^^]Aga-TK in phosphate-buffered saline; 2, co-[L-Ser^^]Aga-TK in phosphate-buffered saline; 3, co-[L-Ser^^]Aga-TK in phosphate-buffered saline containing 5.2 mM guanidine hydrochloride; 4, ©-[D-Ser'^^JAga-TK in phosphatebuffered saline containing 5.2 mM guanidine hydrochloride. CD spectra were recorded between 210 and 250 nm or 195 and 250 nm in the presence or absence of guanidine hydrochloride, respectively.
Role of D-Ser46 in w-Agatoxin-TK
551
These features are characteristic of peptide random coil and p-sheet structures, and the magnitude of the positive ellipticity band revealed a significant difference in p-sheet contents between the two toxins. The secondary structures of the two toxins were found to be disrupted by the addition of 5.2 M guanidine hydrochloride at pH 7.4, since the spectra changed to a pattern typical of predominantly random coil structure. It was concluded that CO-[DSer46]Aga-TK has a significantly higher p-sheet content than CO-[LSer46]Aga-TK under neutral conditions. Intrinsic fluorescence of co-[D-Ser46]Aga-TK and co-[L-Ser46]AgaTK was determined to compare the tertiary structures around the Trp and Tyr residues between the two toxins. The two toxins have a single residue each of Trp and Tyr in the disulfide-rich region, at positions 14 and 9, respectively. As shown in figure 4, tryptophan fluorescence with an emission maximum near 345 nm was strongly quenched in the D-Ser toxin, but not in the L-Ser toxin, whereas tyrosine fluorescence of the two toxins showed almost the same intensity at the emission maximum of 310 nm. Further, the intensity of tryptophan fluorescence of the D-form toxin, but not that of the L-form toxin, increased concomitantly with the increase of the concentration of guanidine hydrochloride at pH 7.4. These results clearly indicate that the Trp^^ residue in the L-form toxin is exposed to the solvent, but this residue of the D-form toxin is in a relatively hydrophobic environment. Previously, Yu et al. reported the solution structure of ©-[o-Ser^^JAga-TK at pH 4.0, showing that the indole side chain of the Trp^^ residue packs against the sulfur atoms of Cys^^-Cys^ and may serve to stabilize the loop formed by the disulfide bond (11). It is, therefore, suggested that the tryptophan fluorescence of the D-Ser toxin may be quenched by the sulfur atoms of the disulfide bond, whereas the indole chromophore of the L-Ser toxin may not be affected by the sulfur atoms due to the greater distance between the two groups. In conclusion, we have investigated the conformation of CO-[DSer46]Aga-TK and co-[L-Ser4^]Aga-TK at physiological pH and ionic strength using size-exclusion chromatography and spectroscopic methods. We have found that the apparent molecular mass of co[D-Ser^^]Aga-TK is significantly smaller than that of co-[L-Ser^^]AgaTK as determined by size-exclusion chromatography. CD spectra of the two toxins also revealed that co-[D-Ser4^]Aga-TK has a higher psheet content than co-[L-Ser46]Aga-TK. Furthermore, the intrinsic fluorescence of ©-[o-Ser^^JAga-TK showed that Trp^^ of CO-[DSer^6]Aga-TK is in a relatively hydrophobic environment compared with that of ©-[L-Ser^^JAga-TK. These data imply that
Tomohiro Watanabe et al
552 3727
30001 2000-1
1000-1 0.000 u O
250.0
300.0
350.0 (Emission)
400.0
300.0
350.0 (Emission)
400.0
3727
0.000 250.0
450.0
450.0
Wavelength (nm) Figure 4. Intrinsic fluorescence spectra of co-[D-Ser^6]Aga-TK and co-[L-Ser^^]AgaTK in phosphate-buffered saline, pH 7.4. Emission spectra were recorded between the wavelengths of 250 and 450 nm at t h e excitation wavelength of 280 nm.
Role of D-Ser46 in w-Agatoxin-TK
553
the D-Ser^^ residue of co-[D-Ser^^]Aga-TK may be involved in the formation of additional intramolecular p-sheet structure in the carboxyl-terminal region or between the disulfide core and carboxyl-terminal tail, which contributes to the compact folding of co-[D-Ser4^]Aga-TK. It is also likely that the additional p-sheet causes a change in the tertiary environment around the Trp^^ residue of co-[D-Ser4^]Aga-TK. Additional experiments to assess the biological importance of the carboxyl-terminal tail seem worthwhile. For instance, it would be interesting to examine the effects of sequential truncation of the carboxyl-terminal region of co-[D-Ser4^]Aga-TK on the blockade of the P-type calcium channels. Studies are in progress to characterize further the carboxyl-terminal conformation of co-[D-Ser46] Aga-TK.
Acknowledgments We thank Dr. Kozaki for helpful discussions and Dr. Takakuwa (Jasco), for measuring CD.
References 1. 2.
3. 4. 5.
6.
7.
8.
Olivera B.M., Miljanich G.P., and Ramachandran J. (1994) Annu. Rev. Biochem. 63, 823-8671. Niidome, T., Teramoto, T., Murata, Y., Tanaka, I., Seto, T., Sawada, K., Mori, Y., and Katayama, K. (1994) Biochem. Biophys. Res. Comtnun. 203, 1821-1827 Kimura, M., Yamanishi, Y., Hanada, T., Kagaya, T., Kuwada, M., Watanabe, T., Katayama, K., and Nishizawa, Y. (1995) Neuroscience 66, 609-615 Teramoto, T., Niidome, T., Miyagawa, T., Nishizawa, Y., Katayama, K., and Sawada, K. (1995) NeuroReport 6, 1684-1688 Kuwada, M., Teramoto, T., Kumagaye, K. Y., Nakajima, K., Watanabe, T., Kawai, T., Kawakami, Y., Niidome, T., Sawada, K., Nishizawa, Y., and Katayama, K. (1994) Mol Pharmacol. 46, 587-593 Kozaki, T., Kuwada, M., Narukawa, M., Nagai, Y., and Asakawa, N. (1996) in "Peptide Chemistry 1995" (Nishi, N., ed) Protein Research Foimdation, Osaka, 245-248 Watanabe, T., Teramoto, T., Kuwada, M., Shikata, Y., Niidome, T., Kawakami, Y., Sawada, K., Nishizawa, Y., and Katayama, K. (1995) in "Peptide Chemistry 1994" (Ohno, M., ed) Protein Research Foundation, Osaka, 253-256 Heck, S. D., Siok, C. J., Krapcho, K. J., Kelbaugh, P. R., Thadeio, P. P., Welch, M. J., Williams, R. D., Ganong, A. H., Kelly, M. E., Lanzetti, A. J., Gray, W. R., Phillips, D., Parks, T. N., Jackson, H., Ahlijanian, M. K., Saccomano, N. A., and Volkmann, R. A. (1994) Science 266,1065-1068
554 9. 10. 11. 12. 13.
Tomohiro Watanabe et al Shikata, Y., Watanabe, T., Teramoto, T., Inoue, A., Kawakami, Y., Nishizawa, Y., Katayama, and K., Kuwada, M. (1995) /. Biol Chem. 270, 16719-16723 Adams, M. E., Mintz, I. M., Reily, M. D., Thanabal, V., and Bean, B. P. (1993) Mol Pharmacol 44, 681-688 Yu, H., Rosen, M. K., Saccomano, N. A., Phillips, D., Volkmann, R. A., and Schreiber, S. L. (1993) Biochemistry 32, 13123-13129 Teramoto, T., Kuwada, M., Niidome, T., Sawada, K., Nishizawa, Y., and Katayama, K. (1993) Biochem. Biophys. Res. Commun. 196, 134-140 Watanabe, T., Shikata, Y., Oda, Y., Nishizawa, Y., Kuwada, M., and Asakawa N. The two dimensional HPLC purification of biologically active polypeptides and polyamines in funnel web spider venom, manuscript in preparation
Involvement of Basic Amphiphilic a-helical Domain in the Reversible Membrane Interaction of Amphitropic Proteins: Structural Studies by Mass Spectrometry, Circular Dichroism, and Nuclear Magnetic Resonance Nobuhiro Hayashi, Mamoru Matsubara, Koiti Titani, and Hisaaki Taniguchi Division of Biomedical Polymer Science, Institute for Comprehensive Medical Science, Fujita Health University, Toyoake, Aichi 470-11, Japan
I.
Introduction
A growing number of proteins have been shown to belong to the socalled "amphitropic" proteins which are neither "pure" membrane proteins nor soluble proteins (1). Interestingly, many of them are involved in the signal transduction, and their stimulation-dependent translocation plays important roles in the transmission of signals between the plasma membrane and the nucleus (2). They usually lack any apparent hydrophobic membrane-binding domain, but the importance of highly basic domains in the Src family proteins (2) and that of the basic amphiphilic domain in MARCKS (3) in the membrane association have been well established. In the latter case, the direct phosphorylation of the domain by protein kinase C regulates the reversible membrane association of MARCKS (3, 4). It is of interest to note that some of these amphitropic proteins are fatty acylated, and the modification is also involved in the membrane interaction (2, 5). One of the major phosphoproteins in neuronal growth cone, GAP-43 (growth-associated protein-43, also known as B50, Fl, P56, or neuromodulin), which is found associated with membrane cytoskeletal fractions (6), is very hydrophilic and lacks any apparent hydrophobic membrane-binding domain (7). Palmitoylation of two cysteine residues near the N-terminus has been assumed to be involved in the interaction with membranes (8, 9). However, we have recently shown that GAP-43 isolated from the membrane fractions is notpalmitoylated at all but still retains the ability to bind phospholipid membranes in vitro (10, 11). GAP-43 belongs to the MARCKS family of acidic hydrophilic membrane-associated proteins (12) and has a similar basic amphiphilic domain which serves as the calmodulin-binding domain and the phosphorylation domain by PKC. The involvement of the domain in the membrane-anchoring of GAP-43 has, in fact, been suggested (13, 14). In the present study, we first show a detailed mass spectrometric TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
555
Nobuhiro Hayashi et al
556
analysis on the posttranslational modifications of GAP-43, which provides the basis for the understanding of the structures of molecules involved. The interaction of GAP-43 and that of the basic amphiphilic domain with membrane phospholipids are then studied using circular dichroism (CD) and nuclear magnetic resonance (NMR) to understand the underlying structural mechanisms in the interaction.
II. Materials and Methods A.
Materials
GAP-43 (10) and PKC (15) were purified from bovine brain as described previously. A peptide (QASFRGHITRKKLKGEK) corresponding to the calmodulin-binding domain of GAP-43, named GAP peptide, was synthesized using conventional tBoc chemistry in an ABI 430A peptide synthesizer (Applied Biosystems), and purified over a C18 reversedphase column (Vydac 218TP1010, The Separations Group) using a linear HjOacetonitrile gradient in the presence of 0.1% trifluoroacetic acid. Lipids purchased from Avanti Polar Lipids were suspended in 5 mM phosphate buffer (pH 7.5), and sonicated in a BRANSON SONIHER 250 sonicator for 30 min. The supernatant obtained after centrifugation in a tabletop centrifuge for 20 min was used as unilamellar liposomes. B.
Preparation
of Phosphorylated
GAP'43 and GAP
Peptide
Phosphorylation of the intact GAP-43 and GAP peptide by PKC was carried out in the reaction buffer (25 mM Tris-HCl buffer (pH7.5), 10 mM MgClj, 100 mM CaCl2, 80 |ig/ml phosphatidylserine, 8 M-g/ml dioleoyl glycerol, ImM ATP) at35°C for 90 min, and was stopped by adding 0.1% final concentration of trifluoroacetic acid. The extent of the phosphorylation was analyzed by mass spectrometry as described previously (10, 16). The phosphorylated GAP peptide was purified over a reversed-phase HPLC column. The phosphorylated GAP-43 protein was purified by ion exchange chromatography on a mono Q column (HR 5/5) using a linear gradient of NaCl (0 - 0.5M) in 20 mM Tris-HCl buffer (pH 7.5) containing 1 mM EDTA and 1 mM dithiothreitol. C.
Mass Spectrometrie
Analysis
Electrospray mass spectra were recorded in a PE Sciex API-Ill mass spectrometer as described previously (10, 16). A capillary HPLC was connected on-line to the electrospray interface of the mass spectrometer. D.
Circular Dichroism
CD spectra
were
(CD)
recorded
Spectrometry at
25°C
in
a JASCO
J-720
CD
Membrane Structure of GAP-43 Peptide
557
spectropolarimeter using a 0.1 cm cell. Concentration of the peptide was 20 }iM in 5mM phosphate buffer (pH7.3). The contents of secondary structures were calculated from the CD spectra using a CONTIN program (17) modified by Dr. F. Arisaka, Tokyo Institute of Technology.
E. NMR Spectrometric Analysis 500MHz proton NMR spectra were recorded on a Bruker DMX-500 spectrometer. Chemical shifts were measured relative to the methyl resonance of an internal reference, 4,4-dimethyl-4-silapentane-lsulfonate. GAP peptide (5mM) was dissolved in 90% H ^ - 10% D p , 99.98% D p , 50% H P -10% D p - 40% trifluoroethanol (TFE)-d3, or 60% D2O - 40% TFE-d3. The pH of the samples was 4.0 (direct meter reading). By using standard procedures for 2D proton NMR of proteins (18), the sequence-specific assignment of resonances was obtained from two-dimensional TOCSY (19), NOESY (20, 21), DQF-COSY with phase cycling (22) or with pulsed field gradient (23, 24), and TQF-COSY with pulsed field gradient (23, 24) spectra. All spectra were acquired at 25°C in the phase-sensitive mode using the time proportional phase increment technique. WATERGATE (25, 26) or presaturation was used for the water suppression. A total of 512 measurements with increasing t^ values were made, and 64 transients were accumulated for each measurement. For tj 2048 data points were taken, and the spectral widths along ^2 ^ ^ ^ 5000 Hz. The data were zero filled once in the f^ dimension. A cosine window function and a Gaussian function were used in f^ and fj dimension before Fourier transformation, respectively. For the NOESY spectra, the time-domain data were multiplied by Gaussian functions in both dimensions. All spectra were processed using Bruker XWIN-NMR or MSI Felix95.0 software packages. III.
Results and Discussion
A. Mass Spectrometric Analysis on the in Vivo Posttranslational Modifications Soft ionization techniques such as electrospray ionization and matrix assisted laser desorption are now routinely used to determine the mass of large hydrophilic polymers like proteins (27). However, as is usual for the ionization process, the presence of salts and detergents, which is common for biological samples, can affect the process significantiy. The use of the on-line capillary reversed-phase HPLC in combination of the electrospray mass spectrometer (LC/MS) has made it possible to analyze such samples directly (10,16, 28). When GAP-43 isolated from the membrane fractions of bovine brain was analyzed, a single major peak with a minor peak corresponding to a phosphorylated species was observed (Fig. la). To study the posttranslational modifications in detail, the protein was digested with specific proteases such as lysyl
Nobuhiro Hayashi et al
558
25145.0
>^
24,600
25,000
a
25,400
25,800
Mass (Da)
750
800
m/z
850
900
Fig. 1. Mass spectrometric analysis of GAP-43 purified from membrane fractions of bovine brain, (a) A deconvoluted mass spectrum of GAP-43. A deconvoluted mass spectrum of the N-terminal peptide before reduction (b) and after reduction (c). Peaks formed by oxidation of Met were also observed.
endoprotease and trypsin, and the resulting mixtures were directly analyzed with the same LC/MS apparatus. Since the cDNA sequence has been known, most of the peptides detected could be assigned solely from their masses, and the two peptides containing phosphorylation and a peptide corresponding to the N-terminal peptide were observed (10). Interestingly, the mass of the latter (796.3 Da) was slightly but significantly lower than the theoretical mass of 798.3 Da. Since the peptide contained two successive Cys residues, the peptide was treated with dithiothreitol, and directly analyzed with the LC/MS apparatus. As shown in Fig. lb, c, the mass of the peptide increased by 2 Da after the dithiothreitol treatment, suggesting that the two Cys residues form an intrachain disulfide bridge. Since no palmitoylated N-terminal peptide was detected to significant extent, we conclude that the isolated GAP-43 is not palmitoylated at the two Cys near the N-terminus. B. Conformational Change of GAP-^3 Phospholipid Binding
and GAP Peptide
upon
GAP-43 purified from bovine brain showed a CD spectrum with a single
Membrane Structure of GAP-43 Peptide
559
negative peak at around 197 nm in aqueous solution, which is typical for a random structure. At most 10% of the whole molecule seems to assume a-helix. Upon addition of acidic phospholipids such as phosphatidylglycerol (PG), however, a broad negative peak between 220 and 230 nm due to the increase in the a helix content was observed (Fig. 2a). All the acidic phospholipids tested but not neutral phospholipid such as phosphatidylcholine affected the CD spectrum in a similar way. A peptide corresponding to the calmodulin-binding domain of GAP-43 (GAP peptide) showed a similar random coil to a-helix conformational change upon phospholipid binding (Fig. 2b). The extents of the change in the CD spectra of the intact protein and the peptide are comparable, suggesting that only the domain interacts with the lipids and undergoes a conformational change to a-helix. This is reasonable, since the whole molecule of GAP-43 except for the calmodulin binding domain is hydrophilic and acidic without any hydrophobic amino acids. When ionic strength of the buffer was increased, the apparent affinity between the GAP peptide and the phospholipids decreased, suggesting that the interaction between the GAP peptide and the phospholipids involves electrostatic interaction (29, 30). The addition of TFE, a membrane
O
260
X3
o^ q
-4.0'
200
220
240
Wavelength (nm)
260
F^. 2. Effects of phospholipids on CD spectra of GAP-43 and GAP peptide. CD spectra of GAP-43 (a) and GAP peptide (b) were measured in the absence (O) and in the presence (•) of phosphatidylglycerol or phosphatidylcholine (A).
Nobuhiro Hayashi et al
560
mimicking reagent, caused a concentration dependent induction of the CD spectrum component typical for an a-helix. The a-helical content reached almost 100% in the presence of 40% TFE.
C.
Structural Analysis by Nuclear Magnetic
Resonance
The structural characteristics of the domain was further studied in detail by NMR techniques. Compared to the CD spectrometry, the NMR method gave more accurate and residue-specific information on the conformation. Large portion of the synthetic peptide formed a regular a-helix in the presence of TFE, as was evidenced by the consecutive NOE connectivities (Fig. 3a) (18). Fig. 3a shows that rather strong medium range ^H-^H NOE's of both ap(i, i+3) and aN(i, i+3) are detected in the region from Phe"^ to Lys^^ Furthermore, compared to the chemical shifts of a protons observed in GAP peptide with those obtained in random structure peptide (31), the characteristic upfield shifts of the a protons of GAP peptide except for those of two residues near the C terminus were observed (Fig. 3b). This feature is observable with a helical structure regions (32,33). These results indicate that the region (Phe^-Lys^^) forms a "regular" a-helix in the presence of TFE.
aN(i,i+1)
Q1A2 S3 F 4 R 5 G 6 H 7 |8 T9 R i ( K i i K i l i 3 K i 4 G i t i 6 K i 7
-0.60
-0.40
-0.20
0.00
Membrane Structure of GAP-43 Peptide
561
In the absence of TFE, GAP peptide showed a typical CD spectrum for a random structure (Fig. 2b). Due to resonance overlaps, many peaks in the NMR spectra could not be uniquely assigned except for several peaks. However, as is shown in Fig. 3c, the signals of a protons generally showed characteristic upfield shifts again, although the degrees were not so large as those obtained in the presence of TFE. Because Ala^ Ile^ Thr^ and Leu^^ each occurs only once in GAP peptide, and their methyl group signals give well-resolved signals in higher magnetic field region, it was possible to assign these residues. Interestingly, the a proton chemical shifts of all the assigned residues showed intermediate values between those typical for random coil and those for a helix obtained as above (Fig. 3b). Since the resonance overlaps observed is characteristic
Gly in random coil
CO
c^ Q_
Signals observed in a helix
X
00
Lys,Arg,Gln,Ser,His,Phe in random coil 8.7
8.4 8.1 F2 (ppm)
7.8
Fig. 3. NMR analysis of GAP peptide, (a) NOE connectivities of specified proton pairs observed in theNOESY spectra of GAP peptide in the presence of 40 % TFE are marked with a bar (aN(i,i+l)), open (aP(i,i+3)) and/or shaded (aN(i,i+3)) boxes, (b) Deviation in the chemical shifts of some of the a protons in the presence of TFE (open bars) and in the absence of TFE (shaded bars) from those observed with typical random coil (31) are indicated, (c) ocH (F1)/NH (F2) region of 500 MHs DQF-COSY spectrum of GAP peptide in 90 % H2O -10 % DjO. The regions, in which Lys, Arg, Gin, Ser, His, Phe, and Gly in random coil are observed, are indicated.
Nobuhiro Hayashi et al
562
for a random cx3il, and chemical sifts of a protons showed intermediatevalues between those of random coil and those of a helical structure, the GAP peptide in aqueous solution assumes an intermediate state between a random coil and a regular a helix. Such a "nascent" helical structure may deviate from ideal geometry, and/or the ends of the a-helix can fray (34, 35). The interaction of GAP peptide with phospholipids seemed to stabilize the conformation to induce an a helix, as is often the case of "nascent" a-helical structures which are usually induced or further stabilized by addition of the a-helix promoting solvent TFE (36,37).
IV.
Conclusions
GAP-43 lacks any hydrophobic region found in usual membrane proteins and the pal mi toy la ti on which has been implicated in the membrane anchoring is not present in the purified protein. However, the effector domain of basic amphiphilic nature has the ability to bind acidic phospholipids. The domain adopts an a helical conformation when put into hydrophobic environments as shown by the CD and NMR analyses. A growing body of evidence suggests that the basic amphiphilic a-helical domain, which has been initially found as a calmodulin binding motif, serves as a reversible membrane-association signal.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Burn, P. (1988) Trends Biochem. Sci. 13, 79-83. Resh, M. D. (1994) Cell 6,411-413. Taniguchi, H., and Manenti, S. (1993) /. Biol. Chem. 268,99609963. Kim, J., Shishido, T., Jiang, X., Aderem, A., and McLaughlin, S. (1994) /. Biol. Chem. 269, 28214-22821. Peitzsch, R. M., and McLaughlin, S. (1993) Biochemistry 32, 10436-10443. Meiri, K. P., and Gordon-Weeks, P. R. (1990) /. Neurosci. 10, 256-266. LaBate, M. E., and Skene, J. H. P. (1989) Neuron 3, 299-310. Zuber, M. X., Strittmatter, S. M., and Fishman, M. C (1989) Nature 341,345- 348. Skene, J. H. P., and Virag, I. (1989) /. Cell Biol. 108,613-624. Taniguchi, H., Suzuki, M., Manenti, S., and Titani, K. (1994) /. Biol. Chem. 269, 22481-22484. Hayashi, N., Matsubara, M., Titani, K., and Taniguchi, H. (1996) in preparation. Blackshear, P. J. (1993) /. Biol. Chem. 268,1501-1504. Houbre, D., Duportail, G., Deloulme, J. C, and Baudier, J.
Membrane Structure of GAP-43 Peptide
14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36.
37.
563
(1991) /. Biol Chem. 266, 7121-7123. Kim, J., Blackshear, P. J., Johnson,}. D., and McLaughlin, S. (1994) Biophys. ]. 67, 227-237. Manenti, S., Sorokine, O., Van Dorsselaer, A., and Taniguchi, H. (1992) /. Biol Chem. 267,22310-22315. Taniguchi, H., Manenti, S., Suzuki, M., and Titani, K. (1994) /. Biol Chem. 269,18299-18302. Provencher, S. W., and Glockner, J. (1981) Biochemistry 20,3337. Wiithrich, K. (1986) NMR of Proteins and Nucleic Acids, J. Wiley, New York. Bax, A., and Davies, D. G. (1985) /. Magn. Reson. 65,393-402. Jeener, J., Meier, B. H., Bachman, P., and Ernst, R. R. (1979) /. Chem. Phys. 71,4546-4553. Macura, S., Hyang, Y., Suter, D., and Ernst, R. R. (1981) /. Magn. i^eson. 43,259-281. Ranee, M., Sorensen, O. W., Bodenhausen, G., Wagner, G., Ernst, R. R., and Wiithrich, K. (1983) Biochem. Biophys. Res.Commun. 177,479-485. Baker, P., and Freeman, R. (1985) /. Magn. Reson. 64,334-338. Hurd, R. E. (1990) /. Magn. Reson. 87,422-428. Piotto, M., Saudek, V., and Sklenar, V. (1992) /. Biomol NMR 2,661-665. Sklenar, V., Piotto, M., Leppik, R., and Saudek, V. (1993) /. Magn. Reson. 102 (Ser. A), 241-245. Biemann, K. (1992) Annu Rev Biochem 61, 977-1010. Taniguchi, H. (1996) /. Mass Spectrm. Soc. Japan 44,443-457. McLaughlin, S. (1977) Curr. Top. Membr. Transp. 9,1-144. McLaughlin, S. (1989) Annu. Rev. Biophys. Biophys. Chem. 18, 13-136. Bundi, A., and Wuthrich, K. (1979) Biopolymers 18, 285-298. Pastore, A., and Saude, V. (1990) /. Magn. Reson. 90,165-176. Wishart, D., Sykes, B., and Richards, F. (1991) /. Mol Biol 222, 311-333. Dyson, H. J., Merutka, J., Waltho, J. P., Lerner, R. A., and Wright, P. E. (1992) /. Mol Biol 226, 795-817. Manning, M. C , Illangasekare, M., and Woody, R. W. (1988) Biophys. Chem. 31, 77-86. Munier, H., Blanco, F. ]., Precheur, B., Diesis, E., Nieto, J. L., Craescu, C. T., and Barzu, O. (1993) /. Biol Chem. 268,16951701. Shang, M., and Vogel, H. J. (1994) /. Biol Chem. 269, 981-985.
Acknowledgements We thank Mr. M. Suzuki for technical assistance. This work was supported in part by Grants-in-Aid from the Fujita Health University, Science Research Promotion Fund from the Japan Private School
564
Nobuhiro Hayashi et al
Promotion Foundation, Research Grant from the Naito Foundation for Medical Research, Grant-in-Aid for Scientific Research (C) (06680773) and Grants-in-Aid for Scientific Research on Priority Areas (06253218, 06276218, 07268221,07279242, 08249240 and 08260220) from the Ministry of Education, Science and Culture, Japan. M.M is a Research Fellow of the Japan Society of the Promotion of Science.
One-Dimensional Diffusion of a Protein along a Single-Stranded Nucleic Acid Bradley R. Kelemen Ronald T. Raines Department of Biochemistry University of Wisconsin Madison, WI 53706-1569
I. Introduction One-dimensional diffusion can accelerate the formation of site-specific interactions within biopolymers by up to lO^-fold (Berg et aL, 1981). Such facilitated diffusion is used by transcription factors and restriction endonucleases to locate specific sites on double-stranded DNA (von Hippel and Berg, 1989). The backbone of RNA, like that of DNA, could allow for the facilitated diffusion of proteins. Yet, the facilitated diffusion of a protein along RNA (or any single-stranded nucleic acid) has not been demonstrated previously. Bovine pancreatic ribonuclease A (RNase A; RNA depolymerase; EC 3.1.27.5) is a distributive endoribonuclease that catalyzes the cleavage of the P-O5' bond of RNA on the 3' side of pyrimidine residues. RNase A binds to polymeric substrates (Imura et aL, 1965; Trie et al., 1984; Moussaoui et al., 1995), but the mechanism by which RNase A locates a pyrimidine residue within a polymeric substrate is not known. Binding to phosphoryl groups is important for the one-dimensional diffusion of proteins along DNA (Winter et al, 1981), and may likewise provide nonspecific interactions necessary to generate one-dimensional diffusion by RNase A. RNase A has three defined phosphoryl group binding subsites, PO, PI, and P2, as well as three base binding subsites, Bl, B2, and B3 (Pares et al, 1991). The subsite interactions in the RNase A»RNA complex are shown in Figure la. The PO and P2 subsites interact with phosphoryl groups that remain intact during catalysis; the PI subsite is the active site. The Bl subsite is responsible for the pyrimidine specificity of RNase A. RNase A cleaves poly(cytidine) [poly(C)] or poly (uridine) [poly(U)] lO'^-fold faster than poly(adenosine) [poly(A)] as a result of the selectivity of the Bl TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
565
566
Bradley R. Kelemen and Ronald T. Raines
subsite. In contrast to the Bl subsite, the B2 and B3 subsites prefer to bind purines. Previously, we demonstrated that enlarging the B1 subsite increases the rate of poly(A) cleavage by lO^-fold (delCardayre and Raines, 1994; delCardayre et a/., 1994). This enlargement also converts the distributive mechanism of wildtype RNase A to a processive mechanism when poly(A) is the substrate.
Lys66| - O — P = 0 PO subsite
O. ^
fThr45 Cyt/Ura \ Asp83 [Phe120 B1 subsite
Gimi] O OH His12 I Lys41 > - O — P = 0 His 119 I - • — scissile bond Asp121 ON Ade PI subsite -O-.^ I
. Gln69 JAsn71 iGlulH
OH
Lys7 j -Q_,i=o •. ^
ArglOj >'
O. Ade/Gua | Lys1
Figure 1. a. Amino acid residues of RNase A that compose the subsites for binding phosphoryl groups (PO, PI, and P2) and bases (Bl, B2, and B3) of single-stranded nucleic acids. b. Fluorescein-labeled deoxynucleotides used to assess binding to the B1 subsite.
Single-stranded DNA is an excellent substrate analog for RNase A, and this analogy is the basis for the work described here. First, we report on the use of DNA oUgonucleotides and fluorescence polarization to probe the binding of adenine to the B1 subsite of RNase A. Then, we describe the use of DNA/RNA chimeric oligonucleotides to distinguish between three-dimensional and one-dimensional diffusion mechanisms for catalysis by RNase A. Our results provide a biophysical rationale as well as direct evidence for the diffusion of a protein along a single-stranded nucleic acid.
II. Materials and Methods A. Oligonucleotide synthesis DNA and DNA/RNA chimeric oUgonucleotides were synthesized with a Model 392 DNA/RNA synthesizer from AppUed Biosystems (Foster City, CA) with reagents from Glen Research (Sterling, VA). Oligonucleotides were purified by elution from an acrylamide gel after electrophoresis. To assess binding to the B1 subsite, we synthesized deoxynucleotides that differ only in the base that interacts with the Bl subsite (Figure la). The ligands have a uridine (U), adenosine (A), or abasic (0) residue at their 5' ends, followed by two
One-Dimensional Diffusion of Properties of Protein on SS Nucleic Acid
567
adenosine residues to fill the enzymic B2 and B3 subsites. Each deoxynucleotide is labeled with fluorescein (Fl) so that binding can be detected by fluorescence polarization. The products of these syntheses are shown in Figure lb. To probe for one-dimensional diffusion, we synthesized DNA/RNA chimeric oligonucleotides. Special precautions were taken to avoid ribonuclease contamination during synthesis, purification, and use of these chimeras. For example, all water was treated with diethylpyrocarbonate before exposure to the chimeras. Ribonucleotide 2'-hydroxyl groups were deprotected with 1 M tetrabutyl anmionium fluoride in dimethyl formamide (Aldrich Chemical; Milwaukee, WI). Purified oligonucleotides were labeled on the 5' end with [y-^^pj^xP (duPont; Wilmington, DE) by T4 kinase (Promega; Madison, WI), and desalted with a Nicj^TM gel filtration column (Pharmacia; Uppsala, Sweden).
B. Binding Fluorescence polarization (like fluorescence anisotropy) can be used to measure the rate of tumbling of a fluorescent molecule (Jameson and Sawyer, 1995; Royer, 1995). A receptor (e.g., RNase A) binding a fluorescent ligand (e.g., a labeled nucleic acid) slows the tumbUng of the Ugand. Accordingly, fluorescence polarization can reveal the fraction of a nucleic acid that is bound to RNase A. Fluorescence polarization experiments were performed as described elsewhere (B. M. Templer and R. T. Raines, unpubl. results). Briefly, RNase A (Sigma Chemical; St. Louis, MO) was dialyzed exhaustively at 4 °C against distilled water to remove salts. The enzyme was then lyophilized. The lyophilized enzyme was suspended in 0.90 mL of 0.10 M Mes-HCl buffer, pH 6.0, containing NaCl (0.10 M), such that the concentration was 1 - 2 mM (15 - 30 mg/mL). Fluorescein-labeled deoxynucleotides were dissolved in buffer and added to half of the enzyme solution to a final concentration of 2 - 3 nM. The sample volume was then raised to 1.00 mL with buffer. A blank containing enzyme but not DNA was made by raising the volume of the remaining enzyme solution to 1.00 mL with buffer. The precise concentration of enzyme was determined by assuming that A = 0.72 at 277.5 nm for a 1.0 mg/mL solution. At least five repetitive fluorescence polarization readings (with individual blank readings) were made at room temperature with a Beacon^^^ fluorescence polarization instrument (Panvera; Madison, WI). The average and standard deviations were calculated for the readings. The protein sample was then diluted by removing 0.25 mL and replacing it with buffer containing the same concentration of labeled deoxynucleotide as was in the original protein sample. The blank was diluted with buffer. The data collection and dilution steps were repeated up to thirty times. The resulting data were fit to eq 1 by a non-linear least squares analysis, which was weighted by the standard deviation of each reading. p--Pmax[RNaseA] K^-\- [RNase A] ^ ^^in
(1)
Bradley R. Kelemen and Ronald T. Raines
568
In eq 1, P is the average of the measured fluorescence polarization, Pmin is the polarization of free deoxynucleotide, and Pmax is the polarization at deoxynucleotide saturation minus Pmin- [RNase A] is protein concentration, and K(ji is the equilibrium dissociation constant. For Fl-d(AAA) and Fl-d(0AA), the value of Pmax was poorly defined but apparently similar to that for Fl-d(UAA); therefore, the Pmax of Fl-d(UAA) was used to fit the Fl-d(AAA) and Fl-d(0AA) data.
C. One-dimensional diffusion Enzymes capable of one-dimensional diffusion should cleave a substrate with a long nonspecific binding region faster than a similar substrate with a short such region (Berg et al, 1981). The substrates used here derive from simpler substrates with long and short nonspecific binding regions (Figure 2a). By merging the simpler substrates into one, evidence for facilitated diffusion can be obtained directly in a single experiment. A conceptually analogous experiment has been performed with EcoRl endonuclease (Jeltsch et al, 1994).
a
Simple Substrates d(AAAAA)Ud(AAAAA) d(AAAAA)Ud(AAAAAAAAAAAAAAAAAAAAAAAAA) Composite Substrates Oligo 1: d (AAAAA) U d (AAAAA) Ud (AAAAAAAAAAAAAAAAAAAA AAAAA) Oligo 2. d (AAA AAAAA AAAAA AAAAAAAAAAAA) Ud (AAAAA) Ud (AAAAA)
Oligo 1
Oligo 2
32p_u_u
32 p
i
NI RNase A 32p_U_U 32p_U
P1D P3D
-u—uRNase A
32 p_ 32 p_
-u—u
Figure 2. a. DNA/RNA chimeric oligonucleotide substrates used to detect one-dimensional diffusion by RNase A. Oligo 1 and Oligo 2 are circular permutations containing two cleavage sites, one of which is proximal to a long nonspecific binding region, b. Products of the cleavage of Oligo 1 and Oligo 2. Pip results from one-dimensional diffusion of RNase A along the long poly(dA) tract.
One-Dimensional Diffusion of Properties of Protein on SS Nucleic Acid
569
Oligo 1 and Oligo 2 are chimeric oligonuclotides that contain 35 DNA residues and 2 RNA residues. The RNA residues are uridine nucleotides, and are referred to as the ID and 3D sites. We chose this naming system because the ID site is closer to the long nonspecific binding region and will be cleaved faster if RNase A uses a one-dimensional diffusion mechanism. In both substrates, the ID cleavage site is flanked on one side by 25 deoxyadenosine residues. The ID and 3D cleavage sites are separated by 5 deoxy adenosine residues, and 5 more deoxyadenosine residues separate the 3D site from the end. Oligo 1 has the uridine nucleotides near the 5' end, whereas Oligo 2 has the uridine nucleotides near the y end. The use of composite substrates could compUcate data interpretation because of the possibiUty of multiple catalytic events on the same substrate. Of course, diffusion in one dimension, like diffusion in three dimensions, cannot be directional (von Hippel and Berg, 1989). Thus, RNase A bound to the long nonspecific binding region should cleave the ID site faster than the 3D site regardless of the site's proximity to the 5' or 3' end. Thus, comparing the initial rates of cleavage of Oligo 1 and Oligo 2 resolves the complications incurred from the consolidation of substrates. Only two detectable products are formed from the degradation of Oligo 1 or Ohgo 2 because only a 5' ^^p label is used for detection (Figure 2b). RNase A cleavage at the ID site produces a detectable product, PID- Cleavage at the 3D site forms a detectable product, P3D, of a different length. For Oligo 1, FID is 12 nt and P3D is 6 nt. For Oligo 2, FID and P3D are 26 and 32 nt, respectively. The ratio [PID]/[P3D] is approximately equal to the ratio of the initial rates of cleavage at the ID (/:ID) and 3D (/:3D) sites (i.e., [PID]/[P3D] = ^1D/^3D)- This ratio is an indicator of one-dimensional diffusion of RNase A along Oligo 1 and Oligo 2. A ratio of [PID]/[P3D] > 1 is indicative of one-dimensional diffusion; [ F I D ] / [PSD] = 1 is indicative of three-dimensional diffusion. Assays for one-dimensional diffusion were performed as follows. Reactions were initiated at room temperature by the addition of substrate. The reaction mixture consisted of 0.050 M Mes-HCl buffer, pH 6.0, containing RNase A (1 fmol 0.1 pmol), NaCl (0.025,0.12, or 1.0 M), and substrate (0.4 - 0.8 |LiM). Aliquots (2 |iL) of the reaction were quenched at various times by the addition to an equal volume of formamide (95% v/v) containing EDTA (20 mM), xylene cyanol (0.05% w/v), and bromophenol blue (0.05% w/v). Less than 10% of the substrate was cleaved during the course of an experiment. Reaction products were separated by electrophoresis on a denaturing 18% (w/v) acrylamide gel. To prevent shattering, these gels were soaked in an aqueous solution of acetic acid (7% v/v) and methanol (7% v/v), then in methanol before drying under reduced pressure (Thomas et al., 1992). Detection and quantification of cleavage products were made using a FhosphorlmagerT"^ radioisotope imaging system from Molecular Dynamics (Sunnyvale, CA).
570
Bradley R. Kelemen and Ronald T. Raines
III. Results A. Binding Fluorescence polarization data for the binding of RNase A to Fl-d(UAA), R-d(AAA) and Fl-d(0AA) are shown in Figure 3. RNase A binds Fl-d(UAA) approximately 20-fold more tightly than Fl-d(AAA) or Fl-d(0AA), demonstrating that the Bl subsite has affinity for a pyrimidine base. The similarity in binding affinity for Fl-d(AAA) and Fl-d(0AA) indicates that the Bl subsite of RNase A does not bind adenine significantly, but does not discriminate against it.
150
Figure 3. Binding of RNase A to Fld(UAA) (•), R-d(AAA) (O), and Fld(0AA) (D) as assessed by changes in fluoresence polarization (mP). Data were obtained in 0.10 M Mes-HCl buffer, pH 6.0, containing NaCl (0.10 M). Data were fit to eq 1, yielding K(X values of 0.13 mM, 3.3 mM, and 2.5 mM for Fl-d(UAA), Fld(AAA), and Fl-d(0AA), respectively.
mP
10'°
10"''
10'^
10"''
10"'
[RNase A] (M)
B. Facilitated diffusion A typical time-course for the degradation of Oligo 1 and OUgo 2 by RNase A in the presence of 25 mM NaCl is shown in Figure 4. The concentration of F I D exceeds that of P3D at all times for both Oligo 1 and Oligo 2. These data provide evidence that RNase A uses one-dimensional diffusion to locate pyrimidine nucleotides within a polymeric substrate. The one-dimensional diffusion of RNase A is diminished by added NaCl. The ratio [PID]/[P3D] for Oligo 1 and OHgo 2 at three concentrations of NaCl is shown in Figure 5. RNase A displays no indication of faciUtated diffusion at high NaCl concentration, where [PID]/[P3D] = 1- At 0.12 M NaCl concentration, [PID]/[P3D] > 1. indicating that RNase A can use one-dimensional diffusion at NaCl concentrations close to physiological. At 0.025 M NaCl, [PID]/[P3D] is even greater, consistent with a facilitated diffusion mechanism that relies on the nonspecific binding to the phosphoryl group of poly(dA). Under these low-salt conditions, RNase A also shows the slowest turnover of substrate. As shown in Figure 4, the cleavage occurs in a burst but is then inhibited by products. The size of this burst increases with enzyme concentration (data not shown).
One-Dimensional Diffusion of Properties of Protein on SS Nucleic Acid
a
b Oligo 1
—
0.2 r'
•
^
1
571
Oligo 2
Oligo 2 P3D(«)
Time-
Time (min)
Time (min)
Figure 4. a. Reaction products 0,1,2, 5, and 10 min after addition of RNase A to Oligo 1 and Oligo 2. Reactions were performed in 0.050 M Mes-HCl buffer, pH 6.0, containing NaCl (0.025 M). b. Plots of product formation versus time for Oligo 1 and Oligo 2.
2.0
1.5
[P3D]
Figure 5. The [PID]/[P3D] ratio versus the log of the concentration of NaCl. Data were obtained in 0.050 M Mes-HCl buffer, pH 6.0, containing NaCl (0.025,0.12, or 1.0 M).
1.0
0.5 0.10
1.00
[NaCl] (M)
IV. Conclusions RNase A can use one-dimensional diffusion along a poly(dA) tract to accelerate the location of a uridine substrate. Use of this mechanism depends on the concentration of NaCl, as expected if the enzyme were binding to the nucleic acid by nonspecific interactions with phosphoryl groups. Binding of the enzymic active site to adenosine residues is 20-fold weaker than to uridine residues, which could enhance the ability of the enzyme to slide along the poly(dA) tract. A facilitated diffusion mechanism may have evolved for a sinister purpose. Some homologs of RNase A are cytotoxic because they are able to deUver ribonucleolytic activity to the cytosol of manmialian cells (Youle et ah, 1993). Facilitated diffusion may enable these cytotoxic ribonucleases to use the poly(A) tail of mammaUan mRNAs as a runway leading to substrates in the indispensable coding region.
572
Bradley R. Kelemen and Ronald T. Raines
References Berg, O. G., Winter, R. B., and von Hippel, R H. (1981). Biochemistry 20, 6929-6948. delCardayre, S. B., and Raines, R. T. (1994). Biochemistry 33, 6031-6037. delCardayre, S. B., Thompson, J. E., and Raines, R. T. (1994). In "Techniques in Protein Chemistry V" (Crabb, J. W., ed.) pp. 313-320, Academic Press, New York. Imura, N., Irie, M., and Ukita, T (1965). 7. Biochem. 58, 264-272. Irie, M., Mikami, R, Monma, K., Ohgi, K., Watanabe, H., Yamaguchi, R., and Nagase, H. (1984). J. Biochem. (Tokyo) 96, 89-96. Jameson, D. M., and Sawyer, W. H. (1995). Methods Enzymol. 246, 283-300. Jehsch, A., Alves, J., Wolfes, H., Maass, G., and Pingoud, A. (1994). Biochemistry 33, 1021510219. Jensen, D. E., and von Hippel, R H. (1976). /. Biol. Chem. 251, 7198-7214. Moussaoui, M., Guasch, A., Boix, E., Cuchillo, C. M., and Nogues, M. V. (1995). J. Biol. Chem. 271, 4687-3692. Pares, X., Nogues, M. V., de Llorens, R., and Cuchillo, C. M. (1991). Essays Biochem. 26, 89103. Royer, C. A. (1995). Methods Molec. Biol. 40, 65-89. Thomas, M., Abedi, H., Farzaneh, F. (1992). Biotechniques 13, 533. von Hippel, R H., and Berg, O. G. (1989). J. Biol. Chem. 264, 675-678. Winter, R. B., Berg, O. G., and von Hippel, R H. (1981). Biochemistry 20, 6961-6977. Youle, R. J., Newton, D., Wu, Y.-N., Gadina, M., and Rybak, S. M. (1993). Crit. Rev. Therapeutic Drug Carrier Systems 10, 1-28
Acknowledgements We thank B. M. Templer and C. A. Royer for advice on fluorescence polarization assays. This work was supported by NIH grant GM44783. BRK was supported by NIH Chemistry - Biology Interface training grant GM08505.
Metal-dependent Structure and Self Association of the RAGl Zinc-Binding Domain Karla K. Rodgers and Karen G. Fleming Department of Molecular Biophysics and Biochemistry Yale University, New Haven, CT 06520-8114
L Introduction Structural zinc-binding domains are often characterized by the requirement of zinc coordination for proper protein folding [1]. One specific class of zinc-binding motif that will be discussed here is the zinc C3HC4 motif, also known as the RING finger [2]. To date at least eighty proteins include a sequence of approximately 50 residues consistent with a RING finger motif. This conserved sequence, with minor variations in some cases, is defined as follows: C-X2-C-loopI-CX-H-X2-C-X2-C-I00PII-C-X2-C, where X represents any amino acid. A common function attributable to the RING finger module has remained elusive, although a role in protein-protein interactions has been speculated [2]. One of the first RING finger sequences was identified in RAGl, a protein expressed in developing lymphocytes by recombination activating gene-1 [3]. RAGl, along with RAG2, is an essential component of the V(D)J recombination reaction, which produces the genetic sequence encoding for the variable regions of the T cell receptor and immunoglobulin chains. Briefly, V(D)J recombination is accomplished via selection and assembly of gene segments known as variable (V), joining (J), and sometimes diversity (D) in an ordered and precisely regulated process (for a review see [4]). The RING finger sequence of RAGl is present within the N-terminal third of the protein, which contains a total of 1040 residues in the murine form. Besides the RING finger sequence, we have recently identified the presence of two C2H2 zinc finger sequences within RAGl [5]. A domain in RAGl containing one of the zinc finger modules plus the RING finger forms a highly specific dimer, as characterized by a variety of biophysical techniques [5]. The dimerization of this zincbinding domain provides further support for the participation of RING fingers in protein-protein interactions. This dimerization TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
573
574
Karla K. Rodgers and Karen G. Fleming
domain of RAGl, previously referred to as R121, will be referred to here as ZDD, zinc-binding dimerization domain. Here we focus o n the role of metal binding to the ZDD dimer. In particular, we h a v e investigated the stabilities of different species of ZDD with varying metal-to-protein stoichiometries. Combined with the metal-binding studies, we have further investigated dimer formation of this u n i q u e zinc-binding domain while providing additional detail into the techniques and methods used,
11. Materials and Methods A.
ZDD Purification, Metal Exchange and Analysis
ZDD and a fragment of RAGl including only the RING finger sequence have been expressed in £. coli as fusion proteins with maltose binding protein (MBP). These proteins are referred to as MBP-ZDD and MBP-RF, respectively. We have recently described the cloning, expression, and purification of MBP-ZDD and MBP-RF. In addition, the proteolytic cleavage of the MBP-ZDD chimera to generate the ZDD fragment, and its subsequent purification, was done as previously reported [5]. N-terminal amino acid sequencing was done at the W.M. Keck Foundation Biotechnology Resource Laboratory, and electrospray mass spectrometry was done by Walter McMurray at the Yale University School of Medicine. The Zn2-coordinated form of ZDD (Zn2-ZDD) was produced by dialysis of the native ZDD against nitrogen-saturated 20 mM Tris-HCl (pH 7.5), 100 mM NaCl, 5 mM 2-mercaptoethanol (BME), and 1 m M EDTA at 4"'C for 17 hours. The Zn2-Cdi forni of ZDD was generated by dialysis of Zn2-ZDD against nitrogen-saturated 20 mM Tris-HCl (pH 7.5), 100 mM NaCl, 5 mM BME supplemented with one molar equivalent of CdCl2 at 4°C for 12 hours. The metal to protein stoichiometry was determined by atomic absorption spectroscopy using an Instrumentation Laboratory IL157 spectrometer. The concentration of metal ions in the RAGl proteins was measured in solutions of 3 to 10 |iM protein. These were compared to either a Zn or Cd calibration curve ranging from 1 to 15 |LiM, which was measured prior to each protein sample.
B.
Circular Dichroism
Spectroscopy
Circular dichroism (CD) spectra were collected on an AVIV model 62DS spectropolarimeter using a 0.2 mm path length cell. Protein samples were dialyzed against buffer containing 20 m M sodium phosphate (pH 7.0), 50 mM NaCl, ImM BME. Five separate spectra with a step size of 0.5 nm and a 1.5 nm bandwidth were averaged to obtain the final spectrum for each protein sample. The temperature during the scans was maintained at 25°C with a water-
Solution Properties of RAGl ZDD
575
jacketed cuvette holder. The molar ellipticity was determined using protein concentrations obtained from amino acid analysis. Thermal denaturation studies were done by collecting data at a single wavelength and increasing the temperature in VC increments, equilibrating for 60 sec, and using a 30 sec signal integration time. The melting temperatures (Tm) were determined from the temperature at which the slope of the first derivative of the data was at a minimum.
C.
Analytical
Ultracentrifugation
The details of the ultracentrifugation experiments have been described previously [5]. Briefly, equilibrium sedimentation experiments were performed in a Beckman XL-A analytical ultracentrifuge at multiple speeds in buffer containing 20 m M sodium phosphate (pH 7.0), 150 mM NaCl, and 5 mM BME at 20°C. The partial specific volumes were calculated using the values of Cohn & Edsall [6]. The data were analyzed using a modified version of IGOR Pro as well as the MacNONLIN program [7]. The applicable mathematical models for the equilibrium distributions of the isolated ZDD and the MBP-ZDD chimera, respectively, are given by q = c^^ exp a + base
(1)
for a single species, where a = M\l-vp]co^U^ -r^\ IRT, c^ is the total concentration at a radial position r-, c^^^ is the concentration at a reference position, r^^^, M and v are the monomer molecular weight (g/mol) and partial specific volume (ml/g), co is the angular velocity (rad/sec), p is the solvent density (g/ml), r. and r^^^ are the radial positions (cm) at an arbitrary position and at the reference position, R is the universal gas constant (g/mol °K), T is the absolute temperature and base is a term for non-sedimenting material; and c^ = c^^ exp a + [^jcl^) exp la + base
(2)
for a m o n o m e r / d i m e r distribution where K is the equilibrium constant, ^jcl^f is the concentration of dimer (using the law of mass action) and other terms are as previously defined. Velocity sedimentation experiments on both proteins at several concentrations were performed at 55,000 rpm in the same buffer at 20°C. The data were analyzed using the time derivative method of Stafford [8]. 1.
Calculation of s020,w
A detailed description of the calculation of the S^Q^^ parameter used in shape estimations as well as in solution molecular weight
576
Karla K. Rodgers and Karen G. Fleming
determination in conjunction with the D2Q^^ from dynamic Hght scattering has previously been reported [5]. Briefly, since at all concentrations the apparent sedimentation coefficient distributions were symmetrical and approximately Gaussian on the s* scale (data not shown here), the weight average sedimentation coefficient at a particular concentration, S2o^soivent^ ^^^ calculated from the apparent distribution function [8]. These values were used to calculate the corresponding sedimentation coefficient, S2o,u;/ which is corrected to an infinitely dilute protein concentration and to water at 20°C. 2.
Calculation of Sapp and Dapp
An extended analysis of data using the time-derivative method provides for simultaneous determination of apparent sedimentation, s^pp, and apparent diffusion coefficient, D^^, values at a particular concentration and temperature [9]. The apparent diffusion coefficient was calculated from the apparent sedimentation coefficient distribution by the following relationship: %p = i^mCO'tf/^t
(3)
where r^ is the radial position of the meniscus (cm), t is the equivalent sedimentation time (sec), and a is the standard deviation of the gis"^) versus s"^ curve determined by fitting to the following equation: g(s>Aexp[-0.5((s*-s„^J/cT)']
(4)
where A and a are constants, and s^^^ is the sedimentation coefficient given by the maximum position of the g(s*) versus s* curve. D.
Calculation
of Molecular
Weight from s and D
The Svedberg equation was used to calculate the molecular weight from the sedimentation and diffusion coefficients:
E.
s M(l-vp] - =— D RT
(5)
Calculation of Shape Factor and Axial Ratio
Calculation of the frictional coefficient, shape factor and axial ratio for the RAGl fragment (ZDD) from the sedimentation coefficient, s^^^^, has been previously described in detail [5].
Solution Properties of RAGl ZDD
577
III. Results and Discussion A
The Zinc-binding Dimerization Domain of RAGl
The zinc-binding dimerization domain, ZDD, of RAGl includes two different zinc-binding modules: a RING finger and a C2H2 zinc finger. ZDD was expressed as a fusion protein with the maltose-binding protein (MBP) in E. coli. It could be efficiently cleaved from the MPB-ZDD chimera after purification via limited proteolysis with trypsin [5]. This domain, previously referred to as R121, was originally believed to consist of 121 residues. However, from electrospray mass spectrometry and N-terminal amino acid sequencing we have determined that the domain corresponds to residues 265 to 380 in the RAGl full length sequence after cleavage from MBP, yielding a monomer molecular weight of 13.2 kDa. The position of ZDD relative to other proposed domains in the entire RAGl sequence is shown in Figure 1. It can be seen that the dimerization domain is located immediately N-terminal to the core region of RAGl, the minimal RAGl domain required for efficient recombination [10]. The locations of the proposed RING finger and zinc finger modules within ZDD are also illustrated in Figure 1. In addition to ZDD, a fragment containing only the RING finger of RAGl has been cloned and expressed as a fusion protein with MBP and is referred to as MBP-RF.
B.
Metal Binding in the Dimerization Domain
To determine the metal-to-protein stoichiometries of the zincbinding domains of RAGl atomic absorption spectroscopy was used. As expected, MBP-RF, which contains only the RING finger sequence
:p:
RINGZFA ++
1
Ii Mm I
218 288 349 380
I MBP MBP
ZFB
I 723
II
265 ZDD
380
265
380
P H H I J rHE
277
1040
1 Core RAG-1
MBP-ZDD Construct
1008
MBP-RF Construct
337
Figure 1. Schematic of proposed domaii\s in RAGl. The top bar represents the fulllength murine RAGl sequence. Solid boxes are zinc-binding domains, with ZFA and ZFB representing two zinc-finger subdomains [5]. Hatched boxes represent positivelycharged potential nucleic-acid binding regions. Lines beneath the bar indicate the positions of the ZDD and core RAGl domains. RAGl clones are represented as bars placed in the corresponding position relative to the full-length protein.
578
Karla K. Rodgers and Karen G. Fleming
of RAGl, binds approximately two zinc ions (1.7±0.2 Zn/protein molecule). The zinc binding stoichiometry of ZDD was found to be 3.2 zinc ions bound per monomer (3.0-3.5 for multiple determinations). Two of these zinc ions bind within the conserved RING finger sequence, with the third bound within the zinc finger module. W i t h measurements ranging as high as 3.5 Z n / m o n o m e r , there remains the possibility for the coordination of a fourth zinc ion. Although the geometry of a fourth zinc site is unclear from observation of the primary amino acid sequence, there are several cysteine and histidine residues that could serve as additional coordinating ligands. For the purposes of this report we will refer only to three zinc sites: two in the RING finger and one in the zinc finger. A two zinc-coordinated form (1.80±0.2 Zn/protein molecule) of ZDD (Zn2-ZDD) is easily generated by dialysis against dilute concentrations of EDTA, indicating that one of the three zinc ions is weakly bound as compared to the other two metal ions. Under similar conditions one of the two zinc ions is removed from MBPRF, which contains only the RING finger module of RAGl. Thus, the location of the labile zinc-binding site can be narrowed down to one of two sites present in the RING finger. Similar results in the COPl protein indicate that one of its RING finger zinc ions is also relatively labile [11]. Zinc-binding at the labile RING finger site is, however, reversible since dialysis of Zn2-ZDD against zinc or cadmiumcontaining solutions restores the fully coordinated species. In the case of cadmium solutions, this results in a two zinc-one c a d m i u m coordinated species (1.9±0.2 Zn and 0.9±0.2 Cd/protein molecule).
C.
Structural Stability of the Dimerization Domain
The circular dichroism (CD) spectrum of the fully zinccoordinated (native) ZDD as well as of an apo form is shown in Figure 2A. Removal of all zinc ions to produce the apo form of the domain results in extensive loss of ordered secondary structure as judged by the reduction of molar ellipticity in the CD spectrum. As in other structural zinc-binding domains, we conclude that the free energy associated with the coordination of metal ions is necessary for correct folding of ZDD. We also used CD to ascertain the effects on structure and stability of the ZDD domain upon removal of the labile-bound RING finger zinc ion. In this case, the CD spectrum shows a 36% loss in molar ellipticity at 204 nm as compared to the native domain, indicating partial unfolding upon release of the metal ion from the labile site (shown in Figure 2A). These observations support the conclusion that the labile zinc plays an important role in the determination and stabilization of the local secondary structure in the RING finger subdomain.
Solution Properties of RAGl ZDD
579
The thermal stability of ZDD was determined l y monitoring the temperature dependence of the molar ellipticity at 204 nm. ZDD was found to be quite stable with a melting temperature of 78±1°C (Figure 2B); however, a AGunfolding could not be calculated since the denaturation was completely irreversible. The Zn2-ZDD fragment lacking the labile zinc ion was found to be significantly less stable t h a n the native form. Its melting temperature was 67±2°C, 11°C lower than that of the fully metal-coordinated fragment. Further, the melting curve of Zn2-ZDD 40 50 60 70 80 90 displayed less cooperativity Temperature (°C) than the native form. Figure 2. Circular Dichroism of ZDD. A, Spectra Native-like stability and of the fully zinc-coordinated form (native), a cooperativity could be Zn2 species, and an apo-form of the fragment. B, recovered upon dialysis Thermal denaturation curves of native ZDD as| against cadmium-containing compared to a Zn2 species. solutions to produce the triply liganded Zn2Cdi species (data not shown). The reduced stability and decreased cooperativity observed for ZDD missing one of the RING finger zinc ions suggests that the extent of zinc-binding can fine-tune structural properties of not only the RING finger subdomain, but also of the entire ZDD domain. That removal of one of the coordinated zinc ions from the RING finger can have a major influence on structural stability is supported by previous structural studies of homologous RING finger domains. Specifically, high resolution structures of two different RING finger modules from equine herpes virus (EHV) gene 63 and a putative h u m a n transcription factor, FML, have been solved by nuclear magnetic resonance [12, 13]. Common features of both structures reveal two separate zinc-binding sites, with the zinc ions separated by approximately 14 A. The polypeptide chain alternately winds between the two zinc sites, such that the first and third pair of cys ligands coordinate to one zinc ion with the second zinc ion ligated by the third and fourth pair of Cys and His ligands. This u n i q u e
Karla K. Rodgers and Karen G. Fleming
580
feature is accomplished via an antiparallel P sheet situated between the individual zinc-binding sites. Thus, removal of one of the zinc ions from the RING finger module would most likely result in partial disruption of this (3 sheet. The variation in affinity for zinc ions between multiple zincbinding sites has been demonstrated with other zinc-binding subdomains. One example is the zinc binuclear cluster in the GAL4 DNA binding domain in which one of the two zinc ions is bound with higher affinity. The single zinc species of GAL4 shows a marked decrease in the free energy of binding to its specific DNA sequence, exhibiting the consequence of differential affinities for zinccoordination on protein function as well [14].
D»
Solution Properties of the Zn-Binding
Domain
Sedimentation equilibrium analytical ultracentrifugation of isolated ZDD was used to determine its solution molecular weight. These experiments revealed the presence of a single species in solution corresponding to the molecular mass of the dimeric ZDD fragment. Global analysis of the equilibrium data using equation 1
—I—'—r6.90 7.00 Radius, cm
7.10
10 10 10 Total Concentration, M (Monomer)
Figure 3. Sedimentation Equilibrium of MBP-ZDD. B, Equilibrium distribution of MBP-ZDD at 15000 rpm. The monomer and dimer exponentials, whose sum gives rise to the model fit, as well as the sum itself are shown by the solid lines. The circles are the data points. A, Residuals of the fit. C, The thickened portions of the curves indicate the concentration range wherein the analysis was carried out. The thin portions of the curves are extrapolated from analysis of those data.
Solution Properties of RAGl ZDD
581
yielded a molecular weight within 2% of that predicted from the amino acid sequence [5]. From the observed concentration range, an upper limit for the equilibrium dissociation constant could be estimated as 14 |iMo Sedimentation equilibrium of the MBP-ZDD chimera, which has a larger extinction coefficient at 280 nm, permitted experiments to be done at low enough molar concentrations to detect significant amounts of monomeric protein using the absorbance optics. A n exponential distribution of the chimeric protein at 15,000 rpm is shown in Figure 3B. The observed data are best described by equation 2, which is the sum of two exponentials corresponding to the distributions of the monomeric and dimeric chimeric proteins. The equilibrium dissociation constant was found to be 3.12 |LIM (±16%). Using the parameters derived from the m o n o m e r / d i m e r fit. Figure 3C shows the relative concentrations of the RAGl monomer and dimer as a function of total monomer concentration where it can be seen that it is predominantly dimeric at concentrations above 5 |LIM. Although all ultracentrifugation measurements were done in buffers which contained no excess zinc, bound zinc ions are required for the specific homodimer formation of ZDD as the apo form of the domain was shown to be unfolded and nonspecifically aggregated [5]. Atomic absorption spectroscopy of the samples in the buffers used for these experiments confirmed the expected stoichiometry of zincbindingo 1.
Combination of Hydrodynamic Parameters for Molecular Weight Determination
The dimeric molecular weight of the isolated ZDD fragment was further confirmed by combining sedimentation and diffusion coefficient measurements. Even though both of these coefficients are hydrodynamic measures of a macromolecule, the ratio of s to D is proportional to the molecular weight by the Svedberg equation (equation 5). By using the Svedberg relationship, the shape and hydration factors inherent in each coefficient cancel out, and the molecular weight can be calculated. 2.
Application of the Svedberg Equation Using s^20,w and D02O,W Values
We first applied the Svedberg equation to calculation of the solution molecular weight from the S^Q^^, as measured by velocity sedimentation, and the D^Q^^, as measured by dynamic light scattering. As previously described, we determined values of 2.44 S and 7,97 F for the S^Q^^ and the D^Q^^ coefficients, respectively [5]. Combining these two coefficients, obtained by two independent.
Karla K. Rodgers and Karen G. Fleming
582
experimental approaches, in the Svedberg equation yielded a solution molecular weight of 29.2 kDa for purified ZDD, which is within 10% of the dimeric mass determined by electrospray mass spectrometry. 3.
Application of the Svedberg Equation Using Sapp and Dapp Values
The molecular weight was also calculated from velocity sedimentation analysis alone by simultaneous determination of the apparent sedimentation, s^^, and diffusion, D^^, coefficients using an extended analysis of the time-derivative method. It has recently been shown that the diffusion coefficient of the macromolecule is related to the standard deviation of the g(s) versus s* curve fitted to equation 4 [9]. Figure 4B shows such a fit to ZDD sedimentation velocity data. Using equations 3 and 4, we calculated an s^^ of 2.33 S and a D^^^ of 8.21 F. Combining these simultaneously determined parameters in the Svedberg equation yielded a solution molecular weight of 26.8 kDa, within 2% of the molecular weight as measured by electrospray mass spectrometry. A major advantage of using the time derivative method is the rapidity in which one can determine the solution molecular weight of an ideal, monodisperse macromolecule. Essentially the data collection and analysis can both be done in one afternoon to yield an estimate of the solution oligomeric state.
%pp ~ ^-^^ S Dapp = 8.21 F Ms, D = 26.8 kDa
0.5-
A
Mmass spec. = 26.4 kDa X-
0 -o
g 0.3(J)
/
D
^0.2. * 0.1-
n n_ u.u-
1
6.4
6.8 radius, cm
7.2
- ^ 1
1
\
/Vi 1
1
-T
'
2 3 s* (svedbergs)
Figure 4. Sedimentation Velocity Analysis of ZDD. A, Primary data collected at 1 mg/ml (10 scans). B, Apparent sedimentation coefficient distribution function, g(s*) versus s*. The error bars represent the standard error of the mean. The solid line is the fit to equation 4. Apparent s, D, and Ms,D values were calculated as described.
Solution Properties of RAGl ZDD
E.
583
Interpretation of Shape Parameters
Insight into the overall shape of the ZDD dimer in solution was obtained by interpretation of sedimentation velocity and smallangle X-ray (SAXS) scattering experiments [5]. An experimental frictional coefficient, /2o^, was calculated from the sedimentation coefficient, S^Q^^, and using an estimate of the protein hydration, the shape factor of ZDD was found to be 1.14. When modelled as a prolate ellipsoid of revolution using Perrin's law, this shape factor corresponds to an axial ratio of 3.2, indicating a quite elongated structure. We have previously reported small-angle X-ray scattering results that gave values for the radius of gyration (Rg=23.4A), as well as the maximum dimension (dmax=89A), for the ZDD dimer (5). Again, modelled as a prolate ellipsoid of revolution, a major axis of 89 A (from dmax) would require equivalent minor axes of 27 A in order to enclose a volume consistent with the molecular weight and partial specific volume of the dimer. Thus, these studies gave an axial ratio of 3.3, consistent with that obtained from sedimentation velocity experiments. Although the values obtained from these separate techniques cannot be directly compared, as velocity sedimentation is a hydrodynamic measure of the molecule in contrast to small-angle Xray scattering, both results indicate that the ZDD dimer is likely to be more elongated than spherical in overall shape.
TV. Conclusions Using a combination of biophysical techniques we h a v e defined the solution properties of the amino terminal zinc-binding domain of the recombination activating protein, RAGl. The ZDD domain consists of two types of zinc-binding subdomains: a RING finger and a zinc finger, both of which appear to be intimately involved in the structural determination and stability of this domain. Full metal coordination is required for proper folding, since even the loss of one zinc ion results in significant alterations to the structure and stability of this protein. This zinc-binding RAGl domain self associates in solution to form a stable dimer. The dimeric oligomeric state was confirmed by combining complementary hydrodynamic parameters in the Svedberg equation to yield the solution molecular weight as well as by direct measurement in equilibrium sedimentation experiments. We were further able to measure the equilibrium dissociation constant of the dimerization reaction by equilibrium sedimentation of the MBP-ZDD fusion protein, which allowed us to access m u c h lower concentrations than were possible with the ZDD fragment alone. The free energy of the interaction shows that the dimer forms
584
Karla K. Rodgers and Karen G. Fleming
with relatively high affinity suggesting that dimerization may play an important role in the physiological function of RAGl. The overall shape of the RAGl zinc-binding dimerization domain is elongated as modelled by a prolate ellipsoid of revolution. Both sedimentation velocity and small-angle x-ray scattering experiments yielded axial ratios consistent with an extended molecule in solution for the ZDD dimer. This zinc-binding dimerization domain in RAGl is positioned immediately N-terminal to the essential core region of the entire RAGl protein (Figure 1). Stable dimerization of such an elongated structure ensures a specific positioning of the zinc-binding d o m a i n monomers with respect to each other. In such a manner, this d o m a i n is poised to orient and bring together the core region of RAGl for optimum function. Given the strong influence of zinc coordination on the structure and stability of this domain, it is plausible that the extent of zinc-binding may modulate the tertiary and quaternary structure of RAGl, possibly contributing to mechanisms of effective cellular control for V(D)J recombination.
Acknowledgements We thank Charles B. Millard and Clarence A. Broomfield for generous use of their ultracentrifuge. We thank Joseph E. Coleman, David G. Schatz, Preston Hensley, and Walter F. Stafford, III for helpful discussions. This work was supported by NIH grants DK09070 to JEC, AI32524 to DGS, GM16039 to KKR and GM16769 to KGF.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Schwabe, J.W.R. and Klug, A. (1994) Nature Struct Biol 1: 345-349. Saurin, A.J., Borden, K.L.B., Boddy, M.N., and Freemont, P.S. (1996) Trends Biochem. Sci. 21: 208-214. Schatz, D.G., Oettinger, M.A., and Baltimore, D. (1989) Cell 59: 1035-1048. Lewis, S.M. (1994) Advan. Immunol 56: 27-150. Rodgers, K.K., Bu, Z., Fleming, K.G., Schatz, D.G., Engelman, D.M., and Coleman, J.E. (1996) /. Mol Biol 260: 70-84. Cohn, E.J. and Edsall, J.T. (1943) in Proteins, Amino Acids and Peptides Reinhold Publishing Corporation: New York. pp. 370-381. Johnson, M.L., Correia, J.J., Yphantis, D.A., and Halvorson, H.R. (1981) Biophys. J. 36: 575-588. Stafford III, W.F. (1992) Anal Biochem. 203: 295-391. Stafford III, W.F. (1996) Biophys. J. 70: M-Pos452. McBlane, J.F., van Gent, D . C , Ramsden, D.A., Romeo, C , Cuomo, C.A., Gellert, M., and Oettinger, M.A. (1995) Cell 83: 387-395. von Armin, A.G. and Deng X.W. (1993) /. Biol Chem. 268: 19626-19631. Barlow, P.N., Luisi, B., Mihier, A., Elliot, M., and Everett, R. (1994) /. Mol. Biol 237: 201-211. Borden, K.L.B., Boddy, M.N., Lally, J., O'Reilly, N.J., Martin, S., Howe, K., Solomon, E., and Freemont, P.S. (1995) EMBOJ. 14:1532-1541. Rodgers, K.K. and Coleman, J.E. (1994) Protein Sci 3: 608-619.
Localizing Flexibility within the Target Site of DNA-bending Proteins Anne Grove and E. Peter Geiduschek Department of Biology and Center for Molecular Genetics University of California, San Diego La JoUa, CA 92093-0634
I. Sequence-specific DNA Bendability DNA is not the perfect double helix of traditional textbooks. Slight, but significant structure variations have been demonstrated from comparison of crystal structures of oligonucleotides. It has also become evident that these structure variations are not entirely determined by the individual base-steps (AA, AT, GC, etc), but are influenced by sequence contexts (1,2). The emerging picture of the DNA duplex, in fact, suggests a dynamic structure that is continuously contorted in a sequence-dependent manner. Macroscopic DNA bending, which is a frequent consequence of interaction with proteins, is generated by the cumulative effects of changes in local variables, twist, roll, etc. Substantial DNA bending usually involves a change in roll angles that results in a compression of the major groove, presumably because charge repulsion between the sugar-phosphate backbones opposes a compression of the minor groove (3). Accommodation of DNA in a complex that involves significant DNA curvature or looping must reflect its propensity for bending (i.e. its anisotropic flexibility). Analysis of the distribution of DNA sequences in nucleosome structures has yielded a statistical profile of trinucleotide sequences that are more tolerant of bending (2,4,5). A similar data set has been obtained by analysis of the relative accessibility of DNA to cleavage by DNase I, as variations in cutting frequency may be interpreted in terms of the widening of the minor groove that accompanies DNA bending away from the enzyme (6). The TA step has received particular attention due to its frequent use in binding sites for DNA-bending proteins, and has been rationalized by a greater range of allowable roll angles (7). For the nucleosome core particle, bendability is a major determinant of specific positioning. For proteins that introduce sharp kinks in DNA upon binding, bending appears to supplement sequence-specific TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
585
586
Anne Grove and E. Peter Geiduschek
recognition of binding sites (8). The variable flexibility that is built into the DNA sequence is obviously not the sole determinant of protein binding. Specificity in binding site selection must derive from interactions with the protein that are peripheral to the bending locus. We have reported a strategy that aims to evaluate the contribution of DNA flexibility to complex formation, by measuring the binding of DNAbending proteins to DNA in which flexibility has been imposed by tandem mismatches (9,10), here discussed in the context of the prokaryotic type II DNA binding proteins.
II. Architectural DNA-Binding Proteins: Preference for Prebent DNA Bacterial nucleoids are dense structures in which DNA supercoiling and compaction is assisted by DNA-bending proteins (11,12). Several abundant proteins are associated with this nucleoid, forming what is somewhat loosely referred to as "bacterial chromatin". In Escherichia coli, four abundant proteins are associated with the nucleoid: H-NS, Fis, HU and Integration Host Factor (IHF) (12,13), all of which bend DNA. HU binds preferentially to bent or deformed DNA, such as four-way (cruciform) junctions and DNA with nicks and gaps (14-16), and in that respect resembles the eukaryotic HMG-domain proteins which have increased affinities for cruciform structures and cisplatin DNA adducts (17,18). Both Holliday junctions and cisplatin adducts are thought to cause two helical DNA segments to form a sharp angle (19) and may allow increased binding by lessening the energetic cost of DNA bending. HU and IHF are members of the ubiquitous prokaryotic family of type II DNA-binding proteins, all dimers of 90- to 99-amino acid subunits. Most HU proteins are homodimers, IHF is heterodimeric. The structure of Bacillus stearothermophilus HU (20-23) revealed that a flexible, anti-parallel B-hairpin arm extends from each monomer, as though poised to embrace a DNA double helix. Whereas HU binds to DNA non-specifically, E. coli IHF binds relatively tightly (Kd in the nM range) to unique sites. Comparison of numerous IHF binding sites established a 9 bp interrupted consensus sequence (AATCAAxxxxTTA), asymmetrically disposed within an -30 bp site (24-26). The large binding site, determined by DNase I footprinting, and the sharp DNA bending suggests that DNA bends toward the protein and wraps around it. The genome of the B. subtilis bacteriophage SPOl contains hmU entirely replacing T. Phage SPOl encodes T F l , which is homologous to HU, and possesses similar structural features (27,28). Like HU and IHF, T F l is very abundant with more than 50,000 dimers accumulating in an SPOl-infected cell. T F l binds preferentially to hmU-containing DNA relative to T-containing DNA, prefers double-stranded over single-
Flexibility within the DNA Target Site
587
stranded DNA, and binds to selected sites in the phage genome. Only few TFl-sites have been sequenced, and no strong consensus has been found. T F l sharply bends DNA and, in so doing, wraps the DNA around the body of the protein to allow interactions with an -30 bp site, very similar to IHF (29). The type II DNA-binding proteins in prokaryotes and the HMG box proteins in eukaryotes are regarded as architectural, as they are thought to mold the DNA into a conformation that facilitates the formation of higher order protein-DNA assemblies (30). The ability to interchange these proteins in processes such as phage X site-specific recombination (with variable and in some cases limited efficiency) suggested a common function and motivated this designation (e.g. ref. 31).
III. DNA Loops: Effect on Protein Binding Since the type II DNA-binding proteins sharply bend DNA, it follows that sequence-dependent DNA deformability may contribute to the selection of preferred binding sites. This direction of thinking was explored by designing and synthesizing DNA with specifically placed loops - which should confer site-specific flexibility - and by analyzing these loop-containing duplexes for protein binding. The strategy is outlined below for two members of the family of type II DNA-binding proteins: T F l , which exhibits sequence-specific DNA binding only in the context of hmU-DNA, and IHF for which a consensus sequence is known (26). Details of the experimental design have been reported (9,10). Several properties were considered in designing loop-constructs, (i) Loops consisting of three consecutive mismatches were reported to enhance DNA flexibility (32). Based on protein binding studies, we concluded that tandem mismatches reproduce the effect of 6-nt loops in terms of increased DNA flexure but are preferable for studying protein binding, as the introduction of three consecutive base-substitutions is more likely to disrupt specific contacts (10). Only constructs with 4-nt loops were used for subsequent analyses, (ii) For the dimeric type II DNA-binding proteins, structural analysis indicated that the DNA was likely to be distorted at two sites; loop-constructs in which sets of loops were separated by variable spacings were therefore evaluated for protein binding, (iii) The asymmetrical disposition of the IHF consensus sequence required that sets of loops with optimized spacing be differently positioned across the binding region, (iv) Differences in affinity between constructs with separate loop-spacings or placements were compared to variations contributed by loops of different nucleotide composition; such sequence variations had only secondary effects on affinity (10). (v) The length of the DNA construct was selected to accommodate only one protein molecule to reduce opportunities for alternative placements.
588
Anne Grove and E. Peter Geiduschek
A. TFl For T F l , the reference DNA sequence corresponds to a preferred binding site in the SPOl genome (Figure 1). A set of 37-mer Tcontaining DNA constructs was prepared with pairs of 4-nt loops placed S3niimetrically about the center of the binding site, spaced apart by 7-11 bp. Protein binding was evaluated by electrophoretic mobility shift assays and equilibrium dissociation constants, K^, were determined from the slopes of Scatchard plots (10). Four-nt loops separated by 9 bp of duplex DNA are optimal for T F l binding (Kd ~3 nM; Figure 2); other loop-separations generate suboptimal binding. When the formation of T F l complexes with short duplex DNA is monitored by gel electrophoresis, the discrimination is effectively absolute, because affinity differences are compounded by the greater rate of dissociation of less stable complexes in the gel. To the extent that loops generate partly single-stranded regions, this would not be expected to increase the affinity of T F l which prefers duplex DNA. We interpret our results to suggest that increased binding of T F l to loop-containing duplexes is due to recognition based on DNA deformability and that DNA in a complex with T F l is distorted at two sites separated by 9 bp of duplex (10).
A No loop
B 4-nt loops
5'-CCTAGGCTACACCTACTCTTTGTA?^GAATTAAGCTTC-3' 3 ' -GGATCCGATGTGGATGAGAAACATTCTTAATTCGAAG-5 '
4-nt(spacing 7)
3' -GGATCCGATGTGGTAGAGAAACTATCTTAATTCGAAG-5 '
4-nt(spacing 8)
3' -GGATCCGATGTGCTTGAGAAACTATCTTAATTCGAAG-5 '
4-nt(spacing 9)
3' -GGATCCGATGTGCTTGAGAAACAAACTTAATTCGAAG- 5'
4-nt(spacing 10)
3' -GGATCCGATGTCCATGAGAAACAAACTTAATTCGAAG-5 '
4-nt(spacing 11)
3' -GGATCCGATGTCCATGAGAAACATAGTTAATTCGAAG- 5'
Figure 1. Sequences of 37-mer oligonucleotides corresponding to a preferred binding site for TFl. The position of a short inverted repeat flanking the center of the binding site is indicated by arrows, and two TA steps 9 bp apart noted by asterisks. For loop-containing duplexes, the sequence of the bottom strand is altered to generate mismatches of identical nucleotides. Sequences generating loops are underlined. Oligonucleotides with T-content were purchased and purified by denaturing polyacrylamide gel electrophoresis. The top strand (shared among all DNA constructs) was ^^P-labeled at the 5'-end using T4 poljniucleotide kinase. Complementary oligonucleotides were mixed stoichiometrically, heated to 90°C and slowly cooled to 4°C over several hours to form duplex DNA.
Flexibility within the DNA Target Site
589
A
l-Conplex
dsDNA ^ ^ , ^ ^ , ^ ^k- ^tfl^
^^ ^WP m r ^^ffm^'Wm ^^ ^ ^ ^^^^ ^ ssDNA 14
^
20 27
41
54
68 95
nM TFl
«Plr l i l ^ - ' l i l l l N I I N i k A -^Complex dsDNA ssDNA 3
9
14
20
27
41 54
nM TFl
Figure 2. Electrophoretic mobility shift analysis of TFl binding to (A) perfect duplex or (B) duplex with two 4-nt loops separated by 9 bp. Protein concentrations are indicated below.
B. IHF Unlike T F l , IHF exhibits sequence-specificity in T-DNA, yet is anticipated to interact with DNA in a comparable fashion. The approach to evaluating the contribution of sequence-dependent DNA flexibility to complex formation must therefore consider not only optimal spacing between sets of 4-nt loops, but the location of loops with respect to the consensus sequence. The resulting iterative process showed that IHF has highest affinity for loops separated by 8-9 bp, even if the DNA sequence does not have a strong consensus (9). Placing sets of 4-nt loops separated by 8 bp across a consensus binding region (a 37-mer duplex representing the H' site of the phage X genome) indicated that an increase in affinity requires that loops do not disrupt the consensus sequence. Optimal binding is generated by an off-center placement with one of two 4-nt loops at the edge of the upstream consensus block (Kd=0.25 nM compared to 3.7 nM for the perfect duplex). Re-evaluating the optimal separation between loops in the context of the consensus sequence confirmed the 8-9 bp optimal spacing (9). The preferred separation between loops is similar for T F l and IHF, indicating that the two proteins indeed engage their DNA target in similar fashions. For IHF, the contribution of direct base-contacts is evidenced by the distinct preference for loop placement with respect to consensus sequence elements.
590
Anne Grove and E. Peter Geiduschek
IV. DNA-Bending Proteins and Hydroxymethyluracil-Containing DNA The decreased affinity of T F l for T-DNA is correlated with reduced bending, suggesting that the substitution of hmlJ for T might affect deformabiUty. Binding to hmU-containing loop-constructs was therefore compared to results obtained with T-containing DNA. Most loopplacements diminish the affinity of T F l for hmU-DNA. For DNA with optimal placement of 4-nt loops (9 bp separation), the affinity is identical to that of perfect hmU-duplex (~3 nM). Remarkably, the discrimination between hmU and T essentially disappears with the optimal loop separation. Since site-specific flexure qualitatively and quantitatively substitutes for hmU-preference, we propose that hmUcontent and loops offer the same or similar contributions to complex formation (10). A similar analysis was extended to three other DNA-bending proteins: IHF, HU and HMGl. The affinity of IHF for one of its preferred sites is increased -6.5 fold by substituting hmU for T. Both HU and HMGl, which bind DNA non-specifically, have increased affinity for hmU-DNA relative to T-DNA of otherwise same sequence (9). There is relatively little information about the effect of substituting T with hmU on DNA bending. HmU-DNA melts ~10°C lower than does T-containing DNA of otherwise identical composition, but has been thought to have a normal B-type structure (33,34). A measurement of the torsional rigidity of hmU-containing DNA by time-resolved fluorescence polarization anisotropy of intercalated ethidium failed to show differences from similar measurements on T-DNA, indicating that hmU-DNA does not possess freely flexible joints on a length scale of -10^ bp (35). However, the relationship between wedge models of localized DNA bending and the hydrodynamic models of long-range cooperative motions, which form the basis for interpreting fluorescence polarization experiments, has not been worked out. The structure and dynamical properties of hmU-DNA are being re-examined by the group ofD. R. Kearns(36). It is a striking finding of our experiments that a substitution for the hmU-preference of T F l can be made by suitably placing flexible loops in T-DNA. To our thinking, the implication is that hmU-selectivity is a least partly due to differences in the energetics of DNA deformation between T- and hmU-DNA. We surmise that these differences are sequence-specific.
Flexibility within the DNA Target Site
591
Acknowledgments We greatly appreciate the contributions of our collaborators L. Mayol and A. Galeone and the continuing interest of and discussions with V. L. Hsu and D. R. Kearns. This research was supported by a grant from the NIGMS.
References 1. Dickerson, R. E., Goodsell, D. & Kopka, M. L. (1996). MPD and DNA bending in crystals and in solution. J. Mol. Biol. 256, 108-125. 2. Wolffe, A. P. & Drew, H. R. (1996). DNA structure: implications for chromatin structure and function. In: Frontiers in Molecular Biology IRL Press. In press. 3. Travers, A. A. (1995). Reading the minor groove. Nature Struct Biol. 2, 615-618. 4. Satchwell, S. C , Drew, H. R. & Travers, A. A. (1986). Sequence periodicities in chicken nucleosome core DNA. J. Mol. Biol. 191, 659-675. 5. Travers, A. A. & Klug, A. (1990). Bending of DNA in nucleoprotein complexes. In: DNA Topology and its Biological Implication (Cozzarelli, N. R. & Wang, J. C , eds.), pp. 57-106. Cold Spring Harbor Laboratory Press, NY. 6. Brukner, I., Sanchez, R., Suck, D. & Pongor, S. (1995). Sequence-dependent bending propensity of DNA as revealed by DNase I: parameters for trinucleotides. EMBO J. 14, 1812-1818. 7. Quintana, J. R., Grzeskowiak, K., Yanagi, K. & Dickerson, R. E. (1992). Structure of a B-DNA decamer with a central T-A step: C-G-A-T-T-A-A-T-C-G. J. Mol. Biol. 225, 379-395. 8. Gartenberg, M. R. & Crothers, D. M. (1988). DNA sequence determinants of CAPinduced bending and protein binding affinity. Nature 333, 824-829. 9. Grove, A., Galeone, A., Mayol, L. & Geiduschek, E. P. (1996a). Locahzed DNA flexibility contributes to target site selection by DNA-bending proteins. J. Mol. Biol. 206, 120-125. 10. Grove, A., Galeone, A., Mayol, L. & Geiduschek, E. P. (1996b). On the connection between inherent DNA flexure and preferred binding of hydroxymethyluracilcontaining DNA by the type II DNA-binding protein T F l . J. Mol. Biol. 206, 196206. 11. Kellenberger, E. (1996). Structure and function at the subcellular level. In: Escherichia coli and Salmonella: cellular and molecular biology. (Neidhardt, F. C. ed. in chief), pp. 17-28. ASM Press, Washington, DC. 12. Pettijohn, D. E. (1996). The nucleoid. In: Escherichia coli and Salmonella: cellular and molecular biology. (Neidhardt, F. C., ed. in chief), pp. 158-166. ASM Press, Washington, DC. 13. Finkel, S. E. & Johnson, R. C. (1992). The Fis protein: it's not j u s t for DNA inversion anymore (Published erratum: Mol. Microbiol. 1993, 7, 1023). Mol. Microbiol. 6, 3257-3265. 14. Pontiggia, A., Negri, A., Beltrame, M. & Bianchi, M. E. (1993). Protein HU binds specifically to kinked DNA. Mol. Microbiol. 7(3), 343-350. 15. Bonnefoy, E., Takahashi, M. & Rouviere-Yaniv, J. (1994). DNA-binding parameters of t h e HU protein of Escherichia coli to cruciform DNA. J. Mol. Biol. 242, 116-129. 16. Castaing, B., Zelwer, C , Laval, J. & Boiteux, S. (1995). HU protein of Escherichia coli binds specifically to DNA t h a t contains single-strand breaks or gaps. J. Biol. Chem. 270, 10291-10296.
592
Anne Grove and E. Peter Geiduschek
17. Bianchi, M. E., Beltrame, M. & Paonessa, G. (1989). Specific recognition of cruciform DNA by nuclear protein HMGl. Science 243, 1056-1059. 18. Pil, P. M. & Lippard, S. J. (1992). Specific binding of chromosomal protein HMGl to DNA damaged by the anticancer drug cisplatin. Science 256, 234-237. 19. Lilley, D. M. J. (1992). HMG has DNA wrapped up. Nature 357, 282-283. 20. Tanaka, I., Appelt, K., Dijk, J., White, S. W. & Wilson, K. S. (1984). 3-A resolution structure of a protein with histone-like properties in prokaryotes. Nature 310, 376381. 21. White, S. W., Appelt, K, Wilson, K. S. & Tanaka, I. (1989). A protein structural motif that bends DNA. Proteins: Struct. Funct. Genet. 5, 281-288. 22. Vis, H., Boelens, R., Mariani, M., Stroop, R., Vorgias, C. E., Wilson, K. S., & Kaptein, R. (1994). ^H, ^^C, and ^^N resonance assignments and secondary structure analysis of the HU protein from Bacillus stearothermophilus using twoand three-dimensional double- and triple-resonance heteronuclear magnetic resonance spectroscopy. Biochemistry 33, 14858-14870. 23. Vis, H., Mariani, M., Vorgias, C. B., Wilson, K. S., Kaptein, R. & Boelens, R. (1995). Solution structure of the HU protein from Bacillus stearothermophilus. J. Mol. Biol. 254, 692-703. 24. Craig, N. L. & Nash, H. A. (1984). E. coli integration host factor binds to specific sites in DNA. Cell 39, 707-716. 25. Yang, C.-C. & Nash, H. A. (1989). The interaction of E. coli IHF protein with its specific binding sites. Cell 57, 869-880. 26. Nash, H. A. (1996). The E. coli HU and IHF proteins: accessory factors for complex protein-DNA assemblies. In: Regulation of gene expression in Escherichia coli. (Lin, E. C. C. & Lynch, A. S., eds.), pp. 149-179. R. G. Landes Company. 27. Jia, X., Reisman, J. M., Hsu, V. L., Geiduschek, E. P., Parello, J. & Kearns, D. R. (1994). Proton and nitrogen NMR sequence-specific assignments and secondary structure determination of the Bacillus subtilis SPOl-encoded transcription factor 1. Biochemistry 33, 8842-8852. 28. Jia, X., Grove, A., Ivancic, M., Hsu, V. L., Geiduschek, E. P. & Kearns, D. R. (1996). Structure of the Bacillus subtilis phage SPOl-encoded type II DNA-binding protein TFl in solution. J. Mol. Biol. In press. 29. Schneider, G. J., Sayre, M. H. & Geiduschek, E. P. (1991). DNA-bending properties of TFl. J. Mol. Biol. 221, 777-794. 30. Grosschedl, R. (1995). Higher-order nucleoprotein complexes in transcription: analogies with site-specific recombination. Curr. Biol. 7, 362-370. 31. Segall, A. M., Goodman, S. D. & Nash, H. (1994). Architectural elements in nucleoprotein complexes: interchangeability of specific and non-specific DNA binding proteins. EMBO J. 13, 4536-4548. 32. Kahn, J. D., Yun, E. & Crothers, D. M. (1994). Detection of locaHzed DNA flexibility. Nature 368, 163-166. 33. Kallen, R. G., Simon, M. & Marmur, J. (1962). The occurrence of a new pyrimidine base replacing thymine in a bacteriophage DNA: 5-hydroxymethyl uracil. J. Mol. Biol. 5, 248-250. 34. Mellac, S., Fazakerley, G. V. & Sowers, L. C. (1993). Structure of base pairs with 5-(hydroxymethyl)-2'-deoxyuridine in DNA determined by NMR spectroscopy. Biochemistry 32, 7779-7786. 35. Hard, T. & Kearns, D. R. (1990). Reduced DNA flexibihty in complexes with a type II DNA binding protein. Biochemistry 29, 959-965. 36. Pasternack, L. B., Bramham, J., Mayol, L., Galeone, A., Jia, X. & Kearns, D. R. (1996). ^H NMR studies of the 5-(hydroxymethyl)-2'-deoxyTiridine containing TFl binding site. Nucleic Acids Res. 24, 2740-2745.
Assembly of the multifunctional EcoKl DNA restriction enzyme in vitro David T. F. Dry den*, Laurie P. Cooper and Noreen E. Murray Institute of Cell and Molecular Biology, The University of Edinburgh The King's Buildings, Edinburgh, EH9 3JR United Kingdom
I. Introduction Type I DNA restriction/modification systems have been found in many strains of Escherichia coli and Salmonella enterica (Bickle & Kruger, 1993; King & Murray, 1994; Barcus et al, 1995) and several other gram negative and positive bacteria (Dybvig & Yu, 1994; Fleischmann et al, 1995; Stein et al, 1995; Valinluck et al, 1995; Xu et al, 1995). They maintain the modification of the host chromosome after DNA replication by methylating adenine bases on the newly synthesised DNA strand within specific DNA target sequences. This methylation reaction is triggered by the recognition of targets which are methylated on the parental DNA strand. If methylation is not detected on either strand then the restriction reaction is triggered. Unmodified target sequences will exist on foreign DNA, usually of viral origin. A type I system cleaves the foreign DNA thereby preventing (restricting) its replication and propagation. In contrast to the widely used type II restriction/modification systems which have separate restriction endonucleases and modification methyltransferases (mtases), the type I systems combine both activities in one large oligomeric enzyme. The archetypal type I system is that of E.coli K12, EcoKl. This enzyme comprises three different subunits, the specificity (S) subunit which recognises the DNA sequence 5'AAC-(N)6-GTGC3', the modification mtase (M) subunit, and the restriction endonuclease (R) subunit. Two M subunits bind to one S subunit to form an active modification mtase which TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
593
David T. R Dryden et al
594
has a strong preference for methylating a target sequence which aheady contains one methylated adenine base, figure 1. The binding of two R subunits to the mtase gives rise to a nuclease activity which is triggered only when the target sequence is not methylated at either position. The molecular weights of the S, M, and R subunits are 51kDa, 59kDa and 134kDa respectively, thus the complete EcoKl enzyme has a molecular weight of 437kDa. It has proved difficult to produce more than 1 or 2 milligram of the complete nuclease for biophysical analysis by in vivo expression of the EcoYJ genes, however, it has been possible to produce large amounts of the mtase and its subunits (Dryden et al, 1993) and milligram quantities of the R subunit. Therefore, we have examined the possibility of assembling the complete EcoYl enzyme in vitro using intramolecular crosslinking combined with denaturing gel electrophoresis to detect subunit-subunit contacts, and the method of continuous variation titration to confirm subunit stoichiometrics (Job, 1928; Agmus, 1961).
Unmodified target H
1
5 ' -- AAC ( N )6 GTGC -- 3 ( N )6 CACG -- 5 3 ' •- TTG
Fully modified target CH3
I
5' - AAC ( N ) 6 GTGC - 3' 3' - TTG ( N ) 6 CACG - 5' CH^
Hemimethylated targets CH, 5' - AAC { N ) 6GTGC - 3' 3'- TTG ( N )6 CACG - 5'
5' - AAC ( N ) 6 GTGC - 3' 3' - TTG ( N ) 6 CACG - 5' CHo
Figure 1. The DNA target sequence for EcoYA in its different methylated forms. The unmodified form elicits the restriction reaction, the two hemimethylated forms elicit the modification reaction and the fully methylated form causes no reaction. EcoKI binds with the same affinity to all of these forms (Powell et al, 1993).
Assembly of EcoKl
595
II. Materials and Methods The M2S1 mtase and the partially assembled, inactive MjSi form were purified as described (Dryden et al, 1993). The M subunit was purified using DEAE ion exchange and gel filtration chromatography from cells containing an overexpression plasmid, pJFM, derived from the mtase overexpression plasmid, pJFMS. This was made by excising the Sma\-Hind\\\ fragment containing the M gene from pJFMS and ligating it into plasmid vector pJFl 18EH. The R subunit was purified from cells containing the multicopy plasmid pJK2 (Kelleher et al, 1991) using DEAE ion exchange, heparin - affinity and gel filtration column chromatography. The proteins were all at least 95% pure as judged by SDSPAGE with Coomassie blue or silver staining. Protein concentration was measured by absorption at 280nm using extinction coefficients calculated from the tyrosine and tryptophan content of the subunits (Sober, 1970). The buffer used in all crosslinking experiments was 20mM tris, 20mM MES, lOmM MgCl2, 7mM p-mercaptoethanol, O.lmM EDTA, pH8 and experiments were all performed at room temperature. Glutaraldehyde crosslinking was performed by adding a 25% stock solution of glutaraldehyde to the protein solution to obtain a final concentration of 1%) glutaraldehyde and approximately 0.2mg/ml of protein in a volume of 100|al. The reaction was terminated by the addition of 2.5 \i\ of 2M NaBH4 freshly prepared in 0.1 M NaOH. After 20 minutes, the samples were mixed with an equal volume of SDS PAGE loading buffer, boiled for 5 minutes and applied to the gel. The crosslinked samples were subjected to electrophoresis on 10% acrylamide gels with a stacking gel or on 0.8% agarose, 3.5%) acrylamide gels without a stacking gel. The agarose significantly improved the strength of the acrylamide gels without affecting their resolution. These gels were made by dissolving the agarose in hot gel electrophoresis buffer, followed by the addition of the acrylamide solution, TEMED and ammonium persulphate (Sambrook et al, 1989). The mixture was rapidly poured between pre-warmed glass plates. Gel casting and subsequent electrophoresis used either the standard tris-glycine buffer (Sambrook et al, 1989) or a 20mM sodium phosphate, 2% SDS, pH7 buffer (Sigma technical note MWS-877X). Molecular weight markers of up to 205kDa (Sigma) were used with the tris-glycine buffer system, but it was possible to use crosslinked phosphorylase b and crosslinked bovine serum albumin markers (Sigma)
David T. F. Dryden et al
596
with molecular weights up to 600kDa with the phosphate buffer system (Sigma technical note MWS-877X). The molecular weights of the crosslinked complexes were estimated by comparison to a calibration plot of log(molecular weight) versus migration for the molecular weight markers. This calibration for the markers used with the phosphate buffer system and the agarose/acrylamide gels was completely linear over the extraordinarily large range of 27kDa to at least TOOkDa. To obtain a measure of the number of R subunits which would bind in vitro to the mtase we used the continuous variation titration method (Job, 1928; Agmus, 1961). In this method, the two components are mixed such that the sum of their concentrations remains a constant but the mole fraction, x, of each component is varied. If one chooses a total concentration, C, substantially greater than the dissociation constant (Kd) for the binding of the components under investigation, then a plot of the amount of complex formed as a function of mole fraction will give the stoichiometry of the complex and an estimate of the Kd. The method can be applied to any associating system at equilibrium using any appropriate technique to measure the amount of complex formed. From the usual equation describing the binding of n molecules of B to one of A to form a complex ABn one can write Kd={[A]*[Br }/[AB„] [A] = C* (1-x) - [AB„] and [B] = C*x - n*[AB„] where % is the mole fraction of B. Therefore, { C*(l-X) - [AB„] } * { C*x - n*[ABJ }" = Kd*[ABJ The solution of this equation is simple for n=l, but is more complicated for other values of n, however, when one plots the amount of complex formed versus mole fraction, one can immediately estimate the ratio of A to B by the value of the mole fraction which gives the maximum amount of complex AB^. It can be calculated that xmax = n / (n+1) so that the maxima for n =1, 2 and 3 are at x = 0.5, 0.667 and 0.75 respectively.
Assembly of EcoKl
597
%T
100 •267
NaCI
moles/litre
•475
Figure 2. Elution profile (A) of the EcoKl mtase from a heparin agarose chromatography column showing the presence of two peaks, the major peak being the MjSi form and the minor peak being MiSj. Elution profile (B) formed by reapplying the smaller of the two peaks from elution run A to the heparm agarose column showing the re-equilibration of the mtase into M2S1 and MjSi forms. % transmission at 280nm was monitored (Dryden et al, 1993). 1 2 3 4 5 6 7 8 lySSL
it^m
MMHM
I^H ^mq 'fP'5
205
116 97.4
..^i» „
.am,—^...
:»,
^.^j^.^.A
Figure 3. SDS-PAGE, on a 0.8% agarose, 3.5% acrylamide gel run in tris-glycine buffer, of samples after crosslinking with glutaraldehyde. Bands were stained with silver. Lane 1, M2S1 mtase; lane 2, M2S1 mtase + R subunit; lane 3, MiS^ + M subunit; lane 4, M|Si visible at very bottom of gel; lane 5, M^Si + R subunit; lane 6, M subunit + R subunit; lane 7, M subunit dimer with the predominant M subunit monomer having migrated off the base of the gel; lane 8, R subunit in a monomeric form.
598
David T. F. Dryden et al
III. Results and Discussion It has been found that the M2S1 mtase can dissociate during ion exchange and heparin affinity chromatography ( Dryden et al, 1993) to give a mixture of M2S1, Ml Si and M subunit, figure 2. The dissociation has been confirmed by determination of the mtase molecular weight as a function of protein concentration by both gel filtration and sedimentation equilibrium measurements (results not shown). The Kd for this process is approximately 15nM. Figure 3 shows the effect of glutaraldehyde on our preparations of M2S1, MiSi, M and R. The most intense band in each case is that of the lowest molecular weight and corresponds to the normal multimeric state of each protein, i.e. a trimer, dimer, monomer and monomer respectively. The less intense bands of higher molecular weight are due to intermolecular crosslinking between different protein molecules rather then intramolecular crosslinking. The amount of intermolecular crosslinking can be minimised by reducing the amount of crosslinker, however, this will also lead to the presence of some free subunits which have not undergone intramolecular crosslinking (Klotz et al, 1975). Lane 3 shows that the mtase can be reconstituted in vitro by mixing Mi Si with the M subunit. The crosslinked mixture shows a band not present in either of the individually crosslinked samples, lanes 4 and 7, that migrates at the same position as the crossUnked mtase, lane 1. The apparent molecular weight of this band is 150kDa, slightly less than the 170kDa expected for the mtase trimer. This slightly lower molecular weight can be attributed to the crosslinks preventing complete unfolding of the protein and resulting in faster migration of the more compact structure through the gel. Lanes 2, 5 and 6 show that the R subunit can be crosslinked to M2S1, Mi Si and the M subunit giving rise to complexes of very high molecular weight. Electrophoresis of these complexes on agarose/acrylamide gels with the phosphate buffer system allows their weights to be estimated at 400450kDa. Further analysis of these complexes using gel filtration chromatography suggests that these complexes are of the form R2M2S1, R2M2S2 and R2M2 respectively (data not shown). Only the complex between R and M2S1 shows full nuclease activity. The estimation of subunit stoichiometry of such large complexes is a rather uncertain process, so we used the continuous variation titration method to examine the binding of R to M2S1 in more detail. Figure 4 shows a typical result of the titration of M2S1 with R after crosslinking of the
Assembly of EcoKl
599
samples and electrophoresis through the agarose/acrylamide gel. The amount of crosslinked mtase decreases with increasing mole fraction of R and a high molecular weight band of the complete nuclease appears. The amount of nuclease reaches a maximum at a mole fraction of R = 0.7 and then disappears at higher mole fractions when free R subunit becomes visible. Densitometry of this gel and several others allowed figure 5, showing the amount of nuclease formed as a function of mole fraction of R, to be plotted. This graph clearly shows that more than one molecule of the R subunit binds to each molecule of the mtase and the most likely stoichiometry is that predicted from the molecular weight determination i.e. R2M2S1. This stoichiometry agrees with that observed for EcoKl nuclease purified from cells expressing all three genes. The nuclease assembled in vitro has the same enzymatic activities as the nuclease isolated from in vivo sources (data not shown).
MOLE FRACTION OF R SUBUNIT 0
0.1
0.2
\^ (M2Sl)2
0.3
0.4
0.6
H^'
0.8
0.9
1.0
'MM
-*- R2M2S1
205kDa-^^^%^
M2S1 • 116kDa
MMMr'
Figure 4. SDS-PAGE using the same gel system as in figure 3, of samples from the continuous variation titration of the M2S1 mtase with the R subunit after crosslinking with glutaraldehyde. The maximum amount of high molecular weight complex corresponding to the EcoKl nuclease is visible between 0.6 and 0.8 mole fraction of R subunit.
David T. F. Dryden et al
600
Amount of nuclease connplex, arbitrary units
_
1
i
1
\
0.8
'' "
/
i
/
r/
^ J * "«•
\
\
-_
.^vT""*^
V . \f$i.
J
\ \
-
A
/
^
f
- -'' /^/y /
0.2
1
r —r
* * * * *
0.6 0.4
i
-''
/
y*^
'
yv /
0 t ^i^^^Q-^ 0
/
/
1
0.2
1
»]\\ _
U
/
\\
'* V 6v
/
lAr *M
\
0.4
\
L
0.6
^
1 0.8
\
\ 1
Mole Fraction of R subunit
Figure 5. A plot of the amount of nuclease complex formed in the continuous variation titration experiments versus molefractionof the R subunit as determined by densitometry of silver stained gels such as that shown in figure 4. The lines drawn are the theoretical curves expected for the association of 1 (...), 2 (-), or 3 (—) R subunits per molecule of M2S1 mtase. The error bars are +/- one standard deviation.
IV. Conclusions Our results show the effectiveness of intramolecular crosslinking coupled with SDS-PAGE in analysing a complex assembly process. The use of the continuous variation titration method of Job (Job, 1928; Agmus, 1961) is particularly useful for determining subunit stoichiometrics in situations were the high molecular weight of the complexes potentially permits many different subunit stoichiometrics.
Assembly of EcoKl
601
The ability to assemble the EcoKJ nuclease in vitro is a great advantage in mutagenesis studies since one can assemble different combinations of MjSi, M, M2S1 and R containing single amino acid changes and possessing altered activities very easily, particularly if one wishes to make a nuclease proficient in restriction but deficient in modification which would be lethal if expressed in the cell.
Acknowledgements We would like to thank Peter Thorpe for the construction of the pJFMS and pJFM plasmids. This work was supported by grants from the Medical Research Council and The Royal Society. David Dryden thanks the Royal Society for a University Research Fellowship.
References Agmus, E. (1961) Z Analyt. Chem. 183, 321-333. Barcus, V. A., Titheradge, A. J. B , & Murray, N. E. (1995) Genetics 140, 1187-1197. Bickle, T. A., & Kruger, D. H. (1993) Microbiol. Rev. 57, 434-450. Dryden, D. T. F., Cooper, L. P., & Murray, N. E. (1993) J. Biol. Chem. 268, 13228-13236. Dybvig, K., & Yu, H. (1994) Molec. Microbiol. 12, 547-560. Fleischmann, R. D., Adams, M. D., White, O., Clayton, R. A., Kirkness, E. P., Kerlavage, A. R., Bult, C. J., Tomb, J. F., Dougherty, B. A., Merrick, J. M., McKenney, K., Sutton, G., FitzHugh, W., Fields, C , Gocayne, J. D., Scott, J., Shirley, R., Liu, L-L, Glodek, A., Kelley, J. M., Weidman, J. F., Phillips, C. A., Spriggs, T., Hedblom, E., Cotton, M. D., Utterback, T. R., Hanna, M. C , Nguyen, D. T., Saudek, D. M., Brandon, R. C , Fine, L. D., Fritchman, J. L., Fuhrmann, J. L., Geoghagen, N. S. M., Gnehm, C. L., McDonald, L. A., Small, K. V., Eraser, C. F., Smith, H. O., & Venter, J. C. (1995) Science 269, 496-512. Job, P. (192S) Annls. Chim. (Ser. 10) 9, 113-134. Kelleher, J. E., Daniel, A. S, & Murray, N. E. (1991) J. Mol. Biol. Ill, 431-440. King, G., & Murray, N. E. (1994) Trends in Microbiol. 2, 465-469. Klotz, I. M., Damall, D. W., & Langerman, N. R. (1975) in The Proteins, 3rd ed. (Neurath, H., Hill, R L., & Boeder, C-L. eds) pp 293-411, Academic Press, New York. Powell, L. M., Dryden, D. T. F., Willcock, D. F., Pain, R. H., & Murray, N. E. (1993) J. Mol. Biol. 234,60-11. Sambrook, J., Fritsch, E. F., & Maniatis, T. (1989) Molecular Cloning: A laboratory manual. Cold Spring Harbor Press, NY. Sober, H. A. (1970) Handbook of Biochemistry, 2nd ed. ppB75-B76, CRC Press, Boca Raton, FL. Stein, D. C , Gunn, J. S., Radlinska, M., & Piekarowicz, A. (1995) Gene 157, 19-22. Valinluck, B., Lee, N. S., & Ryu, J. (1995) Gene 167, 59-62. Xu, G., Willert, J., Kapfer, W., & Trautner, T. A. (1995) Gene 157, 59.
This Page Intentionally Left Blank
SECTION VIII Three Dimensional Structure
This Page Intentionally Left Blank
strategies for NMR Assignment and Global Fold Determinations Using Perdeuterated Proteins Ronald A. Venters, Hai M. Vu, Robert M. de Lorimier, and Leonard D. Spicer Departments of Biochemistry and Radiology and the Duke University NMR Center, Duke University, Durham, NC 27710
L Introduction NMR is proving to be a very useful tool in structural studies of small- to medium-sized proteins in solution. For larger proteins, however, magnetic relaxation becomes a limiting factor. Here we show the benefits of using uniform high-level (> 96%) deuteration to inhibit relaxation processes. This facilitates assignment of larger proteins for structural studies and enables, via edited NOESY experiments, the determination of medium- to long-range distance constraints important in establishing the tertiary organization or global fold of proteins. The proteins studied are human carbonic anhydrase II (HCA II), a 29 kDa metalloenzyme recently assigned in our lab (1), and a 12 kDa core packing mutant of thioredoxin (L78K-TRX) for which we have characterized motional dynamics (2). NMR pulse sequences utilized for protein ^^C, ^^N, and ^H assignment (3,4) rapidly lose sensitivity as the size of the protein under study increases above 25 kDa, due mainly to fast ^''C transverse relaxation via'the strong dipolar coupling between a ^^C nucleus and its directly bonded protons (5,6,7). Since the gyromagnetic ratio of H is 6.5 times smaller than that of ^H, perdeuteration dramatically reduces this relaxation. We have successfully ^^C, ^^N and ^H- labeled the protein HCA II (8) and have demonstrated significant advantages in signal-to-noise ratios for heteronuclear NMR experiments compared to a fully protonated *^C/^^N protein (1,9). Using this protein we have also developed a general strategy for the complete mainchain, as well as carbon and NHx sidechain assignments of perdeuterated proteins (1,9,10). In addition, for both HCA II and L78K-TRX we have obtained 3D and 4D ^ N/^ N-separated NOESY data which show anticipated long range interactions from which distance constraints can be derived. These are currently being evaluated in establishing the global folding patterns for these proteins (11) and here we show initital results for L78K-TRX that confirm the importance and utility of these data in establishing tertiary organization. The rapid determination of protein global TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
605
Ronald A. Venters et al
606
folds can enhance the comparison of mutant proteins with their wild-type counterparts and can significantly speed up efforts in drug discovery. In addition, the global fold may subsequently be utilized in more detailed structural studies by helping to resolve ambiguities in 4D ^"^C/^C-separated and ^^C/^^N-separated NOESY data.
IL Experimental Conditions High-level expression of HCA II in E, coli (12) has been achieved by the construction of vectors (pACA) which contain the protein gene subcloned behind a phage T7 RNA polymerase promoter vector (13). Transcription was initiated by the addition of isopropyl-p-D-thiogalactopyranoside (IPTG), inducing a chromosomal copy of T7 RNA polymerase (behind a lac UV promoter) in the cell line BL21(DE3) (14). HCA II was purified using sulfonamide affinity chromatography with slight modifications to the published procedures (15). HCA II activity was measured by assaying enzyme-catalyzed hydrolysis of/?-nitrophenyl acetate at 348 nm(16). PHH121/XL1BLU (L78K-TRX)
PACA/BL21 (HCAII)
-RICH MEDIUM PLATELB/Amp 3rC.pH7|
- RICH MEDIUM ( H ^ - RICH MEDIUM (100% D^) AaOO-0.8 3 r C. pH 7
-MINIMALyGLUCOSE MEDIUM (99% D^)— 3rC.pH7J MINIMA17ACETATE MEDIUM (99% DjP)
INDUCTION © OD - 0.4
GROW @34» 0,16 HOURS
MINIMAL/GLUCOSE MEDIUM (99% D^)
INDUCTION O OD - 0.8
GROW O 37" C, 8 HOURS
I
I
HARVEST Cartwn sources « protonated glucose for L78K-TRX and protonated sodium (1,2- ^^Cj) acetate for HCAII Nitrogen source s
NH4CI
Figure 1. Growth of E, coli in D2O for biosynthetic labeling of HCA II and L78K-TRX.
NMR and Global Fold Determinations of Perdeuterated Proteins
607
The flow chart for biosynthetic labeling of HCA II and L78K-TRX is shown in Figure 1 above. Uniform ~H and ^^N labeled HCA II was obtained by growing BL21(DE3)pACA E. coli in defined media containing essentially 100% D2O, 3 g/L sodium acetate as the sole carbon source, and 1 g/L [^^N, 99%] ammonium chloride as the sole nitrogen source (8). In addition, the defined media contained M9 salts (17), 2 mM MgS04, 1 jiM FeCls, 10 mL/L vitamin mixture (containing 10 mg/100 mL each of biotin, choline chloride, folic acid, n i a c i n a m i d e , Dpantothenate, and pyridoxal and 1 mg/100 mL riboflavin), 5 mg/L thiamine, 100 jiM CaCL, 50 |iM ZnS04, and 50 [Xg/mL ampicillin. Stock reagents were prepared in D2O and filter sterilized. To minimize ^H/^H exchange, the media were used immediately after preparation and were never autoclaved. In order to obtain maximum sensitivity in heteronuclear 3D experiments it is essential that all amide "H be exchanged with ^ I . To achieve this, deuterated HCA II was unfolded in the presence of H2O by incubation in 3 M guanidine-HCl at pH 7.5 and room temperature for 1 hour followed by a rapid 20-fold step dilution with 0.1 M tris sulfate at pH 7.5 and subsequent refolding for 2 hours (18). Furthermore, we have optimized HCA II growth conditions for maximum protein yields in the defined acetate media described above (8). Conditions optimized included IS^QQ at time of induction, induction time, growth temperature, antibiotic levels, and pH. Doubling times for cells in 98.8% D20/acetate media increased slightly compared to H20/acetate. Optimum protein yields were obtained using the conditions we found optimum for acetate growths in H2O, with two exceptions: maximum yield was achieved when the cells were induced at A600 ^^^ 0.3-0.5, and when induction times were increased from 8 hours to 16 hours. The total mass of protein produced per liter of medium decreased approximately 33% to 50 mg compared with the same fully protonated medium. For ^^N/^H L78K-TRX a different strain of E. coll was utilized (pHH121/XLlBLU) and growth was carried out in a minimal glucose medium. Temperature and pH were as indicated in the flow chart and induction was initiated when A6oo~0.8. Induction was optimum at 8 hours compared with 4 hours in H2O. The total yield was ,15 mg/L compared with 14 mg/L in protonated media. We have also determined the upper limit of deuterium incorporation in HCA II. For this purpose milligram quantities of ^H labeled protein were produced in defined media containing 98.8% D2O and [^Ha, 98%] sodium acetate as the sole carbon source using the optimized procedures outlined above. To quantitate the level of deuterium incorporation, we analyzed the molecular mass of purified HCA II by mass spectrometry. The molecular mass of fully protonated HCA II was measured to be 29102 +/- 2.4 (theoretical mass = 29098.9). At low pH the protein contains 2018 protons; therefore, one would predict a theoretical mass increase of 2030.5 mass units upon complete deuteration. The molecular mass of protein produced in 98.8% D2O and ["1113, 98%] sodium acetate was measured to be 31133 +/13, an increase of 2034 +1- 15 mass units, indicating above 9 6 % deuterium incorporation.
Ronald A. Venters et al
608
[^Hs, 98%] sodium acetate, [^^N, 99%] ammonium chloride, and D2O were obtained from Cambridge Isotope Laboratories. NMR experiments were carried out on a 3-channel Varian Unity 600 spectrometer using a ^H/^"^C/^^N tripleresonance probe equipped with an actively shielded B^ gradient coil.
in. Results and Discussion A. Backbone and Aliphatic Sidechain Resonance Assignments Since there are no aliphatic protons present in perdeuterated proteins, new strategies must be employed and new pulse sequences developed for the NMR assignment and structure determination. The sequential mainchain assignment of perdeuterated proteins is achieved by collecting and analyzing 3D HNCACB, 3D HN(CO)CACB, and 4D HN(CACO)NH data (1). These sequences include "H decoupling when ^ C is transverse and work best if H2O flip-back pulses and pulsed field gradients are employed. Complete aliphatic deuteration increases both resolution and sensitivity in these experiments by eliminating partially deuterated CHnDm moieties, which have different ^ C chemical shifts due to the ^H isotope shift. Sidechain carbon assignments are obtained from a 3D C(CC)(CO)NH data set (9). This sequence is a modified version of the HC(CC)(CO)NH sequence in which magnetization originates on aliphatic ^^C and not aliphatic ^H. Theoretical calculations and experimental evidence indicate an approximate 3.5-fold increase in sensitivity for methine groups and an approximate 7-fold increase in sensitivity for methylene groups using the C(CC)(CO)NH experiment on perdeuterated HCA II. Sidechain NHx assignments are obtained using modified 2D versions of the ^H-^^N HSQC, HNCO, HNCACB, and HN(CO)CACB experiments (10) to provide through-bond correlations of these sidechain ^HN/^^N resonances to the previously assigned sidechain ^^C resonances. Subsequent to the assignment of the perdeuterated protein, inter-residue CJC^ and Ha/Hp chemical shifts can be obtained from the CBCA(CO)NH and H B H A ( C 0 ) N H experiments using a fully protonated ^^C/^^N labeled protein sample. These data allow for the parameterization of the "H isotope shifts on the Ca and Cft carbons and allow for the reasonable estimation of the "Yi isotope shifts at 13
sidechain C resonances (1). Sidechain ^H resonances can then be assigned from a 4D HCCH-TOCSY data set collected on the fully protonated protein sample. The ^"^Ca and ^^C^ chemical shift values obtained directly from the protonated sample and the corrected C chemical shift values of the additional sidechain carbons should facilitate the analysis of the TOCSY data.
NMR and Global Fold Determinations of Perdeuterated Proteins
609
B. Secondary Structure Determination The relationship between NMR chemical shifts and the secondary structure of a protein has been well established (19,20,21). The Ca and carbonyl carbons experience an upfield shift in extended structures, such as a p-strand, and a downfield shift in helical structures. Both the Cp and the Ha proton chemical shifts exhibit the opposite correlation. These shifts have proven to be sufficiently consistent to permit the prediction of secondary structural elements for a number of proteins (1,19,20). Knowledge of the secondary structure of a protein can be useful in identifying spin-diffusion effects during the analysis of 4D ^ N/^^N-separated NOESY data collected with long mixing times as described below. The secondary structure can also be used as a constraint in the calculation of protein global folds.
C. Global Fold Determination A global fold of a protein may be determined from the analysis of a 4D ^^N/^^N-separated NOESY spectrum collected on perdeuterated protein once the mainchain and sidechain ^HN/^ N resonances have been assigned (11). Detection of ^HN-^HNNOES in a perdeuterated protein can provide longer distance constraints than in a fully protonated protein. This is due to greater control of alternate relaxation pathways and a reduction in the number of possible spin-diffusion routes which would otherwise compete with direct ^HN-^HN cross-relaxation at long mixing times. Results in perdeuterated HCAII and L78K-TRX suggests that NOEs are detected between amides separated by 7 A or more in the crystal structure. For example Figures 2 and 3 show planes from ^^N/^^N-separated NOESY spectra of HCA II and L78K-TRX respectively. Labeled peaks correspond to amide-amide
IH(don)
(ppm)
Figure 2. H donor/ ^^N donor planes from a 4D ^^N/ ^^N-separated NOESY spectrum on a 2.8 mM perdeuterated^^ N-labeled HCA II sample.
Ronald A. Venters et al
610
V55 (Diagonal Peak) 10,0
9,5
6.6
4.6
6.6
A46
T54
14
9.0 e.5 Q,o IH don (ppm)
7,5
Figure 3. h^ acceptor plane from a 3D ^^N/ ^^N-separated NOESY spectrum on a 4.0 mM perdeuterated N-labeled L78K-TRX sample. NOEs for Leu 118 (HCA II) and Val 55 (L78K-TRX), with inter-proton distances (from the crystal structures of wild type protein) given in A, several of which are greater than 5 A. This increased range leads not only to more total constraints, but also to highly informative constraints between different structural elements, which should allow more accurate prediction of the global folding pattern (11,22-25). For example, in the crystal structure of wild type thioredoxin there are 205 mainchain amide-to-amide distances less than 5 A. Extending this range up to 7 A gives an additional 134 constraints, many of which link different substructures. In many respects medium and long range constraints are particularly important in determining precise protein structures by NMR (22) and such constraints are crucial for determining an accurate global fold of a protein. An example of the utility of using N/^ N-separated NOE data from perdeuterated L78K-TRX to establish the tertiary organization of the protein is illustrated below. The 4D ^^N/^^N-separated NOESY data were collected using a mixing time of 400 ms and the resulting spectrum was referenced using previously determined resonance assignments (2). Data on spectral peaks were tabulated using the peakpicking and volume measurement routines of a modified version of the FELIX program (Hare Research), then assigned as NOEs between specific amides. The number of inter-residue mainchain NOEs was 381, of which 80 were sequential / to i+7. The remaining 301 were / to /+2 or greater. Also assigned were approximately 60 sidechain to mainchain NOEs involving Trp indole and Asn/Gln primary amide groups. Some mainchain amides had no detectable NOEs, perhaps because their direct ^^N-^H correlations are weak as obsereved in ^H/^^N-HSQC data. The relationship between NOE volume and ^H-^H distance was examined for all NOEs where both symmetry related cross peaks were observed. A plot of the logarithm of the average volume of the two peaks, versus the inter-proton distance as mea-
NMR and Global Fold Determinations of Perdeuterated Proteins
611
sured from the crystal structure of wild type thioredoxin, is shown in Figure 4. The data suggest an approximately linear overall relationship, as expected from a dipolar interaction. Of the 280 NOEs in Figure 4, 153 (55%) correspond to distances less than 5 A, while the remaining 127 (45%) NOEs occur between protons separated by 5 A or more. It is also noted that cross peaks corresponding to distances as large as 9 A are clearly observed. NOE volume appears to be less well correlated with distance at longer distances, perhaps because of a greater contribution of spindiffusion to peak volume (25). 1.0E+09
a
^ 1.0E+08 o >
o (U
§ 1.0E+07 >
5 A) proved important in determining the tertiary fold, as illustrated in Figure 5 by four such NOEs observed in L78KTRX (drawn on the wild type structure) that link different types of secondary structure. One segment, the C-terminal helix, lacked NOEs to other secondary structural elements and is the only substructure not properly positioned. Thus using as distance geometry constraints only amide proton NOEs (3.6 per residue), a generally correct mainchain fold of L78K-TRX was obtained.
Figure 5. X-ray crystal structure of wild type thioredoxin (A) and the calculated structure of L78K-TRX (B). Distances between mainchain amide protons are given in A. A more detailed evaluation of the extent to which spin-diffusion contributes to NOEs in perdeuterated HCA II and L78K-TRX is under way. Modeling studies with three spins suggest that NOESY mixing times of up to 600 ms may be employed without significant (
3.0E+05
g
2.0E+05
!
1.0E-t-05
1
/
"
'
/
•••
^ ^
J^l^'l^- f ^ ^
O.OE+00 200
400
Mixing Time (ms)
600
200
400
600
Mixing Time (ms)
Figure 6. NOE build-up curves for a sample of ten long range (A) and ten short range (B) amide pairs in L78K thioredoxin. Once a global fold has been established for a protein, a complete high-resolution 3D structure of the protein can then be calculated using distance constraints derived from 4D- ^^C/^^N-separated and ^^C/^^C-separated NOESY data. The analysis of these 4D NOESY data sets should be facilitated by the previous determination of the protein global fold.
IV. Conclusions These studies indicate that perdeuteration can be achieved in proteins expressed in several different E. coli strains by growing selected cells in D2O media. Complete deuteration provides significant signal-to-noise enhancement in heteronuclear NMR assignment and structure determination experiments which use the amide proton for detection. Using a perdeuterated ^^C/^^N sample, we have completed the ^H, ^^C, and mainchain and ^HN, ^^N, and *^C aliphatic sidechain assignments for the 259 residue protein HCA II utilizing the strategies outlined above. We are in the process of analyzing 4D ^^N/^^N-separated NOESY data on both perdeuterated HCA II and a mutant of thioredoxin in order to generate distance constraints which will be used to determine the global folds of these proteins. The data include particularly useful longer range contraints often extending to greater than 7 A. Our initial application of the strategy to the L78K-TRX mutant protein is highly encouraging and illustrates the importance of long range NOE constraints in tertiary structure evaluation.
614
Ronald A. Venters et al
The strategies we have outlined here should be applicable to proteins with rotational correlation times substantially longer than HCA II.
Acknowledgments The Duke University NMR Center was established with grants from the NIH, NSF, and the North Carolina Biotechnology Center, which are gratefully acknowledged. This work was supported in part by the NIH research grant GM 41829. The authors thank Homme W. Hellinga for providing the expression strain and facilities to purify L78K-TRX.
References 1) Venters, R.A., Farmer, B.T. II, Fierke, C.A., and Spicer, L.D. (1996) J. Mol BioL in press. 2) de Lorimier, R. M., Hellinga, H., and Spicer, L.D. (1996) Protein Science^ in press. 3) Bax, A., andGrzesiek, S. (1993) Ace. Chenu Res. 26. 131-138. 4) Muhandiram, D.R., and Kay, L.E. (1994) J. Magn. Reson., series B 103, 203-216. 5) Grzesiek, S., Anglister, J., Ren, H., and Bax, A. (1993) J. Am, Chem, Soc. 115, 4369-4370. 6) Yamazaki, T., Muhandiram, R., and Kay, L.E. (1994) J. Am, Chem. Soc. 116, 8266-8278. 7) Yamazaki, T.,Lee, M., Revington, M., Mattiello, D.L., Dahlquist, F.W., Arrowsmith, C.H., and Kay, L.E. (1994) J. Am, Chem, Soc. 116,6464-6465. 8) Venters, R.A., Huang, C.-C, Farmer, B.T. II, Trolard, R., Spicer, L.D., and Fierke, C.A. (1995) J. Biomol. NMR 5, 339-344. 9) Farmer, B.T. n, and Venters, R.A. (1995) J. Am, Chem, Soc. Ill, 4187-4188. 10) Farmer, B.T. II, and Venters, R.A. (1996) J. Biomol. NMR 7, 59-71. 11) Venters, R.A., Metzler, WJ., Spicer, L.D., Mueller, L., and Farmer, B.T. II (1995) J. Am, Chem, Soc. 117,9592-9593. 12) Nair, S.K., Calderone, T.L., Christiansen, D.W., and Fierke, C.A. (1991) J. Biol. Chem, 266, 17320-17325. 13) Rosenberg, A.H., Lade, B.N., Chui, D.S., Lin, S.W., Dunn, JJ., and Studier, F.W. (1987) Gene 56,125-135. 14) Studier, F.W. and Moffatt, B.A. (1986) J. Mol. Biol. 189, 113-130. 15) Khalifah, R.G., Strader, D.J., Bryant, S.H., and Gibson, S.M. (1977) Biochemistry 16, 22412247. 16) Veipoorte, J.A., Mehta, S., and Edsall, J.J. (1967) J. Biol. Chem, 242,4221-4229. 17) Sambrook, S., Fritsch, E.F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York. 18) Carlsson, U., Henderson, L.E., and Lindskog, S. (1973) Biochim. Biophys. Acta 310, 376-387. 19) Wishart, D.S., and Sykes, B.D. (1994) J. Biomol. NMR 4, 171-180. 20) Metzler, WJ., Constantine, K.L., Friedrichs, M.S., Bell, AJ., Ernst, E.G., Lavoie, T.B., and Mueller, L. (1993) Biochemistry 32, 13818-13829.
NMR and Global Fold Determinations of Perdeuterated Proteins 21) Spera, S. and Bax, A. (1991) J. Am, Chem. Soc. 113, 5490-5492. 22) James, T.L. (1994) Methods in Enzymology 239,416-439. 23) Zhao, D., and Jardetsky, O. (1994) /. Mol Biol. 239, 601-607. 24) Clore, G.M., Robien, M.A., and Gronenbom, A.M. (1993), J. Mol. Biol. 231, 82-102. 25) Hoogstraten, C.G. and Markley, J.L. (1996) J. Mol Biol. 258, 334-348.
615
This Page Intentionally Left Blank
IH-NMR EVIDENCE FOR TWO BURIED ASN SIDE-CHAINS IN THE cMYC-MAX HETERODIMERIC a-HELICAL COILED-COIL Pierre Lavigne, Matthew P. Crump, Stephane M. Gagne, Brian D. Sykes, Robert S. Hodges and Cyril M. Kay Department of Biochemistry and the Protein Engineering Network of Centres of Excellence, University of Alberta, Edmonton, Alberta CANADA T6G 2S2
I. INTRODUCTION The Leucine Zipper (LZ) is a dimerization motif found in the b-LZ and b-HLH-LZ transcription factor families (1,2). Upon dimerization, LZs fold into parallel and two-stranded a-helical coiled-coils (3-6). The primary structure of coiled-coils forming proteins is characterized by the heptad repeats (abcdefg)n where Leu residues are conserved at positions d and positions a are mostly occupied by Pbranched and hydrophobic residues while e and g positions are often occupied by acidic or basic residues (7,8). The tertiary interactions of the dimeric LZ or parallel and two-stranded a-helical coiled-coils are described by the knobs-into-holes model (3,9). In the b-LZ family (e.g, GCN4 and c-Jun), Asn residues are found to be conserved at an a position in the heptad reapeat (1,10). A pair of Asn side-chains destabilizes the homodimeric LZ coiled-coil compared to hydrophobic side-chains othenvise conserved at this position (3,10). From a biological point of view a lower stabilty for homodimeric species will facilitate the reassortment of LZs which is desirable in the light of theu" regulative (heterodimerization) role (3,11). It has also been shown in a series of GCN4 LZ mutants and de novo designed LZs that replacement of the Asn residue by aliphatic residues leads to the formation of oHgomers, namely trimeric and tetrameric species (12-14). In addition to decreasing the stability of dimeric LZs, Asn side-chains can impose the correct dimer orientation (parallel and in-register) and specify folding of dimeric species over oligomeric ones (3,12,13). TTie crystal structure of the GCN4 homodimeric LZ incHcates that the Asn side-chains pack asymmetrically at the interface of the dimer where an interhelical H-bond between the 5NH2 of one Asn side-chain and the 05 of the other is formed (3). On the other hand, solution NMR studies on the GCN4 (15) and cJun homodimeric LZ (16) have shown that they are symmetric. It has been shown that the Asn side-chains are most likely flipping between two distinct, symmetry-related H-bonded conformations in the fast chemical exchange regime at room temperature (14). The oncoprotein c-Myc (a b-HLH-LZ protein) heterodimerizes specifically with the protein Max (anotiier b-HLH-LZ protein) to bind DNA and activate transcription (17,18). The LZ domain of Max contains two Asn residues at a TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
617
618
Pierre Lavigne et al
positions (Fig. lA). In two previous studies (19,20) it has been shown that the cMyc and Max LZs form a heterodimeric a-helical coiled-coil with a high specificity. This has led us (19) and others (20) to propose that the LZ domains of Max and cMyc are responsible for the specificity or molecular recognition in vivo. Molecular models describing interiielical salt bridges and hydrogen bonds that might be responsible for the specificity have been proposed (19,20). Amongst other features, the two Asn side-chains found on the Max LZ are proposed to be buried and to form interhelical side-chain--side-chain and side-chain—main-chain hydrogen bonds. In this paper we focus, using proton NMR spectroscopy, on the interactions of the two Asn side-chains at the interface of the c-Myc-Max heterodimeric LZ. We report interhelical NOE's between the H5 protons of Max Asn5a and Asnl9a and protons from the side-chains of c-Myc-LZ forming the holes in which they are proposed to pack according to the knobs-into-holes model indicating that they are indeed buried. Moreover, Max Asn 19a Hz shows an interhelical NOE to a backbone amide proton of c-Myc-LZ as well as slow amide exchange indicating that Max Asnl9a is potentially forming side-chain—main-chain hydrogen bonds. As discussed, these results support the molecular models for the c-Myc-Max heterodimeric coiled-coil and shed more light into the putative role of the conserved Asn residues in the mechanism of heterodimerization in this b-HLH-LZ subfamily of transcription factors. n. MATERL\L AND METHODS Solid phase peptide synthesis of the c-Myc and the Max LZs, characterization by mass spectrometry, purification by reversed-phase HPLC and the formation of the disulfide linked c-Myc-Max heterodimeric LZ have been described elsewhere (19). All proton NMR spectra were recorded on a Varian Unity 600 at 25 °C. 6 to 10 mg of the disulfide linked c-Myc-Max heterodimeric LZ were dissolved in 0.5 mL of potassium phosphate buffer (50 mM, 10% D2O / 90% H2O and pH 4.7) containing 100 mM KCl and ImM 2,2-dimethyl-2-silapentane-5-sulfonic acid (DSS) to yield solutions ranging from 0.75 to 1.25 mM. Proton resonances were assigned from two-dimensional double quantum filtered correlation spectroscopy (DQF-COSY; (21)), two-dimensional total correlation spectrocopy (TOCSY; mixing time = 50 ms; (22)) and two-dimensional nuclear Overhauser enhancement spectrocopy (NOESY; mixing times = 1 5 0 and 200 ms; (23)) experiments. Sequential assignment of the proton resonances was performed as described by Wuthrich (24). The spectra were acquired with 2048 t2 complex data points and 256 t\ increments in the phase sensitive mode with quadrature detection using the method described by States et al. (25). Water resonance was supressed during the 1.5s relaxation period used in the NOESY, DQF-COSY and TOCSY experiments and the mixing period of the NOESY experiments by irradiating continuously at its resonance frequency. The amide exchange experiments were carried out by
c-Myc-Max Heterodimeric LZ
619
acquiring 1-D spectra as described elsewhere (19) after dissolving the lyophilized sample in 100 % D2O. pH readings were not corrected for the isotopic effect. III. RESULTS We present on Fig. 1 A, the primary sequences of the c-Myc and Max LZs. Fig. IB shows the arrangement of the heterodimeric LZ in a helical wheel representation.
10
19
defgabcdefgabcdefgabcdefgabcd
25
CGGMRRKNDTHQQDIDDLKRQNALLEQQVRAL MaxLZ CGGVQAEEQKLISEEDLLRKRREQLKHKLEQL c-Myc LZ
B
Q24 Ri7 Q
S10K17H24
Figure 1. A. Primary structures of the c-Myc and Max LZs. Sequences are taken from Zervo et al. (26) and renumbered. B. Helical wheel diagram of the c-Myc-Max heterodimeric LZ. Potential interhelical electrostatic interactions have been discussed elsewhere (19,20). In the knobs-into-hole model (9), side-chains (knobs) at position a in the heptad repeat pack in the holes formed by consecutive g and a residues and two d positions. Accordingly, Max Asn5fl is proposed to pack in the hole formed by Vall90%) between pH 4.0 and pH 7.0. We present in Fig. 2 the amide-amide region from a NOESY spectrum recorded at 25°C and pH 4.7. Extensive sequential djsiN ('» '*+!) NOE's typical for a-helices (24) can be seen. Despite poor chemical shift dispersion of the a-protons, a significant portion of the short range doN (^^'+3 and i, /+4) dap (/, /+3) a-helical connectivities (24) could be unambiguously identified. In summary, enough a-
Pierre Lavigne et al
620
helical connectivities encompassing all the primary structure of both LZs were observed to ensure that the heterodimeric disulfide linked c-Myc-Max LZ has an extensive a-helical secondary structure.
6.5H
7.CH
I 7.^
I ^,0^^
8.(H
8.5H
MaxN19Hz/ c-MycR19HN
9.0H ""I""l""|""""'|""l"iilii
9.0
8.8
8.6
8.4
i|iiM|i.ii|nii|Mii|
8.2
8.0
iiM|M.i|nii|
7.8
7.6
7.4
|nii|MM|nM
7.2
7.0
|M.i|rM.|
6.8
6.6
|MM|MII|IMI
6.4
6.2
F2 (ppm) Figure 2. Backbone and side-chain amide region of a 600 MHz NOES Y spectrum of the disulfide linked c-Myc-Max LZ at 25 °C. Mixingtime= 200 ms, pH 4.7. Labelled is the interhelical NOE between Max Asnl9fl H5 (Hz) and c-Myc Argl9fl backbone HN. B. Tertiary interactions involving the two Asn side-chains at a positions The spin systems of Max Asn5a and Max Asn 19a have been completely assigned. As these residues are proposed to be buried at the interface of the heterodimer, long range (interhelical) NOEs involving their side-chains should enable us to define or probe their tertiary interactions and verify if they are indeed buried. Figure 2 shows a NOE between Max Asnl9a H5 (Z) and c-Myc Argl9a backbone HN. In addition, both H5 side-chain protons show NOEs to c-Myc Argl9a H a and one of c-Myc Argl9a HP (Fig.3). Max Asn5a H5 side-chain protons show NOE's with c-Myc GluSfl H a and one of c-Myc Glu4g HP and the protons of one of c-Myc Leu8 J methyl groups (Fig.3). The NOEs from H8 protons of Max Asn 5a and Max Asn 19a connect these side-chains to the residues on the c-Myc LZ that form the holes in which they would pack according to the knobs-into-holes model (see legend of Fig.l). This strongly supports the proposition that both Asn side-chains are buried at the interface of the c-Myc-Max heterodimeric LZ.
c-Myc-Max Heterodimeric LZ
621 ^:^^
100 %
0
^
\
'
^
-100 -100 (C)
0
0
^
•
-
-100
100
•
-100
01
%
:
0 (|)2
100
0
-100
Figure 3. (a) Major clusters identified through a-carbon distances for E-L2-E linker; (b) (t),\|/ plot for one of the clusters (Fig.3a) for E-L2-E linker. The turn types I & T are shown to isolate; (c) D) folded
Folded/Deuterated '^unfold
unfolded
kfold
Unfolded/Deuterated unfolded '(H-^D)
The model simply proposes that hydrogen exchange may occur (with different rates) in both the folded and unfolded forms of the protein, and that the folding/unfolding rates are not affected by replacement of amide hydrogens by deuteriums. (Note that k ^ ^ is zero for experiments in D2O, since there are no H2O solvent protons available for back-exchange; this rate constant does become observable by separate pulse-labeling experiments discussed later. Also, k ^ ^ is the same as k ^ ti except for kinetic isotope effects.) Let kunfold be the rate constant for unfolding, kfoid the rate constant for folding, and k"" ° ^ be the hydrogen/deuterium exchange rate for an amide hydrogen in unfolded state. If k^^J^^ « kfoid, the protein will fold and unfold many times before it can be fully deuterated, and the observed exchange rate, kobs = Kfoid ku .tj, in which Kfoid = kfoid/kunfold- This mechanism represents the EX2 limit. Under these conditions, the deuteration sites are randomly distributed about the average, so that a single-envelope isotope distribution is observed. (k"uLr) ^^^ different amide hydrogens have previously been found to range over only about one order of magnitude (27)). We observed such behavior when C22A-FKBP was incubated in 2 M urea (data not shown). On the other hand. If k""""^^^^ » kfoid, then once the protein unfolds, all amide hydrogens in the unfolded region will rapidly be replaced by deuteriums, to yield a second isotopic envelope with high deuterium content (unfolded/deuterated) distinct from that of the folded/deuterated protein. Under these conditions, known as EXl, kobs = kunfold- We observe this type of behavior for C22A-FKBP in the presence of 3.5 or 4.5 M urea (Figure 1, bottom two panels). In this limit, the rate of increase in relative abundance of the high-m/z envelope provides a direct measure of kunfold-
710
Zhongqi Zhang et al.
If the equilibrium constant for the unfolding can also be determined, then the refolding rate constant, kfold, niay also be determined. By the same reasoning, if the isotopic distributions for different segments of the deuterated protein can be determined, then the unfolding and refolding rate constants for each segment can be determined independently. A detailed unfolding and refolding mechanism may then be developed from the kinetics of those individual exchange processes, providing a powerful probe of detailed unfolding/refolding pathways. For the present example, our observation of similar unfolding kinetics for intact C22A-FKBP (Fig. 1) and all of its fragments (e.g.. Fig. 2 shows the V^-M^^ segment) in 3.5 or 4.5 M urea further confirms that the protein unfolds with a single cooperative transition. Figure 4 shows the relative abundance of high-m/z envelope (corresponding to the unfolded form) vs. H/D exchange period for C22A-FKBP and its V^-M^^ segment, at either 3.5 M or 4.5 M urea. The unfolding rate constant, kunfold* of the protein and its V^-M^^ segments of the protein can be determined by fitting each data set to a single-exponential curve. At either 3.5 M or 4.5 M urea, the unfolding rate of the intact protein and its V^-M^^ segment are near-identical. Similar results (not shown) were obtained for numerous other C22A-FKBP segments. Observation of a common unfolding rate for each of many segments of the protein backbone further establishes a two-state (folded/unfolded) equilibrium for C22A-FKBP in 3.5 and 4.5 M urea. The unfolding _ rate constant, kunfold = 1.8 ± 0.2 h-1 at 3.5 M urea and 7.2 ± 0.4 h-1 at 4.5 M urea, corresponding to a unfolding half-life of 23 min at 3.5 M urea and 6 min at 4.5 M urea. If the unfolding equilibrium constant Kunfold = k u n f o l d / k f o l d can be determined, then the refolding constant, kfoid» for different regions of the protein can also be determined as follows. The fully deuterated protein 0 0.2 0.4 0.6 0.8 equilibrated in 3.5 M and 4.5 M urea/D20 solution was Exchange Period (h) rapidly diluted (10-fold) by Figure 4. Relative abundance of high-m/z isotopic H 2 O buffer, and after 5 envelope (i.e., relative number of FKBP molecules that seconds, further deuterium have unfolded at least once) vs. exchange period for exchange-out was quenched C22A-FKBP and its V^-M^^ segment in either 3.5 M or by decreasing the pH to 2.4. 4.5 M urea. Each data set is fitted to a single During the brief 5 s period, exponential curve. The close agreement between the data amide deuteriums in unfolded for intact protein and its W^-M^^ segment and all of its protein molecules are other segments (not shown) supports a two-state replaced with hydrogen faster folded/unfolded protein equilibrium model. than are amide deuteriums in
Structure and Dynamics of FKBP by H/D Exchange and FT-ICR MS
711
folded protein molecules. If C22A FK506-Binding Protein the D/H exchange rate is faster than or comparable to the Equilibrate in D2O; refolding rate during those five Exposed to H2O (5 seconds); Quench seconds, then two distinct envelopes will be seen in the isotopic distribution, and the 3.5 iVI Urea relative abundances of the two envelopes provides a direct -[•-"{"•y-i^inrTni/yJb llir^ii»inii measure of the relative concentrations of the folded and unfolded proteins. 4.5 M Urea Figure 5 shows the isotope distribution of C22A-FKBP ^....ti^LUlAAA^^tAy^^ • )--nf-v after such pulse-labeling in 3.5 1314 1310 1312 1316 1318 M and 4.5 M urea. For the 4.5 m/z M urea experiment, two isotopic envelopes are clearly Deuterium Distribution evident. The low-m/z envelope 4.5 IVi Urea 1 represents the unfolded (rapidly D/H-exchanged) protein and the high-m/z r ^ ^M, envelope is the folded (slowly 10 20 30 40 50 60 70 D/H-exchanged) protein. Number of Deuterons Thus, the D/H exchange rate must be much faster than the Figure 5. Isotope distributions for C22A-FKBP, refolding rate. The relative [M+9H]^"*", following equilibration in 3.5 M and 4.5 M abundance of the two urea in D2O, followed by pulse-labeling for 5 s with envelopes thus provides a H2O, and quenching to stop the D/H exchange. The direct measure of the relative low-m/z envelope thus represents the unfolded form of concentrations of folded and the protein. The bottom panel represents the unfolded protein. (Although distribution of incorporated deuterons computed by the two distributions do not deconvolving the natural-isotopic distribution from the experimental mass spectra. overlap in this example, deconvolution of the naturalabundance isotope distribution clearly narrows the observed distributions, and promises to increase the power of mass spectrometry to resolve conformational states more similar than shown here.) From the distributions shown in Fig. 5, we calculated an unfolding equilibrium constant, Kunfold» of 40 h"! (half-life r (7). However, these wavelengths are not ideal for studies of biological systems because they are not strongly absorbed by aqueous solutions. Absorbing dyes were therefore needed to transfer heat to the solvent. To generate larger temperature jumps, more suitable wavelengths (between 1100 to 2000 nm) were needed. Holzwarth et al. used an iodine laser with an output at 1.315 |im for direct heating of water (8). Turner et aL (9, 10) made use of the Stimulated Raman effect to shift the Nd:YAG fundamental to 1.41 \xm in liquid N2. However, difficulties with using Uquid N2 (unstable pulses due to boiling at room temperature, self focussing, and the need for an expensive dewar) prompted use of H2 gas as the Raman active medium (11). H2 produces a frequency shift from 1.06 jim to 1.89 |Lim. Not only were larger temperature jumps achieved but very rapid heating was accomplished since the radiation was directly absorbed by the solvent. Nearinfrared heating also eliminated the complication of affecting the reactant chemical relaxation since it is unable to excite most electronic transitions in a single photon process. One remaining difficulty was the possibility of producing temperature gradients in the sample since the amount of light absorbed at any point depends exponentially on the depth. This effect can however be minimized by using short pathlength cells ( 0 . 1 - 1 mm), or reflecting the heat pulse multiple times through the sample, or by decreasing the IR optical density by either using a laser wavelength with lower absorption or by mixing water and D2O (10). Recently, interesting results have emerged from nanosecond or faster temperature jump experiments studying the initial events during protein folding/unfolding with 10-30 °C temperature changes. Williams et al. have used difference frequency generation of a Nd:YAG laser and pulsed dye laser to produce a 20 ns temperature jump pulse at 2 [xm that remained elevated for ~1 millisecond (5c). The unfolding reactions of a small helical peptide (5c) and apomyoglobin (5d) were monitored with transient infrared spectroscopy. The Stimulated Raman effect was used by Ballew et al. (5f) to generate a temperature jump pulse at 1.54 [xm that has similar time resolution as reported by Williams et al. (5c). Intrinsic tryptophan fluorescence was used to follow the refolding reaction of apomyoglobin from its cold denatured state. Phillips et al. have used a different approach to generating a temperature jump (5a). The energy from a laser pulse was absorbed by homogeneously dispersed dye molecules that subsequently released energy as heat to cause a temperature jump of up to 10 °C within 70 ps. They have studied the unfolding of RNase A by picosecond transient infrared spectroscopy. One interesting problem that can be addressed with this technique is the kinetics of helix-coil conformational change. The a-helix is the most commonly occurring form of secondary structure found in proteins, and understanding the kinetics of the helix-coil transition will undoubtedly contribute to understanding the mechanism of protein folding. Studies have shown that for long polypeptides (>200 residues), a-helix formation occurs at a rate faster than 1x10^ s"^ (12); however, a-helical segments in proteins are much smaller than 200 residues. It was initially thought that small peptides would not be stable enough to form a-helical structures in solution. Recently,
Laser Temperature Jump in Protein Folding
737
however, there have been several reports of heUx formation in small peptides in aqueous solution (13). The fact that small peptides are stable enough to form secondary structure may support theoretical postulates that secondary structure forms early in the folding process (14). Tertiary structure could then form through the relative diffusion of fluctuating secondary structural regions. Simulations of small a-helical peptides have provided insight into the helix folding mechanism and have conjectured that folding and unfolding can occur in less than ten nanoseconds (15). Applying recently developed rapid initiation techniques to the helix-coil transition of small peptides will allow these theoretical predictions to be examined. This paper describes a nanosecond laser temperature-jump instrument with time resolution suitable for investigations of the early events in protein and peptide folding/unfolding, as well as kinetic events that occur out to 12 milliseconds. As an example of the capabilities of the instrument described here, some results are presented on the kinetics of the helix-coil transition of a synthetic, 21-residue a-helical peptide studied by Lockhart and Kim (16). The peptide has a fluorescent probe, 4-methylaminobenzoic acid (MABA), covalently attached to the N-terminus, giving the structure: MABAAAAAA(AAARA)3A-CONH2, where A is alanine and R is arginine. The carbonyl oxygen of MABA forms a hydrogen bond with the amide NH of residue 4 only in the helical conformation (16), dramatically affecting the fluorescence intensity.
11. METHODS As determined from HPLC, the purity of the peptide was greater than 90%. Steady-state fluorescence spectra of this peptide were collected from 310-480 nm with an excitation wavelength of 264 nm. A 1 cm pathlength cuvette was used with concentrations of -8.6 \JM. The emission quantum yields were determined relative to N-acetyl-L-tryptophanamide at pH = 6.9 (Of = 0.13) (17). Steady-state circular dichroism spectra were obtained using a 0.5 mm pathlength cylindrical cell and concentrations of -0.26 mM. The mean molar ellipticity [6] (deg cm^ dmol"^) was calibrated with (+)-10-camphorsulfonic acid. Concentrations of the solutions were determined by measuring the absorbance of 4-methylaminobenzoic acid.
A.
Laser Temperature Jump Instrument
A near-infrared laser pulse at 1.54 jam was used to rapidly heat the aqueous peptide solution 10-20 °C, while a cw ultraviolet probe beam excited the fluorescence of the labeled peptide and monitored the relaxation kinetics. A schematic of the instrument is shown in Figure 1. To produce the temperature jump pulse the fundamental (1064 nm) of a Nd:YAG laser (Continuum Surelite I), operating at 1.67 Hz, was focused with a 0.75 m lens into a one meter Raman cell (Princeton Optics, Inc.). The Raman cell contained 600 psi of CH4 and 500 psi of He and had a conversion efficiency of up to 20% for the first Stokes hne (1.54 |im). The 1.54 jjm wavelength
Peggy A. Thompson
738
(pulse width of 3-5 ns FWHM) was separated from the fundamental and antiStokes lines with a pellin-broca prism. The T-jump pulse was then focused onto the sample with a 0.75 m lens to give a spot size of 1 mm, and directly heated a small volume of water (-0.4 jiL) by vibrational excitation. This wavelength corresponds to the near-IR absorption band of the OH stretching overtone with £ = 5.2 cm~\ Using a 500 |Lim pathlength cuvette, 38% of the infrared beam was absorbed in an aqueous medium. To ensure uniform heating at the front and back of the cuvette, the remaining 62% of the 1.54 |Lim light was reflected back onto the sample in a double pass configuration. An ultraviolet fluorescence excitation beam was generated with an intracavity frequency doubled argon ion laser (Coherent, FRED) providing a tunable wavelength range of 229 - 264 nm. The excitation beam was focused onto the sample with a 10 cm lens to give a spot size of 60-70 |im. For fluorescence experiments, front-face illumination geometry was used. The Prism
Nd:YAG
Photodiode
Frequency Doubled Argon Ion Laser
Digital L Oscilloscope
Computer
Trigger
Figure 1. Schematic of the temperature jump instrument. A 3-5 ns (FWHM) temperature jump pulse is generated using the Stimulated Raman effect to shift the Nd:YAG fundamental (1064 nm) to 1.54 |im in a mixture of CH4 and He gas. 10 mJ of 1.54 |Lim light is focused onto the 5(X) |Lim pathlength sample cell to give a spot size of 1 mm. The remaining nearinfrared light is reflected back onto the sample cell to ensure uniform heating. A cw ultraviolet beam from an intracavity frequency doubled argon ion laser is focused to a spot size of 60-70 |im and excites the fluorescent sample. The fluorescence signal is detected 90" from the excitation beam with a photomultiplier tube.
Laser Temperature Jump in Protein Folding
739
fluorescence signal was collected with a 3.9 cm focal length lens, filtered from residual reflected excitation and infrared light, and detected 90 degrees from the excitation beam with a photomultiplier tube. The signal was amplified by cascading two-channels (5x gain per channel) of a fast preamp module (Stanford Research Systems, SR240) and processed by a digital oscilloscope (Tektronix, TDS 620). The waveforms from the oscilloscope were transferred to a personal computer for analysis. The size of the laser temperature jump was characterized by monitoring tryptophan fluorescence intensity change, which decreases 1% for each degree of temperature increase. 20 degree temperature jumps can consistently be generated with this instrument using 10 mJ of laser energy. Since the vibrational relaxation time of water in the near-IR is about 10"^^ s (18), the time it takes to reach a maximum temperature jump is limited only by the laser pulse width. Figure 2a shows that the instrument response time for a temperature jump from 0 °C to 20 °C is ~5 ns and that thermal diffusion from the reaction volume takes several milliseconds (Figure 2b).
^:^ 0.055
a
0.05
S
0.045
FT
0.04
o c
0
^
0.036
>iii»ii[>
50 100 Time, nanoseconds
M|H«M»l OMXl*! SI I W W i X "
o
E
0.028 0.026
0
Time, milliseconds Figure 2. a) The decrease in tryptophan fluorescence intensity after a 0 "C to 20 "C temperature jump is shown. The fluorescence intensity can be fit with a 5 ns time constant giving the time resolution of the instrument, b) Monitoring the tryptophan fluorescence intensity shows that the temperature remains elevated for several milliseconds. Relaxation kinetics can be measured over a time range of 5 ns to ~2 ms with this instrument.
Peggy A. Thompson
740
III. RESULTS AND DISCUSSION A.
Steady State Circular Dichroism and Fluorescence
The equilibrium properties of the peptide were characterized by circular dichroism and fluorescence spectroscopies. At low temperatures, the CD spectrum of this peptide has the characteristic double minima at 222 nm and 208 nm indicative of an a-helix. An isodichroic point observed at 202 nm implies that each residue of the peptide exists in either a helix or coil conformation. The extent of helix formation is most easily monitored by following the 222 nm minimum, -[0]222- In Figure 3, the CD thermal unfolding curve monitored at 222 nm shows that the helix content decreases with increasing temperature. At 0 °C the peptide is -70% a-helix, whereas at 70 °C it is -10% helical. The midpoint of the thermal transition occurs at -25 °C, consistent with what has been previously reported (16). Similar to other small helical peptides, the helix-to-coil transition is very broad as a function of temperature, >70 °C (19). However, it is still possible to induce a significant change in the helical population with a 20 degree temperature change. The quantum yields from the fluorescence spectra of MABA-peptide within the thermally induced helix-coil transition are shown in Figure 3 for
0.8 c too
CO
a6
Circular Dichroism
**3
CC
0.4 0.2
Temperature, C Figure 3. Equilibrium data for the MABA-peptide as a function of temperature. The thermal unfolding curve for the peptide monitored by CD at 222 nm shows that the helical content decreases with increasing temperature (Circles). The helix-coil transition is very broad as a function of temperature with the mid-point occurring at -25 "C. The fluorescence quantum yield of MABA attached to the peptide has a strong temperature dependence (Triangles), in marked contrast to free MABA in solution (Squares). There are significant differences between the CD and fluorescence thermal transition curves. It is expected that the two experimental techniques provide different measures of the helical content. (Lines through the data points are provided to guide the eye.)
Laser Temperature Jump in Protein Folding
741
temperatures between -3.5 °C and 65 °C. The total fluorescence intensity for MABA bound to the peptide is strongly dependent on temperature, decreasing 55% as the temperature increases from 0 °C to 65 °C. No spectral shift (emission X^^^ = 368 nm) is detected in this temperature range. By comparison, very little fluorescence temperature dependence is observed for free MABA in solution (Figure 3), indicating that the fluorescence intensity for the MABA-peptide monitors the helix-coil conformational transition at the N-terminus. Equilibrium helix-coil theories (20) predict that the probability of forming a helical segment is higher in the middle of a peptide sequence than at the termini. Spectroscopic techniques sensitive to helical content may then be expected to respond differently to perturbations in the helix population. Because the CD signal has contributions from all amino acids in this peptide, the helical fraction as determined by [9]222 measures the average helical content. By contrast, the fluorescence signal should be sensitive only to the helical population of the N-terminal amino acid residues owing to the location of the fluorescent probe. A comparison of the fluorescence quantum yields and the CD data (222 nm) shows differences in the thermal transition curves (see Figure 3), consistent with the expectation that the two experimental techniques provide different measures of the peptide's helical content.
B. Temperature Jump Kinetics The laser temperature jump instrument was used to rapidly initiate the helixcoil transition for constant initial temperatures between - 8 °C and 50 ""C. The unfolding reaction kinetics were monitored by detecting the fluorescence intensity change of the MABA labeled peptide for the wavelength range 320400 nm. Figure 4 shows an example of the relaxation kinetics for a temperature jump from 10 °C to 29 °C. An average time constant of 18 (± 4) ns was measured for the unfolding reaction. A maximum relaxation time of 21 (± 4) ns is observed near the mid-point of the helix-coil transition for 1
1
1
0.06 >. • V c
o c
O D O
v^
0.055
-
^ i " ^ ^ * - s , , . i : : : : \ ^.^•~^-
0.05 -20
r. ^- y^**^"^
v^v ^ —*.
0
20
40
60
80
100
Time, nanoseconds Figure 4. The time constant for the change in fluorescence of the MABA-peptide after a temperature jump from 10 "C to 29 °C is 18 (± 4) ns. This is interpreted as the relaxation time for the change in helix content at the N-terminus.
742
Peggy A. Thompson
this peptide. The relaxation times for final temperatures 30 "C above and below the mid-point temperature are -3 times faster (7-9 ns). All of the relaxation data could be fit with a single exponential decay. It is interesting to compare these results with those from previous work on the same (but not MABA-labeled) peptide. Williams et al. studied the helix-coil transition by infrared spectroscopy for a temperature jump from 9 ''C to 27 °C (5c). They observed a relaxation time of about 160 (± 60) ns, which is approximately 8 times longer than what is observed in the present experiment under similar conditions. However, infrared spectroscopy measures an average helix content, similar to what is expected from CD, whereas the fluorescence monitors the change in helix population at the N-terminus. It has been observed that the helix probability distribution is lower at the termini then in the middle of the peptide sequence (21). Simulations of the kinetics suggest that the relaxation time for the average helical content will be longer than for the N-terminus (22). Previous investigations of hehx-coil transition kinetics, which used a variety of fast relaxation methods (electric field jump, ultrasonic absorption, dielectric relaxation and temperature jump), encountered many difficulties (12). The systems studied were long homopolymers (>200 residues) that often had hydrolyzable side chains. Controversial results have been reported, depending on the experimental technique employed, because unwanted side chain reactions or molecular reorientation were often difficult to distinguish from the helix-coil conformational change. However, as observed here, a maximum in the relaxation times was detected for these experiments ranging from 15 |LIS to 20 ns and was attributed to the hehx-coil transition.
IV. CONCLUSIONS The laser temperature jump instrument can effectively be used to initiate and observe the fast events in protein/peptide folding and unfolding as well as those events that extend out to several milliseconds. In the present study, the unfolding of a helical peptide was determined to occur within tens of nanoseconds, supporting the need for nanosecond or faster initiation techniques. Promising results obtained by the laser temperature jump method will continue to stimulate the development of additional monitoring techniques such as UV absorption and circular dichroism.
ACKNOWLEDGEMENTS This work was carried out in collaboration with James Hofrichter and William Eaton at the National Institutes of Health. The peptide was a kind gift from Peter Kim.
Laser Temperature Jump in Protein Folding
743
REFERENCES 1.
2. 3. 4. 5.
6.
7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18.
19.
20.
21.
22.
(a) M. Karplus and D. L. Weaver, Prot. Sci. 3, 650 (1994); (b) J. D. Bryngelson, J. N. Onuchic, N. D. Socci, P. G. Wolynes, Proteins 21, 167 (1995); (c) K. A. Dill et al Prot. Sci. 4, 561 (1995); (d) D. Thirumalai, J. de Phys. I 5, 1457 (1995); (e) A. A. Mirny, V. Abkevich, E. I. Shakhnovich, Folding & Design, 1, 103, (1996). C. M. Jones, E. R. Henry, Y. Hu, C.-K. Chan, S. D. Luck, A. K. Bhuyan, H. Roder, J. Hofrichter, W. A. Eaton, Proc. Natl. Acad. Sci. USA 90, 11860 (1993). C.-K. Chan, Y. Hu, S. Takahashi, D. L. Rousseau, W. A. Eaton, J. Hofrichter, in preparation. T. Pascher, J. P. Chesick, J. R. Winkler, H. B. Gray, Science 211, 1558 (1996). (a) C. M. Phillips, Y. Mizutani, R. M. Hochstrasser, Proc. Natl. Acad. Sci. USA 92, 7292 (1995); (b) B. Nolting, R. Golbik, A. Fersht, Proc. Natl. Acad. Sci. USA 92, 10668 (1995); (c) S. Williams, T. P. Causgrove, R. Gilmanshin, K. S. Fang, R. H. Callender, W. H. Woodruff, R. B. Dyer, Biochemistry 35, 691 (1996); (d) R. B. Dyer, S. Williams, W. H. Woodruff, R. Gilmanshin, R. H. Callender, Biophys. J. 70, A177 (1996); (e) P. A. Thompson, W. A. Eaton, J. Hofrichter, Biophys. J. 70, A177 (1996); (f) R. M. Ballew, J. Sabelko, M. Gruebele, Proc. Natl. Acad. Sci. USA 93, 5759 (1996). (a) G. W. Flynn, N. Sutin, in Chemical and Biochemical Applications of Lasers, C. Bradley Moore, ed.. Academic Press, New York, p. 309, 1974; (b) C. F. Bernasconi, in Relaxation Kinetics, Academic Press, New York, p. 180, 1976; (c) D. H. Turner, in Investigation of Rates and Mechanisms of Reactions, Part 2, C. F. Bernasconi, ed., John Wiley & Sons, Inc., Vol. 6, p. 141, 1986. H. Staerk, G. Czerlinski, Nature 205, 63 (1965); H. Hoffmann, E. Yeager, J. Stuehr, Rev. Sci. Instrum. 39, 649 (1968). J. F. Holzwarth, A. Schmidt, H. Wolff, R. Volk, J. Phys. Chem. 81, 2300 (1977). J. V. Beitz, G. W. Flynn, D. H. Turner, N. Sutin, J. Am. Chem. Soc. 92, 4130 1970). D. H. Turner, G. W. Flynn, N. Sutin, J. V. Beitz, J. Am. Chem. Soc. 94, 1554 (1972). S. Ameen, Rev. Sci. Instrum. 46, 1209 (1975). (a) R. Zana, Biopolymers 14, 2425 (1975); (b) B. Gruenewald, C. U. Nicola, A. Lustig, G. Schwarz, H. Klump, Biophys. Chem. 9, 137. (a) J. E. Brown, W. A. Klee, Biochemistry 10, 470 (1971); (b) J. M. Scholtz, R. L. Baldwin, Annu. Rev. Biophys. Biomol. Struct. 21, 95 (1992). (a) O. B. Ptitsyn, A. A. Rashin, Biophys. Chem. 3, 1 (1975); (b) M. Karplus, D. L. Weaver, Nature 260, 404 (1976). (a) V. Daggett, P. A. Kollman, I. D. Kuntz, Biopolymers 31, 1115 (1991); (b) V. Daggett, M. Levitt, J. Mol. Biol. 223, 1121 (1992); (c) W. Schneller, D. L. Weaver, Biopolymers 33, 1519 (1993); (d) S.-S. Sung, Biophys. J. 66, 1796 (1994); (e) S.-S. Sung, X.-W. Wu. PROT.: Struct., Func, and Gen. 25, 202 (1996). (a) D. J. Lockhart, P. S. Kim, Science 257, 947 (1992); (b) D. J. Lockhart, P. S. Kim, Science 260, 198 (1993). R. W. Cowgill, Biochim. Biophys. Acta. 168, 431 (1968). (a) D. M. Goodall, R. C. Greenhow, Chem. Phy. Lett. 9, 583 (1971); (b) L. Genberg, F. Heisel, G. McLendon, R. J. D. Miller, J. Phys. Chem. 91, 5521 (1987); (c) P. A. Anfinrud, C. Han, R. M. Hochstrasser, Proc. Natl. Acad. Sci. USA, 86, 8387 (1989). (a) K. R. Shoemaker, P. S. Kim, E. J. York, J. M. Stewart, R. L. Baldwin, Nature 326, 563 (1987); (b) J. M. Scholtz, S. Marqusee, R. L. Baldwin, E. J. York, J. M. Stewart, M. Santoro, D. W. Bolen, Proc. Natl. Acad. Sci. USA 88, 2854 (1991); (c) J. M. Scholtz, H. Qian, E. J. York, J. M. Stewart, R. L Baldwin, Biopolymers 31, 1463 (1991). (a) B. H. Zimm, J. K. Bragg, J. Chem. Phys. 31, 526 (1959); (b) S. Lifson, A. Roig, J. Chem. Phys. 34, 1963 (1961); (c) D. Poland, H. A. Scheraga, in Theory of HelixCoil Transitions in Biopolymers, Academic Press, New York, 1970. (a) E. K. Bradley, J. F. Thomason, F. E. Cohen, P. A. Kosen, L D. Kuntz, J. Mol. Biol. 215, 607 (1990); (b) S. M. Miick, A. P. Todd, G. L. Millhauser, Biochemistry 30, 9498 (1991); (c) A. Chakrabartty, J. A. Schellman, R. L. Baldwin, Nature 351, 586 (1991); (d) M. I. Liff, P. C. Lyu, N. R. Kallenbach, J. Am. Chem. Soc. 113, 1014 (1991); (e) C. A. Rohl, R. L. Baldwin, Biochemistry 33, 7760 (1994). P. A. Thompson, W. A. Eaton, J. Hofrichter (manuscript in preparation).
This Page Intentionally Left Blank
Biophysical and Structural Analysis of Human Acidic Fibroblast Growth Factor Michael Blaber, Daniel H. Adamek, Aleksandar Popovic and Sachiko I. Blaber Institute of Molecular Biophysics and Department of Chemistry, Florida State University, Tallahassee, FL 32306-3015
I. Introduction II. Materials and Methods A. Expression and Purification of human aPGF B. Calorimetric Analysis III. Results A. Purification of Human aFGF B. Calorimetric Analysis IV. Discussion A. Calorimetric Analysis Acknowledgments References
I. Introduction Acidic fibroblast growth factor (aFGF) is one of nine known members of the FGF family (1, 2, 3, 4). It is the only member which is able to bind with high affinity to all four characterized FGF receptors (FGFRs), and variants produced by alternative mRNA splicing (5). Since expression of the various FGFRs is distributed over a wide variety of cell types, including cells of mesodermal and ectodermal origin, aFGF is probably one of the broadest specificity mitogens known. FGF's have also been termed "heparin binding" growth factors due to their binding specificity for heparin and heparan proteoglycans (6, 7). Complexation with heparin has been demonstrated to protect aFGF from inactivation by heat, acid (8), proteolysis (9) and oxidation (10). Thermal inactivation appears to be a physiologically relevant phenomenon. Circular dichroism and differential calorimetric studies have suggested that the thermal transition midpoint (Tm) may be near to physiological temperature, and that interaction with heparin can stabilize aFGF by some 20 °C (11). In addition to an apparently low thermal stability, FGF appears to face additional problems in maintaining its native, functional structure. Human aFGF contains three cysteine residues and the related basic FGF (bFGF) contains four cysteines. These residues are present in the active protein as fi'ee cysteine residues and oxidation, to form either inter- or intra-chain disulfide bonds, has been demonstrated to inactivate the protein TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
745
Michael Blaber et al
746
(10). Mutation of these residues has demonstrated that they are not functionally important and substitution by serine can extend the in-vitro protein half-life considerably (12, 13). Although this increase in half-life has been interpreted as the result oi stabilization of the structure, there is no evidence for this in the formal sense, i.e. the mutation has increased the Tm value. An alternative interpretation is that a disulfide mediated irreversible denaturation pathway has been effectively eliminated. Formulation studies of aFGF have identified yet another contribution to inactivation, namely irreversible aggregation of the unfolded state. Thermal denaturation studies indicate that after unfolding, the protein aggregates and precipitates (14). This does not appear to be related to the formation of mixed disulfide bonds, but is instead a non-covalent association of the protein while in the unfolded state. Unfolded, or partially folded, forms of aFGF have very low solubility and aggregate irreversibly (15). Formulations which minimize or postpone aggregation have been interpreted as stabilizing the structure (14). Again, in the formal sense, this may not be the case. It could very well be that useful formulation additives are able to solubilize the unfolded state without any influence upon the T^. In any case, the FGF's (and particularly aFGF) may, in fact, utilize stability as a regulatory mechanism. This is achieved by combining inherently low thermal stability with irreversible denaturation (both covalent and non-covalent in origin). Furthermore, as a true regulation mechanism, under specific circumstances (e.g. in the presence of heparin) the stability (Tm) can be significantly increased. While this increase in stability would not necessarily have any effect upon the irreversible mechanisms, at physiological temperatures it would effectively minimize the fraction of the protein population which would be in the unfolded state at any given time. Thus, the irreversible denaturation mechanisms would occur at a significantly lower effective rate. The relationship between reversible and irreversible denaturation pathways for human aFGF are diagrammed in figure 1.
aggregation of unfolded state
low Tm 1
(absence of heparin)
Native State
_ ""
^ , . - ,^ Precipitation _».
Denatured -::[]
highTm (presence of heparin)
active
(oxidation of cys residues)
98% pure material.
B, Calorimetric Analysis Calorimetric data for human aFGF at neutral pH (50 mM HEPES, 0.5 mM DTT and 2 mM EDTA) with and without the addition of 10 mM Phosphate or Sulfate ion, are listed in table I. Also listed in this table are the calorimetric data for the addition of 0.6 M GuHCl to phosphate buffered saline (plus 0.5 mM EDTA and 2.0 mM DTT) in the presence of 10 mM (NH4)2S04. Table I. Thermodynamic parameters of unfolding for human aFGF in the presence of phosphate and sulfate ions. Also listed is the effect of 0.6 M guanidine hydrochloride on the thermodynamic parameters of unfolding. Reversibility Sample Buffer Tn. AHcal AHvH (°C) (kcal/mol) (kcal/mol) (%) 0 50 mM HEPES, 0.5 mM EDTA, 2.0 mM 61 98 35.2 DTT, pH 7.0 40.9 67 143 0 +10mMNaH2PO4 46.2 72 160 0 +10 mM (NH4)2S04 0 86 144 20 mM NaH2P04, 0.15 M NaCl, 0.5 mM 46.2 EDTA, 2.0 mM DTT, 10 mM (NH4)2S04 pH7.3 +0.6 M GuHCl 46.0 66 67 88
Biophysical and Structural Analysis of Human Acidic FGF
749
IV. Discussion A. Calorimetric Analysis With the exception of the sample containing 0.6 M GuHCl (discussed below) all DSC samples exhibited irreversible denaturation, as judged by an absence of an endotherm on the second run, even in the presence of DTT. Furthermore, all samples upon removal from the calorimeter were opaque, indicating that precipitation had occurred during or after denaturation. A denaturation endotherm for human aFGF at neutral pH is shown in figure 2. Initial studies at various pH values suggests that the thermodynamic parameters do not vary much over the range pH 7.0 to 8.0, thus, this endotherm is representative of the physiological pH range. The !„, of human aFGF at pH 7.0 is approximately 35 °C, a value similar to that reported by Middaugh and coworkers (11). The profile of the endotherm indicates that a significant fraction of the protein is actually unfolded at physiological temperature. In practical terms, this information also suggests that yields from fermentation would be expected to be low unless the temperature upon induction is lowered from the typical value of 37 °C. The AHcai is 61 kcal/mol and AHVH is 98 kcal/mol. Normally, a AHVH/ AHcai ratio greater than 1.0 is indicative of a protein which is present in a multimeric state in solution. However, there is no evidence for stable multimer formation of aFGF. The most likely explanation for the observed AHVH/ AHcai ratio is that it is related to the associated aggregation and precipitation under these conditions.
Tta Heal HvH
— I
10
1
Control
10 inM Pi
35.2 6.14E4 9.80E4
40.9 6.72E4 1.43E5
1
20
1
1
30
1
1
40
10 DM SO^ 46.2 7.20E4 1.60E5
1
1 —
50
temperature (°C)
Figure 2. DSC denaturation endothenns for human aFGF in 50 mM HEPES, 0.5 mM EDTA, 2 mM DTT, pH 7.0 (short dashed line). Overlaid onto this plot are the endotherms for the same conditions but with the addition of either 10 mM NaH2P04 (long dashed line) or 10 mM (NH4)2S04 (solid line).
Michael Blaber et al
750
The effects of either phosphate or sulfate ion on the stability of human aFGF at neutral pH is shown infigure2. The addition of 10 mM phosphate ion increases the Tm by 5.7 °C to 40.9 °C, and the presence of 10 mM sulfate ion increases the T^ by 11.0 °C to 46.2 °C. The structure of human aFGF was solved with crystals grown in the presence of lOmM (NH4)2S04 and 20 mM phosphate buffer (17). A region of positive density was observed on the surface of the molecule near the residues asparagine 18, lysine 113 and lysine 118 (figure 3). This density was interpreted as an ordered sulfate ion (17). Near this region are additional basic residues including lysine 112, arginine 116, and arginine 122. Thus, this region can be described as a clustering of like-charged (i.e. basic) residues. In the unfolded state these residues are separated fi'om one another along the polypeptide chain. Thus, due to charge repulsion, they may actually contribute to instability of the native structure, and the introduction of an appropriate counter ion (e.g. sulfate) stabilizes the structure. The lack of reversibility, and the presence of precipitation, makes thermodynamic analysis of aFGF particularly challenging. Precipitation in the presence of DTT indicates that precipitation is not dependent upon the formation of mixed disulfides. Structural analysis of human aFGF (17) shows that the three fi'ee cysteine residues are located at solvent inaccessible positions (figure 4). Thus, formation of mixed disulfides would be expected to destabilize the protein because a) structural changes would be required to expose the cysteines for oxidation and b) covalent adducts of the cysteine residues would have to be tolerated within the packing constraints of the interior of the protein for the native state to be adopted.
Lys112
Argl22
Lys 112
Arg 122
Figure 3. X-ray crystal structure of human aFGF in the region of lysine 118. Shown is an Fobs-Fcaic difference density map (phases from the model), contoured at 4 a, into which is a sulfate ion has been built (17). The region around this site contains several other basic residues, including lysine 112, lysine 113, arginine 122 and lysine 128.
Biophysical and Structural Analysis of Human Acidic FGF
751
Figure 4. Stereo Ca trace of human aFGF (17) showing the locations of the three free cysteine residues at positions 16, 83 and 117.
The addition of GuHCl had little effect upon either the stability or the reversibility of thermal denaturation until a concentration of approximately 0.6 M. At this concentration the reversibility of the thermal denaturation went from 0% to 88%, as judged by a comparison of AHcai values for repetitive scans (figure 5). The addition of this amount of GuHCl did not appear to significantly destabilize the protein, as judged by the similar Tm value in comparison to the sample in the absence of GuHCl (46.0 °C versus 46.2 °C). Furthermore, the values for the calorimetric and van't Hoff enthalpies were much closer to unity (table I). In comparison to the DSC analysis in the absence of GuHCl, the effect was primarily on upon the apparent van't Hoff enthalpy. Thus, for those DSC analyses demonstrating aggregation and precipitation, the calorimetric enthalpy is the more reliable value. How is reversibility of folding achieved by the addition of a relatively small amount of GuHCl? Since there is almost no change in the Tm, and the calorimetric enthalpy is approximately 80% that of the sample in the absence of GuHCl (table I), it would appear that this amount of GuHCl has little effect upon the native state of the protein. Therefore, the GuHCl appears to be affecting primarily the unfolded state of the protein, i.e. it helps to prevent aggregation of the unfolded state, resulting in reversible folding upon cooling. The discovery that the addition of a relatively small amount of GuHCl can allow reversible denaturation will, for the first time, allow accurate determination of the thermodynamic parameters of unfolding for human aFGF. We are currently constructing a series of alanine and serine mutants at the three cysteine residues in human aFGF. DSC analyses of these mutants will allow the determination of their specific contribution to stability, separate from their effects upon irreversible denaturation.
752
Michael Blaber et al
A Tm
46.23
Heal
8.58E4
HvH
1.44E5
^--^/^—^ 10
20
30
40
^ 50
Temperature ( C)
Temperature (°C)
Temperature ("C)
Temperature ("C)
60
70
B
Figure 5. DSC denaturation endotherms for human aFGF in 20 mM NaH2P04, 0.15 M NaCl, 0.5 mM EDTA, 2.0 mM DTT, 10 mM (NH4)2S04, pH 7.3. Panel A shows repetitive scans (thefirstscan is on the left and the second on the right. Panel B shows repetitive scans, as in panel A, but with the addition of 0.6 M guanidine hydrochloride to the buffer.
Acknowledgments The authors would like to thank Drs. Ken Thomas, C. Russell Middaugh and John Brandts for helpful discussions. This work was supported in part by the Markey Foundation, Florida State University Council on Research and Creativity, and N.I.H. grant GM54429-01.
References 1. Burgess, W. H. and Maciag, T. {\9%9) Annual Reviews of Biochemistry 58, 575-606. 2. Miyamoto, M., Naruo, K., Seko, C, Matsumoto, S., Kondo, T. and Kurokawa, T. (1993) Molecular and Cellular Biology 13, 4251-4259.
Biophysical and Structural Analysis of Human Acidic FGF
753
3. Tanaka, A., Miyamoto, K., Minamino, N., Takeda, M., Sato, B., Matsuo, H. and Matsumoto, K. (1992) Proceedings of the National Academy of Science USA 89, 8928-8932. 4. Thomas, K. A. in Neurotrophic factors S. E. Loughlin, J. H. Fallon, Eds. (Academic Press, Inc., San Diego, 1993) pp. 285-312. 5. Chellaiah, A. T., McEwen, D. G., Werner, S., Xu, J. and Omitz, D. M. (1994) Journal of Biological Chemistry 269, 11620-11627. 6. Gospodarowicz, D., Cheng, J., Lui, G.-M., Baird, A. and Bohlen, P. (1984) Proceedings of the National Academy of Science USA 81, 6963-6967. 7. Lobb, R. R. and Fett, J. W. (1984) Biochemistry 23, 6295-6299. 8. Gospodarowicz, D. and Cheng, J. (1986) Journal of Cellular Physiology 128, 475-484. 9. Rosengart, T. K., Johnson, W. V., Friesel, R., Clark, R. and Maciag, T. (1988) Biochemical and Biophysical Research Communications 152, 432-440. 10. Linemeyer, D. L., Menke, J.G., Kelly, L.J., DiSalvo, J., Soderman, D., Schaefifer, M.-T., Ortega, S., Gimenez-Gallego, G. and Thomas, K.A. (1990) Growth Factors 3, 287-298. 11. Copeland, R. A., Ji, H., Halfpenny, A.J., Williams, R.W., Thompson, K.C., Heiber, W.K., Thomas, K.A., Bruner, M.W., Ryan, J.A., Marquis-Omer, D., Sanyal, G., Sitrin, R.D., Yamazaki, S. and Middaugh, C.R. (1991) Archives of Biochemistry and Biophysics 2S9, 53-61. 12. Seno, M., Sasada, R., Iwane, M., Sudo, K., Kurokawa, T., Ito, K. and Igarashi, K. (1988) Biochemical and Biophysical Research Communications 151, 701-708. 13. Ortega, S., Schaefifer, M.-T., Soderman, D., DiSalvo, J., Linemeyer, D.L., Gimenez-Gallego, G. and Thomas, K.A. (1991) Journal of Biological Chemistry 266, 5842-5846. 14. Tsai, P. K., Volkin, D.B., Dabora, J.M., Thompson, K.C., Bruner, M.W., Gress, J.O., Matuszewska, B., Keogan, M., Bondi, J.V. and Middaugh, C.R. (1993) Pharmaceutical Research 10, 649-659. 15. Mach, H., Ryan, J. A., Burke, C. J., Volkin, D. B. and Middaugh, C. R. (1993) Biochemistry 32, 7703-7711. 16. Linemeyer, D. L., Kelly, L.J., Menke, J.G., Gimenez-Gallego, G., DiSalvo, J. and Thomas, K.A. (1987) Biotechnology 5, 960-965. 17. Blaber, M., DiSalvo, J. and Thomas, K. A. (1996) Biochemistry 35, 2086-2094.
This Page Intentionally Left Blank
A thermodynamic analysis discriminating loop backbone conformations Jean-Luc Pellequer^ and Shu-wen W. Chen^ Department of Biochemistry and Molecular Biophysics. Columbia University. 630W 168th Street. New York, NY 10032
I. INTRODUCTION Antibodies are soluble molecules that specifically recognize antigens by their antigenbinding sites called complementarity determining regions (CDRs). CDRs were originally characterized as regions having a high variability in amino-acid sequences (Kabat et ai, 1977). The X-ray crystal structures of antibodies revealed that CDRs are loop connecting pstrands located at the extremity of a highly conserved p-barrel fold known as the framework (FR) (Padlan & Davies, 1975; Amzel & Poljak, 1979; Davies etai, 1990; Barre etai, 1994; Bork etal., 1994). Moreover, X-ray crystal structures of protein antigen-antibody complexes demonstrate that CDRs provide almost all intermolecular contacts with the antigenic determinant or epitope (Sheriff et al., 1987; Padlan et al., 1989; Fischmann et al., 1991; Herron etal, 1991; Tulip etal., 1992; Chitarra etal, 1993; Prasad etal., 1993; Ban etal., 1994; Bhat etal., 1994; Braden etal., 1994; Malby etal., 1994; Braden etal., 1996). In order to understand the intimate details in the specificity of the recognition process between antibodies and antigens, the three-dimensional structures of these molecules are required. Although X-ray diffraction studies provide accurate description of molecules at an atomic level, it is a time consuming task to undertake. Because of the very high structural similarity of the framework conformation, attempts have been made to model new antibody conformations using homology modeling techniques. Moreover, these modeling experiments provide a basis for integrating and testing our understanding of antibody structure. The major challenge in this approach is to adequately search conformational space for the six hypervariable loops or CDRs, three for the light chain and three for the heavy chain to obtain accurate models. We used several methods were used to model antibody CDRs that can be divided into two categories: (1) a knowledge-based approach that uses CDRs from known crystal structures of antibodies and (2) an ab-initio approach that builds CDR loops. All of these approaches must fulfill one criterion: to identify the conformation that is best adapted to the framework of a current model. Unfortunately, none of the current methods has an Present address: Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
755
756
Jean-Luc Pellequer and Shu-wen W. Chen
appropriate energy furxtion that allows the discrimination between an incorrect from a correct CDR conformation. Such a discrimination is of great significance especially for methods employing the knowledge-based approach by using canonical structures of CDRs. Indeed, methods relying on CDR structural knowledge allow you to identify which class the CDR you are modeling belong to, but do discriminate which CDR from known crystal structure of antibodies one should select. Moreover, when critical residues are not present, the modehng using canonical structures becomes less efficient (Steipe et al., 1992; Bell et ai, 1995). In this paper we report results concerning the development of a complete physical treatment that allows the screening of loop conformations in order to identify the most suitable ones for a particular antibody model. Here, we establish a formalism that allows the computation of the conformational free energies of loops by combining a molecular mechanic treatment of a loop with a continuum treatment of the solvent (Smith & Honig, 1994). We simulate a modeling study by removing the three light chain CDRs from a recently solved crystal structure of an antibody in a bound conformation, namely Fab R4545-11 (R45) (Altschuh etai, 1992; Vix et ai, 1993), then replacing loops from our database (Pellequer & Chen, 1996) and calculating the conformational free energies for each conformation. Our results reveal that loops in the database having the lowest conformational energy are the loops with the smallest RMSD compared to CDRs of R45. We expect our thermodynamic analysis to be generally useful for antibody modeling.
II. MATERIALS AND METHODS A, Insertion of a loop from the database into the Fv R45 1. Replacing loop side-chains Side-chains were substituted to match the sequence of the Fv R45. We set up the dihedral angles % 1 and x2 (Table I) according to the highest probability found in the rotamer library established by Tuffery et al. (1991). Dihedral angles x3 and %4 were set to 180°. Consequently, each amino acid side chain displays the same starting conformation. 2. Optimizing the inserted loop As compared to loop building, the use of a database requires an additional step which is the insertion of a loop into the antibody framework. For example, we could superimpose residues from the framework onto the flanking residues of each loop (N-1 and C+1). However, this would require an additional constraint on the distances between atoms from residues N-1 and C+1 during the building of the database. Such a constraint is inconsistent due to our definition of antibody CDRs (Pellequer & Chen, 1996). Indeed, this constraint could have been adopted only for loops in which residues N-1 and C+1 were in pconformations in their original molecule. Moreover, a recent study concluded that
Thermodynamics of Loop Backbone Conformations
757
superimposing flanking peptides of loops onto an antibody framework failed in providing accurate model for CDRs (Tramontano & Lesk, 1992). An alternative method is a docking procedure e.g. (Carlacci & Englander, 1993). However, this method is computationally costly and introduces an additional variable which is the identification of the best docking orientation. Although, we could have used the conformational free energy calculation to identify such best docking orientation, we found experimentally that it is more appropriate to start with a simple insertion protocol such as a least squares superimposition of the backbone atoms of each loop to the one from the native crystal structure (N, CA, C). Table I. Original and rotamer dihedral angles of light chain CDRs from R45
tL
.
IL Fab R45
Rotamer
Fab R45
Rotamer
CDRLl SER26
58.81
65
-
-
GLN27
166.61
-63
-164.22
178
ASP28
-171.73
-70
132.32
-32
ILE29
59.22
-62
168.06
163
SER30
-63.97
65
THR31
-83.74
-61
-
-
TYR32
-46.51
-64
-46.33
102
TYR50
-144.37
-64
-23.9
102
THR51
-155.63
-61
SER52
65.03
65
-
-
ARG53
-168.13
-176
-176.31
156
LEU54
-74.7
-62
62.12
170
ARG55
100.98
-176
110.83
156
SER56
-54.98
65
-
156
CDRL2
-
-
CDRL3 GLY91
-
-
SER92
-176.11
65
ARG93
-55.36
-176
-60.19
ILE94
-60.92
-62
176.79
163
PR095
35.86
27^
-45.97
-29a
PR096
25.61
27^
-42.22
-29a
a Values for Pro were obtained from Ponder & Richards (1987).
Explicit hydrogens were built on each loop when they were inserted into the Fv R45. We used the HBUILD command (Brlinger & Karplus, 1988) provided with the Xplor program (Version 3.0) (Brunger, 1992). The Xplor topology and parameter files were respectively T0PALLH6x.PR0 and PARMALL3x.PR0. Sixty cycles of conjugate gradient minimization (Powell method) were then carried out while fixing all atoms of the Fv and all heavy atoms of the loop. We used a dielectric constant of 1 and non bonded cut-off of 9A. The cut-on and cut-off of both VdW switch and electrostatic shift functions were
Jean-Luc Pellequer and Shu-wen W. Chen
758
respectively 6.5A and 8A. The total energy used in this evaluation includes bond distance, valence angle, dihedral and improper angle, van der Waals, and electrostatic energies. Then, we carried out two types of optimization (including hydrogens and heavy atoms): (1) side chains only and (2) all atoms. (1) Side chain conformations were minimized by 600 cycles of conjugate gradient minimization (Powell method) and saved. We observed that 600 cycles of minimization allows convergence in a reasonable time. (2) Starting from the conformations in (1), we applied 600 cycles of conjugate gradient minimization to all atoms of the loop.
3. Optimization of the loop closure For both minimization procedures described above the loop closure occurs identically as follows: the peptide bond atoms located at both extremities of a loop were allowed to move during the minimizations (even when only side chains were optimized). Only four atoms per extremity was needed for loop closure because of the restricted distances between N and C termini in the establishment of the database (Pellequer & Chen, 1996).
B. Relative stability of loops We assessed the stability of loops by evaluating their relative conformational free energy compared with CDRs from Fv R45. The conformational free energy in solution can be described in terms of a thermodynamic cycle and equation 1. AGconf
- • Fabi
Fab;
A /^SOlV
AGnat
AUmod
Fab
Fab
sol mod
Ar^sol AGconf
AG-J,f = AGf-, + AAG3,i,
(1)
The conformational free energy in the gas phase is obtained from a molecular mechanic force field (CHARMm): it includes internal coordinate energies as well as non-bonded interactions (van der Waals and electrostatic). The solvation free energy for transferring molecule from a gas phase to an aqueous phase is calculated with a continuum model of solvent (Jean-Charles et ai, 1991; Honig et al, 1993). Addition of the gas phase conformational free energy to the solvation free energy gives the conformational free energy
Thermodynamics of Loop Backbone Conformations
759
in solution. It should be stressed that in this thermodynamic cycle, there is no double counting interactions due to the combination of a continuum model and a molecular force field (Smith & Honig, 1994). The solvation free energy difference between modeled and native loops is AAG3„,,=AGro?.^-AG-,t (2) The solvation free energy change can be written as AG3,,,, = AGfr^"^'^^ + AG^^;^-^""'^'
(3)
where ^Q.s^^s->wa er -g ^j^^ difference in electrostatic free energy of transferring Fv from gas to water obtained from finite difference Poisson-Boltzmann calculations (Delphi Version 3.0, (Sridharan et al, ; Nicholls & Honig, 1991)), which is the difference between the reaction field energy in vacuum and in water (Gilson & Honig, 1988; Jean-Charles et ai, 1991). AG^p^"^^^ ^^ is the transfer free energy of an uncharged molecule of the same size and shape as the Fv from gas to water. It is commonly assumed that AG^p^^^^ ^^ is proportional to the total accessible surface area of the Fv (equation 4): ^(jgas^water ^ ^ ^ ^
(4)
where y is the vacuum-to-water transfer free energy coefficient. In our study we used a value o 9 of 5 cal/mol/A as determined from solubility experiments (Ben-Naim & Marcus, 1984). The reaction field energy in vacuum and in water was calculated with the Delphi program using a 129 cube grid size, three focusing runs per calculation (24%, 48% and 96%), and a dipolar boundary condition for the first run. The final resolution was 2 grid points per A, which has been shown to be sufficient for convergence. At such a resolution, the relative energy is almost insensitive to the orientation of the molecule inside the grid (Smith & Honig, 1994). The internal dielectric constant was 2 and the external dielectric was 80 for water. In the gas phase calculation, the external dielectric constant was 1. We used the newly derived PARSE parameters for radii and atom charges (Sitkoff et al., 1994) as these parameters have been optimized for accurate reproduction of the hydration free energy of amino acids upon transfer from gas phase to water phase. These PARSE parameters allow an assignment of particular values for N and C terminal residues as well as for disulfide bridges. Only Asp, Glu, Lys, and Arg were charged.
III. RESULTS Two ways of optimizing CDR modeled loops were tested: (1) only the side chains of the loops were minimized, and (2) all atoms of loops were minimized. In the first case, the backbone was kept fixed in the original loop conformation. In the second case, all atoms were minimized in order to obtain a "clash-free" loop conformation. To reduce computational requirements, only a subset of the lowest conformational free energy loops (^^j - <m>^.
(6)
As in the case of the van't Hoff enthalpy per residue the effective m values are statistical quantities equal to the difference between the average m value for the conformations in which the residue is not folded and the average m value for the conformations in which that residue is folded. At low
772
Vincent J. Hilser and Ernesto Freire
denaturant concentrations the two averages are about equal and cancel each other to a large extent. This explains why it is possible for some residues to exhibit high stability constants or protection factors and simultaneously m values close to zero (Hilser & Freire, 1996a; Hilser et al., 1996). This behavior should be contrasted with that expected for the logarithm of a two-state equilibrium constant which should be linear with denaturant concentration.
III. Hydrogen Exchange Protection Factors. While the residue stability constants are purely thermodynamic quantities defined for all residues, the protection factors also contain non-thermodynamic contributions and are defined only for a subset of residues. For example, proline residues lack the amide group and therefore are not included. From a statistical standpoint, the protection factor for any given residue j can be defined as the ratio of the sum of the probabilities of the states in which residue j is closed, to the sum of the probabilities of the states in which residue j is open:
I
Pi
(states with residue j closed)
^T-,
•^
L
p
^1
clOSed,j
open,j
(states with residue j open)
It is obvious that not all residues that are folded are protected from exchange, since they can be exposed to the solvent in the native state or become exposed because adjacent or complementary surfaces become unfolded. The statistical definition of the protection factors has the same form as that of the stability constants (equation 1) and can be expressed in terms of the folding probabilities as follows:
Equilibrium Conformations in Staphylococcal Nuclease ^^
P. . - P. 1,1
^^j = p
where the correction term P
773
I,XC,1
+p
(8)
. is the sum of the probabilities of all states in which residue j is
folded, yet exchange competent. It is evident that the hydrogen exchange protection factors, PF., are equal to the stability constants per residue, K^., only when the P^^ . terms are small.
The most common situations in which a residue is folded but exposed to the solvent occurs when: 1) The amide group of the residue is exposed in the native state; and, 2) the amide group of the residue becomes exposed by being located in a region of the protein that is structurally complementary to an unfolded region. Of course, amide protons that exchange via different mechanisms (e.g. solvent penetration) will not be accounted for by this formalism. Strucutral thermodynamic analysis of the protection factors of several globular proteins (staphylococcal nuclease, hen egg white lyzosyme, equine lyzosyme, turkey ovomucoid third domain, BPTI) (Hilser & Freire, 1996a; Hilser et al., 1996) suggests that for most residues that show protection the contribution of P.
. is small and the protection factors are similar to the stability constants.
Finally, the prediction of hydrogen exchange protection factors requires knowledge of the limiting exchange rates that can be measured under a given set of experimental conditions. This constraint sets a limit to the magnitude of the protection factors that can be determined for a given amino acid in the sequence. This is a purely experimental constraint, the magnitude of which depends on the actual experimental setup (Radford et al., 1992a). In our calculations, the expected exchange rates for each amide were estimated by using the intrinsic exchange rates calculated according to the method of Bai et al.(Bai et al, 1993).
774
IV.
Vincent J. Hilser and Ernesto Freire
The COREX Algorithm
Analysis of protein equilibrium in terms of the formalism described above involves an approximation of the ensemble of conformational states available to a protein. In our laboratory, the ensemble of partially folded states is approximated with the computer by using the high resolution structure as a template. In the COREX algorithm (Hilser & Freire, 1996a; Hilser & Freire, 1996b) the entire protein is considered as being composed of different folding units and partially folded states are generated by folding and unfolding those units in all possible combinations.
The division of the protein into a given number of folding units is called a partition. In order to maximize the number of distinct partially folded states, different partitions are included in the analysis. Each partition is defined by placing a block of windows over the entire sequence of the protein. The folding units are defined by the location of the windows irrespective of whether or not they coincide with specific secondary structure elements. By sliding the entire block of windows one residue at a time different partitions of the protein are obtained. For two consecutive partitions the first and last amino acids of each folding unit are shifted by one residue. This procedure is repeated until the entire set of partitions have been exhausted. Usually, 50,000 - 150,000 states are generated for a typical globular protein.
Each of the states generated by the COREX algorithm is characterized by having some regions folded and some other regions unfolded. There are two basic assumptions in this algorithm: 1) The folded regions in partially folded states are native-like; and, 2) the unfolded regions are assumed to be devoid of structure. While these assumptions appear drastic at first, it has been shown that the resulting ensemble accounts well for hydrogen exchange protection patterns suggesting that non
Equilibrium Conformations in Staphylococcal Nuclease
775
native-like intermediates have vanishingly small probabilities and do not contribute measurably to the experimental values under normal equilibrium conditions. The Gibbs energy (AG) and probability of each state (equation 2) are calculated using a structural parametrization of the folding energetics as described elsewhere (Hilser & Freire, 1996a; Hilser et al., 1996).
V.
The Pattern of Hydrogen Exchange Protection for Staphylococcal
Nuclease Figure 1 shows the predicted and experimental hydrogen exchange protection factors for Staphylococcal Nuclease (SNase).
T
60
80
140
Residue Figure 1. Comparison of predicted and experimental (Loh et al., 1993) protection factors at 37°C. For better comparison, the negative value of the experimental protection factors has been plotted in the figure. Shown at the top of both panels are the corresponding elements of secondary' structure (adapted from (Hilser & Freire, 1996b))'
776
Vincent J. Hilser and Ernesto Freire
Inspection of the figure reveals three major protection levels. The highest Ln PF. values correspond to the second and third p strands (residues 21-39), the central residues of the fourth (3 strand (residues 73-75) and the last portion of the fifth p strand through the second a helix (residues 91-106). In this region it must be noted the presence of higher values for the highly hydrophobic cluster Leu36, Leu37, Leu38 and Val39 in p3 and Alal02, Leul03 and Vall04 in a2. The second level corresponds to the first p strand and the adjacent turns (residues 10-20), the second half of the first a helix (residues 62-68) along with the beginning of the fourth p strand (residues 71-73), and the region from the loop following the second a helix through the third helix (residues 107-135). The third level corresponds to the amino and carboxyl terminal residues (7-10 and 136-141) and the loop region from residue 41 to 53, the first half of the first a helix (residues 54-61) and the loop region defined by residues 77-89.
Of the 49 protected residues, 44 are correctly predicted to exhibit protection. In addition 62 are correctly predicted to show no protection: 6 are prolines, 26 are solvent accessible, and 30 (residues 9, 10, 35, 41, 44-46, 49, 50, 52, 54, 55, 57-60, 77-80, 83, 85-88, 118, 119, 121, 138, and 139) are predicted to have protection factors below the experimental limit of detection. This relatively large number of residues beyond experimental detection is primarily due to the high temperature (37''C) at which the experiments were performed (Loh et al., 1993). This gives a total of 100 residues (excluding prolines) or 78% for which the prediction matches the experimental results. Of the 29 mispredictions, the vast majority (24) represent cases in which protection was predicted but not observed. This pattern suggest that many of those residues may indeed be thermodynamically stable but able to exchange by adifferent mechanism. For those residues that
Equilibrium Conformations in Staphylococcal Nuclease
777
exhibit protection, the average difference between predicted and experimental protection factors expressed as differences in the apparent free energies per residue amounts to 0.3+0.6 kcal/mol.
VI.
Predicted Temperature Dependence of Hydrogen Exchange
Protection for Staphylococcal Nuclease Figure 2 shows the temperature dependence of the predicted individual folding constants for SNase. When the individual Ln(l/Kp values of each residue are plotted against the inverse temperature, a noticeable trend emerges. Specifically, there are groups of residues which show identical Ln(l/K.) values and identical temperature dependencies suggesting that these residues define cooperative folding units. These predicted trends agree with experimental results obtained for other proteins and reproduce the general behavior described by Bai et al. (Bai et al., 1995). ouuu r^-::,,,,,^^" p2 p3 (35 a2 6000 4000 112-122 H
2000
^ ^
77 - 89
II
^^"^"""^"^^""^^^^^
0.000
^
-2000 -4000 20
30
40
50 60 Temp °C
70
80
90
Figure 2. The temperature dependence of the natural logarithm of the apparent residue stability constants. For clarity a single line is shown for groups of residues that exhibit similar behavior.
778
VII.
Vincent J. Hilser and Ernesto Freire
Predicted Urea Dependence of Hydrogen Exchange Protection
for Staphylococcal Nuclease The urea concentration dependence of the natural logarithm of the apparent residue stability constants is shown in Figure 3.
Ns^ p2 (33 p5 a2 al.^^!^"P4a3
10.00
112-122
^ ^ v
5.000 77 - 89
^^^^""^^^^^^
0.000
1
-1
1
1
1
2
3
1
1
[Urea]
Figure 3. The urea concentration dependence of the natural logarithm of the apparent residue stability constants. For clarity a single line is shown for groups of residues that exhibit similar behavior. These calculations simulate the energetic conditions at 25°C where higher resolution between protection factors is expected (adapted from (Hilser & Freire, 1996a)). In general, as the urea concentration increases and the stability of the protein diminishes, the magnitude of the stability constants decreases. At low urea concentration, the rate of decrease is not the same for all residues; several groups of residues with similar stability constants and similar m values can be recognized, as shown in the figure. At increasing urea concentrations the stability
Equilibrium Conformations in Staphylococcal Nuclease
779
constants progressively merge into a single curve characterized by the parameters corresponding to the global unfolding of the protein. This is the same type of behavior observed experimentally for the denaturant dependence of the protection factors (Bai et al., 1995), which have been used to define cooperative folding units or partially unfolded forms (PUF's) (Bai et al., 1995).
As shown in the figure, the (3 barrel, particularily strands 2, 3 and 5 as well as a-helix 2 define the group of residues with the highest stability constants and m values. These residues define the most stable core of SNase. The unfolding of these residues only occurs by complete unfolding of the protein, a-helix 1, the first P strand and the loop region defined by residues 112-122 come next, followed by the loop region defined by residues 77-89. The loop between (3 strand 3 and a-heUx 1 (residues 42 - 57) is unstable at all urea concentrations and is not shown in the figure.
VIII.
Conclusions
The agreement between predicted and experimental hydrogen exchange protection factors for SNase and other proteins (Hilser & Freire, 1996a; Hilser & Freire, 1996b; Hilser & Freire, 1996c) suggests that the probability distribufion of partially folded states generated with the COREX algorithm mimics the general features of the ensemble of conformations under native conditions and that this approach can be used to examine general aspects of the equilibrium ensemble of partially folded states. It has also been shown that this approach accounts for the cooperativity of folding/unfolding transitions and succesfully predicts the apparent two-state behavior observed in temperature and denaturant induced folding/unfolding reactions.
780
Vincent J. Hilser and Ernesto Freire
Acknowledgments This work was supported by grants RR04328 and GM51362 from the National Institutes of Health.
References Bai, Y., Milne, J. S., Mayne, L. & Englander, S. W. (1993). Primary structure effects on peptide group hydrogen exchange. Proteins 17, 75-86. Bai, Y., Sosnick, T. R., Mayne, L. & Englander, S. W. (1995). Protein Folding Intermediates: Native-State Hydrogen Exchange. Science 269, 192-197. Freire, E. (1995). Thermodynamics of Partly Folded Intermediates in Proteins. Ann. Rev. of Biophys. and Biomolec. Struct. 24, 141-165. Hilser, V. J. & Freire, E. (1996a). Predicting the Equilibrium Protein Folding Pathway: StructureBased Analysis of Staphylococcal Nuclease. Proteins In Press. Hilser, V. J. & Freire, E. (1996b). Structure based calculation of the equilibrium folding pathway of proteins. Correlation with hydrogen exchange protection factors. J. Mol. Biol In Press. Hilser, V. J. & Freire, E. (1996c). Structure-Based Statistical Thermodynamic Analysis of T4 Lysozyme Mutants: Structural Mapping of Cooperative Interactions. Biophysical Chem. . Hilser, V. J., Gomez, J. & Freire, E. (1996). The Enthalpy Change in Protein Folding and Binding. Refinement of Parameters for Structure Based Calculations. Proteins In Press. Jacobs, M. D. & Fox, R. O. (1994). Staphylococcal nuclease folding intermediate charaterized by hydrogen exchange and NMR spectroscopy. Proc. Natl. Acad. Sci. USA 91, 449-453. Jennings, P. A. & Wright, P. E. (1993). Formation of a Molten Globule Intermediate Early in the Kinetic Folding Pathway of Apomyoglobin. Science 262, 892-896. Kim, K.-S., Fuchs, J. A. & Woodward, C. K. (1993). Hydrogen exchange identifies native-state motional domains important in protien folding. Biochemistry (32), 9600-9608.
Equilibrium Conformations in Staphylococcal Nuclease
781
Kim, K.-S. & Woodward, C. (1993). Protein internal flexibility and global stability: Effect of urea on hydrogen exchange rates of bovine pancreatic trypsin inhibitor. Biochemistry 32, 9609-9613. Kuwajima, K., Nitta, K., Yoneyama, M. & Sugai, S. (1976). Three-state denaturation of alactalbumin by guanidine hydrochloride. /. Mol. Biol. 106, 359-373. Loh, S. N., Prehoda, K. E., Wang, J. & Markley, J. L. (1993). Hydrogen exchange in unligated and ligated staphylococcal nuclease. Biochemistry 32, 11022-11028. Morozova, L. A., Haynie, D. T., Arico-Muendel, C, Van Dael, H. & Dobson, C. M. (1995). Structural Basis of the Stability of a Lysozyme Molten Globule. Nature Structural Biology 2, 871875. Privalov, P. L. (1979). Stability of Proteins: Small Globular Proteins. Adv. Protein Chem. 33, 167-239. Privalov, P. L. (1982). Stability of Proteins: Proteins Which Do Not Present a Single Cooperative System. Adv. Protein Chem. 35, 1-104. Radford, S. E., Buck, M., Topping, K. D., Dobson, C. M. & Evans, P. A. (1992a). Hydrogen exchange in native and denatured states of hen egg-white lysozyme. Proteins 14, 237-248. Radford, S. E., Dobson, C. M. & Evans, P. A. (1992b). The Folding of Hen Lysozyme Involves Partially Structured Intermediates and Multiple Pathways. Nature 358, 302-307. Roder, H., Elove, G. A. & Englander, S. W. (1988). Nature 335, 700-704. Schulman, B. A., Redfield, C., Peng, Z., Dobson, C. M. & Kim, P. S. (1995). Different subdomains are most protected from hydrogen exchange in the molten globule and native states of human a-lactalbumin. /. Mol. Biol. 253, 651-657. Udgaonkar, J. B. & Baldwin, R. L. (1988). NMR evidence for an early framework intermediate on the folding pathway of ribonuclease A. Nature 335, 694-699. Woodward, C. (1993). Is the slow-exchange core the protein folding core? TIBS 18, 359-360.
This Page Intentionally Left Blank
An Evaluation Of Protein Secondary Structure Prediction Algorithms Georgios Pappas Jr. and Shankar Subramaniam Department of Physiology and Biophysics, Beckman Institute, University of Illinois, Urbana, Illinois 61801
I. Introduction Over the past years several algorithms were developed in order to predict the secondary structure of proteins based on very distinct theoretical approaches (Fasman, 1989, Eisenhaber et al. 1995). This list keeps growing incessantly, with the authors eager to improve the predictive power of their methods. This frantic search exposes the current status of the prediction accuracy which is far from ideal in order to make reasonable inferences about the tertiary structure, although it does not invalidate the use of these methods as rough starting points for modeling purposes (Rost and Sander, 1995; Schultz, 1987). The level of success in the predictions reported by the authors is in the range from 60 % to 72%. Most of these results represent an overestimation due to incomplete cross-validation (Holley and Karplus, 1989) or by the lack of a reasonable number of test cases (Burgess et al., 1974). All the methods developed so far try to extract information, directly or indirectly (Lim, 1974), from the ever growing databases of X-ray crystallography resolved protein structures. Unfortunately, the rate at which new structures are added to the structure databases is far from optimal. Chothia (1992) estimated that all proteins, when their structures are known, would fall into about one thousand folding classes, more than half of them yet to be discovered. If so, this means that a great deal of information in the forthcoming structures is not available for the current methods, and therefore we still must rely on the future to see a coherent and realistic increase In the accuracy of secondary structure prediction methods. Comparative analysis of the performance of various algorithms has been carried out in the past (Kabsh and Sander, 1983). However, this task can be deceptive If factors such as the selection of proteins for the testing set and the choice of the scoring Index are not carried out properly. The present work aims to provide an updated evaluation of several predictive methods with a testing set size that permits to obtain more accurate statistics, which in turn can possibly measure the usefulness of the information gathered by those methods and also identify trends that characterize the behavior of individual algorithms. Further, we present a uniform testing of these methods, ws-a-ws the size of the datasets, the measure of accuracy and proper cross-validation procedures. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
783
784
Georgios Pappas, Jr. and Shankar Subramaniam
II. Material And Methods A. Secondary Structure Prediction Algorithms Algorithms for secondary structure prediction are based upon diverse theoretical approaches. There are three mainstream classes of methods (Gamier and Levin, 1991): 1. Statistical: Rely on the assumption that amino acids have intrinsic propensities for formation of a specific type of secondary structure. This information is collected by analysis of proteins with known tertiary structure and is done using simple statistical principles (Chou and Fasman, 1974) or more elaborate ones like information theory (Gamier et ai, 1978). 2. Neural networks: Those are highly nonlinear pattern recognition devices that try to mimic the organization of nervous systems. They are trained by adaptively learning a set of patterns and can extract high order features of the input space with the ability of making generalizations for unknown input events. 3. Sequence similarity: Based on the comparison of the protein to be predicted with an available database of known structures. The prediction is made by assigning the secondary structure of the fragment in the database which displays the most sequence similarity with a segment in the test protein. Other variations often appear utilizing different methodologies, but with no relative gain in accuracy. Those include methods that are based on hidden Markov models (Sasagawa and Tagima, 1993), stereo-chemical principles (Lim, 1974) and statistical mechanics (Ptitsyn and Finkelstein, 1983), just to cite a few. From a large list of available algorithms for secondary structure prediction, nine of them were selected to represent the main classes depicted above. Those were chosen mainly because they are the most often cited in the literature and by the fact that they permit a relatively safe implementation through of a computer program. The selected methods are summarized in table I. Table L List of secondary structure prediction methods utilized Method Code Type Statistical BPS C F Statistical D R Statistical GAS Statistical GGR Information Theory GOR Information Theory H K Neural Networks L G Sequence Similarity Q_S Neural Networks
Reference Burgess et al. (1974) Chou and Fasman (1974) Del6age and Roux(1987) Gascuel and Golmard (1988) Gibratefa/.(1989) Gamier ef a/. (1978) Holley and Karplus (1989) Levin and Gamier (1988) Qian and Sejnowski (1988)
Protein Secondary Structure Prediction Algorithms
785
A software package called MultPred (Multiple Predictions) was developed in C++ language and implements all but C__F (Chou and Fasman, 1974) and GGR (Gibrat et al., 1987) methods which were taken from the program ANTHEPROT (Deleage and Roux, 1989). Additionally, a joint prediction scheme (JOI- Joint prediction) was utilized in which the prediction from the different methods analyzed were combined and the structure predicted by the majority was assigned to the respective residue in the same fashion implemented by Nishikawa and Noguchi (1991). Some methods provide three-state prediction (i.e., locating helices, sheets and coil regions) and others four-state prediction (the former plus p-turns). For the present analysis only three-state predictions were analyzed, and the four-state predictions were transformed to three-state ones by assigning the coil state to predicted turn regions. In the case of BPS (Burgess et al., 1974) method the secondary structural propensities of the amino acids were recalculated following the paper, because the original values were based on just 9 proteins. For the L_G method (Levin and Gamier, 1988) the database used to make the prediction was the database used in this work (see below) instead of the original given in the paper. This was necessary to avoid over-predictions owing to high sequence homology between the two datasets. However, we note that the L_G parameters are not optimized for the current protein database.
6. Protein Database Selection The testing set used is composed of 148 proteins with resolution better than 2.4 A and less than 25% of sequence homology between each other (Hobohm et al., 1992). The total number of residues analyzed was 36229. Secondary structure assignments were taken from DSSP program (Kabsh and Sander, 1983a) and those were transformed to a three-state form according the rules given by Levin and Gamier (1988). Protein class assignments were based on SCOP database (Murzin et al., 1995), dividing the set of proteins in 31 all-alpha, 31 all-beta, 51 alpha/beta, 21 alpha+beta and 14 irregular or multi-domain. The relative secondary structure composition for each class is given below:
Table II. Relative composition in terms of structure for the current database distributed in terms of protein classes Coil (%) Number of proteins Protein Class a-helix(%) 3-sheet (%) 31.93 All Proteins 19.88 48.19 148 56.74 All-Alpha 3.22 40.04 31 52.98 31 8.64 38.39 All-Beta Alpha/Beta 35.78 51 46.57 17.65 26.63 23.77 49.60 Alpha+Beta 21
786
Georgios Pappas, Jr. and Shankar Subramaniam
C. Accuracy Measurements One key element in the performance analysis of secondary structure prediction methods is the proper selection of the accuracy measurement to be employed. Three different types of predictive accuracy measurements were used (Schultz and Schirmer, 1979): 1. Q3: Is calculated by taking the number of correct predictions over the sequence by the total number of amino acids. This index often produces high values because it does not penalize over predictions. 2. Mathews correlation coefficient (Cx): It is the correlation coefficient between positive predicted and observed as well as negative predicted residues. This index is particular for a specific structure x and the formula is
Cs-
('''^ yl(x + y)(x + z)(w + y)(w + z))
where
s = a,p or Coil
w = Number of correct predicted residues for structure x X = Number of negative correct prediction for structure x y = Number of residues under predicted for structure x z = Number of residues over predicted for structure x 3. Entropy-related information: This measure was introduced by Rost and Sander (1993) and is related to the probability of deviation between a random prediction and the actual prediction. The value of this index is affected by over- and underpredictions, which is not accomplished by Q3. Therefore it provides a more reliable estimate of the significance of accuracy. The formula is given by 3
3
2 at • lna«-2 ^v ' In^i, Info^l-
1=1
ij=\
N'\nN-Y,biinbi
N = Number of residues a, = Number of residues predicted to be in secondary structure / b/ = Number of residues observed to be in secondary structure / Ay = Number of residues predicted to be In / and observed to be iny
Protein Secondary Structure Prediction Algorithms
787
D. Multidimensional Scaling To help in the visualization of how the secondary structure prediction methods relate to each other, the statistical technique called multidimensional scaling (MDS) was utilized. Basically, what MDS does is, when provided a matrix of dissimilarities between objects (in our case, the algorithms for secondary structure prediction), to find a lowdimensional representation (2-dimensional) of the data, one point representing one object in such a manner that the distances in the new coordinate system match as well as possible the original distances provided in the dissimilarity matrix (Cox and Cox, 1994). For this kind of analysis one of the crucial steps is the definition of the dissimilarity between two predictive algorithms. Q3 values and Mathews' correlation coefficients where calculated for all proteins in the database resulting in an accuracy vector for each predictive method (""Xj, where m= predictive method and 1= Q3, Ca, Cp or Cc). Given those vectors, dissimilarity matrices where calculated for each accuracy index over all predictive methods using Guttman's ^ coefficient (Guttman, 1968), which is a measure of simple monotonic relationship between variables and is given by
I.UXi-''Xj\\'Xi-''Xj\
Hr.s= Dissimilarity between method rand s. ; ""Xj = Accuracy index m for protein /.
III. Results And Discussion A. Analysis Of Predictive Accuracy The first step in the analysis was to obtain the secondary structure prediction for the 148 proteins in the test database with the selected methods. The accuracy results in terms of the Q3 index can be examined in table III.
Table III Newly calculated and reported accuracies in the original papers in terms of Q3 Method New Accuracy (Q3%) Original Accuracy (Q3%) EPS 53.1 ± 7 .5 61.3 C_F 55.5 ± 8.7 59.2 D_R 59.6 ± 9.4 61.3 G_G 55.3 ± 7.5 62.3 GGR 60.4 ± 8.1 63.0 GOR 57.4 ± 8.4 58.0 H_K 60.1 ± 8.4 63.2 L_G 55.2 ± 9.7 63.0 Q S 59.6 ± 8.7 64.3
788
Georgios Pappas, Jr. and Shankar Subramaniam
When analyzing the Q3 values it is clear that those are lower compared to the ones claimed by the authors in the original papers. This can basically be due to two factors: poor statistics as a consequence of low number of test proteins, and lack of cross-validation of the results. The discrepancy between reported values and the newly calculated ones is variable indicating different degrees of prediction generalization attained by each method. Additionally, it must be kept in mind that the database used in this work may contain proteins used in the training set of some of methods, which induces an overestimation of the accuracies. Nevertheless, it is still observed that all methods had the accuracies decreased. However, despite Q3 being the most popular and widespread measure it suffers serious problems in terms of providing a reliable and significant accuracy estimate. The main Q3 drawback is that it does not take in account under- and over-predictions failing to capture the real significance of the results. For example, if we predict all the residues as being coil in the test database, an average Q3 value of 48.19% Is obtained but correlation coefficients and information values will be null. As an alternative way to analyze the accuracies it is possible to use the average Mathews' correlation coefficients and information values reported in table IV for all predicted proteins. The use of these two measures is very scarce in secondary structure prediction literature, despite their obvious superiority over Q3. In one of the few publications that utilize Mathews' correlation coefficients, Holley and Karplus (1989) reported values of Ca=41%, Cp=32% and Cc=36%. In the new analysis those values were sensibly decreased (Ca=32%, Cp=25% and Cc=31%), clearly indicating there is a poor generalization power of the method to a larger set of proteins. It also strengthens an important fact for secondary structure evaluation, already noted by Rost and Sander (1994), which is the need of a representative testing set in terms of size and structural composition that permits gathering reliable statistical information from the results. Table IV. Average Mathews' correlation coefficients and information values for the 148 chains in the testing set. Standard deviation values are shown in parentheses Method Ca (%) Cp (%) Cc (%) Information (%) 24.08 ±(16.51) 13.73 ±(14.22) 20.00 ±(12.56) 8.23 ± (5.34) BPS 24.33 ±(18.61) 21.24 ±(17.12) 29.25 ±(12.38) 11.10 ±(6.69) C F 29.15 ±(14.98) 23.27 ±(16.68) 36.58 ±(11.65) 12.85 ±(6.30) D R 20.68 ±(13.49) 28.23 ±(18.04) 23.32 ±(11.27) 9.63 ± (5.07) G G 32.15 ±(16.92) 26.87 ±(17.98) 36.17 ±(11.87) 14.08 ±(6.68) GGR 29.31 ±(17.32) 26.16 ±(17.35) 34.65 ±(11.66) 13.71 ±(6.75) GOR H K 31.96 ±(19.66) 24.62 ±(16.69) 30.99 ±(13.10) 13.01 ±(7.34) 30.56 ±(12.75) 25.64 ±(17.54) 33.67 ±(10.22) 12.69 ±(5.31) L G 28.66 ±(19.34) 22.63 ±(17.46) 33.31 ±(13.21) 12.85 ±(9.02) Q S 34.68 ±(18.71) 27.67 ±(17.86) 35.44 ±(12.56) 14.97 ±(8.32) JOI
Protein Secondary Structure Prediction Algorithms
789
To further extend the analysis, accuracies were measured in terms of correlation coefficients and information independently based on protein structural classes in order to check if there are biases particular to specific chain folds. The results are shown in figures 1 and 2. Predictive Accuracies (information) of secondary structure algoritlims 0.20
BPS C_F D_R G_G GGR GOR H_K L_G
Q_S
JOI
Algorithm Code
Predictive Accuracies ( C^^) of secondary structure algoritlims
Ca(%)
BPS
C_F D_R G_G GGR GOR H_K L_G Q_S
JOI
Algorithm Code
a^
Figure 1. Predictive accuracy in terms of information and Ca. The values are averaged over the respective class of proteins.
Georgios Pappas, Jr. and Shankar Subramaniam
790
Predictive Accuracies (CQ) of secondary structure algorithms 40.00
30.00
Cp(%)
20.00
10.00
0.00
BPS C_F D_R G_G GGR GOR H_K L_G Q_S
JOI
Algorithm Code
m-
I
ALPHAfiETA
^ ^ M
ALPHA+BETA
Predictive Accuracies (Cc) of secondary structure algorithms
30.00
CC(%)
0.00
BPS
C_F D^R G_G GGR GOR H_K
L_G Q_S
JOI
Algorithm Code i
ALPHA«ETA
^ ^ M
ALPHAfBETA
Figure 2. Variation of predictive accuracies (average) according to the protein class as measured by the Cp and Cc values.
Protein Secondary Structure Prediction Algorithms
791
The first observation of this kind of analysis is that for all types of measures utilized the behavior of predictive methods varies significantly according to the protein fold family. This can be relevant in pointing out what method performs better for the prediction of a determined structural element depending on the protein class. Conversely, it also is possible to diagnose critical points where the algorithms fail. From the information values in figure 1 it is possible to observe that in general the prediction in all alpha and alpha+beta classes has a greater success than for all beta and alpha/beta classes. Also it shows a greater variability between predictive accuracies of the methods among each other as well as between the fold classes individually. Figures 1 and 2 reinforce this view with the use of Mathews' correlation coefficients for each type of secondary structure. However, a more striking observation arises. When the prediction is done for all-beta proteins the Ca is extremely low ( 0) and the residues with conserved properties in corresponding positions of yB- and PB2crystallins.
Domain Binding Sites: PA3- and pB2- Crystallins
821
III. Results A. The domain-binding site The stmctural surface template was obtained for the asymmetric part of the molecular surface involved in the domain-binding site. The common part of the template, consisting of the accessible residues of the P-sheet surface, is represented in Fig. 2a. Equivalent (a)
vl
v2 v3 v4
v5 - v6 - v7 cl - c2 - c3 c4 - c5 - c6 cl v8 (C)
91±62/1.5 53±36/l,5 84±20/4,2 50±38/1.3 33±5 /6.6 -- 24±13/1, 8 59±8 /I, 4 -• 39±8 /4, 9 44±5 /5.5 -- 11116/0, 7 -
(b)
86/175 85/174 84/173 83/172 82/171 - 40/129 •- 59/148 81/170 - 41/130 -• 58/147 79/168 - 43/132 -• 56/145 54/143 53/142 (d)
34±40/0.9 65±15/4.3 82±7/ll,7 104±13/8.0 47±18/2. 6
n/a -3.8 -6,9 -0,8 -4,3 39,6 21,3 -1,5 28,2 17,0
13, 1 9. 9
11. 7 16. 2 -5. .5
Figure 2. Average accessibility changes and the sequence variabihty for the interface surface residues for both domains in yB- and PB2-crystallins: (a) surface residue positions vl-v8, cl-c7 are labelled as shown in Fig. 1; (b) numbers of the equivalent surface residuesfromN- and Cterminal dcxnains; (c) accessibility changes averaged for both domains in yB-crystallin (1 gcs file) and two pB2-dimers (Iblbfile),the ratio/= /a, is shown under the slash; (d) the sequence similarity score for aligned surface positions (b) of the yB- and pB2-crystallins estimated from the mutation matrix (Gonnet et ai, 1992) as score = Ipfij), where the sum was calculated over all i,j = 1,...4, and ihepfif) is the score for the single mutation.
surface residues from the N-and C-domains were aligned with residues as shown in Fig. 2b. The domain-binding site can be divided into two parts: the highly conserved common part (residue positions cl - c7) and the part with higher variability of the accessibility change (positions vl - v^). The average total accessibility changes calculated from Fig. 2c for both the variable and common parts are about the same value: 416 A^ and 404 k , respectively. Residues in colTjmns containing P-strands P3 and pi 1 are not involved in close interactions in the interface (Fig. 1). Residues with the most significant accessibility change are mainly located in P-strands P6, pi4, ps and pi6.
Yuri V. Sergeev and J. Fielding Hejtmancik
822
Residues in positions c7, c3, c4, c6 and c7 show both lower variability of the accessibility change (f> 3.0) and higher sequence similarity score (Fig. 2 c, d).. Hydrophobic residues are preferrably located in positions cl and c6 and charged residues are located in positions c3 and c4. Only a few residues have an accessible area change greater than 50 A^ which are considered significant (Fig. 2c). Residues c7, c3, c6 and c7 show a remarkably low sequence variability and a significant change in accessible area in both proteins. This suggests that they might serve as structural determinants in the interface. Significant accessibility changes were also observed for the linker and the C-terminus (residues v7 - v4). However these residues show a weak sequence similarity, calculatedfi'omthe mutation data (Fig. 2d).
r
••••••99.9% probability of containing ^ e most common Leu Leu Leu sequence.
C
Selection of Functional Mutants
To select for functional random mutants, E. coli XLl-B cells containing the plasmid library to be tested was streaked on the surface of an LB agar plate that contained 1 mg ml"l ampicillin (Sigma Chemical Co.). The agar plate was incubated overnight at 37" C. Clearly isolated single colonies were then picked the next day and cultures were grown from the single colonies to isolate ssDNA for sequencing. Alternatively, the isolated single colonies were picked and used directly for PCR to amplify the coding region of the blaTEM-1 gene. The amplified PCR product was then sequenced directly (Hanke and Wink, 1992).
III. Results A. Randomization Procedure Two different site-directed mutagenesis approaches were used to generate the set of 88 random libraries that encompass the blajEM-i g^^^- Ten libraries were constructed using the random replacement mutagenesis protocol which has been described in detail (Palzkill and Botstein, 1992). The remaining 78 libraries were constructed by a combination of linker insertion mutagenesis and oligonucleotidedirected mutagenesis (Huang et al., 1996)(Fig.l). A set of 78 linker insertion mutants were generated throughout the hldijEM-1 gene by oligonucleotide-
Timothy Palzkill et al
830 pBG66 p l a s m i d
bla^„ gene 75 76 77 M S T F K V L L C G A V L S R ^ 5 ' -ATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGT-3 ^
m
M S T F K V L L C G A V L S R -ATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGT- 3 ' 3'-AATTTCAAGACG TACACCGCGCCA-5' C G A T G-C Sail
M S T F K V L Sail C G A V L S R 5 ' -ATGAGCACTTTTAAAGTTCTGCCAGCTGATGTGGCGCGGTATTATCCCGT-3'
D
GCCAGCTGAT T G C T 5'-ATGAGCACTTTTAAAGTT GGCGCGGTATTATCCCGT-3' 3'~CGTGAAAATTTCAANNSNNSNNSCCGCGCCATAATAG-5'
[CJ
75 76 77 M S T T K V N N N G A V L S R 5 ' ~ATGAGCACTTTTAAAGTTNNSNNSNNSGGCGCGGTATTATCCCGT-3'
Figure 1. Randomization procedure used to construct 78 of the 88 bla random libraries. First, nine bases, corresponding to three contiguous codons, are selected for randomization. In this example, codons 75-77 are targeted for mutagenesis. Single stranded plasmid DNA is shown in (A). A p-lactamase insert mutant is next created, using the Kunkel method of mutagenesis, to insert a unique Sal I resttiction site within the region targeted for mutagenesis (B) (Kunkel et al., 1987). Single-stranded insert mutant DNA is then isolated from an wng-, dut- strain oiE. coli (C). Randomization is accompUshed by annealing an oligonucleotide, designed to replace the target with nine base pairs of random sequence, to the template (D). Second strand synthesis and transformation into an wn^+, dut-k- strain results in randomization of the targeted codons 75-77 (E). Since mutagenesis is not 100% efficient, some cells will have plasmids still containing the Sal I site. To remove non-mutagenized DNA, pooled plasmid DNA is restricted with Sal I. Nonmutagenized DNA is linearized, and only randomized bla DNA is transformed in the last step.
Complete Mutagenesis of a Gene
831
directed mutagenesis (Kunkel et al., 1987). Each linker insert lies within a set of codons to be randomized. In addition, each linker contains a Sal I restriction enzyme site. Importantly, the Sal I site is not present elsewhere on the plasmid used for these experiments (Fig. 1). An oligonucleotide was then designed and synthesized that would replace the Sal I site and the three codons with random sequence DNA. Oligonucleotide mutagenesis was carried out using the method of Kunkel et al., (1987) and the reactions were electroporated into E .coli. The transformants were pooled and the plasmid DNA was extracted. The number of colonies pooled at this step is an indication of the probability that the library contains all possible amino acid substitutions. For each of the 88 libraries, greater than 75,000 colonies were pooled. Therefore, all of the libraries in which three codons were mutagenized have a >90% probability of containing all possible amino acid substitutions. The final step was to digest the pooled plasmid DNA with Sal I and electroporate E. coli again (Fig. 1). This procedure effectively eliminates all non-mutagenized molecules and leaves only the random substitutions. This strategy is termed site-selection mutagenesis and has been used by others to create a variety of site-directed mutations (Deng and Nickoloff, 1992). The advantage for the functional selection strategy is that the starting molecule is itself a linker insert mutant of (i-lactamase. Thus, there are no unmutagenized,^ wild-type molecules present in the final libraries. However, it was necessary to perform all DNA manipulations using barrier micropipette tips to eliminate contamination from aerosols originating from the micropipetor. In the absence of these tips, extensive contamination of the random libraries with wild-type plasmid DNA was observed. Each of the 88 random libraries was used to transform E. coli, and functional random mutants were selected by spreading the transformed cells on agar plates containing 1 mg ml"^ ampicilHn. This is the maximal concentration on which E. coli containing the wild-type blajEM-l gene on the plasmid used to construct the random libraries can grow. Thus, phenotypically wild-type mutants are selected. The specific activity of enzyme from a number of mutants indicated that, on average, the selected mutants possess 80% the activity of the wild-type enzyme (Huang et al., 1996).
B. Comparison of Tolerance to Amino Acid Substitutions in Mutagenesis Experiments and Sequence Conservation in the Gene Family To determine the identity of allowable substitutions at each residue position, the DNA sequence of an average of 9 functional random mutants from each library were determined. In total, 43 out of the 263 (16%) mutated residues are inferred to be critical for TEM-1 P-lactamase structure and function since only the wild type amino acid is found at these positions among the sequenced mutants. This set of essential residues includes catalytic residues and a number of other amino acids that are buried in the hydrophobic core of the enzyme. A detailed description and analysis of these results has been published elsewhere (Huang et al., 1996). A large number of class A P-lactamases have now been identified and sequenced and an alignment of 20 class A p-lactamases has been published (Ambler et al., 1991). These aligned sequences permit a comparison between the conserved amino acid residue positions among class A P-lactamases and the conserved positions among the functional random mutants in TEM P-lactamase (Fig. 2). In general, there is agreement between the tolerance of a residue in TEM P-lactamase to amino acid substitutions and the amount a position is substituted in
832
Timothy Palzkill et al
TEM-1
RQ KRL CCGSTT TSERKPV QAADAEAS PSLAHGETL
D BR VY SA MML RA GDOIKT SGT ASARSCAAVE VKVSDAELLL
V T r S G RTW L H VSW K IFI TY Q PM AATQ SV AVKATF ASLL SLIN TOT QLTHSV KTSIA TIMO IRMAETVEHN GARVGYLTLE INSGKILDVH
PD F W M V E C L H T E F VAR L ML T A SS AVF KHM H TI IVS Y AQ LNDV HSAQQ FT A TSQT QLNNDDKTH PLLDK IS V LI AAKI LFWIYTKSLL REYERFPLMS TFKVLLCGCV MSRVDAGHEQ
G T D N TS V ITQM ID WLRWHDA SQLTVAFMKR LGRLIHYSRP
MSIQHFRV ALIPFFAAFC LPVFAHPETL VKVKDAEDQL GARVGYIELD LNSGKILESF RPEERFPMMS TFKVLLCGAV LSRVDAGQEQ LGRRIHYSQN 100
Class A
F L Q SQ S L TE DKH N CA Hit V LS CAM AMAL I QTM HVIGOT ASV DLVKYSPSVK
L S M V H CS P D T TQT NIH L PPD SLQ Q DLIN RFS ELRRSAQQTH KHATEGMTVK
VPYOGGQPD YAMKSQILE SSSPTEAVS AG7TFNDRH QEVKVna<M EDHQAKPYK KHKEMDSOT LNMRESRIA GCGSKI AQ AGPA D R
VKVKDAEDQL GARVGYIELD UISGKILESF RPDERFPMMS KQITLS STP SGKI MWNE TAMMRTWTAW KANQL LLC EAFAQE TSY QVHL VALIK DT LZVAQH BGRD GYG LDLEAI VES G P ISAV QO ESI DY EGA AFA QT SRL EKH D AY R G WER F D K IV GERGXQN F Y S Q G T S T BRNQRA R A K N RA
A R V H SPE 6 OA X D Y VAQ S KIVFR H MI TMEL A TSTZVA G AG SLSA ELdSU^OMS 0NTAANLLL8
F C GA D GT EDOHY F Q Q D L S Q VK SB D WW IQ T H TATIVTAAKMHMHC V SV KH AR A VXL TK HIM L AC JUBS BSS LO N KRVD R8 KW F GI HLSKAGaVME YMSC8EELTP BCT Y WMG TZGGPYELTA FLSSHGDHAT RLDRNETDLM
TFKVLLCGAV LSRVDAGQEQ LGRRIHYSQN VH AASAAKL AQAERHDLM PEQKVPIRRQ KV TTAV LI YDW QTLVS VDSALEVEKT Y SWS V S ST I I Q G E N
G Q R K E
NSKDE KPRKK A PNA ID T G Y
E
NDT T KEA SEV K T E TK G K N D S
M M
YVC K FG G ARN A P SY VAM KD C YHM RCKIT F YE ANI BMBNH SVLK FVADS A I MSLCNDF YAVTQIRQRD BSSAE TN M QQVGGLT GRQMSDPSVS DTTLL VQ LACVISSSV TEMYYADTFA EAISNDERDT TGPAAMATTL RKLLTGELIT
D^VEYSPVTE KHLTDGMTVR ELCSAAITMS 0ilTAAllIX2.T TIG6PKELTA FIAMNGDBVT RLDRMEPEZIT EAIPNDERDT TMPAAMATTL RKLLTGELLT 200 pLVEYSPVTE KHLTDGMTVR ELCSAAITLS DNSAAHII2A TIGGPKELTA FLBQHGDHVT AILDHA ISG QQVSQALKLG DAAAVTVSI 8T Q lAID AXD AAGFED DMRSI I«S VSTMN EI Y ITDISGD AVLR 8MDT V N KIF8 HL RRTVKQ AFEEV KBI YL RIIF ILRM lEAYDE ZCXBL PTF AM M 6 E PW6 K S L AH S X SSL K EV K C Q NF T D QY N Q TK8 R S G A E CE HY A I T E A K D Q R R ET RR TT TMC TT KI KBT HL CH QRA AA SC VSHQDK RA PKRGNVMEE SGE I KD lASRQELIDN MEADMVAGPL
D K C Q V FR SH A C 8 R F WQ TA M M GA Y I MSV DD K V LRSALPBG1VY lADKSGAG.E
T A PC H A CY D G LVSV RGSRGIIAAL
K R Q GP
AT VE NKTR KABC HTSAL V MRRDTKL DGKPSRI
RL0RNEPEU7 MSE R TA S SAN I QD G MPV F IK
EAIPNDERDT TMPAAMAATL RKLLTGELLT DGLLGEAH STAISASTSF QTFVLAQV S S K IVRMV NAYTFSGA E RFTSKSLQ SVK DRTK R LGSD ED AVDSR P YKA IT L FTE KS lAEDH R AP K G G GR MMK A YS P T T G H V K K N
G H V H NV TC M AT I ISLIST V BR AL LVEGMR P SK SM V ICVFSKWTPS S EDQ TAVM WIYTTGSQA TMDARNRQIA
I HK S L Y F W L IDTL Q Q STAMTS T PYRVYSC D AVAITBA MIGASLVAHH
LASRQQLIDW MEADKVAGPL LRSALPAGNF lAOKSGAG.E RGSRGIIAAL GP.DGKPSRI WIYTTGSQA TMDERMRQIA EIGASLIKHW LAARQQLIDH ARMQRE LQM EPPKNK TSL PMK KV EE BEE GI AT SGD DL VG KQQ EF NR HM AR L
MEDDKVAGPL LRSALPAGHF lADKSGAG.E RGSRGIIAAL GS LVAMRTTDAV IKAVAEPDYQ VL RT SAIG HAATSDVGLV KR LGGQSGRNS F DGVRRA N DG T GVES F TAMLT VI Q RNKG KTET KEI D N L L E T KYVN MITR KHTSKR ESR Q D A FA SG A Q TV T R P T K E Q T D Q K R A S
DGKPSRI MMRAEW PKQSAI EHGDPW GEDK
WIYTTGSQA AAMFLRDTPE ISVMIAQEAK LTTLSSRDDQ MN QVMKGEP PT FN
IMDERNRQIA SFAASDQBVS EASVDSDALV DDEFBKEL K KGIYN AV KL K PS P VT G
290
EIGASLIKHW GLAQAIAEVL RATSRVFDTY KT BV VASV DG RI MGRI EL TSGF K EM
Figure 2. Comparison of sequence variability among functional TEM-1 p-lactamase mutants and twenty aligned class A P-lactamases. The wild type TEM-1 P-lactamase primary sequence is shown. Above the sequence are the different amino acids that were identified at that sequence position among functional random mutants. Below the TEM-1 primary sequence are the different amino acids that appear at these positions in an aUgnment of 20 class A P-lactamases (Figure adapted from Huang et al., 1996).
Complete Mutagenesis of a Gene
833
the class A family. However, a detailed analysis reveals several important differences. The patterns of substitutions between TEM-1 and the class A enzymes can be grouped into 4 classes based on tolerance to amino acid substitutions. Class 1.
Positions substituted in class A and TEM-1 enzymes. 210 residues.
Class 2.
Positions not substiuted in class A or TEM-1. 13 residues. F66, S70, K73, P107, S130, D131, A134, R164, E166, D179, T180, D233, G236
Class 3.
Positions substituted in class A but not TEM-1. 10 residues. E37, G45, L81, N136, G144, G156, D157, L169, L199, L207
Class 4.
Positions substituted in TEM-1 but not class A. 30 residues. L30, Y46, P67, T71, L76, L122, A125, N132, D176, T181, W210, D214, V216, A217, L220, R222, P226, W229, A232, K234, S235, G242, R244, G245, L250, G251, M272, N276, 1282, W290
As evidenced by the large size of Class 1 above, the majority of positions in TEM-1 and class A enzymes may be substituted to some extent. Class 4 is also a large class and represents residues that make essential interactions in TEM-1 but are not conserved among other class A enzymes. An example of a Class 4 residue is Leu76. Leu76 is part of a buried, hydrophobic cluster of residues including Phe72, Alal26, and Alal35 (Jelsch et al., 1993) (Fig. 3) . The B. licheniformis and S.aureus enzymes have a threonine and glutamine residue, respectively, at position 76 (Fig.3) (Herzberg, 1991; Knox and Moews, 1991). In the B. licheniformis and S. aureus enzymes, the residues surrounding residue 76 are thus predominently hydrophilic: residue 76 interacts with its neighbors by means of hydrogen bonds rather than hydrophobic interactions, as in TEM-1. Such a change in environment may have occurred by coupled, compensating substitutions that altered the character of the region. For example, introduction of a Asn residue at position 76 of the TEM-1 enzyme would, based on the mutagenesis results, create a non-functional enzyme. Selective pressure would then favor additional mutations that either revert residue 76 to Leu or provide a new hydrogen bonding partner for Asn76.
C. Identification of Substitutions that Correct the Asn76 Defect To test the hypothesis that compensating mutations have occurred in the class A P-lactamase gene family in the vicinity of residue 76, position 76 was first converted to an asparagine residue so that residue 76 is identical to the 5. aureus enzyme. As expected, the function of this enzyme was drastically reduced. Whereas E. coli containing the wild-type enzyme is able to grow on agar plates containing 1 mg/ml ampicillin, E. coli containing the Asn76 enzyme can only grow on agar plates containing 50 |ig/ml or less of ampicillin. Western blots have shown that the Asn76 enzyme is poorly expressed relative to the wild-type enzyme (data not shown). To assess whether compensating mutations can occur in the vicinity of the Asn76 residue, suppressor mutants of the Asn76 mutant were isolated. These mutants were isolated by introducing the plasmid containing the bla^EM g^^^
834
Timothy Palzkill et al
S. aureus
B. licheniformis Figure 3. Ribbon diagram of the Leu76 region of the TEM-1 (Jelsch et al., 1993), S. aureus (Herzberg, 1991) and 5. licheniformis (Knox and Moews, 1991) structures. For simplicity, only those residues that are hydrophobic in TEM-1 and hydrophilic in the S. aureus or B. licheniformis structures are shown. Figure prepared with MOLSCRIPT (Kraulis, 1991) (Figure adapted from Huang et al., 1996).
Complete Mutagenesis of a Gene
835
encoding the Asn76 substitution into E, coli ESI301. This E. coli strain is defective in mismatch repair and accumulates mutations at a rate approximately 100-fold higher than wild-type E. coli (Siegel et al., 1982). The strain was grown for 20 generations and the plasmid DNA was isolated and used to electroporate E, coli XLl-Blue (Bullock et al., 1987). Suppressor mutants were isolated by spreading the transformed cells on agar plates containing 500 (Xg/inl ampicilhn. Ten colonies were picked and plasmid DNA was isolated and used to re-transform E. coli XLl-Blue. In each case the ampicillin resistance phenotype was linked to the plasmid containing the blajEM g^^^- The DNA sequence of blajEM g^^^ of six mutants was determined. In one case the codon at position 76 had accumulated 2 nucleotide substitutions to revert to a leucine residue (AAC Asn -> CTC Leu). Two mutants exhibited a single nucleotide change at codon 76 to introduce an isoleucine (AAC Asn -^ ATC He). Finally, three mutants contained a nucleotide substituion at codon 182 that resulted in the replacement of methionine with threonine (Metl82 ATG -^ Thrl82 ACG). Interestingly, the acarbons of residues 76 and 182 are 17 A apart and thus do not directly interact. These results do not eliminate the possibility that compensating mutations can occur in the immediate vicinity of the residue 76 side chain as appears to have occurred in the gene family, but they do indicate that there are multiple mutational pathways for correcting a defect in a protein. Studies are in progress to introduce substitutions at positions whose side chains directly interact with Asn76 to determine if they can also suppress the defect caused by the Asn76 substitution.
IV.
Discussion
Improvements in oligonucleotide synthesis methods in recent years has greatly reduced the cost of oligonucleotide directed mutagenesis experiments. This has enabled large scale mutagenesis projects, such as that described here, to be conducted. In a few months, one can create 50-100 site directed mutations. This allowed us to create SaU linker inserts by oligonucleotide directed mutagenesis at 78 sites in the bla TEM-I gene. An additional 78 oligonucleotides were then used to randomize the codons that had been targeted by the linker insertions. In addition, the widespread use of automated DNA sequencing should facilitate the determination of mutant sequences obtained using the functional selection approach. Finally, the development of phage display technology has made functional selections possible for a large variety of proteins and targets. Taken together, these improvements in technology will permit large scale oligonucleotide mutagenesis studies to be performed routinely.
References Ambler, R. P., Coulson, F. W., Frere, J.-M., Ghuysen, J.-M., Joris, B., Forsman, M., Levesque, R. C, Tiraby, G. & Waley, S. G. (1991). Biochem. J. 276, 269-272. Bowie. J. U., Reidhaar-Olson, J. F., Lim, W. A. & Sauer. R. T. (1990). Science lAl. 13061310. Bullock, W. O., Fernandez, J. M. & Short, J. M. (1987). BioTechniques 5, 376-379. Datta, N. & Kontomichalou, P. (1965). Nature 208, 239-241. Deng, W. P. & Nickoloff, J. A. (1992). Anal Biochem. 200, 81-88. Hanke, M. & Wink, M. (1994). BioTechniques 17, 858-860. Herzberg, O. (1991). /. Mol Biol 111, 701-719. Huang, W., Petrosino, J., Hirsch, M., Shenkin, P.S., & Palzkill, T. (1996). /. Mol Biol 258, 688-703.
836
Timothy Palzkill et al
Jacoby, G. A. & Medeiros, A. A. (1991). Antimicrob. Agents and Chemother. 35(9), 16971704. Jelsch, C, Mourey, L.. Masson, J.-M. & Samama, J.-P. (1993). Proteins 16, 364-383. Joris, B., Ghuysen, J.-M., Dive, G., Renard, A., Dideberg, O., Charlier, P., Frere, J.-M., Kelly, J. A., Boyington, J. C, Moews, P. C. & Knox, J. R. (1988). Biochem. J. 250. 313324. Knox, J. R. & Moews, P. C. (1991). /. Mol Biol 220, 435-455. Kraulis, P. J. (1991). /. Appl Crystallogr. 24, 946-950. Kunkel, T. A., Roberts, J. D. & Zakour, R. A. (1987). Methods EnzymoL 154, 367-382. Loeb, D. D., Swanstrom, R., Everitt, L., Manchester, M., Stamper, S. E. & Hutchison, C. A. (l9S9)J^ature 340, 397-400. Markiewicz, P., Kleina, L. G., Cruz, C., Ehret, S. & Miller, J. H. (1994). J. Mol. Biol., 421433. Palzkill, T. & Botstein, D. (1992). Proteins 14, 29-44. Palzkill, T., Le, Q.-Q., Venkatachalam, K. V., LaRocco, M. & Ocera, H. (1994a). Mol. Microbiol. 12,217-229. Rennell, D., Bouvier, S. E., Hardy, L. W. & Poteete, A. R. (1991). /. Mol. Biol. Ill, 67-87. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989). Molecular cloning: a laboratory manual. 2 edit. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Siegel, E.C., Wain, S.L., Meltzer, S.F., Binion, M.L., & Steinberg, J.L. (1982). Mutation Research 93, 25-33. Terwilliger, T. C, Zabin, H. B., Horvath, M. B., Sandberg, W. S. & Schlunk, P. M. (1994). J. Mol Biol lU, 556-571. Wiedemann, B., Kliebe, C. & Kresken, M. (1989). /. Antimicrob. Chemother. 24, 1-24.
Characterization of Truncated Kirsten-Ras Purified from Baculovirus Infected Insect Cells Indicates Heterogeneity due to N-terminal Processing and Nucleotide Dissociation Lisa M. Churgay, Nancy B. Rankl, John M. Richardson, Gerald W. Becker and John E. Hale Lilly Research Labs Indianapolis IN
I.
Introduction
Kirsten-ras is the most frequently activated oncogene in human tumors (1). Activated K-ras is bound to GTP and possesses intrinsic GTPase activity which leads to its inactivation. Oncogenic K-ras has dramatically lower GTPase activity. The importance of K-ras in tumorigenesis makes it an important target for drug development. Determination of the crystal structure of K-ras will aid in the rational design of anti-cancer therapeutics. To this end truncated K-ras was expressed in baculovirus infected insect cells which has been suggested as an appropriate system for its production (1). K-ras was highly expressed in these cells and was purified to apparent homogeneity evaluated by SDSPAGE. Analytical DEAE chromatography and electrospray-ionization mass spectrometry (ESI-MS) of the purified protein indicated substantial heterogeneity. The different proteins were characterized by tryptic mapping utilizing LC-MS. This analysis indicated the presence of at least 4 different N-terminal variants of Kras and additional heterogeneity due to dissociation of bound nucleotide, indicating that unwanted cellular processing of proteins may occur in baculovirus infected cells. This processing may impact the further usefulness TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
837
Lisa M. Churgay et al
838
of the protein particularly in the case of protein crystallography in which N-terminal variants may impair the ability to obtain crystals suitable for diffraction studies. Thus, additional purification steps must be included in order to separate these very similar molecules, significantly reducing yields. More preferably, other cell lines or growth conditions will be evaluated in an effort to minimize the heterogeneity from this cellular processing and to obtain protein suitable for structural studies. These results demonstrate that careful characterization of purified recombinant proteins must be undertaken in order to understand the extent of cellular modification and determine the impact on the biological and physical behavior of these proteins. II.
Materials and Methods
A. Kirsten ras protein production A truncated analog containing residues 1-166 of K-ras 4B (val 12) and a C-terminal Arg-Ser dipeptide was produced in Sf9 (Spodopetera frugiperda) insect cells. Cells were cultured in Grace's insect medium supplemented with 3.33mg/ml yeastolate, 3.33mg/ml lactalbumin hydrolysate (JRH Biosciences), 10% FES (Atlanta Biologicals), 1% antibiotic /antimycotic (Sigma), 0.1% pluronic F-68 (JRH Biosciences) using magnetic spinner flasks. Infections were performed in 9L stirred vessels as previously described(2) by seeding the vessels at a final density of 8.5X10^ cells/ml using virus at an MOI of 5. B. Preparative purification of K-ras Cell pellets from 4L of baculovirus infected insect cells were resuspended into 25mM Tris-HCL, pHS.O, 5mM DTT, 250mM sucrose and protease inhibitor cocktail tablets (Boehringer Mannheim). Cells were homogenized and centrifuged at 38,000 X g for 20 min. The supernatant was ultracentrifuged at 100,000 X g for 2hrs, filtered, and loaded over a 21.5mm ID X 15 cm DEAE column
Kirsten-Ras Purified from Baculovirus Infected Insect Cells
839
(Tosohaas) equilibrated in 25mM Tris-HCl, pH 8.0, and 5mM DTT. The protein was eluted with a NaCl gradient from 0-0.5M developed over 85min. Fractions containing the K-ras were identified on 4-20% tris-glycine gels (Novex), pooled, and concentrated to lOmls in an Amicon stirred cell. Protein was passed over a Superdex 75 Prepgrade 35/600 column(Pharmacia) equilibrated in lOmM MOPS, pH 7.0, lOOmM NaCl, 5mM DTT, and ImM MgCl2. Column fractions were analyzed by SDS-PAGE and a peak fraction analyzed by mass spectrometry. C. Analysis of K-ras The peak fraction was dialyzed overnight into lOmM MOPS, pH 7.0, 5mM DTT and ImM MgC^ and approximately 5mg was injected onto a 7.5mm X 7.5cm DEAE column (Tosohaas). The K-ras eluted with a NaCl gradient from 0-250mM developed over 75min. Protein peaks were isolated and digested overnight at 37^ C with trypsin at an enzyme to substrate ratio of 1:25. Reversed phase HPLC was done on the digested protein using a Vydac Cig (4.5mm X 25cm) column and peaks were eluted with a linear gradient from 0-50% acetonitrile (0.1 % TFA) in 60min. Intact protein was analyzed by reversed phase HPLC on a Vydac Cig column with a linear gradient from 30-60% acetonitrile (0.1% TFA) over 60min. All protein sequence analysis was performed on a Procise sequencer (Applied Biosystems, Foster City, CA). D. Electrospray-mass spectral analysis All mass spectra were obtained on a PE-Sciex triple quadrapole instrument (model API III) as described (3). Collisionally induced dissociation (CID) MS/MS experiments were performed in the positive ion detection mode with the orifice potential set at +50 V and the argon collision gas thickness maintained at 315 X lO^^ molecules/ cm^. Product ion scans were averaged over a range of 50600 u in 0.1 u intervals for a dwell time of 1 msec, per interval. III.
Results
840
Lisa M. Churgay et al
A, Purification of K-ras from insect cells and initial characterization Insect cell cytoplasm was initially purified by preparative DEAE chromatography. Fractions containing K-ras were pooled and fractionated over a Superdex-75 column. The protein obtained from this purification appeared to be homogeneous by SDS-PAGE (figure lA). This protein preparation was subjected to ESI-MS analysis and multiple masses were noted (figure IB). The mass expected for the K-ras protein was detected (19146) however additional masses were seen including those at 19012, 19055 and 19186. N-terminal sequence analysis of the protein preparation yielded primarily a major sequence of MTEY and a minor sequence of TEYK indicating that some N-terminal processing of the K-ras resulting in removal of the methionyl residue had occurred. B. Identification of the N-terminally processed forms of K-ras The K-ras mixture was analyzed on a TSK-DEAE HPLC column. Four major peaks were seen to elute from this column (figure 2) and these peaks were analyzed by ESIMS. Peak 1 was primarily mass 19146 with minor components of 19012 and approximately 19590. Peak 2 was primarily 19055 with minor components of 19189 and approximately 19500. Peak 3 was primarily 19146 with a minor component of 19012 and peak 4 was primarily 19055 with a minor component of 19189. We digested peaks 1 and 2 with trypsin and separated the peptides by reversed phase HPLC. A single peptide was seen to be shifted in the digests of peaks 1 and 2 (figure 3). LC-MS of these digests indicated that this peptide was the N-terminal tryptic peptide in peak 1 with a mass of 671. This peptide was absent in the peak 2 digest and a new peak was present eluting slightly earlier in the gradient with a mass of 582. Thus the N-terminus of the protein in peak 2 was modified. This peptide was
841
Kirsten-Ras Purified from Baculovirus Infected Insect Cells
kDa
A
200-
1
2
3
*T—.
116, 97' 66 55 35 31 21 14 6
B 100 19,146
75 t B £
50 f
19,186 13
25
19,012
18,800
19,000
19,497 19,586 19,200 19,400 Molecular Weight
19,600
Figure 1. A, SDS-PAGE of K-ras produced in baculovirus infected insect cells. Protein samples were electrophoresed on a 4-20 % SDS gel under reducing conditions. 1; MW standards, 2; preparative DEAE pool, 3; purified Kirsten ras. B, Mass distribution of the purified K-ras reconstructed from the ESI-MS using PESciex MacSpec software.
o
d
d
d (uiu 082) sqv
/—V
o^ v
o >o
o ^
o
en
(N
o
t-H
O
^^^
e
• »-* H
C/5
o
CJ Thr plus Cys 97 -> Ala (C54T/C97A) which will be referred to as WT* (Matsumura and Matthews, 1989). Methionines were introduced singly at each of the core sites listed in Table II. Multiple methionine variants were then made from different combinations of the single core sites. Among these are a seven-methionine mutant that includes the substitutions L84M/L91M/L99M/L118M/L121M/L133M/ F153M, and a ten-methionine mutant that includes the additional replacements I78M/I100M/V103M.
C.
Protein Purification
The single-Met and 7-Met variant proteins were purified according to standard methods (Alber and Matthews, 1987; Poteete et al, 1991; Muchmore et ah, 1989). The ten-methionine variant was expressed as inactive inclusion bodies (Rudolph and Lilie, 1996) which were observed directly by phase contrast microscopy in the E. coli cells as refractile bodies. To harvest the inclusion bodies, the cells were pelleted, weighed, and resuspended in 50 mM NaCl, 50 mM Tris-HCl, 10 mM NajHEDTA, 2.5 mM benzamidine-HCl, 1 mM paramethanesulfonyl fluoride and 0.1 mM 1,4-dithiothreitol, pH 8.0, at 23°C, using a volume in milliliters equivalent to the number of grams of pellet. After disrupting the cells by three passages through a French press at 20,000 psi, the lysed cells were pelleted for 10 minutes at 8,000 rpm in a JA-20 rotor at 4°C. The pellet was resuspended in ten-fold the apparent pellet volume of Nanopure water, repelleted, resuspended in 0.1 M KCl, 10 mM Tris-HCl, 5 mM MgCl2, pH 8.0, and stirred at room temperature for 30 minutes with DNase I (10 /ig/ml), the sample was again pelleted, then resuspended and pelleted two times with tenfold the pellet volume of water. Ten-fold the pellet volume of water was added, brought to 10% Triton X-100 and stirred for one hour at 4''C. It was repelleted and washed twice as above. The pellet was resuspended in ten-fold the pellet volume of water, brought to 2.5% octyl-jS-O-glucopyranoside and stirred at 4°C for one hour. The sample was again washed twice in water and repelleted. The pellet was solubilized in ten-fold its own volume with 4 M urea, 10 mM glycine and the pH adjusted to 2.0 with concentrated
Multiple-Methionine Mutants of T4 Lysozyme
855
phosphoric acid at 23°C. The sample was refolded by rapid dilution with one hundred times the pellet volume of 25 mM NaCl, 20 mM Tris-HCl, pH 5.5, at 23°C. Alternatively, refolding was accomplished by slow dialysis of the solubilized sample against 25 mM NaCl, 50 mM Na citrate, pH 5.5, plus 10% glycerol. Any remaining insoluble material was removed by centrifugation for 5 minutes at 8000 rpm. Protein purification from this point was according to standard methods previously described for other soluble lysozyme variants (Poteete et al, 1991). To test for the possible oxidation of the sulfur in the methionine side-chains, the molecular masses of the multiple methionine proteins were determined by electrospray mass spectrometry and were found to agree to within 5 Da with the theoretical values expected on the basis of natural isotopic abundance (7-Met variant expected 18694.82 Da, observed 18698.75 Da; 10-Met variant expected 18762.82 Da, observed 18767.54 Da). This indicates little if any oxidation of the introduced methionines.
D.
Preparation of Cell Wall Fragments
E, coli cell wall fragments were isolated by modification of a previous method (Becktel and Baase, 1985) and without the use of trypsin. Thirty-six grams of freeze-dried E. coli B (Sigma, EC 11303) were suspended in 250 mis of 1.5 M NaCl at 23°C. All subsequent steps were at 23''C except as noted. After French-pressing the slurry at 16,000 psi the cells were centrifuged at 16,000 g in a Beckman JA-20 rotor for one hour. Upon resuspension in 200 mis of 1 M NaCl, the cells were brought to a boil, cooled, then pelleted at 5,000 g. Suspension and pelleting was repeated three times, the supernatant becoming more clear and less viscous with each cycle. The final resuspension was into Nanopure water to a total volume of 150 mis. The suspension was added slowly to 250 mis of boiling 10% SDS with constant stirring. After cooling in an ice bath to 23''C (below room temperature SDS precipitates), the cell walls were centrifuged at 37,0(K) g for one hour at room temperature. Discarding the supernatant, the cell walls were resuspended in water using a teflon pestle tissue grinder (0.15 mm clearance, at least twenty strokes), then centrifuged. The cycle of resuspension and centrifugation was repeated six times with the penultimate and final resuspensions in 66.7 mM
Nadine C. Gassner et al
856
K3/2H3/2PO4, pH 6.88, 0.02% sodium azide. Cell walls made in this manner were stable at 4°C for one month, but will aggregate if frozen.
E.
Enzyme Activity Assays
Activity of the mutant enzymes was initially assessed by halo size on lysis plates (Streisinger et al, 1961) and by a modification of the lysoplate assay of Becktel and Baase (1985) where a chloroform treated bacterial lawn was used in lieu of embedded peptidoglycan. In order to obtain greater sensitivity, an assay was used based on the increase of circular dichroism (CD) signal observed after peptidoglycan fragments are exposed to lysozyme (Zhang et al, 1995). CD assays were carried out in 1 cm pathlength QS-IU (Hellma) birefringence-free optical cells with a 3 x 6.5 mm teflon-coated stirring flea (VWR). Two mis of 66.7 mM K3/2H3/2PO4, pH 6.8, were mixed with cell wall fragments as prepared above sufficient to result in a postreaction signal of -30 millidegrees at 223 nm as measured in a J-720 spectropolarimeter (Jasco). Stirring speed was maintained at 700 rpm using the PTC-348W temperature controller (Jasco) to keep the peptidoglycan fragments in suspension. After thermal equilibration to 20°C, reactions were initiated by manual addition of enzyme and reaction course followed by the time evolution of the CD signal at 223 nm. The change in CD with time was analyzed by fitting the early linear portion of the sigmoidal decay curve. These rates scaled as WT* concentration. Relative activities were calculated by dividing the rate of the mutant by the rate of WT* at the same protein concentration.
F.
Thermal Measurements
Protein thermal stabilities were determined by monitoring the circular dichroism at 223 nm as a function of sample temperature (Eriksson et a/., 1993; Zhang et al., 1995). The buffer was 0.1 M sodium chloride, 1.4 mM acetic acid, 8.6 mM sodium acetate, pH 5.42, with protein present at 15 to 30 /ig/ml. The melting temperature, T^, of WT* was 65.3'*C. Free energy values were computed at 59°C assuming a constant change in heat capacity, ACp of 2.5 kcal/mol-deg.
Multiple-Methionine Mutants of T4 Lysozyme
G.
857
Crystallography and Structure Analysis
Determination and evaluation of the structure of the seven-methionine mutant was as described (Gassner et al, 1996). In order to estimate the volume available for each substituted methionine side-chain, a model calculation was carried out based on the coordinates of WT* lysozyme. The coordinates were displayed on a graphics system using FRODO (Jones, 1982), and the side-chain in question truncated to alanine. The volume of the cavity that resulted was then calculated using the program of Connolly (1983). In a prior report (Eriksson et al, 1992) the same procedure was used except that the radius of the jS-carbon was assumed to correspond to a methylene rather than a methyl group. This explains why the model cavity volumes quoted here are not identical with those given by Eriksson et al (1992).
H.
Chemicals
All reagents were "Baker analyzed". pH was measured using a Radiometer PHM84 and Radiometer "S" series standards with a GK2421C electrode.
III. Results and Discussion A.
Activity
As judged by halo size using modified lysis assay plates (see Materials and Methods), all single variants had close to wild-type levels of activity. The multiple methionine variant halos were 50-70% smaller than those of the single mutants. As measured by the CD-based assay (Figure 1) the activity relative to WT* varied from 70% (I78M) to 106% (L133M) for the single methionine mutants (Table II). Despite alteration of the core composition, the ten methionine variant retained activity equal to about 20% that of wild-type. The activity measurements therefore suggest that the single methionine substitutions described here have a small effect on biological function.
Nadine C. Gassner et al
858 1
o
I •0.5
1
1
1
1
_,
,
,——'
^
1
'
y
'— '
'
1
L99M
1
L133M
•
• L91M
'
• F153M
LI IBM