PROGRESS
IN
Nucleic Acid Research and Molecular Biology Volume 47
This Page Intentionally Left Blank
PROGRESS IN ...
22 downloads
863 Views
14MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PROGRESS
IN
Nucleic Acid Research and Molecular Biology Volume 47
This Page Intentionally Left Blank
PROGRESS IN
Nucleic Acid Research and Molecular Biology edited by
WALDO E. COHN Biology Division
KlVlE MOLDAVE Department of Biology
Oak Ridge National Laboratory Oak Ridge, Tennessee
University of California Santa Crrrz, California
Volume 47
ACADEMIC PRESS, INC. Harcourt Brace ]manmich, Publishers Son Diego New York Boston London Sydney Tokyo Toronto
This book is printed on acid-free paper. @ Copyright 0 1991 BY ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means. electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press, Inc. San Diego, California 92101 United Kingdom Edition published by ACADEMIC PRESS LIMITED 24-28 Oval Road, London NWl 7DX
Library of Congress Catalog Card Number:
ISBN 0-12-540041-1 (alk. paper)
PRINTED IN THE UNITED STATES OF AMERICA 91
92 93 94
9 8 7 6 5 4 3 2 I
63- 15847
Contents
ABBREVIATIONS AND SYMBOLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
SOME ARTICLES PLANNED FOR FUTUREVOLUMES . . . . . . . . . . . . . . . . . . . . . . .
xi
Molecular Structure and Transcriptional Regulation of the Salivary Gland Proline-Rich Protein Multigene Families. . . . . 1 D o n M. Carlson, Jie Zhou and Paul S. Wright
I. 11. 111.
IV. V. VI.
Background .......................... PRP mRNAs 11-free Translation Analysis . . PRP cDNAs and Amino-acid Sequences . . . . . . . . . Sequence and Structural Analyses of PRP Genes . . . . . . . . . . . . . . . . . Regulation of Expression of PRP Genes . . . . . . . . . . . . . . . . . . . . . . . . . Functional Aspects of PRPs ................
...............................
3 6 9 10 16
18 20 21
Recognition of tRNAs by Aminoacyl-tRNA Synthetases . . . . .23 LaDonne H. Schulman I. Recognition versus Identity . . . , . . . . . . . . . . . . . . 11. Assays of the Amino-acid-acceptor Specificity of 111. Role of the Anticodon . . . . . . . . . . . . . . . . . . . . . . IV. Role of the Acceptor Stem and the “Discriminat at Position 73 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ............... V. Other Recognition Profiles VI . Role of Modified Bases . . . . . . . . . . . . . . . . . . . . . . . VII . The Complex of E. coli Glutamine tRNA and Glut VIII. tRNA Binding Domains of Other Synthetases . . . ...................... IX . Concluding Remarks References ...........................
Ribosome Biogenesis in Yeast.
24 25 29
44 58 64 66 72 81 82
. . . . 89
H. A. Rau6 and R. J. Planta I. Transcription of Ribosomal-RNA Genes . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Expression of Ribosomal-protein Genes . . . . . . . . . . . . . . . . . . . . . . . . . V
91 103
CONTENTS
vi
111.
Processing and Assembly of Ribosomal Constituents . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Structural Elements in RNA
111 124
. . . . . 131
Michael Chastain and Ignacio Tinoco. Jr. I . Secondary Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
132
111. Tertiary Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Predicting Tertiary Interactions . . . . V. Three-dimensional Structure . . . . . . VI . Determining RNA Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII . Protein-RNA Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ............... VIII . RNA-RNA Interactions . . . . . . . . . . . . . . . . . . . . . IX . RNA-DNA Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
150
I1 . Predicting Secondary Structure . .
161 167 169 170 171
Nuclear RNA-binding Proteins . . . . . 179 Jack D . Keene and Charles C . Query
. . . . . . . . . . . . . . . . . . 180
.
RRM Family of Proteins
.......
. . . . . . . 202
Amplification of DNA Sequences in Mammalian Cells . . . . . 203 Joyce L . Hamlin. Tzeng-Horng Leu. James P. Vaughn. Chi Ma and Pieter A . Dijkwel I . Historical Development of the Amplification Field . . . . . . . . . . . . . . . . I1. Occurrence of Amplified DNA Sequences . . . . . . . . . . . . . . . . . . . . . . . 111. Properties of Amplified DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Possible Mechanisms and Ways to Discriminate among Them . . . . . . V. Usefulness of Cell Lines Bearing Amplified Genes . . . . . . . . . . . . . . . . VI . Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
205 206 207 218 228 232 232
CONTENTS
vii
Molecular-Biology Approaches to Genetic Defects of the Mammalian Nervous System . . . . . 241 J. Gregor Sutcliffe and Gabriel H. Travis I.
Neural Mutants . . . . . . . . . . . . . . .
11. The rds Gene . . . . . . . . . . 111. Secretogranin 111 . . . . . . . . . . . . .
IV. V. VI.
Making Mutants . . . . . . . . Getting All of the Genes Reprise . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Lens Proteins and Their Genes.
257
. . . .259
Hans Bloemendal and Wilfried W. de Jong I. The Lens and Its Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11. The Lens and Its DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
References
..................................................
INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
259 269 277 277
283
This Page Intentionally Left Blank
Abbreviations and Symbols All contributors to this Series are asked to use the terminology (abbreviations and symbols) recommended by the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) and approved by IUPAC and IUB, and the Editors endeavor to assure conformity. These Recommendations have been published in many journals (I. 2) and compendia (3)and are available in reprint form from the Office of Biochemical Nomenclature (OBN); they are therefore considered to be generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recommendations (I)and subsequently revised and expanded (2.3),are given in condensed form in the frontmatter of Volumes 9-33 of this series. A recent expansion of the one-letter system ( 5 ) follows. SINGLE-LETTER CODE Symbol
RECOMMENDATIONS.
Meaning
(5)
Origin of symbol Guanosine Adenosine (ribo)Thymidine (Uridine) Cytidine
R
G or A
Y
T(U) or C A or C G or T(U) G or C A or T(U)
M
K S W‘ H B
puRine pyrimidine aMino Keto Strong interaction (3 H-bonds) Weak interaction (2 H-bonds)
or C or T(U)
not not not not
G; H follows G in the alphabet A; B follows A T (not U); V follows U C; D follows C
D
A G G G
N
G or A or T(U) or C
aNy nucleoside (i.e., unspecified)
Q
Q
Queuosine (nucleoside of queuine)
V
or T(U) or C or C or A or A or T(U)
‘Modified from Proc. Natl. Acad. Ski. US.A. 83, 4 (1986). *W has been used for wyosine, the nucleoside of “base Y” (wye). ‘D has been used for dihydrouridine (hU or H, Urd). Enzymes
In naming enzymes, the 1984 recommendations of the IUB Commission on Biochemical Nomenclature ( 4 ) are followed as far as possible. At first mention, each enzyme is described either by its systematic name or by the equation for the reaction catalyzed or by the recommended trivial name, followed by its EC number in parentheses. Thereafter, a trivial name may be used. Enzyme names are not to be abbreviated except when the substrate has an approved abbreviation (e.g.. ATPase, but not LDH, is acceptable).
ix
ABBREVIATIONS AND SYMBOLS
X
REFERENCES 1. JBC241,527 (1966); &hem 5,1445 (1966); MlO1, I (1966); ABB 115. I (I%), 129,l (1%9); and e1smhere.t General. 2. EJB I S , 203 (1970); JBC 245, 5171 (1970);JMB 55, 299 (1971); and e1sewhere.t 3. “Handbook of Biochemistry” (G. Fasman, ed.), 3rd ed. Chemical Rubber Co., Cleveland. Ohio, 1970, 1975, Nucleic Acids, Vols. I and 11, pp. 3-59. Nucleic acids. 4. “Enzyme Nomenclature” [Recommendations (1984) of the Nomenclature Committee of the IUB]. Academic Press, New York, 1984. 5. /LIB 150, I (1985). Nucleic Acids (One-letter system).t Abbreviations of Journal Titles
Journals
Abbreviations used
Annu. Rev. Biochem. Annu. Rev. Genet. Arch. Biochem. Biophys. Biochem. Biophys. Res. Commun. Biochemistry Biochem. J. Biochim. Biophys. Acta Cold Spring Harbor Cold Spring Harbor Lab Cold Spring Harbor Symp. Quant. Biol. Eur. J. Biochem. Fed. Proc. Hoppe-Scyler’s Z. Physiol. Chem. J. Amer. Chem. Soc J. Bactcriol. J. Biol. Chem. J. Chem. Soc. J. Mol. Biol. J. Nat. Cancer Inst. Mol. Cell. Biol. Mol. Cell. Biochem. Mol. Gen. Genet. Nature, New Biology Nucleic Acid Research Proc Natl. Acad. Sci. U.S.A. Proc SOc Exp. Biol. Mcd. Progr. Nucl. Acid. Res. Mol. Bid.
ARB ARGen ABB BBRC Bchem BJ BBA CSH CSHLab CSHSQB EJB FP ZpChem JACS J. Bact. JBC JCS JMB JNCl MCBiol MCBchem MGG
Nature NB NARes PNAS PSEBM This Series
tbprints available from the Office of Biochemical Nomenclature (W. E. Cohn, Director).
Some Articles Planned for Future Volumes
Phosphotransfer Reactions of Plant Virus Satellite RNAs
GEORGEBRUENING Positive and Negative Regulation of Gene Expression by Steroid Agonists and Antagonists ANDREW B. CATO, H. PONTA AND P. HERRLICH
c.
Regulation of Gene Expression in Trypanosomes
CHRISTINE CLAYTON Oligonucleotides as Antisense Inhibitors of Gene Expression JACK
s. COHEN AND M.
GHOSH
The DNA Binding Domain of the Zn(ll)-containing Transcription Factors JOSEPH
E. COLMAN AND T.
PAN
Specific Hormonal and Neoplastic Transcriptional Control of the Alpha 2u Globulin Gene Family
PHILIPFEIGELSON Cellular Transcriptional Factors Involved in the Regulation of HIV Gene Expression
RICHARDGAYNORAND C. MUCHARDT Correlation between tRNA Structure and Efficient Aminoocylation
RICHARD GIEGE, C. FLORENTZ AND J. PUGLISI snRNA Genes: Tronscription by RNA Polymerase II and RNA Polymerase 111
NOURIAHERNANDEZ AND S. LOBO Regulation of mRNA Stability in Yeast ALLAN JACOBSON Recombination Enzymes from E. coli and S. cerevisiae
RICHARD KOLODNER Cell Delivery and Mechanisms of Action of Antisense Oligonucleotides
BERNARDLEBLEU,J. P. LEONETTIAND G . DEGLSO Signal-tronsducing G Proteins: Basic and Clinical Implications
MICHAEL A. LEVINE xi
SOME ARTICLES PLANNED FOR FUTURE VOLUMES
xii Synthesis of Ribosomes
LASE LINDAHL AND J. M . ZENCEL Enzymes of DNA Repair
STUARTLINN RNA Replication of Plant Viruses Comprising an RNA Genome
ANNE-LISEHAENNI, R. GARCOURI-BOUZIDAND C. DAVID Nitrogen Regulation in Bacteria and Yeast
BORIS MACASANIK Alkylation Damage Repair Genes: Molecular Cloning and Regulation of Expression SANKAR MITRA An Analysis of lntron Splicing in Monocot Plants RALPH SINIBALDIAND I. METTLER trp Repressor, A Ligand-activated Regulatory Protein
RONALDL. SOMMERVILLE lmmunochemical Analyses of Nucleic Acids
DAVIDSTOLLAR The Structure and Expressions of the Insulin-like Growth-factor Gene LYDIAVILLA-KOMAROFF AND K. M. ROSEN
Molecular Structure and Transcriptional Regulation of the Salivary Gland Proline-Rich Protein M uItigene FamiIies DON M. CARL SON,^ ZHOU~ AND PAUL S. WRIGHT~ JIE
Department of Biochemistry and Biophysics University of California-Davis Davis, California 95616
I. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. PRP mRNAs and Cell-free Translation Analysis ........ 111. PRP cDNAs and Amino-acid Sequences ............................ IV. Sequence and Structural Analyses of PRP Genes V. Regulation of Expression of PRP Genes ............................ VI. Functional Aspects of PRPs .........
16 18
...........
21
3 6 9
The proline-rich proteins (PRPs) in mammalian salivary glands are encoded by tissue-specific multigene families whose members have diverged with respect to structure and regulation of expression. A common evolutionary origin of the P R P genes is evident from the extensive conservation of 5’untranslated regions, coding sequences, and intronlexon organizations. The 42-nucleotide repeat unit CCA CCA CCA CCA GGA GGC CCA CAG CCG AGA CCC CCT CAA GGC has been proposed (1) as the ancestral unit, multiples of three bases probably being recruited into, or deleted from, this ancestral sequence during gene duplication. Gene conversion possibly was the mechanism of homogenization of the divergence of the internal repeats. Two nonallelic mouse P R P genes ( M P 2 and M 1 4 ) have essentially identical sequences, with two major differences (2). M P 2 has 13tandemly arranged 42-nucleotide repeats, whereas M 1 4 has 17 repeats. M 1 4 has an insertion by transposition of a two-kilobase member of the long, interspersed elements of repeated mouse DNA (LINE family) into intron I. The 5’-untranslated se-
* To whom correspondence may be addressed. Present address: Neurological Sciences Institute, Good Samaritan Hospital and Medical Center, Portland, Oregon 97209. Present address: Merrell Dow Pharmaceuticals. Inc., Cincinnati, Ohio 45215. 2
1 Progress i i i Nucleic Acid Hrearch and Moleciilar Biology, Vnl. 41
Copyright 8 1991 by Academic Press. Inc. All rights of reproduction in any form reserved.
2
DON M. CARLSON ET AL.
quences and regions encoding the signal peptides of all PRP mRNAs, regardless of source, are nearly identical. In another multigene family from rat submandibular glands that encodes contiguous repeat proteins (CRPs) or glutamic acid/glutamine-rich proteins (Glx-rich proteins), the 5'-untranslated sequences and the regions encoding the signal peptides of the mRNAs are 91% identical (nucleotides) and 92% identical (amino acids) to the PRP mRNAs (3, 4). Two mRNA size-classes, each containing multiple PRP mRNAs, are transcripts from PRP gene families of mice (5), hamsters (6),rats (i'),and humans (8).The CRP or Glx-rich multigene family also encodes two size-classes of mRNAs, and this multigene family has the same introdexon organization as the mouse and rat PRP genes. Cell-free translations show some unusual differences in PRPs encoded by mRNAs from parotid glands of four mouse strains (BALB/cJ, DBA/2J, CD-1, and C57BL/6J) after isoproterenol treatment (5).Reasons for the variations of translation products in these mouse strains after induction of the PRP gene families are unknown. Repeated administration of the P-agonist isoproterenol causes hypertrophy and hyperplasia of rat and mouse parotid and submandibular glands (9, 10).The morphological changes are accompanied by a dramatic increase, or induction, in the synthesis of PRPs. Typically, these proteins contain 25-45% proline, 18-22% glycine, and 18-22% glutamine and glutamic acid. Aromatic and sulfur-containing amino acids are either very low in amount or absent. Generally, PRPs can be divided into acidic and basic groups, and both groups may be glycosylated and phosphorylated. PRPs may compose more than 70% of the protein in salivary gland extracts after treatment with isoproterenol. All proteins derived from the nucleotide sequences of PRP cDNAs and PRP genes are characterized by four general regions: a signal peptide region, a transition region, a repeat region, and a carboxyl-terminal region (11). The apparent tissue-specific synthesis and the appearance of PRPs in saliva in such large quantities, either constitutive (as in humans) or induced by isoproterenol, suggest biological functions in the oral cavity and the gastrointestinal tract. Several functions, such as calcium binding, inhibition of hydroxylapatite formation, and formation of the dental-acquired pellicle, have been attributed to the human salivary PRPs (12). PRPs have an unusually high &nity for such multihydroxylated phenols as tannins; feeding tannins to rats and mice mimics the effects of isoproterenol on the parotid glands (13). The induction of PRP synthesis by dietary tannins clearly results in a protective response against the detrimental effects of the tannins (13). Unlike mice and rats, hamsters do not respond to tannins in the diet by the induction of PRPs. Pronounced detrimental effects are observed in weanling hamsters specifically. When these animals are maintained on a 2%
t
tannin diet for 6 months, they fail to grow (6).Tannins are unusually toxic to weanling hamsters; an increase of tannin in the diet to 4% causes death to most animals within 3 days. The association of tannins with pathological problems, including carcinogenesis and hepatotoxicity, and the influences on growth and toxicity in hamsters, have led to the proposal that PRPs may act as a first line of defense against these multihydroxylated phenols (13). This review focuses on the biochemistry and molecular biology of the salivary PRPs; it is not intended to be an overall or complete review of PRPs. To those who have contributed to the PRP literature and whose work is not mentioned, we apologize. Previous reviews are used for many references and studies.
1. Background4 Salivary glands of various animals synthesize, or can be induced to synthesize, a group of proteins unusually high in proline, the so-called prolinerich proteins (PRPs) (12, 14-20). These proteins collectively constitute the largest group of proteins in human salivary secretions, making up more than 70%of the secreted proteins (12).PRPs may be divided into acidic and basic groups, and members of each group may be phosphorylated or glycosylated, or both. These unusual proteins are constitutive in human saliva, but families of similar proteins are dramatically increased or induced in parotid and submandibular glands of rats, mice, and hamsters by isoproterenol treatment (6, 18, 19,21).Profound morphological effects on rat parotid glands by isoproterenol treatment were first observed in 1961 (9, 10). Repeated pharmacological doses cause dramatic glandular hypertrophy (Fig. 1). The increase in DNA synthesis with isoproterenol treatment (25, 26) probably results mainly from polyploidy; by 4-5 days, more polyploid than diploid nuclei are seen (Fig. 2) (see 27 for a review on the regulation of salivary gland size and the effects of isoproterenol). The dramatic accumulation of PRPs in the parotid glands of rats treated with isoproterenol was first described in 1974 (16, 18,28).After 7-10 days of treatment (5 mg of isoproterenol per day), PRPs composed about 70%of the total soluble proteins in parotid gland extracts. Initially, an acidic PRP (PI =
4 Reviews describing mainly the human PRP families are available (12, 22, 23). These unusual proteins were first observed in human saliva by Mandel, Thompson, and Ellison (24) and were first purified and characterized by Bennick and Connell(14) and by Oppenheim, Hay, and Franzblau (15).The genetics of this human multigene family were described in a review by Bennick (23). Other than for comparisons of the human cDNAs and multigene families, this review focuses primarily on the tissue-specific inducible multigene PRP families of mouse, rat, and hamster.
4
DON M. CARLSON ET AL.
FIG.1. Hypertrophic effects of isoproterenol treatment on rat salivary glands. Rats (150200 g of body weight) were injected intraperitoneally with 5 mg of isoproterenol daily for 7 days. The parotid glands (p), submandibular glands (sm), and sublingual glands (sl) were removed from control (bottom) and isoproterenol-treated animals (top). No changes were noted for the sublingual glands, which secrete principally mucous glycoproteins. Parotid glands, which are serous secretors, showed a dramatic increase in weight of about 6- to l0-fold. Submandibular glands are of a mixed cell population and showed an intermediate response to isoproterenol.
4.5) was identified (Ipr-lA2), and this protein was phosphorylated and glycosylated (16, 18, 19). Subsequently, six basic PRPs unusually high in proline (40-44%), glutamine plus glutamate (22-25%), and glycine (18-20%), containing varying amounts of lysine plus arginine (7-9%), were isolated and characterized (18, 19). Aromatic and sulfur-containing amino acids were either absent or present in very low amounts. Therefore, PRPs have little or no absorbance at 280 nm. Neither hydroxylysine nor hydroxyproline is present and the treatment of these PRPs with purified prolyl hydroxylase failed to convert proline into hydroxyproline. The molecular weights of the basic proteins, from sedimentation equilibrium, ranged from 15,000 to 18,OOO, and that of PRP Ipr-1A2 was 25,000. A high MW,,, (71,000) was observed following chromatography on Sephadex G-100, but the unusually high axial ratio (>25) of these proteins undoubtedly caused this value to be substantially overestimated. S values ranged from 1.1 to 1.4. Circular dichroism spectra showed no a-helical or polyproline conformations.
FIG. 2. Karyotypes of (a) a mouse bone marrow cell and (b) a monse parotid gland cell. The chromosomal display of the mouse hone marrow cell showed the normal 2n (= 40) chromosomes after 2 days of isoproterenol treatment. The mouse parotid gland cells (>50% of the cells) showed 471 chromosomes after 2 days of isoproterenol treatment. (Courtesy of Christopher Bidwell.)
6
DON M. CARLSON ET AL.
II, PRP mRNAs and Cell-free Translation Analysis Studies by cell-free translation analysis using the reticulocyte lysate system and labeling with [3H]proline or 135Slmethionine showed dramatic and definitive changes in the patterns of protein synthesis in parotid glands of isoproterenol-treated rats, and PRP mRNAs were highly elevated in the treated animals (29).There was very little synthesis of PRPs from poIy(A)+ RNAs from glands of control rats: poly(A) RNAs from the glands of treated animals synthesized mainly PRPs; translation patterns with [3H]proline and [35S]methionine gave identical labeling patterns; and PRPs from cell-free translations were all precipitated by antibodies to PRPs. [35S]Methionine was incorporated only into the initiation site, as determined by sequence analysis and by the fact that PRPs synthesized by tissue slices of parotid glands of isoproterenol-treated rats in the presence of [35S]methionine contained no 35S label. Because most PRPs are acid-soluble, a property first used in the purification procedures of rat submandibular gland PRPs (30),it is imperative that cell-free translation products be precipitated with a solution containing both trichloroacetic and phosphotungstic acids (29). The induction of PRP mRNAs in the parotid and submandibular glands of both rats and mice by isoproterenol treatment has been demonstrated by Northern and dot-blot hybridizations (21). PRP mRNAs either are very low or are not detectable in the glands of untreated rats and mice. After 4-5 days of isoproterenol treatment, mRNAs encoding these unusual proteins compose over 50% of the total glandular mRNAs (5). For example, plasmid pRP25 does not hybridize with RNAs from control rats (Fig. 3A), but does hybridize with PRP mRNAs of two size-classes, ranging from 600 to 1100 bases, from isoproterenol-treated animals. These size ranges of mRNAs are consistent with all rat RNA preparations tested. The multiplicity of PRPs encoded by the PRP mRNAs from treated rats is evident from Fig. 3B, since about 12 PRPs were identified by cell-free translation analysis and immunoprecipitation. The PRP cDNA insert of pUMP40 (11), prepared from mRNAs from BALB/cJ mice, has been tentatively identified as the transcript of the mouse PRP gene MP2 (1).However, the nucleotide sequences of MP2 and the PRP insert of pUMP40 showed only 98% homology (1).MP2 was cloned from a genomic library prepared from chromosomal DNA from the CD-1 mouse strain. In an attempt to reconcile the heterologous regions and base differences between the CD-1 mouse gene MP2 and the BALB/cJ mouse mRNA, we isolated mRNAs from four mouse strains. Northern blots of total RNA from the parotid glands of mouse strains CD-1 and BALB/cJ and from strains DBA/2J and C57BL/6J, from both control and isoproterenol-treated mice, were probed with 32P-labeled exon +
7
PROLINE-RICH PROTEIN MULTICENE FAMILIES
A
B
1078 1353
872 -
603 -
FIG.3. Northern blot of parotid gland RNA from normal and isoproterenol-treated rats and cell-free translations of “sized” PRP mRNAs. (A) Parotid gland RNAs (10 pg) from normal and isoproterenol-treated rats were electrophoresed on a 1.5% agarose gel containing 5 mM methyl mercury hydroxide and transferred to nitrocellulose. The blot was probed with 32Plabeled pRP25 (11).(B) RNA was isolated from a methyl mercury denaturing low-melting-point agarose gel and translated in oitro with [SSImethionine. The translation products were separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Lanes l and 8 show S S label incorporated in the absence of RNA and with total RNA from the parotid glands of isoproterenol-treated rats, respectively. Lanes 2-7 are the translation products obtained from RNA indicated in (A). Molecular-weight standards ( X l O - 3 ) are indicated at the right, and nucleotide standards are indicated at the left. [Reprinted with permission from the Journal of Biological Chemistry (S).]
IIb (see Fig. 10) of PRP gene MP2 (5). Two major classes of PRP mRNAs were detected in the treated animals. RNA species of about 1050 and 1300 bases for BALB/cJ and DBA/2J mice and about 1100 and 1200 bases for CD-1 and C57BL/6J mice were observed. Cell-free translations of total RNA from these four mouse strains showed interesting and unusual differences in the PRPs synthesized (Fig. 4). Similar labeling patterns were observed with both [3H]proline and [35S]methionine. The amounts or levels of incorporation varied considerably between controls and treated animals, and cx-
8
DON M. CARLSON ET AL.
35S-Met
I
3
M.W. Std.
I PR
NORMAL a
-3
x c 0 m
<m -J
U
\ -I
m
I
lc
n m
m V
V
3 H -Pro ’
cv
*
a
m
-
-I
I
m U 0 n m V
m lc In
o
3
(0
(D
\ -I
I PR
NORMAL 7
7
3
\
1
cv
\ -I
U
lc
U
In 0
D
3
\
m
0
m
3
cv
\
m
(D
7
<m
‘
-
3
-IN
m \
I + a a n o m -I
m o
o n
45.0
31 .O
21.5
FIG.4. Translation products of RNAs prepared from four different mouse strains. Ten micrograms of total RNA from parotid glands of the mouse strains indicated, both before and after isoproterenol treatment, was translated with either [35S]methionine or [3H]proline; 200,000cpm of 35s and 50,000 cpm of 3H were applied to gel electrophoresis. a-Amylase and parotid-specific protein are indicated by the upper and lower arrows, respectively. Molecularweight standards (XlOW3) (M.W. Std.) are indicated at the left. [Reprinted with permission from the Journal of Biological Chemistry (5).]
amylase (upper arrow, Fig. 4)and parotid-specific protein (lower arrow, Fig. 4) were dramatically reduced. a-Amylase and parotid-specific protein expressions appear to be regulated in concert (31).Earlier cell-free translation experiments (21, 29) and RNA/DNA hybridization results (5) show that a-amylase mRNA is dramat-
PROLINE-RICH PROTEIN MULTIGENE FAMILIES
9
ically decreased by isoproterenol treatment. In a related study, a polymorphism in an androgen-regulated single-copy mouse gene (RE?)produced three major mRNAs (32).These RP2 mRNAs differed in the lengths of their untranslated 3’ regions as a result of using different polyadenylation sites, and additional variability resulted from the insertion of a member of the mouse B1 family. However, these RP2 polymorphisms had no effect on the translation product. Whether the PRP polymorphisms are the result of different PRP genes or are caused by differential RNA splicing remains to be determined.
111. PRP cDNAs and Amino-acid Sequences Plasmids containing cDNAs for PRPs were first isolated from a cDNA library prepared from RNA isolated from the parotid glands of isoproterenoltreated rats (7). Four recombinant plasmids (pRP8, pRP18, pRP25, and pRP33) were selected. Several mRNAs hybridized to each PRP cDNA, which emphasized the similarities in nucleotide sequences of the PRP mRNAs. This could have resulted from the expression of a family of closely related genes or from the production of multiple mRNAs from the same or similar genes by different splicing patterns. Whether one or both possibilities are responsible for the multiple PRP mRNAs has not been unequivocally demonstrated. Subsequently, several more PRP cDNAs were cloned from mouse and rat parotid gland mRNAs after isoproterenol treatment (11) and from the human parotid gland (33).5 The nucleotide sequence of the PRP cDNA insert of pRP33 (7) encodes the acidic PRP, Ipr-1A2 (Fig. 5). The 13 amino-terminal amino acids are highly hydrophobic and are probably part of a signal peptide (the signal peptide region) (see Fig. 8). The next 60 amino acids (the “transition” region) contain numerous acidic residues, with 10 aspartic acids in the 16-aminoacid sequence of Asp-58 to Asp-73. The “proline-rich” region (residues 80-189) is high in proline, glycine, and glutamine, and includes six repeats of 18 or 19 amino acids (the “repeat” region). The 17 carboxyl-terminal amino acids (the carboxyl-terminal region) contain single residues of tyrosine (-2O1), tryptophan (-203), and phenylalanine (-204) clustered close to the carboxylterminal serine (-206). These data, derived from the nucleotide sequence of pRP33, gave the first complete amino-acid sequence of a PRP. This sequence has been compared (7) with the partial amino-acid sequences reported for the human acidic (34) and basic (12)PRPs. Subsequent data derived from several PRP cDNA and PRP gene sequences show that the first 100 nucleotides in the 5’ regions of PRP mRNAs, which contain the 5’5 Differential splicing is considered to contribute to the multiple PRP mRNAs in the human salivary gland (33).
10
DON M. CARLSON ET AL.
'
M
L
V
V
L
L
T
A
A
L
L
V
L
S
S
A
H
G
S
m
D
E
E
V
T
Y
E
D
S
S
S
Q
L
L
D
V
E
Q
Q
a
N
Q
K
H
G
Q
H
H
Q
K
P
P
P
A
S
D
E
N
G
s
D
G
D
D
S
D
D
G
D
D
D
G
S
G
D
D
G
N
R
E
R
8oP P
P
H
G
G
N
H
Q
R
P
P
P
G
H
H
H
G
9BP P
P
S
G
G
P
Q
T
S
S
Q
P
G
N
P
Q
G
P
P
Q
P
G
N
P
Q
G
P
P
nP
' " P P P Q G G P Q G "
P
P
P
Q
G
G
P
Q
Q
P
P
Q
G
G
P
Q
G
P
P
Q
G
G
H
Q
Q
I w P A Q D A T H
E
Q
"'P
R
Q
P
G
K
P
Q
G
P
P
Q
P
G
N
P
Q
G
R
P
P
Q
P
R
Q
D
P
S
Y
L
W
F
K P 206 S S
FIG. 5. The amino-acid sequence derived from primer-extended pRP33, arranged to align similar sequences. [Reprinted with permission from the Journal of Biological Chemistry (7.1
flanking sequence and encode exon I (or the putative signal peptide), have unusually strong homologies (>95% identity) (11)(Fig. 6). Sequence data obtained from a multigene family encoding CRPs from rat submandibular glands (3), are unusually high in glutamine and glutamate (4) and are very similar in this region, 91% of the nucleotides and 92% of the amino acids being identical to the sequences of the PRPs (3).
IV. Sequence and Structural Analyses of PRP Genes One of the family of mouse PRP genes was isolated on a cloned 3600-bp EcoRilBgZII-generated DNA fragment from a partial Sau3A bacteriophage library of CD-1 mouse chromosomal DNA (1).The transcriptional unit included three exonic sequences separated by 1434 bp (intron I) and by 450 bp (intron 11). The upstream sequence (Fig. 7) had putative induction sites for CAMP(box I11 and box I) and an activator or enhancer sequence (box II), ZDNA sequences that flanked an 86-bp sequence, a TATA box, and a CAAT box (1).The derived amino-acid sequence of this PRP gene ( M P 2 ) revealed a protein that contained 13 tandemly arranged repeats of 14 amino acids with the prototype sequence P P P P G G P Q P R P P Q G (Fig. 8). Each amino
-30
-20
I
I
up2 A
M(terenrs:
C
I
A
C
-10 I T
T
C
.
~
0
10
20
30
40
50
I
I
I
I
I
I
~
G
M
A
M
C
T
C
C
T
T
C
C
~
~
60 I ~
T
~
T
~
~
02030000000330220001003011213010000000230130300130330003301030010123311010010230111123333333300010
FIG.6. Comparisons of the S’-flanking regions and sequences e n d i n g exon 1 of PRP cDNAs and PRP genes from mouse ( M P 2 , pUMP40, pUMPl2, and pUMPl), rat (pRP25, pRP33, and pRP18), human (CPTI, CPT3, and CP6), hamster (H29), and rat CRPB. The legend indicating differences of 0, 1, 2, and 2 3 denotes the relative conservation of bases at each position. In 65% of the positions, there is only 0-1 base change.
~
T
C
C
T
H29
HE2 C T A A C C T T A A G C A T C T T T A A T A G A A C A A ~ T ~ G ~ G C ~ T C T-650 AT
I l l
I I
I
I Ill
I
-494 TCCCTACTGGGTGAGCTAACTCCCTACACAAT"AAACAAATCAATCAACT
I
D
I
GGTCCTTCAIAAATGTAACAGTCAAA-CAIACTCAC-CAGGAATTACGGATT-602
II
IIIIIIII Ill
-
1 1 1 1 I II II
I
-444 AAGTGTTAT GCATGTAACA-TCATGCCA A-TAACACUTGAA
+
I
I
-
d
CAAGATATTGACTCATGTATACCTCATATGTGTTGTGACTCCACTTTTAC -552
0
TGGGACTTTATAGATGAAATAGGTCTCATGCTTTACTA GCCAAT GTTGTA -502 GTTATTGTGTTAGGTCAGGAGAATAGTGGGCACTCTTACTGAGGCTTAGC -452 ATGTTAGGGATTCCAAGGGTCTTGGTGTAATTGATAlTTGlTTATGAATA
-402
GCCTCAACACCATCACTCTTAACTAATTATAGAATATATAAGAACATATA
-352
TAAGTGACAGTGGTTAAGCTATCCTACTGATCATAAA?iATTGACCACATT
-302
CAATTTGGACAGAAATCATTACTGTCAA TATAAACAAAT
-263
I I I I I I l l II CATAATTTTGCACCTTTAGTCTCAGTGA CAGGAA
-404
-370
I l l IIIIII IIII I Ill I I II II IIIII GAATATGGACA AAATTA TACAGGTATGTAGAAGCACCTCCCACAA
-322 TCATACCTAATAGGTCAGAGTCAGAGTTATGTCAATAACAGTGTCTTACA -274 CAATGATAGGCCTTAAAGGACAATAGACTTATTG -238 ATAGAACTATATATCTAATGTCTAGACTTTGCCTGTATCACTTAAACTAT GTGTATGCACACTAGTTTTA -244
I
I I
-1aa TGTTGTCAAAATTTCACATTGTACCATAGAGAACTGAAACATTGACTGCA CCCCAATGCACATTGATACACAAA AAATGTCAGCAAATGCA ATGAGATAT -193
I
II I
I I II
I
-138 TCCTGCTGGGCTAGAGTCCCAAAG AAAAGTCAGT GATGCA AAG
TTATATATTGTTAGTCATTACTGCAATAACTGGGTTATATGATTACATAG -143 GAGTTTTTTCTAGTAGGGACACTAGCAGCTAGC
TCTTCCTTACCTCATCCTGATGGGCAAAAGTCCCAGTGTCACACAAAGGA
-60
GAAAGGTGACATTCTTCTGCTCCTCCTTATAAAGGCAGTGTCTTACT
-12
II II I I l l II I I II I l l I I IIII TA TCCTGCTG TG TC AGGT CAGATCAATAGTGAGGA
-95 C
-60
I IIIIIIII IIII IIIIII I IIIIIIIIIII II I I CATGAAAGGTGCCATTGTTCTGC CTTCCTTATAAAGATTTTGGCCTTGC TCTTCCAGCACAGACTTGG
I
IIIIIIIIIIIIIII
-11 TGGCCCAGCACAGACTTGG FIG.7. Comparisons of upstream sequences of mouse PRP gene M P 2 and hamster PRP gene H 2 9 . The upstream sequences of M P 2 and H 2 9 are aligned to maximize sequence similarities. Putative regulatory regions are indicated. Boxes 111 and I, M P 2 , -640 to -623 and -218 to 203. respectively; arrows, AP-1 binding sites; GCCAAT, -513 to 508. CCAAT box.
13
PROLINE-RICH PROTEIN MULTICENE FAMILIES
I M L V V L F T V A L L A L S S 1 6 A Q G P R E E L Q N Q I Q I P N Q R
SIGNAL PEPTIDE TRANS I T I O N REG ION
3 4 P P P S G F Q P R P P V N G S Q Q G 52P P P P G G P Q P R P P
Q
G
REPEAT REGION
66P P P P G G P Q P R P P Q G 8oP P P P G G P Q P R P P
Q G
g4P P P P G G P Q P R P P Q G 108P P P P G G P Q Q R P P Q G 122P P P P G G P Q P R P P Q G 136P P P P G G P Q L R P P Q G 15oP P P P A G P Q P R P P Q G 16‘4P P P P A G P Q P R P P Q G 178P P T T - G P Q P R P T Q G 191P P P T G G P
Q Q R P P Q G
205P P P P G G P Q P R P P Q G 219P P P P G G P Q P S P T Q G 2 3 3 P P P T G G P Q Q T P P L A G N T 61 G
CARBOXYL TERMINUS
2 5 2 P P Q G R P Q G P R STOP 26 1 FIG. 8. Amino-acid sequence of PRP GPMsm derived from the nucleotide sequence of mouse PRP gene MP2.
acid within the repeat had its “favored” codon ( 1 ) (Table I), and six amino acids had a total conservation of codons for all 13 repeats. Subsequent studies (2) showed that two nonallelic PRP genes ( M P 2 and M 1 4 ) are tandemly arrayed and separated by about 30 kbp. Analysis of DNA sequences suggested that M P 2 and M 1 4 arose via gene duplication of a common ancestor. A homology matrix, or “dot-plot,” showed virtually no spurious background, and, aside from three differences, the sequences of the two genes, including the introns, were nearly identical. The differences observed were two additional sequences in intron I of M 1 4 of 223 and 2005 bp, four additional repeats (17 repeats total) in M 1 4 , and fractional sequence differences of the simple repetitive sequences (CA, TA, and TAGA) of intron
DON M. CARLSON ET AL.
14 TABLE I COMPARISON OF CODON USAGE’ Codon
MP2
Other mouse genes
CCU Pro (47)b CCA CCG
16 16 60 8
31 25 38 6
GGU Gly (19)b GGC GGA GGG
0 62 31 7
21 26 32 21
CAA Gln (17)b CAG
53 47
35
CCU Arg (7)b CGC CGA CGG AGA AGG
6 0 6 6 64 18
7 13 13 12 31 24
ACU Thr (4)b ACC ACA ACG
23 0 77 0
25 36 32 7
UAA term UAG UGA
100 0 0
0 0 100
ccc
65
“Reprinted with permission from theloumul of Biological Chemistry ( 1 ) . b Amino-acid composition in mol%.
I. The additional 2005-bp sequence was the 3‘ portion of the mouse LINE element, and it apparently had been transposed into intron I, but in the opposite orientation of M 1 4 (Fig. 9). This mouse LINE sequence (LIMdPRP), like most mouse LINE elements, is truncated at the 5‘ end (35). LIMd-PRP contains the typical polyadenylation signal (AATAAA) and an adenine-rich sequence, and it is flanked by a pair of 10-bp imperfect repeats (TGTCTTTTTT and TGTCTTTCTT). This IO-bp sequence is present only once in MF2. These and other data are strong evidence that LIMd-PRP entered this PRP locus via transposition. Both M P 2 and M 1 4 are transcriptionally active in the parotid gland when the mouse is treated with isoproterenol (11). PRP cDNAs pUMP4 and pUMP40 are encoded by M 1 4 and MF2, respectively. The number of tan-
15
PROLINE-RICH PROTEIN MULTIGENE FAMILIES 0
10
5
15
20
30 35
25
40
45
50
60
55
65
70 75
80 kb
GENES
Hindm Sol1 BamHI EcoRI
CLONES
I
9
I,
1
0
I
,
b
,
,,
I
f
,,,
, I
c
,d,
e
,
I
,
MC I6 MC22 M I4
FIG. 9. Linkage of mouse PRP genes MP2 and M14.The organization of MP2 and M I 4 are shown by the expanded scales and the relative lengths are indicated by the kilobase bar. Solid bars show the three exonic regions, and the open arrow ( M 1 4 ) represents the LINE insert. Arrows show the direction of transcription. [Reprinted with permission from the Journal of Biological Chemistry @).I
dem repeats within each gene varies, as we indicated (I, 36), and is similar to those reported (37) using variable number of tandem repeats (VNTRs) as markers for mapping human genes. However, PRP tandem repeats are the major body of the active gene, and no sequence similarity exists between this repeat and the invariant core sequence of VNTRs (37, 38). Sequence analysis of a hamster PRP gene (H29) showed that the hamster, rat, mouse, and human PRP genes are all closely related. Mouse and rat PRP genes have two exons encoding PRPs (I and IIb) (Fig. lo), while hamster and human genes have three exons (I, IIa, and IIb). Exon IIa of hamster and human PRP genes are both comprised of 36 bp, seemingly coming from the 5’ sequences of exon IIb of the mouse and rat. Whether this difference in PRP gene organization resulted from a separation or combination of exonic regions is unknown. Upstream regulatory regions of mouse, hamster, and human PRP genes are discussed in Section V. Unlike PRPs from mice and rats, which are all blocked on the amino terminus, hamster PRPs Hp43a and Hp43b were partially sequenced from the amino terminus (39). The open reading frame of exon I (H29) encodes hydrophobic residues, the putative signal peptide. Exon IIa contains only 36 bp, and the derived amino-acid sequence is A T I Y E D S I S Q L S, which is exactly the sequence of the amino terminus of Hp43a, except in position 8, which is D instead of I. Exon IIb contains 514 bp and encodes the mature protein, except for the first 12 amino acids. Exon IIb has 10 Hue111 sites and six Sau96I restriction enzyme sites, which is one of the unusual characteristics of PRP genes (I).One open reading frame of exon IIb encodes a 20-
DON M. CARLSON ET AL.
16 EI
E Ilo
\ ,I
Hamster
'\\
H29
M
ll
0
Human PRHl
,'
N
\
I,
I,
,
,
,
,/
,
\I
'\
'
;
:: I t
I
U
I
3'
5' 1 kb
FIG. 10. Comparison of PRP gene organizations in mice, hamsters, and humans. Related exonic regions (bars)are connected by dashed lines. The gene organizations of hamster H29 (36) and human PRHl (8)show the additional 36-bp exonic sequence. [Reprinted with permission from the Journal of Biological Chemistry (2).]
aminoacid peptide that is repeated five times and has a prototype sequence of P P Q Q E G Q Q Q N R P P K P Q N Q E G. The first 43aminoacids and the last 12 amino acids derived from the nucleotide sequences of exons IIa and IIb diverge from this prototype repeat pattern and give rise to the transition region and the carboxyl-terminal region, respectively. While the 5'-noncoding regions and the sequences encoding the putative signal peptides (about 100-110 bp) are highly conserved in all PRP mRNAs (Fig. 6), there is a discrepancy in the apparent cleavage site by the signal peptidase in hamster PRPs, as suggested by the amino-acid sequence of Hp43a. From the amino-acid sequence derived from PRP gene H 2 9 , the nascent polypeptide chain has the sequence M L V V L L T A A L L A & E H f A T I Y E----, with Glu (E) and His (H) preceding the apparent site of cleavage ( t ) (see the amino-terminal sequence encoded by exon IIa, above, A T I Y E D-----). Histidine has not been observed in position -1 (counting from the cleavage site) (40). We have proposed that the signal peptidase cleaves between Ala and Glu ( J. ) and that Glu and His are removed by further processing (36). However, this proposal predicts an unusually short signal peptide of only 12 amino acids.
V. Regulation of Expression of PRP Genes The upstream sequences of mouse PRP genes M P 2 and M14 and hamster PRP gene H 2 9 contain potential regulatory elements (Figs. 7 and 11). In each of these genes, three highly conserved regions were identified (boxes I111) (1).These regions include two putative CAMPresponse elements (boxes
17
PROLINE-RICII PROTEIN MULTIGENE FAMILIES
Source
Sequence
E. coli CRP Binding Site
Bovine a-gonadotropin Human VIP
A A N T G T G A N N -122 -81
Reference (411
- T N N N - N C A
T T A T G T G A A G - T A C -
- .C A
-108
(55)
T A C T G T G A C G
- T C A
-65
(56)
- T C T T
PRP Genes, Box 111 Mouse MP2 Mouse M14 Hamster H29
-640 -640 -435
A A A T G T A A C A G T C A A - A C A C A A T G T A A C A G T C A A - A C A G C A T G T A A C A - T C A T G C C A
-623 -623 -418
(1) (2) (36)
PRP Genes, Box I Mouse MP2 Mouse M14 Hamster H29
-218 -217 -114
A A A T G T C A G C - A A A T - G C A A A T T G T C A G C - T A A T - G C A A A A T G T C A G T - - G A T - G C A
-203 -202 -99
(1) (2) (36)
Human PRP Genes PRHl PRH2
-484 -484
A A A T G T G A A A A T A C C A A A T A T A C A A A T A T C
-467 -467
(8) (8)
Relative Base Frequencies: A T G C
0 1 0 411 12 11 12 0 0
0 0 0 5 0 0 0 0 3 1
1
11 0 0
- T C A - C A T 1lJ 0 1 0 11 0
FIG. 11. Sequence comparisons of the E . coli CRP binding site with putative cAMP regulatory sites (boxes I and 111) of PRP gene M P 2 . Similar sequences in the E . coli CRP binding site are reported in bovine a-gonadotropin, human vasoactive intestinal peptide (VIP), and human PRHl (PRP) genes. The relative frequencies of A, T, G , and C at each position are indicated. Positions 4, 5, 6, and 8 (TGT-A) are totally conserved. Positions 12, 18, and 19 each show only one substitution: A for T ( 1 2 ) in M P 2 , and A for C (18)and C for A (19),both found in human PRHI.
I and 111) with sequences similar to the CRP binding site required for transcriptional activation of CAMP-regulated genes in Escherichia coli (1,41) (Fig. 11). Also, box I11 of the mouse PRP genes contains a sequence (-637 TAACAGTCA - 629) which resembles the 8-bp palindromic sequence TGACGTCA, a cAMP response element (CRE) in eukaryotes (42).The palindrome is imperfect by one base (A for G), and it is interrupted by an A. This sequence in box I11 of the hamster gene is similar, but it lacks a G (TAAC-TGA). Such overlapping sequences of CRP and CRE have been shown to be functionally related (43). Mammalian activation-translation factors (ATF)can bind specifically to some E. coli CRP sites, and, conversely, E. coli CRP-binding protein specifically binds to some mammalian ATF sites (43). Of considerable interest is the observation that multiple AP-1 binding sites (44) are present immediately 3' to box I11 in PRP genes MP2 and M14 (Fig. 11).These sequences are not present in the hamster gene. A perfect
18
DON M. CARLSON ET AL.
copy of the AP-1 heptamer, TGACTCA, is located at positions -594 to -588. A totally conserved CCAAT box (GCCAAT) is located at positions -516 to -509 in MP2 and M14. Proteins known to bind to the CCAAT box (CTF/NF-1 proteins) activate both transcription and replication (45). The proline-rich transcriptional activator of CTF/NF-1 is distinct from the replication and DNA binding domain in that it requires an additional carboxylterminal domain (46). Preliminary studies using gel-mobility-shift assays (47) have shown that nuclear extracts from the parotid glands of isoproterenoltreated mice have about a 6-fold increase in protein(s) binding to the upstream sequence (-702 to -574 bp) of MP2. “Footprint” assays indicate that the nuclear protein(s) binds to the AP-1 repeats. Adding Bt,cAMP or forskolin to hamster parotid gland primary-cell cultures resulted in a large increase (i.e., 15-to 30-fold) in PRP mRNA levels (48). The increase was most dramatic between 10 and 18 hr of treatment. Treatment of the cells with cycloheximide blocked this induction of PRP mRNAs, which is added evidence that the synthesis of a trans-acting factor is necessary for the dramatic increase in transcriptional activation of the PRP genes. &-AmylasemRNA was not significantly ailected by the cycloheximide treatment. Transfections have been performed using the plasmid pUMP2BE, which contains the complete MP2 gene, and with various constructs containing deletions of the upstream sequence of MP2. Constructs containing the sequence -702 to -574 bp of MP2 in tandem with the Rous sarcoma virus (RSV) promoter and the chloramphenicol transferase (CAT) gene showed induction of PRP mRNAs of 2- to 4-fold. Various cell types have been used for the transfection experiments, including PC-12, AtT20, and L M cells. Presently, we are attempting transfection experiments with a parotidhepatoma cell line prepared by fusion of FTO-2B cells and parotid gland primary cells (49). This may be the only “immortalized cell line of parotid gland cells available, and these cells may respond more dramatically to transfections with the PRP gene regulatory sequence.
VI. Functional Aspects of PRPs The high conservation of the sequences and structures of PRP genes and PRPs argues for specific biological functions for these unusual proteins. Some of the proposed functions, such as calcium binding, hydroxylapatite binding, formation of the dental-acquired pellicle, and agglutination of oral bacteria, have been reviewed, especially for human PRPs (12, 22, 23). In 1983, we showed that PRPs, which were dramatically induced in rat parotid glands by feeding tannins, are beneficial to the rat (SO).Condensed tannins (proanthocyanidins, oligomers G f fiaan-3-01s) and hydrolyzable tannins
PROLINE-RICH PROTEIN MULTICENE FAMILIES
19
(oligomers of gallic acid) are present in many foods. Antinutritional effects and other toxic and pathological properties, such as carcinogenicity and hepatotoxicity, have been associated with the ingestion of tannins and with the medicinal use of tannins. The general properties of tannins, their effects on biological systems, and specifically their roles in the induction of PRPs have recently been reviewed (13). Seeds of bird-resistant cultivars of sorghum, a major cereal crop of the semi-arid tropics, contain high levels of tannin (high-tannin sorghum), which diminishes the nutritional value of the grain. Studies designed to define the interactions of tannins and proteins show that tannins have an extremely high affinity for proteins rich in proline (51), and that the salivary PRPs have the highest affinity (50). Because the gastrointestinal tract, specifically the oral cavity, is the source of PRPs, it was suggested that salivary PRPs might interact with tannins and serve as a defense against the detrimental effects of dietary tannins. While the dramatic induction of PRPs in the rat following isoproterenol treatment clearly offsets the usual detrimental effects of dietary tannins (50),parotid glands from rats fed high-tannin sorghum (i.e., 2% of their diet) without isoproterenol treatment were also enlarged about 4fold, and there was a dramatic increase in PRPs within 3 days. Thus, tannins in the diet mimic the effects of the P-agonist isoproterenol on the parotid glands. There was an initial weight loss on the 2% tannin diet, reversed at 3 days, or at the time of maximal stimulation of PRP synthesis. After this time, the animals grew at close to the normal rate. Amino-acid analysis, electrophoretic patterns of proteins, and cell-free translations of mRNAs all confirmed that the PRPs induced in parotid glands by feeding tannin are identical to those induced by isoproterenol. Subsequent studies show that the P-agonist propranolol (a mixed PI, P,-agonist) and atenolol (a PI-agonist), when included in the diet, block the induction of PRPs by dietary tannin. Butoxamine, a P,-specific blocker, had no effect and therefore the P-adrenergic receptor affected by tannin feeding is the PI-receptor. The addition of either propranolol or atenolol to the diet of rats also causes substantial increases in four proteins in the submandibular glands, of MW 145,000 (GP145), 42,000 (P42), 40,000 (P40), and 39,000 (P39) (52). GP145 is glycosylated. These proteins are tissue-specific, as they were not detected in the parotid or sublingual gland, lung, liver, pancreas, kidney, heart, or small intestine either before or after propranolol treatment. We believe that this is the first report on the induction or regulation of protein synthesis by a P-adrenergic blocker. The hamster was used as another animal model to study the regulation and expression of PRPs. The hamster responded to isoproterenol by the induction of a series of proteins (39). However, the protein encoded by “PRP” gene H29 was unusually high (34%) in Gln and was only 15% Pro.
20
DON M. CARLSON ET AL.
Also, there was no evidence of a hypertrophic response in the hamster salivary glands. Subsequent studies of feeding tannins to hamsters also showed essentially no hypertrophic response, and PRPs were not induced. Weanling hamsters fed a diet of 2% tannin lose weight for about 3 days, as do rats and mice, but then an unusual growth inhibition occurs (39). Hamsters maintained on a 2% tannin diet failed to grow, and even at 60 days were essentially the same body weight as at 3 days after starting the feeding trial. When diets were switched, the experimental animals gained weight at close to the normal rate for young hamsters, while the control animals, then on a 2% tannin diet, lost about 20% of their weight. Clearly, the detrimental effects of tannins are reversed or inhibited by the induction of PRPs in rats and mice, but hamsters are unusually susceptible to tannins. In fact, increasing the tannin content of the diet to 4% was fatal to most hamsters within 3 days.
VII. Discussion Specialized cells in eukaryotes variably express different genes during differentiation and development. Exocrine glands, such as the pancreas and the salivary glands, have served as models of secretory tissues. Under ordinary conditions, the salivary glands of adult animals are relative stable and do not change appreciably in cell size or number (27). However, administration of the catecholamine isoproterenol causes dramatic morphological, cytological, and biochemical changes. Morphologically, the parotid glands can increase up to 10-fold in size. Cytologically, about 50% of the acinar cells are polyploid within 2 days of treatment. Biochemically, a dramatic induction of the multigene family encoding the PRPs is observed. The expression of PRPs for the parotid and submandibular glands is tissue-specific or, possibly more correctly, cell-specific. PRPs have been identified immunochemically in the trachea (53) and the pancreas but there is no evidence that PRP genes in these tissues respond as in the salivary glands to isoproterenol treatment. Small amounts of PRP mRNAs are observed in the mouse pancreas after isoproterenol treatment (C. A. Bidwell and D. M. Carlson, unpublished), but these results were variable. Current data suggest that transcriptional controls, tissuespecific factors, and post-translational modification share the role of principal modulators of the expression of the PRP gene families.
(a),
ACKNOWLEDGMENT These studies were supported in part by NIH grant DK 36812. P.S.W. was supported by
NIH training grant T32 HL 7013-13.
PROLINE-RICII PROTEIN MULTIGENE FAMILIES
21
REFERENCES D. K. Ann and D. M. Carlson, JBC 260, 15863 (1985). D. K. Ann, M. K. Smith and D. M. Carlson, JBC 263, 10887 (1988). G. Heinrich and J. F. Habener, JBC 262, 5262 (1987). L. Mirels, G. S. Bedi, D. P. Dickison, K. W. Grossand L. A. Tabak,JBC262,7289(1987). D. K. Ann, S. Clements, E. M. Johnstone and D. M. Carlson, JBC 262, 899 (1987). H. Mehansho, D. K. Ann, L. G. Butler, J. Rogler and D. M. Carlson, JBC 262, 12344 (1987). 7 . M. A. Ziemer, W. F. Swain, W. J. Rutter, S. Clements, D. K. Ann and D. M. Carlson,JBC 259, 10475 (1984). 8. H. S. Kim and N. Maeda, JBC 261, 6712 (1986). 9. H. Selye, M. Cantin and R. Veilleux, Growth 25, 243 (1961). 10. K. Brown-Grant, Nature 191, 1076 (1961). 11. S. Clements, H. Mehansho and D. M. Carlson, JBC 260, 13471 (1985). 12. A. Bennick, MCBchem 45, 83 (1982). 13. H. Mehansho, L. G. Butler and D. M. Carlson, Annu. Reu. Nutr. 7, 423 (1987). 14. A. Bennick and G. E. Connell, BJ 123, 455 (1971). 15. F. G. Oppenheim, D. I. Hay and P. Franzblau, Bchem 10, 4233 (1971). 16. A. Fernandez-Sorenson and D. M. Carlson, BBRC 60, 249 (1974). 17. D. L. Kauffman and P. J. Keller, Arch. Oral B i d . 24, 249 (1979). 18. J. Muenzer, C. Bildstein, M. Gleason and D. M. Carlson, JBC 254, 5623 (1979). 19. J. Muenzer, C. Bildstein, M. Gleason and D. M. Carlson, JBC 254, 5629 (1979). 20. R. S. C. Wong and A. Bennick, JBC 255, 5943 (1980). 21. H. Mehansho, S. Clements, B. T. Sheares, S. Smith and D. M. Carlson, JBC 260, 4418 (1985). 22. A. Bennick, J . Dental Res. 66, 457 (1987). 23. A. Bennick, J . Dental Res. 68, 2 (1989). 24. I. D. Mandel, R. H. Thompson and S. A. Ellison, Arch. Oral B i d . 10, 499 (1965). 25. T. Barka, Exp. Cell Res. 37, 662 (1965). 26. R. Baserga, FP 29, 1443 (1970). 27. C. A. Schneyer, in “Regulation of Organ and Tissue Growth” (R. J. Gross, ed.), 211 pp. Academic Press, New York, 1972. 28. M. R. Robinovitch, P. J. Keller, D. A. Johnson, J. M. Iverson and D. L. Kaufman,]. Dental Res. 56, 290 (1977). 29. M. A. Ziemer, A. Mason and D. M .Carlson, JBC 257, 11176 (1982). 30. H. Mehansho and D. M. Carlson, ]BC 258, 6616 (1983). 31. H. 0. Madsen and J. B. Hjorth, NARes 13, 1 (1985). 32. D. King, L. D. Snider and J. B. Lingrel, MCBiol 6, 209 (1986). 33. N. Maeda, H . 4 . Kim, E. A. Azen and 0. Smithies, JBC 260, 11123 (1985). 34. D. Kauffman, R . Wong, A. Bennick and P. Keller. Bchem 21, 6558 (1982). 35. M . F. Singer and J. Skowronski, TZBS 10, 119 (1985). 36. D. K. Ann, D. Gadbois and D. M. Carlson, JBC 262, 3958 (1987). 37. Y. Nakamura, M. Leppert, P. O’Connell, R. WOK, T. Holm, M. Culver, C. Martin, E. Fujimoto, M. Hoff, E. Kumlin and R. White, Science 235, 1616 (1987). 38. A. J. Jeffreys, V. Wilson and S . L. Thein, Nature 314, 67 (1985). 39. H. Mehansho, D. K. Ann, L. G. Butler, J. Rogler and D. M. Carlson, JBC 262, 12344 (1987). 40. G. von Heijne, J M B 173, 243 (1984). 1. 2. 3. 4. 5. 6.
22
DON M . CARLSON ET AL.
41. B. de Crombrugghe, S. Busby and H. Buc, Science 224, 831 (1984). 42. W. J. Roesler, G. R. Vanderback and R. W. Hansen, JBC 263, 9063 (1988). 43. Y.-S.Lin and M. R. Green, Nature 340, 656 (1989). 44. P. K. V. Vogt and T. J. Bos, TZBS 14, 172 (1989). 45. C. Santoro, N. Mermod, P. C. Andrews and R. Tjian, Nature 334, 218 (1988). 46. N. Mermod, E. A. O’Neill, T. J. Kelly and R. Tjian, Cell 58, 741 (1989). 47. J. Zhou and D. M. Carlson, FASEB]. 4, 2131 (1990). 48. P. S. Wright, C. Lenney and D. M. Carlson, J . Mol. Endocrinol. 4, 81 (1990). 49. P. S Wright and D. M. Carlson, FASEB]. 2, 3104 (1988). 50. H. Mehansho, A. Hagerman, S. Clements, L. G. Butler, J. Rogler and D. M. Carlson, PNAS 80, 3948 (1983). 51. A. Hagerman and L. G. Butler, JBC 256, 4494 (1981). 52. V. N. Subramaniam and D. M. Carlson, FASEB J. 4, 1980 (1990). 53. T. F. Warner and E. A. Azen, Am. Reu. Respir. Dis. 130, 115 (1984). 54. S. Ito, S. Isemura, E. Saitob, K. Sanada, T. Suzuki and A. Shibita, Acta Endocrinol. 103, 544 (1983). 55. R. G. Goodwin, C. L. Moncman, F. M. Rottman and J. H. Nilson, NARes 11, 6873 (1983). 56. T. Tsukada, J. S. Fink, G. Mandel and R. H. Goodman, JBC 262, 8743 (1987).
Recognition of tRNAs by Aminoacyl-tRNA Synthetases’
I
LADONNE H. SCHULMAN Albert Einstein College of Medicine Bronx, New York 10461
I. Recognition versus Identity
................
11. Assays of the Amino-acid-am A. In Vitro Assays
B. In Vioo Assays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Role of the Anticodon
IV.
V.
VI. VII. VIII. IX.
A. Anticodon Recognition in E . coli tRNAs . . . . . . . . . . . . . . . . . . . . . . . . B. Anticodon Recognition in Yeast tRNAs C. Summary . . . . . . . . . . . . . . . . . . .. . . . . . Role of the Acceptor Stem and the “Discriminator” Base at Position 73 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Alanine Synthetases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. E. coli Serine Synthetase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. E. coli Glutamine Synthetase . . . . D. Other E . coli Synthetases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other Recognition Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. E . coli Arginine Synthetase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Yeast Phenylalanine Synthetase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. E. coli Phenylalanine Synthetase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Summary ...................... Role of Modified Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Complex of E. coli Glutamine tRNA and Glutamine Synthetase . . . tRNA Binding Domains of Other Synthetases . . . . . . . . . . . . . . . . . . . . . . . Concluding Remarks References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24 2.5 25 25 29 30 43
44 44 44 51 53
55 57 58 58 60 62
63 64 66
72 81 82
The highly specific selection of tRNA substrates by aminoacyl-tRNA synthetases is an intriguing problem in RNA-protein recognition. Synthetases specific for each of the 20 amino acids encounter a pool of tRNAs in the cell having similar overall structures (1-3). Selection of the appropriate tRNAs for attachment of each amino acid occurs by the formation of RNA-protein contacts unique to each cognate tRNA-synthetase pair. The sites in tRNAs that govern these interactions have been investigated by a variety of techAbbreviations: synthetase, aminoacyl-tRNA synthetase; MetRS, methionyl-tRNA synthetase; GlnRS, glutaminyl-tRNA synthetase, etc.; tRNAMet(CAU), methionine tRNA have the anticodon sequence 5‘CAU3‘; tRNATw(CCA), tryptophan tRNA having the anticodon sequence 5’CCA3’, etc.; DIIFR, dihydrofolate reductase.
23 Progress in Niicleic Acid Research and Molecular Biology, Vol. 41
Copyright 0 1991 by Academic Press. Inc. All rights of reproduction in any form reserved.
24
LADONNE H. SCHULMAN
niques for over 20 years, and numerous reviews cover this literature (4-6). However, recent technical advances have allowed a burst of new activity in this area, leading to the identification of nucleotide bases required for the recognition of a number of specific tRNAs (7-11).In addition, new studies have begun to reveal the corresponding sites in the synthetases that govern specific tRNA interactions (12-18). This article focuses on these exciting recent developments, dealing first with the tRNA studies, and following with a summary of data on tRNA recognition sites in synthetases.
1. Recognition versus Identity The terms “tRNA recognition” and “tRNA identity” have both been widely used in discussions of tRNA aminoacylation specificity. I use “recognition” here to refer to the specific identification of a cognate tRNA by its corresponding synthetase. The recognition elements in a given tRNA are defined as the set of structural features unique to that tRNA required for its efficient aminoacylation by the cognate synthetase. Recognition elements are identified by structural changes that significantly reduce the efficiency of aminoacylation by the cognate synthetase in in oitro assays. In addition, recognition elements can be transferred to noncognate tRNAs, allowing them to be charged with the corresponding amino acid in uitro and/or in oiuo. Further, loss of an important recognition element from an essential tRNA in uiuo should lead to a decrease in aminoacylation efficiency sufficient to impair cell growth. In the case of several tRNAs, recognition is now known to require only a small number of specific nucleotides. The prospects are good in the next few years for identification of the major recognition elements in tRNAs specific for all 20 amino acids in Escherichia coli. “tRNA identity” (19)is the term currently in general use to indicate the amino-acid-acceptor specificity of a tRNA. Identity elements include recognition elements for the cognate synthetase plus additional structural features that prevent recognition of the tRNA by noncognate synthetases. The locations of these positive and negative identity elements coincide when two synthetases recognize the same sites in their cognate tRNAs, but diverge in cases in which the recognition patterns differ. Identity elements need not be conserved among tRNAs specific for the same amino acid, as isoacceptor tRNAs could require different negative elements to protect against mischarging by different noncognate synthetases. Analysis of an extensive set of mutant tRNAs is often required to distinguish between positive and negative identity elements, and mutants having dual or multiple identities can be created readily. At present, the size of the identity set for specific tRNAs is largely unknown. In addition, inherent difficulties in setting up appropriate in oioo assays may hamper complete solution of the tRNA identity problem for some time.
RECOGNITION OF
25
tRNAs
II. Assays of the Amino-acid-acceptor Specificity of tRNA A. In Vitro Assays In oitro assays of tRNA charging and mischarging by purified synthetases, together with sequence comparisons of the test tRNAs, were used in earlier studies to attempt to define tRNA recognition elements (6, 20-23). In addition, chemical modification experiments helped to identify sites where structural changes led to loss of recognition by cognate synthetases (5, 6, 24). Nonsense and missense suppressor tRNAs generated in genetic experiments were also isolated from cells for in oitro analysis (25-28). Studies using tRNAs synthesized in oioo were sometimes complicated by the fact that mutations, especially those in the anticodon loop, altered the normal patterns of post-transcriptional base modification, and the effects of the two changes could not easily be distinguished (28). In 1982, a method for in uitro anticodon replacement using T4 RNA ligase was reported that avoids this problem (29). T4 RNA ligase was also used subsequently to introduce mutations near the 3' end of tRNAs (30) and to join synthetic oligoribonucleotides into intact tRNA structures (31). More recently, wild-type and synthetic tRNA genes have been cloned, and a variety of tRNAs have been overproduced and purified for in oitro studies (3236). In addition, in oitro transcription of tRNAs using "7 RNA polymerase has allowed the preparation of milligram quantities of specific tRNA sequences for enzymatic assays and physical studies (37-39). Fortunately, these transcripts lacking base modifications have generally been efficient substrates for aminoacyl-tRNA synthetases (Table I), allowing quantitative comparisons of the effects of base changes at specific sites on recognition by purified cognate and noncognate enzymes. This technique has also provided tRNA sequences difficult to obtain from cells due to the effect of mutations on tRNA biosynthesis or cell viability.
B. In Vivo Assays 1. NONSENSESUPPRESSION While much can be learned about tRNA recognition from in vitro experiments, the complete set of tRNA identity elements can only be derived from in oioo studies in which synthetases specific for all 20 amino acids compete for tRNA substrates under normal physiological conditions. Unfortunately, no direct in oioo assay of tRNA amino-acid-acceptor specificity exists. In early in oioo studies using amber suppressor derivatives of E. coli tRNATyr, genetic selections were devised for tRNA base changes that allow suppres-
26
LADONNE
n. SCHULMAN
TABLE I OF AMINOACYLATION OF NATIVE tRNAs AND in Vitro SYNTHESIZED tRNA COMPARISON WITH COGNATE SYNTHETASES" TRANSCRIPTS ~
Organism
E. coli Yeast
E . coli E. coli Yeast
E. coli E. coli
tRNA
[Mg2+](mM)
Asp Native Transcript Asp Native Transcript His Native Transcript Met Native Transcript Phe Native Transcript Thr, Native Transcript Val, Native Transcript Native Transcript
10 10 15 15 10 10 5 8
15 15 8
8 5 11 15 15
Relative specificityd transcript/ native
Reference
1.0
40
0.3Bb l.O=
0.9
41
4.0 0.6
0.80
0.7
42
1.1
0.9r 1.oc 0.8= 1.oc
0.5
43
0.2
37
1.2=
0.4
44
0.75~ 1.0=
0.4
43
0.94c
0.9
45
Apparent K, ( F M )
0.33 0.32 0.044 0.028
3.5
0.096
0.380 0.06 0.20 0.5 1.0 1.6 1.6
kb.c
1.oc 1.oc 0.66b
1.0"'
1.oc
Kinetic parameters vary with aminoacylation conditions. Native tRNAs and tRNA transcripts were assayed in parallel in each case, except for Mgz+ concentration. Transcripts normally require higher [Mgz'] for optimal aminoacylation. It is not clear in all cases whether experiments were carried out at the Mgz+ optimum for the native tRNA or for the transcript. Apparent K,'s are given since assays are carried out at suboptimal amino-acid concentrations. See specific references for details.
bkcal (s-'). Relative V,-. "[ k,,,/K,,,]transcript/[ k,,,/K,]native
or [ V,,/K,]transcript/[
V,,/K,]native.
sion of amber codons at sites for which Tyr insertion fails to yield a biologically active protein (46,47). A number of mutant tRNAs were isolated, a11 of which inserted Gln (48-51). In 1986, the powerful genetic approach to the study of tRNA identity was revived, taking advantage of automated DNA synthesis to construct genes of E. coli amber suppressor tRNAs containing base changes at any desired location (19,52, 53). The genes were cloned in a tRNA expression vector and the amino acid inserted by the suppressor tRNA at the site of specific amber codons in uiuo was directly determined by protein sequencing. E. coli dihydrofolate reductase, containing an amber codon at position 10 of the gene, was used as the target protein due to its ease of purification by methotrexate affinity chromatography and the neutral
RECOGNITION OF
27
tRNAs
effect of amino-acid substitutions at position 10 on enzyme activity (Fig. 1). This approach has also recently been widely used by others (54-61). Much important information on tRNA identity has resulted from these studies; however, there are several problems inherent in such suppression assays. First, the anticodon of the tRNA must be changed to one that is complementary to the nonsense codon. As discussed in greater detail in Section 111, such changes frequently affect recognition of the tRNA by one or more synthetases. Second, the efficiency of suppression cannot be directly related to the efficiency of aminoacylation. Suppression efficiency is usually measured by the rate of translation of a nonsense codon by a given suppressor tRNA relative to the rate of translation of a wild-type codon at the same site by an endogenous tRNA. Suppression efficiency can be affected by the level of synthesis of the tRNA, the type and extent of base modifications (particularly in the anticodon loop), and the interaction of the tRNA with ribosomes and translation factors. Evaluation of the effects of some of these factors on suppression efficiency can readily be made (62); however, it is difficult to access the effect of others. Several types of data are particularly useful in maximizing the information to be obtained on tRNA identity from in uiuo suppression studies. One is knowledge of the relative intracellular levels of tRNAs competing for the same synthetase. It is clear from theoretical considerations (63), as well as from a variety of experimental observations (59, 62, M), that in uiuo mischarging of tRNAs can be brought about by an imbalance of tRNAs and synthetases in the cell. Test tRNAs are normally overproduced in the sup2
Wild-type DHFR
Ile
1 2
Ser
Leu
Ile
Ah
Ala
Leu
Ala
Val
Asp
Arg
ATC AGT CTG A T T GCG GCG T T A GCG G T A G A T CGC DHFR-amber
........................ ........................
... ...
A A ' Asn TAG AAT
2
Pseudo-DHFR
Met
Lys
Leu V a l S e r
Ala
ATG AAG C T T G T A AGC GCG DHFR-start
AA'
Ser
Leu
Ile
Ale
ATC AGT CTG A T T GCG
..............................
NNN... DHFR-shift
Ile
...........................
. . . . . . . . . AA' . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . NNNN . . . . . . . . . . . . . . . . . . . . . FIG. 1. The sequence of the amino terminus of wild-type E . coli DHFR (118) and DHFR derivatives (19, 670), used for in oioo assays of tRNA identity.
28
LADONNE
n.
SCHULMAN
pression assay to maximize the amount of suppressed protein generated for sequencing. This “fixes the game” in the intracellular competition between the test tRNA and endogenous tRNAs. On the other hand, some mutations drastically reduce the amount of mature tRNA produced in the cell, complicating the interpretation of the effect of specific mutations on amino-acid specificity when comparing tRNAs expressed at different levels. Another valuable piece of information is knowledge of whether aminoacylation is rate-limiting in the suppression assay. This can be assessed by determining the effect on suppression efficiency of overproducing the synthetase(s) corresponding to the amino acid(s) inserted by the test tRNA. Genetic screens have also been used to examine tRNA identity in uiuo, using cells that require insertion of a specific amino acid at a nonsense codon for normal growth (19, 56, 58, 64).This assay is a minimal test of tRNA identity, however, since low levels of aminoacylated tRNA (below the amount detectable by sequencing dihydrofolate reductase) have been found to support growth in some cases, and in uiuo charging with other amino acids is not assessed. Interesting tRNA mutants with altered amino-acid-acceptor specificity can also be obtained by selection for changes in chromosomal tRNA genes that suppress missense mutations in specific proteins (65,66). However, such mischarging mutants insert incorrect amino acids at normal translation codons; they are therefore expected to be significantly defective in aminoacylation or some subsequent step in translation, or else they would be highly toxic.
2. NEW in Viuo ASSAYS
The inability to assay tRNAs containing wild-type anticodons has been a major problem in in uiuo studies of tRNA identity. If tRNAs containing anticodon base changes are efficiently charged by their cognate synthetases or by more than one synthetase, they are toxic to cells, frequently inserting incorrect amino acids into proteins. If the tRNAs are completely converted to the identity corresponding to the new anticodon, the usual assay is lost, since proteins indistinguishable from those synthesized by endogenous tRNAs are made. My group has recently been attempting to develop new in uiuo assays that allow studies of tRNAs containing wild-type anticodons (67, 67a). One such assay uses the E. coli initiator Met tRNA (tRNAfMet)as the test tRNA for the effect of specific mutations on amino-acid-acceptor specificity. Plasmid-borne tRNAfMet genes containing non-Met anticodons are expressed in cells harboring a compatible plasmid carrying a target protein gene with a complementary non-Met initiation codon. As in the nonsense suppression assay, dihydrofolate reductase has usually been used as the target protein; however, the gene has been altered to encode six additional
RECOGNITION OF
tRNAs
29
amino-terminal amino acids fused to the second amino acid of the wild-type protein (Fig. 1).Anticodon derivatives of tRNAfMetinitiate DHFR synthesis in this system (see below) and produce biologically active proteins that retain the initiating amino acid ( 6 7 ~ )The . amino-terminal amino acid is identified by protein sequencing to determine the amino-acid-acceptor specificity of the tRNA. An advantage of this system is that anticodon mutants of tRNAfM” are not toxic to cells, because they are unable to participate in polypeptide chain elongation (68,69). However, it remains to be determined whether this assay will be useful for all non-Met amino acids, and to what extent the structure of the tRNA outside of the anticodon can be altered without complete loss of its initiator function. A second potentially useful in uiuo assay now being explored is the use of frameshift-suppressor tRNAs having eight-membered anticodon loops to translate four-base codons near the amino terminus of the target protein. Several such tRNAs are aminoacylated quite efficiently in uitro (70, 71), and many more function as frameshift-suppressor tRNAs in uiuo (72-76). The potential advantages of this system are that any tRNA structure that allows translation can be tested for its effect on tRNA identity in the presence of the wild-type anticodon sequence. The disadvantages include the typically low suppression efficiencies of frameshift tRNAs and the potential of toxic effects caused by frameshifts in essential cellular proteins. Other in uiuo experiments could provide useful information on tRNA identity. Mutant elongator tRNAs that contain the presumed recognition elements and the known anticodon sequence of a noncognate tRNA could be expressed in uiuo and the effect on cell growth determined. Normal growth should be observed if the tRNA has been completely converted to a new identity. A more stringent test of such an “identity swap” would be complementation of cells in which the chromosomal gene(s) for the original tRNA has been inactivated. The latter is the only kind of in uiuo experiment that addresses the issue of aminoacylation efficiency as well as specificity. Such studies have not yet been attempted, and might fail in any case due to effects of the mutant tRNA on cell growth unrelated to tRNA identity.
111. Role of the Anticodon As was pointed out early in the study of tRNA recognition (77), the anticodon is the most logical site to specify the amino acid to be coupled to a tRNA, since this would directly link aminoacylation to the genetic code. Despite considerable evidence in favor of anticodon recognition for a number of tRNAs (reviewed in 5), the importance of the anticodon was not widely accepted until quite recently. Indeed, some current textbooks still state that the anticodon is not a recognition element for tRNAs (78).Arguments against
30
LADONNE H . SCHULMAN
the involvement of the anticodon in tRNA recognition were largely based on early studies of nonsense suppressors derived from Gln, Leu, Ser, and Tyr tRNAs, which insert the correct amino acid despite the presence of singlebase changes in the anticodon. Nevertheless, one of the first amber suppressor tRNAs studied, su +7 tRNATw containing a CCA-CUA anticodon change, has an ambiguous identity, inserting both Gln and Trp in oiuo (79). Subsequent in oitro studies showed that this single mutation affects recognition of the tRNA by both TrpRS and GlnRS, decreasing interaction with the cognate synthetase and increasing interaction with the noncognate synthetase (80, 81). At about the same time, anticodon base changes generated in oitro were shown to have dramatic effects on the recognition of E. coli tRNAfMef(82,), tRNAArg (83), and yeast tRNAVal(84,), and missense suppressors derived from tRNAC1y were found to be highly defective in aminoacylation by E. coli GlyRS (27,28). These early results were just a prelude of things to come. Kisselev presented an extensive review of the literature on anticodon recognition in this series in 1985 (5). I will attempt to summarize here the' data obtained in the following 5 years, with emphasis on E . coli tRNAs, which have been the focus of many of the recent investigations.
A. Anticodon Recognition in E. coli tRNAs 1. In Vitro STUDIES
Tables I1 and I11 summarize the effects of anticodon base changes on the recognition of tRNAs by cognate and noncognate synthetases in oitro. Significant reductions in the efficiency of aminoacylation of E. coli tRNAs specific for Arg, Gly, Ile, Met, Phe, Thr, Trp, Tyr, and Val result from base changes in the anticodon. The magnitude of the effect varies with the particular synthetase, and may be related to the total number of recognition elements in a given tRNA. MetRS and ThrRS also strongly protect the anticodons of their cognate tRNAs from chemical modification and nuclease attack (85, 86). Even stronger evidence for the role of the anticodon in recognition of tRNAs by specific synthetases is obtained when mischarging occurs in an anticodon-dependent manner. Substitution of the anticodons of noncognate tRNAs with anticodons corresponding to those of E . coli tRNAs for Arg, Met, Thr, and Val leads to increases of 104 to lo6 in mischarging by the corresponding synthetases (Table 111). In addition, 103- and 105-fold increases in the efficiency of mischarging of tRNAfMefand tRNATrpby E. coli GlnRS are observed when the wild-type anticodons of these tRNAs are replaced with the amber anticodon CUA. The efficiency of aminoacylation of noncognate tRNAs containing Met and Val anticodons by MetRS and ValRS is similar to that of the corresponding cognate tRNAs, suggesting that the
TABLE I1 EFFECT OF ANTICODONBASE CHANGESON in Vitro RECOGNITIONOF tRNAs Apparent Organism
tRNA".'
Anticodon"
K, (CLW
keJ
E. coli
.41a, Ala,
2.2 ND
1.oe ND 1.W
Yeast
Ala,
VIGC (wt) G G C (wt) CUA IGC (wt) IGU AGC GGC GGCC GC ICG (wt) IUG G U C (wt)
E.coli
A%,
Yeast
Asp"
E. coli E. coli
Glu
Gly I
uuc AUC U8UC (wt) UUA U8UU
2.9
0.039 0.326 0.250
0.7~ 0.035e 0.036
100 >100 2.9 >100 >100
1.Of
0.030
Comments
Reduced rate
Mixture of species
1.0 Greatly reduced rate 1.0 1.0 1O2) of single anticodon base changes on the recognition of cognate yeast tRNAs have been observed for yeast Asp, Met, and Val synthetases. Genetic studies have also implicated the anticodon as a recognition element for yeast MetRS (119). Single-base changes in the anticodons of yeast tRNAPhe and tRNATyr decrease the efficiency of aminoacylation by the cognate synthetase in uitro; however, the magnitude of the effect of such changes on recognition by these enzymes is much smaller, in the range of 3- to 14-fold (Table 11). Nevertheless, the effects of individual base substitutions are roughly additive, creating large effects on activity in tRNAs containing multiple mutations. In addition, the anticodon bases compose a significant part of the overall recognition set, at least in the case of tRNAPhc(see Section V, B), and switching the anticodons of tRNAPhe and tRNATyr leads to increased mischarging by the corresponding noncognate synthetase (Table 111).
44
LADONNE H. SCHULMAN
Base changes in the anticodon of yeast tRNAAla have little or no effect on the recognition by yeast AlaRS, as was observed for the E. coli Ala synthetase.
C . Summary The anticodons of 11E . coli tRNAs clearly contain one or more important recognition elements for cognate synthetases, and it is likely that this number will increase. In addition, two other E. coli tRNAs have been shown to have identity elements in the anticodon. Although the anticodon is not required for the recognition of tRNAAIa,and possibly tRNASer,these and all other E. coli tRNAs probably contain identity elements in the anticodon that are crucial for the discrimination of cognate and noncognate tRNAs by synthetases in uiuo. Where information is available, yeast tRNAs follow a similar pattern to that observed in E. coli. Thus, it is expected that many eukaryotic tRNAs have retained essential recognition elements in the anticodon as well, as directly demonstrated for bovine tRNATm (103).
IV. Role of the Acceptor Stem and the ”Discriminator” Base at Position 73 Aside from the anticodon, the region of tRNA structure most frequently implicated in synthetase recognition is the domain adjacent to the 3’-terminal CCA sequence, containing the first 3 bp of the acceptor stem (positions 1.72, 2.71, and 3.70) and the fourth base from the 3’ end (position 73) (Fig. 2). The latter base has been suggested to be a universal “discriminator” site that assists synthetases in sorting tRNA substrates (120);the acceptor stem has been postulated to be the site of the earliest recognition elements in the evolution of tRNAs (121).
A. Alanine Synthetases Among E. coli tRNAs, the acceptor stem contains recognition elements for Ala, Gln, His, and Ser, and this region has been implicated in recognition by several other synthetases. The most dramatic results have been obtained with Ala tRNAs, in which a single G,*U, base-pair is a major recognition element for AlaRS both in uiuo (35, 54) and in uitro (35, 36, 59). Amber suppressor tRNA derivatives of Ala tRNAs undergo a large loss of activity on conversion of G,-U,, to a standard Watson-Crick base-pair or on substitu-
RECOGNITION OF
tRNAs
45
tion of a U,*G,, sequence (Table VIII). In addition, transfer of the G,.U,, base-pair to amber suppressor derivatives of tRNAPhe, tRNAcys, and tRNALys converts each of these tRNAs into efficient Ala-inserting tRNAs. The mutant Cys and Lys tRNAs insert only Ala; however, tRNAPhe (CUA)G,-U,, inserts both Ala and Phe, indicating that PheRS can still recognize the mutant tRNAPhe. Mutations at several additional sites (Table VIII) lead to almost complete conversion of the identity of tRNAPheto that of Ala. This could result from a more favorable interaction of the altered tRNA with AlaRS and/or a less favorable interaction with PheRS. McClain et al. (56) have shown that sequences other than G,*U,O allow the insertion of Ala into protein by tRNAAIa(CUA)with low efficiency. Some of these sequences contain neither G, nor UT0. These workers have therefore suggested that a "helix irregularity," rather than a specific sequence at position 3.70, is recognized by AlaRS. However, in uitro data argue that some feature of the G,.U,, basepair is specifically recognized, as no activity is observed with tRNAs containing G,*C,; A,*U70, or U,'G70 using high concentrations of purified enzyme (Table IX). A minihelix consisting of the acceptor and T-stems of tRNAAla plus the unmodified T-loop (Fig. 4) is aminoacylated with a specificity only a fifth of that of the native tRNA, indicating that this truncated RNA contains the major determinants for recognition by AlaRS (122).A microhelix containing only the acceptor stem and a seven-membered loop is aminoacylated with a fiftieth of the specificity of the intact tRNA, indicating loss of contacts that improve both binding and the efficiency of aminoacylation. In uitro footprinting of the tRNAAla.AlaRScomplex show that the enzyme protects phosphates in the acceptor stem on the 3' side of residues 64-70 from nuclease attack (123).Base changes at positions 1.72, 2.71, 5.68, 6.67, and 7.66 in the acceptor stem, and 49.65, 50.64, and 51-63 in the T$C-stem allowed insertion of Ala by tRNA$'"(CUA) in uiuo (35),indicating that none of these sites is essential for recognition by AlaRS. In addition, base-pairs 5-68, 6-67, 7.66, 49.65, 50.64, and 51-63 are not conserved in amber suppressor derivatives of B. mori and human tRNAAla,which are efficiently aminoacylated by E . coli AlaRS in uitro and insert only Ala into DHFR-amber in E. coli (60).It has been suggested that bases 16, 17, 20, and 60 play a role in Ala identity (54); however, these bases are not conserved in the B. mori and human Ala tRNAs. In addition, the low levels (4-6%) of insertion of Lys brought about by mutations at these sites in E. coli tRNAAla(CUA)would very likely be blocked by the presence of a wild-type Ala anticodon containing G,,, making the significance of the Lys mischarging unclear. Sites outside of the acceptor and T-stems are likely to be involved in Ala identity by making negative contacts with noncognate synthetases.
TABLE VIII
EFFECTOF MUTATIONSON THE RECOGNITIONOF tRNAs in Vioo BY E . coh AlaRs tRNA ( a n t i d o n change)
tRNAtL"(GGC)+(CUA)
tRNAPhe(GAA+(CUA)
Additional mutations
Amino acid inserted
(%)a
Ala, 96 Ala, 18; Lys, 29; Gln, 44 Ala, 89 Ala, 55; Lys, 29; Gln, 6 Ala, 83;Gln, 12 Ala, 90 Ala, 75; Lys, 12 Ala, 97
Phe, 100 Ala. 24; Phe, 76 Ala, 63; Phe, 37. Ala, 96; Lys, 4
Suppression efficiency ( % ) b . c 4.126 0.076 0.636 0.3Qb 0.556 0.36h
0.22" 21c Inactive' Inactivec Very weakly activec 12= 786d 106
ND 146
tRNACys(GCA+(CUA)
None c3
tRNAL~5(U,,UU+(CUA)
'
G70*G3
'
u70
None G3
. C70+G3
.
G3 . AiO A3 c i o '
tRNAz'Y(U*CC+(CUA)
None G , C,O+G, None None '
tRNATYr(QUA+(CUA)
u3
3
'
U7"
.
'
7 '0
u3 .
'
7 '0
(1 X (17 x (1 X (17 X
AlaRS)
AlaRS) AlaRS) AlaRS)
CYS Ala, > 9 Y
Xbd
LYS, 94 Ala, 94 Ala, 39; Lys, 49 Ala, 22; Lys, 69 Gly, 16; Gln, 84
31" 34" 1.66
Gly, 5; Gln, 95 Tyr onlyf Tyr onlyf Tyr, 95; Gln, 5.f Ala, 95; Gln, Y
ND
3.6" 24
3" ND ND ND ND
"Insertion into DHFR-amber. The percentage of each amino acid at position 10 is given. Note that this may not correspond to the percentage of aminoacylation of the tRNA in oioo. All tRNAs were overproduced. See Table 11, footnote d, for the definition of anticodon minor bases. bSuppression of the amber allele A,, in the hcZ-Z fusion gene (52). Data are from 56 unless otherwise noted. See this reference for additional mutants. (125). Data are from 35. cSuppression of TrpA(UAG,) dData are from 52. eData are from 35. /Data are from 59.
48
LADONNE
n. SCHULMAN
FIG.4. (A) Structures of E . coli tRNAf", minihelix*la, microhelixAla, and minihelixTYr. Base changes from the wild-type sequence are indicated by arrows. (From 8 with permission.) (B) Sites of known major recognition elements in E. cob Ala tRNAs are indicated by arrows.
RECOGNITION OF
49
tRNAs
Retention of very weak Ala-inserting activity by tRNAA1"(CUA)mutants containing non-G,*U7, sequences may indicate the presence of additional weak recognition elements at other sites. Sequestering of low levels of aminoacylated tRNA by EFTu-GTP may also contribute to the in uiuo activity of these mutants. However, it is likely that these weak suppressor tRNAs would show no measurable in uiuo activity if they were not significantly overproduced. Two nonstandard base-pairs, G3eA70 and A3C70, also lead to the insertion of low levels of Ala by tRNALYs(CUA)(56), suggesting that mismatches at position 3.70 may assist in adapting the structure of a tRNA to the surface of AlaRS in a manner leading to inefficient aminoacylation in the absence of G,*U,,. These results are somewhat reminiscent of those seen in the mischarging of tRNATyr(CUA)acceptor-stem mutants with Gln, where introduction of sequences different from those present in tRNAGln leads to mischarging by GlnRS (see Section IV,C). Several suppressor tRNAs to which the G3.U7, sequence was transferred failed to insert Ala in uiuo, including tRNA,G'y(CUA) and tRNATYr(CUA) (Table VIII). The activity of tRNATyr(CUA)G3-U7,was examined in uitro and found to be a tenth of that of tRNAAIa(Table IX). Comparison of the kinetic
B
I
FIG.4. (cont.)
50
LADONNE
H.
SCHULMAN
TABLE IX AMINOACYLATION OF RNAs WITH E . coli AlaRSO Apparent K, (pM)
RNA
k,,, (SKI)
k , t K
( M - 1 s - 1 x 10-5)
Relative
k,,,lK,,
~~
tRNA:'"(UGC) tRNAf"(CUA) + U3,4+A, tRNATYr(CUA)+ C3G,o-*G,. U7, MinihelixA'a MinihelixTyrC, G7,+G3 . U7, Minihelixc~'C3. G7O+G3 . u70 + U,,+A7, MicrohelixAla tRNA;'"(CUA)G, U,,+G, C, '4, ' G,o '
MinihelixcW, . G7O+G3 MinihelixAlaA7,+ N7,
c70
u70
2.2 2.9 14.0 9.1 8.8 8.8
1.0 1.8 0.6 0.9 0.5 0.3
35.9
0.3 0.078 0.02 No activity at 4-pM tRNA, 20-pM AlaRS No activity at 4-pM tRNA, 20-pM AlaRS No activity at 4-pLM tRNA, 20-pM AlaRS No activity at 2-pM RNA, 0.75-pM AlaRS Rate A,, S- C7, > U,, > G7,
>90
4.5 6.2 0.43 1.0 0.53 0.32
1.0 1.38 0.10 0.22 0.12 0.07
a Data are from 36.59, 122, and 126. The concentration of Ala in the assays is suhsaturating; however, this does not greatly affect the kinetics of aminoacylation by AlaRS (127).
parameters for aminoacylation of the tRNA by TyrRS and AlaRS in uitro and estimates of the endogenous levels of the two synthetases suggested that the intracellular concentration of AlaRS was too low to compete effectively with TyrRS for tRNATYr(CUA)G3.U7,. Elevation of the AlaRS concentration by introducing a plasmid carrying the alas gene resulted in insertion of Ala by the tRNA in uiuo (Table VIII). The failure of tRNA,G'Y(CUA)G,-U,, to insert Ala in uiuo could be due to more favorable interaction of this tRNA with GlnRS (Table VIII) and/or to the presence of negative elements for AlaRS. A missense suppressor derived from wild-type tRNALYs(U*UU)containing a G , ~ C 7 0 ~ G , * U 7mutation 0 was isolated earlier in genetic studies and shown to insert either Gly or Ala in uiuo (124).Suppressor activity is lost when cells containing the mutant tRNA and a temperature-sensitive AlaRS are grown at an intermediate temperature, indicating that Ala is inserted by this tRNALys derivative (F. T. Page1 and E. J. Murgola, unpublished). The mutant Lys tRNA has also been known to accept Ala in uitro. The G3eU70 base-pair was predicted by sequence analysis to be a recognition element for E. coli AlaRS (128,129),as it is uniquely present in Ala tRNAs in E. coli. This structural feature has also been preserved in higher organisms (112).Early studies using reannealed acceptor-stem fragments derived from yeast tRNAAla showed that the acceptor stem contains sufficient information for in uitro aminoacylation by yeast AlaRS (130). The G,.U7, base-pair has recently been shown to be required for in uitro ami-
RECOGNITION OF
tRNAs
51
noacylation of human and B. mori Ala tRNAs by homologous and heterologous Ala synthetases (60), suggesting that this sequence is an important recognition element for all of the Ala enzymes. Although tRNACys(CUA)G,*U7,inserts only Ala in uiuo, this tRNA was inefficiently aminoacylated by AlaRS in uitro (35).Recent studies (126)indicate that this may be due largely to the presence of U,, in tRNACys. A minihelix containing the acceptor and T-arms of the Cys tRNA plus a change of U,, to A,, was aminoacylated in uitro by AlaRS with a specificity only a third of that of the Ala minihelix, while the U,,-containing minihelix was inactive (Table IX). Substitutions of any other base in place of the wild-type A,, sequence in the Ala minihelix also resulted in a significant reduction in both the rate and extent of aminoacylation, indicating that A,, is also a recognition element for AlaRS (Fig. 4). The effect of changes at A,, appear to be mainly on kcat, while changes at G3.U70 strongly affect both kcat and K,,, (36, 126). A change of A,, to U,, in the amber suppressor derivative of tRNA$Ia has no detectable effect on the identity of the tRNA in uiuo, indicating that G,.U,, is a dominant recognition element for this tRNA (Table VIII).
B. E. coli Serine Synthetase The in uiuo amber suppression assay has been very effectively used to study the structural requirements for recognition of tRNAs by E. coli SerRS. The original identity swap type of experiment was carried out by Normanly et al. (19), inserting 12 base changes into the structure of tRNA,L""(CUA)to convert it into a Ser amber suppressor tRNA (19).Subsequent studies show that only eight of these changes (in addition to the anticodon changes) are required for the Leu-Ser conversion (Fig. 5; 7). Six of the important sites are located near the acceptor end of the tRNA. Four generate sequences conserved in all E. coli Ser tRNAs and bacteriophage T, tRNASer: G,.C,,, G,C,,, and G73. Sequences found at position 3.70 in wild-type Ser tRNAs are either A,.U,, or U,*A,,; however, only A,*U,O brought about the desired conversion to Ser identity, possibly by blocking interaction of the tRNA with LeuRS (7). Thus, position 3.70 contains an identity element, but not necessarily a recognition element for SerRS. Substitution of 2 bp at positions 1-72and 3-70 in tRNASer(CUA)leads to a large loss of Ser suppressor activity and a gain of Gln-inserting activity (61) (Table X). The changes are to sequences important for the recognition of tRNAs by GlnRS (see Section IV,C); however, it is not clear that GlnRS would recognize native tRNAser(U*GA) containing the anticodon base G,, and having the same acceptor-stem mutations. Such a tRNA might retain Ser identity. In addition to the mutations in the acceptor-stem region, the complete conversion of tRNAp(CUA) to a Ser-inserting tRNA required an additional change at position 11.24 in the D-stem (Fig. 5). A C,,*G,4 sequence is found
52
LADONNE
N. SCHULMAN
16
0
0 I
G-C G -C
G
.-. .-.
0 - 0 7 0
0-0 0-0
G 0
O 0 * .
20..
.
p,
@$* ..GO
21
b* ttii;
Yo..
C-G 0 - 0 .-• 30 0 60
-
0-0
\
A
C
A A
e
* 35 U
O
0
35
FIG. 5. (A) Base changes involved in the in oiuo conversion of E. coli tRNAk' identity from Leu to Ser (7, 19). (B) Composite structure of E. coli and bacteriophage T, Ser tRNAs (112). Due to uncertainty in the alignment of the D-stem, this region of the selenocysteineinserting tRNASer has been omitted from the composite (see text). The large variable loop is not conserved in size or sequence. See the legend to Fig. 3 for definition of the symbols.
in T4 tRNASer and all four E. coli Ser tRNAs; however, the Ser tRNA that normally inserts selenocysteine at the site of UGA codons in specific E . coli proteins (131)has unusual D-stem and acceptor-stem structures (132).There are 8 bp in the acceptor stem, followed by two unpaired bases (9 and lo), and a 4-bp D-stem. By the conventional cloverleaf arrangement, the base-pair in the position equivalent to 11.24 in the other Ser tRNAs is a G C sequence, raising questions about the exact role of the 11-24 base-pair in the recognition of tRNAs by SerRS. Three of the five Leu tRNAs contain Cll.G24, suggesting that C - G does not inhibit LeuRS. Thus, structural features other than primary sequence may play a role in the interaction of SerRS with this region of its tRNA substrates. The long variable arm of Ser tRNAs also plays a role in discrimination between cognate and noncognate tRNAs by SerRS (101~). In uitro identity swap experiments designed to convert tRNATyr into a Ser-accepting tRNA suggest that the orientation of the variable arm, rather than its primary sequence, influences the interaction of tRNAs with SerRS. Comparison of the conserved sequences in Ser tRNAs (Fig. 5 ) suggests
RECOGNITION OF
53
tRNAs
TABLE X MUTATIONSAFFECTING in Vivo AMINOACYLATIONOF tRNAs
tRNA tRNA;*''
tRNAp
Amino acid inserted ( % ) c
Mutations" U*AA+CUA All (see Fig. 5) Omit G I . U,,+G, Omit C, G7,+GP Omit G , . C,,+A, Omit A,,+G,, Omit U,, Az4+Cl, VIGA-CUA VIGA+CUA + GI VIGA+CUA + A,
Leu, 99 Ser, 92 Leu, 15; Gln, Leu, 91; Gln, Leu, 72; Gln, Leu, 99 Leu, 38; Gln, Ser only
C,, C,,
, U . G,, '
'
C7z+U, U,O+G,
BY
'
A72
'
C70
78; Ser, < I 9 6; Ser, 20
39; Ser, 16
Gln, >90; Ser, 5-6
SerRSa Suppression efficiency (%)d.e 52-5gd 33-49d 12d 5-9d ll-12d 20-35d 35-48d 47e 47p
UData from t R N A p are from 7 and 19 and from Normanly and Abelson, unpublished. Data for tRNA? are from 61. bunnumbered sequences are anticodon sequences. c h e r t i o n into DHFR-amber at position 10. "Efficiency of suppression of derivatives of locl-Z containing amber mutations at different sites in the locl portion of the fusion gene. tRNAs were overproduced. "Etficiency of suppression at the A,, amber allele of lad-Z. tRNAs were not overproduced.
that there are few other sites outside of the acceptor stem region that could contribute to base-specific recognition by SerRS.
C. E. coli Glutamine Synthetase In addition to important sites in the anticodon, E. coli GlnRS also recognizes key structural features in the acceptor-stem region of tRNA substrates. Early evidence for this came from genetic studies, in which mutants of an amber suppressor Tyr tRNA were isolated that insert Gln at the site of UAG codons in uiuo and accept Gln in uitro (Table XI; 46-52). The first mutation obtained converted A,, to the Gln sequence G,, and led to insertion of Gln, but not Tyr, in uiuo. This mutation has the dual effect of increasing activity with GlnRS and reducing activity with TyrRS (49). Mutations at position 1.72 were subsequently isolated that converted the wild-type Tyr G,.C,, sequence to weak or mismatched base-pairs and led to the insertion of both Tyr and Gln in uiuo. These sequences did not correspond to the Gln sequence U,.A,,, suggesting that an easily disrupted base-pair, rather than a specific primary sequence, favors interaction with GlnRS. This was further suggested by later experiments with an amber suppressor derivative of E. coli tRNAfMet,which contains a C,.A,, mismatch at the 5' terminus and is also a substrate for GlnRS in uitro (97). An A*C mismatch at position 2.71 in
AMINOACYLATION
tRNA tRNA:'"(CUG) tRNAfMet(CAU-&UA)
Additional mutations None None 1'
'
A72+U1 1'
cl
tRNATYr(QUA+CUA)
. . G7?. . G72 +
A73jG73 GI . C 7 p A I . C7, GI - u,, '1
G2
'
'71jAP
'
. 7' 2 . 7' 1 '
tRNATyr(QU A)
OF
'71
G* . u,, Wild-type tRNA
tRNAs
TABLE XI BY E. coli GlnRS in Vioo AND in Vitro" Relative V,,,.J&b
6325 29 9 1 5
In oitro activity Incomplete charging at high [GlnRS] Complete charging at high [GlnRS] Complete charging at high [GlnRS] Complete charging at high [GlnRS] ND Active at high [GlnRS] Inactive at high [GlnRS] Inactive at high [GlnRS] Rate relative to tRNA2'" at 1 0 - ~ MtRNA'
aData on tRNAfMctare from 69 and data on tRNATyr are from 46-51. ND, Not determined. bApparent K,,,at subsaturating amino acid concentration. =Amino acids are inserted by tRNATyr derivatives into T4 am H36 head protein. dNo Tyr, according to 48; some Tyr, 46. eData are from 133.
Amino acid inserted in oiooc Gln ND ND ND ND Gln > Tyrd Gln Gln, 20; Tyr, 80 Neutral amino acid Gln, 30;Tyr, 70 Gln and Tyr TYr TYr TYr
+ Tyr
RECOGNITION OF
tRNAs
55
tRNATYr(CUA)also allowed in uiuo mischarging by GlnRS, although this sequence change is actually away from the wild-type G,*C,, sequence of tRNA”’I1. Again, the data suggest that unpairing of the acceptor stem facilitates interaction with GlnRS (49). tRNAfMef(CUA) and the position 1.72 mutants of tRNATYr(CUA)contain A,,, indicating that G,, is not essential for GlnRS recognition. This conclusion is also consistent with the results on the Gln-inserting amber suppressor tRNAs (Table IV), where only three of the five tRNAs mischarged by GlnRS have G,, (two have A,,). Conversion of the C,*A,, sequence in tRNAfMet(CUA)to a “glutamine” U,.A,, base-pair actually reduces the specificity for aminoacylation by GlnRS (Table XI; 69), suggesting that G,, plays a more important role when no mismatch is present at 1-72. The stronger C,*G7, base-pair further reduces the activity of tRNAmet(CUA), and in this structural context, G, is seen to enhance interaction with GlnRS fivefold. The X-ray structure of the tRNAG’”.GlnRS complex (12)reveals that the base-pair at position 1-72 is broken, and base-specific contacts are made at both positions 2.71 and 3.70 by GlnRS (see Section VII). G,, is involved in an RNA.RNA contact that facilitates the conformational change at the 3’ end of the tRNA. Each of the recognition elements in the acceptor stem contributes to the overall interaction between GlnRS and its tRNA substrates, but none is essential. Of the suppressor tRNAs mischarged by GlnRS, only tRNATrl’ (CUA) contains all of these elements. In addition, individual changes at each site do not eliminate the Gln acceptor activity of tRNAG1I1(CUA)in uiuo (M. J. Rogers and D. Soll, unpublished). U,, in the anticodon makes a much larger contribution quantitatively to recognition by GlnRS, increasing the specificity of the enzyme for tRNATq (C,,--*U,,) by 105 and for tRNAfMet(A,,.U,,+U35.A36) by 103 (Table 111). Nevertheless, the sum of the recognition elements in the acceptor stem makes a significant contribution to the recognition of tRNAs by GlnRS.
D. Other E. coli Synthetases tRNA1Iis is unique among E . co2i tRNAs in having only three unpaired bases at the 3’ terminus, plus an acceptor stem containing 8 bp (134)(Fig. 6). The role of this unusual structure has been investigated by examination of a series of tRNAHisderivatives prepared by in uitro transcription (Table XII; 42). In one set of experiments, the extra base at the 5‘ end (designated G- ,) which is paired with C,, has been removed to generate a “standard” tRNA 3’ terminus. This change causes a large decrease in the specificity for aminoacylation of the tRNA by HisRS, indicating the importance of the structure for recognition by the enzyme. The primary sequence at the -1.73 position is also important for efficient aminoacylation by HisRS. C, is
56
LADONNE ? I
H.
SCHULMAN
76
FC
F, - ;73 4 I
w
-I
IG-CC72
-
-
-
-
-
FIG. 6. The unusual structure at the acceptor end of E . coli tRNAHiS(134). Known major recognition elements (42) are indicated by arrows.
unique to His tRNA in E. coli (112),and conversion to any other nucleotide causes significant loss of activity, whether or not G - is present. Substitution of G - 1.C7-3 by an A- 1-U73base-pair also reduces activity to below 0.1%. Most of the observed effect is on the maximal velocity of the reaction. These data indicate that the G - ,*C, base-pair is an important recognition element for E. coli HisRS, affecting the positioning of the 3' end in the catalytic step of the aminoacylation reaction. The extra 5' G but not the base pair has been preserved in the His tRNAs of yeast and higher eukaryotes (112).The 5'-terminal G - is encoded in E. coZi, but is added post-transcriptionally to the cytoplasmic tRNAHisof higher organisms (135, 136). The nature of the base at position 73 plays an important role in the recognition of E. coli tRNAASpby AspRS (Table XIII; 40). Alterations of G,, reduce activity to 1/200 or less. In this case, significant changes in both K , and V,, are observed. The nucleotides that substitute best for G,, (U > A) share some functional groups with G , suggesting that direct contacts may be made by the enzyme at this site. The discriminator base also plays a role in the recognition of tRNATyr by E. co2i TyrRS, since conversion of A,, to G,, greatly reduces the Tyr-insert-
RECOGNITION OF
ROLE OF
57
tRNAs
THE
TABLE XI1 EIGHT-MEMBERED ACCEPTOR STEM IN THE AMINOACYLATION OF E. coli tRNAHisTRANSCRIPTS"
Sequence change
Apparent K , (PM)
Vmax (pmol/minlmg x 10-2)
4.0 3.7 4.8
100 7.8 1.5
None G-I C 7 p C 1 A,, G-I u73 G-I ' CiR Delete G - , Delete G - , + C,,+A,, '
10
6.1
10
4.1
Relative VIK, 1.0 0.084 0.013 loo
V (+mol/min/mg)
Relative
VIK,
~
2.0 0.9 0.03 0.0007 -
10 x 1oR 5 x 106 40 X I@ 1 x 18 1
61
tRNAs
RECOGNITION OF
TABLE XVI AMINOACYLATION OF tRNA TRANSCRIPTS WITH YEAST PheRSa tRNA Yeast tRNAPhe(CAA) Yeast tRNAPhe(GAA)Gzo+U,, A7dJ7, E . coli tRNAPhe(GAA)U,, + c, G7n+G, c70 E . coli tRNAPhe(CAA)Uz,+G,, + c, G7,-rC3 c7, Yeast tRNAMet(GAA)A,,+G, + G,, . C,+C,, C , + A,g+U5g Yeast tRNAAr~(GAA)Cz~+G,o+ U,, + c59+u59 + C73+A7, Yeast tRNAArg(GAA)Cm+Czo + U,, + c59+us9 Yeast tRNATyr(GAA) U,u~o_,U,-2+~,, + other changes6
Apparent K , (PW
k,,,
Relative
(min-1)
kcatlKm
0.35
160
2.10
80
1.0 0.083 0.083
'
'
1.80
35
0.042
'
'
0.42
100
0.52
0.41
130
0.68
0.38
110
0.64
1.00
60
0.13
0.36
250
1.5
Data are from 37 and 139. b o t h e r changes: C , . G72+G, . c72, U S A7,+C2 U,, A,, A,, A,,-C,, . G,, and A4,pG&. 0
G,,,
c, . G7o+G3.
c70, c12. G,+
Conversion of G,, to U,, or of A?, to G,,, reduces k,,,lK, for Phe acceptor activity to 1112th. G,, is unique to tRNAPhe in yeast and is one of the variable pocket nucleotides. High-resolution NMR studies of tRNAPhetranscripts containing G or U at position 20 indicate that the structures of the two tRNAs are nearly identical (142), suggesting that the change in activity on mutation of this site is due to the loss of specific contacts with PheRS. Footprinting studies of the yeast tRNAPhe.PheRS complex show that the protein contacts the entire surface of the tRNA (143),consistent with the location of widely separated recognition elements. Extensive studies of the interactions involved in maintaining the threedimensional structure of tRNAPhe reveal that none of the specific bases involved is required for PheRS recognition; however, the proper tertiary structure must be maintained (141).In addition, changes in unbonded bases at positions 16, 17, 59, and 60 have less than a two-fold effect on activity, indicating that G,, is the only required base in the variable pocket. Transfers of the GAA anticodon, G,,, and A,, to other yeast tRNAs yield mutants with nearly wild-type Phe acceptor activity (Table XVI). In addition, conversion of U,, in E. coli tRNAPheto G,, makes the E. coli tRNA a good substrate for the enzyme. Comparison of the sequences of all of the tRNAs that are efficient substrates for PheRS (Fig. 8) indicates that there are few additional conserved sites.
62
LADONNE €1. SCHULMAN 0
76
0 0
.-. .-.
1G 0
A
-
-
4
c
.
0 - a 7 0 - 0
0
.-a
60 m
0 - 0
0
0 0 0 O
.
0
50
0
0
O
0
.
U
I I I I *.*.
0
0
G
20
A
G
I I I I I
0e.e
lo.
.
0
..*.o
0 0
.
0 .
0 - 0 0 - 0 30 G CtO
-
- u
A
0
s
0 A
G
35 FIG. 8. Composite structure of tRNAs aminoacylated by yeast PheRS with kinetics similar (within a factor of 2 or 3) to cognate yeast tRNAPhe(139-141). Arrowheads indicate major known recognition elements. See the legend to Fig. 3 for definition of the symbols. (Adapted from 139 with permission.)
C. E. coli Phenylalanine Synthetase Both in uiuo and in uitro studies indicate that the anticodon contains major recognition elements for E. coli PheRS (Tables V and XVII; 67, and E. F. Tinkle and O.C. Uhlenbeck, unpublished). In uiuo experiments using the amber suppressor derivative of tRNAPhehave suggested that additional sites, including U,,, G27.C43, G,,*C,,, G,,, U,,, U,,, ,U and A,, contribute to Phe identity based on mischarging of tRNAs containing mutations at these sites (54,57). Since tRNAPhe(CUA)retains its identity in uiuo with only one
RECOGNITION OF
63
tRNAs
TABLE XVII AMINOACYLATION OF E . coli tRNAPheTRANSCRIPTS WITH E . coli PheRSa Apparent
K , (PM)
Mutation None C3 ' G7n-tC3 G o G3 C, + GAA-tAAA CUA '
+
ul6-tcl6
+ Cl7-tUl7
+ U2n-tGzo +
'27
'
c43+A27
+ U45+G, + u.5Ll-tc.59
+ u,+c, +
A73-tG73 c73
u73
'
'43
Relative
k,
0.20 0.11
100 70
0.22 0.19 1.10 0.30 0.11 0.95 0.19 0.23 0.35 0.35
80 80 110 100 110 70 70 80 80
70
Relative
k,llK, 1.0 1.2