PROGRESS IN
Nucleic Acid Research and Molecular Biology edited by
WALDO E. COHN
KlVlE MOLDAVE
Biology Dioision Ouk R...
13 downloads
844 Views
23MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PROGRESS IN
Nucleic Acid Research and Molecular Biology edited by
WALDO E. COHN
KlVlE MOLDAVE
Biology Dioision Ouk Ridge Nutional Luhorutory Ouk Ridge, Tennessee
Depurtment of Molecular Biology and Biochemistry Unioersity of Culiforniu, Zruine Zruine, Calijiomia
Volume 52
(#)
ACADEMIC PRESS San Diego New York Boston London Sydney Tokyo Toronto
This book is printed on acid-free paper.
@
Copyright 0 1996 by ACADEMIC PRESS, INC
All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press, Inc.
A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495
United Kingdom Edition published by Academic Press Limited 24-28 Oval Road, London NW I 7DX
International Standard Serial Number: 0079-6603 International Standard Book Number: 0-12-540052-7 PRINTED IN THE UNITED STATES OF AMERICA 96 97 9 8 9 9 00 01 BB 9 8 7 6 5
4
3 2
1
Abbreviations and Symbols
All contributors to this Series are asked to use the terminology (abbreviations and symbols) recommended by the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) and approved by IUPAC and IUB, and the Editors endeavor to assure conformity. These Recommendations have been published in many journals ( 1 , 2 )and compendia (3);they are therefore considered to be generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recommendations ( 1 ) and subsequently revised and expanded (2, 3), are given in condensed form in the frontmatter of Volumes 9-33 of this series. A recent expansion of the oneletter system (5) follows. SINGLE-LETTER CODE Symbol
&.COMMENDATIONSo
(5)
Origin of symbol
Meaning
G
G
Guanosine Adenosine (ribo)Thymidine (Uridine) Cytidine
R
G or A T(U) or C A or C G or T(U) G or C A or T(U)
puRine pyrimidine aMino Keto Strong interaction (3 H-bonds) Weak interaction (2 H-bonds)
A or C or T(U) G or T(U) or C G or C or A G or A or T(U)
not not not not
N
G or A or T(U) or C
aNy nucleoside (i.e., unspecified)
Q
Q
Queuosine (nucleoside of queuine)
Y
M K S
Wb
H
B V DC
G; H follows G in the alphabet A; B follows A T (not U); V follows U C; D follows C
UModified from Proc. Natl. Acad. Sci. U . S . A . 83, 4 (1986). bW has been used for wyosine, the nucleoside of “base Y” (wye). CDhas been used for dihydrouridine (hU or H,Urd).
Enzymes
In naming enzymes, the 1984 recommendations of the IUB Commission on Biochemical Nomenclature (4)are followed as far as possible. At first mention, each enzyme is described either by its systematic name or by the equation for the reaction catalyzed or by the recommended trivial name, followed by its EC number in parentheses. Thereafter, a trivial name may be used. Enzyme names are not to be abbreviated except when the substrate has an approved abbreviation (e.g., ATPase, but not LDH, is acceptable).
ix
ABBREVIATIONS AND SYMBOLS
X
REFERENCES 1 . JBC 241,527 (1966);Bchetn 5, 1445 (1966); BJ 101,l(1966);ABB 115, 1 (1966),129,l(1969);
and elsewhere. General.
2. EJB 15, 203 (1970);JBC 245, 5171 (1970);J M B 55, 299 (1971),and elsewhere.
3. “Handbook of Biochemistry” (G. Fasman, ed.), 3rd ed. Chemical Rubber Co., Cleveland, Ohio, 1970, 1975, Nucleic Acids, Vols. I and 11, pp. 3-59. Nucleic acids. 4. “EnLyme Nomenclature” [Recommendations (1984)of the Nomenclature Committee of the IUB]. Academic Press, New York, 1984. 5. EJB 150, 1 (1985).Nucleic Acids (One-letter system). Abbreviations of Journal Titles
Journals
Abbreviations used
Annu. Rev. Biochem Annu. Rev. Genet. Arch. Biochem. Biophys. Biochem. Biophys. Res. Commun. Biochemistry Biochem. J. Biochim. Biophys. Acta Cold Spring Harbor Cold Spring Harbor Lab Cold Spring Harbor Symp. Quant. Biol Eur. J. Biochem. Fed. Proc. Hoppe-Seyler’s Z. Physiol. Chein. J. Amer. Chem. Soc. J. Bacteriol. J. Biol. Chem. J. Chem. Soc. J. Mol. Biol. J. Nat. Cancer Inst. Mol. Cell. Biol. Mol. Cell. Biochem. Mol. Gen. Genet. Nature, New Biology Nucleic Acid Research Proc. Natl. Acad. Sci. U.S.A. Proc. Soc. Exp. Biol. Med. Progr. Nuel. Acid. Res. Mol. Bid.
ARB ARGen ABB BBRC Bchem BJ BBA CSH CSHLab CSHSQB EJB FP ZpChem JACS J. Bact. JBC JCS JMB JNCI MCBiol MCBchem MGG Nature NB NARes PNAS PSEBM This Series
Structure, Reactivity, and Biology of DoubleStranded RNA’ ALLEN W. NICHOLSON Department of Biological Sciences Wayne State University Detroit, Michigan 48202
I. Biological Origins of dsRNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structure and Dynamics of dsRNA . , . . . . . . . . . . , . . . . . . . . . . . Protein Recognition of dsRNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chemical Stability of dsKNA . . . . . . , . . . . . . . . . . . . . . . . . , . . . . . . . . . . Enzymatic Cleavage of dsRNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Ribonuclease I11 . . . . . . . . . . . . . , . . . , . . , . . . , B. Cobra Venom Rihonuclease (RNase V,) . , . . . . . . . . . . , . . . . . . . . . .
11. Experimental Criteria for dsRNA
111. IV. V. VI.
C. dsRNase Activities Mechanistically Related to Pancreatic RNase VII. dsRNA Function in Prokaryotes . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Gene Regulation by Ribonuclease I11 . . . . . . . . . . . . . . . . . . . . . . . . .. B. dsRNA and Antisense Regulation . . , . . . , . .. VIII. dsRNA Function in Eukaryotes . . . . . . . . . . . . . A. dsRNA and hnRNA . . . . . . . . . . . , . . , . , . . . . . , . . . . . . , . . . . . . , . . . B. dsRNase Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Other dsRNA-specific Activities . , . , . . . . . . . . . . , , . . , . . . . . . , . . . IX. dsRNA and the Interferon System . , . . . . . . , , . . . . . . . . . . A. The dsRNA-activated Protein Kinase . . . . . . . . . . . . . . . . . . . . . . . . . B. The dsRNA-activated 2‘-5’A Synthetase . . . . . . . . C. dsRNA and Mammalian Cell Signal Transduction , . . , , . . . . . , , . . X. Cellular and Physiological Effects of dsRNA, and Therapeutic Applications . . . . . . . . . . . . . . . . . . . . , , . . . , . , . , , , , , . , . . , . , . , , . , , , , XI. Conclusions and Prospects . . . . . . . . . . . , . . . . . . , . . . . . . . , . . , . . . , . . . References . . . . .............................. Note Added in P r o o f . . . . . . . . . . . . . , . . . , , . , , . , . , . , . , . , , , , , , . , , , .
2 3 5 13 17 18 18 24 24 26 26 28 34 34 36 42 46 46 49 51 53 56 58 65
Abbreviations: AFM, atomic force microscopy; Da, dalton; ds, double-stranded; dsRBD, double-stranded RNA-binding domain; hnRNA, heterogeneous nuclear RNA; hnRNP, heterogeneous nuclear ribonucleoprotein; HIV, human immunodeficiency virus; IFN, interferon; IL, interleukin; M-MuLV, Moloney murine leukemia virus, RSV, Rous sarcoma virus; RTase, reverse transcriptase; SD, Shine-Dalgarno; snRNA, small nuclear RNA; ss, single-stranded; TIR, translation initiation region; ts, triple-stranded; 5’-UTR and 3’-UTR, 5’ and 3’ untranslated regions, respectively; UV, uItraviolet. Progress in Nuclcic Acid Rescarch and Moleculdr Biology, Vol 52
1
Copyright 0 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
2
ALLEN W. NICHOLSON
The RNA double helix is an ubiquitous structural motif in living organisms. Double-stranded (ds)RNAZ is created by a number of biosynthetic pathways, and is subsequently degraded, denatured, or specifically modified by enzymatic activities. It also serves as a stable repository of genetic information for many viruses. The diverse functional roles of dsRNA have spurred intensive studies on the biochemical processes that involve dsRNA. dsRNA is also being examined as an agent that changes gene expression patterns and alters cell physiology, as well as a potential therapeutic agent in fighting disease. In addition to providing answers to intriguing biological phenomena, ongoing studies on dsRNA have prompted new questions. How do the physical properties of the RNA double helix establish biological function? How is dsRNA specifically recognized by proteins? What are the pathways of dsRNA formation and breakdown in uiuo? How does dsRNA participate in signal transduction pathways? I intend to address these questions, and to frame new ones prompted by recent findings. I focus on the structure and physicochemical properties of dsRNA; on the enzymes that degrade, modify, or otherwise modulate dsRNA structure and function; and on protein motifs that specifically recognize dsRNA. The metabolism and regulatory functions of dsRNA in the prokaryotic cell are discussed, as are the functions of dsRNA and dsRNA-specific enzymes in eukaryotic cells. Finally, the mammalian cellular and physiological response to dsRNA and the prospects of dsRNA as a therapeutic agent are considered. Due in part to space limitations, this review does not examine the role of dsRNA as a structural component of macromolecular complexes, nor (except for antisense RNA) does it discuss the myriad of short, transiently formed dsRNA segments that are essential features of many biological processes (for example, the base-pairing of the prokaryotic mRNA translation initiation region with the 3' end of 16-S rRNA, or between eukaryotic U1 snRNA and the 5' splice site of group 11 introns). I also do not discuss the structures, genetic organization, and replication strategies of viruses with dsRNA genomes, nor summarize the extensive studies on dsRNA isolated from virusinfected plants. Specific aspects of the structure and biological properties of dsRNA have been examined in several previous reviews (1-4).
1. Biological Origins of dsRNA Double-stranded RNA appears in many biological processes. Many viruses have dsRNA chromosomes, which on infection express their encoded 2
The term double-stranded (ds) RNA refers to the antiparallel right-handed double helix,
in which the two Watson-Crick base-pairs (G-C and A U) are predominantly, if not exclusively,
present.
DOUBLE-STRANDED RNA
3
genes, undergo amplification, and are subsequently encapsidated and transmitted to other cells. Following single-stranded (ss) RNA virus infection, dsRNA is generated as a probable by-product of replication. dsRNA can also arise from the symmetrical transcription of viral DNA, followed by RNARNA annealing. There is no strong evidence in the latter two instances that dsRNA production is essential to the viral infection strategy; in fact, intracellular viral dsRNA in nonsequestered form can provoke the interferonmediated antiviral response (see Section IX). Cells produce dsRNA in the normal course of gene expression. dsRNA structures can occur within primary transcripts, which either persist in the mature species or are removed by RNA processing. Intramolecular dsRNA elements are present within local hairpin structures, or created through long-distance base-pairing. The latter situation is seen in the primary ribosomal RNA transcript of Escherichia coli, where complementary sequences thousands of nucleotides apart engage to form specific processing sites for RNase 111 (Section VI1,A). dsRNA structures can also arise through base-pairing between independent transcripts, such as the binding of antisense RNAs to their targets (Section VI1,B).
II. Experimental Criteria for dsRNA A number of experimental protocols can distinguish dsRNA from less structured species (5, 6). Several physicochemical methods are informative, the availability of sufficient material permitting. The base composition of a dsRNA preparation should exhibit equivalent amounts of A and U, and G and C, which reflects the presence of Watson-Crick base-pairs. dsRNA also exhibits a distinct temperature-dependent UV absorbance profile, wherein a sharp hyperchromism at the wavelength of peak absorbance occurs over a narrow temperature range. The transition reflects the highly cooperative melting of the double helix to yield separated single strands (7).The midpoint for the dsRNA + ssRNA transition is characterized by a temperature value (T,) that is sensitive to the salt concentration. I n contrast, the absorbance-versus-temperature profile of less structured RNAs exhibits a significantly lower hyperchromicity and cooperativity. Chromatographic fractionation can be used to separate and punfy dsRNA, or RNA species that contain double-stranded regions. dsRNA preferentially binds to cellulose CF-11 in ethanol-containing buffers, such that ssRNA is eluted first as the ethanol concentration is lowered (8).The exact nature of the interaction of dsRNA with the cellulose matrix is not understood, but it may involve hydrogen bonds between the hydroxyl groups in dsRNA and cellulose. RNA purification procedures that include a cellulose CF-11 step can remove trace amounts of dsRNA from RNA preparations (9, 10).
4
ALLEN W. NICHOLSON
Enzymatic analysis of dsRNA is relatively rapid, and uses much smaller amounts of material, usually in radiolabeled form. A well-known enzymatic test is the resistance of dsRNA to degradation by pancreatic ribonuclease (RNase A) in high (>0.15 M ) salt, and a corresponding sensitivity in low salt (5).The molecular basis for the differential reactivity is discussed in Section V1,C. Another enzymatic test uses E. coEi RNase 111, which degrades dsRNA species that are 2 20 bp, but does not cleave ssRNA, or dsRNA containing a significant amount of mismatches or other structural irregularities (6) (Section V1,A). Cobra venom ribonuclease (RNase VJ, cleaves dsRNA endonucleolytically, although helical ssRNA is also a substrate (11, 12) (Section VI,B). A sensitive biological test is provided by the ability of dsRNA (280 bp) to inhibit protein synthesis in reticulocyte lysates (6), due to the activation of the endogenous dsRNA-dependent protein kinase, whose action blocks an essential step in translation initiation (Section IX,A). Establishing the existence of dsRNA species in vivo has been more problematic, and careful consideration must be given to the experimental protocol. For example, phenol extraction can promote dsRNA formation (13). Gentle fractionation procedures that omit phenol may atrord an RNA preparation that retains much of its original secondary structure, and is largely devoid of artifactually generated dsRNA. The ssRNA component of an RNA preparation can be removed by RNase A digestion in high salt, and CF-11 cellulose chromatography can purify the dsRNA fraction. RNA fingerprinting or nucleotide sequence analysis would then be required to determine the complexity of the dsRNA preparation. Polyclonal antibodies have been used to detect dsRNA in cells and biological preparations (14). dsRNA-specific monoclonal antibodies that are largely insensitive to base-pair sequence have also been developed (15). Photoreactive reagents such as psoralens, which form intermolecular crosslinks within a double helix, can detect and “freeze” dsRNA structures in vivo (16). However, these approaches are not expected to be successful in detecting dsRNAs that have a transient existence, and that therefore have a low steady-state concentration in uioo. H a mutational approach is feasible, nucleotide sequence changes expected to disrupt predicted base-pairs-and secondary mutations that compensate for the initial disruption-can b e used to verify dsRNA structures otherwise inaccessible to other types of analysis. dsRNA molecules can be directly visualized by electron microscopy or by atomic force microscopy (AFM). AFM involves measuring the local contact forces between the scanning probe and the biological sample, which is stably &xed to a flat surface (17). AFM can provide images of dsRNA of a quality comparable to that obtained by electron microscopy, and can allow accurate length measurements of dsRNA without prior staining, shadowing, or other modifications (Fig. 1).
DOUBLE-STRANDED KNA
5
FIG. 1. Atomic force microscopy (AFM) image ofpurified dsRNA from reovirus. The scale is given in the lower right corner. Reprinted by permission of Oxford University Press from Ref.
275.
111. Structure and Dynamics of dsRNA As with any macromolecule and its attendant physical complexity, the function of dsRNA is best understood through knowledge of its structure. By definition, the secondary structure of an RNA is its ensemble of base-paired elements. The secondary structure provides the framework for additional RNA folding, creating tertiary interactions that establish and stabilize the three-dimensional shape (for recent reviews, see Refs. 4 and 18). Regarding dsRNA as a canonical double helix is sufficient for many first-order analyses. Nevertheless, is dsRNA capable of displaying a range of conformations? This question has been prompted in part by the large body of evidence that DNA double helices exhibit pronounced conformational plasticity. The polymorphism of DNA is manifested within the structural context of antiparallel, complementary strands, and is influenced by specific base-pair sequence
6
ALLEN W. NICHOLSON
and physical environment (7). Many of the original investigations of dsRNA structure detected no pronounced conformational diversity, which prompted the conclusion that the RNA double helix is structurally conservative (1, 7, 19). However, these studies were limited by low resolution, and recent investigations are now revealing a significant degree of polymorphism.
A. Structure of dsRNA at the Atomic Level The first structural information on dsRNA came from X-ray diffraction analyses of synthetic or naturally occurring dsRNA fibers (reviewed in Refs. 1 and 7). These studies confirmed the prediction that dsRNA consists of two antiparallel strands engaged in a right-handed double helix. In contrast to the various families of double-helical DNAs, dsRNA displays the A-helix motif, which exhibits an 11-fold helical pitch (Fig. 2). Raising the salt concentration in the fiber preparations causes a minor structural change to the A' double helix, which has a 1Zfold helical symmetry. Because the noncrystalline nature of the RNA fibers limited the resolution to approximately 3 A, no detailed information at the atomic level could be obtained. X-Ray diffraction analyses of crystals of two self-complementary dinucleoside phosphates, ApU and GpC, provided the first high-resolution structural information on the RNA double helix (20,21).The structures were refined to 0.8 A resolution and displayed Watson-Crick base-pairing, with the ribose sugar in the W - e n d o conformation and the nucleobase torsion angles in the anti range. Extrapolation of the structures to infinite length yielded a right-handed double helix with 11-fold symmetry, in good agreement with the fiber diffraction studies. Because the dinucleoside phosphates are heavily hydrated, the crystal structures are defined by local interactions (e.g., sugar-phosphate backbone constraints), rather than by crystal packing forces. Two classes of sodium ion binding sites were observed: one site is positioned between adjacent phosphate groups and the other is close to the 0 2 atom of the uracil residue in the minor groove. The latter interaction provided the first example of specific ligand binding to the dsRNA minor groove. X-Ray diffraction analysis of tRNA crystals also provided information on short double-helical structures within the context of a more complex tertiary structure (22).A statistical analysis did not find a correlation between the type of base-pair and local structural parameters of double-helical regions in tRNA (19), suggesting that specific base-pairs have a minor influence on dsRNA conformation. The A-form double helix is distinguished in several ways from the other helix families (I,4 , 7).The two antiparallel strands wrap around the helix axis in a ribbonlike manner, and the base-pairs are tilted away from the axis. The base-pairs also exhibit a forward displacement from the helix axis, creating a hollow cylindrical core with a van-der-Waals diameter of approximately 3.5A
7
DOUBLE-STRANDED RNA
B
C
FIG.2. Structure of the A-form RNA double helix. I n B, the doultle helix is tilted by 32" with respect to the helix in A, in order to show more clearly the major (M) and minor (m) grooves. In C , the helix is rotated by 90", and displays t h e central channel and extensive basepair stacking. Reprinted with permission from Ref. 7.
8
ALLEN W. NICHOLSON
(Fig. 2). The combination of base-pair tilt and forward displacement allows interstrand as well as intrastrand base stacking and creates a narrow, deep major groove and a shallow minor groove. The ribose conformation is C3'-endo, which reflects the necessity of accommodating the bulky 2'-hydroxyl group. The CZ-endo conformation causes the A-helix to be underwound with respect to the B-helix, and shortens the intrastrand phosphatephosphate distance to 5.9 A. dsRNA is therefore more compact than DNA, with a helical rise of 2.74 A, compared to 3.4 A for DNA, and exhibits a higher molecular mass per length (241 Da/hi) compared to DNA (195 DalK). The compact nature of dsRNA has a major influence on its gel electrophoretic mobility (Section 111,B). The V - e n d o ribose conformation places the 2'-hydroxyl groups at the edge of the minor groove and within hydrogen bonding distance of the 0 4 ' oxygen of the 3' neighboring nucleotide. This network of hydrogen bonds may give additional stability to the A-helix. A computer-assisted analysis of the solvent-accessible surface of the RNA double helix gave further support to the exposed nature of the minor groove (23).A molecular modeling study of the A-form RNA double helix emphasized the depth and narrowness of the major groove and the shallow, exposed nature of the minor groove (24). With its border of 2'-hydroxyl groups, and accessible bases, the minor groove provides a richly interactive molecular surface that can confer specificity and binding energy for proteins, other nucleic acids, and small ligands. The development of efficient methods to synthesize RNA chemically and enzymatically has allowed determination of the crystal and solution structures of dsRNAs of specific sequence and larger size. X-Ray diffraction analysis (2.25 hi resolution) of the self-complementary oligoribonucleotide U(UA)& revealed novel structural features, and provided an important model with which to understand how the RNA double helix engages in specific intermolecular interactions (25, 26). The [U(UA),A], structure displays the overall features of the A-form double helix, but also exhibits local discontinuities (Fig. 3). The double helix is kinked at two specific sites, which define a central and two flanking helical domains. The central domain displays the structural features of the canonical A-form helix, whereas the terminal domains show a significant deviation. The angles defined by the helix axes of adjacent domains are 13"and 11".The two kinks are not coplanar, and create a torsion angle of 70" between the helix axes of the terminal domains. Because the highly hydrated nature of the unit cell effectively minimizes crystal packing forces, it was argued that the kinks are an inherent feature of the [U(UA),A], duplex (25, 26). Both intramolecular and intermolecular hydrogen bonds are observed, all of which involve 2'-hydroxyl groups. The intramolecular interactions include 2'-hydroxyl group bonding, via a bridging water molecule, to either the 3' neighboring ribose 0 4 ' oxygen, with an
9
DOUBLE-STRANDED RNA
A
B
*
PI4 P28
FIG.3. Crystal structure of the [U(UA),A], duplex, displayed in stereo view. The! vertical lines indicate the three axes (see text for additional discussion). In A, the minor groov'e is emphasized, whereas in B, the major groove is displayed. Reprinted with permission from Ref. 26.
10
ALLEN W. NICHOLSON
average distance of 3.3 A, or to the minor groove-localized 0 2 or N3 atom of the adjacent base. These hydrogen bonds may stabilize the sugar C3’-endo conformation. The intermolecular hydrogen bonds are either direct or water mediated. One intermolecular interaction involves the 2’-hydroxyl group of the terminal ribose and the 0 2 atom of uracil in the minor groove of the neighboring duplex. A crystallographic study of an irregular dsRNA revealed the ability of the RNA double helix to accommodate noncanonical base-pairs and provided additional insight into the role of 2’-hydroxyl groups in mediating intermolecular interactions. The ribo dodecamer, GGACUUCGGUCC, which exists as a monomeric hairpin in solution, crystallizes as a duplex containing two copies each of the noncanonical G.U and U.C base-pairs (27). The four base-pairs, which are adjacent and in the center of the duplex, are apparently stabilized through additional hydrogen bonds involving water molecules in the major and minor grooves. The dsRNA crystallizes as a pseudoinfinite helix, in which the unit duplexes are linked by four direct hydrogen bonds. Additional interactions between adjacent duplexes involve several water-mediated hydrogen bonds. Thus, the intermolecular interactions in the crystal lattice are established through hydrogen bonds involving 2‘-hydroxyl groups, similar to what is seen in the [U(UA),A], structure. High-resolution nuclear magnetic resonance (NMR) analyses have provided information on the structure of dsRNA in solution and have provided support for the occurrence of significant sequence-dependent differences in local structure. A proton NMR study of the self-complementary hexamer, GCAUGC, was assisted by restrained dynamic molecular structure refinement to reveal an A-form double-helical geometry (28).The dsRNA exhibits local variations in structural parameters, including helix twist, as well as base-pair roll, slide, and propellor twist. There is extensive intramolecular base stacking, involving R-Y steps, as well as interstrand stacking of the purine rings. The extensive base-pair stacking provides a significant stabilizing force. The dsRNA is bent by approximately 20°, which is less than the bending of the corresponding DNA duplex (approximately SO0) (28).NMR analysis of two self-complementary RNA dodecamers, CGCGAAUUCGCG and CGCGUAUACGCG, also revealed the canonical A-form helix and a significant amount of interstrand base overlap, but in addition uncovered several sequence-dependent variations in the roll angle between adjacent bases (29). There is now a clearly defined example of an RNA double helix that exists outside the M A ‘ family. Incubation of poly(C-G) at 45 in high salt causes a conformational transition from a right-handed to a left-handed Z-helix (30). The B -+Z conversion is highly cooperative and the transition temperature
DOUBLE-STRANDED
RNA
11
increases with decreasing salt concentration (31). NMR, circular dichroic, and Raman spectroscopic analyses support the assignment of a left-handed structure (32, 33). Broinination of guanine Cx stabilizes the Z-RNA structure, which can be recognized by antibodies raised to Z-DNA (32). Methyl substitution at cytosine C5 destabilizes Z-RNA, in contrast to its stabilizing effect on Z-DNA (31). A glimpse of the structure of Z-RNA at the atomic level is provided by an X-ray analysis of the self-complementary, hexameric DNA-RNA copolymer d(CG)r(CG)d(CG),which crystallizes in the duplex Z-form (34).The cytosine 2’-hydroxyl groups within the two central r(C.G) base-pairs engage in intramolecular hydrogen bonds with the N2 atom of the 5’ neighboring guanine residue, apparently stabilizing the purine syn conformation. Immunocytochemical experiments provided evidence for the existence of Z-RNA in the eukaryotic cell cytoplasm (3.5);however, it remains to be shown whether the Z-RNA has a function in vivo.
9. Molecular Properties of dsRNA The macroscopic behavior of dsRNA derives from its microscopic features. A number of studies have shown that dsRNA is relatively inflexible, compared to DNA. The greater stiffness reflects the conformational rigidity of the ribose ring, which is imposed by the 2’-hydroxyl group. However, there has been some disagreement about the magnitude of the inflexibility. Analysis of the sedimentation coefficients of dsRNAs of defined size yields a persistence length (P value)3 of 1125 (2100)A (36), compared with the P value of 500-600 A for DNA. The dsRNA sedimentation coefficients were determined in high salt buffer, so that the P values would reflect the internal structural features of dsRNA, such as base stacking and hydrogen bonding, with minimal contribution from phosphate-phosphate repulsive electrostatic forces. The hydrodynamic behavior provides a qualitative description of the dsRNA molecule as an elastic cylinder, having a hydrated diameter of 30 A. Gel electrophoresis has been applied to determine a P value for dsRNA of 1050 A, which is approximately twice the value determined for DNA (37). In contrast, transient electric birefringence measurements (38) yielded a dsRNA persistence length of 500-700 A, only slightly greater than that of DNA. The authors of the latter study remarked that the previously determined dsRNA P value (37)may have been an overestimate, due to the use of
3 The persistence length is defined as the tangential distance over which a double helix maintains its direction before a significant change occurs, caused by external or internal bending forces (36).
12
ALLEN W. NICHOLSON
an S200,w value (36)that was itself too large. The relatively constrained flexibility of dsRNA was demonstrated by hydrodynamic analysis as well as gel electrophoresis, which also showed that phased adenine tracts-known to induce DNA bending-do not bend dsRNA (38). An electron-microscope study also demonstrated the inability of adenine tracts to induce dsRNA bending (39). However, these studies do not rule out the possibility of intrinsic dsRNA bending by other base sequence elements. The flexibility of dsRNA can be increased by introducing local structural discontinuities. Viroids, for example, are highly base-paired, circular RNA pathogens of plants and contain many internal loops, bulges, and base-pair mismatches. The P value of a specific viroid species is about 300 A, compared to > 1000 hi for dsRNA (40). Site-specific bulge loops can kink dsRNA (41, 42, 42a). The kinking originates from a specific structural discontinuity at the bulge site, and analyses of RNA duplexes with two or more sitespecific bulges show that kinks can exhibit phasing (41, 42). The kinkdependent phasing provided an alternative method of determining the helical pitch of dsRNA, measured as 11.3 bp in one study (42),whereas another analysis yielded 11.8 bp (41). These values are in accord with the A- and A'-form double-helical parameters, but because they are average measures, any local variation in pitch could not be discerned. A recent study on dsRNA with bulge loops has tentatively revealed a natural curvature for the RNA double helix (approximately 30-40" over 80 bp (43). The curvature was proposed in order to account for the measured helical repeat value of 10.2 bp, which is significantly smaller than observed in the other investigations. The persistent curvature of dsRNA is also seen in X-ray diffraction studies (Section 111,A). Double-stranded RNA electrophoreses more slowly than the corresponding DNA (38, 42, 44, 4 4 4 . The slower mobility may arise from the greater amount of counterion condensation with dsRNA, compared to DNA. The smaller axial distance between phosphates (the A-form helix has a 1.4A rise/phosphate, whereas the B-form has a 1.7 rise per phosphate) results in less residual negative charge density following counterion condensation. The smaller charge-to-mass ratio causes a reduced gel electrophoretic mobility (38).Introducing internal loops and bulges significantly reduces gel electrophoretic mobility (442).The differential dependence of the gel mobilities of regular and irregular dsRNA species on gel concentration and percent crosslinking results from specific but poorly understood interactions between RNA and the gel matrix (44a). The formation of triple-stranded RNA (tsRNA)represents an important mode of interaction of a nucleic acid chain with dsRNA, and triple-helix structures are observed in biological RNAs (4,45).Is the RNA triple helix similar to dsRNA?
DOUBLE-STRANDED RNA
13
The tsRNA structure can readily be formed, and the physical properties of short tsRNAs of defined sequence have been analyzed (46).The all-RNA triple helix is thermodynamically the most stable species, compared to the corresponding all-DNA form, or the several hybrid RNA-DNA species (46).Vibrational circular dichroism shows that under defined salt and polynucleotide concentrations, the thermal denaturation of poly(rA).poly(rU)involves formation of an intermediate triple-stranded species, poly(rA).poly(rU).poly(rU) (47). Alternatively, poly(rA).poly(rU)can undergo isothermal disproportionation in high salt, forming poly(rU).poly(rA).poly(rU),where the adenine residues simultaneously engage in Watson-Crick and non-Watson-Crick base-pairing with the two poly(rU) strands (7). The second poly(rU) strand fits into the major groove of the poly(rA).poly(rU)double helix, running parallel to the poly(rA) strand. To accommodate the third strand, the duplex base-pair tilt is decreased, creating a greater axial rise per base-pair, which widens the major groove (7). One may anticipate the existence of Hoogsteen hydrogen bonds as a stabilizing force for tsRNA, such as seen in other situations (e.g., see Ref. 48).
IV. Protein Recognition of dsRNA The biological activity of dsRNA is manifested through its specific interactions with other nucleic acids, small molecules, and proteins. A growing body of experimental evidence shows that dsRNA associates with other nucleic acid chains through (i) 2’-hydroxyl-group-mediated hydrogen bonding, (ii) intermolecular coordination of phosphates by divalent metal ion bridges, or (iii) base-base interactions within the major groove (for reviews see Refs. 4 and 7). Regarding the binding of small molecules, numerous investigations provide a detailed picture of the intercalative binding of planar dye molecules to the double helix (7). Other modes of small-molecule binding can be anticipated; these would involve hydrogen bonding to ribose 2‘-hydroxyl groups as well as ionic bonds with phosphate oxygens. The specific binding of protein to dsRNA is not well understood, but recent studies provide insight into this important interaction. In principle, the twofold symmetry of dsRNA provides a surface appropriate for recognition by a twofold symmetric protein (e.g., a homodimer). However, asymmetric binding modes are also possible: recognition of one or both strands of the duplex could be accomplished by a single polypeptide. Sequenceindependent protein binding would occur through recognition of the general features of the A-form helix, including the regular array of phosphate oxygens and 2’-hydroxyl groups, and the relatively nonpolar minor groove sur-
A
Ilmstau-1 Ilmstau-3 Hetrbp-1 Xlrbpa-1 HSDAI-1 aTIK-1 7hrE3L Hetrbp-2 Xlrbpa-2 Heeona EcruaC Dmetau-4 coneenaue
Pv
m
B
LnEy
qk
p
Y 1 f v i
sGPaH k FTf v v r 1 i m
1
1EEY S IV T E T S CT ITS L EFPE EFGE VF D K ER F I E I E T F V ET G S AY Q P S G L S E P VV ASG STAR g
G G SKK AK a rr r
AAe AL V
i
,ETMY K H .LP GS DT EK EN YV VP IS VP E AV L
-
Dmstau 2 HSDAX-2 -TIK-2 PrVIlS34 Sppacl Hatrbp-3 Xlrbpa-3 Wlstau-5
search motif
KLSVLIE IDIICRF
TQA SEE KSP L*
Q -
KDY IMA
IMC KLG GxGxSKKxAKxxAAxxALxxL A
FIG.4. (A) The double-stranded RNA binding domain (dsRBD) motif. In A, proteins that exhibit the full-length dsRBD are listed. “Ecrnac” is the sequence from RNase 111. The conserved residues are highlighted, and the consensus sequence is provided at the bottom. In (B), proteins are listed that contain mainly the C-terminal portion of the dsRBD. (C) The location of the dsRBD in nine proteins. The larger, darker boxes indicate the occurrence of the full-length dsRBD; the shorter, lighter boxes indicate the presence of the shorter dsRBD motif. Reprinted with permission from Ref. 52.
15
DOUBLE-STRANDED RNA
C
Human DAI (551 aa) & Mouse TIK (518 aa) Vaccinia E3L (190 aa) Human TAR binding protein (345 aa) Xenopus rbpa (299 aa) Drosophila Staufen (1026 aa)
E . coli RNase III (226 aa) Human son-a (1523 aa)
*
S. pombe pacl (363 aa)
Porcine rotavirus ns34 (403 aa)
FIG. 4. (Continued)
face. The ordered spine of water molecules in the minor groove may also participate in hydrogen bonds with bound protein (7). The 2'-hydroxyl groups would also serve to distinguish dsRNA from A-form DNA, or an RNA-DNA hybrid. Several problems are posed in principle by the sequence-specific recognition of dsRNA by protein. The major groove formally provides unambiguous sequence information, because each of the four base-pair arrangements presents a unique array of hydrogen bond acceptor and donor groups (49). However, the A-helix structure renders these groups sterically inaccessible, due to the narrowness and depth of the major groove. In the absence of any confoi-mational change that would widen the major groove, the sequence-specific binding of protein would be expected instead to depend on information provided by base-pair groups in the minor groove. However, only the AU.UA base-pair can be distinguished from the GC.CG base-pair set, in a recognition mechanism involving only minor groove-directed hydrogen bonds (49).The protein must therefore depend on additional interactions in order to read unambiguously the base-pair sequence. The major groove may nevertheless enable sequence-specific recognition. Because the degree of base-pair tilt establishes the helix rise value, it can dictate whether the major groove remains narrow and only accessible to water or metal ions, or whether it can widen to accommodate a protein structure (or another
16
ALLEN W. NICHOLSON
nucleic-acid strand). In this regard, internal loops or bulge-loops can promote partial unwinding of adjacent double-helical regions, allowing specific protein-dsRNA contacts in the major groove (50). Are there specific protein motifs that recognize dsRNA? An early molecular modeling study revealed a natural structural complementarity of the antiparallel P-sheet with the RNA double helix (51). The protein secondary structure motif displays a right-handed double-helical shape, which affords a precise interaction with dsRNA. Specific protein-RNA contacts can be established in the minor groove involving hydrogen bonds between the 2’-hydroxyl groups and peptide-bond carbonyl oxygens. Although not further explored, it was also suggested that sequence-specific recognition could be accomplished through the interaction of amino-acid side-chains with the base-pair groups exposed in the minor groove (51). Whether this protein motif is used in dsRNA recognition is not known. Protein sequence databanks have uncovered a motif that specifically recognizes dsRNA. Sequence comparison of proteins that bind dsRNA exposed an approximately 65- to 70-amino-acid sequence that contains about 36 conserved amino acids (Fig. 4) (52, 53).The consensus element, termed the dsRNA-binding domain (dsRBD), is present in E . coli ribonuclease 111 and the mammalian dsRNA-activated protein kinase (Sections VI,A and IX,A). In vitro assays demonstrated that the dsRBD can directly bind dsRNA (52).The extended length of the motif suggests that structure as well as sequence is important for dsRNA recognition by the dsRBD. There is no current evidence that a dsRBD exhibits sequence specificity in binding, but it is possible that specific nonconserved amino acids either within or outside the domain could confer such an ability. The zinc finger provides a motifwherein specific amino acids, adjacent to conserved amino acids within a local structure, can confer sequence specificity (54). Direct information on protein recognition of dsRNA has been provided by the X-ray structural analysis of glutaminyl-tRNA bound to its cognate synthetase (55). The minor groove of the tRNA acceptor helix engages in several specific contacts with two p-turn motifs. A proline residue (Pro-181), present in a p-turn that separates two p-strands, engages in a hydrogen bond through its peptide carbonyl oxygen with the purine exocyclic amine group in the G2.C71 base-pair. The peptide bond of isoleucine 183 is hydrogenbonded to a “buried water molecule that is itself hydrogen-bonded to both the keto oxygen of C71 and the G2 exocyclic amine. The buried water molecule is also hydrogen-bonded to a carboxylate oxygen of aspartic acid 235, which is within a second p-turn. Asp-235 also engages in a hydrogen bond with the G3 exocyclic amine. In summary, three protein side-chains and a water molecule engage in a complex, highly specific hydrogen-bond pattern with nucleic-acid base groups in the minor groove. Both direct and
DOUBLE-STRANDED RNA
17
water-mediated hydrogen bonds are present. The intricacy of this interaction may hint at the general complexity of protein-dsRNA contacts.
V. Chemical Stability of dsRNA RNA chains break down in solution under conditions wherein DNA is stable. The 2'-hydroxyl group acts as an internal nucleophile, attacking the vicinal phosphodiester linkage and displacing the 5' oxygen of the neighboring 3' nucleotide. The breakdown proceeds by an in-line mechanism, wherein the nucleophilic 2'-oxygen and 5'-oxygen leaving group occupy apical positions within the trigonal bipyramidal phosphorane intermediate (56). An RNA strand in a helical conformation, whether single-stranded or engaged in a double helix, is more resistant to this reaction than the corresponding random coil. It was noted that a right-handed, antiparallel double helix is particularly well suited toward protecting the 3'-5' internucleotide linkage from 2'-hydroxyl attack (57). The RNA double helix imposes a significant structural constraint in that the attacking and leaving groups cannot simultaneously occupy the required apical positions. The reaction is therefore inhibited, due to a disfavored stereochemical arrangement. However, if the RNA double helix undergoes localized unwinding and strand separation, the stereochemical barrier would be lost, and chain scission readily proceed. Disruption of the double helix may be important in the degradation of dsRNA by ribonucleases related to the pancreatic RNase family (Section VI, C), whose catalytic mechanism requires the 2'-hydroxyl group. It is predicted that a 2'-5' phosphodiester linkage within a right-handed double helix should undergo more facile cleavage, because the attacking 3'-oxygen is in line with the 5'-oxygen leaving group. An experimental study using model oligonucleotides revealed an approximately 900-fold relative stability of the 3'-5' linkage over the 2'-5' linkage within the RNA double helix (58).The hydrolytic lability of the 2'-5' linkage is consistent with its facile formation when 3'-activated oligonucleotides are nonenzymatically polymerized on a complementary oligonucleotide template (59).The preferential formation of the 2'-5' linkage is predicted, because the pathway is formally the reverse of 3'-oxygen attack on the 2'-5' linkage. It also was noted that the use of 5'-activated nucleosides would favor nonenzyinatic formation of the 3'-5' linkage over the 2'-5' linkage (57). The hydrolytic stability of dsRNA has so far been considered from the standpoint of its relative resistance to 2'-hydroxyl group-mediated chain cleavage. dsRNA breakdown could occur instead by a hydrolytic mechanism, where the nucleophile is an activated water molecule. Depending on the
18
ALLEN W. NICHOLSON
identity of the leaving group (3’ or 5‘ oxygen), hydrolysis would create RNA products with 5’ phosphate or 3’ phosphate termini, respectively. In either case, the double-helical structure does not have to be disrupted to permit the requisite stereochemistry for an in-line SN2(P)mechanism. However, as the uncatalyzed hydrolysis of dsRNA is very slow, enzymatic assistance is required to provide the necessary rate.
VI. Enzymatic Cleavage of dsRNA Intracellular dsRNA species must turn over to avoid excessive accumulation, and to provide precursors for new RNA. There are several pathways by which dsRNA may be degraded (see also Ref. 60): (i) an enzyme could carry out a coordinated and nonspecific double cleavage of the double helix; (ii) an enzyme could bind directly and introduce random nicks in either strand, ultimately providing small, unstable dsRNA fragments; (iii) a ssRNA-specific enzyme could bind reversibly to locally melted regions, then cleave the single-stranded segments; (iv) an exonuclease (3’ -+ 5‘ or 5’ + 3’) could attack the ends of the duplex and degrade each strand; (v) an RNA helicase could convert dsRNA to single-stranded species, which then would be degraded by ssRNA-specific exo- or endonucleases; and (vi) the dsRNA can be enzymatically modified (Section VIII, C), thereby weakening or destroying duplex structure, or providing a recognition signal for specific exo- or endonucleases. This section analyzes several enzymatic activities that degrade dsRNA directly, and compares and contrasts their mechanisms.
A. Ribonuclease Ill Ribonuclease 111was the first dsRNA-specificendoribonuclease to be discovered, and it has received continuous attention since its original characterization as a potent activity in E . coli cell-free extracts. RNase I11 was later identified as a prominent member of a group of enzymes involved in RNA maturation and decay (for a recent comprehensive review, see Ref. 61). RNase 111 exhibits a homodimeric structure and requires a divalent metal ion, preferably Mg2+, as an essential cofactor for its phosphodiesterase activity. Exhaustive digestion of synthetic dsRNAs yields double-stranded species, ranging in size from approximately 12 to 15 bp. RNase I11 creates 5’-phosphate, 3‘-hydroxyl product termini, which exhibit two-nucleotide 3’ overhang. RNase 111-catalyzed hydrolysis of dsRNA apparently proceeds through coordinated (but probably not concerted) double cleavage. Many of the natural RNase 111 substrates, also termed processing signals, exhibit specific deviations from regular dsRNA structure at or near the cleavage site. These irregularities can determine the specific pattern ofprocessing (Section VII,A).
DOUBLE-STRANDED RNA
19
The structural gene for RNase 111 (mc) lies between 55 and 56 minutes on the E. coli chromosome, and has been cloned and sequenced. The rnc polypeptide contains 226 amino acids and has a molecular mass of 25.6 kDa (62). Mutations in the m c gene that exert specific effects on RNase I11 activity have been identified. The mc70 mutation changes glutamic acid at position 117 to lysine, and blocks cleavage without inhibiting binding (61) (H. Li and A. W. Nicholson, unpublished). Changing the same residue to an alanine has essentially the same effect (H. Li and A. W. Nicholson, unpublished). Further evidence for carboxyl group involvement in the catalytic mechanism is provided by the observation that treatment of RNase I11 with a water-soluble carbodiimide abolishes cleavage, but does not affect substrate binding (H. Li and A. W. Nicholson, unpublished). The mc97 mutation changes glycine at position 97 to a glutamic acid, and inhibits processing activity in vivo (63). The mc97 mutation may weaken divalent-metal-ion binding, because elevated Mg”+ concentrations rescue processing activity in uitro (63).The mc105 mutation also inhibits processing activity in vivo, and represents a glycine-to-serine change at position 44 (62).This residue occurs within a 10-aminoacid segment (NERLEFLGDS) that is also present within the yeast dsRNase, Pacl, and the RNase-111-like enzyme of Coxiella burnettii (64, 64a). The role of this conserved sequence in RNase I11 function has not been defined. The C-terminal third of the m c polypeptide contains a consensus dsRBD motif (52, 61). The rev3 mutation changes alanine to a valine at position 211 (62), which corresponds to a conserved residue within the dsRBD. The rev3 mutation does not noticeably affect RNase I11 processing in viuo, although it suppresses a specific mutation in ribosomal protein S12 that causes a cold-sensitive defect in 30-S ribosomal subunit assembly and/or function (65). The catalytic mechanism of RNase 111 is not known, but recent studies provide a framework for a description. RNase I11 is a low-abundance protein, but it is easily overexpressed and purified (66-68). RNase I11 processing obeys Michaelis-Menten kinetics, and its in vitro catalytic efficiency is comparable to that of other nucleic-acid processing enzymes, including E . coli RNase P, E. coli RNase H, and restriction endonuclease EcoRI (68). Given the requirement for a divaleiit metal ion and the apparent involvement of at least one carboxyl group in the chemical step, it is possible that RNase 111 utilizes the “two-metal-ion” mechanism (e.g., see Refs. 69 and 70). However, other mechanisms are equally likely. Because the 2‘-hydroxyl group adjacent to the scissile bond is not required for cleavage (71),the unreactivity of DNA or RNA-DNA hybrids does not reflect the specific absence of this group at the scissile bond. Biological processing substrates of KNase 111 undergo precise enzymatic cleavage. A necessary but not sufficient requirement for reactivity is the
20
ALLEN W. NICHOLSON
presence of approximately 20 bp of dsRNA ( i e . , two turns of the A-form double helix), within which occur(s) the cleavage site(s). To rationalize the cleavage specificity, one model proposed that RNase I11 acts as a “molecular ruler,” whereby the scissile bond is selected by its distance from one end of the dsRNA element (72, 73). However, mutational analysis of a T7 phage processing signal showed that the length of the dsRNA element does not dictate cleavage site choice, although it does determine overall reactivity (74). Other structural features can determine the reactivity pattern. For example, asymmetric internal loops can enforce single cleavage, whereas altering the internal loop to fully Watson-Crick base-paired form restores double cleavage (74)(Fig. 5). It was proposed that the internal loop folds into the major grooves of the adjacent double helices, forming a “dsRNAmimicry” structure, which allows only single cleavage (75).This model is not supported by mutational analysis and NMR studies of a representative substrate (74, 76). Internal loops in RNase I11 processing signals (and other RNAS) instead exhibit a more formal helical shape, which is most likely stabilized by non-Watson-Crick base-pairing interactions (76). The participation of base-pair sequence in establishing RNase I11 processing signal reactivity has been controversial. RNase I11 is not a baseA
CA G A CG UA UG AU CG UG GC GC AU
B
U
f$Jd
c A
AU AU CG A A GC GC GU AU GC AU s...u u . . . 3
C
CA G A CG UA UG AU CG UG GC GC AU AU
CA G A CG UA UG AU CG UG GC GC
:J
A ‘
A
AU AU CG A A GC GC GU AU GC AU s...u U . . . 3 ‘
AU AU CG A A GC GC GU AU GC AU
s...u u . . . 3’
FIG.5. Structure of the bacteriophage T7 R 1 . l RNase 111 processing signal (B), which undergoes single enzymatic cleavage in the internal loop. Also shown are two R 1 . 1 variants that exhibit fully Watson-Crick base-paired internal loops, and that undergo coordinate double cleavage.
21
DOUBLE-STRANDED RNA
A
RNase I11 11 bp
I
I #
W N A 0 W Q N N C W W ( N N N N), A 118 B N N C W C O N W 3 ' - W ' N ' W 6 W o e N’N’B W p % ( n n n n ) y U Ww'C N’N’B WOB €A N’W’5'-
4 1
B 5'-
3'-
11 b p
I
-
3' 5'
Drd I I
e
#
I 6 N N N N N N B P C P B N"""""'C A @
Q Q
6bp
41
6 bP
I
-
3' 5'
FIG. 6. (A) The consensus model for an RNase 111 processing signal (see also Refs. 61 and 73). The overall length is approximately 22 bp, or two turns of the RNA double helix. The nucleotides in outlined form represent the conserved base-pairs; the N,N' pairs represent any base-pair combination; the W,W' pairs indicate U . A or A.U base-pairs, whereas the N,n pairs indicate that Watson-Crick lrase-pairing is not a strict requirement. "(NNNN)," and "(nnnn)," are used to indicate that the two opposed segments are not necessarily equal in length, nor necessarily complementary. For example, in the R 1 . l processing signal, x = 5 and y = 4 (see Fig. 5 ) (B) The recognition sequence for restriction endonuclease DrdI. Note the similar pattern of cleavage and placement of the conserved base-pairs, which in this case spans one turn of the B-DNA helix.
specific enzyme, because the nucleotides that immediately flank the scissile bond are not conserved. A number of substrates exhibit a short, conserved base-pair sequence element (CUU.GAA)proximal to the cleavage site. However, base-pair substitutions within this element do not block accurate cleavage of a T7 phage substrate (77).It was therefore proposed that the processing signal identity elements-whether or not specific base-pairs are involvedare spatially dispersed and degenerate in nature (77).There now is evidence for base-pair sequence involvement in processing substrate reactivity. Alignment of the sequences of RNase I11 substrates with respect to their cleavage sites revealed a more extensive, albeit loosely conserved base-pair consensus motif (73) (Fig. 6A). The consensus base-pair set spans approximately two turns of the double helix, and exhibits a hyphenated dyadic symmetry centered about the cleavage sites. A single turn of the double helix would therefore contain one copy of the consensus base-pair set. The variability in
22
ALLEN W. NICHOLSON
base-pair sequence establishes the degenerate character of the identity elements. Preliminary studies indicate that base-pair substitutions within the conserved sequence set can inhibit cleavage by weakening enzyme binding (K. Zhang and A. W. Nicholson, unpublished). The studies summarized above provide a preliminary structure-function model of RNase 111, and a qualitative description of the processing pathway. RNase I11 contains substrate-binding, catalytic, and subunit dimerization domains. The substrate-binding and catalytic domains are physically and functionally separable (Fig. 7). The C-terminal third of the m c polypeptide, containing the dsRBD, is involved in substrate binding. Preliminary results indicate that the isolated dsRBD of RNase I11 can bind substrate, but cannot catalyze cleavage (A. Amarasinghe and A. W. Nicholson, unpublished). The location of point mutations that abolish cleavage suggests that the catalytic domain is contained within the N-terminal two-thirds of the enzyme. The separability of substrate-binding and catalytic domains also implies that recognition is not necessarily coupled to catalysis, and that under certain circumstances, RNase I11 may act as a dsRNA-binding protein. There is preliminary evidence for such an alternative function of RNase I11 in which specific RNA structures allow RNase I11 binding, but block cleavage (61, 78). The twofold symmetries of RNase I11 and dsRNA imply that processing can occur within a symmetrical enzyme-substrate complex. The model proposes that the dsRBD of each subunit binds a substrate half-site (one turn of dsRNA), which contains a single consensus base-pair set (see Fig. 6A). Substrate binding is accompanied by a change in the enzyme-substrate complex, such that the catalytic site (one per subunit) is positioned next to one of the two scissile bonds (Fig. 8). The chemical step then occurs, followed by product release. The involvement of two catalytic sites in the processing reaction means that each strand is cleaved independently. Thus, a substrate half-site may be
1
3646
97 Catalytic Domain
117
152
211 226
dsRNA-Binding Domain
FIG.7. The primary structure of RNase 111 polypeptide, indicating the dsRBD (shaded area) and catalytic domain. The black bars indicate sequence identity with the yeast PacI nuclease. The sites of specific mutations in RNase 111 are indicated, and the exact positions (amino-acid number) are given below the diagram. This model predicts that each subunit of the RNase 111 dimer has a separate substrate binding site and catalytic center (see Section VI,A for further discussion).
23
DOUBLE-STRANDED RNA
0
Y? -
FIG. 8. The RNase 111 processing reaction, indicating that double cleavage of dsRNA is a coordinated but not a necessarily concerted reaction.
sufficient to confer substrate reactivity, if the corresponding scissile bond is appropriately positioned in the active site of the bound subunit. This model can rationalize the influence of substrate structure on reactivity. Disruption of secondary structure immediately surrounding the cleavage site (for example, by the presence of an asymmetric internal loop) abolishes the local twofold symmetry in the enzyme-substrate complex. This would allow the placement of only one of the two scissile phosphodiesters in an active site, resulting in single-strand cleavage. Are there other nucleic-acid-processing enzymes whose mechanisms are relevant to consider in thinking about the RNase I11 processing reaction? It has been useful to regard RNase I11 in light of what is known of the DNA restriction endonucleases (see also Ref. 61). Restriction enzymes can cleave at noncanonical sites (i.e., exhibit “star” activity) in low-salt buffers, in the presence of organic cosolvents, or in the presence of divalent metal ions other than Mg2+ (e.g., MiG+ or Co2+) (79). The noncanonical sites are usually degenerate forms of the recognition sequence. RNase I11 exhibits star-cleavage activity under comparable conditions, in which secondary sites are cleaved in addition to the primary processing sites (68, 80, 81). Secondary cleavage sites are not norinally used in uiuo; they usually contain a
24
ALLEN W. NICHOLSON
smaller dsRNA element, and often exhibit base-base mismatches or other deviations from regular dsRNA. Restriction enzymes show a diversity of primary structure, and it has been argued that the type of recognition site (e.g., the occurrence of hyphenated symmetry) and cleavage pattern (e.g., 5' or 3' overhang of one, two, three, or four nucleotides) dictates the relative placement and structures of the substrate-binding and catalytic sites (79).Therefore, assuming an involvement of base-pair sequence in RNase I11 substrate recognition, a formal relative of RNase I11 would be the restriction endonuclease DrdI. This enzyme recognizes the hyphenated sequence, GACNNNN/NNGTC, and cleaves to provide product ends with two-nucleotide 3' overhangs (Fig. 6B). It may be informative to compare and contrast the structures and mechanisms of RNase I11 and DrdI, with due attention given to the fundamental structural differences between the respective substrates.
B. Cobra Venom Ribonuclease (RNase V,) RNase V, is one of several nuclease activities present in the venom of the central Asian cobra, Naja naja oxiana (1 I , 82). RNase V, preferentially degrades dsRNA, but also cleaves helical ssRNA, whereas DNA is not a substrate (12). The physical properties of RNase V, are unknown, because the enzyme has not been purified to homogeneity. Studies using partially purified enzyme demonstrated that RNase V, is a phosphodiesterase that requires Mg2+, creates 5'-phosphate termini, and is inhibited at salt concentrations above 100 mM ( 1 1 , 12). Specific nucleotide sequences are not important for recognition (83).The minimum size for an RNase V, substrate is approximately four to six nucleotides, which corresponds to the number of ionic contacts established on enzyme binding (12).To reconcile the ability of RNase V, to cleave dsRNA as well as helical ssRNA, it was proposed that the enzyme recognizes the helical sugar-phosphate backbone (12). RNase V, has been used to map helical or double-helical regions in RNA. Careful interpretation of RNase V, structure mapping results is required because studies on tRNA reveal that RNA regions not engaged in a canonical double helix are sensitive to RNase V,, and that double-stranded regions are not uniformly reactive ( 1 1 , 83). It is clear that the interaction of RNase V, with its substrates depends on additional parameters that as yet are not well understood.
C. dsRNase Activities Mechanistically Related to Pancreatic RNase
As discussed in Section 11, a key diagnostic feature of the RNA double helix is its resistance to RNase A in high salt, and a corresponding sensitivity in low salt. How can a ssRNA-specific nuclease degrade dsRNA? It was proposed that low salt increases interstrand coulombic repulsion between phos-
DOUBLE-STRANDED RNA
25
phate oxygens, such that the dsRNA is denatured to single-stranded form. However, the RNA double helix is stable under these conditions. A series of investigations analyzed the degradation of dsRNA by RNases mechanistically related to RNase A (i. e., the cyclizing-decyclizing phosphotransferases), including bovine seminal plasma ribonuclease (RNase BS-1) (2, 84, 85). It was initially proposed that the homodimeric structure of RNase BS-1 confers efficient recognition and cleavage of dsRNA, wherein each subunit cleaves one of the two RNA strands. In support of this hypothesis, it was shown that artificially dimerized RNase A can degrade dsRNA under conditions where the monomeric form is inactive (86).However, it was subsequently shown that the monomeric form of RNase BS-1, obtained through reduction/alkylation, exhibits a dsRNase activity comparable to that of the native dimer (87). Examination of the primary structures and dsRNase activities of a number of RNase A-related ribonucleases revealed a correlation between polypeptide basicity and dsRNA cleavage ability. Specifically, the more basic ribonucleases possess a more efficient dsRNase activity. Moreover, the dsRNase activity of RNase A is greatly enhanced by the covalent linkage of spermine residues (84). Studies on RNase A binding to double-helical DNA (which permits measurement of enzyme binding without cleavage) demonstrated that RNase A binds and stabilizes local single-stranded regions. RNase A and its relatives can therefore be regarded as nucleic-acid-melting proteins which can bind dsRNA by taking advantage of the dynamic “breathing” of the double helix. Binding to the ssRNA regions would be followed by cleavage. The low-salt enhancement of dsRNA cleavage by RNase A and its relatives would derive from an increased dsRNA breathing rate, due to increased internal electrostatic repulsion. The two-step mechanism for dsRNA degradation by RNase A and RNase BS-1 is also consistent with the stereoelectronic restraints on the cyclizatiodcleavage pathway. Attack of a phosphodiester linkage by the adjacent 2’-oxygen would ordinarily be disallowed within the context of the double helix (Section V), but would proceed when a single-stranded segment is produced on enzyme binding. A study of the dsRNA-binding properties of catalytically inactive mutants of RNase A or BS-1 could determine how enzyme binding participates in helix destabilization, how the salt concentration influences the binding and cleavage of dsRNA, and how specific posttranslational modification (84) may stimulate the dsRNase activity of otherwise ssRNA-specific activities. In contrast to RNase A and its relatives, such phosphodiesterases as RNase 111 would not necessarily require a singlestranded segment as substrate. Because these enzymes employ an activated water molecule as the nucleophile, the phosphodiester linkage can be cleaved through an in-line mechanism, which would be stereoelectronically allowed within a double-helical structure.
26
ALLEN W. NICHOLSON
VII. dsRNA Function in Prokaryotes
A. Gene Regulation by Ribonuclease Ill Insight into the role of RNase I11 in E . coli RNA metabolism was provided by the isolation of the mc105 mutation, which abolishes RNase 111 processing in uivo (88). The 3 0 4 RNA species that accumulates in mc105 mutant strains represents the primary transcription product of the rRNA operons. RNase I11 processing of the primary transcript creates the immediate precursors to the 1 6 3 and 2 3 3 rRNAs (61).The viability of RNase IIIstrains indicates that other processing activities provide alternate rRNA maturation pathways (89). A number of cellular mRNAs also are processed by RNase I11 (Table I). Although the list of RNase I11 targets is undoubtedly incomplete, their encoded functions indicate that RNase I11 regulates expression of components involved in the flow of genetic information (i.e., the synthesis, maturation, function, and decay of RNA). In addition to its role in the metabolism of specific cellular RNAs, RNase 111 processes transcripts expressed by a wide range of phage and accessory genetic elements. RNase 111 cleaves RNAs encoded by phage T7 and its relatives, as well as transcripts of phages T4 and lambda (61). Plasmids and transposons express RNAs that contain RNase I11 processing signals, and antisense RNA binding to their targets provides RNase I11 substrates (Section VII,B).4 RNase I11 processing can control gene expression by altering mRNA translational activity. The translation of most prokaryotic mRNAs depends on the accessibility of the mRNA Shine-Dalgarno (SD) sequence to the complementary (anti-SD) sequence at the 3’ end of the 1 6 3 rRNA. KNase I11 processing within the 5’ untranslated region (5’-UTR) of an mRNA can enhance translation by disengaging the SD sequence from secondary structure, promoting 30-S subunit binding. For example, RNase I11 cleavage within the 5’-UTK of the T7 polycistronic early transcript creates the mature 0.3 gene mRNA, and also stimulates the production of the 0.3 protein (90). RNase I11 cleavage within the 3’-UTR enhances translation of the T7 1.1/12 mRNA, apparently by disrupting a long-range RNA-RNA interaction (91).A
RNase 111 processing signals are relatively abundant in coliphages and accessory genetic elements. It was speculated that RNase 111 may protect the cell against infection by RNA phage (as well as other phage) by attacking dsRNA replicative intermediates or viral mRNAs (72). An original antiviral function of RNase 111 may have been subsequently subverted by phage and extrachromosomal elements to their advantage (61).To speculate further, RNase I11 may represent a modern version of a primitive cellular activity that restricted genetic exchauge at the RNA level. Such an activity would have been potentially toxic to the cell, given the ubiquity of dsRNA structures, and would need to have been tightly regulated, or cellular dsRNAs subtly altered to avoid cleavage.
27
DOUBLE-STRANDED RNA
TABLE I Escherichia coZi RIRONUCLEASE I11 PROCESSING SICNALS~ Operon
Encoded functions
No. of sites
Processing signal function
rrA-H
16-S, 23-S, 5-S rRNA; tRNAs
2
Maturation of rRNAs; tRNA
me-era-recO
RNase 111, Era, RecO proteins
1
Initiation of mRNA decay
rpsO-pnp
r-Protein S15, PNPase
1
Initiation of mRNA decay
rnetY-nusA-in@
tRNA’Met; NusA protein, IF2
1
Initiation of mRNA decay; tRNA maturation
rpZK,AJ,L-rpoB,C
r-Proteins L1, L7/L12, L10, L11; @, p’ RNA polymerase subunits
1
Modulation of mRNA expression (?)
secE-nusG
SecE, NusC proteins
1
Modulation of mRNA expression (?)
* See Section VII,A and Ref. 61 for further discussion of the structures, reactivities, and functions of the listed RNase 111 processing signals.
recent report describes an RNase I11 processing signal within an mRNA coding sequence (92), whose cleavage down-regulates expression of the encoded protein. RNase I11 may also control translation by binding to a specific site without concomitant cleavage (78). The binding event may induce an mRNA conforinational change that enhances translation initiation. RNase I11 processing can also control gene expression by altering mRNA stability. Cleavage within mRNA 3’-UTRs can provide a 3’ hairpin that blocks the action of 3’ + 5’ exonucleases, such as polynucleotide phosphorylase (61).The in vivo stabilities of the T7 phage early mRNAs is established in part by 3’ hairpins, created by RNase I11 processing (93). Alternatively, cleavage within a 3‘-UTR can remove an RNA hairpin or other secondary structure, thereby accelerating mRNA decay. For example, RNase I11 cleavage of the phage lambda sib regulatory element removes an RNA hairpin, thereby promoting 3‘ -+5’ exonucleolytic digestion into the upstream integrase coding region, suppressing protein production (61, 94). RNase I11 cleavage within a 5’-UTR can also initiate RNA turnover. In this instance, RNase I11 processing can facilitate subsequent cleavage by degradative endonucleases, such as RNase E (95).This mechanism is involved in the autoregulated production of RNase I11 (96), and the negative control of polynucleotide phosphorylase (PNPase) (97). With regard to the latter event, RNase 111- strains exhibit altered RNA metabolism (98); this may result in part from the elevated levels of PNPase, which would accelerate the degradation of PNPase-sensitive mRNAs.
28
ALLEN W. NICHOLSON
RNase I11 activity can be controlled through covalent modification. RNase 111 is phosphorylated on serine in the T7-infected cell by a phageencoded protein kinase, which enhances processing activity (99).Because T7 infection shuts off host protein synthesis, the T7-directed phosphorylation may allow the limited amounts of RNase 111 to process efficiently the large quantities of the T7 mRNAs, many of which have RNase I11 cleavage sites (93).The phosphorylation may confer an additional degree of stability to the T7 messages by enhancing PNPase mRNA cleavage, thereby suppressing PNPase production (97), which may be involved in T7 mRNA degradation. It is not known whether RNase I11 is a target for a cell-encoded protein kinase, but some form of regulation is feasible to consider, as RNase I11 can bind ATP (66), and may interact with other proteins (e.g., see Refs. 67 and 100).
B. dsRNA and Antisense Regulation Antisense RNAs bind to complementary sequences in target RNAs, forming specific RNA.RNA duplex structures, which can alter target function. Antisense RNAs can be generated through transcription of all or part of the target gene complementary strand, or expressed from an unlinked locus. Extensive studies on natural antisense RNAs have been spurred by the inherently interesting properties and mechanisms of action of these regulatory molecules, and in developing antisense technology for the directed control of gene expression (for recent comprehensive reviews, see Refs. 101-103). Prokaryotic antisense RNAs act primarily as negative regulatory elements. For example, antisense RNA binding may directly sequester an mRNA translation initiation region, or inhibit target RNA function through an allosteric mechanism. Prokaryotic antisense RNA action does not necessarily require full-length duplex formation, and moreover, although the dsRNA product is formally an RNase I11 substrate, enzymatic degradation is often not necessary for regulation. This section reviews several natural antisense RNA-mediated regulatory mechanisms, and the role of dsRNA in antisense action.
1. ColE 1 PLASMIDREPLICATION CONTROL Initiation of replication of plasmid ColE 1requires RNA primer formation, which is negatively controlled by an antisense RNA (104).The 3' end of the RNA primer for leading strand DNA synthesis is created through sitespecific RNase H cleavage of the precursor transcript, RNA 11. Cleavage is inhibited by RNA I, a plasmid-encoded antisense transcript of 108 nt. Specifically, RNA I base-pairs with RNA I1 within a specific segment upstream of the RNase H cleavage site. Duplex formation causes a conformationd change in RNA 11, which suppresses stable formation of the RNA.DNA duplex target for RNase H. The RNA-I.RNA-I1 duplex is ultimately degraded by RNase 111, but this event is not required for negative regulation (104).
29
DOUBLE-STRANDED RNA
Extensive investigations provide a detailed description of the specific structural features in RNA I1 and RNA I that promote duplex formation and a pathway for RNA I action (104). RNA I and RNA I1 initially engage in a “kissing” interaction, in which reversible base-pairing occurs between complementary hairpin-loop nucleotides in each RNA. The kissing reaction is the rate-limiting step for the association of the two RNAs, and mutations in the loops that abolish complementarity suppress negative regulation. A ColEl plasmid-encoded protein (Rom) enhances negative control by stabilizing the kissing complex (105).Formation of a stable dsRNA complex involves pairing of the single-stranded 5’ end of RNA I with the complementary sequence in RNA 11. The creation of a nucleation center for dsRNA formation at a location separate from the kissing site avoids the topological barrier to double-helix formation involving two closed, complementary loops (104). 2. R1 PLASMIDREPLICATION CONTROL The replication of plasmid R 1 depends on the synthesis of the plasmidencoded protein RepA, which participates in the initiation step (106). RepA protein production is negatively controlled at the translational level by the plasmid-encoded CopA RNA: an approximately 90-nt, constitutively synthesized antisense transcript (107). The steady-state levels of CopA RNA directly reflect plasmid copy-number, because CopA RNA has a short metabolic
loop I1
A
U
UA VA
binding of CopA
G
loop I
VA UA
no binding
-3
middle region
tail
of CopA
%-Ti ---+
FIG. 9. Mechanism of CopA antisense RNA action. (A) The secondary structure of CopA RNA, indicating hairpin loops I and 11. (B) The overall mechanism for CopA interaction with its target, leading to repression of RepA protein production (see Section VII,B for frirther discussion). Reprinted by permission of Oxford University Press from Ref. 135.
30
ALLEN W. NICHOLSON
half-life. CopA RNA exhibits two hairpins, the loop nucleotides of which are available for binding to complementary sequences within the repA mRNA 5‘ leader region (termed COPT)(Fig. 9). A stable kissing interaction between complementary loops in the CopA RNA and the CopT sequence is followed by dsRNA formation at a site separate from the kissing loops. The binding of CopA RNA to CopT sequesters a short upstream reading frame, tap, preventing its translation and therefore also that of repA, which is translationally coupled to tap (108, 109) (Fig. 9). The kissing interaction alone may be sufficient for inhibition of repA translation (110), and although the CopACopT duplex represents a target for RNase 111, the absence of RNase I11 has only a minor effect on the translational activity and metabolic stability of repA mRNA (110,111).
3.
REGULATION OF PLASMID
KILLER-GENEEXPRESSION
Several plasmids are maintained through expression of killer genes, whose products destroy plasmid-free segregants. Analysis of the R1 plasmid hoklsok system provides insight into the mechanism of action of plasmid killer genes, and the regulation of their expression by antisense RNA. The R1 plasmid hok mRNA encodes the Hok (host-killing) protein, which causes cell death by damaging the cytoplasmic membrane (112, 113). An antisense transcript, termed Sok (suppressor-of-killing) RNA, down-regulates Hok protein expression. Specifically, Sok RNA (67 nt) is complementary to the translation-initiation region (TIR) of the mok (modulator-of-killing) gene, which overlaps the hok coding sequence in a separate reading frame. Sok RNA binding creates a duplex that sequesters the mok TIR, and suppresses Hok protein production, because the hok and mok cistrons are translationally coupled (114). Sok RNA binding also accelerates the decay of the RNA, presumably through the action of RNase I11 (115). Sok RNA binding to Hok mRNA does not proceed through the interaction of complementary loops, but involves a single-stranded region at the 5‘ end of Sok RNA
(115). The killing of cells that lack the R1 plasmid depends on (i) the persistence of the sok and hoklmok RNAs in the segregants and (ii) differential RNA decay rates. In the absence of continued transcription in plasmid-free cells, the more rapid decay of sok RNA allows translation of hok mRNA and production of the toxic Hok protein. An important additional facet of this mechanism is that the Hok mRNA must undergo enzymatic cleavage within the 3’-UTR in order to become active translationally (115). Cleavage allows translation by apparently disrupting a long-range RNA.RNA interaction between the mok TIR and the 3’-UTR. The RNA processing activity has not been identified. The 3’-UTR sequence therefore provides an important negative regulatory element, not only in preventing the inappropriate synthesis
31
DOUBLE-STRANDED RNA
of Hok protein in plasmid-containing cells, but in preventing premature Hok mRNA degradation resulting from Sok RNA binding and RNase 111 attack.
4. CONTROL
OF
Islo
TRANSPOSASE
EXPRESSION
TnlO transposon movement is negatively regulated at the translational level by a 70-nt antisense transcript, termed RNA-OUT (116). RNA-OUT is complementary to the 5’ end portion of the transposase mRNA (RNA-IN). RNA-OUT binding to RNA-IN creates an approximately 35-bp duplex, which blocks translation by directly sequestering the TIR of the transposase cistron (117). The dsRNA segment is a substrate for RNase 111, although RNase I11 is not required for negative regulation (118). RNA duplex formation is initiated by a kissing interaction involving the hairpin loop of RNAOUT and the complementary sequence in RNA-IN. The secondary structure and mechanism of action of RNA-OUT is similar to several other plasmid antisense RNAs, the notable exception being that the kissing loop also serves as the nucleation site for full-length duplex formation (119).TnlO transposition exhibits multicopy inhibition, wherein transposition frequency decreases with increasing TnlO copy number. Effective multicopy inhibition is due to the metabolic stability of RNA-OUT, whose hairpin structure confers resistance to exo- and endoribonucleases (120).
5.
CONTROL OF
LYSOGENY IN
BACTERIOPHAGE
LAMBDA
Phage lambda expresses a 77-nt transcript (OOP RNA) that is complementary to a 55-nt segment containing the 3’ end of the lambda cII gene and the adjoining 22 nucleotides in the cII-0 gene intercistronic region. Overexpression of OOP RNA from a plasmid reduces cII gene expression to approximately +m, through destabilization of the cII coding sequence (121). OOP RNA binding to its target allows RNase I11 cleavage within the cII-0 intercistronic region, and the new 3’ end provides an initiation site for 3’ + 5’ exonucleolytic digestion into the cII coding sequence (73).This mechanism is similar to the sib-dependent retroregulation of lambda int mRNA expression (Section VI1,A). The precise pathway of RNA.RNA duplex formation is unknown, because the secondary structures of OOP RNA and the target cII0 sequence have not been determined. OOP RNA is not involved in the lysidlysogeny decision following infection (122). However, OOP RNA production following prophage induction antagonizes cII expression, thereby down-regulating cI repressor synthesis. The suppressed CI levels serve to enforce the lytic pathway (122). The specific involvement of OOP RNA in prophage induction is consistent with the dependence of OOP promoter activity on the LexA repressor (122).
32
ALLEN W. NICHOLSON
6. HIGHLIGHTSOF OTHERANTISENSE RNA-DEPENDENT REGULATORYMECHANISMS Fertility (F) plasmid conjugation requires expression of the plasmid tra (transfer) operon, which is controlled by the transcriptional activator protein, TraJ. TraJ production is negatively regulated by the product of the plasmidfinP (fertility inhibition) gene, a 78-nt antisense RNA that is complementary to the 5’ leader of the TraJ mRNA (123). Binding of FinP RNA occludes the TIR of the TraJ mRNA, repressing TraJ synthesis. The dsRNA segment formally provides a substrate for RNase 111, but it is not known whether repression requires cleavage. This mechanism is formally similar to the antisense regulation of IS10 transposase expression (see Section VII,B,4). FinP RNA is stabilized by the fin0 gene product, a protein that also enhances the binding of FinP RNA to TraJ mRNA (124). The c4 repressors of bacteriophages P1 and P7 are antisense RNAs of approximately 77 nt that regulate expression of the phage ant (antirepressor) gene (125). Upstream of and overlapping ant is an open reading frame, icd (formerly o f l ) , which is required for ant expression. c4 RNA binding to its complementary target sequence represses icd translation, which in turn represses ant expression through inducing early transcription termination (126).The c4 RNA is cotranscribed with icd and ant, and at least one processing event is required for the maturation of c4 antisense RNA (125). The E . coZi FtsZ protein is involved in the septation step of cell division. The FtsZ protein levels are controlled by a variety of factors. A 53-nt RNA (DicF RNA), encoded by the dicF gene of a defective prophage, acts as a negative regulator of FtsZ protein production (127, 128). DicF RNA is complementary to a segment of theftsz mRNA containing the TIR (128). Preliminary experimental evidence indicates that dicF RNA inhibits FtsZ protein production by blocking 30-S subunit recognition of the ftsZ TIR (127, 128). An E . coZi cell-encoded antisense RNA, MicF RNA, has been implicated in regulating the expression of the outer membrane protein, OmpF. MicF RNA is transcribed from an unlinked locus, and is complementary to the 5’ end of OmpF mRNA (129, 130). The MicF-dependent reduction in OmpF protein production precedes the drop in steady-state levels of OmpF mRNA, indicating that repression occurs through translation inhibition rather than by mRNA destabilization (130). There also is evidence for specific protein binding to the antisense RNA, suggesting that MicF RNA functions as an RNA-protein complex (131).Perhaps the protein stabilizes MicF RNA in a manner similar to the stabilization of FinP mRNA by F i n 0 protein.
7. ANTISENSE RNA DESIGNSTRATEGIES An important experimental objective is to achieve targeted control of gene expression. “Designer” antisense RNAs can provide such control at the
33
DOUBLE-STRANDED RNA
post-transcriptional level, and are particularly well-suited to negatively regulating the expression of genes essential or otherwise inaccessible to other forms of control. It was originally speculated that antisense RNAs with optimal activity would be relatively unstructured and specific for a comparably unstructured, functionally essential region in the target. Several studies examined the efficacy of artificial antisense RNAs, expressed from “reversed copies of the target genes (132-134). Reversed gene expression was shown to inhibit target mRNA expression, and optimal inhibition was observed when the antisense transcript is complementary to the TIR of the target mRNA (132, 134). However, the requirement for relatively large amounts of the antisense RNA indicated an inherent inefficiency of action. Placing a TIR at the 5’ end of the reversed gene transcript increased the effectiveness of inhibition (134). The TIR may promote ribosome binding, which would block RNA degradation that initiates at the 5’ end. Incorporation of a transcriptional terminator structure at the 3’ end of the reversed RNA also increased the inhibition, and it was hypothesized that the terminator permits a higher rate of antisense RNA synthesis (134). Alternatively, the terminator hairpin may act as a 3’-end stabilizer, protecting the antisense RNA from 3‘ + 5‘ exonucleolytic decay. In contrast to “reversed gene” transcripts, natural antisense RNAs reflect sophisticated design principles. As evidenced by the examples described above, these RNAs are typically small (50-110 nt), with a high degree of secondary structure and specific noncanonical elements that afford protection against degradation (134~).The loop structures appear to provide optimal recognition of the target RNA, and bases within the stem can influence the antisense interaction (135, 1352).The precise nature of the kissing interaction between loops must be carefully considered for proper function. Recognition loops typically contain five to seven nucleotides, and loops exhibiting fewer or a greater number of nucleotides usually exhibit a decreased rate of stable complex formation. However, antisense and target RNAs that contain significantly larger loops can interact productively, wherein duplex formation directly propagates from the site of initial binding (136). Finally, the ability of a small antisense RNA to hybridize to a model RNA hairpin is sensitive to the exact placement of the target sequence within the hairpin loop, and dependent on specific structural features of the stem (137).
8. dsKNA AND RIBOZYME FUNCTION
I N PAOKARYOTES
Ribozyme-catalyzed cleavage of RNA incorporates the essential features of antisense RNA action, in that trans-acting ribozymes recognize their target through complementary base-pairing. Because ribozymes act catalytically rather than stoichiometrically, a higher efficiency of action may be realized. Targeted cleavage of bacterial RNA by ribozymes in viuo has not
34
ALLEN W. NICHOLSON
been extensively investigated, but a preliminary report suggested an ineffectiveness of a ribozyme in E . coli (138). An explanation for the observed inefficiency was that the coupled synthesis and translation of bacterial mRNA reduces the accessibility of the target sites (138, 139). A recent study demonstrates that a ribozyme can function with reasonable efficiency in the bacterial cell (140).A plasmid-encoded ribozyme was targeted to a site within the coding region of the A2 gene of the RNA coliphage, SP. Expression of the ribozyme suppressed phage growth. The inhibition presumably occurred through site-specific cleavage, because a catalytically inactive version of the ribozyme only weakly inhibited phage growth. The rapid in vivo turnover of the RNA prevented direct confirmation of cleavage at the predicted site. The corresponding antisense RNA was also able to inhibit SP phage infection, which may have been due to formation of the RNA.RNA duplex, followed by degradation by RNase I11 (140).
VIII. dsRNA Function in Eukaryotes
A. dsRNA and hnRNA A significant fraction (approximately 5%) of the sequences in mammalian cellular heterogeneous nuclear RNA (hnRNA) can be isolated in doublestranded form (141).The dsRNA component has been identified by (i) resistance to RNase A in high salt, (ii) chromatographic behavior on CF-11 cellulose, and (iii) sensitivity to KNase 111 (141-143). Analysis of HeLa cell hnRNA revealed that the dsRNA component occurs, on average, every 2000-2500 nt, and is derived from the A h family of repetitive sequence elements, of which there are approximately 300,000 copies per haploid genome (143-145). The size of the dsRNA ranges up to approximately 300 base-pairs (143, 146). Transcription of the A h inverted repeat sequences would allow formation of intramolecular hairpin (“snap-back) structures, as well as intermolecular duplexes. The latter process explains the tendency of hnRNA to aggregate, which can be reversed by brief heat treatment. A portion of the dsRNA fraction of mammalian hnRNA is resistant to RNase I11 (143).The resistance may be due to the natural sequence heterogeneity of the A h sequence family, which would provide mismatched intermolecular duplexes not recognized by RNase I11 (143). Alternatively (or in addition), the cleavage resistance may reflect the action of the dsRNA adenosine deaminase (Section VIII,C), which converts A.U to 1 . U base-pairs. This assumes that the dsRNA elements are present in viuo, and that 1.U basepairs can block RNase I11 action. It may be informative to determine the inosine content of hnRNA-derived dsRNA and whether hnRNA-specific
DOUBLE-STRANDED RNA
35
dsRNA is a substrate for the dsRNA adenosine deaminase. The presence of dsRNA in purified hnRNA could have a trivial explanation, in that the dsRNA is a product of phenol extraction during isolation. Phenol accelerates nucleic-acid-reassociation reactions (13). However, there are several lines of evidence for the occurrence of dsRNA within the eukaryotic cell nucleus. One study isolated hnRNP (hnRNA associated with specific nuclear proteins) by a gentle extraction procedure that omitted phenol, and applied differential nuclease sensitivity to demonstrate the presence of dsRNA within the hnRNP preparation (146). A subsequent study obtained cross-linking of dsRNA regions in vivo, using a photoreactive psoralen derivative that could be taken up by the cell (147).These and related studies (16) provide strong evidence that dsRNA is an intrinsic component of hnRNP, and is relatively accessible to nuclease digestion and photocross-linking. The presence of dsRNA in vivo also has been supported by immunocytochemical studies. Immunofluorescent staining by dsRNA-specific antibodies was observed in the nucleus of Vero cells and mosquito cells (14). There was no detectable immunofluorescence of the nucleolus or the cytoplasm of these cells. However, it should be noted that under the same conditions, other cell lines, which included HeLa, KB, BHK, and CEF cells, did not provide a detectable reaction (14). The functional roles of the dsRNA component of hnRNA is not known, but its nuclear localization has focused attention on several possibilities. The dsRNA component may provide a structure that organizes hnRNP and promotes specific interactions with the nuclear matrix, including those that facilitate nuclear-cytoplasmic transport. Because dsRNA-binding proteins are implicated in developmental programs (52), it is possible that dsRNA elements not necessarily related to the Alu-related sequences, in specific mRNA precursors provide protein binding sites or signals for trafficking, storage, and controlled expression. Alternatively, the nuclear dsRNA component may lack a specific function and is targeted for degradation by dsRNA-specific nucleases, the dsRNA adenosine deaminase, or the (2’-5’)A polymerase/RNase L system. Depending on their location within hnRNA, dsRNA elements may be removed along with introns, or by cleavage of 3’ trailer sequences. Normal cell function may require compartmentalization or masking of dsRNA. For example, given the lengths of the Alu-related dsRNA sequences (up to 300 bp), the inappropriate presence in the cytoplasm of these sequences could activate the dsRNA-dependent protein kinase and inhibit translation, as well as trigger interferon gene expression. It is also possible that during certain cellular events (e.g., nuclear envelope breakdown or altered RNA processing) nuclear-localized dsRNA may enter the cytoplasm and trigger specific changes in cell physiology.
36
ALLEN W. NICHOLSON
B. dsRNase Activities The discovery of E . coZi RNase I11 and identification of its role in rRNA maturation prompted the search for a similar activity involved in eukaryotic rRNA processing. There is now good evidence for the existence of one or more dsRNase activities in mammalian cells (summarized in Table 11), but there are scant data on their functional roles. Biochemical analyses provide limited information, and because the activities have been difficult to purie. A cautionary note is provided by the observation that mycoplasmasubiquitous contaminants of mammalian cell lines-are a source of a dsRNase (148).A yeast dsRNase is here described first, because the enzyme bears a number of similarities to RNase 111, because there is some information available on its cellular role.
1. RNase 111-RELATED ACTIVITIES
IN
YEAST
A dsRNA-specific nuclease in the yeast Saccharomyces cerevisiae was first detected using an in situ gel electrophoretic enzyme assay. The dsRNase activity degrades poly(rG).poly(rC), and is associated with a 26-kDa polypeptide (149).Using a different approach, another study described a S . cerevisiae dsRNase of 27 kDa (150).This dsRNase required reducing agents for full activity, and was stimulated by KCI. Cell-growth experiments indicated that the dsRNase activity levels are higher in cells deprived of nutrients, and it was suggested that under these conditions the increased activity may enhance RNA turnover and ribonucleotide reutilization (150). The Schizosaccharomyces pombe p a d gene encodes a 41-kDa polypeptide that degrades dsRNA in vitro (64).The C-terminal portion of the Pacl enzyme has a 25% amino-acid similarity with the complete primary structure of E . coZi RNase 111 (64, 151).However, antibodies to RNase I11 do not react with the Pacl enzyme, and neither the pacl nor the rnc gene exhibits measurable activity in reciprocal complementation experiments (64). The role of Pacl enzyme in RNA metabolism has been partially defined. The p a 1 gene product (Pacl) is essential for vegetative growth (M), and overexpression of the enzyme inhibits entry into meiosis. It is possible that the enzyme suppresses meiotic gene expression during vegetative growth, and must be down-regulated to allow entry into meiosis (64). Alternatively, Pacl may be required for the maturation of meiosis-specific transcripts. The Patl protein kinase may regulate Pacl enzyme activity. Because pat1 mutants exhibit uncontrolled meiosis, the Patl enzyme inhibits meiosis, and must be suppressed (probably by the mei3 gene product) to allow the cell to enter meiosis. Because overexpression of Pacl enzyme permits normal vegetative growth and sexual development of a p Q f l t smutant at the nonpermissive temperature (64), one scenario is that the Pacl enzyme activity is stimu-
TABLE I1 MAMMALIAN CELLAND Nameb FV3 dsRNase
Sourcec FV3 virions; cytoplasm of FV3-infected BHK cells
VIRUS-ASSOCIATEDdSRNA-CLEAVING
Sized ND
ACTIVITIES=
Salt optima; other requirements"
Other features
Ref.
Requires Mgz+ (-5 mM)
165, 166
RSV virions
ND
Requires Mgz+
Cytoplasm of Krebs I1 ascites cells
ND
ND
167 155
RNase D
Cytoplasm of Krebs I1 ascites cells
50-150 kDa
ND
154
RNase DS
dsRNA-treated chick embryo cells
34.5 kDa
0.05-1.4 mM Mg2'; 0.3-30 mM M+
Associated ssRNase
168
RNase DII
Chick embryo cell extracts, or nucleolar fraction
43-70 kDa (several species)
0.5-1 mM Mg2+; 75100 mM M +
Associated ssRNase
158
-
Cytoplasm of mouse embryo cells
65 kDa
2-5 mM Mgz+; 25-50 mM M +
Associated ssRNase
160, 161
RNase D -
HeLa cell hnRNP
ND
ND
-
157
Calf thymus (whole cell and nuclei)
60 kDa
2-4 mM Mgz+; C > G , with no obvious 3’ neighbor preference (189). Short dsRNAs showed high site selectivity, whereas longer substrates were promiscuously deaminated. The placement of specific adenine residues relative to the duplex termini strongly influenced their ability to be deaminated (189). Thus, the size and sequence of the duplex substrate may be sufficient to confer the requisite editing specificity. Additional factors may regulate dsRAD activity (180). A cytoplasmic protein or protein complex can bind dsRNA and block the action of dsRAD ( 1 8 9 ~ )Also, . depending on the specific developmental stage, the enzyme can either be nuclear or cytoplasmically localized (176). The dsRAD may be involved in the cellular antiviral response and in cell development. The dsRAD mRNA is expressed in every human tissue tested and is especially prevalent in brain tissue (183b). Specific viral infection or dsRNA treatment causes a decrease in dsRAD activity (190), and it was proposed that the down-regulation of dsRAD may increase the cytoplasnlic dsRNA levels, thereby enhancing the antiviral interferon response. Another study implicated the dsRAD in triggering the differentiation of pluripotent embryonal carcinoma cells through an autocrine signaling mechanism (191). Specifically, a programmed decrease in dsRAD activity would cause a corresponding rise in the cytoplasmic dsRNA levels. The cytoplasmic dsRNA would autoinduce interferon production, and force the cells to exit the proliferative state and terminally differentiate (191).
2. dsRNA UNWINDING AND ANNEALINGACTIVITIES The existence of proteins that catalyze unwinding of dsRNA (RNA helicases) or, conversely, facilitate dsRNA formation (RNA annealing proteins) implies biological processes that involve the directed denaturation or formation of dsRNA. RNA helicase activities are ubiquitous, and use the free energy provided by nucleoside triphosphate hydrolysis to catalyze the unwinding and separation of RNA strands engaged in a duplex structure (e.g., see Refs. 192-202). Several prokaryotic RNA helicases have been identified that appear to be involved in the assembly and function of the translational apparatus and in mRNA utilization. The DbpA protein, encoded by the dbpA gene (193), hydrolyzes ATP specifically in response to binding 23-S rRNA, and may manipulate a 23-S rRNA structure during 50-S subunit assembly (198).The product of the s m B gene suppresses a temperature-sensitive defect in
DOUBLE-STRANDED RNA
45
ribosomal protein L24, which inhibits proper ribosome assembly at the nonpermissive temperature (203). The deaD gene [also identified as the mssB gene (201)] was first identified as a multicopy suppressor of a temperature-sensitive mutation in ribosomal protein S2 (195). It has been speculated that the DeaD protein may alter mRNA structure during translation andlor participate in 3 0 3 subunit assembly (I%), although other functional roles are possible (201). Genetic evidence indicates that the DeaD and SrmB proteins do not share a common role (195).The transcriptional terminator protein Rho can be regarded as a helicase, because its action is directed toward unwinding RNA. DNA hybrid structures at Rho-dependent terminator sites (204). Eukaryotic RNA helicases have been implicated in manipulating mRNA structure during translation initiation (192) or pre-mRNA structure during nuclear splicing reactions (199). The ATP requirement for spliceosomecatalyzed pre-mRNA splicing in part reflects the action of specific helicases that mediate interactions between snRNP particles (194). Several helicase activities have been purified from nuclear extracts of HeLa cells. One activity, termed RNA Helicase A, unwinds dsRNA within a 3’ -+ 5‘ directionality (200),whereas the other enzyme (RNA Helicase 11) exhibits a 5’ + 3’ directionality (205). Both enzymes catalyze multiple rounds of duplex unwinding. RNA helicase A is active in monomeric form and is closely related to the protein encoded by the Drosophila gene maleless (196). The exact role of these mammalian nuclear-localized helicases remain to be demonstrated. It was recently shown that the monomeric RNA helicases contain two copies of the dsRBD (53),and it was proposed that for the monomeric helicases, two dsRNA-binding domains are necessary to generate the unwinding force and movement along the double helix (53). A protein present in the mammalian cytoplasm and nucleus, termed La, can bind and unwind dsRNA by a mechanism that may not require NTP hydrolysis (202). Proposed roles of La protein include facilitation of translation by mRNA secondary structure melting, nuclear-cytoplasmic transport of mRNA, transcription termination by RNA polymerase 111, and global regulation of translation by controlling the accessibility of dsRNA to the dsRNA-activated protein kinase (202).La protein may therefore be an important regulator of cell growth and development. Specific proteins present in HeLa cell nuclei can catalyze RNA-RNA annealing (206). Several of the activities correspond to specific hnRNP proteins, and one of the proteins (hnRNP A1 protein) may be controlled by reversible phosphorylation (207). Two nonexclusive models have been proposed to describe how these species promote RNA-RNA annealing (206). In the “matchmaker” model, interaction of annealing proteins with bound RNA provide an increase in local HNA concentration, thereby facilitating duplex
46
ALLEN W. NICHOLSON
formation by accelerating the nucleation step. In the “chaperone” model, the annealing proteins maintain the bound RNAs in an unstructured conformation and enhance the rate of duplex formation (206). One may regard both RNA helicases and RNA annealing proteins as molecular chaperones, possessing counterpoised activities that mediate the association and dissociation of complementary RNA chains.
IX. dsRNA and the Interferon System
A. The dsRNA-activated Protein Kinase dsRNA is a potent inhibitor of mammalian protein synthesis in uitro (208). The inhibition is mediated by a protein kinase whose activity is stimulated by dsRNA binding, and which catalyzes the phosphorylation of the 01 subunit of initiation factor eIF2. The phosphorylated eIF2 sequesters the guanine exchange factor protein (GEF), inhibiting the exchange of GDP for GTP. The double-stranded RNA-activated protein kinase has been termed the DAI (double-stranded RNA-activated inhibitor), p68 kinase, P1 kinase, Pl/eIF201 kinase, PK-ds, and Dsl (10). A consensus has recently been reached on the name PKR (for Protein Kinase, dsRNA-dependent) (209). PKR (and DAI-see Fig. 10) is used in this discussion. Structure-function studies on PKR have been undertaken following the cloning of the PKR cDNA (210)and the ability to express the protein in uitro, as well as in vivo, in heterologous systems (e.g., see Ref. 211). dsRNA and PKR play specific roles in the interferon response: dsRNA is a by-product of viral replication (see above), and the presence of dsRNA in the cytoplasm can activate PKR, which not only inhibits cell protein synthesis but also stimulates transcription of genes whose products participate in the interferon response (see below). Attention has also been focused on the role of dsRNA and PKR in normal cell development and proliferation (see below). Because there are recent excellent reviews on the dsRNA-activated protein kinase (10, 212), this section focuses on the structure-function aspects of the enzyme and how it interacts with dsRNA. The involvement of PKR in signal transduction and gene expression is discussed in Section IX,C. PKR is normally a low-abundance protein, but treatment of cells with interferon or dsRNA greatly increases its Ievels, as a result of transcription of the PKR gene. The enzyme is cytoplasmic and may be ribosome-associated. dsRNA binding to PKR triggers self-phosphorylation on multiple serine and theonine residues, which is believed to cause a protein conformational change. The autophosphorylated enzyme can phosphorylate the 01 subunit of eIF2 on a specific serine residue. There is evidence for at least one other
47
DOUBLE-STRANDED RNA
I
A DAI ACTIVATION
Inactive
-
[dsRNA] Active
Inactive
0
B MODEL 1
- 1 site for dsRNA - DAI dimer - intermolecular
or
autophosphorylation
MODEL la.
Activation site. High affinity
C
/
MODEL 2
- 2 sites for dsRNA -DAI monomer
- intramolecular
autophosphorylation Inhibitory site, Low affinity
FIG. 10. Models for dsRNA activation of t h e dsRNA-dependent protein kinase, DAI (PKR). (A) The observed dependence of DAI activation on dsRNA concentration, and inhibition at high dsRNA concentrations. (B) In Model 1 the binding of two DAI monomers to a single dsRNA species stimulates autophosphorylation and activation. High dsRNA concentrations would disfavor binding of two proteins on a single dsRNA molecule, and therefore inhibit activation. (C) In Model 2 binding of a monomer to dsRNA (low concentration) induces autophosphorylation, whereas at high dsRNA concentrations, a weaker binding site is also utilized, which prevents autophospliorylation. Reprinted with permission from Ref. 10.
target, I-NF-KB, whose phosphorylation promotes interferon gene transcription (see Section IX,C). There is in vitro evidence that PKR can autophosphorylate in an intermolecular fashion, and that the activated PKR phosphorylates its targets independent of continued dsRNA binding (213).
48
ALLEN W. NICHOLSON
Whether PKR undergoes intramolecular phosphorylation remains to be demonstrated. PKR regulates its expression at the translational level; perhaps there is a cis-acting element (dsRNA?) on the PKR mRNA (213). PKR inhibition of protein synthesis in vitro requires Watson-Crick basepaired dsRNA at least 50 bp in size, whereas ssRNA, DNA, or RNA.DNA hybrids are ineffective (208).More recent studies, which used purified protein and dsRNAs of defined lengths, have further characterized the dsRNA requirements for PKR activation. First, there is no apparent base-pair sequence specificity, but base-pair length is important: dsRNA species that are shorter than approximately 30 bp interact only weakly with PKR, and fail to activate. Above 30 bp there is stronger binding, with a concomitantly increased ability to activate, until maximal effect is reached at approximately 85 bp (214).Short dsRNAs can inhibit activation by longer dsRNAs, and high concentrations of long dsRNAs also inhibit activation. These studies suggest that PKR activity is differentially responsive to the length of the bound dsRNA. In this regard, a number of viruses counteract the growth-inhibitory action of PKR by expressing specific RNAs that bind but do not activate PKR. The competitive binding of these RNAs can prevent subsequent activation of PKR by viral-infection-specific dsRNAs (10). Based on these and other observations, two models have been proposed for the mechanism of PKR activation by dsRNA (Fig. 10) (10). The first model proposes that autophosphorylation is an intramolecular event that occurs within a binary complex of PKR and dsRNA. The suppression of phosphorylation by high dsRNA concentrations may result from dsRNA binding to a weaker, inhibitory site (Fig. 10). The second model proposes that phosphorylation is intermolecular, and occurs efficiently only when two PKR monomers are bound to the same dsRNA. This model rationalizes the more efficient activation by longer dsRNAs, because these species would provide multiple binding sites. Moreover, high dsRNA concentrations would inhibit activation, because the excess dsRNA would favor binding of (at most) one PKR to a single dsRNA, thereby preventing intermolecular phosphorylation (10). Recent biochemical and genetic data support the proposal that PKR monomers cooperatively bind dsRNA, producing an autophosphorylated, dimeric species as the active enzyme (211a). The PKR contains an ATP-binding/phosphate transfer domain in the C-terminal region, which includes a lysine residue essential for catalytic activity (215).The N-terminal portion of the protein binds dsRNA and contains two consensus dsRBDs (motifs I and 11) (52, 216-218). An in viva analysis using specific PKR' mutants showed that (i) both motifs are required for maximal PKR activity and (ii) the N-terminal-proximal dsRBD (motif I) is more important for activity than motif I1 (211a). I n vitru experiments also demonstrated that motif I plays a greater role than motif I1 in dsRNA bind-
DOUBLE-STRANDED RNA
49
ing (217, 218, 2 1 8 ~ )In . this regard, motif I exhibits a better match with the consensus dsRBD than motif I1 (52).The activities in yeast of PKR variants exhibiting catalytic domain point or deletion mutations indicate that the respective catalytic domains of two monomers must specifically interact to activate the phosphotransferase mechanism ( 2 1 1 ~ )It. was reported that a PKR mutual lacking motifs I and I1 is activated in mammalian cells (218b). The evidence argued against a constitutively active mutant PKR, but supported a mechanism whereby the mutant PKR is activated by the endogenous PKR, or by another cofactor unrelated to dsRNA (218b).
B. The dsRNA-activated 2'-5'A Synthetase Interferon treatment of mammalian cells causes a 10- to 100-fold increase in the levels of a unique enzyme activity, (2'-5')oligo(A) synthetase (2-5A synthetase). The induction of 2-5A synthetase occurs at the transcriptional level, and new protein synthesis is not required for transcription (219). On binding dsRNA, the 2-5A synthetase polymerizes ATP to form the oligonucleotide species (2'-5')oligo(A) (2-5A). The 2-5A has 2'-5' phosphodiester linkages and ranges in size from 2 to 15 nt (for a review see Refs. 219 and 220). The 2-5A binds and activates the ssRNA-specificendoribonuclease RNase L (221), which can inhibit viral replication by degrading viral and cellular RNAs. The dsRNA species that activate 2-5A synthetase most likely derive from viral replication intermediates, and a recent study has demonstrated the binding of viral-specific dsRNA to the 2-5A synthetase isolated from interferon-treated, EMCV-infected cells (222).Is there a role for 2-5A synthetase and RNase L in KNA metabolism in the uninfected cell? The two enzymes may be involved in the maturation and/or turnover of hnRNA, wherein internal dsRNA elements are cis-acting processing signals. hnRNA can activate the 2-5A synthetase in vitro (223),and there is a recent report that one of the 2-5A synthetase isoforms may participate in the mammalian nuclear pre-mRNA splicing pathway (224). Are there specific sequence or structural features in dsRNA that are necessary for 2-5A synthetase activation? There is no apparent sequence requirement, and a low level of base-pair mismatch can be tolerated (225). The synthesis of 2-5A in cell-free extracts is maximally stimulated by dsRNA species longer than 65-80 bp, whereas dsRNAs less than 30 bp fail to activate (225).There is a good correlation between dsRNA size requirements for efficient induction of the interferon response and activation of 2-5A synthetase (225). 2-5A synthetase activation by dsRNA exhibits a sigmoidal dependence on enzyme concentration (226),suggesting that efficient activation may require assembly of multiple proteins on the same dsRNA. However, unlike the behavior of the dsRNA-dependent protein kinase, 2-5A synthetase activation is not inhibited by high dsRNA concentrations (225).
50
ALLEN W. NICHOLSON
Immunoprecipitation experiments reveal at least three forms of the 2-5A synthetase in human cells (of approximately 40/46, 67/69, and 100 kDa), whose levels and activities apparently are regulated in a cell-type specific manner (227). The enzymatic activity of each synthetase isoform exhibits a different dsRNA concentration dependence, and the 100-kDa form has the highest affinity for dsRNA. The isoforms are expressed from at least two different genes, and alternate RNA splicing and post-translational modification provide further differentiation. The 40- and 46-kDa isoforms are encoded by the same gene (on human chromosome 12), but are expressed from separate mRNAs, which are generated by alternative splicing. The two isoforms therefore are identical for the first 346 amino acids, but have different carboxyl termini. The 69-kDa form of 2-5A synthetase is expressed from a separate gene and there is no current information on the relationship of the 100-kDa species to the other 2-5A synthetases (228). Gel filtration chromatography shows different aggregate forms of the 25A synthetases. It has been proposed that complexes containing multiple copies of the synthetases can synthesize more efficiently the longer forms of 2-5A, which in turn are better activators of RNase L (228). In support of this hypothesis, the monomeric 100-kDa enzyme produces mainly the dimeric form of 2-5A, whereas the tetrameric 40146-kDa species and the dimeric 69-kDa species preferentially synthesize the longer forms (219). The physical proximity of multiple catalytic sites may more efficiently convert the dimeric 2-5A species to longer chains, and a preformed multisubunit complex may be more easily activated by dsRNA binding. The sigmoidal dependence of synthetase activation on protein concentration may reflect this requirement . Chemical fractionation studies indicate different subcellular locations for the 2-5A synthetase isoforms. The 100-kDa enzyme is associated with ribosomes and the rough microsomal fraction; perhaps this isoform suppresses viral protein synthesis by mediating RNase L-dependent cleavage of viral and ribosomal RNA (see Ref. 227 and references therein). The 46-kDa enzyme is also associated with ribosomes, but the more hydrophobic 40-kDa isoform, which is myristylated, preferentially associates with the plasma membrane fraction (227, 229). The 67-kDa 2-SA synthetase isoform is plasma membrane associated, but also associates with the nuclear matrix (230). The activity levels of this synthetase are stimulated by HIV-1 infection, which is followed by an increase in RNase L activity that is also nuclear matrix associated. The involvement of the nuclear matrix in mRNA synthesis, maturation, and transport further supports the general involvement of 2-SA synthetase and RNase L in the normal metabolism of nuclear RNA. Perhaps the cell in the antiviral state exhibits enhanced turnover of nuclearlocalized cellular as well as viral RNA.
DOUBLE-STRANDED RNA
51
C. dsRNA and Mammalian Cell S ig na I Tra nsduct io n dsRNA induces the transcription of the interferon beta gene, as well as other genes (3). The mechanism of transmission of the dsRNA-triggered signal from the cell surface to the nucleus has been the focus of recent studies, and several important features of the pathway have been established (reviewed in Ref. 3). One question concerns the primary event: does dsRNA binding to the cell surface generate the signal directly, or is the dsRNA first internalized? The experimental evidence supports the latter process. First, microinjected poly(I).poly(C) causes rapid lysis of interferon-treated mouse LM cells, whereas treatment of the cells with poly(I),poly(C)covalently linked to Sepharose beads, which physically blocks dsRNA internalization, is without effect (231). Second, Northern analysis was applied to show that interferon induction in mouse cells is directly correlated with the intracellular uptake of poly(I).poly(C)(232).Additional evidence indicates that dsRNA internalization occurs by an energy-dependent process involving an endocytic pathway. Specifically, prior treatment of cells with an endocytosis inhibitor inhibits dsRNA uptake, and other evidence indicates an acidic intracellular compartment as an important intermediary in potentiating the biological action of dsRNA (231). Is there a specific cell surface receptor for dsRNA? Early studies suggested specificity in the interaction of dsRNA with the mammalian cell surface (233, 234). One study employed rabbit kidney cell lines that were either sensitive or unresponsive to dsRNA to provide evidence for a specific cell surface protein that may be a component of the putative dsRNA receptor (235). Characterization of the receptor has been complicated by the existence of both specific and nonspecific binding sites for dsRNA on the cell surface, and by the observation that only a small fraction of the bound dsRNA (presumably the specific receptor-bound component) is needed to generate the signal (235). As discussed above, two cellular proteins that bind dsRNA are PKR and 2-5A synthetase. Because these enzymes are present at low levels in the absence of interferon treatment or viral infection, they provide targets for the internalized dsRNA. Do either of these proteins participate in the dsRNA signal transduction pathway? A clue is provided by the occurrence of signal amplification in the dsRNA-mediated interferon response. Specifically, infection of cells with defective-interfering vesicular stomatitis virus (VSV) particles containing covalently cross-linked (+) and (-) strands of VSV RNA (i.e., an encapsidated, noninfectious dsRNA) showed that essentially a single molecule of dsRNA is sufficient to invoke a full interferon response (236).Signal amplification may be attained by the binding of dsRNA to PKR,
52
ALLEN W. NICHOLSON
because the autophosphorylated PKR could then activate other PKR molecules through intermolecular phosphorylation. The involvement of PKR in the dsRNA signal transduction pathway is also indicated by the ability of 2-aminopurine to block interferon induction in dsRNA-treated or virusinfected cells (237, 238). 2-Aminopurine suppresses PKR activation through competitive inhibition of ATP binding (239, 240). Recent evidence indicates that the dsRNA signal transduction pathway does not involve PKR-catalyzed phosphorylation of eIF2a. Rather, dsRNA stimulates, through PKR phosphorylation of a separate target, the binding of nuclear transcription factor NF-KB to promoter elements of the human P-interferon and other genes (241). NF-KB is a heterodimeric species, containing 50-kDa (p50)and 65-kDa (p65) subunits. Because each subunit may be one of several subtypes, NF-KB can be regarded as a family of closely related transcription factors (for a recent review, see Ref. 242). NF-KB is normally present in the cytoplasm in an inactive complex with I-KB, a protein inhibitor of NF-KB activity, which also has specific isoforms (243).The inactive, I-KB-bound form of NF-KB also contains the precursor (p105) to the active p50 subunit. It has recently been shown that I-KB is a target for phosphorylation by PKR (244) [as well as several other protein kinases, including protein kinase C (245) and Raf-1 (246)].On phosphorylation, the I-KWNF-KBcomplex dissociates, and the p105 is proteolytically processed to the p50 form, which exposes a nuclear localization signal (243). NF-KB migrates to the nucleus, where it binds to promoter-specific sequences and cytoplasmic membrane I
.
FIG. 11. Summary of the dsRNA-dependent signal transduction pathway for transcription of the IFN gene (as well as other genes). See Section IX,Cfor further discussion.
DOUBLE-STRANDED RNA
53
activates interferon gene transcription. The PKR-catalyzed phosphorylation of I-KB apparently is followed by rapid proteolytic destruction of I-KB (243, 247). Figure 11provides a summary of the present status of the dsRNA signal transduction pathway. The incomplete nature of this scheme is underscored by recent evidence that dsRNA mediates the action of transcription factors other than NF-KB, and that tyrosine-specific protein phosphorylation may be involved in a signal transduction pathway involving dsRNA (248, 249).
X. Cellular and Physiological Effects of dsRNA, and Therapeutic Applications
A. Viral Infection, dsRNA, and the Acute Phase Response dsRNA exerts a variety of cellular and physiological effects in addition to (or as a consequence of) interferon production, which underscores the complexity of the response of the organism to dsRNA (reviewed in Refs. 3 and 250). The physiological effects of dsRNA stem from specific cellular events, which would include activation of PKR, activation of the various 2-5A synthetases, and expression of interferons. The dsRNA response may also vary significantly from cell-type to cell-type, adding to the complexity in interpreting the physiological effects. It has been proposed that dsRNA plays a primary role in the physiological response to a cytolytic viral infection (250).Although a viral infection may be limited to a specific tissue, the effects of the infection can be systemic. Thus, viral dsRNA released from infected cells may be distributed by the bloodstream and affect the function of other tissues. In support of this proposal, dsRNA (either synthetic or isolated from influenza virus-infected lung tissue) can provoke in rabbits the constitutional symptoms of the acute phase response of influenza infection, which includes fever and drowsiness (250252). Additional adverse effects of dsRNA include ocular and embryonal toxicity (in the rabbit), and suppression of hemopoietic stem cell proliferation and differentiation, along with spleen hypoplasia and thymic atrophy (in the mouse and rat) (250). dsRNA can also mimic a viral infection at the cellular level, which includes cell damage, as evidenced by vacuolation and cloudy swelling, and can induce apoptosis, as indicated by pyknosis and . activation of PKR has been implichromatin breakdown (250, 2 5 0 ~ )dsRNA cated in triggering apoptosis (250~).
B. dsRNA and Cell Proliferation The antiproliferative ability of dsRNA was demonstrated shortly after its discovery as an interferon inducer (reviewed in Ref. 3). The ability of dsRNA
54
ALLEN W. NICHOLSON
to limit neoplastic or normal cell growth might be a consequence of interferon production, but dsRNA may play a more direct role in modulating cell proliferation. Liver metastases quickly form when LSlBL ascitic tumor cells are injected into a mouse host (253).Administration of poly(rG).poly(rC)or poly(rI).poly(rC) prior to and following inoculation of the tumor cells significantly decreases the number of metastatic colonies in the liver, and prolongs survival of the mice (253). The growth of the same tumor cells in culture is not inhibited by dsRNA, suggesting that inhibition of metastatic growth is host mediated (253). Poly(I).poly(C)suppresses the proliferation of human umbilical vein endothelial cells (254).It was noted that interleukin (1L)-la mRNA production occurred concomitantly with the inhibition of cell division. The involvement of IL-la in the dsRNA-mediated inhibition of endothelial cell proliferation is also indicated by the observation that an antisense oligonucleotide specific for IL-la mRNA abrogates the dsRNA-dependent growth inhibition (254).A consequence of blocking endothelial growth would be a weakening of the lining ofblood vessels, which may be responsible for the hemorrhage and edema seen following administration of dsRNA to chickens (250). There is additional evidence that the antiproliferative effect of dsRNA is mediated by factors other than interferon. For example, interferon-specific antibodies do not relieve the growth inhibitory effects of dsRNA (3). Also, dsRNA suppression of human glioma cell growth may proceed through a pathway involving the CAMP-dependent protein kinase (255). Finally, dsRNA need not be strictly antiproliferative: poly(rI).poly(rC)can stimulate the growth of Balb/C 3T3 or human fibroblast cells (256, 257). In the study on human fibroblasts, antibody neutralization of interferon enhanced the proliferative response, indicating that interferon may self-limit cell growth and division (257). The cell type is a major determinant of the dsRNA response: a recent study demonstrates that dsRNA stimulates the growth of fibroblast MDBK cells while inhibiting the growth of epithelial HT-29 cells
(258). The sequence of a dsRNA can influence its biological activity. A 300-bp dsRNA of defined sequence did not inhibit tumor cell growth, but under the same conditions, poly(I).poly(C) did (259). Because the 300-bp definedsequence dsRNA was capable of activating PKR and the 2-5A synthetase in vitro, the lack of a cellular response may reflect differential binding and/or internalization of the dsRNA, or an increased nuclease sensitivity (259). It would be informative to screen additional dsRNAs of defined sequences and lengths for their ability to inhibit cell proliferation I
C. Therapeutic Applications of dsRNA The potential ofdsRNA as an anticancer agent has been investigated for a number of years. However, the adverse physiological effects of dsRNA have
DOUBLE-STRANDED RNA
55
limited its effectiveness as a therapeutic agent, prompting different approaches. For example, administration of poly(I).poly(C), complexed to the stabilizing agents polylysine and carboxymethylcellulose, has toxic effects (3). In contrast, the mismatched dsHNA, poly(I).poly(CIzU), also termed Ampligen, can provide favorable biological responses with minimal side effects (3). The reduced toxicity of mismatched dsRNA may be due to its shorter physiological half-life (3). Administration of high doses (up to 600 mg) of poly(I).poly(C,,U) to healthy human volunteers had minimal side effects, but the anticipated induction of interferon was not realized (260). In another clinical trial, high doses of poly(I).poly(C,,U) were given to cancer patients. The most frequent response was suppression of tumor growth, with one reported instance of complete remission (discussed in Ref. 3). A synergistic enhancement was observed when IFN-a was given in combination with the poly(I).poly(C,,U). It is not yet clear whether the dsRNA acts directly on the tumor cells or indirectly through the immune system (3). A problem with the use of poly(I).poly(C,,U) is the high dose requirement, necessitating the use of large volumes that must be infused into the bloodstream over a period of hours. Liposomes have been investigated as agents for delivery of poly(I).poly(C,,U) and poly(I).poly(C) (261).However, poly(1).poly(C) is toxic in liposome-encapsulated form, compared to the unencapsulated form (261). For dsRNA to realize full potential as an anticancer agent, more efficient methods of delivery, increased efficiency of response, and decreased toxicity must be established. Double-stranded RNA may provide an effective base therapy for HIV disease, because it can stimulate immune cells, inhibit HIV infection, suppress growth of opportunistic tumors, and perhaps act in a synergistic manner with other anti-HIV drugs (262).Poly(I).poly(C,,U) inhibits HIV replication in vitro and enhances the ability of azidothymidine to suppress HIV replication and stabilize the T cell count (262). For example, HIV-infected human H9 T-cells shed virus approximately 3 days post-infection. However, a prior treatment with poly(I).poly(C,,U) significantly delays the appearance of progeny HIV (263). Poly(A).poly(U)inhibits HIV infection of human lymphoid cells in culture and exhibits a synergistic inhibitory effect with AZT (264).The mechanism of inhibition appears to occur at the level of HIV entry (265),and other polyanions, such as heparin and dextran sulfate, also inhibit entry. There are some biochemical data indicating that the 2-5A synthetase/ KNase L enzyme system is altered in HIV-infected cells. Specifically, the levels of 2-5A synthetase in HIV-infected human T cells are inversely correlated with the amount of progeny virus (230), and poly(I).poly(C,,U) may sustain the increased activity levels of2-5A synthetase in HIV-infected cells (263).
56
ALLEN W. NICHOLSON
XI. Conclusions and Prospects Almost four decades of research have provided a wealth of information on dsRNA structure and biological function. Although the molecular features of the A-form RNA double helix are now well-defined, more studies are required to understand how mismatches, bulges, internal loops, and other structures perturb dsRNA structure and stability, and modulate function. Understanding the determinants that promote triple-helix formation, or that facilitate interconversion of the A and Z helices, would be useful in predicting the occurrence and function of these structures in biological RNAs and in evaluating their potential for the directed control of gene expression. Complexity as well as diversity of protein recognition of dsRNA is anticipated. Determining the structure of the dsRBD motif would provide direct information on one mode of dsRNA recognition. Whether the dsRBD is capable of base-pair sequence-specific binding or whether other motifs exist that also recognize dsRNA remains to be determined. Structural analyses of RNase 111, PKR, and dsRAD, with or without bound substrate, would provide a basis for describing how dsRNA binding triggers an enzymatic activity (dsRNA cleavage, protein phosphorylation, and deamination, respectively). Although not directly addressed in this review, the importance of the RNA double helix in organizing macromolecular assemblages should be noted. A striking example is provided by the recently determined structures of a plant virus and an animal virus (266,267).In each case, a specific dsRNA element within the viral chromosome mediates capsomere-capsomere interactions, thereby stabilizing the protein coat. The structure of the animal virus capsid and the location of the dsRNA element are shown in Fig. 12. Analyses of the structures of prokaryotic antisense RNAs and the precise interactions with their targets provide insight into the dynamics of dsRNA formation and how gene expression is regulated by RNA-RNA interactions. Elucidation of the factors that influence sense-antisense binding, such as RNA secondary and tertiary structure, base-pairing dynamics, and metabolic stability, should assist in the intelligent design of efficient antisense RNAs and ribozymes. There is a growing number of examples of natural antisense RNAs in eukaryotic cells and associated viruses (e.g., see Refs. 268-271). Although little is known of their functional roles, a recent report suggests a role of antisense transcripts in the temporal regulation of translation, through interaction with mRNA 3’-UTRs (272, 2 7 2 ~ )A. wider role of antisense RNA and duplex RNA elements in the regulation of eukaryotic cellular processes is anticipated (273, 274). The characterization of activities that catalyze dsRNA formation, denaturation, movement, degradation, or specific covalent modification is be-
57
DOUBLE-STRANDED RNA
c
FIG. 12. Involvement of dsRNA in capsid protein interactions in flock house virus (FHV). (A) The entire capsid is shown (T = 3). The positions of the dsRNA and the C subunit polypeptide arms are shown at the icosahedral twofold axis (indicated by the solid oval). (B) The interactions of dsRNA with the peptide arms are shown. (C) Another view of the subunitsubunit interactions near the dsRNA-binding cleft. Reprinted with permission from Nature (from Ref. 267). Copyright 1993 Macmillan Magazines Limited.
ginning to reveal an intricate choreography of dsRNA in the mammalian cell. Understanding the mechanisms of these processes and how they change during normal or abnormal cell development or during viral infection presents an experimental challenge. A description of the dsRNA-specific signal transduction mechanism is currently incomplete, but ongoing studies should fill in the gaps and perhaps interrelate it with other signaling pathways. The successful application of dsRNA and specific analog in fighting neoplastic and viral disease may only be realized when (i) more is learned about the cellular, physiological and immunological responses to dsRNA, (ii) more efficient dsRNA delivery methods are developed, and (iii) next-generation dsRNA analogs with improved effectiveness-perhaps when coadministered with other drugs and/or biological response modifiers-are developed. It will be important to determine the cellular factors that determine whether dsKNA acts as a growth inhibitor or stimulator, because this would have an impact on the therapeutic use of dsRNA and antisense RNA. If the past is indeed prologue, then future research on dsRNA structure, reactivity, and biology should still hold many surprises.
58
ALLEN W. NICHOLSON
ACKNOWLEDGMENTS The author thanks the members of his laboratory for sharing their results and interest in dsRNase, and also thanks his colleagues for providing material and information. Thanks also to M. T. Murray and R. H. Nicholson for critically reading the manuscript. Research in the author’s laboratory is supported by the National Institutes of Health (Grant GM41283).
REFERENCES 1. N. R. Kallenbach and H. M. Berman, Q. Reu. Biophys. 10, 137 (1977). 2. M . Libonati, A. Carsana and A. Furia, MCBchem 31, 147 (1980). 3. D. S. Haines, K. I. Strauss and D. H . Gillespie, J. Cell. Biochem. 46, 9 (1991). 4 . J. R. Wyatt and I. Tinoco, in “The RNA World” (R. F. Gesteland and J. F. Atkins, eds.), p. 465. CSH Lab, Cold Spring Harbor, New York, 1993. 5 . M. A. Billeter and C. Weissmann, Prnc. Nucleic Acids Res. 1, 498 (1966). 6. H. D. Robertson and T. Hunter, ]BC 250, 418 (1975). 7 . W. Saenger, “Principles of Nucleic Acid Structure,” Springer-Verlag, New York, 1984. 8 . R. M. Franklin, PNAS 55, 1504 (1966). 9. K. H. Mellits, T. Pe’ery, L. Manche, H. D. Robertson and M. B. Mathews, NARes 18, 5401 (1990). 10. M. B. Mathews, Sem. Virnl. 4, 247 (1993). 1 1 . R. E. Lockard and A. Kuniar, NARes 9, 5125 (1981). 12. H. B. Lowman and D. E. Draper, JBC 261, 5396 (1986). 13. D. E. Kohne, S. A. Levison and M. J. Byers, Bchem 16, 5329 (1977). 14. B. D. Stollar, R. Koo and V. Stollar, Science 200, 1381 (1978). 15. J. Schonborn et a l . , NARes 19, 2993 (1991). 16. J. P. Calvet and T. Pederson, J M B 122, 361 (1978). 17. Y. L. Lyubchenko, A. A. Gall, L. S. Shlyakhtenko, R. E. Harrington, B. L. Jacobs, P. I. Oden and S. M. Lindsay, J . Biomol. Struct. Dynam. 10, 589 (1992). 18. M . Chastain and I. Tinoco, This Series 41, 131 (1991). 19. S . R. Holbrook, J. L. Sussman and S:H. Kim, Science 212, 1275 (1981). 20. R. 0. Day, N. C. Seeman, J. M. Rosenberg and A. Rich, PNAS 70, 849 (1973). 21. J. M. Rosenberg, N. C. Seeman, J. J. P. Kim, F. L. Suddath, H. B. Nicholasand A. Rich, Nature 243, 150 (1973). 22. S. H. Kim, F. L. Suddath, G. J. Quigley, A. McPherson, J. L. Sussman, A. H.-J. Wang, N. C. Seeman and A. h c h , Science 185, 435 (1974). 23. C. J. Alden and S.-H. Kim, J M B 132, 411 (1979). 24. S. Corbin, R. Lavery and B. Pullman, BBA 698, 86 (1982). 25. A. C. Dock-Bregeon, B. Chevrier, A. Podjarny, D. Moras, J. S. deBear, G. R. Gough, P. T. Gilham and J. E. Johnson, Nature 335, 375 (1988). 26. A. C. Dock-Bregeon, B. Chevrier, A. Podjarny, J. Johnson, J. S. deBear, G. R. Gough, P. T. Gilham and D. Moras, J M B 209, 459 (1989). 27. S. R. Holbrook, C. Cheong, I. Tinow and S.-H. Kim, Nature 353, 579 (1991). 28. C. S. Happ, E. Happ, M. Nilges, A. M. Gronenborn and G. M. Clore, Bchem 27, 1735 (1988). 29. S.-H. Chou, P. Flynn and B. Reid, Bchem 28, 2422 (1989). 30. K . Hall, P. Cruz and M. J. Chamberlin, ABB 236, 47 (1984). 31. H. H. Klump and T. M. Jovin, Bchem 26, 5186 (1987).
DOUBLE-STRANDED HNA
59
32. C. C. Hardin, D. A. Zarling, J. D. Puglisi, M. 0. Trulson, P. W. Davis and I. Tinoco, Bchem 26, 5191 (1987). 33. P. W. Davis, R. W. Adamiak aiid I. Tinoco, Biopolyrner 29, 109 (1990). 34. M. Teng, Y.-C. Liaw, G . A. van der Marel, J. H. van Boom and A. H.-J. Wang, Bchem 28, 4923 (1989). 35. D. A. Zarling, C. J. Calhoun, C. C. Hardin and A. H. Zarling, PNAS 84, 6117 (1987). 36. R. Kapahnke, W. Rappold, U. Desselberger and D. Riesner, NARes 14, 3215 (1986). 37. M. A. Livshits, 0.A. Amosova and Y. L. Lyubchenko, J. Biomol. Struct. Dynani. 7, 1237 (1990). 38. F.-U. Gast and P. J. Hagerman, Bcherri 30, 4268 (1991). 39. Y.-H. Wang, M. T. Howard and J. D. Griffith, Bchem 30, 5443 (1991). 40. D. Riesner, J. M. Kaper and J. W. Randles, NARes 10, 5587 (1982). 41. R. S. Tang aiid D. E. Draper, Bchem 29, 5232 (1990). 42. A. Bhattacharyya, A. I. H. Murchie and D. M. J. Lilley, Nature 343, 484 (1990). 42a. C . Gohlke, A. I. H. Murchie, D. M. J. Lilley and R. M. Clegg, PNAS 91, 11660 (1994). 43. R. S. Tang and D. E. Draper, NARes 22, 835 (1994). 44. S. P. Edmondson and D. M. Gray, Biopolymer 23, 2725 (1984). 44a. F.-U. Gast and H. L. SBnger, Electrophoresis 15, 1493 (1994). 45. M. Chastain and I. Tinoco, NARes 20, 315 (1992). 46. R. W. Roberts and D. M. Crothers, Science 258, 1463 (1992). 47. L. Yang and T. A. Keiderling, Biopdymer 33, 315 (1993). 48. B. J. Rao and C. M. Radding, PNAS 91, 6161 (1994). 49. N. C. Seeman, J. M. Rosenberg and A. Rich, PNAS 73, 804 (1976). 50. K. M. Weeks and D. M. Crothers, Science 261, 1574 (1993). 51. C. W. Carter and J. Kraut, PNAS 71, 283 (1974). 52. D. St Johnstone, N . H. Brown, J. G. Gall and M. Jantsch, PNAS 89, 10979 (1992). 53. T. Gibson and J. D. Thompson, NARes 22, 2552 (1994). 54. R. E. Klevit, Science 253, 1367 (1991). 55. M. A. Rould, J. J. Perona, D. Sol1 and T. A, Seitz, Science 246, 1135 (1989). 56. F. H. Westheimer, Accts. Chem. Res. 1, 70 (1968). 57. D. A. Usher, Nature (New B i d . ) 235, 207 (1972). 58. D. A. Usher and A. H. McHale, PNAS 73, 1149 (1976). 59. D. A. Usher and A. H. McHale, Science 192, 53 (1976). 60. S. H. Hall and R. J. Crouch, JBC 252, 4092 (1977). 61. D. Court, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman, eds.), p. 71. Academic Press, New York, 1993. 62. H. Nashimoto arid H. Uchida, MGG 201, 25 (1985). 63. Y. Davidov, A. Rahat, I. Flechiier and 0. Pines, J. Gen. Microbiol. 139, 717 (199.3). 6 4 Y. Iino, A. Sugimoto and M. Yamamoto, EMBO J. 10, 221 (1991). 64a. M. Zuber, T. A. Hoover, B. S. Powell, and D. L. Court, Mol. Microbiol. 14, 291 (1994). 65. H. Nashimoto, A. Miura, H. Saito and H. Uchida, MGG 199, 381 (1985). 66. S.-M. Chen, H. E. Taka, 6. C . Dubois, A. M . Barber, J. C. A. Bardwell and D. L. Court, JBC 265, 2888 (1990). 67. P. E. March and M. Gonzalez. NARes 18, 3293 (1990). 68. H. Li, B. S. Chelladurai, K. Zhang and A. W. Nicholson, NAHes 21, 1919 (1993). 69. E. E. Kim and H. W. Wyckafl; J M B 218, 449 (1991). 70. T. A. Steitz and J. A. Steitz, PNAS 90, 6498 (1993). 71. A. W. Nicholson, BBA 1129, 318 (1992). 72. H. 11. Robertson, Cell 30, 669 (1982). 7,3. L. Krinke and D. L. Wulfl; NARes 18, 4809 (1990).
60
ALLEN W. NICHOLSON
74. B. S. Chelladurai, H. Li, K. Zhang and A. W. Nicholson, Bchem 32, 7549 (1993). 75. H. D. Robertson and F. Barany, in “Proceedings of the 12th FEBS Congress,” p. 285. Pergamon, Oxford, 1979. 76. D. C. Schweisguth, B. S. Chelladurai, A. W. Nicholson and P. B. Moore, NARes 22, 604 (1994). 77. B. S. Chelladurai, H. Li and A. W. Nicholson, NARes 19, 1759 (1991). 78. S. Altuvia, H. Locker-Giladi, S. Koby, 0. Ben-Nun and A. B. Oppenheim, PNAS 84,6511 (1987). 79. R. J. Roberts and S. E. Halford in “Nucleases” (S. M. Linn, R. S. Lloyd and R. J. Roberts, eds.), 2nd Ed., p. 35. CSHLab, Cold Spring Harbor, New York, 1993. 80. J. J. Dunn, JBC 251, 3807 (1976). 81. G. Gross and J. J. Dunn, NARes 15, 431 (1987). 82. 0. 0. Favorova, F. Fasiolo, G. Keith, S. K. Vassilenko and J.-P. Ebel, Bchem 20, 1006 (1981). 83. P. E. Auron, L. D. Weber and A. Rich, Bchem21, 4700(1982). 84. M. Libonati and S. Sorrentino, MCBchem 117, 139 (1992). 85. S. Sorrentino and M. Libonati, ABB 312, 340 (1994). 86. M. Libonati, BBA 228, 440 (1971). 87. M. Libonati, M. C. Malorni, A. Parente and G. D’Alessio, BBA 402, 83 (1975). 88. P. Kindler, T. U . Keil and P. H. Hofschneider, MGG 126, 53 (1973). 89. P. Gegenheimer and D. Apirion, Microbiol. Reu. 45, 502 (1981). 90. F. W. Studier, J. J. Dunn and E. Buzash-Pollert, in “From Gene to Protein: Information Transfer in Normal and Abnormal Cells” (T. R. Russell, K. Brew, H. Farber and T. Schultz, eds.), p. 261. Academic Press, New York, 1979. 91. H. Saito and C. C. Richardson, Cell 27, 533 (1981). 92. G. Koraimann, C. Schroller, H. Graus, D. Angerer, K. Teferle and G . Hogenauer, Mol. Microbiol. 9, 717 (1993). 93. J. J. Dunn and F. W. Studier, J M B 166, 477 (1983). 94. M. Gottesman, A. Oppenheim and D . Court, Cell 29, 727 (1982). 95. P. Regnier and M. Grunberg-Manago, Biochimie 72, 825 (1990). 96. J. C. A. Bardwell, P. Regnier, S. M. Chen, Y. Nakamura, M. Grunberg-Manago and D. Court, EMBO J. 8, 3401 (1989). 97. E. Hajnsdorf, A. J. Carpousis and P. Regnier, J M B 239, 439 (1994). 98. D. R. Gitelman and D. Apirion, BBRC 96, 1063 (1980). 99. J. E. Mayer and M. Schweiger, JBC 258, 5340 (1983). 100. R. A. K. Srivastava, N. Srivastava and D. Apirion, Znt. J. Biochem. 24, 737 (1992). 101. E. G. H. Wagner and R. W. Simons, ARBchem 48, 713 (1994). 102. Y. Eguchi, T. Itoh and J.-I. Tomizawa, ARBchem 60, 631 (1991). 103. K. M. Takayama and M. Inouye, CRC Crit. Reu. Biochem. Mol. B i d . 25, 155 (1990). 104. J.-I. Tomizawa, in “The RNA World (R. F. Gesteland and J. F. Atkins, eds.), p. 419. CSHLab, Cold Spring Harbor, New York, 1993. 105. Y. Eguchi and J.-I. Tomizawa, J M B 220, 831 (1991). 106. H. Masai and K. Arai, NARes 16, 6493 (1988). 107. C. Persson, E. 6. H. Wagner and K. Nordstrom, EMBO J . 7, 3279 (1988). 108. P. Blomberg, K. Nordstrom and E. C. H. Wagner, E M B O J . 11, 2675 (1992). 109. P. Blomberg, H. M. Engdahl, C. Malmgren, P. Romby and E. G. H. Wagner, Mol. Microbiol. 12, 49 (1994). 110. E. G. H. Wagner, P. Blomberg and K. Nordstrom, EMBOJ. 11, 1195 (1992). 111. P. Blomberg, E. 6 . H. Wagner and K. Nordstrom, EMBO J. 9, 2331 (1990). 112. K. Gerdes, A. Nielsen, P. Thorsted and E. 6. H. Wagner, J M B 226, 637 (1992).
DOUBLE-STRANDED RNA
61
T. Thisted and K. Gerdes, J M B 223, 41 (1992). T. Thisted, N. S. Sorensen, E. 6. H. Wagner and K. Gerdes, EMBOJ. 13, 1960 (1994). T. Thisted, A. K. Nielsen and K. Gerdes, EMBOJ. 13, 1950 (1994). R. W. Simons and N. Kleckner, Cell 34, 683 (1983). C. Ma and R. W. Simons, E M B O J . 9, 1267 (1990). C. C. Case, E. L. Simons and R. W. Simons, EMBOJ. 9, 1259 (1990). J. D. Kittle, R. W. Simons, J. Lee and N. Kleckner, J M B 210, 561 (1989). C. C. Case, S. M. Roels, P. D. Jensen, J. Lee, N. Kleckner and R. W. Simons, E M B O J . 8, 4297 (1989). 121. L. Krinke and D. L. Wulff, Genes Deu. 4, 2223 (1990). 122. L. Krinke, M. Mahoney and D. L. Wulff, Mol. Micl-obiol. 5, 1265 (1991). 123. T. van Biesen, F. Soderbom, E. G . H. Wagner and L. S. Frost, Mol. Microbid. 10, 35 (1993). 124. S. H. Lee, L. S. Frost and W. Paranchych, MGG 235, 131 (1992). 125. M. Citron and H. Schuster, NARes 20, 3085 (1992). 126. A. L. Biere, M . Citron and H. Schuster, Genes Deo. 6, 2409 (1992). 127. M. Faubladier, K. Cam and J.-P. Bouche, ] M B 212, 461 (1990). 128. F. Tetart and J.-P. Bouche, Mol. Microbid. 6, 615 (1992). 129. T. Mizuno, M.-Y. Chou and M. Inouye, PNAS 81, 19066 (1984). 130. J. Andersen, S. A. Forst, K. Zhao, M. Inouye and N. Delihas, JBC 264, 17961 (1989). 131. J. Andersen and N. Delihas, Bchem 29, 9249 (1990). 132. J. Coleman, P. J. Green and M. Inouye, Cell 37, 429 (1984). 133. S. Pestka, B. L. Daugherty, V. Jung, K. Hotta and R. K. Pestka, PNAS 81, 7275 (1984). 134. B. L. Daugherty, K. Hotta, C. Kumar, Y. H . Ahn, J. Zhu and S. Pestka, Gene Anal. Techn. 6, l(1989). 134a. T. A. H. Hjalt and E. G. H. Wagner, NARes 23, 571 (1995). 135. T. A. H. Hjalt and E. G. H. Wagner, NARes 20, 6723 (1992). 135a. T. A. H. Hjalt and E. G. H. Wagner, NARes 23, 580 (1995). 136. M . Homann, K. Rittner and G. Sczakiel, J M B 233, 7 (1993). 137. W. F. Lima, B. P. Monia, D. J. Ecker and S. M. Freier, Bchem 31, 12055 (1992). 138. J.-C. Chuat and F. Galibert, BBRC 162, 1025 (1989). 139. S. Altman, PNAS 90, 10898 (1993). 140. Y. Inokuchi, N. Yuyama, A. Hirashima, S. Nishikawa and J. Ohkawa, JBC 269, 11361 (1994). 141. W. Jelinek and J. E. Darnel], PNAS 69, 2537 (1972). 142. H. D. Robertson, E. Dickson and W. Jelinek, JMB 115, 571 (1977). 143. H. D. Robertson and E. Dickson, MCBiol 4, 310 (1984). 144. W. R. Jelinek, T.P. Toomey, L. Leinwand, C. H. Duncan, P. A. Biro, P. V. Choudary, S. M. Weissman, C. M. Rubin, C. M. Houck, P. L. DeiningerandC. W. Schmid, PNAS 77, 1398 (1980). 145. C. M. Rubin, C. M. Houck, P. L. Deininger, T. Friedmann and C. W. Schmid, Nature 284, 372 (1980). 146. J. P. Calvet and T. Pederson, PNAS 74, 3705 (1977). 147. J. P. Calvet and T. Pederson, PNAS 76, 755 (1979). 148. P. I. Marcus and I. Yoshida, J. Chem. Physiol. 143, 416 (1990). 149. J. Huet, A. Sentenac and P. Fromageot, FEBS Lett. 94, 28 (1978). 150. D. J. Mead and S. 6 . Oliver, EJB 137, 501 (1983). 151. H.-P. Xu, M. Riggs, L. Rodgers and M. Wigler, NARes 18, 5304 (1990). 152. D. Frendewey, M. Gillespie and 1. Tarnok, I. Chem. Biochem. 17C, 177 (1993). 153. R. Stern, BBRC 41, 608 (1970).
113. 114. 115. 116. 117. 118. 119. 120.
62
ALLEN W. NICHOLSON
154. J. Rech, G. Cathala and P. Jeanteur, NARes 3, 2055 (1976). 155. H. D. Robertson and M. B. Mathews, PNAS 70, 225 (1973). 156. A. L. M. Bothwell and S. Altman, JBC 250, 1451 (1975). 157. J. Rech, C. Brunel and P. Jeanteur, BBRC 88, 422 (1979). 158. I. Grummt, S. H. Hall and R. J. Crouch, EJB 94, 437 (1979). 159. B. A. Peculis and J. A. Steitz, Cell 73, 1233 (1993). 160. G. Shanmugam, BBRC 70, 818 (1976). 161. G. Shanmugam, Bcheni 17, 5052 (1978). 162. B. K. Saha and D. Schlessinger, BBRC 79, 1142 (1977). 163. B. K. Saha and D. Schlessinger, JBC 253, 4537 (1978). la. K. Ohtsuki, Y. Groner and J. Hurwitz, JBC 252, 483 (1977). 165. P. Palese and 6 . Koch, PNAS 69, 698 (1972). 166. H. S. Kang and B. R. McAuslan, J. Virol. 10, 202 (1972). 167. P. P. Hung, Virology 51, 287 (1973). 168. J. M. Meegdn and P. I. Marcus, Science 244, 1089 (1989). 169. H. Ben-Artzi, E. Zeelon, M. Gorecki and A. Panet, PNAS 89, 927 (1992). 170. H. Ben-Artzi, E. Zeelon, S. F. J. Le-Grice, M. Gorecki and A. Panet, NARes 20, 5115 (1992). 171. S. W. Blain and S. P. Goff, JBC 268, 23585 (1993). 172. Z. Hostomsky, 6. 0. Hudson, S. Rahmati and Z. Hostomska, NARes 20, 5819 (1992). 173. B. L. Bass and H. Weintraub, Cell 48, 607 (1987). 174. M. R . Rebagliati and D. A. Melton, Cell 48, 599 (1987). 175. R. W. Wagner and K. Nishikura, MCBiol 8, 770 (1988). 176. Y. A. W. Skeiky and K. Iatrou, JMB 218, 517 (1991). 177. B. L. Bass and H. Weintraub, Cell 55, 1089 (1988). 178. A. G. Polson, P. F. Crain, S. C. Pomerantz, J. A. McCloskey and B. L. Bass, Bchem 30, 11507 (1991). 179. R. W. Wagner, J. E. Smith, B. S. Cooperman and K. Nishikura, PNAS 86, 2647 (1989). 180. B. L. Bass, in “The RNA World (R. F. Gesteland and J. F. Atkins, eds.), p. 383. CSHLab, Cold Spring Harbor, New York, 1993. 181. R. F. Hough and B. L. Bass, JBC 269, 9933 (1994). 182. U. Kim, T. L. Garner, T.Sanford, D. Speicher, J. M. Murray and K. Nishikura,JBC 269, 13480 (1994). 183. K. Nishikura, C. Yoo, U. Kim, J. M. Murray, P. A. Estes, F. E. CashandS. A. Leibhaber, EMBO J. 10, 3523 (1991). 183a. M. A. O’Connell and W. Keller, PNAS 91, 10596 (1994). 183b. U. Kim, Y. Wang, T. Sanford, Y. Zeng and K. Nishikura, PNAS 91, 11457 (1994). 184. K. Nishikura, Ann. N.Y. Acad. Sci. 60, 240 (1992). 185. D. Kimelman and M. W. Kirschner, Cell 59, 687 (1989). 186. B. L. Bass, H. Weintraub, R. Cattaneo and M. Billeter, Cell 56, 331 (1989). 187. S. M. Rataul, A. Hirano and T. C. Wong, J. Virol. 66, 1769 (1992). 188. M. Higuchi, F. N. Single, M. Kohler, B. Siimmer, R. Sprengel and P. H. Seeberg, Cell 75, 1361 (1993). 788a. S. M. Rueter, C. M. Burns, S. A. Coode, P. Mookherjee and R. B. Erneson, Science 267, 1491 (1995). 788b. J.-H. Yang, P. Sklar, R. Axel and T. Maniatis, Nature 374, 77 (1995). 189. A. G. Polson and B. L. Bass, EMBOJ. 13, 5701 (1994). 789a. L. Saccomanno and B. L. Bass, MCBiol 14, 5425 (1994). 190. L. M. Morrissey and K. Kirkegaard, MCBiol 11, 3719 (1991).
DOUBLE-STRANDED RNA
63
191. P. Belhuineur, J. Lanoix, Y. Blais, D. Forget, A. Steyaert and D. Skup, MCBiot 13, 2846
(1993). 192. B. K. Ray, T. G. Lawson, J. C. Kramer, M. H. Cladaras, J. A. Grifo, R. D. Abramson, W. C. Merrick and R. E. Thach, JBC 260, 7651 (1985). 193. R. Iggo, S . Picksley, J. Southgate, J. McPheat and D. P. Lane, NARes 18, 5413 (1990). 194. D. A. Wassarnian and J. A. Steitz, Nature 349, 463 (1991). 195. W. M. Toone, K. E. Rudd and J. D. Friesen, J. Bact. 173, 3291 (1991). 196. C.-G. Lee and J. Hunvitz, JBC 268, 16822 (1993). 197. H . Flores-Rosa and J. Hunvitz, JBC 268, 21372 (1993). 198. F. V. Fuller-Pace, S. M. Nicol, A . D. Reid and D. P. Lane, EMBO J. 12, 3619 (1993). 199. S. Teigelkamp, M. McGarvey, M. Plumpton and J. D. Beggs, EMBO J. 13, 888 (1994). 200. C.-G. Lee and J. Hunvitz, JBC 267, 4398 (1992). 201. K. Yamanaka, T. Ogura, E. V. Koonin, H. Niki and S. Hiraga, MGG 243, 9 (1994). 202. Q . Xiao, T. V. Sharp, I. W. Jeffrey, M. C. James, G. J. M. Pruijn, W. J. van Venrooij and M. J. Clemens, NARes 22, 2512 (1994). 203. K. Nishi, F . Morel-Deville, J. W. B. Hershey, T. Leighton and J. Schnier, Nature 336,496 (1988). 204. E . J. Steinmetz and T. Platt, PNAS 91, 1401 (1994). 205. H. Flows-Rozas and J. Hurwitz, JBC 268, 21372 (1993). 206. D. S. Portman and G . Dreyfus, E M B O 1. 13, 213 (1994). 207. H. Idriss, A. Kumar, J. R. Casas-Finet, H. Guo, Z. Damuni and S. H. Wilson, Bchem 33, 11382 (1994). 208. T. Hunter, T. Hunt and R. J. Jackson, JBC 250, 409 (1975). 209. M. J. Clemens, J. W. B. Hershey, A. Hovanessian, B. C. Jacobs, M. 6. Katze, R. J. ,Kaufman, P. Lengyel, C. E. Samuel, G. Sen and B. R. 6. Williams, J. Znterferon Res. 13, 241 (1993). 210. E . Meurs, K. Chong, J. Galabru, N . S. B. Thomas and A. Hovanessian, Cell 62, 379 (1990). 211. 6. N . Barber, M. Wambach, M.-L. Wong, T. E. DeverandA. G. Hinnebusch, PNAS90, 4621 (1993). 211a. P. R. Romano, S. R. Green, 6. N . Barber, M. B. Mathews and A. 6. Hinnebusch, MCBiol 15, 365 (1995). 212. A. 6. Hovanessian, J. Interferon Res. 9, 641 (1989). 213. D. C. Thomis and C. E. Samuel, J. Virol. 67, 7695 (1993). 214. L. Manche, S . R. Green, C. Schmedt and M. B. Mathews, MCBiol 12, 5238 (1992). 215. M. G. Katze, M. Wambach, M. L. Wong, M. Garfinkel, E. Meurs, K. Chong, B. R. 6. Williams, A. G. Hovanessian and C . N . Barber, MCBiol 11, 5497 (1991). 216. G . 3 . Feng, K. Chong, A. Kumar and B. R. 6. Williams, PNAS 89, 5447 (1992). 217. S . R. Green and M. B. Mathews, Genes Dev. 6, 2478 (1992). 218. S. J. McCormack, L. G . Ortega, J. P. Doohan and C. E. Samuel, Virology 198, 92 (1994). 218a. S . R. Green, L. Manche and M . 8 . Mathews, MCBiol 15, 358 (1995). 218b. S . B. Lee, S . R. Green, M. B. Mathews and M. Esteban, PNAS 91, 10551 (1995). 219. A. 6. Hovanessian, J. Interferon Res. 11, 199 (1991). 220. C. E. Samuel, Virology 183, l(1991). 221. A. Zhou, B . A. Hassel and R. H . Silverman, Cell 72, 753 (1993). 222. G . Grihaudo, D. Lernbo, 6. Cavallo, S. Landolfo and P. Lengyel, J. Virol. 65, 1748 (1Bl). 223. T. W. Nilsen, P. A. Maroney, H. D. Robertson and C. Baglioni, MCBiol 2, 154 (1982). 224. J. Sperling, J. Chebath, H. Arad-Dann, D. Offen, P. Spann, R. Lehrer, D. Goldblatt, B. Jolles and R. Sperling, PNAS 88, 10377 (1991).
64
ALLEN W. NICHOLSON
M. A. Minks, D . K. Weset, S. Benvin and C. Baglioni, JBC 254, 10180 (1979). H. Samanta, J. P. Dougherty and P. Lengyel, JBC 255, 9807 (1980). J. Chehath, P. Benech, A. Hovanessian, J. Galabru and M. Revel, JBC 262, 3852 (1987). I. Marie and A. G. Hovanessian, JBC 267, 9933 (1992). I. Marie, J. Svab, N. Robert, J. Galabru and A. G. Hovanessian, JBC 265, 18601 (1990). H. C. Schroder, R. Wenger, Y. Kuchino and W. E. G. Muller, JBC 264, 5669 (1989). P. G. Milhaud, M. Silhol, T. Salehzada and B. Lehleu, J. Gen. Virol. 68, 1125 (1987). K. A. Kelley and P. M. Pitha, Virology 147, 382 (1985). C. Colby and M. Chamherlin, PNAS 63, 160 (1969). J. Vilcek, M. H. Ng, A. E. Friedman-Kien and T. Krawciw, J. Virol. 2, 648 (1968). I. Yoshida, M. Azuma, H. Kawaii, H. W. Fisher and T. Suzutani, Acta Virol. 36, 347 (1992). 236. P. I. Marcus and M . J. Sekellick, Nature 266, 815 (1977). 237. K. Zinn, A. Keller, L.-A. Whittemore and T. Maniatis, Science 240, 210 (1988). 238. P. I. Marcus and M. J. Sekellick, J . Gen. Virol. 69, 1637 (1988). 239. P. J. Farrell, K. Balkow, T Hunt, R. J. Jackson and H. Trachsel, Cell 11, 187 (1977). 240. Y. Hu and T. W. Conway, J . Interferon Res. 13, 323 (1993). 241. K. V. Visvanathan and S. Goodbourn, EMBO]. 8, 1129 (1989). 242. P. A. Bauerle, BBA 1072, 63 (1991). 243. K. H. Mellits, R. T. Hay and S. Goodbourn, NARes 21, 5059 (1993). 244. A. Kumar, J. Haque, J. Lacoste, J. Hiscott and B. R. G . Williams, PNAS 91, 6288 (1994). 245. S. Ghosh and D. Baltimore, Nature 344, 678 (1990). 246. S. Li and J. M. Sedivy, PNAS 90, 9247 (1993). 247. T. Henkel, T. Machleidt, I. Alkalay, M. Kronke, Y. Ben-Neriah and P. Bauerle, Nature 365, 182 (1993). 248. T. Decker. J . Interferon Res. 12, 445 (1992). 249. C. Daly and N. Reich, MCBiol 13, 3756 (1993). 250. W. A. Carter and E. De Clercq, Science 186, 1172 (1974). 250a. S. B. Lee and M. Esteban, Virology 199, 491 (1994). 251. J. A. Majde, R. K. Brown and M. W. Jones, Microb. Pathogen. 10, 105 (1991). 252. M. Kimura-Takeuchi, J. A. Majde, L. A. Toth and J. A. Krueger,J. Infect. Dis. 166, 1266 (1992). 253. V. Juraskova, N. Dyatlova and V. Brabec, Eur. J . Phannacol. 221, 107 (1992). 254. S. Garfinkel, D. S. Haines, S. Brown, J. Wessendorf, D. H. Gillespie and T. Maciag, JBC 267, 24375 (1992). 255. H. R. Hubbell, J. E. Boyer, P. Roane and R. M. Burch, PNAS 88, 906 (1991). 256. J. N. Zullo, B. H. Cochran, A. S. Huang and C. D. Stiles, Cell 43, 793 (1985). 257. J. Vilcek, M. Kohase and D. Henrikson-DeStefano, J . Cell. Physiol. 130, 37 (1987). 258. M. K. Chelbi-Mix and C. E. Sripati, Exp. Cell Res. 213, 383 (1994). 259. D. S. Haines, R. J. Suhadolnik, H. R. Hubhell and D.H. Gillespie, JBC 267, 18315 (1992). 260. C. W. Hendrix, J. B. Margolick, B. G. Petty, R. B. Markham, L. Nerhood, H. Farzadegan, P. 0. P. Ts'o and P. S. Lietman, Antimicrob. Agents Chemother. 37,429 (1993). 261. P. G . Milhaud, P. Machy, S. Colote, B. Lehleu and L. Lesernian, J. Integeron Res. 11, 261 (1991). 262. D. Gillespie and W. A. Carter, Med. Hypotheses 37, 1 (1992). 263. H. Ushijima, P. G. Rytik, F. Schacke, H. U. Scheffer, W. E. G. Muller and H. C. Schroder, J . Interferon Res. 13, 161 (1993). 264. A. G. Laurent-Crawford, B. Krust, E. Deschamps de Paillette, L. Montagnier and A. Hovanessian, AIDS Res. H u m n Retrouir. 8, 285 (1992).
225. 226. 227. 228. 229. 230. 231. 232. 233. 234. 235.
DOUBLE-STRANDED RNA
65
265. B. Krust, C. Callebaut and A. Hovanessian, AZDS Res. Human Retrmir. 9, 1087 (1993). 266. S. B. Larson, S. Koszelak, J. Day, A. Greenwood, J. A. Dodds and A. McPherson, Nature 361, 179 (1993). 267. A. J. Fisher and J. E. Johnson, Natvre 361, 176 (1993). 268. J . G. Stevens, E. K. Wagner, G. B. Devi-Rao, M. L. Cook and L. T. Feldman, Science 235, 1056 (1987). 269. S . Khochbin and J.-J. Lawrence, E M B O J. 8, 4107 (1989). 270. M. Hildebrandt and W. Nellen, Cell 69, 197 (1992). 271. B. J. Dolnick, NARes 21, 1747 (1993). 272. R. C. Lee, R L. Feinbaum and V. Ambrose, Cell 75, 843 (1993). 273. M. Wickens and K. Takayama, Nature 367, 17 (1994). 274. R. Nowak, Science 263, 608 (1994). 275. Y. L. Lyubchenko, B. L. Jacobs and S. M. Lindsay, NARes 20, 3983 (1992). NOTE ADDEDIN PROOF:(1)A dsRNA persistence length of 720 i 70 A was determined by transient electric birefringence (TEB) [Kebbekus et al., Bchem 34, 4354 (1995)], and is consistent with earlier measurements, in that dsRNA is stiffer than DNA. TEB was also used to measure bulge-loop bending of dsRNA [Zacharias and Hagerman, J M B 247, 486 (1995)l. The angles range from 7-93", with an increasing number of nt (A or U) in the bulge loop. (2) Using immunocytochemical techniques, PKR was localized to the mammalian nucleus and nucleolus, in addition to the cytoplasm. Interferon treatment selectively increases cytoplasmic PKR levels, and a nuclear function of PKR is suggested [Jeffrey et al., Exp. Cell Res. 218, 17 (1995)l. (3) Additional evidence indicates accurate in oitro editing by d s U D of GluR-B pre-mRNAs [Melcher et al., JBC 270, 8566 (1995)l. (4) Biochemical studies indicate that the PKR dsRBD interacts with one turn of dsRNA [Schmedt et a l . , J M B 249, 29 (1995)l. (5) N M R was used to solve the structure of the dsRBD of Drusuphila Staufen protein [Bycroft et al., E M B O J. 14, 3563 (1995)]and ofE. Culi RNase 111 [Khdratt et at., EMBO J. 14, 3572 (1995)]. Both dsRBDs are closely similar, compact ellipsoids, and exhibit an alPlP2P3a, tertiary fold, with the two a helices packed on one side of the antiparallel P sheet. Direct interaction wih dsRNA is proposed to occur near the N terminus of helix aP.(6) Yeast RNase H I binds dsRNA in its N-terminal domain [Cerritelli and Crouch, RNA 1, 246 (1995)l. This domain, separate from the RNase H catalytic domain, contains two dsRBD-like motifs. Also, dsRNA binding is distinct from RNADNA hybrid binding and cleavage. (7) The HIV-1 reverse-transcriptase-associatedRNase H can cleave dsRNA under conditions of arrested reverse transcription [Gotte et al., E M B O J . 14, 833 (1995)], or in the presence of Mn2+ [Cirino et al., Bchem 34, 9936 (1995)l. (8) dsRNA induces adherence of sickle erythrocytes to the vascular endothelium [Smolinski et al., Bloud 85, 2945 (1995)], providing a connection between viral infection, dsRNA production, and resultant microvascular occlusion that precipitates sickle cell-associated pain.
Evolution, Expression, and Possible Function of a Master Gene for Amplification of an Interspersed Repeated DNA Family in Rodents PRESCOTTL. DEININGER~ Department of Biochemistry and . Molecular Biology Louisiana State University Medical Center New Orleans, Louisiana 70112 and Laboratory of Molecular Genetics Alton Ochsner Medical Foundation New Orleans, Louisiana 70121
HENRITIEDGE Departments of Pharmacology and Neurology State Unioersity of New York Health Science Center a t Brooklyn Brooklyn, New York 11203 JOOMYEONG
KIM
Department of Biochemistry and Molecular Biology Louisiuna State University Medical Center New Orleans, Louisiana 70112 JURGEN
BROSIUS
lnstitut f u r Erperimentelle Pathologie Zentruni f u r Molekularbiologie der Entziindung (ZMBE) Westfalische Wilhelm-Unioersitat 48149 Munster, Gennany I. Evolution of the BC1 RNA Gene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. The BC1 RNA Gene As a Master Gene for ID Repeats . . . . . . . . . . . . . 111. Anatomical and Subcellular Distribution of BC1 RNA . . . . . . . . . . . . . . . IV. Transcriptional Regulation of the Rat BC1 RNA Gene . . . . . . . . . . . . . . V. Speculations on BC1 RNA Fnnction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69 74 79 81 85 87
’ To whom correspondence may be addressed. Progress in Nucleic Acid Keaearch
aid tvl~~leculiu Biology, Val. >2
67
Copyright 0 19% by Academic Press, Inc. All r&ts nf reproducbon in any form reserved
68
PRESCOTT L. DEININGER ET AL.
BC1 RNA was originally identified (1-4) as a small cytoplasmic RNA species, found primarily within the brain of rats, that hybridized with a major (short interspersed repetitive element; SINE) family of repetitive DNA sequences in the rat genome. Identifier (ID) elements were initially found associated with neural-specific genes, prompting the idea that they might be involved in cell-type-specific gene expression. Subsequently, ID elements were detected in various nonneural genes (including housekeeping genes), and the notion of an ID-dependent regulation of brain-specific gene expression was challenged (5). BC1 RNA is transcribed by RNA polymerase 111 (3). Because most SINE elements have RNA polymerase 111 promoters (6), it was originally thought that the abundant transcription product detected in Northern blots with ID sequence probes was due to cumulative transcription from many dispersed ID loci (1).It was later found, through cDNA cloning experiments designed to clone the full-length RNA polymerase III-derived transcript, that the BC1 RNA was actually generated almost exclusively from a single gene (7). All of the BC1 cDNAs cloned had not only the identical ID-related sequence at
A
ID-region
GGGGUUGGGGAUUUAGCUCAGUGGUAGAGCGCUUGCCUAGCAAGCG CAAGGCCCUGGGWCGGUCCUCAGCUCCG
3' A-rich region
AAAAAAAAAAAAAAAAAAAAAAGACAAAAUAACAAAAGACC-
unique region CAAGGUAACUGGCACACACAACCUUU
B
_____ _____
3'
ID body
A-rich
____
unique
FIG. 1. Sequence and schematic of the rat BC1 RNA gene (A) The coding region of the gene includes a 75-nt ID-body region, a 51-nt A-rich region, and a 26-nt unique region, and terminates with a typical RNA polymerase I1 terminator consisting of 4 T residues. The arrow in B shows the length and direction of the BC1 transcrlpt. The A-rich region is not pure A and T, but includes a few other bases interspersed. Only a few bases at each end of the A-rich region are shown. The last three bases of the unique region are also shown, and typical transcripts terminate with two to four U residues coded from the terminator. (B) The sequence elements are shown schematically with the transcript and its orientation indicated with an arrow.
EVOLUTION AND EXPRESSION OF RODENT
BC1 RNA
69
their 5' ends and the expected A-rich region found in SINE transcripts, but they also had a short segment at the 3' end that was not related to the ID repeats (see Fig. 1). When this segment was used to probe rat genomic Southern blots, it was found to be unique and was then used to isolate the BC1 genomic locus (8). A number of proposals have been made concerning the relationship of this single BC1 locus and the other dispersed I D elements, both in terms of functional models and evolutionary relationships. Although no specific function has been demonstrated for either the BC1 RNA or I D elements, a number of lines of investigation suggest that the BC1 RNA gene plays some functional role, probably within neurons, throughout the rodent order. The BC1 RNA gene has also been shown to be a master gene for I D repeat amplification and evolution (10).
1. Evolution of the BC1 RNA Gene
A. The BC1 RNA Gene Is Rodent-specific
The BC1 RNA is a major RNA species in rat, mouse, hamster, guinea pig (11), squirrel, and Peromyscus (D. Kass and P. Deininger, unpublished). Thus, it appears to exist in all rodent genomes. A small transcript with similar expression patterns, found in the primate brain (U), is related to a totally independent repeated DNA family specific for primates, and has no direct relationship with the rodent BC1 RNA gene (13). Other extensive hybridization experiments at the RNA level also failed to discover a related sequence in rabbit, bovine, or primates (14).Although the sequencing of the region orthologous to the BC1 locus would be necessary to demonstrate unambiguously the rodent specificity of the BC1 RNA gene, the existing experiments make a very strong case for the origin of this gene specifically in the rodent genome. Some investigators have suggested that the guinea pig should not be considered a rodent (15). However, the presence of BC1 RNA and the specific BC1 genomic locus clearly identifiable in the guinea pig, but in no other nonrodent species, makes a strong argument, along with other data, for the guinea pig's relationship to the rodents (14).
6. Origin of BC1 RNA If the BC1 RNA gene originated early in rodent evolution, where did it come from? It almost certainly arose by evolution from a tRNA progenitor (16). Although there is some question about the specific tRNA species that gave rise to the BC1 RNA gene, it seems most likely that it was derived from
70
PRESCOTT L. DEININGER ET AL.
a tRNA*'a gene or pseudogene. Figure 2 presents a comparison of the BC1 RNA sequence to a mouse tRNA*la gene (17). The origin of SINES from different tRNA genes has been reviewed (18, 19). The presence of the A-rich region immediately adjacent to the 3' end of the region having homology to a mature tRNA transcript suggests that the tRNA gene copy that eventually gave rise to the BC1 RNA gene was generated through a retroposition process. However, although the BC1 RNA gene was one of the very first ID-related sequences in the rodent genome (lo),it cannot be determined whether the BC1 RNA gene was directly derived G
A
C
GG C
G
U G
A
G
l
u x
G
G G C C C I
I
I
I
I
C C G G G C C uU .
-C
G
U G A
cu
-__
u
U
c
U -A
U -Q
C -G
C -G
G-U
U U
A
u u
C-G A G G A
~a ~a
G P G-C A C
1
, G A G C
G
uu
'G-C G- C G-U G-C A-U U-A U
l
gG
A
CUCG" I
G
B
G - c
U
C
A
U
C G C
t
a
A
a G C
uu
GGGGUUGGGGAU
IIIIIIIIIIII
t
U AGA AGCUCAG GGU GCGCUUGC~
IIIIIII Ill
IIIIIIII u Q(3)
FIG.2. The tRNA origin and RNA structure of the BC1 RNA. (A) The traditional tRNA cloverleaf structure of a mouse tRNAAh (dashes show the base-pairing. (Post-transcriptional base modifications are not shown.) The unpaired sequences at the 5' end of the transcript are from the gene sequence and are normally processed off in the mature tRNA. (B) A portion of the BC1 RNA placed into the same structure. Lowercase letters represent bases of the BC1 RNA that differ from the sequence of the tRNAA1d. No gaps need be placed in the sequence to maintain this alignment. (C) An alternate, and much more stable, structure that can be drawn for the BCI RNA. The arrows point to positions that have mutated, as shown for the specific ID subfamily sequences (subfamily shown in parentheses).
EVOLUTION AND EXPRESSION OF RODENT
BC1 RNA
71
from a tRNA gene, or whether there were one or more intermediate gene duplications in the process. There is seine similarity between the 5' end of the tRNA transcript that should be processed off the mature tRNA species, and the 5' end of the BC1 RNA gene. This region is too short to determine whether this sequence similarity is due to the BC1 RNA gene originating directly from this sequence, from chance, or from some selective constraints on the initiation sequences for some RNA polymerase 111-directed transcripts. It should be noted that this tRNA gene and the BC1 RNA gene match throughout their length without the need for any gaps to align their sequence. Thus, it seems likely that if this was not the particular tRNAAla gene that gave rise to the BC1 RNA gene, it was a closely related one. However, if the BC1 RNA gene was created through a retroposition process, there are no longer any clear flanking direct repeats to serve as a typical hallmark of the retroposition process. These may have been lost, considering its age, and would also mark it as much older than the majority of the I D copies. It is quite possible that the event giving rise to the original BC1 locus occurred prior to the divergence of rodents from the other mammalian orders. However, in this case, we must propose that it was lost from those other orders and specifically conserved in the rodents. As discussed in Section I,D, the BC1 RNA gene seeins to be under selective pressure in rodents, consistent with the possibility of an older tRNA-derived pseudogene having obtained functional significance (20) only in the rodent lineage.
C. Duplication of the BC1 RNA Gene in Guinea Pig As discussed in Section 11, the BC1 RNA gene is responsible for a large portion of the amplification of I D repeats via retroposition. However, we know in the guinea pig that the BC1 RNA gene and its flanking regions were duplicated at least once by some other mechanism, probably via a DNAmediated recombination mechanism (10).In contrast, retroposition events that generate an ID repeat derived from the BC1 RNA gene will only encode the I D portion of the repeat and a similar A-rich 3' region, not the unique portion of the BCl RNA molecule, nor the flanking sequences. Thus, the I D copies will be missing any contribution of the BC1 3' unique region and the flanking regions as far as gene expression and potential function (see Section 11,D). Both of the two guinea pig BC1 RNA genes were thus more likely to be functional initially. The coding regions of the genes have been relatively well conserved, but the flanking regions have been subjected to extensive deletions and mutations. We do not know whether any of these changes have significantly changed the expression pattern or potential functions of these genes. However, having a duplicate gene available has impor-
PRESCOTT L. DEININGER ET AL.
72
tant implications for the generation of new master genes and the potential divergence of one copy to a slightly different expression pattern or function.
D. Conservation of the BC1 RNA Gene There are two important aspects of the conservation of the BC1 RNA gene. The first is the conservation of tRNA-like (or other RNA structure) features within the BC1 RNA gene; the second is the evidence that conservation provides regarding regions that may be under functional selection. Figure 2 shows the relationship of the BC1 RNA gene to a tRNA*ld gene. Although the sequence conservation is very strong, particularly in the 5’ two-thirds of the RNA, many of the standard tRNA features have not been conserved. The anticodon and aminoacyl stems can no longer form a stable structure. On the other hand, the D-loop and pseudouridine (+) loop stems are still structurally sound. The loop stem has apparently had at least one compensating mutation to maintain its structure. We must be careful because the presence of the A and B promoter elements associated with the D and loops, respectively, may contribute to conservation of these regions rather than RNA stability. However, it is tempting to propose that the BC1 RNA has simply evolved to a somewhat modified structure. A much more stable possible structure for the BC1 RNA is shown in Fig. 2C. This structure has not been confirmed by biochemical studies. However, the relationship of the four ID subfamily mutations relative to this structure are quite interesting. The one base change in the Type 2 subfamily strengthens the base-pairing by changing a G.U to a G-C base-pair. The diagnostic mutation associated with the Type 3 subfamily would destabilize a base-pair near the loop region. Last, the two changes in the Type 4 Subfamily would increase the base-pair stability in the same stem as the Type 2 mutation and would affect the same base-pair as the Type 3 mutation. Thus, it seems likely that RNA structural considerations have played an important role in evolution of both the BCl RNA gene and the specific subfamilies of I D repeats in rat. Figure 3 shows the actual conservation between the rat BC1 locus and that of mouse, hamster, and guinea pig. In all cases, the RNA coding region is much better conserved than are the flanking sequences, with a decrease in sequence identity further from the coding region. This is very strong evidence that there is functional selection being placed on the coding region. Surprisingly, when analyzing the coding sequences in more detail (lo),there is no obvious difference in the conservation between the I D body of the gene and the A-rich or unique regions. This suggests that all portions of the RNA
+
+
EVOLUTION AND EXPRESSION OF RODENT -200
-100
J
I
BC1 coding
I
loo[ 95
-
I
73
BC1 RNA
I
100
200
-
I I
al
e
n
-
I
KEY:
rathamster ratlguinea pig
FIG. 3. Conservation of the BC1 RNA locus. The top line presents a schematic of the BCl RNA gene locus, with the shaded box representing the RNA-coding portion of the gene and the numbering representing bases either 5' flanking to the gene (negative numbers) or 3' flanking to the gene (positive numbers). The scale on the left represents the percent identity between the various regions of the locus shown, with the key representing the comparison of various rodent genomes with those of rat. Bars represent sequence similarity in the coding region or in 100-bp flanking segments. The black bars represent comparisons between rat/mouse; gray bars, between rat/harnster; and open boxes, between rat/guinea pig. The rat/guinea pig 5' and 3' flanking sequences represent identity of only the first 52 and 70 bases, respectively, because alignments beyond those points were difficult without excessive gaps. In all cases, the coding region has diverged significantly less than either flanking region.
are subject to selection. The higher level of similarity with the immediate flanking sequences (Fig. 3) suggests that the flanking regions of the gene may also be under some selective constraints. These would almost certainly have to be selected because of effects on the expression of the gene. Analysis of the 5' flank of the rat BC1 gene shows a TATA-like sequence at position -28, which is also conserved in the other rodent genes. It seems likely that this and other conserved stretches play a role in the high levels of expression or tissue specificity of this gene (see Section IV).
74
PRESCOTT L. DEININGER ET AL.
II. The BC1 RNA Gene As a Master Gene for ID Repeats A. The BC1 RNA Gene Is a Master Gene
Two features suggest that the BCl RNA gene might represent a master gene controlling I D family amplification. The most important is that the SINE amplification mechanism almost certainly requires an RNA intermediate (21).Thus, the very high levels of expression that can be generated from the BC1 RNA gene and, similarly, the relatively low levels of expression that must be coming from other ID loci make the BC1 RNA a likely intermediate in the amplification process. The presence of the BC1 RNA gene in all rodents and without the traditional direct repeats associated with retroposons also suggests that it has the appropriate age to have founded this SINE family. There are only about 200 I D sequences in the guinea pig genome. Thus, the BCl RNA gene was among the first of the ID-related sequences. In fact, although the guinea pig genome has two BC1 RNA genes, it has the lowest copy number of I D repeats of any of the rodents examined (Section 11,B). Analyses of guinea pig I D sequences show that some have diagnostic sequence differences specific to one of the BC1 RNA genes, and some to the other (10). This evolutionary pattern suggests that both copies of the guinea pig RNA gene have been able to make ID copies at some time during guinea pig evolution. The mouse genome has about 10,000 copies of I D elements. Once again, the sequences of these ID elements closely reflect the sequence of the BC1 gene in mouse. This confirms that the BC1 RNA gene has controlled the evolution of the I D family of repeats and represents a master gene for I D amplification (10). A similar analysis of I D repeats and the BC1 locus from Peromyscus (D. Kass and P. Deininger) is consistent with this role of BC1 as a master gene of rodent I D amplification. The dominance of the BC1 RNA gene as a master gene for I D elements in these rodents demonstrates the extremely low probability that a new I D insertion will be highly active at retroposition. Most such insertions are probably pseudogenes from the start (22) and any copies that are initially active will be silenced relatively quickly. It is very likely that the selective evolutionary constraints placed on the BC1 RNA gene have been important in maintaining its amplification potential throughout rodent evolution. This has allowed it to continue to make copies and therefore dominate the amplification process (23). The relationship between BC1 and I D sequences in the rat is much more complicated. About 10,000 copies of the rat ID elements have sequences
EVOLUTION AND EXPRESSION OF RODENT
BC1
75
RNA
consistent with having been generated by a BC1 master gene. However, several newer subfamilies (see Section 11,C) are inconsistent with amplifications using the BC1 HNA as the intermediate. Thus, although the BCl RNA gene has dominated the evolution of ID family members in most rodent genomes, in the rat genome, other loci have also contributed significantly. This suggests that there may be one or more ID loci in the rat that became highly effective master genes.
B. Identifier-element Copy Numbers and Times of Amplification
There is tremendous variation in the copy number of ID repeats found within various rodent species (see Fig. 4). This ranges from a minimum of about 200 copies in the guinea pig to about 130,000 in the rat, with numbers
RAT
MOUSE
-
GUINEA PIG
7 A
200
Ancestral BC1
RNA Gene
FIG. 4. Evolutionary relationship of the BC1 gene and ID repeats. The BC1 gene, represented as a heavy line, was founded early in the rodent lineage. Different mutations can be found in the modern BC1 gene in different species as shown by the different geometric shapes on the BC1 gene. Two BCI genes are present in guinea pig, with independent mutations. The lighter lines represent the ID elements present in those genomes, with the number represeuting the approximate copy number. Essentially the same diagnostic mutations are found in the ID elements as have occurred in the BC1 gene. One exception occurs in rat, where only the 10,000 copy number Type 1ID element matches the BC1 gene, and a series of subfamilies 2-4 have a successive series of newer diagnostic mutations.
76
PRESCOTT L. DEININGER ET AL.
around 2,000-10,OOO found in mouse and hamster (24). This seems consistent with a steady increase in amplification rate in the lineage leading up to the rat species. However, the situation is much more complicated than this, with significant copy number differences found even within specific rodent families. The ID family members that have been analyzed in most species also appear to be quite homogeneous in sequence, suggesting very recent times of amplification. This is particularly true with the very high copy number of rat ID repeats (10, 24), but is also true for the mouse ID sequences studied. These two observations-the copy-number variation and the recent formation of most ID repeats-are most easily explained with a model in which the ID repeats had very little amplification capability early in rodent evolution, and that certain stochastic events have increased amplification rates at different times and in different species.
C. Identifier-element Subfamilies There are many ID sequences available in the rat database. Analysis of these sequences demonstrates that there are distinct subfamilies of ID sequences, similar to that seen for a number of other mammalian SINEs (22, 23). These subfamilies can also be arranged in a sequential manner, in which each subfamily sequence has one or more diagnostic positions relative to the previous one (10). These subfamilies also show progressively less sequence divergence, consistent with increasingly younger average sequence age. These younger subfamilies, termed Types 2-4, represent over 100,000 copies and show an average of less than 3% from the consensus. This suggests extremely rapid and recent amplification. Thus, because these sequences are inconsistent with amplification with the BC1 RNA gene as the master gene, there must be one or more new master genes formed in rat that are even more efficient than the BC1 RNA gene. We believe that the major reason that rat has such a significantly higher copy number of ID repeats relative to other rodent genomes is the presence of additional master gene(s). We do not know whether such master genes were made through BC1 RNA gene-duplication events, such as seen for the guinea pig gene (lo),or whether an ID element inserted into a favorable locus for transcription and for further amplifications. The possible influences and limits of such a site have been extensively reviewed (26a). However, as discussed in Section II,A, new highly active master genes for ID repeats have not been detected in other rodent genomes and seem to be a rare event for other SINEs (22).Thus, it seems that the chance formation of one or two new master genes in rat has been responsible for this rapid increase in
EVOLUTION AND EXPRESSION OF RODENT
77
BC1 RNA
amplification rate. Although these subfamilies have been made over a relatively short evolutionary period, it is tempting to consider the possibility that the master loci for these subfamilies represent duplicates of the BC1 locus or ID copies that have adapted to a slightly different function than that of the BC1 RNA. This would allow such elements to be maintained by selection and perhaps they could adapt to a new expression pattern that allowed higher expression in the germ line, where sequence amplifications must occur. J. Kim, D. H. Kass and P. Deininger (26b) and others (7) have not detected any other major RNA species in rat cells that might represent the RNA intermediate for these newer subfamily copies. We cannot be sure that such transcripts do not exist in some specific germ-line cell type. We have detected a particular variant locus that represents the major form of RNA present in the BC2 fraction of' ID transcripts (26b). This is a relatively divergent ID copy that does not seem to be actively making ID copies, but does continue to show a significant level of brain specificity in its expression, Because of the lack of detection of any major transcript(s) specific to the newer subfamilies, we believe it likely that the master gene(s) making these ID elements must be relatively more efficient than the BC1 RNA gene at other steps in the retroposition process.
D. Mechanistic Considerations of the BC1 /ID Master Gene The finding that the BC1 RNA gene has served as a master gene for ID amplification and evolution demonstrates that this gene has a significant advantage in amplification capability relative to the many dispersed ID loci. It is obvious that the expression of the BC1 RNA is a prerequisite to its amplification. However, in other tissues, BC1 RNA expression is significantly reduced and the relative level of expression from other dispersed ID loci is more likely to be important. Thus, it seems likely that other factors,
unique .................................... AAAAAAAAAAACAAGGT
?.
3'
ID body
FIG. 5. Self priming of BC1 RNA. The primary transcript shown in Fig. 1 is folded into a structure that would allow self-priming of reverse transcription. The putative reverse transcript is represented by the dotted line. The numbers of U residues at the 3' end are expected to vary somewhat.
78
PRESCOTT L. DEININGER ET AL.
in addition to transcription, may also be important in selecting the active copies. We find it likely that the 3‘ end of the BC1 KNA also plays a significant role in the amplification capability. One proposal for the remarkable amplification capabilities of SINES was that the 3’ terminal uridines on the RNA polymerase 111 transcripts might efficiently prime reverse transcription on the 3’ A-rich region in an intramolecular reaction (26c) (see Figs. 1 and 5). We have tested the ability of BC1 RNA to undergo such an intramolecular priming event (M. R. Shen and P. Deininger, unpublished) and found that the RNA does undergo an extremely efficient self-priming reaction. However, this self-priming was found not to be a generalized priming on the A-rich region, but was instead found to involve a longer stretch of the 3’ end of the RNA, forming a very specific hairpin structure at the extreme 3’ end of the A-rich region (Fig. 5). It is likely that the templates for this self-priming are preferentially the subset of BC1 RNAs ending with only two U residues, as three or four U residues would result in mismatched bases near the 3’ end. As each ID copy will have a difterent 3‘ end, depending on the site of integration, it is unlikely that most copies could undergo an efficient self-priming reaction. Although this finding does not demonstrate the use of self-priming in the authentic retroposition mechanism, the finding that a demonstrated master gene for amplification is able efficiently to carry out a self-priming reaction is strong circumstantial evidence for the importance of this process to SINE amplification in general. The potential involvement of 3’ terminal sequences in SINE amplifications also implicates several aspects of RNA stability in the efficiency of amplification. It is obvious that if there is the need for a germ-line RNA intermediate in retroposition that a more stable RNA will build up to higher steady-state levels and therefore have a potential amplification advantage. Because the principal difference between different SINE transcripts that might form is in terms of the different 3’ unique sequences they might contain, these sequences are the most likely to play a potential role in differential stability of ID transcripts. In addition, a number of SINE RNAs undergo a 3’ processing or specific degradation reaction (27). If the 3’ end of the RNA is removed in this way, the potential self-priming sequences will also be removed. The BC1 KNA gene is clearly very stable in some cells. We have studied both BC1 transcripts and ID transcripts in rat brain and testes and found little processing of the BCl transcript, but extensive processing of other I D transcripts to forms with complete removal of the 3‘ sequences (26b).These studies suggest that the structure and stability of the BC1 RNA may be significant factors in its ability to serve as an ID master gene.
EVOLUTION AND EXPRESSION OF RODENT
BC1
HNA
79
111. Anatomical and Subcellular Distribution of BC1 RNA BC1 RNA was discovered about 13 years ago (1-4) in rat brains as a small cytoplasmic RNA. Subsequently, similar small cytoplasmic RNAs were found at much lower levels in a broad range of other cell types (28). Extensive mapping of BC1 RNA expression in the adult rat brain has established that it is expressed in neurons but not in glial cells and, significantly, that it is located not only in neuronal somata but also in dendrites of neuronal subpopulations (29). Studies with acutely isolated neurons have clearly confirmed the neuron-specific expression and the somatodendritic location of this RNA (29). Although primate BC200 RNA is not a homologue of rodent BC1 RNA (12, 13), it is interesting to note that in the human nervous system its distribution is very similar to that of BC1 RNA, even on a subcellular level (30).The onset of BC1 expression in the developing rat brain has also been extensively studied; significantly, we found that the beginning of BC1 expression in several types of neurons coincided with periods of developmental synaptogenesis (V. Liu, J. Brosius and H. Tiedge, unpublished). Using in situ hybridization techniques, the expression pattern of BC1 RNA in the adult rat nervous system was established with a probe that recognizes only BC1 RNA. Examples of these localizations are presented in Fig. 6. Strongly labeled were elements of the amygdaloid complex, including nuclei in the olfactory, medial, central, and basolateral amygdala, as well as the bed nucleus of the stria terininalis. Intense labeling was also observed in the septa1 nuclei; however, only moderate labeling was evident in the corpus striatum. The neocortex is labeled with medium intensities. A number of thalainic nuclei were strongly labeled, among them the paraventricular thalamic nucleus, the paratenial thalamic nucleus, and the medial habenular nucleus of the epithalamus. A similarly strong hybridization signal was observed in several hypothalamic nuclei, including the supraoptic nucleus, the paraventricular hypothalainic nucleus, the dorso- and ventralmedial hypothalamic nuclei, and several preoptic nuclei. In the visual system, intense labeling was observed in the ventral lateral geniculate nucleus (the dorsal lateral geniculate nucleus was only moderately labeled) and in the superior colliculus, here especially in the zonal layer. Other strongly labeled midbrain areas include the inferior colliculus, in particular the dorsal cortex, and the central gray. In the cerebellum, BC1 labeling is low to moderate. White matter areas throughout the brain, such as the lateral olfactory tract, the optic nerve, the anterior and posterior commissure, corpus callosum, the internal capsule, the sensory root of the trigeminal nerve, and the pyramidal tract, showed little or no labeling. This indicates that BC1 RNA is
80
PKESCOTT L. DEININGER ET AL.
FIG. 6. Location of BC1 RNA and FAP-43 mRNA in acutely isolated spinal cord neurons. Spinal cord neurons were isolated as described in Ref. 62. Epiluminescence micrographs (B, DF, H) show the location of autoradiographic silver grains over individual neurons. B, D, F, and H show single neurons; E shows a group of neurons. Phase contrast micrographs (A, C, G) corresponding to epiluniinescence micrographs B, D, and H, show the nerve cells with their processes. Overexposure (exposure times of > 8 weeks) of neurons hybridized with the probe complementary to GAP-43 mRNA produced little or no specific labeling of neurites, although it resulted in heavy labeling of neuronal perikarya and in higher levels of unspecific background labeling. The respective "sense strand" control probes (BCI and GAP-43) failed to produce any specific labeling of acutely isolated cells (data now shown). Cells were counterstained with cresyl violet and methylene blue. Magnification, 240X. From Ref. 29 with permission.
expressed at low levels, if at all, in axons or glial cells. Likewise, no more than background labeling was detected in a number of nonneural tissues, including liver, lung, kidney, spleen, and skeletal and cardiac muscle. However, developing germ cells in male and female gonads were found to express BC1 RNA at appreciable'levels (Z. Zakeri, J. Brosius and H. Tiedge,
EVOLUTION AND EXPRESSION OF RODENT
BC1 RNA
81
unpublished data). Germ-line expression of BCl RNA is in support of the BC1 RNA gene as the founder of ID repetitive elements (see Section 11,A). BC1 RNA was found to be localized in the inner plexiform layer of the rodent retina. It was then tested whether the BC1 labeling signal can be attributed to any particular type of neurite, in particular to differentiate between dendrites of ganglion cells and other neuritic processes in the inner plexiform layer. This area of the retina contains a dense neuritic plexus with synaptic contacts between axons of bipolar cells, dendrites of ganglion cells, and dendritelike processes of amacrine cells. Because these processes cannot be differentiated by light-microscope observation alone, we used an electriclesion protocol to sever the optic nerve unilaterally shortly after birth. This procedure results in the eventual degeneration of retinal ganglion cells, including their dendritic trees. When we compared the BC1 labeling signal in the inner plexiform layer of a retina 6 weeks after the operation, with the signal in the contralateral control eye, we found a significant reduction of the grain density. The signal remaining in the inner plexiform layer after transection of the optic nerve may be attributable to dendritic processes of amacrine cells (31). We have recently found the only exception (thus far) to the somatodendritic location of BC1 RNA in neurons: BC1 RNA is axonally transported from magnocellular hypothalamic neurons to neurosecretory nerve endings in the posterior lobe of the rat pituitary (32). Recently, axonal messenger RNAs have also been identified in the pituitary. They include mRNAs for oxytocin, vasopressin, dynorphin, and neurofilament (33-38).
IV. Transcriptional Regulation of the Rat BC1 RNA Gene On its discovery, rodent brain cytoplasmic BC1 RNA was thought to be a transcription product from many ID repetitive elements (1-4). This belief was, in part, based on the presence of internal RNA polymerase 111 promoter elements in the I D elements, at that time thought to be necessary and sufficient for all genes transcribed by RNA polymerase 111. Later, it was shown that BC1 RNA is a homogeneous RNA transcribed from a single gene (7) and that most of the ID repetitive elements are transcriptionally silent and only found in transcripts when located on larger hnRNAs or mRNAs (5). The notion that BC1 RNA has been recruited (or exapted; see Ref. 39a) into a function and is not an RNA product that is fortuitously expressed in a few rodent species is furthermore supported by its cell-type-specific transcription. The prevalent expression of BC1 RNA in the nervous tissue of rodents (apart from lower level expression in reproductive organs (Section
82
PRESCOTT L. DEININGER ET AL.
II,D) occurs in both sciurognathid rodents and guinea pig (14). As tissuespecific expression patterns are thus identical in both rodent suborders (Sciurognathi and Hystricognathi), its transcriptional regulation is also a conserved feature prevailing for about 55 million years. Our in vitro studies indicate that there are several control elements within the gene and in the 5’ flanking sequences (39b).Most of these elements, as shown by alterations on an individual basis, are necessary for efficient transcription. Thus, the persistence of the nerve-cell-specific transcription pattern of BC1 RNA for about 55 million years cannot merely be explained away by the presence of a “robust promoter. ” From analysis of the genomic structure, we identified several putative elements that previously had been shown to be necessary for RNA polymerase I11 transcription of various small RNA genes. Apart from the typical internal promoter elements, referred to as box A and box B, we detected octamer transcription-factor-binding sequences, a proximal sequence element (PSE, -53), and a TATA box (-27) upstream from the gene (Fig. 7). The latter three elements are also present upstream from the genes for 7SK RNA and U6 snRNA (40, 41) and, interestingly, they are not only necessary but also sufficient for transcription by RNA polymerase 111. It was expected, therefore, that a subcloned SacI-Mae11 fragment (pKK 415-1) located at positions -4 and -429 from the BC1 RNA coding region, would analogously support RNA polymerase I11 transcription when used as a template for in vitro transcription is a HeLa cell extract (a gift from S. Murphy and R. Roeder). Surprisingly, no transcription was observed using the upstream region alone in a HeLa cell extract (Fig. SA, lane 1) or in a rat brain extract at various conditions (not shown). Furthermore, the BC1 upstream region could not functionally replace the 7SK gene upstream sequences (Fig. SA, lane 4; Fig. 8B, lane 8) whereas, conversely, the 7SK
FIG. 7. Map of upstream regulatory region of the BC1 RNA gene. The 433-bp segment between the Sac1 site and the 5’ end of the gene (angled bar) is shown enhanced. The putative ocfamer transcription factor binding sites (OCTA), the proximal sequence element (PSE), and the TATA box are indicated. The positions that correspond to deletion points (deletions starting upstream) are marked above the enlarged map portion.
EVOLUTION AND EXPRESSION OF RODENT
A 1 2 3 456 789
BC1
83
HNA
C
0 1 2 3 4 5 6 7 8 9 *
*
1234567 890AB
*-a.r*l-*Ln*-a!-
*
sl
I -
-*
FIG. 8. In uitro transcription of the BCl RNA gene. The transcripts, radiolabeled with ["ZPlGTP (800 Ci/mniol), were separated on 6% acrylamide, 0.3%(bis)acrylamidegels containing 7 M urea. After drying, the gels were exposed for about 12 hours with an intensifier screen (A) HeLa cell extract was used for the fdlowing plasmid (p) templates (concentrations [mg/ml] in the reactions are given in brackets): (1)pKK415-1 [ S ] ; (2) pBC1:KS [5]; (3)pBC1:SK IS]; (4) pBCU7SK [lo]; (5) p7SK/BC1 [lo]; (6)p7SK [20]; (7) p7SK(pUC) [20]; (8)pBluescript KS [20]: (9) pBluescript SK [20]. pBCl plasmids contain the entire 1453-bp SmI-BarnHI fragment (see Fig. 7) in either orientation. pBClI7SK and p7SK/BC1 are hybrid genes with swapped regulatory regions. (B) Rat-brain whole-cell extract with the following templates (all S mgiml): (1) pBC1:KS: (2) pBCl:-Abox; (3)pBCl:-Bbox; (4) pBCl:-A/Bbox; (5) ptRNA:XP; (6) pBClltRNA; (7) p7SK; (8) pBCU7SK; (9) p7SL (see also text for ternplate descriptions). (C) Rat-brain wholecell extract with the following deletion (see Fig. 7) templates (all 5 mg/ml): (1) pBC1:KS; (2) pBC1:O; (3)pBC1:-17; (4) pBC1:-33; (5)pBC1:-53; (6)pBC1:-73; (7) pBC1:-97; (8)pBC1:-129; (9) pBC1:-173; (0) pBC1:-186; (A) pBC1:-273: (B) pBC1:-313.
upstream region was active when fused to the BC1 gene (Fig. 8A, lane 5). Unlike with U 6 or 7SK genes, the corresponding region from the BCl gene i s therefore not sufficient for transcription. The above results prompted us to test whether the internal promoter regions (box A and box B) were importaut for in vitro transcription of the BC1 RNA gene usiiig the homologous rat \ m i n extract. Deletions of either
84
PRESCOTT L. DEININGER ET AL.
box A or B alone, or in combination, virtually eliminated transcription (Fig. 8B, lanes 2-4). However, as is the case for the upstream promoter elements, the internal RNA polymerase I11 transcription elements are also not sufficient for transcription by themselves. Unlike many studied tRNA genes, the upstream region of the BC1 gene is clearly important, because a fragment corresponding to the coding region only (with box A B intact) is not transcribed in uitro (Fig. 8C, lane 2). To h r t h e r delineate which combinations of regulatory elements are necessary using the homologous rat brain extract in uitro, nested deletions of the 5' flanking region were generated. It could be demonstrated that the odtamer sequences are not necessary in vitro, but that the PSE (to a lesser extent) and the TATA box are important for efficient transcription in uitro (Fig. 8C). In order to study the effect of various control elements on celltype-specific BC1 RNA expression we are currently using various rat BC1 RNA gene constructs in transgenic mice. In these in uiuo experiments, we find that transcription efficiency strongly requires the presence of upstream sequences that include both octamers (42). This demonstrates that at least some of the elements found upstream from the BC1 RNA gene strongly modulate its expression in uitro. This fact is also supported by the following experiment. As shown in Fig. 8B, lane 5, a tRNALeUgene (43) is only weakly transcribed in uitro. When the upstream sequence of the tRNA gene was replaced with that of the BC1 gene, a strong enhancement on transcription of the tRNALeUgene was observed (Fig. 8B, lane 6). The BC1 gene (PSE, TATA, and box A + B necessary) belongs to the class of RNA polymerase I11 genes that shares elements with RNA polymerase 11genes. However, it must be grouped into yet another subclass because it differs from the 7SK RNA and U6 RNA genes (no internal elements necessary; see Ref. 44)and the selenocysteine tRNA(Ser)secgene (PSE, TATA, and box B necessary; see Ref. 45) in that it requires, in addition to the upstream elements, at least the internal elements (box A and box B). Our results in this in uitro analysis are also consistent with the observation that I D repetitive elements per se are transcriptionally silent, as the retroposition process will not carry these important flanking sequences to the new insertion locus. In addition to the above identified elements (TATA box, internal box A, and box B) that are vital for BC1 RNA transcription in uitro, we expect additional promoter elements to be present that are responsible for the developmental and nerve-cell-specific RNA polymerase I11 transcription of the B C l RNA gene. We are currently using transgenic mouse models to identify such element(s).
+
EVOLUTION AND EXPRESSION OF RODENT
BC1
RNA
85
V. Speculations on BC1 RNA Function The concept of local protein synthesis in dendrites has received increasing-experimental support in recent years (46). Polyribosomes are located beneath synaptic sites, most prominently at the base of dendritic spines, in dentate granule cells of the hippocampus (47, 48). It has also been demonstrated that RNA is actively transported into dendrites but not into axons of hippocampal neurons in culture (49, 50). Consistently, mRNAs for a limited number of dendritic proteins have recently been detected in dendrites (most mRNAs, whether they encode dendritic proteins or other components of nerve cells, are restricted to the cell body). One of the dendritic mRNAs codes for the large form of microtubule-associated protein 2 (MAPB; see Ref. 51).The large MAP2 is a tubulin-binding protein specifically associated with the dendritic cytoskeleton (52). Another dendritic mRNA encodes the a-subunit of Ca2+/calmodulin-dependent protein kinase type I1 (CaM-KII; see Ref. 53). CaM-KII is found at high concentrations in postsynaptic densities and has been implicated in signal transduction mechanisms and in the induction of long-term potentiation (54). Furthermore, the mRNA for the type I inositol 1,4,5,-triphosphate receptor has been detected at substantial levels in Purkinje cell dendrites in mice (55). Recent reports demonstrating active protein biosynthesis in a preparation of dendrites isolated from cultured hippocampal neurons or in a preparation containing synaptosoines (56, 57) strongly emphasize the importance of specialized protein synthetic machinery in postsynaptic domains of dendrites. Such a mechanism would enable neurons to synthesize selected dendritic proteins locally, close to the respective postsynaptic sites where they are required. This would facilitate a decentralized and more flexible regulation of protein repertoires in postsynaptic domains, for example, in response to local synaptic stimuli. A precondition for localized protein synthesis is that components of the translation apparatus are also localized within the same subcellular compartment. In addition, for temporal and conditional regulation of this process, special mechanisms are required to prevent constitutive translation. Our current working hypothesis is that rodent BC1 RNA may (as ribonucleoprotein complexes; see Refs. 58-60) regulate translation in postsynaptic compartments. Synapses target both dendrites and cell bodies. Location of the small RNAs (such as BC1 RNA) both in dendrites and in cell bodies is therefore consistent with our hypothesis. BC1 RNA is derived from tRNAAla(16). Thus its ancestry supports our hypothesis that the RNA molecule may be involved in regulatory aspects of dendritic protein biosynthesis, possibly before or during phases of translation. BC1 RNA, as an example of a recent gene duplication yielding an RNA
86
PRESCOTT L. DEININGER ET AL.
species with novel distributions and potentially novel function, demonstrates that RNA molecules are not merely remnants or fossils from the RNA world. In contrast, just as with proteins, new RNA species can be generated at any time during evolution (20, 3%). Another concept is emerging from our studies. In the past, retroposition has been thought to produce mainly “junk DNA” in the form of retropseudogenes and middle repetitive sequences, but it now seems likely that these mechanisms can occasionally give rise to novel genes or regulatory elements (20, 39u). Our work suggests that variants of existing RNAs have been co-opted into specialized functions by the evolving nervous system, just as variant proteins have been. Although many molecules that are important for nerve cell function are quite ancient (kinases, phosphatases, receptors, channels), other neuronal-specific molecules, such as microtubuleassociated protein or growth-associated protein (GAP-43), have so far not been detected in invertebrates. The young age (on an evolutionary scale) of BC1 suggests two possible scenarios. Either these RNA molecules have been recruited into an existing functional protein or RNP complex to enhance efficiency, or these RNPs play a role that is entirely novel to nerve cells. This modification or novel role has become indispensable and is now under selective pressure. Although this hypothetical function may not be essential for all nerve cells from invertebrates to primates, it is tempting to consider that nervous systems and some of their neurons must have undergone significant changes and “improvements,” even over the last tens of millions of years, which hardly could have been achieved without recruitment of “novel” macromolecules from the existing repertoire in the genome. The elucidation of the neural function for BC1 RNA is of particular interest, because (i) it constitutes the first neuron-specific nonmessenger RNA, and (ii) it exhibits an unusual subcellular distribution. In addition, knowledge of BC1 RNA function will foster studies on the more recent evolution of the nervous system in mammals. This may lead to recognition of parallels between the evolutionary appearance of this novel RNA and structural andlor functional features of the expressing neurons. As Arbas, Meinertzhagen and Shaw (61) have stated in their chapter on evolution in nervous systems, “Evolution is the unifying theme of biological thought. It is therefore surprising that until recently it has little shaped the ideas of those who have sought principles among the cells and circuits of nervous systems.”
Abbreviations BC1 RNA
major small, discrete brain cytoplasmic KNA species related to I D elements in rodents
EVOLUTION AND EXPKESSION OF RODENT
BC2 RNA BC200 RNA SINE I D element D-loop *-loop PSE snRNA tRNA(Ser)Sec MAP2 CaM-KII GAP-43 OCTA
BC1 RNA
87
less abundant I D element-related RNA species that is smaller and more heterogeneous than BC1 RNA small (200-base) brain cytoplasmic RNA related to Alu elements in primates Short INterspersed repetitive Element in DNA a SINE family found in rodents, termed “identifier elements,” initially thought to mark brain-specific genes dihydrouridine loop pseudouridine loop proximal sequence element small nuclear RNA selenocysteine transfer RNA microtubule-associated protein 2 Ca2+/calmodulin-dependent protein kinase type I1 a 43-kDa growth-associated protein octamer transcription factor binding site
REFERENCES 1 . J. 6. Sutcliffe, R. J. Milner, F. E. Bloom and R. A. Lerner, PNAS 79, 4942 (1982). 2 . J. 6. Sutcliffe, R. J. Milner, J. M . Gottesfeld and W. Reynolds, Science 225, 1308 (1984). 3. J. G. Sutcliffe, R. J. Milner, J. M. Gottesfeld and R. A. Lerner, Nature 308, 237 (1984). 4. R. J. Milner, F. E. Bloom, C. Lai, R. A. Lerner and J. G. Sutcliffe, PNAS 81, 713 (1984). 5. G. P. Owens, N. Chaudhari and W. E. Hahn, Science 229, 1263 (198s). 6. P. L. Deininger, in “Mobile DNA: SINES Short Interspersed Repeated DNA Elements in Higher Eucaryotes” (M. Howe and D. Berg, eds.), p. 619. American Society for Microbiology, Washington, D. C., 1989. 7 . T. M. DeCliiara and J. Brosius, PNAS 84, 2624 (1987). 8. J. A. Martignetti, Ph.D. Thesis. The Mount Sinai School of Medicine, City University of New York, New York, 1992. 10. J . Kim, J. A. Martignetti, M . R. Shen, J. Brosius and P. Deininger, PNAS 91, 3607 (1994). 11. K. Anzai, S. Kobayashi, Y. Suehiro and S . Goto, Mol. Brain Res. 2, 43 (1987). 12. J. B. Watson and J. G. Sutcliffe, M C B 7, 3324 (1987). 13. J. A. Martignetti and J. Brosius, PNAS 90, 11.563 (1993). 14. J. A. Martignetti and J. Brosius, PNAS 90, 9698 (1993). 15. D. Graur, W. A. Hide and W.-H. Li, Nature 351, 649 (1991). 16. G. R. Daniels and P. L. Deininger, Nature 317, 819 (1985). 17. T. Russo, F. Constanzo, A. Oliva, R. Ammendola, A. Duilio, F. Eposito and F. Cimino, EJB 158, 437 (1986). 18. P. L. Deininger and 6 . R. Daniels, Trends Genet. 2, 76 (1986). 19. N. Okada and K. Ohshima, /, ,4402. Eool. 37, 167 (1993). 20. J. Brosius, Science 251, 753 (1991). 21. A. M. Weiner, P. L. Deininger and A. Efstratiadis, ARB 55, 631 (1986). 22. P . L. Deininger, M. A. Batzer, C. A. Hutchison, 111 and M. H . Edgell, Trends Genet. 8, 307 (1992).
88
PRESCOTT L. DEININGER ET AL.
23. P. L. Deininger and M. A. Batzer, in “Evolutionary Biology: Evolution of Retroposon” (M. H. Hecht, R. J. MacIntyre and M. T. Clegg, eds.), Vol. 27, p. 157. Plenum, New York, 1993. 24. C. Sapienza and B. St.-Jacques, Nature 319, 418 (1986). 26a. C. Schmid and R. Maraia, Curr. Opin. Genet. Den 2, 874 (1992). 26b. J. Kim, D. H. Kass and P. L. Deininger, NARes 23, 2245 (1995). 26c. P. Jagadeeswaran, B. G. Forget and S. M. Weissman, Cell 26, 141 (1981). 27. R. J. Maraia, NARes 19, 5695 (1991). 28. R. D. Mckinnon, P. Danielson, M. A. D. Brow, F. E. Bloom and J. 6.Sutcliffe, MCBiol7, 2148 (1987). 29. H. Tiedge, R. T. Fremeau, Jr.. P. H. Weinstock, 0. Arancio and J. Brosius, PNAS 88,2093 (1991). 30. H. Tiedge, W. Chen and J. Brosius, J. Neurosci. 13, 2382 (1993). 31. H. Tiedge, U. C. Drager and J.Brosius, Neurosci. Lett. 141, 136 (1992). 32. H. Tiedge, A. Zhou, N. Thorn and J. Brosius, J. Neurosci. 13, 4214 (1993). 33. D. Murphy, A. Levy, S. Lightman and D. Carter, PNAS 86, 9002 (1989). 34. G. F. Jirikowski, P. P. Sanna and F. E. Bloom, PNAS 87, 7400 (1990). 35. E. Lehman, J. Hanze, M. Pauschinger, D. Ganten and R. E. Lang, Neurosci. Lett. 111, 170 (1990). 36. J. T. McCabe, E. Lehman, N . Chastrette, J. Hanze, R. E. Lang, D. Ganten and D. W. Pfaff, Mol. Brain Res. 8, 325 (1990). 37. E. Mohr, A. Zhou, N. A. Thorn and D. Richter, FEBS Lett. 263, 332 (1990). 38. E. Mohr and D. Richter, Eur. J. Neurosci. 4, 870 (1992). 39a. J. Brosius and S. J. Gould, PNAS 89, 10,706 (1992). 39b. J. Martignetti and J. Brosius, MCBiol 15, 642 (1995). 40. S. Murphy, C. Di Liegro and M. Melli, Cell 51, 81 (1987). 41. P. Carbon, S. Murgo, J.-P. Ebel, A. Krol, G. Tebh and I. W. Mattaj, Cell 51, 71 (1987). 42. W. Chen, Ph.D. Thesis. City University of New York, New York, 1994. 43. D. R. Makowski, R. A. Haas, K. P. Dolan and D. Grunberger, NARes 11, 8609 (1983). 44. S. Murphy, B. Moorefield and T. Pieler, Trends Genet. 5, 122 (1989). 45. P. Carbon and A. Krol, EMBO J. 10, 599 (1991). 46. 0. Steward and G . A. Banker, Trends Neurosci. 15, 180 (1992). 47. 0. Steward and W. B. Levy, J. Neurosci. 2, 284 (1982). 48. 0. Steward and T. M. Reeves, J. Neurosci. 8, 176 (1988). 49. L. Davis, G. Banker and 0. Steward, Nature 330, 447 (1987). 50. L. Davis, B. Burger, G . Banker and 0. Steward, J. Neurosci. 10, 3056 (1990). 51. C. C. Garner, R. P. Tucker and A. Matus, Nature 336, 674 (1988). 52. D. W. Cleveland, Cell 60, 701 (1990). 53. K. E. Burgin, M. N. Waxham, S. Rickling, S. A. Westgate, W. C. Mobley and P. T. Kelly, J. Neurosci. 10, 1788 (1990). 54. M. B. Kennedy, Cell 59, 777 (1989). 55. T. Furuichi, D. Simon-Chazottes, I. Fujino, N. Yamada, M. Hasegawa, A. Miyawaki, S. Yoshikawa, J.-L. Gunet and K. Mikoshiha, Receptors and Channels 1, 1124 (1993). 56. E. R. Torre and 0. Steward, J. Neurosci. 12, 762 (1991). 57. A. Rao and 0. Steward, J. Neurosci. 11, 2881 (1991). 58. S. Kobayashi, S. Goto and K. Anzai, JBC 266, 4726 (1991). 59. S. Kobayashi, N. Higashi, K. Susuki, S. Goto, K. Yumoto and K. Anzai,JBC 267, 18,291 (1992). 60. J. G . Cheng, H. Tiedge and J. Brosius. SOC. Neurosci. Abstr. 18, 624 (1992). 61. E. A. Arbas, I. A. Meinertzhagen and S. R. Shaw, Annu. Rev. Neurosci. 14, 9 (1991). 62. K. Murase, P. Ryu and M. Rudic, Neurosc. Lett. 103, 56 (1989).
Nutritional and Hormonal Regulation of Expression of the Gene for Malic Enzyme’ ALAN G. GOOD RIDGE,^ STEPHENA. KLAUTKY, DOMINICA. FANTOZZI, REBECCA A. BAILLIE, DEANW. HODNETT, WEIZUCHEN, DEBBIEC. THURMOND, GANGXu AND CESARRONCERO
’
Department of Biochemistry Uniuersity of ~ o w a Iowa City, Zowa 52242
I. Nutritional State Regulates Fatty-acid Synthesis and Activities of Lipogenic Enzymes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. The Animal Model.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Physiological Mechanisms ............ IV. Molecular Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Mechanisms for Regulating Transcription .............. VI. Chromatin Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........ VII. &-Acting Elements in the Malic-enzyme Gene VIII. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90 90 91 93 99 108 112 120 121
Abbreviations: T3, 3,3 ,5-triiodo-~-thyronine; IFG-1, insulin-like growth-factor-1; CAT, chloramphenicol acetyltransferase; CMV, cytomegalovirus; RSV, Rous sarcoma virus; LTR, long terminal repeat; HSV-TK, herpes simplex virus thymidine kinase; CPT, chlorophenylthio; HNF4, hepatic nuclear hctor 4; MLTF, major late transcription factor, CREB, cyclic-AMP response-element binding protein; TJRE, T3 response element. 2 To whom correspondence may be addressed. Progress in Nucleic Acid Research and Moleciilar Biology. Val. 52
89
Copyright Q 1996 by Acdemic Press, Inc. All rights of reproduction in any form reserved.
90
ALAN G . GOODRIDCE ET AL.
1. Nutritional State Regulates Fatty-acid Synthesis and Activities of Lipogenic Enzymes
The de novo synthesis of long-chain fatty acids is high in well-fed animals, especially if their diets contain high percentages of carbohydrate, and is low in starved animals and in those fed diets with a low percentage of carbohydrate (1, 2). Similarly, activities of the “lipogenic” enzymes are high in animals on high-carbohydrate diets and low in starved animals or those on lowcarbohydrate diets ( 1 , 2). We have concentrated on two lipogenic enzymes, malic enzyme and fatty-acid synthase. We work mainly with liver because this organ is the primary anatomic site for the de novo synthesis of fatty acids in birds (3-6). Malic enzyme [L-malate-NADP+ oxidoreductase (decarboxylating), E C 1.1.1.401 catalyzes the oxidative decarboxylation of malate to pyruvate, simultaneously generating NADPH from NADP+ . Fatty-acid synthase (EC 2.3.1.85)is a multifunctional polypeptide that catalyzes the final reactions in the synthesis of long-chain fatty acids. Starting with a primer of one molecule of acetyl-CoA, the enzyme catalyzes condensation with one molecule of malonyl-CoA, producing a compound lengthened by two carbons plus a molecule of CO,. The lengthened chain is then reduced with two molecules of NADPH and dehydrated. This process is repeated seven times, thus producing a molecule of the 16-carbon saturated fatty acid, palmitate, seven molecules of CO,, eight molecules of CoA, and 14 molecules of NADP+. In chicken liver, virtually all of the 14 molecules of NADPH required for this reaction are furnished by the reaction catalyzed by malic enzyme (3, 6). In this essay, we concentrate on malic enzyme.
II. The Animal Model Our objectives are to understand the physiologic and molecular mechanisms by which nutritional state regulates hepatic fatty-acid synthesis. An ideal system for this analysis would display a low basal rate of fatty-acid synthesis in the starved state and a high rate in the fed state. Unfortunately, enzyme activity or enzyme concentration decreases slowly, and starved animals do not survive long enough to achieve the basal state. We circumvented this problem by using unfed, newly hatched chicks as our model. The embryonic chick develops in a low-carbohydrate, high-fat environment; the rate of fatty-acid synthesis and the activities of the lipogenic enzymes are low in the liver and adipose tissue (5-6). Chicks feed on a mash diet high in carbohydrate and low in fat almost immediately after they hatch. Furthermore, newly hatched birds grow rapidly, depositing most of the
91
MALIC ENZYME GENE
I:.;c
MALIC
FATTY ACID SYNTHESIS
2
3 s
0
a
I
ENZYME
\ 2
K
1600
i x
I \ I \ I \ I \
30
z
90
I I
6
400
'?
1
3 5 7
::.
;
PO
0
w
2
a w U
-I
0
-. .
20 22 %4 26 28
5
\
f+ 200
II
I
0
I
1
3
5 7
,
I
,
20 22 24 26 28
i
AGE (DAYS)
FIG. 1. Correlation between fatty-acid synthesis from glucose and malic-enzyme activity in the liver of neonatal chicks. Left panel: incorporation of [U-f*C]glucose into total fatty acids in liver slices. Right panel: total activity ofhepatic malic enzyme. Results are for normally fed birds (0-0), starved birds (O...O) , and birds refed after 2 or 3 days ofstarvation (x---x). Each point represents the average of 4 to 11 experiments (from Ref. 24, with permission). TPN (NADP), Triphosphopyridine iiucleotide (nicotinamide adenine dinucleotide phosphate).
stored calories as fat. Selected meat chickens can grow from 50 g at hatching to 2 kg in 7 weeks and contain more than 85% of calories as triacylglycerol. When hatched chicks are fed, the rate of fatty-acid synthesis increases rapidly to 500 to 1000 times that in unfed chicks (5). Concomitantly, malicenzyme activity increases 70-fold (Fig. 1) (6). There is little or no fatty-acid synthesis or lipogenic enzyme activity in adipose tissue of fed chicks (5, 6). Both the increased rate of hepatic fatty-acid synthesis and malic-enzyme activity are regulated by feeding, not by developmental state, because both processes were inhibited when food was withheld from newly hatched chicks or withdrawn from fed chicks (7).Thus, regulation of hepatic enzyme activity in the newly hatched chick is similar physiologically to that observed in adult animals undergoing the transition between the starved and fed states.
111. Physiological Mechanisms
A. Insulin and Glucagon Macronutrients in the diet, or the products of their digestion in the gut, regulate the secretion of hormones that, in turn, regulate metabolic function in the liver and other organs. One of our goals has been to identify the humoral factors that regulate hepatic malic-enzyme activity during the tran-
92
ALAN G . GOODRIDGE ET AL.
sition between the starved and fed states. Insulin stimulates, and glucagon inhibits, fatty-acid synthesis both in vivo and in isolated liver preparations. Furthermore, administration of insulin in vivo causes an increase in the activity of malic enzyme, and glucagon blocks the increase in malic-enzyme activity caused by refeeding starved animals (1,2). Consistent with roles in the regulation of fatty-acid synthesis, plasma insulin levels are elevated in fed animals and lowered in starved ones. The opposite pattern is true for glucagon (8-13). Thus, insulin and glucagon are candidates to communicate the state of alimentation of the whole animal to its liver.
B. Thyroid Hormone The activity of hepatic malic enzyme and hepatic lipogenesis are elevated in hyperthyroid animals and decreased in hypothyroid animals (14, 15). Plasma levels of T3, the active form of thyroid hormone, are increased by feeding and decreased by starvation (13, 16,17).Thus, T3 is also a candidate to mediate the effects of diet on malic enzyme activity.
C. Unesterified Long-chain Fatty Acids Hormones may not be the only agents that regulate malic enzyme activity in the liver. The blood levels of several metabolic fuels are regulated by dietary state and are potential candidates for regulating fatty-acid synthesis in the liver. For example, the concentration of unesterified fatty acids in the blood is increased by starvation and decreased by feeding. Unesterified long-chain fatty acids inhibit the rate of fatty-acid synthesis in isolated hepatocytes (18).Furthermore, long-chain fatty-acyl-CoAs, which are direct metabolites of unesterified fatty acids, inhibit the activity of acetyl-CoA carboxylase, the probable rate-limiting enzyme for the de nmo synthesis of long-chain fatty acids (19-21). Thus, unesterified fatty acids also are potential regulators of hepatic malic-enzyme activity.
D. Development of a Cell-culture System When we began our studies, the direct effects of insulin, glucagon, T3, or long-chain fatty acids on malic-enzyme activity in hepatocytes had not been analyzed. The increase in malic-enzyme activity is not maximal for several days after feeding is initiated. Preparations of hepatocytes then in use survived only a few hours, so the direct effects of humoral agents on hepatic malic-enzyme activity could not be tested. With the technical advice of a colleague, B. P. Schimmer of the Banting and Best Department of Medical Research, University of Toronto, we developed a tissue-culture system for chick embryo hepatocytes in which the direct effects of hormones and fuels could be tested. Initially, these studies utilized a medium enriched with serum, but later we switched to a chemically defined but highly enriched
93
MALIC ENZYME GENE
TABLE I MALIC ENZYME IN HEPATOCYTES IN CULTURE^ Measurement
No addition
Insulin
T3
Enzyme activity Enzyme synthesis Transcription rate
1 1 1
2
40
1 1
73
-
+
Insulin T3
Insulin T3 glucagon
120 125
3
50
2
+
+
4
0 Hepatocytes were isolated from the livers of 19-day-old chick embryos and incubated in serum-free Waymouth medium MD 705/1 containing no additions, insulin (300 ng/ml), T3 (1pgiml), insulin plus T3, or insulin plus T3 plus glucagon (1 pglml) for 3 days. Enzyme activities and rates of enzyme synthesis were determined as described (27) and then recalculated as fold-increases by setting the results for hepatocytes incubated without hormones at 1.0. In the transcription experiments, cells were incubated with insulin (300 nglml) for about 20 hours. The medium was then changed to one of the same composition with or without T3 (1 pgfml) or glucagon (1pglml). This change in protocol did not change the magnitude of the effects of the hormones on malic-enzyme activity. Transcription rates were determined as described ( 4 1 )and then recalculated as fold-increases by setting the values for insulin alone to 1.0.
medium, Waymouth MD 705 (22).With or without serum in the medium, T3 and insulin increased and glucagon inhibited the rate of fatty-acid synthesis and the activity of malic enzyme (Table I) (23-25). Stearate inhibited the increases in fatty-acid synthesis and malic-enzyme activity in cells incubated with serum- and albumin-supplemented medium (23).Rapid metabolism of unesterified long-chain fatty acids in hepatocytes made it difficult to analyze their effects. In subsequent experiments, described later (Section V, C, 2) in this essay, transcription of the malic-enzyme gene was assayed during short incubations and was inhibited by unesterified long-chain fatty acids. These results are consistent with insulin and T3 being humoral agents that mediate stimulation of malic-enzyme activity by the fed state, and glucagon and unesterified long-chain fatty acids being humoral agents that communicate the starved state to the liver.
IV. Molecular Mechanisms A. Strategy For hormones that regulate hepatic malic-enzyme activity during the transition between the starved and fed states, we want to determine the molecular nature of each event between binding of the hormone to its relevant hepatic receptor and altered malic-enzyme activity. For fuels such as fatty acids, we want to determine the molecular nature of events between uptake of the fuel by the hepatocyte and altered malic-enzyme activity. This
94
ALAN G . GOODRIDGE ET AL.
includes determining whether the active signaling molecule is the fuel molecule itself or a molecule produced during the metabolism of the fuel. If the latter, we want to identify the metabolic intermediate that is the active signaling molecule. In other words, we want to define each of the intracellular signaling pathways that regulate malic enzyme activity. Our strategy has been to work backward along the signaling pathways, starting with the change in enzyme activity.
B. Enzyme Activity The activity of an enzyme can be regulated by controlling the catalytic efficiency of that enzyme, for example, by allosteric mechanisms or covalent modifications. Alternatively, enzyme activity can be regulated by controlling the number of enzyme molecules per cell. Chicken malic enzyme was purified, and a rabbit antibody was raised against the purified enzyme. Using immunological techniques, we showed that the change in activity that accompanies the increase in malic-enzyme activity when newly hatched chicks are fed is due exclusively to an increase in the concentration of malic enzyme in the liver (26).In culture, the increases in activity caused by insulin and T3 and the decrease in activity caused by glucagon also were due to altered enzyme concentration (27).
C. Enzyme Concentration The concentration of an enzyme can be regulated by controlling the rate constants for either synthesis or degradation of that enzyme. Using the antibodies raised against chicken malic enzyme as reagents for rapid purification of the enzyme, we measured the rate constants for synthesis and degradation of hepatic malic enzyme in newly hatched chicks that were fed or unfed, and in hepatocytes in culture that were treated with no hormone or with insulin, T3, insulin plus T3, or insulin plus T3 plus glucagon. Degradation of malic enzyme was unaffected by either dietary manipulation in vivo (26) or hormonal manipulation in culture (27). The magnitudes and directions of the changes in rates of synthesis of malic enzyme were the same as those for malic-enzyme concentration during both dietary manipulations in uivo and hormonal manipulation in culture (Table I). Thus, the concentration of malic enzyme is controlled by regulating its rate of synthesis.
D. Enzyme Synthesis Synthesis of an enzyme can be regulated by controlling either the abundance of the mRNA for that enzyme or the efficiency with which that specific mRNA is translated into protein. We cloned the cDNA for avian malic enzyme and used that cDNA in hybridization assays to determine the abun-
95
MALIC ENZYME GENE
dance of malic-enzyme mRNA in starved and fed chicks and in hepatocytes in culture incubated with no hormones or various combinations of insulin, T3, and glucagon. The nutritioiially and horinvnally induced changes in enzyme synthesis were accompanied by comparable changes in the abundance of malic-enzyme mRNA (Figs. 2 and 3) (28, 29).
E.
Other Animal Models
Similar studies have been carried out using intact rats and rat hepatocytes in culture. Changes in the activity of rat hepatic malic enzyme caused by starvation, refeeding, high-carbohydrate diets, induction of the diabetic state, and treatment of diabetic animals with insulin also correspond primarily to changes in enzyme concentration. These alterations, in turn, are due to changes in the synthesis rate of the rat enzyme that, in turn, are due to changes in abundance of rat malic-enzyme mRNA (30-33). Although the changes are much smaller in magnitude, insulin, T3, and glucagon also regulate the concentration, synthesis, and mRNA abundance of malic enzyme in adult rat hepatocytes in culture (34, 35).
u)
3940-
U
1830-
-= Q)
0
DF-
chick liver RNA FIG. 2. The effects of feeding on the level of hepatic malic-enzyme mRNA. Total polyadenylylated RNA was separated by size by electrophoresis in agarose gels, blot-transferred to nitrocellulose, and hybridized to 3zP-labeled, single-stranded malic-enzyme cDNA. RNA was extracted from the livers of 2-day-old chicks starved from hatching or fed for 24 hours as indicated on the figure. Each lane contained 20 p,g of RNA. DF, Dye front; Or, origin (modified from Ref. 28; taken from 28a, with permission of TheJournal ofNutrition).
96
ALAN G. GOODRIDGE ET AL.
OR-
3940-
620-
DFNORTHERN BLOT HEPATOCYTE POLY(A+I RNA PROBE: MALlC ENZYHE cDNA
FIG. 3. The effects of T3 and glucagon on the level of malic-enzyme mRNA. Total polyadenylylated RNA was separated by size and hybridized to 32P-labeled malic-enzyme cDNA as described in the legend to Fig. 2. RNA was extracted from hepatocytes isolated from the livers of 19-day-old chick embryos and incubated in serum-free Waymouth medium MD 705/1 containing insulin (300 ng/ml) (control), insulin plus T3 (1 pg/ml), or insulin plus T3 plus glucagon (1 pg/ml) for 3 days. Each lane contained 20 pg of RNA. DF, Dye front; Or, origin (modified from Ref. 28; taken from Ref. 28a with permission of The Journal of Nutrition).
F. Abundance of mRNA 1. APPEARANCERATE OF CYTOPLASMIC mRNA Our next objective was to determine the mechanism by which diet and hormones regulate mRNA abundance. This is formally similar to the analysis of mechanisms involved in regulation of enzyme concentration; the abundance of an mRNA can be regulated by controlling either synthesis or degradation of that mRNA. In actuality, however, it is somewhat different because
97
MALIC ENZYME GENE
the mRNA population relevant to synthesis rate of an enzyme is cytoplasmic mRNA. Synthesis of mRNA takes place in the nucleus. Thus, the abundance of a cytoplasmic mRNA is a function of its rate of appearance in the cytoplasm and its rate of degradation. The appearance rate of cytoplasmic mRNA is controlled by nuclear processes, including transcription of the gene, processing the primary transcript, and transport of the mature mRNA from the nucleus to the cytoplasm. We first examined degradation of cytoplasmic mRNA.
2. DEGRADATION OF mRNA in Vivo A kinetic method was used to estimate the half-life of hepatic malicenzyme mRNA in fed and starved chicks. The extent of the difference between basal and induced levels of an enzyme or mRNA is a function of changes in both the synthesis rate and the degradation rate constant (In 2/t,,,) of that enzyme or mRNA. The time to progress from one steady-state concentration to another is exclusively a function of the half-life of the enzyme or mRNA. When the abundance of an mRNA is caused to change, the half-life of that mRNA can be calculated from the rate of approach of mRNA abundance to its new steady state (36). This was determined in birds that were refed after a period of starvation, or in starved birds after ad libitum feeding. The calculated half-life of hepatic malic-enzyme mRNA in fed chicks was 3 to 5 hours; in starved chicks, it was about 1 hour (36, 37). This result suggested that part of the more than 50-fold increase in mRNA level could be attributed to regulation of the rate constant for degradation of malic-enzyme mRNA.
3. DEGRADATION OF mRNA
IN
CULTURE
We used a similar approach to estimate the half-life of malic-enzyme mRNA in hepatocytes in culture incubated with and without glucagon (38). Malic enzyme mRNA decayed with a half-life of 8 to 11 hours in cells treated with the transcription inhibitors, actinomycin D or ol-amanitin. In glucagontreated cells, malic-enzyme mRNA decayed with a half-life of 1.5 hours. These results suggested that part of the decrease in malic-enzyme mRNA caused by glucagon was due to an effect on mRNA stability. 4. TRANSCRIPTION in Vivo
We used the transcription “run-on” assay (39) to estimate the rate of transcription. We encountered technical problems in our measurements of the transcription rate of the malic-enzyme gene. A long “GC” tail added to the 5’ end of our largest cDNA during cloning and a small repetitive element in one of our genomic DNAs led to an initial, erroneous, conclusion that diet and hormones have no major effect on transcription of the malic-enzyme
98
ALAN G. GOODRIDGE E T AL.
gene (37). When our DNA probes were freed of repetitive elements, we discovered that diet had a major effect on transcription. Feeding caused a 40to 50-fold increase in transcription of the malic-enzyme gene (Fig. 4); the maximum rate of transcription was achieved within 3 hours after feeding starved chicks. Starvation of fed birds caused an equally rapid decrease in transcription rate (40).The increase in transcription rate caused by feeding was paralleled by a comparable increase in the concentration of nuclear precursors of malic-enzyme mRNA, consistent with a primary action of feeding on transcription of the malic-enzyme gene (40).
A
TJ
al
r
2
m
B 5
u
. I )
0
-
-
t
----/
4.8-5'
ME-4.8-5' ME-4.8-3' puc 19 wt ME-2.6 M13 mp18 Rf wt
P-actin
4.8-3'
Transcription Start Site
2.6
' t
Polyadenylation Signal H
5 kb
FIG. 4. Stimulation of transcription of the malic-enzyme gene by refeeding in chick liver. (A) Nuclei were isolated from livers of 12- to 14-day-oldchicks that were starved for 48 hours and then either starved for an additional 6 hours or fed for 6 hours. Nuclear run-on assays were Strips of Genescreen membrane containing identical amounts of performed as described (40). the indicated probes in slots were hybridized with 2 x 107 cpm/ml each of 3Wabeled nascent RNA isolated from liver nuclei of starved or refed chicks. The membranes were washed and subjected to autoradiography. Vector DNAs (M13mp18RF and pUC 19) were controls for nonspecific hybridization. The cDNA for p-actin was a control for selectivity; the level of hepatic p-actin is unaffected by starvation or refeeding. wt, Wild type; ME, malic enzyme. (B) Location of the various DNA probes within the malic-enzyme gene. Numerical designations of the malicenzyme DNA probes indicate their lengths in kilobases.
MALIC ENZYME GENE
99
5. TRANSCRIPTION IN CULTURE Our earliest measurements of transcription of the malic-enzyme gene in hepatocytes suggested that T3 and gluca.gon have little or no effect (38). Unfortunately, these experiments used a cDNA probe that contained a long G . C tail at the 5’ end. Use of probes free of repetitive elements revealed a 30- to 40-fold stimulation of transcription by T3; 80% of the maximal increase was achieved within 1 hour after adding T3, and ongoing protein synthesis was not required for the effect. The T3-induced increase in transcription was completely blocked by dibutyryl CAMP, an analog of CAMP, the intracellular mediator of the action of glucagon (Fig. 5) (41).These results established that regulation of transcription of the malic enzyme gene was responsible for the effects of T3 and glucagon on malic enzyme activity (Table I). Our results also indicate that the gene for malic enzyme is an immediate-early response gene with respect to the T3- and CAMP-mediated increases in transcription. We also analyzed the role of insulin in the transcriptional response of the malic-enzyme gene to T3 (41). Results of experiments measuring enzyme concentration and enzyme synthesis suggested that insulin had a small positive effect when added to the medium by itselfbut a much larger ampllfying effect on the response caused by T3 (27). Insulin alone had no effect on transcription of the malic-enzyme gene. It amplified the response to T3 in the first few hours after adding T3 but did not alter its maximal effect. The time courses of the responses of the abundance of inalic-enzyme mRNA to T3, or T3 plus insulin, suggested a similar conclusion; in the absence of insulin the T3-induced increase in abundance of malic-enzyme mRNA was delayed but eventually achieved essentially the same maximum level (D. A. Mitchell and A. G. Goodridge, unpublished results). IGF-1 and insulin have similar effects on T3-induced accumulation of malic-enzyme mRNA and transcription of the malic-enzyme gene; IGF-1 acts at more physiological concentrations (41, 42). In vivo, IGF-1 may be more relevant than insulin with respect to the regulation of the malic-enzyme gene. The mechanisms involved in these effects are unknown.
V. Mechanisms for Regulating Transcription
A. Protein Phosphorylation Before beginning an analysis of the promoter regions involved in regulating transcription of the malic-enzyme gene, we investigated the requirements for the T3- and glucagon-mediated responses and the extracellular factors that modulate those responses. When we began the phosphorylation experiments, there was debate as to whether the regulation of transcription
.-c
h
C .-
E
E
0
0
E a
z a I
I
9
5
d U +
d
-0
+
-I
0 [r c
7c!
h
5c! m
iz
iz
l-
TRANSCRIPTION RUN-ON ASSAY HEPATOCYTES IN CULTURE
07
I
,
4.812081 1.5A,7t"r7.5c
7’
H
4 8 1281
7
I
-
H
.
I
2 6
I,
0 6
I
3'[w
= 2 kb:
FIG. 5. Stimulation of transcription of the malic-enzyme gene by T3 and inhibition by dibutyryl CAMP. Chick embryo hepatocytes were isolated as previously described (41). The hepatocytes were incubated for 20 hours in Waymouth medium supplemented with insulin (300 ng/ml). The medium was then changed to one of the same composition. After 42 hours of incubation, T3 (1 kg/mI) was added to some of the plates without a medium change. After an additional 24 hours, dibutyryl CAMP (50 )LM) was added to some plates without a medium change, and the cells were harvested at 24 (control and T3), 24.5, and 26 hours after adding T3 (0,0.5, and 1.0 hours after adding dibutyryl CAMP). Nuclear run-on assays were performed as described (40). Strips of Genescreen membrane containing identical amounts of the indicated probes in slots were hybridized with 2 X lo7 cpm/ml each of3TIabeled nascent RNA isolated from liver nuclei from either starved or refed chicks. The membranes were washed and subjected to autoradiography. ME, Mdic enzyme; FAS, fatty-acid synthase; GAPD,
101
MALIC ENZYME GENE
-
ME-26
. I
M13mpl8Rf 1,
M E -4 a-5
Ir
M E -4 8-31
puc19
C
T3-lh
T3-lh HE-1h
C
T3-24h T3-24h H8-1h
FIG. 6. H8 inhibits T3-induced transcription of the malic-enzyme gene. Chick-embryo hepatocytes were isolated and maintained in culture in Waymouth medium supplemented with insulin (300 nglml). After 48 hours in culture, the medium was changed to one of the same composition with or without T3 (1pg/ml) for 1hour (left panel) or 24 hours (right panel). H8 (25 p M ) was added at the same time as T3 (left panel) or after 23 hours with T3 (right panel). Hepatocytes were harvested and nuclei prepared and incubated in oitro with [32P]UTP as described (46). Labeled transcripts were isolated and hybridized to 2 pg of DNA fixed to GeneScreen membranes. The membranes were washed and subjected to autoradiography. Control DNAs and abbreviations are the same as in the legend to Fig. 5, except that GAD is glyceraldehyde-3-phosphate dehydrogenase DNA and C is control (no T3 or H8) (from Ref. 46, with permission of The Journal of Biological Chemistry). Numerical designations of the malicenzyme DNA probes indicate their lengths in kilobases and can be located on the gene maps in Figs. 4 and 5.
caused by cAMP used a phosphorylation mechanism, as observed for all known effects of cAMP on enzyme activities in eukaryotes, or a proteinbinding mechanism, as observed for the effect of cAMP on transcription in prokaryotes (43). We tested the phosphorylation hypothesis with isoquinoline sulfonamides H8 and H7 (44) and microbial alkaloid (staurosporine) protein-kinase inhibitors (45). We were surprised to find that H8 (Fig. 6) and staurosporine were potent and selective inhibitors of the stimulatory effect of T3 on transcription of the malic-enzyme gene (46). ~~
~~
glyceraldehyde-3-phosphatedehydrogenase. Vector DNAs (Ml3mplSrfand pUC 19)were controls for nonspecific hybridization. P-Actin, glyceraldehyde-3-phosphatedehydrogenase, and fatty-acid synthase cDNAs were controls for selectivity. Transcription rates of the hepatic genes for P-actin and glyceraldehyde-3-phosphate dehydrogenase are unaffected by T3 or CAMP. Transcription of the fatty-acid-synthase gene is stimulated by T3 and cAMP (modified from Ref. 41, with permission of TheJournaZ of BiolugicuZ Chemistry). The map at the bottom ofthe figure indicates the locations of the various DNA probes within the malic-enzyme gene. Numerical designations of the malic-enzyme DNA probes indicate their lengths in kilobases.
102
ALAN G. GOODRIDGE ET AL.
Because induction by T3 is required before the inhibitory effect of CAMP can
be observed, we were unable to test the hypothesis as initially stated. From the work of others, it is now apparent that the positive transcriptional effects of cAMP in vertebrate tissues are mediated by the catalytic subunit of protein-kinase A, the same type of phosphorylation event that mediates the effect of this intracellular signaling agent on enzyme activity (47, 48). The negative effects of cAMP may use the same intracellular signaling pathway, but definitive experimental evidence is lacking. The selective requirement for ongoing phosphorylation suggests that some component of the T3 response machinery must be phosphorylated to be active. It also suggests a potential mechanism by which the T3 response could be regulated by other signaling pathways.
B. Regulation of Responsiveness to T3 1. RESPONSIVENESSTO T3 DECREASES WITH TIMEIN CULTURE
If T3 is added to the culture medium between 20 and 68 hours after the isolated hepatocytes are put into culture, malic-enzyme activity increases 30to 40-fold. This response decreases with time in culture; after 7 days, a 48hour incubation with T3 has no effect on malic-enzyme activity (49). The change in responsiveness of enzyme activity is mediated by a decrease in the ability of T3 to stimulate transcription of the malic-enzyme gene. These results suggest that a protein or metabolite essential for the T3 response is present in excess in vivo before the isolated cells are prepared, but is not made, or is made only very slowly, in the hepatocytes in culture. Alternatively, a negative-acting protein or metabolite may accumulate in hepatocytes in culture. The rate at which responsiveness to T3 decreases (half-life of about 24 hours) is consistent with the rate-limiting component being a protein.
2. GLUCOCORTICOIDS PROLONG RESPONSIVENESS TO T3 Corticosterone has no effect on the activity, mRNA abundance, or transcription rate of the malic-enzyme gene when added in the absence of T3, whether the cells are incubated for 3 or 5 days. In cells incubated with T3 from 20 to 68 hours of incubation, corticosterone has little or no effect on the response to T3. In cells incubated from 68 to 116 hours with T3, however, corticosterone causes a substantial increase in the responsiveness (Fig. 7) (49). The response to T3 is eventually lost whether corticosterone is present or not; it just takes substantially longer when cells are incubated with the glucocorticoid. Intracellular accumulation of long-chain fatty acids or long-chain acylCoAs probably does not cause the loss of responsiveness to T3 or the stimula-
103
MALIC ENZYME GENE
IT3 - 20 to 68 HOURS IT3 - 68to 116 HOURS
FIG. 7. Malic-enzyme activity in hepatocytes treated with T3 for 48 hours and treated with or without corticosterone for the entire incubation period. Hepatocytes were isolated and incubated with insulin (INS, 50 nM) or insulin plus corticosterone (CORT, 1 pM) (49). After 20 hours of incubation, the medium was changed to one of the same composition, and T3 (1.5pM) was added to one set of plates with corticosterone and to one set without corticosterone. At 68 hours of incubation, sets of plates with or without corticosterone and with or without T3 were harvested. At 68 hours of incubation, media in additional sets of plates with or without corticosterone were changed to ones of the same composition with or without T3 (1.5 pM); cells from these sets of plates were harvested at 116 hours. Malic-enzyme activity and DNA were measured as described (49). The results are expressed as units of malic-enzyme activity per milligram of DNA and represent the mean t SE of four experiments, each of which was performed in duplicate (from Ref. 49, with permission of Archives of Biochemistry and Biophysics).
tion of that responsiveness by corticosterone, because adding 0.5%serum albumin (to lower the concentration of unbound fatty acids) or long-chain fatty acids (0.25-0.5 mM) to the medium is without effect at the concentrations of T3 used in these experiments. Nuclear binding of T3 did not decrease in these cells in the absence of corticosterone, nor did corticosterone cause an increase in T3 binding. Thus, changes in the levels of the T3 receptor are unlikely to be involved in the loss of responsiveness that occurs as a function of time in culture or in the increase in responsiveness caused by corticosterone. Our working hypothesis is that corticosterone stimulates production of a
104
ALAN G . GOODRIDGE ET AL.
factor required for T3 responsiveness, or inhibits production of an inhibitor of that process. In preliminary experiments, the glucocorticoid-sensitivecisacting element appears in the same 200-bp fragment of the malic-enzyme gene that mediates the T3 response. Identification of the corticosteroneregulated factor may provide a greater understanding of the factors involved in the ability of the ligand-bound T3 receptor to stimulate transcription of linked genes.
3.
CARNITINE PROLONGS THE RESPONSIVENESS TO
T3
Carnitine, a cofactor involved in the oxidation of fatty acids, also stimulates responsiveness to T3 (49). The effects of carnitine and corticosterone are at least additive and possibly synergistic, suggesting different mechanisms. Carnitine may increase the rate of fatty-acid oxidation, suggesting that a fatty-acid metabolite may regulate responsiveness to T3. Alternatively, a metabolite, the concentration of which is controlled by the rate of fatty-acid oxidation, may regulate responsiveness to T3.
C. Unesterified Fatty Acids Inhibit T3-induced Tra nscription 1. LONG-CHAIN FATTYACIDS
Unesterified long-chain fatty acids inhibit the de nmo synthesis of fatty acids in hepatocytes incubated in simple solutions of buffered salts (50). In addition, the levels of fatty-acyl-CoAs, the immediate product of fatty-acid activation in hepatocytes, are elevated by starvation or induction of diabetes (51-53) or in hepatocytes in culture treated with glucagon (50),all conditions associated with inhibition of fatty-acid synthesis. The activity of the probable pace-setting enzyme in fatty-acid synthesis, acetyl-CoA carboxylase, also is inhibited by fatty-acyl-CoA (19-21). These observations suggest that the concentration of plasma unesterified fatty acids may play an important role in regulating-fatty acid synthesis. It seems reasonable, therefore, that these agents may regulate transcription of the lipogenic genes. However, at the concentrations of T3 used in most of our experiments, unesterified long-chain fatty acids have no effect on transcription of the malic-enzyme gene, even under conditions where the concentration of the fatty acid is unlikely to be affected significantly by its relatively rapid rate of metabolism (49, 54).The concentration of T3 in our experiments was 1.6 p,M, about 1 0 3 higher ~ than that required to saturate the T3 receptor. In our early experiments we measured enzyme activity. Due to the enzyme’s long half-life, the hepatocytes had to be incubated with T3 for 2 or 3 days to achieve a substantial degree of induction. T3 is degraded rapidly in serumfree Waymouth medium; after 24 hours in culture-with or without cells-
MALIC ENZYME GENE
105
T3 is undetectable in the medium (A. G. Goodridge, unpublished results). In order to maintain a significant level of hormone for a prolonged period of time, we add high concentrations of the hormone to the hepatocytes in culture. When we discovered that T3 caused near maximal induction of transcription of the malic-enzyme gene within 2 hours after adding the hormone, we performed a series of experiments at physiological concentrations of T3. Binding of T3 to its nuclear receptor and transcription of the malic enzyme gene were measured in parallel tissue-culture plates during a %hour incubation with T3. The dose-response relationships between binding of T3 to its receptor (Fig. 8A) and T3-mediated stimulation of transcription of the malicenzyme gene (Fig. 8B) are very similar. Furthermore, at 200 pM T3 (sufficient to occupy 80% of the nuclear T3 receptors), 0.5 mM dodecanoate inhibited both transcription of the malic-enzyme gene and binding of T3 to its nuclear receptor (Fig. 9). Long-chain fatty acids and their acyl-CoA derivatives inhibit binding of T3 to its nuclear receptor (55, 56). Their effects are competitive with T3, so that it is unlikely that inhibition would be observed at concentrations of T3 ~ than those necessary to saturate the receptor. This may that are 1 0 3 greater explain why long-chain fatty acids are inhibitory at physiological concentrations of T3 but ineffective at high concentrations. Our results suggest that, in vivo, the changes in plasma levels of unesterified fatty acids that occur during the transitions between the fed and starved states may play important roles in the regulation of transcription of the lipogenic genes.
2. MEDIUM-CHAIN FATTY ACIDS When we initiated the analysis of the actions of fatty acids, we were concerned about maintaining effective concentrations of long-chain fatty acids during incubations of 1or 2 days duration. The physiologically important long-chain fatty acids are quite insoluble in aqueous media. To achieve even the modest concentrations found in the plasma of fed animals, it is necessary to bind the fatty acids to albumin. Medium-chain fatty acids such as hexanoate or octanoate, on the other hand, are much more soluble; it is possible to achieve concentrations of 5 or 10 mM without adding albumin. As a result, we decided to test the effects of unesterified medium-chain fatty acids in our cells. Octanoate and hexanoate inhibit the induction of malic-enzyme activity by T3 (54). These effects are mediated at the level of transcription and are manifest within 30 minutes after adding the fatty acid. Inhibition by such fatty acids is specific with respect to the structure of the fatty acid, selective with respect to the genes that are inhibited, and readily reversible by changing the medium to one lacking the fatty acid (54). Saturated fatty acids with
106
ALAN G. GOODRIDGE E T AL.
0
0.4
0.8
1.2
1.6
2.0
[T31 (W [T3](nM)
0.0
0.16
1.6
-
16
0
-
160
1600
C
1) -ME-2.6 -M13mp18Rf
I )
I) I ,
-ME-4.8-5'
rD,
-ME-4.8-3'
- pUC19 +
=am-
c
- FAS - GAD
FIG.8. T3 binding to nuclear receptor (A) and transcription of the malic-enzyme and fattyacid synthase genes (B) as a function of T3 concentration. Hepatocytes were isolated and incubated for 3 days with insulin (50 nM) plus corticosterone (1p M ) (49). On day 3, the medium was changed to one of the same composition; 1 hour later, [ lz5I]T3 (A) or unlabeled T3 (B) was added for an additional 2 hours at the concentrations indicated. After the cells were harvested, nuclei were isolated and assayed for radioactivity (A) or transcription (B). Assay procedures were as described (49). Nonspecific binding was measured by simultaneous incubation of cells with a 1000-fold molar excess of nonradioactive T3. Nonspecific binding was less than 5% of total binding (at 200 pM T3) and was subtracted from total binding to obtain the specific binding shown in A. Numerical designations of the malic-enzyme DNA probes indicate their lengths in kilobases and can be located on the gene maps in Figs. 4 and 5 .
chain lengths of six to eight carbons are the most effective inhibitors. Butanoate and decanoate are less effective than hexanoate or octanoate. 2-Bromooctanoate, 2-bromopyruvate, six- and eight-carbon dicarboxylates, and branched-chain fatty acids and keto acids derived from the metabolism of
107
MALIC ENZYME GENE
-
-ME -2.6
-
- Ml3mpl8Rf -ME -4.8-5’
-ME -4.8-3’
Br-
-puc 19 FAS
i l l ) -
o r n u
r
e
12%T3 BOUND. ....263f7 238+1
-
-
-GAD
- PACTIN
139+16 62k12
FIG 9 Inhibition of the transcription of the malic enzyme and fatty acid synthase genes in the presence of 200 pM T3. Hepatocytes were isolated and incubated in a chemically defined medium containing insulin (50 nM) and corticosterone (1 K M ) (54). At about 20 hours of incubation, the medium was changed to one of the same composition. At 66 hours of incubation, T3 (200 pM), with or without fatty acid (0 5 inM), was added to the incubation medium The cells were harvested at 68 hours of incubation Isolation of nuclei, transcription run-on assays, and binding of [IZ5I]T3to nuclear receptors wa\ carried out as described in the legend to Fig. 8. Results of the binding assay5 are expressed as femtomoles T3 bound per milligram of DNA HEX, Hexanoate, OCT, octanoate, DODEC, dodecanoate, ME, malic enzyme, M13inp18Rf. replicative form of M13mp18 vector DNA, FAS, fatty-acid synthase, GAD, glyceraldehyde-3phosphate dehydrogenase (from Ref 54, with permission of The Journal of Biological Chemistry). Numerical designations of the malic-enzyme DNA probes indicate their lengths in kilobases and can be located on the gene maps in Fig\. 4 and 5
branched-chain amino acids are slightly stimulatory, ineffective, or only slightly inhibitory (54). Subsequently, we tested a wide variety of related compounds for their effects on the stimulation of malic-enzyme activity by T3. Compounds with inhibitory effects similar in magnitude and potency to those of mediumchain fatty acids are those that are structurally similar to hexanoate or octanoate or that can b e converted to hexanoate or octanoate by intracellular metabolism (Table 11). Despite the fact that medium-chain fatty acids are not present in chicken blood at concentrations that inhibit transcription of the malic-enzyme gene, this inhibition may reflect a physiological regulatory mechanism. The mechanism by which medium-chain fatty acids regulate transcription of the malic enzyme gene is probably different from that for long-chain fatty acids. At 200 pM T3, hexanoate has no effect on binding of T3 to its
108
ALAN G. GOODRIDGE ET AL. TABLE I1
EFFECTOF 0.5 MM OF Inhibit (>50%)
Hexanoate Hexanal Heptanoate Octanoate Octanal 1-Octanol 2-Hydroxyoctanoate Lipoate Dihydrolipoate Monooctanoylglycerol Octyl-P-glucoside 1,2-Dioctanoylglycerol 1,3-Dioctanoylglycerol
COMPOUND ON
MALIC-ENZYME ACTIVITY*
No effect ( 15 bases) before initiating attack on another molecule (227). In addition, both enzymes are inhibited by hairpin structures, although PNPase can digest through a structured RNA, such as a tRNA, if given a sufficient unpaired 3’ end (8). These features all have strong implications for the way in which these exonucleases might act in decay (227). A recent analysis of RNase I1 catalysis addresses its processive mechanism (227). The result was to demonstrate, as proposed also for other RNA and DNA 3’ exonucleases, that two sites exist: one, the “anchor,” holds the substrate; the second, which interacts with the substrate about 15 to 27 nucleotides from the anchor site, binds and hydrolyzes the 3’-terminal dinucleotide linkage. This analysis suggests that the substrate, bound at the anchor site, is pulled progressively through the anchor by the force developed by the cleavages occurring at the catalytic site (227). Neither PNPase or RNase I1 has an activity unique in the bacterial cell, but there is no evidence to link the other 3’ exonucleases to mRNA decay. Early tracer studies on RNA turnover implicated nucleoside 5’-phosphates and, particularly, 5’ NDPs as products of inRNA turnover (26, 203) (Fig. 6). In fact, subsequent studies of the turnover of RNA in the presence of H,180 were interpreted as indicating that turnover in E . coli is primarily hydrolytic, whereas that in B. subtilis is phosphorolytic (204, 205). These findings have gained support in recent work demonstrating that these bacteria differ markedly in their content of the two enzymes. Escherichia coli possess 10fold more RNase I1 activity than PNPase activity in extracts; B. subtilis has no detectable RNase I1 activity (71). Although present in eukaryotic cells, no 5‘-to-3’ exoribonucleases have been identified in bacteria, in spite of the scrutiny resulting from the fact that many messengers appear to be degraded 5’ to 3’. Indeed, an E . coli in vitro system manifesting 5’-to-3’ decay that was coupled to translation was shown to depend on the RNase I1 in the extracts (228).Because of these and related observations, it has been postulated that endonucleases cleave messenger progressively from the 5‘ terminus, and that 3’ exonucleases mediate the degradation of the freed RNA fragments to mononucleotides. More direct evidence for the participation of RNase I1 and PNPase in messenger decay comes from diverse sources (2, 161, 229). For example, the presence
DECAY OF BACTERIAL MESSENGEH HNA
199
of 3‘-terminal hairpins on many transcripts is often attributable to the activity of these enzymes (157, 230). Escherichia coli strains that harbor mutations in either PNPase or RNase I1 alone do not differ significantly from wild type in growth rate, or in the decay of p-galactosidase or total mRNA (223, 231);however, double mutants are inviable (222, 232). Strains that lack PNPase activity and possess a temperature-sensitive RNase I1 have been constructed, although with dimculty. At the restrictive temperature, these strains weakly accumulate fragments ranging from 100 to 1500 nucleotides (196, 224), as well as some defined fragments of specific messengers. These include transcripts from the malEFG operon, the s p c operon, the glyA operon, and of the ribosomal protein S20 inRNA (64,103, 120, 161). The observation that the E . coli p n p mb(Ts) mutants do not have a markedly altered overall rate of mRNA decay is consistent with the presumption that these enzymes do not generally initiate decay but rather serve to remove intermediate fragments. However, in a triply mutant strain that includes a temperature-sensitive RNase E [ p n p mb(Ts) ams(Ts)], the bulk rate of mass decay is slowed to 4-4 at the nonpermissive temperature (196). Also, decay of certain specific messengers is much more strongly affected in the triple mutant. The synergism between p n p m b mutations and a m might be explained in any of several ways. However, it is noteworthy that a complex of PNPase with RNase E has been demonstrated.
B. RNase E
1. IDENTIFICATION OF THE m e GENE AND PROPERTIES OF RNASEE
The determination that RNase E participates in messenger decay and the characterization of the purified enzyme have been perhaps the most significant recent advancements in the field. The me gene is essential; no “knockout” mutations have been isolated. RNase E cleaves several bacterial and phage messengers, as well as influencing the stability of its own mRNA. Finally, RNase E appears to be both a large and complex enzyme and a component of a multiunit protein complex. The m e locus was identified by Apirion by way of a temperaturesensitive mutation that mapped at min 23.5 on the E. coli genome and that was defective in the synthesis of 5-S rRNA at the nonpermissive temperature (233). Initial biochemical studies revealed that RNase E converts the 9-S rRNA precursor into the 5-S rRNA species (2,201,234-236), cleaves RNA I of ColE1, and processes several T4 RNA species (192,200,237).At about the same time, Kuwano and Ono isolated mutants that were temperature sensitive for growth and whose mRNA had significantly prolonged chemical lifetimes; one of the mutations, designated a m (altered mRNA stability), was
200
DONALD P. NIERLICH AND GEORGE J. WFlAKAWA
chosen for further study (238-240). Subsequent work has shown that the original mutations, me-3701 and a m (now also called m e - l ) , are encoded by the same gene and affect adjacent amino acids, being only six bases apart (241-244). The umslme-1 allele possesses a slightly stronger phenotype than does me-3701; otherwise, the two appear much the same. The structural gene for RNase E has been sequenced and encodes a 1061 amino-acid protein (245-247) ( G . Mackie and I. B. Holland, personal communications). The initial sequences of the m e gene obtained were incomplete; the complete sequence (save for some recently posted corrections) was obtained from an E . coli clone that had been selected with antibody directed to yeast heavy-chain myosin. It is unlikely that this was a spurious cloning (199, 245, 2 4 7 ~ )The . product of the hmpl (high molecular weight protein) gene, Hmpl, possesses both attributes of a cell-structure protein and RNase E. Although RNase E does not appear related to other ribonucleases, sequence analysis identified a region very similar to the highly conserved 70kDa snRNP protein involved in eukaryotic RNA splicing in the hydrophilic C-terminal half, and a myosin-like sequence and potential membrane insertion site in the N-terminal half, which also contains the amslme mutations
(245). The RNase E protein has been difficult to identdy and purify, and different techniques have yielded quite different products (199, 201, 245, 2 4 7 ~ ) . The size of the polypeptide, for example, has been variously reported as 60 to 180 kDa. RNase E, which, from DNA sequence analysis, has a subunit size of 118 kDa, migrates aberrantly as a 180-kDa polypeptide on SDS polyacrylamide gels, apparently because of a highly charged C terminus (199, 245, 2 4 7 ~ )The . protein, obtained following denaturing gel electrophoresis at about 95% purity, can, after renaturation, cleave natural substrates correctly, although it possesses a low specific activity (199). When purified without denaturation, RNase E can be obtained as a defined complex of about 500 kDa (248). The complex contains PNPase as well as two unidentified proteins of 50 and 48 kDa. Initial experiments with antibodies directed at the four polypeptides suggest that all of the RNase E and PNPase of the cell are present in the complex; this is not the case for the two smaller proteins (248) (A. J. Carpousis, personal communication). At various times, RNase E has been reported to be either membrane associated or not, and in a complex with other RNA processing endonucleases (249251). It is now generally considered not to be membrane bound, but given its physical characteristics, it may well be associated in some way with cellular structures. RNase E does not appear to be an abundant protein in the cell, and what is present has an apparent size of 180 kDa on SDS gels, as described above ( 2 4 7 ~(A. ) J. Carpousis, personal communication). In a number of studies,
DECAY OF BACTERIAL MESSENGER HNA
20 1
active RNase E of 60-80 kDa was obtained; these polypeptides appear now to be in vitro degradation products that nonetheless retained some activity (202, 252). Similarly, the protein as isolated using different substrates for purification had distinct catalytic features. This “RNase K” activity also appears artifactual (Section IV,B,3). The bacterial chaperonin protein, GroEL, has also been reported to copurify with RNase E, although GroE subunits are not among the two unidentified polypeptides mentioned above (253). In an odd twist of research, a gene that complements the ams mutation was cloned and sequenced. The sequence was subsequently identified as that of GroEL, in keeping with the discovery that the complementing gene did not map at the same site as the a m gene (253, 254). Despite this finding, it has been difficult to establish that the association of the proteins is not fortuitous: GroEL has an affinity for other polypeptides. Nonetheless, highly purified GroEL appears to bind and protect RNA in vitro from RNase E cleavage in a reaction that is dependent on Mg2+ and ATP (255). However, a specific interaction with RNase E per se still remains to be established. Finally, EIF, which is inferred to bind hairpins as discussed in regard to 3’ stabilizers (Section III,B,l), purifies with a large complex that contains RNase E and PNPase (173, 256). These intriguing results lead to the conclusion that RNase E functions as part of a large multisubunit complex, perhaps a “degradasome.” However, they leave the composition of the complex unresolved. 2. CLEAVAGE OF RNA in Viva AND in Vitro BY RNASE E RNase E cleavages sites have been identified, first, by cataloging the RNAs that accumulate in me(Ts)mutants, and second, by detailed comparisons of cleavage patterns of mRNAs in vivo and in uitro. Based on such analyses, RNase E appears to cleave many RNAs, perhaps on the order of 1 per 100 bases (2, 3, 201). The two substrates first characterized, the 9-S rRNA precursor and RNA I, possess two and one cleavage sites, respectively. These facts, as well as the finding that two of these sites have in common a 9or 10-base sequence, led to the initial idea that the enzyme is quite specific. However, as additional substrates were characterized, it became increasingly clear that there is minimal similarity among the sites. By tabulating cleavage sites identified as occurring in vivo and by analyzing target sequences by mutagenesis in T4 gene 32, E . coli r-protein S20, RNA I and other sequences, an only weakly defined pattern emerges (3, 92, 201, 257). Thus, it was concluded that RNase E preferentially cleaves regions of generally single-stranded RNA with a consensus sequence of RAUUW (R = A or G, W = A or U) (92)or simply AU (257);cleavage most often occurs 5’ to A. In some cases, the presence of a nearby secondary structure, often a
202
DONALD P. NIERLICH AND GEORGE J. MURAKAWA
downstream hairpin, has appeared important to RNase E recognition, but . both ranthis has not proved to be a consistent feature (92, 2 5 7 ~ )Indeed, domly generated mutations and site-specific mutagenesis of the region surrounding the cleavage site of RNA I show that the sequence near the site of cleavage influences cleavage, whereas the relative locations of physical features (5‘ end, hairpin loops) do not. However, there is a great deal of freedom in the sequence; e.g., in one case, the introduction of a G shifted and repressed the cleavage but did not abolish it (258-260). In keeping with this, recent studies of RNase E cleavage in uitro, using highly purified enzyme and, as substrates, both RNA I and synthetic oliU gonucleotides, indicate that it preferentially cleaves sites high in A content with little sequence constraint (260). Short, single-stranded RNA oligoribonucleotides are readily cleaved by RNase E, and nearby stem-loop structures actually inhibit the activity in oitro. Many E . coli transcripts that have RNase E cleavage sites have been characterized; these include the mRNAs of the his operon (198),the rpsUdnaG-rpoD operon (261),the pap operon (262),the rplKA]LrpoBC operon (263), the dicB operon (264), the rspO-pnp operon (143, 265), the unc operon (266), the r-protein S20 gene (257, 257a), and m e itself (140).Additional RNase-E-mediated cleavage sites occur in EacZ fusion transcripts (212), tetR-lucZ (267), a hybrid transcript of atpElinterferon-P (268), IS10 mRNA (269), the R. casulatus puf mRNA in E . coli (270), and phage f l mRNA (271). Although a large number of naturally occurring RNase E sites have been identified, it remains unclear whether the enzyme (1)mediates the functional inactivation of these transcripts or acts only to mediate chemical decay, (2) plays an important role in substrate-site selection vis-u-vis features of the substrate or the substrate’s environment, or (3)possesses multiple independent functions, or whether its myosin-like features are components of its function in RNA processing and decay. Only in the autoregulation of RNase E has this enzyme been shown to mediate functional inactivation (139, 140). RNase E cleavage sites have been identified in poorly transcribed chimeric lacZ fusions; however, we and others have found that RNase E mutations do not alter the functional or chemical decay of the natural lacZ transcript (61, 212) (S. H.-Y. Wei, unpublished results), although it appears that sites resembling RNase E cleavage sites are present in lacZ (S. H.-Y. Wei and D. P. Nierlich, unpublished). Interestingly, when rne-3701 was first characterized, its effect on general mRNA decay was missed, and further, when it was studied in the context of strains with defects also in RNase I11 and RNase P, 21 of 80 proteins had reduced, rather than increased, levels (272).It is the strong phenotype of the triple mutant, ams pnp mb, that brings to focus the likely role of RNase E in
+
DECAY OF BACTERIAL MESSENGER RNA
203
decay (196). Decay products then accumulate and the lac and other mRNAs are stabilized markedly (S. H. -Y. Wei, unpublished). It therefore appears that the rnelams product will prove to have an important cellular role. In this regard, two reports present evidence for the presence of RNase-E-like enzymes in human cells. In one, an enzyme with the specificity of RNase E was isolated from cultured B-cells (273);in the second, an enzyme complementing m e mutations in E . coli was cloned from a human library and expressed. In the latter case, the protein obtained had enzymatic and immunological properties in common with E . coli RNase E
(274). 3. “RNASEK” An enzyme was purified that cleaved the ompA transcript in uitro at the same sites cleaved in uivo and was designated RNase K (147, 275). This enzyme had some properties distinct from those reported for RNase E, although it was not present in rne mutants. For a period, it was believed that RNase K might be a natural proteolysis product of RNase E (201). The original investigators now suggest that the properties found for RNase K are largely the consequence of the presence of GroEL in preparations of RNase E (Section IV, B, 1)(A. von Gabin, personal communication).
C. RNase 111 RNase I11 is specific for double-stranded RNA. It was identified as the endonuclease that cleaves one or both strands of the stems of hairpins present in phage T7 transcripts and E . coli rRNA precursors (75, 276, 277). Although RNase I11 sites are relatively infrequent, both perfectly and imperfectly paired sequences are cleaved, and the sites observed share few features. Thus, the basis of the enzyme’s specificity remains elusive (278). Notwithstanding its role in rRNA processing, the structural gene for RNase 111, rnc, is not essential; steps normally catalyzed in rRNA synthesis appear to be made unnecessary with cleavages at other sites (279). Mutation in rnc alters the relative expression of a substantial number of proteins in E . coli, some positively and others negatively (272),and affects the stability of a number of messengers, among them pnp (142), the rnc operon (141), the metY-nusA-in@ operon (280),the dicF gene (264),and the secE-nusG and rplKA]L-rpoBC operons (263). RNase I11 cleaves the sib hairpin and thus destabilizes the int message of bacteriophage A, as discussed above (281). It cleaves mRNAs of phages fl (271)and T7, the latter cleavage stabilizing the mRNA (165). RNase 111 also cleaves the “killer” mRNAs of the hoklsok, srnB, and pnd transcripts for plasmid maintenance (282). In studies in vitro, RNase 111 cleaved the 5’ region of the lac2 transcript
204
DONALD P. NIERLICH AND GEORGE J. MURAKAWA
at discrete sites and inactivated its capacity to synthesize the lacZ a-polypeptide (283, 284). Moreover, RNase I11 was the major activity inactivating the a-polypeptide template activity in E . co2i extracts. Whether RNase 111 cleaves the lac mRNA in vivo is less clear. Strains carrying an rnc mutation produce P-galactosidase at a reduced rather than an enhanced rate, but this trait frequently reverts without affecting the RNase 111 deficiency (285). Although fusion of ZacZ to an upstream sequence that possesses an RNase 111 cleavage site appreciably speeds the decay of lacZ messenger (286),decay of the native lac operon mRNA, when characterized on Northern blots, is not altered by rnc mutation ( S . H.-S. Wei, G . J. Murakawa and D. P. Nierlich, unpublished). These observations suggest that, in uivo, the RNase I11 site(s) in the 5’ coding region of lac2 is masked by some specific mechanism. Thus, it appears that RNase I11 cleaves a small but significant number of mRNAs in E . coli. Cleavage can result in the stabilization or destabilization of the mRNA, and such cleavages provide means by which decay may be modulated.
D. Other Endonucleases RNase I is a broad-specificity endonuclease that degrades RNA to 3’ mononucleotides (287). Because of its periplasmic location, RNase I presumably does not participate in messenger degradation. A form of the enzyme, RNase I*, exists intracellularly in E . coZi, and it was suggested that it might function in mRNA degradation (206, 288). Specifically, given that the exonucleases PNPase and RNase I1 leave short core products (Section IV,A), RNase I* might be particularly suited for their cleavage. If true, RNase I* can have only a participatory role, because it has been known for many years that m a (RNase I) mutants are not defective in mRNA decay (7). Another enzyme, oligonucleotidase, an RNase-11-like enzyme that acts on short substrates, could also perform this function (7). RNase M and RNase R were isolated in an effort to find other broadspecificity endonucleases (202). RNase M is a 26-kDa monomeric endoribonuclease that preferentially cleaves Y-A linkages in RNA (289). RNase R (for residual) is a minor activity in E . coli extracts. It is a 24-kDa endonuclease with a specificity resembling RNase I* (202). RNase I*, RNase M, and RNase R cleave RNA to produce 3’ nucleoside phosphates, which presumably could only be reutilized after dephosphorylation. This suggests that the role of these enzymes in decay might be quite limited, although observations supporting their action in vivo have been made (Section III,C,3) (206).On the other hand, there is work that suggests that the rnu gene product(s) is involved in the degradation of rRNA in C-source-starved E . coli (290). This observation may mean that these enzymes function under those conditions in which the stable species are de-
DECAY OF BACTERIAL MESSENGER RNA
205
graded. (This suggestion does not preclude the lysis of some fraction of the cells during starvation.)
E. Other RNA Binding Proteins Bacteria contain a number of nonspecific RNA binding proteins and helicases. With the exception of the rho protein, which functions in transcriptional termination, and other termination factors, the roles of these proteins and helicases largely are not known (291, 292). Almost certainly some of these will function in mRNA decay. Along this line, results of a recent study suggest that, in E . coli, poorly translated mRNAs are stabilized by the DEAD-box proteins, encoded by deuD and srmB (219). The DEADbox proteins are a ubiquitous group of RNA helicases. A still incompletely defined activity, exoribonuclease-impeding factor (EIF), is postulated to stabilize hairpins from attack by PNPase (Section IV,A). Recent attempts at its purification suggest that it is contained in RNase E- and PNPase-containing complex of about 500 kDa that is much like that obtained by direct purification of the RNase E activity. Finally, GroEL, implicated as protecting mRNA from decay in slowly growing cells, binds RNA in vitro (255) (Section IV, B, 1).
V. Mechanism of mRNA Decay A. Models: Killer Ribosomes, the
“Degradosome,“ and the Runoff of Ribosomes
The long history of study of messenger decay is reflected in the models that have been proposed to explain its underlying mechanism. Based on the observations that messenger decay proceeds 5’ to 3’ and is in some way associated with translation, models were advanced in which a 5’ exonuclease degrades mRNA behind the last translating ribosome, or decay involves a ribosome with an associated exonuclease (a “killer” ribosome), which inactivates and degrades messenger after a random number of translations (293, 294). The observation that decay follows exponential kinetics pointed to a random attack on an exposed site (32). In this same vein, search for an mRNA-degrading activity in vitro led to a report of a new enzyme (RNase V) whose action was dependent on translation and had a 5’-to-3‘ directionality (228, 294). Further research showed that this activity depends on the presence in the extracts of the 3’ exonuclease, RNase I1 (295, 296), and led in time to a focus on the 3‘ ends of mRNAs as substrates for decay (297). When studies of rRNA processing revealed the number and importance of endonucleases in the cell, and as it became clearer that some mRNAs were degraded 5’ to 3’, the idea devel-
206
DONALD P. NIEHLICH AND GEORGE J . MURAKAWA
oped that decay came about through the combined action of endonucleases and exonucleases (10, 12). Further characterization of rRNA and tRNA processing activities [including RNase E, whose role in mRNA decay was not initially detected (272)] showed that some sedimented with other ribonucleases. Similarly, the construction of multiply ribonuclease-deficient strains showed that they possess distinct phenotypes that single mutants do not. This led to the idea of processing complexes-processosomes and degradosomes (248, 250). Studies of the decay of the lac messenger strengthened the notion that mRNA is not degraded by random nucleolytic hits. This led to a reinforcement of the idea that mRNA is degraded by the combined action of endonucleases and 3’ exonucleases. In this model, their action constitutes a 5’-to-3’ “net wave” acting on molecules initially protected by translation and then, after messenger inactivation, exposed to endonucleolytic cleavage behind the last translating ribosome (4). Focus on the role of the 5’ end as a prime determinant of decay for many mRNAs has continued (f14), and a role for RNase E has gained greater support (196).This is reflected in the proposal that RNase E, perhaps as part of a degradosome, enters the messenger and traverses it from 5’ to 3‘, either by sliding or skipping ( 1 , 115, 298). This model is particularly attractive in the light of the finding that RNase E might have spatially distinct cleavage and RNA binding sites (245).Another model, based on the synergism found between the mutations in RNase E and other genes affecting mRNA decay, proposes that a decay-enzyme complex anchors to the 3’ terminal poly(A) sequences and cleaves internally from the 3’ end (148).
6. Concerted Decay
We believe that these models, although accommodating many important features of decay, do not address three aspects that are strongly supported.
1. Messenger molecules are degraded rapidly once decay is initiated (55, 64, 203) ( G . J. Murakawa, T. Thai and D. P. Nierlich, unpublished). This is contrary to the relatively slow movement of ribosomes, which are proposed to limit the 5’-to-3’ movement of the decay wave. 2. It is probable that there are alternative pathways of decay and participatory roles for additional enzymes. Even in double or triple mutants that combine mutations in p n p , m b, and m e , messenger decay still occurs, sometimes only slightly slower than in the wild type, and there is only a modest increase in the accumulation of degradation intermediates of some mRNAs (146, 196). 3. Current models neither address the functional inactivation of messenger nor correctly reflect the role that translation plays in the overall
DECAY OF BACTERIAL MESSENGER HNA
207
process (83, 90, 94,212a) (0.Fattal, T. M . F. Tuohy, J. F. Atkins and D. P. Nierlich, unpublished). The model shown in Fig. 7 has as a key element the formation of a translationally inactive complex that initially encompasses the 5’ region of a messenger but then sequesters distal sequences. Formation of the complex is brought about by the binding of a protein factor near the RBS and inactivating it. The protein factor-it might be an RNA binding protein similar to rho or a ribonuclease like RNase E in an inactive state-binds to RNA randomly and with broad sequence specificity. However, when binding near the RBS, competition occurs between the factor and ribosomes, and when binding to actively translated regions, the factor is displaced by the movement of ribosomes. Where the decay of a polycistronic mRNA such as lacZYA is concerned (94), the factor might inactivate a distal messenger by binding in an intercistronic region, without affecting the decay of the upstream region. Where the intercistronic region is short, the factor could be displaced by ribosomes in the same way as was postulated to occur in translated regions. Further formation of the RNA.protein complex involves gathering the
FIG. 7. A model for the decay of bacterial mRNA. An RNA binding protein or RNase (in an inactive state, arrow) binds randomly to mRNA. Where binding occurs in translated regions, the factor is displaced by ribosomes. Where binding overlaps with an RBS, further translation is inhibited and a complex forms. Association with RNases occurs to form a complex that leads to the rapid formation of fragments, transfer of 3’ exonucleases to newly formed ends leads to rapid removal of fragments (POOF) (94) (0.Fattal, T. M. F. Tuohy, J. F. Atkins and D. P. Nierlich, unpublished).
208
DONALD P. NIERLICH AND GEORGE J. MURAKAWA
inactivated mRNA into folds, the structure of the folds reflecting secondarystructure constraints of the RNA molecule. This complex formation is accompanied by the association of several ribonucleases, including, presumably, RNase E, RNase 11, and PNPase. Triggering of the complex leads to cleavage of the messenger at several sites and the transfer of 3’ exonuclease(s) to the newly formed 3’ ends. The processive action of the exonuclease(s) reduces the size of the fragments quickly. Triggering might be randomly initiated or it might, in some cases, require the formation of a specific complex. The protection afforded by ribosomes is thus indirect, due to interference with the formation of the complex, potentially at several points. By this model, cleavage of messengers might not be entirely 5‘ to 3‘. The rapid decay of the messenger might, to some extent, hide features of the process. At the same time, the 5’ end might be cleaved before translation is fully run out or before transcription of the 3’ end is complete, particularly with long mRNAs. However, by the introduction of a lag between functional and mass decay, most often (particularly with shorter messages) translation will have run out when chemical degradation begins. In some cases, secondary regions of the messenger might be degraded faster or slower than the 5’ end because of differences in the rates of formation and degradation of different domains. In still other cases, the triggering event might depend on the formation of a specific complex, and an event involving the 3‘ or 5’ end, or both, might be rate-limiting. Similarly, the presence of a stabilizing 5’ hairpin might prevent the nascent complex from encompassing the RBS. This model is distinct from some prior models in that decay is not a passive process in which mRNA is attacked randomly by cytoplasmic RNases as it becomes exposed by the runoff of ribosomes. Rather, current research, more and more forcibly, indicates that there is a core of alternative means to remove rapidly the bulk of an mRNA. The enzymes responsible for this appear to be contained in degradation complexes, and their activity is modulated in various ways, in part to make them physiologically responsive, but also, importantly, to direct the overall process to depend on the 5’ and 3’ determinants that are rate-limiting for different mRNAs. For this process to work, translated regions, even those infrequently translated, are not generally direct targets. On top of this core of processes are overlaid the processing and cleavage steps that distinguish individual mRNAs whose rate-limiting steps are dependent on RNase I11 or, perhaps, on RNase E or other endonucleases. Polyadenylylation is necessary for the complete degradation of one labile RNA species, but does not appear necessary for the general decay of mRNA. Thus we envision its role as being in an alternative path of decay, perhaps one that removes both translated and untranslated RNAs that persist after escaping other routes of degradation.
DECAY OF BACTERIAL MESSENGER RNA
209
ACKNOWLEDGMENTS We thank the many colleagues who sent reprints and preprints, or who otherwise communicated unpublished results. We also thank Gregory Charlop, Daniel Behroozan, Wayne Chang, Diane Brandt, Mimi Chan and Rainier Griiang for help in the preparation of the manuscript and figures. D. P. N. thanks the students and associates who over the years have helped make these studies so pleasurable a pursuit. Work in his laboratory was supported by grants from the NIH and NSF and research funds of the University of California.
REFERENCES 1. J. G. Belasco, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman, eds.), Chap. 1. Academic Press, Sari Diego, 1993. 2. C. P. Ehretsmann, A. J. Carpousis and H. M . Krisch, FASEB J. 6, 3186 (1992). 3. A. J. Carpousis and H. M. Krisch, in “Molecular Biology of Bacteriophage T4” (J. D. Karam, ed.), Chap. 14, American Society for Microbiology, Washington, D.C., 1994. 4 . D. Kennell, in “Maximizing Gene expression” (W. S. Reznikoff and L. Gold, eds.), p. 101. Butterworths, Stoneham, MA, 1986. 5. P. Geiduschek and R. Haselkorn, ARB 38, 647 (1969). 6 . M. P. Deutscher, This Series 39, 209 (1990). 7. A. K. Datta and S. K. Niyogi, This Series 17, 271 (1976). 8. M. Grunberg-Manago and A. von Gabain, NATO AS1 series, in press (1995). 9. N. R. Pace, in “Processing of RNA” (D. Aperion, ed.), Chap. 1. CRC Press, Boca Raton, FL, 1984. 10. T. C. King, R. Sirdeskmukh and D. Schlessinger, Microbiol. Rev. 50, 428 (1986). 11. T. C. King and D. Schlessinger, in “Escherichia coli and Salmonella typhimurium, Cellular and Molecular Biology” (F. C. Neidhardt, ed.), Chap. 47. American Society for Microbiology, Washington, D.C., 1987. 12. D. Apirion and P. Gegenheimer, in “Processing of RNA” (D. Aperion, ed.), Chap. 2. CRC Press, Boca Raton, FL, 1984. 13. D. Apirion and A. Miczak, Bioessays 15, 113 (1993). 14. J. G. Belasco and G. Brawerman (eds.), “Control of Messenger RNA Stability.” Academic Press, San Diego, 1993. 15. D. P. Nierlich, Annu. Reu. Microbiol. 32, 393 (1978). 16. F. Jacob and J. Monod, CSMSQB 26, 193 (1961). 16a. F. Jacob and J. Monod, J M B 3, 318 (1961). 17. S. Spiegelman, CSHSQB 26, 75, (1961). 17u. E. Volkin and L. Astrachan, Virology 2, 149 (1956). 18. S. Brenner, CSHSQB 26, 101 (1961). 19. F. Gros, W. Gilbert, H. H. Hiatt, G. Attardi, P. F. Spahr and J. D. Watson, CSHSQB 26, 111 (1961). 20. C. Levinthal, D. P. Fan, A. Higa and R. A. Zimmermann, CSHSQB 28, 183 (1963). 21. M. Jacquet and A. Kepes, J M B 60, 453 (1971). 22. J. G . Belasco and G. Brawerman, in “Control of Messenger RNA Stability” (J. G. Belasco and G . Brawerman, eds.), Chap. 18. Academic Press, San Diego, 1993. 23. D. Nakada and B. Magasanik, J M B 8, 105 (1964). 24. A. Kepes, BBA 76, 293 (1963). 25. A. L. Koch, J. Theor. Biol. 32, 429 (1971).
210 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46.
DONALD P. NIERLICH AND GEORGE J. MURAKAWA W. Salser, J. Janin and C. Levinthal, JMB 31, 237 (1968). N. S. Petersen, C. S. McLaughlin and D. P. Nierlich, Nature 260, 70 (1976).
J. R. Cole and M. Nomura, ] M E 188, 383 (1986). A. Kepes, BBA 138, 107 (1967). L. Leive and V. K o l h , J M B 24, 247 (1967). L. Hartwell and B. Magasanik, J M B 7, 401 (1963). M. Adesnik and C. Levinthal, C S H S Q B 35, 451 (1970). D. Kennell and V. Talkad, J M B 104, 285 (1976). A. L. Koch, JMB 60, 12 (1971). R. 0. Kaempfer and B. Magasanik, JMB 27, 475 (1967). C. Petersen, J. Bact. 173, 2167 (1991). R. L. Coffman, T. E. Norris and A. L. Koch, JMB 60, 1 (1971). L. Leive, J M B 13, 862 (1965). G. Lancini, R. Pallanza and L. G. Silvestri, J. Buct. 97, 761 (1969). K. R. Sowers, T. T. Tung and R. P. Gunsalus, JBC 268, 23172 (1993). A. N. Hennigan and J. N. Reeve, Mol. Microbiol. 11, 655 (1994). T. Kivity-Vogel and D. Elson, BBA 138, 66 (1967). C. Petersen, MGG 209, 179 (1987). M. Wice and D. Kennell, JMB 84, 649 (1974). J. K. Rose and C. Yanofsky, ] M E 69, 103 (1972). C. Yanofsky and I. P. Crawford, in “Escherichirr coli and Salmonella typhimurium, Cellular and Molecular Biology (F. C. Neidhardt (ed.), Chap. 90. American Society for Microbiology, Washington, D.C., 1987. 47. S. Pedersen, S . Reeh and J. D. Friesen, MGG 166, 329 (1978). 48. M. Blundell, E. Craig and D. Kennell, Nature NB 238, 46 (1972). 49. D. Kennell and C. Simmons, J M B 70, 451 (1972). 50. D. Schlessinger, K. A. Jacobs, R. S. Gupta, Y. Kano and F. Irnarnoto, JMB 110, 421 (1977). 51. S. Lin-Chao and S. N. Cohen, Cell 65, 1233 (1991). 52. P. J. Green and M. Inouye, J M B 176, 431 (1984). 53. M. L. Pato and K. von Meyenberg, CSHSQB 35, 497 (1970). 54. K.-0. Cho and C . Yanofsky, ] M E 204, 51 (1988). 55. G. J. Maurakawa, C. Kwan, J. Yamashita and D. P. Nierlich, J. Bact. 173, 28 (1991). 56. 6. A. Mackie, J. Buct. 169, 2697 (1987). 57. G. Klug and S. N. Cohen, I. Bact. 172, 5140 (1990). 58. A. von Gabain, J. G. Belasco, J. L. Schottel, A. C. Y. Chang and S. N. Cohen, PNAS 80, 653 (1983). 59. G. Nilsson, J. 6. Belasco, S . N. Cohen and A. von Gabain, Nature 312, 76 (1984). 60. T. Yamamoto and F. Imamoto, J M B 92, 289 (1975). 61. M. Ono and M. Kuwano, JMB 129, 343 (1979). 62. C. C. Case, E. L. Simons and R. W. Simons, EMBOJ. 9, 1259 (1990). 63. M. Blundell and D. Kennell, J M B 83, 143 (1974). 64. G. A. Mackie, J. Buct. 171, 4112 (1989). 65. D. P. Nierlich, JMB 72, 751 (1972). 66. D. P. Nierlich, JMB 72, 765 (1972). 67. D. P. Nierlich, Science 158, 1186 (1967). 68. J. Neuhard and P. Nygaard, in “Escherichiu coli and Salmonella typhimurium, Cellular and Molecular Biology (F. C. Neidhardt, ed.), Chap. 29. American Society for Microbiology, Washington, D.C., 1987. 69. D. P. Nierlich, PNAS 60, 1345 (1968).
DECAY OF BACTERIAL MESSENGER RNA
211
70. H. Bremer and P. P. Dennis, in “Escherichia coli and Salmonelh typhimurium, Cellular
and Molecular Biology (F. C. Neidhardt, ed.), Chap. 96. American Society for Microbiology, Washington, D.C., 1987. 71. M. P. Deutscher and N. B. Reuven, PNAS 88, 3277 (1991). 72. G . Contesse, M. Crepin and F. Gros, in “The Lactose Operon” (J. R. Beckwith and D. Zipser, eds.), Chap. VI. CSHLab, Cold Spring Harbor, New York, 1970. 73. P. S . Cohen, K. R. Lynch, M. L. Wansh, J. M . Hill and H. L. Ennis,JMB 114,569 (1977). 74. A. C. Walker, M. L. Walsh, D. Pennica, P. S. Cohen and H. L. Ennis, PNAS 75, 1126 (1976). 75. D. Court, in “Control of Messenger RNA Stability” (J. 6. Belasco and G. Brawerman, eds.), Chap. 5. Academic Press, San Diego, 1993. 76. D. Kennell and H. Riezman, J M B 114, 1 (1977). 77. L. W. Lim and D. Kennell, J M B 135, 369 (1979). 78. G. J. Murakawa, Ph.D. Disertation. University of California, Los Angeles, 1988. 79. V. J. Cannistrao, M. N. Subharao and D. Kennell, J M B 192, 257 (1986). 80. J. R. McConnick, J. M. Zengel and L. Lindahl, NARes 19, 2767 (1991). 81. V. J. Cannistraro and D. Kennell, J M B 182, 241 (1985). 82. C. Petersen, in “Control of Messenger RNA Stability” (J. G. Belasco and G . Brawerman, eds.), Chap. 6. Academic Press, San Diego, 1993. 83. J. R. McCorniick, J. M. Zengel and L. Lindilhl, J M B 239, 608 (1994). 84. J. E. Toivonen and D. P. Nierlich, Nature 232, 74 (1974). 85. V. J. Cannistraro and D. Kennell, J. Bact. 161, 820 (1985). 86. C. J. Decker and R. Parker, Trends Biochetn. Sci. 19, 336 (1994). 87. V. J. Cannistraro and D. Kennell, Nature 277, 407 (1979). 88. P. Stanssens, E. Remaut and W. Fiers, Cell 44, 711 (1986). 89. L. A. Wagner, R. F. Gesteland, T. J. Dayhuff and R. B. Weiss, J. Bact. 176, 1863 (1994). 90. 0. Yarchuk, N. Jacques, J. Guillerez and M. Dreyfus, J M B 226, 581 (1992). 91. M. J. Hansen, L.-H. Chen, H. L. S. Fejzo and J. G. Belasco, Mol. Microbiol. 12, 707 (1994). 92. C. P. Ehretsniann, A. J. Carpousis and H. M. Krisch, Genes Den 6, 149 (1992). 93. D. H. Bechhofer and D. Dubnau, PNAS 84, 498 (1987). 94. D. P. Nierlich, 0. Fattal and T. Tiiohy, FP 7, A1090 (1993). 95. J. Forchhammer, E. N. Jackson and C. Yanofsky, J M B 71, 687 (1972). 96. Y. Kano, H. Nakamura, R. L. Somerville and F . Imamoto, MGG 176, 379 (1979). 97. R. D. Mosteller, J. K. Rose and C. Yanokky, C S H S Q B 35, 461 (1970). 98. R. S . Gupta and D. Schlessinger, J M B 92, 311 (1974). 99. C. Yanofsky and J. Ito, J M B 24, 313 (1968). 100. P. Ziemke and J. E. 6. McCarthy, BBA 1130, 297 (1992). 101. J. E. G. McCarthy, B. Gerstel, B. Surin, U. Wiedemann and P. Ziemke, Mot. Microhiol. 5, 2447 (1991). 102. J. B . Owolabi and B. P . Rosen, J . Bact. 172, 2367 (1990). 103. S. F. Newbury, N. H. Smith, E. C. Robinson, I. D. Hiles and C. F. Higgins, Cell 48, 297 (1987). 104. S. F. Newbury, N. H. Smith and C. F. Higgins, Cell 51, 1131 (1987). 105. B. J . A. M. Jordi, I. E. L. op den Camp, L. A. M. de Haan, B. A. M. van der Zeijst and W. Gaastra, J. B a t . 175, 7976 (1993). 106. M. Baga, M. Goransson, S. Normark and B. E . Uhlin, Cell 52, 197 (1988). 107. D. Schumperli, K. McKenney, 13. 4. Sobieski and M . Rosenberg, Cell 30, 865 (1982). 108. D. Georgellis, S. Arvidson and A . von Gabain, J . Bact. 174, 5382 (1992). 109. J. G . Belasco, 6. Nilsson, A. von Gabain and S. N . Cohen, Cell 46, 245 (1986).
212
DONALD P. NIERLICH AND GEORGE J. MURAKAWA
110. G. Nilsson, J. G. Belasco, S. N. Cohen and A. von Gabain, PNAS 84, 4890 (1987). 111. D. Bechhofer, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman,
eds.), Chap. 3. Academic Press, San Diego, 1993. 112. L. Chen, S. A. Emory, A. L. Bricker, P. Bouvet and J. G. Belasco, J. Bact. 173, 4578 (1991). 113. V. Rosenbaum, T. Klahm, U. Lundberg, E. Homgren and A. von Gabain and D. Riesner, ] M B 229, 656 (1993). 114. P. Bouvet and J. G. Belasco, Nature 360, 488 (1992). 115. S. A. Emory and J. G. Belasco, ]. Bact. 172, 4472 (1990). 116. S. A. Emory, P. Bouvet and J. G. Belasco, Genes Deu. 6, 135 (1992). 117. C. F. Higgins, H. C. Causton, G . S . C. Dance and E. A. Mudd, in “Control of Messenger RNA Stability” (J. G. Belasco and G. Brawerman, eds.), Chap. 2. Academic Press, San Diego, 1993. 118. M. Nomura, R. Course and G. Baughman, ARB 53, 75 (1984). 119. J. M. Zengel and L. Lindahl, This Series 47, 332 (1994). 120. L. Mattheakis, L. Vu, F. Sor and M. Nomura, PNAS 86, 448 (1989). 121. A. M. Fallon and C. S. Jinks, G . D. Strycharz and M. Nomura, PNAS 76, 3411 (1979). 122. A. M. Fallon, C. S. Jinks, M. Yamamoto and M. Nomura, J . Bact. 138, 383 (1979). 123. M. 0. Olsson and K. Gausing, Nature 283, 599 (1980). 124. P. Singer and M. Nomura, MGG 199, 543 (1985). 125. C. A. Mackie, j . Bact. 173, 2488 (1991). 126. P. Sandler and B. Weisblum, J M B 203, 905 (1988). 127. J. F. Dimari and D. H. Bechhofer, Mol. Microbiol. 7, 705 (1993). 128. M. Mayford and B. Weisblum, E M B O J . 9, 4307 (1989). 129. M. Mayford and B. Weisblum, J M B 206, 69 (1989). 130. K. K. Hue and D. H. Bechhofer, 1. Bact. 173, 3732 (1991). 131. D. H . Bechhofer and K. H. Zen, J. Bact. 171, 5803 (1989). 132. P. Sandler and B. Weisblum, J. Bact. 171, 6680 (1989). 133. J. Dreher and H. Matzura, Mol. Microbiol. 5, 3025 (1991). 134. A. G. Shivakumar, J. Hahn, G. Grandi, Y. Kozlov and D. Dubnau, PNAS 77, 3903 (1980). 135. N. H. Albertson and T. Nystrom, Fed. Euro. Microbid. Soc. Lett. 117, 181 (1994). 136. D. Georgellis, T. Barlow, S. Arvidson and A. von Gabain, Mol. Microbid. 9, 375 (1993). 137. M. D. Henry, S. D. Yancey and S. R. Kushner, J. Bact. 174, 743 (1992). 138. 0. R. Lagoni, K. von Meyenburg and 0. Michelsen, J. B a t . 175, 5791 (1993). 139. E. A. Mudd and C. F. Higgins, M d . Microbiol. 9, 557 (1993). 140. C. Jain and J. G. Belasco, Genes Deu. 9, 84 (1995). 141. J. C. A. Bardwell, P. Regnier, S. Chen, Y. Nakamura, M. Grunberg-Manago and D. L. Court, EMBO J . 8, 3401 (1989). 142. C. Portier, L. Dondon, M. Grunberg-Manago and P. Regnier, EMBO]. 6, 2165 (1987). 143. E. Hajnsdorf, A. J. Carpousis and P. Regnier, JMB 239, 439 (1994). 144. G. Barry, C. Squires and C. L. Squires, PNAS 77, 3331 (1980). 145. N. P. Ambulos, Jr., E. J. Duvall and P. S. Lovett, Gene 51, 281 (1987). 146. C. M. Arraiano, S. D. Yancy and S. R. Kushner, J. Bact. 175, 1043 (1993). 147. G. Nilsson, U . Lundberg and A. von Gabain, EMBO]. 7, 2269 (1988). 148. E. B. O’Hara, J. A. Chekanova, C. A. Ingle and Z. R. Kushner, PNAS 92, 1807 (1995). 149. K. Gorski, J. M. Rocho, P. Prentki and H. M. Krisch, Cell43, 461 (1985). 149a. D. S. McPheeters, G . D. Stormo and L. Gold, J M B 201, 517 (1988). 150. L. Melin, H. Friden, E. Dehlin, L. Rutberg and A. von Gabain, Mol. Microbiol. 4, 1881 (1990). 151. J. C. Belasco, J. T. Beatty, C. W. Adam, A. von Gabain and S. N. Cohen, Cell 40,171 (1985).
DECAY OF BACTERIAL MESSENGER RNA
213
152. 6 . J. Murakawa and D. P. Nierlich, Bchem 28, 8067 (1989). 153. C. D. Bieger and D. P. Nierlich, J . Buct. 171, 141 (1989). 153a. F. Xu and S. N. Cohen, Nature 374, 180 (1995). 154. M. A. Hediger, D. F. Johnson, D. P. Nierlich and I. Zabin, PNAS 82, 6414 (1985). 155. D. P. Nierlich, C. Kwan, G. J. Murakawa, P. A. Mahoney, A. W. Ung and D. Caprioglio, in “The Molecular Biology of Bacterial Growth (M. Schaechter, F. C. Neidhardt, J. L. Ingraham and N. 0. Kjeldgaard, eds.), p. 185. Jones and Bartlett, Boston, 1985. 156. H. Aiba, A . Hanamura and H. Yamano, JBC 266, 1721 (1991). 157. J. E . Mott, J. L. Galloway and T. Platt, EMBO J. 4, 1887 (1985). 158. M. J. Stern, G. F. Ames, N. H. Smith, E. C. Robinson and C. F. Higgins, Cell37, 1015 (1984). 159. M. Gilson, J.-M. Clement, D. Brutlag and M. Hofnung, EMBOJ. 3, 1417 (1984). 160. B. J. Meyer and J. L. Schottel, Mol. Microhiol. 6, 1095 (1992). 161. M. D. Plamaiin and G. V. Stauffer, MGG 220, 301 (1990). 162. 6. Guarneros, C. Montana, T. Hernandez and D. Court, PNAS 79, 238 (1982). 163. G. Guarneros, L. Kameyama, L. Orozco and F. VelBzquez, Gene 72, 129 (1988). 164. G. Plunker, I11 and H. Echols, J . Bact. 171, 588 (1989). 165. N. Panayotatos and K. Truong, NARes 13, 2227 (1985). 166. P. Alifano, C . Piscitelli, V. Blasi, F. Rivellini, A. G. Nappo, C. B. Bruno and M. S . Carlomagno, Mol. Microbid. 6, 787 (1992). 167. G. Klug, C. W. Adams, J. Belasco, B. Doerge and S. N. Cohen, EMBOJ. 6, 3515 (1987). 168. C.-Y. A. Chen, J. T. Beatty, S. N. Cohen and J. G. Belasco, Cell 52, 609 (1988). 169. G. Klug and S. N. Cohen, J. Bact. 173, 1478 (1991). 170. H. C. Wong and S. Chang, PNAS 83, 3233 (1986). 171. J. M. Romeo and D. R. Zusman, Mol. Microbiol. 6, 2975 (1992). 172. M. N. Hayashi and M. Hayashi, NARes 13, 5937 (1985). 173. H. Cduston, B. Py, R. S. McLaren and C. Higgins, Mol. Microbid. 14, 731 (1994). 174. 6 . Brawerman, in “Control of Messenger RNA Stability” (J. G . Belasco and G. Brawerman, eds.), Chap. 7. Academic Press, San Diego, 1993. 175. N. Ohta, M. Sanders and A. Newton, PNAS 72, 2343 (1975). 176. H. Nakazato, S . Venkatesan and M. Edmonds, Nature 256, 144 (1975). 177. J. W. Brown and J. N. Reeve, J. Buct. 166, 686 (1986). 178. C. W. Kim, P. Markiewicz, J. J. Lee, C. F. Schierle and J. H. Miller, JMB 231,960 (1993). 179. Y. Gopalakrishna, D. Langley and N. Sarkar, NARes 9, 3545 (1981). 180. J. Taljanidisz, P. Karnik and N. Sarkar, J M B 193, 507 (1987). 181. P. Karnik, J. Taljanidisz, M. S.-Szekely and N. Sarkar, JMB 196, 347 (1987). 182. G.-J. Cao and N. Sarkar, PNAS 89, 7546 (1992). 183. G.-J. Cao and N. Sarkar, Fed. Euro. Microhiol. SOC.Lett. 108, 281 (1993). 184. J. W. Brown and J. N . Reeve, J . Bact. 162, 909 (1985). 185. R. Hanschke and M. Hecker, J . Basic Microbiol. 26, 317 (1986). 186. Y. Copalakrishna and N. Sarkar, Bchem 21, 2724 (1982). 187. Y. Gopalakrishna and N. Sarkar, ABB 224, 196 (1983). 188. F. Xu, S. L.-Chao and S. N. Cohen, PNAS 90, 6756 (1993). 189. 6.-J. Cao and N. Sarkar, PNAS 89, 10380 (1992). 190. M. Masters, M. D. Colloms, I. R. Oliver, L. He, E. J. Macnaugliton and Y. Charters, J. B a t . 175, 4405 (1993). 192. T. Tomcsanyi and D. Apirion, J M B 185, 713 (1985). 193. M. P. Kalapos, G.-J. Cao, S. R. Kushner and N. Sarkar, BBRC 198, 459 (1994). 194. B. K. Ray and D. Apirion, J M B 149, 599 (1981). 195. B. Pragai and D. Apirion, J M B 154, 465 (1982).
214
DONALD P. NIERLICH AND GEORGE J. MURAKAWA
196. C. M. Arraiano, S. D. Yancy and S. R. Kushner, J. Bact. 170, 4625 (1988). 197. M. N. Subbarao and D. Kennell, J. B a t . 170, 2860 (1988). 198. P. Alifano, F. Rivellini, C. Piscitelli, C. M. Arraiano, C. B. Bruni and M. S. Carlomagno, Genes Deu. 8 , 3021 (1994). 199. R. S. Cormack, J. L. Generaux and G. A. Mackie, PNAS 90, 9006 (1993). 200. E. A. Mudd, P. Prentki, D. Belin and H. M. Krisch, EMBOJ. 7, 3601, (1988). 201. 0. Melefors, U. Lundberg and A. von Gabain, in “Control of Messenger RNA Stability (J. 6. Belasco and G . Brawerman, eds.), Chap. 4. Academic Press, San Diego, 1993. 202. S. K. Srivastava, V. J. Cannistraro and D. Kennell, J. B a t . 174, 56 (1992). 203. D. P. Nierlich and W. Vielmetter, J M B 32, 135 (1968). 204. J. J. Duffy, S. G . Chaney and P. D. Boyer, J M B 64, 565 (1972). 205. S. G. Chaney and P. D. Boyer, JMB 64, 581 (1972). 206. V. J. Cannistraro and D. Kennell, EJB 213, 285 (1993). 207. D. P. Fan, A. Higa, and C. Levinthal, J M B 8, 210 (1963). 208. F. Imamoto, J M B 74, 113 (1973). 209. R. S. Gupta and D. Schlessinger, J. B a t . 125, 84 (1976). 210. M. Y. Graham, M. Tal and D. Schlessinger, J. B a t . 151, 251 (1982). 211. J. Guillerez, M. Gazeau and M. Dreyfus, NARes 19, 6743 (1991). 212. 0. Yarchuk, I. Iost and M. Dreyfus, Biochimie 73, 1533 (1991). 212a. L. R. Rapaport and G. A. Mackie, J . Bact. 176, 992 (1994). 213. N. Jacques, J. Guillerez and M. Dreyfus, JMB 226, 597 (1992). 214. P. J. Lopez, I. Iost and M. Dreyfus, NARes 22, 1186 (1994). 215. I. Iost, J. Guillerez and M. Dreyfus, J. Baet. 174, 619 (1992). 216. M. Chevrier-Miller, N. Jacques, 0 . Raibaud and M. Dreyfus, NARes 18,5787 (1990). 217. M. P. Deutscher, Cell 40, 731 (1985). 218. M. P. Deutscher, Trends Biochem. Sci. 13, 13 (1988). 219. I. Iost and M. Dreyfus, Nature 372, 193 (1994). 220. K. 0. Kelly and M. P. Deutscher, J. B a t . 174, 6682 (1992). 221. K. 0 . Kelly, N. B. Reuven, Z. Li and M. P. Deutscher, JBC 267, 16015 (1992). 222. T. G. Kinscherf and D. Apirion, MGG 139, 357 (1975). 223. W. P. Donovan and S. R. Kushner, NARes 11, 265 (1983). 224. P. Donovan and S. E. Kushner, PNAS 83, 120 (1986). 225. E. Hajnsdorf, 0. Steir, L. Coscoy, L. Teysset and P. Regnier. EMBO J. 13, 3368 (1994). 226. Z. Li and M. P. Deutscher, JBC 269, 6064 (1994). 227. V. J. Cannistraro and D. Kennell, J M B 243, 930 (1994). 228. M. Kuwano, C. N. Kwan, D. Apirion and D. Schlessinger, PNAS 64, 693 (1969). 229. G. Guarneros and C. Portier, Biochirnie 72, 771 (1990). 230. R. S. McLaren, S. F. Newbury, G. S. C. Dance, H. C. Causton and C. F. Higgins, JMB 221, 81 (1991). 231. A. M. Reiner, J. Bact. 97, 1437 (1969). 232. S. D. Yancey and S. R. Kushner, Biochimie 72, 835 (1990). 233. D. Apirion, Genetics 90, 659 (1978). 234. B. K. Ghora and D. Apirion, Cell 15, 1055 (1978). 235. T. K. Misra and D. Apirion, JBC 254, 11154 (1979). 236. T K. Misra and D. Apirion, J . Bact. 142, 359 (1980). 237. E. A. Mudd, A. J. Carpousis and H. M. Krisch, Genes Deu. 4, 873 (1990). 238. M. Kuwano, M. Ono, H. Endo, K. Hori, K. Nakamura, Y. Hirota and Y. Ohnishi, MGG 154, 279 (1977). 239. M. Ono and M. Kuwano, J M B 129, 3-43 (1979).
DECAY OF BACTERIAL MESSENGER RNA
215
240. M. Kuwdno and M . Ono, in “Microbiology-1983’’ (D. Schlessinger, ed.), p. 86. American Society for Microbiology, Washington, D.C., 1983. 241. E. A. Mudd, H. M . Krisch and C. F. Higgins, Mol. Microbiol. 4, 2127 (1990). 241a . L. Taraseviciene, A. Miczak and D. Apirion, Mol. Microbiol. 5, 851 (1991). 242. 0. Melefors and A. von Gabain, Mol. Microbiol. 5, 857 (1991). 243. P. Babitzke and S. R. Kushner, PNAS 88, 1 (1991). 244. K. J. McDowall, R. G. Hernandez, S. Lin-Chao and S. N. Cohen, J . Bact. 175, 4245 (1993). 245. S. Casarkgola, A. Jacq, D. Laoudj, G . McGurk, S. Margarson, M. Tempete, V. Norris and I. B. Holland, J M B 228, 30 (1992). 246. A. K. Chauhan, A. Miczak, L. Taraseviciene and D. Apirion, NARes 19, 125 (1991). 247. F. Claverie-Martin, M. R. Diaz-Torres, S. I). Yancey and S. R. Kushner, J B C 266, 2843 (1991). 247a. L. Taraseviciene, S. Naureckiene and B. E. Uhlin, J B C 269, 12167 (1994). 248. A. J. Carpousis, 6. Van Houwe, C. Ehretsmann and H. M. Krisch, CeZl 76, 889 (1994). 249. S. K. Jain, B. Pragai and D. Apirion, BBRC 106, 768 (1982). 250. A. Miczak, R. A. K. Srivastava and I). Apirion, Mol. Microhiol. 5, 1801 (1991). 251. R. A. K. Srivastava, N. Srivastava and D. Apirion, Biochem. Znt. 25, 57 (1991). 252. M. K. Roy and D. Apirion, BBA 747, 200 (1983). 253. B. Sohlberg, U. Lundberg, F. Hartl and A. von Gabain, PNAS 90, 277 (1993). 2!54. P. K. Chanda, M. Ono, M. Kuwano and H. Kung, J. Bact. 161, 446 (1985). 255. D. Georgellis, B. Sohlberg, F. U . Hartl and A. von Gabain, MoZ. Microbiol. 16, 1259 (1995). 256. B. Py, H. Causton, E. A . Mudd and C. F. Higgins, Mol. Microbiol. 14, 717 (1994). 257. G . A. Mackie, J B C 267, 1054 (1992). 257a. G . A. Mackie and J. L. Genereaiix, J M B 234, 998 (1993). 258. S. Lin-Chm, T. Wong, K. J. McDowall and S. N. Cohen, J B C 269, 10797 (1994). 259. K. J. McDowall, S. Lin-Chao and S. N. Cohen, JBC 269, 10790 (1994). 260. K. J. McDowall, V. R. Kaberdin, S.-W. Wu, S. N . Cohen and S. Lin-Chao, Nature 374, 287 (1995). 261. V. Yajnik and G. N. Godson, JBC 268, 13253 (1993). 262. P. Nilsson and B. E. Uhlin, Mol. Microbiol. 5, 1791 (1991). 263. J. Chow and P. P. Dennis, Mol. Microbiol. 11, 919 (1994). 264 M . Faubladier, K. Cam and J. Bouche, J M B 212, 461 (1990). 265. P. Regnier and E. Hajnsdorf, J M B 217, 283 (1991). 266. A. M. Patel and S. D. Dunn, J. Buct. 174, 3541 (1992). 267. R. Baunieister, P. Flache, 0. Melefors, A. von Gabain and W. Hillen, NARes 19, 4595 (1991). 268. G . Gross, JBC 266, 17880 (1991). 269. C . Jain and N. Kleckner, MoZ. Microbid. 9, 233 (1993). 270. 6. Klug, S . Jock and R. Rothfuchs, Gene 121, 95 (1992). 271. R. J. Kokoska, K. J. Blumer and D. A. Steege, Biochimie 72, 803 (1990). 272. D. R. Gitelnian and D. Apirion, BBRC 96, 1063 (1980). 273. A. Wennborg, B. Sohlberg, D. Angerer, 6. Klein and A. von Gabain, PNAS. 92, 7322 (1995). 274. M. Wang and S. N. Cohen, PNAS 91, 10591 (1994). 275. U . Lundberg, A. von Gabain and 0. Melefors, E M B O J. 9, 2731 (1990). 276. H. D. Robertson, R. E. Webster and N. D. Zinder, J B C 243, 82 (1968). 277. H. D. Robertson, Cell 30, 669 (1982).
216 278. 279. 280. 281. 282. 283. 284. 285. 286. 287. 288. 289. 290. 291.
292. 293. 294. 295. 296. 297. 298.
DONALD P. NIERLICH AND GEORGE J. MURAKAWA
A. M. Nicholson, This Series. 52, l(1995). P. Babitzke, L. Granger, J. Olszewski and S. R. Kushner, J. Bact. 175, 229 (1993). P. Regnier and M. G. Manago, JMB 210, 293 (1989). U. Schmeissner, K. McKenney, M. Rosenberg and D. Court, J M B 176, 39 (1984). K. Gerdes and A. Nielsen, JMB 226, 637 (1992). V. Shen, M. Cynamon, B. Daugherty, H. Kung and D. Schlessinger, JBC 256, 1896 (1981). V. Shen, F. Imamoto and D. Schlessinger, J. Bact. 150, 1489 (1982). V. Talkad, D. Achord and D. Kennell, J. Bact. 135,528 (1978). M. Robert-Le M e w and C. Portier, EMBO J. 11, 2633 (1992). J. Meador 111, B. Cannon, V. J. Cannistraro and D. Kennell, EJB 187, 549 (1990). V. J. Cannistraro and D. Kennell, J . Bact. 173, 4653 (1991). V. J. Cannistraro and D. Kennell, EJB 181, 363 (1989). R. Kaplan and D. Apirion, JBC 249, 149 (1974). T. D. Yager and P. H. yon Hippel, in “Escherichia coli and Salmonella typhimurilam, Cellular and Molecular Biology (F. C. Neidhardt, ed.), Chap. 76. American Society for Microbiology, Washington, D.C., 1987. C. G. Burd and G. Dreyfuss, Science 265, 615 (1994). D. E. Morse, R. D. Mosteller and C. Yanofsky, C S H S Q B 34, 725 (1969). G. Mangiarotti, D. Schlessinger and M. Kuwano, J M B 60, 441 (1971). R. Holnies and M. F. Singer, BBRC 44, 837 (1971). A. L. M. Bothwell and D. Apirion, BBRC 44, 844 (1971). D. Apirion, MGG 122, 313 (1973). J. G. Belasco and C. F. Higgins, Gene 72, 15 (1988).
The Linker Histones and Chromatin Structure: New Twists JORDANKA
ZLATANOVA
Department of Biochemistry and Biophysics Oregon State University Coruallis, Oregon 97331 and Institute of Genetics Bulgarian Academy of Sciences 1113 So&, Bulgaria
KENSAL
VAN
HOLDE~
Department of Biochemistry and Biophysics Oregon State University Corvallis, Oregon 97331 I. Linker Histones: Properties and Interactions with Other Chromatin Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Properties of the Linker Histones . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Interactions of Linker Histones with D N A ... C. Interactions with High-mobility-group Proteins . . . . . . . . . . . . . . . . . 11. The Importance of Linker Histones in Chromatin Fiber Structure . . . . A. Location of Linker Histones in the Nucleosome . . . . . . . . . . . . . . . . B. Linker Histones and the Chromatin Fiber at Low Ionic Strength . . . . . C. Linker Histones and the Condensed Chromatin Fiber . . . . . . . . . . . 111. What Do We Know? What Do We Need to Learn? . . . . . . . . . . . . . . . . .
...
........
221 221 225 235 236 237 242 246 253 255
The problem of how the fibers of chromatin are folded in the eukaryotic nucleus has interested biologists and biochemists for decades. It has long been recognized that the histones play a major part in this folding (see Ref. 1 for the earlier history). However, the distinctly different roles of the histones H2A, H2B, H3, and H4 on one hand, and the lysine-rich histones such as H1 and its cognates on the other, were not understood until after the discovery of nucleosomes in the early 1970s. It then became clear that the first four were constituents of the nucleosome core particle, whereas the lysine-rich histones were somehow associated with the “linker” DNA between core To whom correspondence may be addressed. Progress in Nucleic Acid Research and Molecular Biology, Vol. 52
217
Copyright 0 1996 by Academic Press, Ioc. All rights of reproduction in any form reserved.
218
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
particles. Accordingly, the former have come to be called “core” histones and the latter “linker” histones. Wrapping of 146 bp of DNA about the histone octamer to form the core particle provides one level of folding (a compaction ratio of about 5:1), but this cannot account for the many thousandfold condensation afforded the DNA in the eukaryotic nucleus. The “string-of-beads” structure observed in early electron microscopic studies (2-4) obviously could not satisfy the compaction requirement. It soon became evident that there must exist some level or levels of higher order folding of the chromatin fiber. Accordingly, extensive research in the late 1970s was directed toward a search for the details of such structure. In a seminal paper, Finch and Klug (5)showed that the extended nucleosomal filaments condense into irregular fibers of about 30 nm diameter in the presence of even low concentrations (0.2mM) of Mg2+. Based on earlier X-ray diffraction studies of chromatin fibers as well as their appearance in the electron microscope, these authors proposed a “solenoid” model, in which nucleosonies were wrapped into a regular helix with a pitch of about 11 nm. Later studies (6, 7) provided much more information. It was shown that increasing concentrations of either monovalent or divalent cations resulted in a progressive condensation of the fiber (Fig. 1). At very low ionic strength, the fibers appear to lie on the grid in a flattened zig-zag conformation, and contract to the condensed fiber at about 60 mM Na+ or 0.3 mM Mg2+. The
FIG. 1. Electron micrographs of rat liver chromatin fibers under three different ionic conditions: [A) 0 InM NaCI, [b) 20 niM NaC1, and ( C )75 mM NaC1. All samples were fixed in 0.1% glutaraldehyde, 5 mM triethanolamine, 0.2 m M EDTA, pH 7.4, at the corresponding ionic strength. Reproduced with permission from Ref. 7.
LINKER HISTONES IN CHROMATIN
219
-50 -50
-100’ 100
so
0
’50
FIG. 2. Model of the chromatin fiber, its simulated scanning-force microscope image, and an actual SFM image. (A) Model of a chromatin fiber used in these simulations. The D N A wraps in a left-handed fashion 1.75 turns around a histone octamer. The octamer is simulated by a disk, 5.5 nm high and 11nm in diameter. The radius of curvature of the D N A wrapped around the core is 5.5 nm, and the pitch of the D N A is 2.86 nm. The D N A has an average of 10.15 bp/turn around the histone octamer and 10.4 hp/turn in the linker portion. The exit angle ofthe D N A is determined by the tangent at the point it leaves the nucleosome. The length of the linker D N A is determined using a uniform deviate random-number algorithm that generates linker lengths between 60 and 64 bp. The linker D N A is assumed to adopt a straight configuration between nucleosomes. The model generates three-dimensional, randomly organized fibers with an average diameter of 30 nm. (B) The model in A as it appears on a plane after partial flattening to simulate the process of deposition o n the mica. (C) Simulated SFM image of the model in A, obtained by convoluting the plane projection in B, assuming a parabolic tip with a radius of curvature of 10 nm. (D) Image of C viewed from a 30" inclination angle. (E) A 30" view of an experimental S F M image of a fixed chromatin fiber deposted on mica from 5 mM triethanolamine-HC1, pH 7.0. Image sizes 400 nin X 400 nm (B-E). Reproduced with permission from Ref. 13.
formation of well-defined fibers requires the presence of lysine-rich histones such as H1. A kind of condensation could occur in the absence of such proteins, but this led only to irregular aggregates. These observations provided the incentive for the generation of a series of models for the condensed fiber in the years that followed. These have often been described (see, for example, Refs. 1, 8, and 9). We believe that it is not important to describe such models here, for there exists no substantial
220
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
evidence to suggest that any significant fraction of chromatin exists in any regular helical structure. A detailed argument is given elsewhere (lo),but the main points can be summarized as follows. First, in none of the many electron-microscope studies of chromatin that have been published do we observe more than minute patches of what might be considered a regular helix. Second, the low-angle scattering patterns (X-ray or neutron) from chromatin fibers do not provide conclusive evidence for any significant amount of regular helix. The major features of the scattering patterns can be accounted for by a random arrangement of nucleosomes on a cylinder (11). Recent scanning-force microscope (SFM) studies (12-14) of chromatin fibers at low ionic strength reveal irregular, helixlike structures whose conformation is explicable in terms of a distribution of linker lengths (see Section II,B,l) (Fig. 2). It seems unlikely that such an irregular fiber should condense into a regular helix as the salt concentration is raised. In fact, recent electron-microscope studies on chromatin fibers in frozen-hydrated or lowtemperature embedded sections of nuclei also show irregular structures (15, 16)(Fig. 3). Finally, studies of chromatin in aqueous solution, using the new technique of X-ray contact microscopy, also show primarily irregular structures (17). Whether the fiber has a regular structure or not, much experimental
FIG. 3. Single slices (a-d) from tomographic reconstructed volumes of electronmicroscope images of nuclei embedded at low temperature and stained with a nucleic-acidspecific stain. Different views of nucleosome-associated and linker DNA (arrowheads)within in situ chromatin fibers from starfish sperm. Bars, 30nm. Reproduced from The Journal of Cell Biology, 1994, Vol. 125, p. 1 by copyright permission of the Rockefeller University Press.
LINKER HISTONES IN CHROMATIN
221
evidence shows that at least some portions of chromatin exist in a rodlike condensed form under physiological conditions. It is also clear that lysinerich (linker) histones are essential for the proper condensation into this fiber. It is our purpose here to examine the following questions: How are these proteins arranged in chromatin? How do they aid in its condensation? What role do they play in regulation of dynamic chromatin processes like transcription? Although we cannot discuss the latter issue in detail (for recent reviews, see Refs. 18 and I S ) , we will, during the course of the presentation of the structural issues, touch on the consequences of H1 binding for the process of transcription and its regulation. To approach these questions, we begin with a description of the various linker histones.
1. linker Histones: Properties and Interactions with Other Chromatin Components
A. Properties of the Linker Histones 1. PRIMARYSTRUCTUREAND VARIANTS
The lysine-rich histones are characterized by a high lysine-to-arginine ratio (on the average, about 15),after which they are named. The amino-acid composition is dominated by basic amino acids: the net charge of the molecule is usually between +50 and +60. The different types of amino acids are distributed quite nonuniformly along the polypeptide chain. The N- and C-terminal regions contain many lysine, arginine, and proline residues, whereas the central part of the molecule is considerably less basic and contains the bulk of the hydrophobic amino-acid residues. The asymmetric distribution of these residues along the chain is reflected in the secondary and tertiary structures typical of these histones (see Section I,A,2). Representative primary structures are shown in Fig. 4. In each cell, the lysine-rich histones are represented by several molecular types (subfractions, isohistones), which differ in molecular mass, aminoacid composition and sequence, and physico-chemical and immunochemical properties (for a review, see Ref. 20). The microheterogeneity is species and tissue specific. In general, the tissue specificity is expressed more as a difference in the relative content of various subfractions than as presence or absence of some of them. However, there are some examples of tissue-specific linker histones. The most well studied is histone H5, which accumulates during the process of terminal differentiation of some nucleated erythrocytes and is believed to be involved in the strong chromatin compaction and transcriptional inactivation
222 H1-CHICK H1O-HUMAN H5SCHICK
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
1 setapvaapavSAPgakaA AKkpkkaaggAkprkpagPsvtelItkAvsAsKeRkGlSla Ill I l l I I I I l l l l l 1 TEnstSAP A AK PKRaKASkkStdHPkYSdMIvAAIqAEKnRaGSSRQ /I I I l l I I I I I I I l l l l l l l l l l l l l l l l l l 1 TEslvlsP ApAK PKRvKASrrSasHPtYSeMIaAAIrAEKsRgGSSRQ
H1-CHICK
61 alkKalaaggydvEknnSrIKLglKsLVskGtLvQTKGtGASGSFklnkKpgEtKakatkK
H10-HUMAN
48 SIQKYIKSHYKVGENADSQIKLSIKRLVttGVLKQTKGVGASGSF RLAKSDEpKKsvafK l I l l l l l / l l l l l I l l l l l l l l I1 1111111I11l1111 I I I I I I I I 49 SIQKYIKSHYKVGhNADlQIKLSIrRLlaaGVLKQTKGVGASGSF RLAKSD kaKrspgK
HSSCHICK H1-CH ICK H10-HUMAN H5SCHICK H1-CHICK H10-HUMAN H5SCHICK
I
I
I l l 1
I l l
I I I I I I I I I I I I
I
I 1
I
122 KpaakpKKpaakkpaaaAkKPKKAAavkkspKkakkspKkakkPaaaAtKKaAksP~tkagrpkkt I I 1 / 1 1 Ill1 I /I I l / / l l / 108 KtKKeiKK vatPKKAsKPKKAAskaPtKKF’KATPvkKAkKKlAAtPKKA KKPKTVK I II IIII I I I I IIIIII I I II I IIII IIIIIII 108 K KK avrrstsPKKAarPrKA rsPaKKPKAT arKArKKsrAsPKKA KKPKTVK 183 AKsPaKAkavKPKaAKskaaKPKAakakKaAtKKK l l l l l l l l l l /Ill I I I I I
164
AK
II
PvKA
II 160 AK srKA
SKPKKAK PvKPKAKSsAKrAqKKK I I I I I I I I I I I Ill SKaKKvK rsKPrAKSgArkspKKX
FIG.4. Alignment of representative primary structures of individual members of the lysine-rich histone family: chicken HI (187),human HI" (18H),and chicken H5 (189).Alignment was carried out using Intelligenetics software. Capital letters denote amino-acid residues conserved in at least two of the three sequences. Courtesy of Dr. S. H. Leuba, University of Oregon, Eugene.
in this cell type (21, 22). Other examples of tissue-specific variants are the subfractions present in gametes or in somatic cells that differentiate into such cells. A well-studied representative of these proteins is H l t (23),which appears only in the mammalian testis during a specific stage of development of the spermatocyte. The sperm-specific proteins of a number of invertebrate species also belong to the tissue-specific members of the histone H1 class. Some H1 subtypes are observed only during embryonic development. Typical examples of these are Hlcs and H l a , which are expressed only during the earliest stages of embryonic development of the sea urchin Strongylocentrotus purpurutus (24).Similarly, the maternally inherited variant H1M (protein B4) is only available until the midblastula stage o f development ofXenopus laevis, at which point it is replaced by somatic-type variants (25, 26). Qualitative and quantitative differences in H1 subtypes have also been observed between normal and malignantly transformed tissues. Especially intriguing are the changes in the amount of histone Hl". This histone, first identified as characteristically present in mammalian tissues with little cell
LINKER HISTONES IN CHROMATIN
223
division, has been subsequently implicated in the establishment and maintenance of the terminally differentiated state (27).
2.
~ R T I A R YSTRUCTURE AND
ASSOCIATION
In aqueous solution at physiological values of p H and ionic strength, the lysine-rich histones consist of three structurally distinct domains: a strongly basic, unstructured fragment at the N terminus (“nose”), a nonpolar central globular domain (“head”), and, again, a strongly basic unfolded domain at the C terminus (“tail”) (28) (Fig. 5A). The overall fold of the globular domains of histones H 5 and H 1 (GH5, GH1) has been determined using twodimensional NMR (29, 30). The NMR results agree well with the recent crystal structure of G H 5 solved to 2.5A resolution (31)(Fig. 5B). The domain consists of a three-helix bundle, with a p hairpin at the C terminus. The structure explains the strong conservation of certain residues at defined positions. Thus, for instance, Gly-79 is involved in making a sharp bend in the polypeptide chain, in going from the end of helix I11 into the p hairpin. Amazingly, the structure is very similar to that of the bacterial DNA-binding protein CAP (catabolite gene activator protein) and also to the structure of the DNA-recognition motif of the Drosophilu transcription factor HNF-3 (32) (Fig. 5B), although the sequences of these proteins show little similarity. The nose-head-tail structure of the members of the H1 family is so strongly conserved that it is often used as a criterion for identification of H 1 proteins. However, it should be noted, that some proteins identified as H 1 histones by a number of other criteria do not contain a typical globular domain. Such, for instance, is the Tetruhymenu H1 (33), the H 1 from the hypotrichous ciliate EupZotes (34),and also one form specific to a terminally differentiated cell type (35). The linker histones constitute the most evolutionary variable histone class. The various portions of these molecules are characterized by a different degree of variation, both in evolution and among subtypes from one and the same tissue or organism. The globular region is relatively well-preserved during evolution and is almost identical in individual representatives of the H 1 complement from a given source. The sequence differences are mainly located in the polycationic termini, both among subtypes and in evolution. It is still not certain to what extent individual linker histone molecules interact with one another in solution. Chemical cross-linking can lead to the formation of H 1 oligomers in solution (1). However, control experiments indicate that the oligomers could have arisen as a result of collision of molecules in solution with some contribution from short-lived couplexes (36),in accordance with conclusions from earlier work (37, 38). On the other hand, H 1 molecules have been shown to aggregate in solution at NaCl concentrations over 50 mM, mainly as a result of interactions among the globular
224
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
A
N
W
C
B
E
GH5 GH5
CAP
~
HNF-3
1
C
FIG. 5. Tertiary structure of histone H5. (A) Schematic presentation of the “nose-headtail” overall structure of the linker histones. (B) Schematic diagram of the globular domain of histone H5 built on the basis of crystallographic data: comparison with the structures of the DNA-binding domains of the bacterial protein CAP and the Drosophila transcription factor HNF-3. Fig. 5B kindly supplied by Dr. V. Ramakrishnan, Brookhaven National Laboratory, Upton, NY. Reprinted with permission from Nature (Ref. 31). Copyright 1993 Macmillan Magazines Limited.
domains (39). These results suggest that the action of H1 in compaction of the fiber (see Section II,C,3) could be due to specific interactions among globular domains. In apparent support of this idea, it has recently been reported that the globular domain of H5 may self-associate in a specific way in solution, whereas similar self-associationwas not observed for the globular domain of histone Hl(40). However, others report that the globular domains of H1 and H5 show little, if any, tendency to associate, even under a wide variety of ionic conditions and protein concentrations (41, 42). How these
LINKER HISTONES IN CHROMATIN
225
seeming discrepancies between results from different laboratories are to be resolved is unclear.
3. SYNTHESISAND MODIFICATION Like the other histones, H 1 is synthesized most intensely during DNA replication. However, a considerable amount of H1 can also be synthesized outside the S phase, especially in G, (43). It is important to note that the synthesis of different linker histone variants may be subject to different types of regulation. For example, transcription of the H5 gene is not cell-cycle regulated (44). Similarly, the synthesis of H1" shows a complex pattern of regulation, depending on the cell type, the physiological state, and the status of differentiation (27). The molecular mechanisms governing the metabolism of the linker histones have recently been reviewed (45). Histone H 1 undergoes two major postsynthetic modifications: phosphorylation and poly(ADP-ribosy1)ation. Despite the intense effort that has gone into the study of these modifications, their role still remains obscure. Bradbury and co-workers (46) suggested that H1 phosphorylation might trigger the condensation of chromosomes during mitosis. However, exceptions to the long-held correlation between H 1 phosphorylation and chromatin condensation in a number of systems have been reported, in which the modification correlates best with decondensation of chromatin (for a review, see Ref. 47). These exceptions call for a reconsideration of how the role of H1 phosphorylation might depend on the specific requirements of different systems. The role of the other major type of modification, poly(ADP-ribosyl)ation, is also far from being clear. The prevailing view is that it is mainly connected to processes involved in DNA repair (1).
6. Interactions of
Linker Histones with DNA
1. INTERACTIONWITH LINEARDNA
The belief that the linker histones interact primarily with the DNA in chromatin has led to numerous studies of complexes between pure DNA and these histones (48).Several lines of evidence suggest that artificial Hl/DNA complexes can be used as appropriate model systems for studying the role of the histone in chromatin. First, the location of the histone molecules in the chromatin fiber is such as to allow chemical cross-linking, with the formation of H1 homopolymers (41, 49). Similar homopolymers can be derived from cooperatively formed complexes of H 1 with linear DNA (50). Second, the saturation ratio of bound H1 to eukaryotic DNA in 0.5 M salt is about the same for naked and chromatin DNA (51),corresponding to one strong H1binding site per 150 bp. This packing density corresponds to occupancy of only the strongest binding sites, because 0.5 M salt represents the bor-
226
JOHDANKA ZLATANOVA AND KENSAL VAN HOLDE
derline for dissociation of H1 from DNA. At lower salt concentrations, H1 molecules will bind nonspecifically to DNA with a density of one H1 for about 35-40 bp. The average nonspecific packing densities of different H 1 subtypes in their complexes with DNA correlate well with the average linker DNA length of the chroinatins with which the respective histone subtypes are associated in uivo (52). Finally, there is a correspondence between the salt level at which compaction of the nucleosomal fiber occurs and the salt concentration required for formation of aggregated H l i DNA complexes. Depending on the ionic strength and the histonelDNA ratio, the appearance of the complexes may differ significantly: thin filaments of two DNA molecules (or possibly DNA duplex hairpins) bridged by histone molecules forming soluble complexes at low salt concentration, rodlike or cablelike structures (consisting of thin filaments packed side by side) at higher salt concentrations, circles, and doughnut-shaped structures have all been observed (48,53-55). The more condensed structures appear to form at higher ionic strength and higher DNA concentrations. The formation of “doughnuts” takes place under conditions of extensive neutralization of the negative charges of the DNA molecule (physiological ionic strength and high Hl/DNA ratio). With the use of scanning-force microscopy (SFM) we have recently observed globular complexes between 146-bp core particle DNA fragments and chicken erythrocyte H1 at low ionic strength (S. Leuba and J. Zlatanova, unpublished). In these complexes the path of the DNA is not resolved, but from their general appearance it seems that it must be severely bent, perhaps wound about a core of histone H1 molecules. The average diameter of these globules is around 50 nm; if the DNA is on the outside, this would correspond to a radius of curvature of the double helix of around 25 nm, which is of the order of the value reported for spontaneously bent DNA molecules whose negative charges had been neutralized by either divalent cations or polyamines (56-58). Inner radii of similar values have been observed in toroids formed by the interaction of isolated C-terminal domains of H1 with DNA (55). Under the conditions of the latter study, however, the intact protein complexed DNA in particles of much greater radius (about 60 nm). It is perhaps physiologically relevant that linker histone binding can cause the DNA to bend so significantly, forming doughnut or toroid structures of curvature close to the curvature of the DNA in the condensed chromatin fiber. The interaction of the linker DNA with the linker histones may be the major determinant of the minimal radius of the chromatin fiber, and hence of the average number of nucleosomes per turn in the irregular helical structure (see Section II,C,l) (see Refs. 59-63, for theoretical analysis of the role of charge neutralization of the DNA phosphates by histone H 1 in DNA bending and higher order folding of polynucleosomes).
LINKER HISTONES IN CHROMATIN
227
OF INTERACTION 2. COOPERATIVITY
Depending on ionic strength, DNA concentration, and histone/DNA ratio, the interaction of histone H1 with DNA will show varying degrees of cooperativity (48). Watanabe (64), using very dilute solutions, has demonstrated cooperative binding on single DNA molecules, with a cooperativity parameter (ratio of cooperativity binding constant to nucleation binding constant) varying from about 3 x 102 at 20 mM salt to about 103 at 200 mM. At higher DNA concentrations, the behavior becomes more complex, with the formation of multistranded complexes beginning in a range of NaCl concentration between 20 and 50 mM. Under these conditions, the cooperativity becomes so extreme that the histone is distributed so as to produce free DNA and saturated multistrand complexes. Interesting, and in contrast to the behavior of histone H1, is the observation that, at these high DNA concentrations, histone HS interacts with DNA cooperatively at all ionic strengths (52). Similarly, the globular domains of both H1 and H5 interact strongly cooperatively at all ionic strengths (42, 65). Whether these aspects of binding behavior of the linker histones to pure DNA also hold true for the binding to chromatin is still unclear (48).
3. PREFEHENTIAL BINDINGTO CROSSOVERS IN DNA As early as 1975 it was shown that histone H1 prefers binding to superhelical over linear or relaxed circular DNA (67, 68). Later, this preference was confirmed in direct competition experiments (69, 70). A simple demonstration of this preference is shown in Fig. 6. Here, a mixture of supercoiled, linear, and relaxed circular DNA has been titrated with H1. The data demonstrate that the supercoiled DNA is shifted rapidly in mobility, whereas the linear and the relaxed DNA forms are unaffected, even though present in excess over the supercoiled one. The H1 concentrations used are such that there are many more histone molecules than DNA writhes, yet all of the histone appears to bind to the superhelical forms. This is consistent with the notion that H1 binds to supercoil crossovers with high affinity, triggering the binding of additional H1 molecules to the same DNA molecules in a cooperative fashion. However, in all these experiments plasmid or viral DNA preparations were used that were incompletely characterized with respect to their topological state. In such preparations the superhelical density was not precisely known and it was unclear how much torsional deformation of the DNA accompanied supercoiling. More importantly, the high superhelical tension would be expected to create alternative non-B-DNA conformations such as cruciforms or Z-DNA at specific nucleotide sequences. If the histone binds
228
JORDANKAZLATANOVAANDKENSALVANHOLDE
FIG. 6. Preference of histone HI for supercoiled DNA forms. A mixture of supercoiled, linear, and relaxed DNA, obtained by treatment of supercoiled pBR322 (lane 1)with the singlestrand-specific endonuclease P1 titrated with increasing amounts of histone HI, as designated above the figure (lanes 2-7). M, Marker lane, containing BstEII-digested A DNA. Note that although the linear and relaxed DNA bands show no change in electrophoretic mobility, the superhelical form is more and more retarded with increasing the amount of H I present in the incubation mixture. Courtesy of Dr. M. Ivanchenko, Oregon State University, Corvallis.
preferentially to some such structures, as turns out to be the case (see below), the preference for supercoiled DNA per se might be illusory. This issue has recently been addressed by using supercoiled plasmid DNA partially prerelaxed with topoisomerase I, so that a population with a narrow distribution of topoisomers of low linking-number difference was obtained (71). None of these topoisomers would contain alternative non-B-DNA structures, The order of disappearance of individual topoisomers from electrophoretic gels as a result of H1 binding again indicated a clear preference for initial histone binding to molecules containing crossovers of double-helical DNA, followed by cooperative binding to these same molecules. The same effect was observed with the isolated globular domain of H5;this suggested that the preference for binding to crossovers is determined by the existence of two specific DNA-binding sites in the linker histone globular domains (31). Crossovers of two double-stranded DNA regions are structurally similar to four-way junction DNA (72). Four-way junction DNA is a biologically
LINKER HISTONES IN CHROMATIN
229
relevant structure in that it is present during genetic recombination (Holliday junctions) and in cruciforms extruding from palindromic sequences in supercoiled DNA. A number of proteins that specifically recognize and bind to these structures have been identified (74, including high-mobility-group protein 1 (HMGl), a member of the other class of major linker DNA-binding proteins (74). Studies on the binding of histone H1 to synthetic four-way junctions have shown that the histone forms a defined complex with these structures even in the presence of a vast excess of linear nonspecific competitor DNA (75) (Fig. 7). The four-way junction also competes efficiently against linear DNA molecules that have the same sequence information as the four-way junction, and against “incomplete” junctions (Fig. 8). The difference in affinity between a four-way junction and linear DNA is so great that we can think of the former as a kind of “specific” binding, the latter “nonspecific.” As in the case of binding to crossovers of DNA in supercoiled plasmids, the globular domain by itself can bind strongly; however, whereas the intact histone forms a single complex, inultiple copies of the globular domain can bind to
A
FIG. 7. Titration of binding of linker histones H I and H5 to four-wayjunction DNA in the presence of competitor DNA. Indicated concentrations of H1 (A) or H5 (B) were incubated with 2.1 nM of four-way junction DNA in the presence of 50 pglml competitor (salmon testis) DNA and analyzed by mobility-shift assay. The concentration of polyacylamide was 5%. Reproduced with permission from Ref. 76.
230
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
-IL
ir
JL
1
JL
FIG.8. Competition between four-way junction DNA, linear control duplexes, and incomplete junctions for binding of H1. Appropriate single-stranded oligonucleotides were annealed in 10 mM Tris HCI (pH 7.45), 100 mM NaCI, 10 mM MgCl,, 1 m M EDTA by heating to 70°C for 3 minutes and cooling down to 0°C over a period of 4 hours. H1 (25 nM) was incubated with labeled four-way junction DNA (0.5 nM) and the indicated amounts of unlabeled four-way junction, unlabeled incomplete junctions, or control duplexes, having the same sequence information as the four-way junction molecule. The products of the interaction were analyzed by mobility-shift assay on polyacrylamide gels. Reproduced with permission from Ref. 75.
the same four-way junction (76). Finally, the affinity of H5 for a four-way junction is higher than that of H1. Binding of intact H1 to four-way junctions is inhibited by cations, Mg2+ and spermidine being much more effective inhibitors than Na+ (76). This inhibition is not likely to be a general ion-competition effect, for Mg2+ shows much less inhibition of the nonspecific binding of H1 to linear DNA. Instead, the inhibition of binding to the four-way junction may be due to iondependent changes in the conformation of the junction. In the absence of specific cations the four-way junction exists in a square planar configuration; in the presence of sufficient concentrations of these ions, the junction folds into an X-shaped structure with two quasicontinuous, coaxially stacked helices (73)(Fig. 9). The transition between these two conformations results in a considerable change in the angles between the four arms of the junction. The results from such experiments suggest that the linker histones may prefer the square planar conformation in which the angle between arms is 90". It is worth noting that 1.75 superhelical turns of the DNA around the octamer of histones would produce an angle of 90" between the DNA strands entering and exiting the nucleosome. It seems likely from the above results that the linker histones bind strongly to such structures, thereby fixing the entry-exit angle and contributing to the formation of the three-dimensional chromatin fiber (see Section 11,B,2). Using planar four-way junctions as a model for crossovers of linker DNA
23 1
LINKER HISTONES IN CHROMATIN
Folded
Unfolded
Stacked X-structure
FIG.9. Schematic offour-way junction folding. In the absence of metal ions, the junction is maximally extended in a planar configuration. Binding of metal ions reduces the electrostatic repulsion of the phosphates in the DNA backbone to the point at which helix-helix stacking may occur. The parallel alignment of the quasicontinuous helices is unstable, again due to electrostatic repulsion along the length of the helices, resulting i n a rotation into the X-structure. The angles between the arms in the unfolded state are around 90" and those in the stacked structure, 60" and 120", respectively. (Reproduced with permission from Ref. 73.)
at the entry and exit to nucleosomes, competition experiments have been performed between linker histones H 1 or H 5 and H M G l (77). The interest in such studies was generated by the fact that the two major groups of linker DNA binding proteins-the linker histones and HMGl/e-have opposite roles in transcription: histone H 1 and its variants seem to act as repressors (18,19),whereas H M G l and HMG2 are reported to act as general transcriptional activators (see Ref. 77, and references therein). In competition experiments in which the two types of proteins were added either simultaneously or successively to the incubation mixture, it was shown that HMGl can compete efficiently with H1 for binding to four-way junctions (Fig. 10A). In contrast, histone H 5 seemed refactory to displacement by HMGl (Fig. 10B). The difference between histones H1 and H5 correlates well with known effects on transcription: whereas H1-containing chromatin can be transcriptionally activated, the presence of H5 seems irreversibly to preclude transcription in the silent genome of nucleated erythrocytes. These observations suggest that direct displacement of histone H I by H M G l on the nucleosome might be part of the mechanism of gene activation by HMGl (see also Section 1,C).
4.
SEQUENCE-SPECIFIC
PREFERENCE OF BINDING
In addition to showing a preference for certain kinds of DNA structures, linker histones may also prefer certain sequences for binding. These preferences fall into two classes. First, histone H1 exhibits a general preference for binding to (A + T)-rich DNA regions (48, 78).The compact p-turn structure
232
JORDANKA ZLATANOVA AND KENSAL VAN H O L D E
A 1
2
3
4
5
6
7
8
9
- four-way junction
- incomplete Junction
;hi dr
0
0
562
0
2010
0
18
35
1
2
3
4
281 562 nM H1
I nM HMCl
2012
[
112
B
70 140
57
5
29
6
14
7
7.2
3.6 molrrntio HMGllRl
8
9
- four-way junction
- incomplete junction 0
0
544
0
2010
0
17
34
272 544 nMH5
InMHMGl
2010
[
I18
68 136
59
30
15
7.4
3.7 molar ratio HMCllHS
FIG.10. Competition between histone H1 (A) or H5 (B) with HMGl for binding to fourway junction DNA. The indicated concentrations of the linker histones and HMGl were incubated with 3.8mM labeled four-way junction DNA and 50 fig/ml of competitor (salmon testis) DNA. The binding was monitored by DNA mobility-shift analysis. Note that in no instance is a ternary complex of DNA, HMG1, and linker histone observed. Reproduced with permission from Ref. 77.
233
LINKER HISTONES IN CHROMATIN
of the tetrapeptide SPKK, present in the histone tails (79), has been implicated in this preference (80). This is in apparent contradiction to earlier work, which identified the globular domain as the portion of the molecule responsible for this property (81).The issue seems to be further complicated in view of the fact that polylysine exhibits the same (A + T)-rich preference (82). In addition to the general A + T preference, there is mounting evidence that there may exist, particularly in eukaryotic DNA, certain specific sequences of very high aflinity for linker histones. A first indication of this can be found in the early filter-binding experiments of Renz (83).When histone H 1 was incubated in a mixture of differentially radiolabeled calf lymphocyte and E . coli DNA, the histone bound preferentially to the eukaryotic DNA. Later, Diez-Caballero et al. (51)found that at 0.5 M salt, where most nonspecific Hl/DNA interactions are very weak, eukaryotic DNA (but not prokaryotic DNA) exhibited a class of strong H1 binding sites. In recent years, more specific information concerning a few such sites has appeared (84-87). Such “Hl-hypersites” are not necessarily (A + T)-rich. Indeed, in the composite binding site identified in the rat albumin gene by DNase-I footprintT)ing (85), the two binding regions themselves are not particularly (A rich, in contrast to the segment between them, which contains more than 80% A + T. An intriguing set of experiments, pointing to the existence of “sequence”specific binding, has been recently performed in a study of the interaction of histone H 1 with populations of restriction fragments of plasmids pBR322 and pUC19 (87a). Certain fragments exhibited unusually high affinity for the histone, forming large complexes at Hl/DNA ratios at which the other fragments present in the same incubation mixture showed no apparent binding (Fig. 11). The highly preferred fragments turned out to be intrinsically curved, as judged by their anomalously slow electrophoretic mobility in polyacrylamide gels, by computer modeling analysis, and by scanning-force microscopy. However, the presence of curvature alone was not sufficient for the preferential binding, because highly curved kinetoplast DNA fragments of similar length were not selectively bound. Using various restriction fragments centered around the highly preferred molecule, it was found that the high-affinity binding required the simultaneous presence of sequences on both sides of the region of curvature. At present there is not yet enough evidence to attempt description of a “consensus” site, or even to conclude that such sites exist. More study of strong H1 binding sites is needed. Of course, it could be that the “sequence”-specific preference identified in the above cases may only reflect peculiar, albeit still unrecognized, sequence-dependent DNA structures. We would like to stress, as we have previously (18), that the major linker
+
234
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
A
O
6
N k ? N Y k ? O
O
r
r
c
r
N
0
z
y
?
o
c
N
r
k
c
?
N
HllDNA
FIG. 11. Titration of DraI-RstNI digests from pBR322 (A) and pUC19 (B) with increasing amounts of histone H1 on agarose electrophoretic gels. The protein:DNA ratios (w/w) are designated below the lanes. The arrows denote the specific fragments from both plasmids that are preferentially hound to histone H1. Courtesy of Dr. J. Yaneva, Oregon State Unviersity, Corvallis.
histone-binding sites in chromatin must be the crossovers of doublestranded DNA at the entry and exit of the nucleosome: if sequence-specific sites exist, they may be involved in some kind of specific regulatory mechanisms, involving only specific genes (such, for instance may be the case of the differential regulation of transcription in the somatic and oocyte types of 5-S RNA gene sequences; see below). In this respect, it is interesting to note that the eukaryotic hypersites identified to date are located in 5' flanking regions or in the beginning of the coding sequences of the respective genes, a location expected to be of importance in regulating transcription from these genes. Perhaps binding of the linker histones to such H1 hypersites will help fix a particular nucleosorne or even a chain of nucleosomes over specific DNA sequences (see Section II,B,3). A particularly clear-cut example of how sequence-preferring (albeit somewhat promiscuous) binding of the linker histones to DNA may affect transcription is presented by the two types of 5-S RNA gene families in Xenopus (88). The large oocyte gene family consists of about 20,000 copies per haploid genome organized in clusters scattered among most of the chromosomes. The smaller somatic gene family consists of 400 copies per haploid genome grouped in a single cluster. The two gene families are differentially
LINKEH HISTONES IN CHROMATIN
235
regulated during early Xenopus development: the oocyte genes become repressed at the niidblastula transition so that the only 5-S genes transcribed in somatic cells are those belonging to the somatic gene family. This differential gene expression requires the presence of histone H1 both in vivo and in vitro (see Ref. 88, and references therein). It seems likely that the differential effect of histone H1 on the transcription of the two types of genes depends on the preferential binding of the histone to oocyte chromatin, which contains an (A + T)-rich spacer between the genes compared to a (G + C)-rich spacer between the somatic genes (89). However, this hypothesis has not, to our knowledge, been tested directly. Neither is it clear whether the effect depends on binding of H1 per se, or on H1-directed binding of other entities, possibly nucleosomes (see Section 11,B,3).
C. Interactions with High-mobility-group Proteins The only chromatin proteins for which the interaction of linker histones has been studied are members of the HMG class. Isolated HMGl binds relatively tightly to histone H 1 (90, 91) but there has been concern that the interaction might be merely nonspecific, deriving solely from the electrostatic interaction between the negatively charged tail of HMGl and the positively charged tails of H1 (92). However, the fact that oxidized and reduced HMG1, which differ only in the presence or absence of a disulfide bond, interact differently with H1 suggests that the interaction may be more specific and significant (93). Whether this type of interaction is physiologically relevant in chromatin remains to be seen: both H1 and HMGl are known to bind to linker DNA, but whether they can do so simultaneously, or only by displacing each other, has not been determined rigorously as yet. The isolation of mononucleosomes containing HMGl and HMG2 but lacking histone H1 favors the replacement hypothesis (94). Also, in direct competition experiments for binding to four-way-junction DNA the two proteins were never observed in ternary complexes involving both H1 and HMG1, strongly indicating that they occupy the same, or overlapping, sites on the junction, and, by inference, perhaps on the nucleosome (77). However, conflicting data have also been reported: the presence of HMGl and HMG2 was found to be restricted to H1-containing mononucleosomal particles isolated from native Chromatin (95).Cross-linking experiments with chromatin reconstituted with exogenous HMGl and HMG2 demonstrate that at least some HMGs are sufficiently close to histone H1 to allow cross-linking to occur (96). As in all reconstitution experiments, it is difficult to relate these results to the in situ situation in view of uncertainties about the fidelity of reconstitution and of the fact that HMGs were added to bulk chromatin already containing endogenous HMGs.
236
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
In this context, it is relevant to mention experimental data concerning the effect of HMGl on the structure of highly aggregated Hl/DNA complexes. The nonhistone protein destroys the complex or prevents its formation (97). Electron-microscope observations demonstrate that HMGl destroys the double DNA fibers and leads to the formation of beaded structures. Native HMGl interacts with histone H1 in such a way as to modulate the ability of H1 to condense pure DNA in uitro (98).It is believed that the binding of H1 to HMG involves the portion of the nonhistone protein that does not participate in interaction with DNA (99). In this way, HMG1, binding to H1 and DNA with different domains of the molecule, could modulate the interaction of H1 with DNA, or, conversely, the interaction of H1 with HMGl could change the affinity of these proteins for DNA. The interaction of histone H1 with HMG14 and HMG17 is much less well-studied. Chemical cross-linking in solution led to the formation of HUHMG14 heterodimers, whereas no interaction between H1 and HMG17 was observed (100).The interaction of HMG14 with H1 was strongly d e c t e d by phosphorylation of the nonhistone protein (101).Recent hydroxyl radical footprinting experiments have addressed the location of HMG14 and HMG17 in nucleosome cores and in chromatosomes lacking linker histone (102).These proteins occupy DNA sites near the end of the chromatosome but distinct from those occupied by the linker histones; in the region of the dyad axis the binding sites of HMGs overlap those protected by the linker histones. The placement of HMG14 and HMG17 near the dyad suggests that interactions between these nonhistone proteins and histone H 1 may affect the transcriptional potential of chromatin. Neutron scattering studies aimed at elucidating the effect of HMG14 binding on chromatin structure showed that HMG14 binding results in a considerable reduction in the mass per unit length of the fibers, which probably reflects larger spacing between neighboring nucleosomes along the DNA (103). This general loosening of the higher order structure might be a necessary condition for the transcriptional activation attributed to HMG14 (see Ref. 103).
II. The Importance of linker Histones in Chromatin Fiber Structure
The early electron-microscope observations of the salt-dependent conformational transitions of soluble chromatin fragments pointed to the importance of the linker histones in the formation and maintenance of the higher order structures (5, 6). The H1-depleted chromatin was also observed to condense with increasing ionic strength; however, no definite structures with a well-defined fiber direction could be obtained. The structure of the
237
LINKER HISTONES IN CHROMATIN
low-ionic-strength extended fiber was also dependent on the presence of histone H1. In chromatin containing H1, the DNA entered and left the nucleosome on the same side, giving rise to the zig-zag appearance of the extended fiber, whereas in H1-depleted chromatin, the entry and exit points of the DNA were much more random, creating the extended “beads-ona-string’’ conformation (6). Recent scanning-force microscopy of native and H1-depleted fibers beautifully support the conclusions based on electron microscopy (13, 14) (see also Sections II,B,l and II,B,2).
A. Location of Linker Histones in the Nucleosome 1. THE POSITIONOF THE LINKERHISTONESWITH RESPECT THE NUCLEOSOME
TO
It has long been known that H1 binds to the linker DNA, and there is much evidence that this is at the point where the DNA double-helix enters and exits from the core particle (1). Binding of the linker histones to linker DNA protects against nuclease digestion an additional 20 bp of linker DNA immediately contiguous to the 146 bp of DNA in the nucleosome core; the particle containing around 168 bp of DNA and one linker histone molecule has been called a chromatosome (104).That H 1 lies close to the nucleosomal core has been directly demonstrated by protein/DNA cross-linking (105)and by proteidprotein cross-linking (106-110). In some studies the major contacts identified were with histone H2A (e.g., 109), whereas other experiments demonstrated cross-linking to all core histones (107, 108). A number of these studies were carried out in chromatin, or even in intact nuclei; this indicates that the propinquity reported is not simply a feature of the isolated chromatosome. A recent study characterized the products of digestion of chromatin by the peptidase clostripain and showed that the N termini of both H 4 molecules lied in close proximity to the globular domain of H1, each H4 terminus pointing toward it (111). However, acetylation of the tails of the core histones does not block the binding of histone H5 to reconstituted mononucleosomal cores (112). Most researchers believe that the binding to the linker is symmetrical, involving 10 bp extending from each end of the core DNA (reviewed in Ref. 1). However, in one case, a nucleosome reconstituted on a 5-S RNA gene fragment from Xenopus borealis, an asymmetric protection of the ends has been reported (113).This observation is unique and may depend on particular features of the DNA sequence used or on the reconstitution procedure. Reconstitution experiments using either intact H1 or isolated fragments thereof have suggested that the globular part of the histone is necessary and sufficient for the protection of the 168-bp chromatosomal DNA (114). The participation of the globular domain of the linker histones in binding near the ends of nucleosomal DNA has also been observed in protein/DNA cross-
238
JOHDANKA ZLATANOVA AND KENSAL VAN HOLDE
linking experiments that identified His-25 within the globular domain of H 5 as a major site of cross-linking in isolated particles, extended chromatin, and nuclei (115). In the case of reconstituted X . borealis mononucleosomes, in which the protection of the linker was found to be asymmetrical (114, the globular domain of H 5 was also found to associate with the core asymmetrically, cross-linking to a single site on one side of the dyad axis (116). Because of the small size of GH5 (2.9 nm) (117)and because of its location on this site far away from the dyad, it would not be capable of also interacting with the entry/exit DNA of this nucleosome. Again, these results may reflect some peculiar specific feature of the X . borealis 5-S sequence. Also, faithful reconstitution of the globular domain of H5 onto single nucleosomes may be a problem, despite the fact that the temporary pausing of micrococcal nuclease digestion, typical of the chromosomal structure ( I ) , has been observed in these experiments. How can a single linker histone molecule interact with the nucleosome core, and both entering and exiting DNA? In an early suggestion, the globular domain was placed directly over the twofold axis of the nucleosome; its size would fill up the gap between the DNA strands entering and leaving the particle (114) (Fig. 12A). Such a placement implies three contacts with the chromatosonial DNA: one with the entering DNA, one with the exiting DNA, and one with the core DNA near the dyad axis. Indeed, DNase I is denied access to the DNA at the dyad in H1-containing dinucleosomes (118). The distance in the chromatosome between the center of mass of the linker histone and the center of mass of the histone octamer measured by neutron scattering is 5.5 nm (119). This estimation places the linker histones very
A
6
FIG. 12. A schematic representation of the possible locations of the globular domain of histone H I (or H5) with respect to the core particle. The histone octamer is presented as a cylinder, the DNA as a tube, and the globular domain of histone H1 as a black ball. (A) The globular domain makes contacts with the DNA at three points: the DNA entering and exiting the nucleosome and the DNA at the dyad axis. (B) Tlie globular domain is situated at a distance from the DNA at the dyad, making contacts only with the crossover ofthe entering and exiting DNA strands.
LINKER HISTONES IN CHROMATIN
239
close to the central turn of the core particle DNA, implying an additional interaction with the core DNA. However, such a triple interaction would require three binding sites in the globular domain, and only two have been suggested on the bases of biochemical (e.g., 50,52)and recent crystallographic data (31).In the light of these considerations it may be that the protection of the dyad site seen in the DNase-I digestion studies (118) may simply reflect steric hindrance to the enzyme by the globular domain, which may not actually bind the DNA at this point. O n the other hand, the results from neutron scattering may reflect the fact that the measurements were done on isolated chromatosoines, in which the histone may artifactually approach the dyad of the core particle more closely than it does in full-length nucleosomes or in chromatin fibers. Because the issue of the exact location of the linker histones at the surface of the nucleosome in vivo cannot be considered solved, an alternative view remains tenable; this places the globular domain away from the dyad, exactly at the point where the entry and exit DNAs cross each other. If the DNA makes 1.75 turns around the histone octamer, then the entry and exit DNAs would cross each other at 90" at some distance (about 2 nm) from the surface of the DNA at the dyad (Fig. 12B). The strong preference of the linker histones and their isolated globular domains for crossovers of DNA (see Section I,B,3) might reflect an evolutionary design to bind to the crossover of the nucleosomal DNA. Whether the linker histone binds close to the nucleosome or at the crossover, it will be able to interact with both entering and exiting strands; the consequences of this are discussed in Sections II,B, 1 and II,B,2. In the above context, a recent study on the DNA sequence organization in the chromatosome (120)may be of interest. Analysis of a large set of DNA sequences cloned from chicken erythrocyte chroinatosomes reveals an asymmetry of di- and trinucleotide steps along the DNA at the chromatosome ends. In particular, two dinucleotides, ApG and GpG, seem to be conserved at one of the termini, while the other terminus is matched by the preferential occurrence of their complements, CpT and CpC. It is suggested that such an asymmetry at this location could be used to orient the asymmetric linker histone, targeting the binding of helix I11 of the globular domain of H1 or H 5 (see Fig. 5B) to one end of the chromatosome. Depending on where the second, more diffuse DNA-binding region situated on the opposite site of the globular domain binds to a second site in the chromatosomal DNA, the degree of constraint imposed on the path of the superhelix would be determined (for more detailed discussion of this intriguing possibility, see Ref. 120). These results may help explain the finding of an asymmetric binding of the globular domain of H5 in the 5-S DNA-reconstituted chroma-
240
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
tosome (116)as a consequence of particular features of the underlying DNA sequence.
2. DISPOSITION OF THE LINKER HISTONETAILS The way in which the globular domain of the linker histones binds to the chromatosome may be of structural consequence to the way the “unstruct u r e d tails are located. Although the carboxy-terminal domain of the linker histones is known to exist as a random coil when the proteins are free in aqueous solution, secondary structure predictions and C D measurements under various conditions suggested that the C-terminal domain might assume a segmented a-helical conformation on binding to DNA (121).These rigid helical segments might track the phosphate backbone of the linker DNA and help determine the conformation of the linker between nucleosomes. Important factors for this role would be the length, stability, and charge density of the helical segments. Because the termini are, in addition, highly positively charged, their binding to the linker DNA is expected to at least partially neutralize its charge (122), which might facilitate bending of the linker. Alternatively, if the histone C-tail binds to only one side of the duplex DNA, this asymmetric charge neutralization should produce DNA bending (123, 124). In either case, closer approach between nucleosomes may be promoted by such bending (for a further discussion of these points see Section II,C,2). In this respect, it is significant that reconstitution onto linker histone-depleted chromatin of a peptide containing only the globular and C-terminal domains is sufficient to induce salt-dependent chromatin folding, whereas the globular domain in itself is not sufficient (122).The role of the N-terminal domain of the linker histone is less clear; it may act as an anchor for the rest of the molecule to be positioned properly in the fiber (122).
3.
RELATIVE
ORIENTATION OF
ADJACENT
LINKERHISTONES
Asymmetrical binding of the globular domain to the chromatosomal DNA would determine the mutual disposition of linker histone molecules sitting on neighboring linkers, i.e., whether the successive histone molecules are situated with respect to each other in a head-to-tail, head-to-head, or tail-to-tail orientation. Analysis of isolated H l / H l dimers, obtained from chemical cross-linking in chromatin or nuclei, has been carried out by several groups of researchers, using enzymatic or chemical degradation. Contacts in all possible combinations (between two C termini, between two N termini and between C and N termini) have been reported (125, 126). In another study, contacts mainly between C-fragments, and some binding between the N terminus of one molecule and the C terminus of another have been observed, whereas contacts between two N termini have not been
LINKER HISTONES IN CHROMATIN
24 1
found (49).The C-terminal tails have also been identified as major sites of histone-histone contacts in purified H5/H5 dimers (127). In contrast, a predominantly polar, head-to-tail arrangement of histone H5 was suggested on the basis of similar analysis in extended chicken erythrocyte chromatin (128). This arrangement persists on compaction, which also brings C-terminal domains in closer juxtaposition than in the extended state, accounting for the increase in C-terminal to C-terminal cross-linking observed in high salt concentrations. The seemingly contradictory results obtained by various workers make it obvious that further studies are needed.
4. D o LINKERHISTONESFIX LINKERLENGTHS? The fact that H1 and its cognates bind to the linker DNA has led to the suggestion that the species- and tissue-specific differences in linker lengths could be explained by differences in the structure of different linker histone variants. However, experiments addressed to solve this issue fail to show a direct relation between changes in the nature of the H1 complement and changes in linker lengths. Thus, comparison of linker lengths in immature and mature chicken erythroid cells and in liver with the composition of the linker histones in these sources revealed no meaningful correlations (129). Also, the increase in nucleosoinal repeat length and the increase of H5 seen during erythrocyte maturation are not related in a proportional way, suggesting that H5 is not the major determinant of the corresponding repeat lengths (130). Similarly, the changes in linker length observed during development of rat brain neurons are not accompanied by significant changes in the H1 complement (131). The most striking evidence comes from in uiuo studies using mouse/ human inter- and intraspecies somatic-cell hybrids (132). All hybrids expressed the H 1 complement of only one of the parent cells. Still, in some of them, the length of the repeat was inherited from the same parent, whereas in others this was inherited from the parent whose H 1 was not expressed. This implies that some other nuclear factor or factors dictate linker length. Finally, in transfection experiments, the gene for histone H5 was expressed in cells that normally contain only H1 (130). Replacement of H 1 by H5 did not lead to a change in the repeat length, but only increased the stability of the chromatin fiber. A potential caveat with respect to such experiments may be that the exogenous linker histone may not be properly incorporated in the chromatin of these cells, which have never encountered histone H5 in the course of evolution. The cells used complete a round of DNA replication and this would allow a window of opportunity for new nucleosome spacing to be imposed in the presence of H5. However, because linker histones are believed to be deposited onto replicating chromatin last (133, 134), it is still possible that they cannot change the already established
242
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
nucleosomal density. If histone H 5 affects linker length in conjunction with other nuclear factors, which coevolved with H5, these other factors may not be present in the types of cells used for transfection. These in vivo results do not always accord with in vitro studies. For example, histone H5 restores the native spacing of about 200 bp of nucleosomes in “randomized” chromatin in oitro (135, 136).The nucleosomal repeat length of nucleosomal arrays reconstituted by introducing naked DNA in cell-free extracts can change on addition ofhistone Hl(137); however, this could only be achieved by nonphysiologically high amounts of the linker histone. A cloned 6.2-kb chicken p-globin DNA fragment assembled into chromatin with chicken core histones and histone H5 as the only cellular components assembled into chromatin with a regular 180 bp repeat, similar to the one observed in this region in erythroid cells where the gene is active; a specific region downstream from the gene was required for this nucleosoma1 arrangement (138).The same gene in chick oviduct, where it is inactive, has a 196-bp repeat (139), and histone H5 is not present in this tissue. Thus, although histone H5 was somehow involved in determining specific repeat lengths over the active p-globin gene, the actual length induced in the H5containing chromatin was in fact smaller than in the H5-lacking chromatin. This is contrary to the general expectation that H 5 should promote longer linker lengths, comparable to the 208 bp in chicken erythrocytes. It is evident that the possible participation of linker histones in determining linker lengths still remains to be understood.
B. Linker Histones and the Chromatin Fiber at Low Ionic Strength 1. STRUCTURE OF
THE
LOW-IONIC-STRENGTH FIBER
Valuable information concerning the role played by linker histones in determining chromatin fiber structure can be obtained from studies at ionic strengths below 10 mM. Under these conditions, the fiber is expanded, and it becomes possible to resolve individual nucleosomes by microscopic techniques and to study their disposition along the fiber. The pioneering work of Thoma et al. (6)gave to many the impression that the fiber at low ionic strength was a flattened zig-zag. However, it was pointed out (6) that this appearance could well be the consequence of flattening of some kind of extended helix on the grid surface. Indeed, the results of theoretical analysis and solution studies (11, 13, 15, 140-148) are more compatible with a three-dimensional, helixlike structure. The theoretical argument is straightforward: if the linker DNA is straight under low salt concentration conditions, and the entrylexit angle of DNA into and out of the nucleosome is fixed by the linker histones (see Section II,A,l), the conforma-
LINKER HISTONES IN CHROMATIN
243
tion of the fiber will be dictated b y these factors, plus the length of the linker. The length plays a crucial role under these circumstances, for it dictates the rotation of each nucleosome with respect to the preceding one (see Fig. 2A). If all linkers are of the same length, some kind of regular helix will necessarily be generated (e.g., 15, 149). However, we know that, in native chromatin, linker lengths are heterogenous, probably even in a local sense (e.g., 150, 151; for further references see Ref. 1).Because of this, the structure that is generated will be an irregular, helixlike fiber (13-15). In modeling such fibers, we have assumed that the D N A wraps 1.75 turns about the octamer; this corresponds to an exitlentry angle of 90” (Fig. 2A). The structures predicted on these assumptions strongly resemble native conformations, as revealed by the relatively nondisturbing technique of scanningforce microscopy (see Fig. 2 E ) . The fibers observed at low salt concentration by scanning-force microscopy are clearly not planar zig-zags, but are irregular, helixlike structures consistent with the results of solution studies. The diameters of the fibers average about 30 nm.
2. STRUCTURAL EFFECTSOF LINKERHISTONEREMOVAL The ability to observe structural details of chromatin fibers at low ionic strength allows critical tests of the effect of removal of linker histones. Figure 13 shows SFM images of chicken erythrocyte chromatin depleted of histones H1 and H5, again compared to computer-simulated images. In this case, to obtain simulations comparable to the observed structures, it was necessary to relax the restriction on the entry/exit angle. In fact, the number of turns of D N A about the histone core was allowed to vary from 1.0 to 2.0, each limit corresponding to an entry/exit angle of 180”. As can be seen in Fig. 13, the result of removal of linker histones is, as predicted, the production of an extended, “beads-on-a-string” structure. The helical zig-zag structure found in the presence of linker histones has been lost. Furthermore, the distribution of nucleosome center-to-center distances (Fig. 1 3 D and E) now shows a broad tail toward longer lengths, which can only result if D N A has been “ p e e l e d from the nucleosome core. The maximum center-to-center distance observed (about 50 nm) corresponds to addition of as much as 80 bp to the linker between two nucleosomes, equivalent to one whole turn of D N A , distributed between contributions from two adjacent nucleosome cores. That such peeling off may be facile follows from topological analysis (152). It turns out that the observed D N A winding pattern around the histone octamer, with 1.75 turns, is singled out from all other geometrically feasible winding patterns: it allows partial unwrapping of the D N A from the octamer without compensatory writhing or twisting of the unwrapped D N A . In other words, with 1.75 turns of D N A in the core, unwrapping of up to one turn may proceed with topological impunity.
LINKER HISTONES IN CHROMATIN
245
Some earlier biochemical and biophysical data support the idea that in the absence of H1 the DNA at the ends of the core particle is less tightly constrained by the histone octamer than in the presence of the histone. Removal of H1 from intact nuclei increases the susceptibility to micrococcal nuclease at the ends of the particle (153). A large increase in the negative linear dichroism of dinucleosomes on H1 removal has been observed and interpreted as resulting from unwinding of the DNA tails (154, 155). A similar interpretation was offered to explain circular dichroism and thermal denaturation data (156).Thus, it is clear that linker histones play a major role in determining chromatin fiber structure at low ionic strength. 3. IMPLICATIONS FOR CHROMATIN TRANSCRIPTION
In vivo, the transition into extended structures is expected to occur at sites of active transcription, to allow access by regulatory factors and enzymes to the underlying DNA template. This transition may involve, among other things, partial or complete removal of histone H1, or some weakening of its interaction with linker DNA (reviewed in Ref. 19). Such changes will have a double structural effect: allowing the relaxation of the condensed higher order structure and destabilizing the nucleosome itself, allowing considerable unwrapping of the DNA from around the histone octamer. These structural transitions should facilitate the passage of transcribing polymerase molecules. Another important but less well-studied aspect of the involvement of linker histones in regulation of transcription is the possible participation of these histones in nucleosome positioning. The occupancy by a nucleosome of specific DNA sequences containing cis-regulatory elements and elements
FIG. 13. Model of linker histone-depleted chromatin fiber at low salt concentration, its simulated SFM image, and an actual SFM image. (A) A computer-generated model of a linker histone-depleted chromatin fiber. The simulation is the same as in Fig. 2, except that the number of turns of DNA around the octamer is allowed to vary randomly between 1 and 2 , and the length of the linker between 51 and 73 bp. (B) Simulated SFM image of the fiber in A, assumed to have been scanned and partially flattened by a parabolic tip with a radius of curvature of 10 nm. (C) Experimental SFM image of an HlIH5-depleted chicken erythrocyte Chromatin fiber after fixation with glutaraldehyde and imaging on mica. Images are 400 nm x 400 nm in size. (D and E) Distribution of center-to-center distances of adjacent nucleosoines along the DNA path for fibers of the kind shown in B and C, respectively. About 700 nieasurements were made for each histogram. The simulated fibers have a slightly shorter mean internucleosome distance than do the actual ones, probably as a result of the simple projection method used to simulate the deposition of the fiber to the surface of the mica. The distributions show that linker histone removal leads to a release of the DNA from the histone cores and the formation of longer length linkers (the mean value for native chicken erythrocyte chromatin is about 22 nm). Reprinted with permission from Nature (Ref. 14). Copyright 1994 Macmillan Magazines Limited.
246
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
involved in transcription initiation may have profound effects on transcription (157, 158). The positioning of nucleosomes is governed by multiple factors, specific DNA sequences, DNA curvature, and boundary effects being among the most important of them (159, 160). In a study of nucleosome positioning choices, linker histones did not override the positioning signals in the underlying DNA template but did change the relative distribution of nucleosome positions among possible 10-bp-spaced alternatives (161). Recent studies from the same group (162) demonstrated that linker histone binding suppresses the general mobility of nucleosomes over 10-bp DNA intervals, which is observed under conditions of relatively low ionic strength. Immobilization of nucleosome cores over specific sequences may be an important mechanism by which H1 could affect transcription outside the context of chromatin condensation. An important study of chromatin organization over the oocyte- and somatic-type 5-S RNA genes in uiuo (163)suggests that H1 is instrumental in promoting formation of nucleosomes over the repressed oocyte-type genes. Conversely, removal of H 1promoted disruption of this specific nucleosomal arrangement, concomitant with promotion of transcription. That linker histones can order nucleosomes in a sequence-specific manner has been demonstrated in model studies of nucleosome positioning on plasmid pBR327 (164). The H5-mediated formation of positioned arrays of nucleosomes depends on the presence of a specific sequence in the plasmid; however, H5 did not bind to this signal sequence. Later, these studies were extended to include eukaryotic gene regions: H1-induced spreading of nucleosome alignment depended on specific DNA positioning signals in the cases of chicken ovalbutnin gene (165)and chicken P-globin gene (138).The exact mechanism for the observed effect of linker histones on creation of positioned arrays of nucleosomes remains to be determined.
C. Linker Histones and the Condensed Chromatin Fiber As briefly discussed in the Introduction, the mechanisms of chromatin condensation into the higher order structure and the details of that structure remain largely unknown despite numerous physical and microscopic studies. We believe that there is no substantial evidence for a regular helical structure of any kind, at least in significant regions of the chromatin fiber (detailed argumentation is given elsewhere) (10).A similar view is shared by C. Woodcock and R. Horowitz (personal communication) based on a careful analysis of published electron-microscope observations, and is supported by the most recent microscopic techniques (17). Because it seems that a major role of linker histones is in establishing and stabilizing this structure, it is important to examine critically the evidence concerning its conformation.
247
LINKER HISTONES IN CHROMATIN
1. MAIN STRUCTURAL CHARACTERISTICS CONDENSED FIBER
OF THE
What are the “firm facts” and what are the main controversies surrounding the issue of the higher order structure of chromatin?
a. Fiber Diameter. The average diameter of the fiber seems to be around 30 nm, although the reported values vary from around 25 nm to around 45 nm, depending on the source of chromatin, the method for its, isolation and preparation of the sample, and the method of investigation. This average value has, in fact, given the condensed fiber the name “30-nm” fiber. Our recent SFM measurements have indicated that even the extended low-ionic-strength fibers have diameters of about 30-35 nm (13); similar values were estimated using physical methods (140, 141, 143, 144, 147). To avoid further confusion, we have recently proposed that the term 30-nm fiber be no longer used to denominate the higher order structure formed in high salt concentration; this can be termed instead the condensed or compacted fiber (10). A major unresolved issue is whether or not the diameter of the fiber depends on the linker length: experimental results supporting both possibilities have been reported in the past. Recent measurements of fiber diameters in cryosections of different types of nuclei seem to shed some light on this issue: the values determined in situ were very similar, in striking contrast to those measured on chromatin isolated from these cell types, where a strong positive correlation between diameter and repeat length had been established (166). Thus, it seems possible that the reported dependence of the diameter on linker length may be just an artifact encountered in isolated chromatin fibers.
h. Mass per Unit Length. The density of the fiber is usually expressed as number of nucleosomes per defined fiber length. The reported values show a transition from one nucleosome per 10 nm at zero salt concentration to six to eight nucleosomes at 70-100 mM or at milliniolar concentrations of Mg2+. Higher values of about twelve nucleosomes per 10 nm have also been reported, but it has been suggested that at least some of them may originate from aggregated fibers (1). c. Orientation of the Nucleosomes. Several kinds of evidence (X-ray diffraction from oriented fibers, linear dichroism) have been cited to suggest that the flat faces of the nucleosomes are approximately parallel to the fiber axis, but may exhibit considerable variability. However, the evidence is difficult to evaluate, for several reasons. Insofar as X-ray diffraction is concerned, patterns from semioriented
248
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
fibers do show some orientation of the 11-nm reflection (corresponding to nucleosome diameter) parallel to the fiber axis (e.g., 167). But in the absence of accompanying measurement offiber orientation, it is not possible to evaluate these data quantitatively. Linear dichoism measurements (flow dichroism and electric dichroism) suffer from other complications. A major source of difficulty arises from the fact that the orientation of the linker DNA (which comprises about 25% of the whole) is unknown. In one instance an attempt has been made to correct for this, using photochemical dichroism to determine linker orientation (168):however, the results remain ambiguous. A more fundamental problem with dichroism studies for such systems can be seen by careful examination of the data. The maximum dichroism values observed are invariably low, with both positive and negative values being reported, but most results clustering around M I A = -0.1. This is a very small value when compared, for example, to the maximum electric dichroism of naked DNA, which approximates -1.3. This has been interpreted as indicating that the average angle between the DNA base planes and the fiber axis is around 60",close to the magic angle of 54.7" at which the dichroism passes through zero and changes sign. This would correspond to a tilt of about 30" of the long axis of the nucleosome away from the fiber axis, a result consistent with a number of models. However, to interpret the data in this way ussumes that all of the nucleosomes make approximately the same angle with respect to the fiber axis. There is, in fact, an alternative explanation for a very low value for the dichoism, that the nucleosomes are nearly randomly oriented with respect to the fiber axis, so that even complete alignment of the fibers still results in nearly random alignment of DNA chromophores. This possibility does not seem to have been considered; this means that we must be very cautious in using linear dichroism data as evidence for particular fiber models. In our opinion, there is no strong evidence to support any specific helical model for the condensed fiber; indeed, a largely irregular structure seems more consistent with the existing data (10).
2. LOCATION AND ORGANIZATION OF
THE
LINKERDNA
This aspect of the fiber structure has been a matter of considerable controversy. It is still not clear what the path of the linker DNA is. In principle, without invoking any specific helical regularity in the structure, there are three major ways in which the linker DNA might be organized: (1) coiled in some way inside a condensed structure with peripherally situated nucleosomes, (2)continuing the superhelical path of the DNA of the core particle, and (3) remaining straight and rigid, as in the low-ionic-strength extended fibers.
LINKER HISTONES IN CHROMATIN
249
Reliable data on the location of the linker DNA remain scarce. The contribution of the linker DNA to the low-resolution X-ray scattering pattern is negligible; therefore, these patterns reflect mainly the intrinsic features of the core particles and their mutual arrangements in the fiber. Some information about the location and structure of the linker DNA has been acquired by biochemical methods (1). The results, based on analysis of the kinetics and the products of digestion with nucleases (e.g., 169, 170),were interpreted as showing that the linker DNA is organized in a manner very similar to the organization of the core DNA and thus follows, together with the latter, a continuously supercoiled path. Such a model has been proposed (17 4 , mainly on the basis of electric dichroism data. Digestion of chromatin fibers with micrococcal nuclease under extremely mild conditions, using either membrane-immobilized or free enzyme, has permitted subtle effects of fiber structure on the digestion parameters to be revealed (12, 172). Although the linker DNA is readily accessible to nucleolytic attack in the extended low-salt conformations, it is almost completely protected against digestion in the condensed high-salt conformation; fibers of intermediate degrees of compaction are digested to intermediate degrees. These results suggest that access to the linkers is most probably limited by high compaction rather than by internalization to the center of the fibers. This interpretation is supported by the observation that cleavage of the linker DNA by a small molecule, methidium-propyl-EDTA-Fe(II),proceeded at similar rates for all types of conformations (12). Although recent studies of chromatin fiber structure at low ionic strength (13-15) reveal linker conformation in these circumstances, they say nothing about linker organization in the condensed fiber. The linkers that appear to be straight in the extended fibers (13-15) may bend or coil on condensation or may remain straight. Investigators tend to think that condensation is brought about by bending of the linker DNA, which is known to become more flexible as the salt concentration is raised (173), thus allowing interactions among neighboring nucleosomes to take place. Evidence that linker DNA may bend or fold in isolated dinucleosomes as conditions for chromatin condensation are approached has been presented by Yao et al. (174, 175). However, examination of the sedimentation data of Butler and Thomas (1 76) leads to the opposite conclusion. No change in S,,, for dinucleosomes was observed in the salt range from 5 to 140 mM, although a 25% increase would be expected if the nucleosomes approached so close as to contact one another. Whatever the in vitro studies may show, whether such bending can occur in vivo in intact chromatin remains to be seen. It is also far from clear whether nucleosomes, at least those that are successive in the linear array of nucleosomes, actually interact in vivo. The early electron-microscope obser-
250
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
vations (177), showing the formation of arcs and helices from closely face-toface packed isolated core particles, suggested that such interactions may be of significance in the process of condensation. Consequently, interaction between adjacent nucleosomes has been made an intrinsic feature of most models of higher order structure. In fact, we are not aware of any data pointing in this direction, Recent electron-microscope imaging of chromatin in sections of low-temperature-embedded starfish sperm nuclei (15, 16) seems to suggest that successive nucleosomes may actually not be in contact with each other, because the linker DNA between them is extended even under physiological conditions. If nucleosome-nucleosome interactions occur in such a structure, they must involve nucleosomes that are not adjacent on the linear fiber. An independent approach to the question of the linker DNA path in nuclei has been developed (178)using photo-induced thymine dimer formation as a structural probe. This method depends on the fact that the rate of thymine dimer formation is affected by the direction and degree of DNA bending. To obtain information about the structure of linkers, DNA from dinucleosomes isolated from irradiated nuclei was examined for distribution of thymine dimers. The results were interpreted to mean that the linker contained very little bending, at least for the subset of dinucleosomes studied. That the linker DNA may not continue the superhelix of the core particle was also inferred from photochemical dichroism studies (179). In summary, the disposition of the linker DNA within the condensed chromatin fiber remains a major puzzle that impedes our effort to understand the structure of the chromatin fiber.
3. LOCATION OF THE LINKER HISTONESIN CONDENSED FIBER
THE
The location of the linker histones in the condensed fiber is also unresolved. Most studies have used immunochemical approaches, with conflicting results and interpretations (180). The general approach has been to determine whether and to what extent the accessibility of the histones, or certain domains thereof, to antibodies change on salt-induced compaction of soluble fibers. Some authors interpret their data as indicating no change in the location of the linker histones on condensation of the fiber, whereas others assert that the histones are internalized. A paper that is often cited as supporting the internalization view is that of Dimitrov et al. (IBf),which takes an ingenious approach. Antibodies against the globular domain of histone H5 were attached to bulky ferritin molecules (the size of a compact dinucleosome) so as to create a probe too large to penetrate into a condensed fiber. The loss of immunological reactivity of the fiber on increasing the salt concentration was seen as an indication of internal location of the globular
LINKER HISTONES IN CHROMATIN
25 1
domain in the compacted structure. However, it should be noted that the immunological reaction, being low even in the fully extended fiber, was already lost at intermediate ionic strength, long before the fiber could attain its condensed state. Moreover, the same gradual decrease in the intensity of the reaction was evident with the free nonconjugated antibody. Thus, it may be that these results reflect the general steric hindrance to bulky probes that develops on fiber compaction, similar to the steric hindrance observed in the case of micrococcal nuclease (12, 172). Recently, immobilized proteases such as trypsin and chymotrypsin have been used as an alternative approach to this problem (182). The data obtained concerned the location of the N- and C-terminal portions of the linker histones. The tails of histone H1 remained accessible on fiber condensation; therefore, it was concluded that they did not change their location. On the other hand, the tails of H5 became significantly inaccessible in the condensed fiber. Why the two linker histones should differ in this way is not clear; perhaps the difference in their location in the fiber may be among the mechanisms by which they exert their differential effect on functions such as transcription and replication. The use of chymotrypsin as a probe to the globular domain turned out to be inappropriate, because phenylalanine, the site of preferential cleavage, was hidden even in the mononucleosomal particle. Alternative proteases, selectively cleaving peptide bonds in the globular domain, should be applied to further elucidate this issue. A potentially very powerful approach to locating the linker histones is neutron scattering, using deuterated protein. Recent studies (183) g’ive an unambiguous answer for the average distance of H 1 molecules from the fiber axis: 6-6.5 nm (Fig. 14). This is smaller than the average distance of nucleosome centers from the axis (11.5nm) and implies that the linker histones lie preferentially inside. However, such an uueruge value does not mean that all linker histones are precisely arranged at this radius, as implied by the figure; some may be closer to the periphery, and some closer to the center, as might be expected in a less regular fiber. The result does show unequivocally that the linker histones are neither confined to the periphery or to the very center of the fiber. It has been known for years that application of bifunctional cross-linking agents to chromatin and nuclei results in the formation of H 1 to H5 homopolymers. This observation has often been interpreted as indicating that the linker histones are located in the center of the compacted fiber. In fact, all it shows is that linker histone molecules are situated closely enough in the chromatin fiber to allow cross-linking to occur. This is clearly demonstrated by data (41) showing that similar cross-linked products are formed both at low ionic strength, when the fiber is in an extended conformation, and at high ionic strength, when the fiber is highly compacted. In either case, very
252
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
FIG. 14. A schematic drawing of a cross-sectional view of a condensed chromatin fiber based on results from neutron scattering analysis using deuterated H I (hatched circles). The nucleosome is represented by a box of dimensions 110 x 57 A. The inner face of the nucleosome falls at approximately the same radial location as the H I center of mass, which may be interpreted as indicating that H1 binds to the face of the nucleosome that is presented to the interior of the filament (for a discussion of this view, see text). Reprinted with permission from Nature (Ref. 183).Copyright 1994 Macmillan Magazines Limited.
few cross-linked products larger than hexamers are found. Similar experiments (184)revealed discrete H 1 homopolymers, in this case integer multiples of 12 H 1 molecules. Again, the structure that determines this crosslinking pattern exists not only in nuclei, but in the extended nucleosomal chain. Thus, the organization of the nucleosomal chain may include structural features sufficient to determine the higher order structure of chromatin. The cross-linking data suggest that the main contacts between H1 molecules in the condensed fiber may be between molecules sitting on adjacent linkers rather than across successive turns of the fiber. Finally, it is important to realize that the actual location of the molecules of the linker histones with respect to the nucleosome may differ in the extended and the condensed fibers. Using covalent protein/DNA crosslinking procedures, Mirzabekov and co-workers (170) have shown that H1 protects both ends of the chromatosomal DNA only in isolated nucleosomes and in unfolded chromatin. Within nuclei, however, the histone interacts with the linker DNA at just one side of the nucleosome, so that its globular domain interacts with the central portion of the linker, with additional con-
LINKER HISTONES IN CHROMATIN
253
tacts on the linker DNA of neighboring nucleosomes, or on more distant positions in the higher order structure. On decondensation of chromatin, H1 is redistributed in such a manner that its globular domain becomes bound to the linkers on both side of the same nucleosome, and this leads to the appearance of a chromatosome. In accordance with this view, the interaction of H 5 with DNA in nuclear chromatin of chicken erythrocytes differs from that in isolated mononucleosomal particles and in unfolded chromatin (115). Interestingly, a similar idea was forwarded earlier by Krueger (185), who suggested that H1 binds to the linker DNA only at low ionic strength, whereas at physiological conditions it interacts with other sites determined by packaging of chromatin into higher order structures. If such a “switch” in linker histone interactions occurs, it would play a major role in condensation. If, for example, one or more of the contacts to the entry/exit DNA were lost, a relaxation of constraints on the entry/exit angle would occur, which might aid in the folding of the extended fiber into the condensed form. Can chromatin fibers condense in the absence of linker histones? There is ample evidence that, in some sense, they do. Thoma et al. (6) describe the formation of “clumps” of nucleosomes when H 1-stripped chromatin is subjected to 100 mM salt. Schwarz and Hansen (186) have shown that the sedimentation coefficient of reconstituted dodecameric oligonucleosomes (minus linker histones) increases markedly on increase of salt concentration. The distribution of sedimentation coefficients exhibits a limit at about 55 S, before aggregation begins (although more slowly sedimenting molecular species are also present). It is suggestive that 55 S is also the approximate value expected for a two-turn helical coil. However, it is very difficult to distinguish such a structure from a globular aggregate of twelve nucleosomes, which would have a similar sedimentation coefficient. Proof of helix formation would require using much longer oligonucleosome chains. It is also possible that formation of a regular helical structure will be much easier if nucleosomes are regularly spaced, as they are in such constructs. At the moment, we can state that the high salt condensation of chromatin does not require H1, but there is much evidence that the formation of physiologically relevant condensed fiber does need these histones (5, 6).
111. What Do We Know? What Do We Need to learn?
Our knowledge of the linker histones is remarkably complete in many ways, and yet surprisingly limited in others. We now know a great deal about
254
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
the proteins, including structures of at least their globular regions. We are beginning to understand quite a bit about how they interact with D N A in its various topological forms, and have some insight into the location of linker histone in the chromatosome. Although we know much more than we did a decade ago, two key questions still await resolution: (1) Precisely where are the linker histones located (with respect to nucleosomes) in the condensed fiber, and (2)Just what role do they plu y in that condensation? Both are difficult questions to approach, and both would be easier if we knew more about the structure of the condensed fiber. It seems likely that some of this structural information will be forthcoming from studies using new microscopic techniques, such as X-ray contact microscopy and scanning-force microscopy, both of which can be carried out in aqueous solution. These may also allow us to follow the condensation of the chromatin fiber in a “dynamic” way, as the ionic environment is continuously changed. Nevertheless, this kind of study will not in itself answer the above questions. A useful complementary approach will be to make use of reconstituted polynucleosome fibers, formed on tandem repeats of nucleosome “positioning” sequences. Here we may hope to obtain the regularity of structure that will allow high-resolution scattering studies, and the flexibility to adjust nature and amount of the linker histones added. A study of such fibers using neutron scattering, especially if deuterated linker histones were employed, might yield important information. Cross-linking studies, using the Mirzabekov technique (170), might provide much more definitive results with such model systems than with heterogeneous native chromatin. It must always be kept in mind, however, that such artificial constructs can be misleading because of special properties of regular systems. Nevertheless, they seem to offer hope of clarity in a confused field. Another aspect of linker histone location that has not been sufficiently explored has to do with the nature and significance of strong binding sites on DNA. How frequent are they, and what do they do? In this area, there is simply the need for more exploratory research. Most important of all, and clearly tied to the above questions, is the problem of how linker histones relate to transcription. We now know that linker histones are present in actively transcribed chromatin, albeit probably in reduced amounts. What features determine the content of linker histones in specific chromatin regions? How can this content be regulated in accordance with the needs of the cell for expression of specific genes? Do other modifications associated with active chromatin modify linker histone/DNA interactions so as to weaken higher order structure? Advances on these questions may yield a key to the more general problem of selective gene expression as the basis for differentiation and development.
LINKER HISTONES I N CHROMATIN
255
ACKNOWLEDGMENTS This work was supported in part by National Institutes of Health Grant GM50276 to K. V. H. and J. Z.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8.
9. 10. 11. 12. 13.
14. 15. 16. 17.
18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.
van Holde, “Chromatin.” Springer-Verlag, Berlin and New York, 1988. L. Olins and D. E. Olins, J . Cell B i d . 59, 252a (1973). L. Olins and D. E. Olins, Science 183, 330 (1974). L. F. Woodcock, J . Cell B i d . 59, 368a (1973). J. T. Finch and A. Klug, PNAS 73, 1897 (1976). F . Thoma, Th. Koller and A. Kl~ig,J. Cell B i d . 83, 403 (1979). G. De Murcia and T. Koller, B i d . Cell 40, 165 (1981). R. Tsanev, G . Russev, I. Pashev and J. Zlatanova, “Replication and Transcription of Chromatin.” CRC Press, Bocd Raton, FL, 1992. L. A. Freeman and W. T. Garrard, Crit. Reu. Euk. Gene E r p r . 2, 165 (1992). K. van Holde and J. Zlatanova, J B C . 270, 8373 (1995). M. H. J. Koch, in “Protein-Nucleic Acid Interactions” (W. Saenger and V. Heineman, eds.), p. 163. MacMillan, London, 1989. J. Zlatanova, S . H. Leuba, G. Yang, C. Bustamante and K. van Holde, PNAS 91, 5277 (1994). S. H . Leuba, G. Yang, C. Robert, B. Samori, K. van Holde, J. Zlatanova and C. Bustamante, PNAS 91, 11621 (1994). G . Yang, S. H. Leuba, C. Bustamante, J. Zlatanova and K. van Holde, Nature Struct. B i d . 1, 761 (1994). C. L. Woodcock, S. A. Grigoryev, R. A. Horowitz and N. Whitaker, PNAS 90,9021 (1993). R. A. Horowitz, D. A. Agard, J. W. Sedat and C. L. Woodcock, J. Cell B i d . 125, l(l994). Y. Kinjo, K . Shinohara, A. Ito, H. Nakano, M. Watanabe, Y. Horiike, Y. Kikuchi, M. C. Richardson and K. A. Tanaka, J . Microsc. 176, 63 (1994). J. Zlatanova, Trends Biochem. Sci. 15, 273 (1990). J. Zlatanova and K. van Holde, J . Cell Sci. 103, 889 (1992). R. D. Cole, Znt. J . Peptide Prot. Res. 39, 433 (1987). J. M. Neelin, P. X. Callahan, D. C. Lamb and K. Murray, Can. J. Biochern. 42, 1743 (1964). B. L. A. Miki and J. M. Neelin, Can. J. Biochern. 53, 1158 (1975). S . M. Seyedin and W. S. Kistler, JBC 255, 5949 (1980). K. M. Newrock, C. R. Alfageme, R. V. Nardi and L. H. Cohen, CSHSQB 42,421 (1978). R. C. Smith, E. Dworkin-Rastl and M . B. Ilworkin, Genes Deu. 2, 1284 (1988). S. Ilimitrov, 6. Almouzni, M. Dasso and A. P. Wolffe, Deu. Biol. 160, 214 (1993). J. Zlatanova and D. Doenecke, FASEB J. 8, 1260 (1994). P. G . Hartman, G. E. Chapman, T. Moss and E. M. Bradbury, EJB 77, 45 (1977). 6. M. Clore, A. M. Gronenborn, M. Nilges, D. K. Sukumaran and J. Zarbock, E M B O J . 6, 1833 (1987). C. Cerf, 6. Lippens, S. Mnyldermans, A. Segers, V. Ramakrishnan, S . J. Wodak, K. Hallenga and L. Wyns, Bcheni 32, 11345 (1993). V. Ramakrishnan, J. T. Finch, V. Graziano, P. L. Lee and R. M. Sweet, Nature 362, 219 (1993).
K. A. A. C.
256
JORDANKA ZLATANOVA AND KENSAL VAN HOLDE
K. L. Clark, E. D. Halay, E. Lai and S. K. Burley, Nature 364, 412 (1993). M. Wu, C . D. Allis, R. Richman, R. G. Cook and M. A. Gorovsky, PNAS 83,8674 (1986). L. J. Hauser, M. L. Treat and D. E. Olins, NARes 21, 3586 (1993). F. Azorin, C. Olivarez, A. Jordan, L. Perez-Grau, Cornudella, L. and J. A. Subirana, E r p . Cell Res. 148, 331 (1983). 36. B. 0. Glotov, L. G. Nikolaev, V. K. Dashkevich and S. F. Barbashov, BBA 824, 185 (1985). 37. M. J. Smerdon and I. Isenberg, Bchern 15, 4233 (1976). 38. E. Russo, V. Giancotti, C. Crane-Robinson and G. Geraci, Int. J. Bchem. 15, 487 (1983). 39. A. F. Protas, S. N. Kharpunov and G. D. Berdishev, Ukr. Biokhim. (Russian)58,22 (1986). 40. J. D. Maman, T. D. Yager and J. Allan, Bchern 33, 1300 (1994). 41. J. 0. Thomas and A. J. A. Khabaza, EJB 112, 501 (1980). 42. P. H. Draves, P. T. Lowary and J. Widom, J M B 225, 1105 (1992). 43. M. A. Tarnowka, C. Baglioni and C. Basilico, Cell 15, 163 (1978). 44. S. Dalton, J. R. Coleman and J. R. E. Wells, MCBiol. 6, 601 (1986). 45. D. Doenecke, W. Albig, H. Bouterfa and B. Drabent, J. Cell. Biochern. 54,423 (1994). 46. E. M. Bradbury, R. J. Inglis and H . R. Matthews, Nature 247, 257 (1974). 47. S. Y. Roth and C. D. Allis, Trends Biochern. Sci. 17, 93 (1992). 48. J. Zlatanova and J. Yaneva, DNA Cell Biol. 10, 239 (1991). 49. D. Ring and R. D. Cole, JBC 258, 15361 (1983). 50. D. J. Clark and J. 0. Thomas, JMB 187, 569 (1986). 51. T. Diez-Caballero, F. X.Aviles and A. Albert, NARes 9, 1383 (1981). 52. D. J. Clark and J. 0. Thomas, EJB 178, 225 (1988). 53. D. E. Olins and A. L. Olins, J M B 57, 437 (1971). 54. R. D. Cole, G. M. Lawson and M. W. Hsiang, C S H S Q B 42, 253 (1977). 55. A. T. Rodriguez, L. Perez, F. Morln, F. Montero and P. Suau, Biophys. Chem. 39, 145 (1991). 56. G . S . Manning, Q.Rev. Biophys. 11, 178 (1978). 57. R. W. Wilson and V. A. Bloomfield, Bchern 18, 2192 (1979). 58. V. A. Bloomfield, Biopolymers 31, 1471 (1991). 59. A. Belmont and C. Nicolini, J . Theor. Biol. 90, 169 (1981). 60. J. A. Subirana, FEBS Lett. 302, 105 (1992). 61. F. Watanabe, BBRC 172, 1129 (1990). 62. D. J. Clark and T. Kimura, J M B 211, 883 (1990). 63. G. S. Manning, Biopolymers 18, 2929 (1979). 64. M. Watanabe, NARes 14, 3573 (1986). 65. J. 0. Thomas, C. Rees and J. T. Finch, NARes 20, 187 (1992). 67. T. Vogel and M. F. Singer, PNAS 72, 2597 (1975). 68. T. Vogel and M. F. Singer, JBC 251, 2334 (1976). 69. C. Iovcheva and 6. N. Dessev, Mol. Biol. Rep. 6 , 21 (1980). 70. J. Yaneva, J. Zlatanova, E. Paneva, L. Srebreva and R. Tsanev, FEBS Lett.263,225 (1990). 71. D. Krylov, S . Leuba, K. van Holde and J. Zlatanova, PNAS 90, 5052 (1993). 72. D. M. J. Lilley, Nature 357, 282 (1992). 73. D. R. Duckett, A. I. H. Murchie, A. Bhattacharyya, R. M. Clegg, S. Diekmann, E. von Kitzing and D. M. J. Lilley, EJB 207, 285 (1992). 74. M. E. Bianchi, M. Beltrame and G. Paonessa, Science 243, 1056 (1989). 75. P. Varga-Weisz, K. van Holde and J. Zlatanova, JBC 268, 20699 (1993). 76. P. Varga-Weisz, J. Zlatanova, S. H. Leuba, G. P. Schroth and K. van Holde, PNAS 91, 3525 (1994). 77. P. Varga-Weisz, K. van Holde and J. Zlatanova, BBRC 203, 1904 (1994). 78. J. Zlatanova and J. Yaneva, Mol. B i d . Rep. 15, 53 (1991). 32. 33. 34. 35.
LINKER HISTONES IN CHROMATIN
257
M. Suzuki, EMBOJ. 8, 797 (1989). M. E . A. Churchill and M. Suzuki, EMBOJ. 8, 4189 (1989). L. N. Marekov, G. Angelov and B. Beltchev, Biochirnie 60, 1347 (1978). M. Leng and G. Felsenfeld, PNAS 56, 1325 (1966). M. Renz, PNAS 72, 733 (1975). S. L. Berent and J. S. Sevall, Bchem 23, 2977 (1984). J. S. Sevall, Bchem 27, 5038 (1988). U. Pauli, J. F. Chiu, P. Ditullio, P. Kroeger, V. Shalhoub, T. Rowe, 6. Stein and J. Stein, J. Cell. Physiol. 139, 320 (1989). 87. J. Yaneva and J. Zlatanova, DNA Cell Biol. 11, 91 (1992). 87u. J. Yaneva, G. P. Schroth, K. E. van Holde and J. Zlatanova, PNAS 92, 7060 (1995). 88. A. P. Wolffe, J. Cell Sci. 107, 2055 (1994). 89. A. Jerzmanowski and R. D. Cole, JBC 265, 10726 (1990). 90. M. J. Smerdon and I. Isenberg, Bchem 15, 4242 (1976). 91. S. S. Yu and T. G. Spring, BBA 492, 20 (1977). 92. P. D. Cary, K. V. Shooter, G. H. Goodwin, E. W. Johns, J. Y. Olayemi, P. 6. Hartman and E. M. Bradbury, BJ 183, 657 (1979). 93. L. A. Kohlstaedt and R. D. Cole, Bchem 33, 570 (1994). 94. J. B. Jackson, J. M. Pollock, Jr. and R. L. Rill, Bchem 18, 3739 (1979). 95. S. C. Albright, J. M. Wiseman, R. A. Lange and W. T. Garrard, JBC 255, 3673 (1980). 96. J. Bernu6s and E. Querol, BBA 1008, 52 (1989). 97. K. Grade, C.-U. von Mickwitz, R. Messelwitz and R. Lindigkeit, Studiu Biophys. 89, 1 (1982). 98. L. A. Kohlstaedt, E. C. Sung, A. Fujishige and R. D. Cole, JBC 262, 524 (1987). 99. M. Carballo, P. PuigdomBnech and J. Palau, EMBO J. 2, 1759 (1983). 100. E. Espel, J. Bernuks, E. Querol, P. Martinez, A. Barris and J. Lloberas, BBRC 117, 817 (1983). 101. J. Palvimo and P. H. Maenpa, BBA 952, 172 (1988). 102. P. J. Alfonso, M. P. Crippa, J. J. Hayes and M. Bustin, JMB 236, 189 (1994). 103. V. Graziano and V. Ramakrishnan, J M B 214, 897 (1990). 104. R. T. Simpson, Bchem 17, 5524 (1978). 105. A. V. Belyavsky, S. G. Bavykin, E. 6. Goguadze and A. D. Mirzabekov, JMB 139, 519 (1980). 106. W. M. Bonner, NARes 5, 71 (1978). 107. B. 0. Glotov, A. V. Itkes, L. G. Nikolaev and E. S. Severin, FEBS Lett. 91, 149 (1978). 108. D. Ring and R. D. Cole, JBC 254, 11688 (1979). 109. T. Boulikas, J. M. Wiseman and W. T. Garrard, PNAS 77, 127 (1980). 110. E. Espel, J. BernuBs, J. A. PBrez-Pons and E. Querol, BBRC 132, 1031 (1985). 111. J.-L. BanBres, L. Essalouh, I. Jariel-Encontre, D. Mesnier, S. Garrod and J. Parello,JMB 243, 48 (1994). 112. K. Ura, A. P. WoEe and J. J. Hayes, JBC 269, 27171 (1994). 113. J. J. Hayes and A. P. Wolffe, PNAS 90, 6415 (1993). 114. J. Allan, P. J. Hartman, C. Crane-Robinson and F. X.Aviles, Nature 288, 675 (1980). 115. A. D. Mirzabekov, D. V. Pruss and K. K. Ebralidse, JMB 211, 479 (1989). 116. J. J. Hayes, D. Pruss and A. P. Wolffe, PNAS 91, 7817 (1994). 117. F. J. Aviles, 6. E. Chapman, G. 6. Kneale, C. Crane-Robinson and E. M. Bradbury, EJB 88, 363 (1978). 118. D. 2. Staynov and C. Crane-Robinson, EMBO J. 7, 3685 (1988). 119. S. Lambert, S. Muyldermans, J. Baldwin, J. Kilner, K. Ibel and L. Wijns, BBRC 179,810 (1991). 79. 80. 81. 82. 83. 84. 85. 86.
258
JOHDANKA ZLATANOVA AND KENSAL VAN HOLDE
120. S. Muyldermans and A. A. Travers, JMB 235, 855 (1994). 121. D. J. Clark, C. S. Hill, S. R. Martin and J. 0. Thomas, EMBOJ. 7, 69 (1988). 122. J. Allan, T. Mitchell, N. Harborne, L. Bohm and C. Crane-Robinson, J M B 187, 591 (1986). 123. G. S. Manning, K. K. Ebralidse, A. D. Mirzabekov and A. Rich, J. Biomol. Struct. Dynum. 6, 877 (1989). 124. J. K. Strauss and L. J. Maher, Science 266, 1829 (1994). 125. L. 6. Nikolaev, B. 0. Glotov, A. V. Itkes and E. S. Severin, FEBS Lett. 125, 20 (1981). 126. L. 6. Nikolaev, B. 0. Glotov, V. K. Dashkevich, S. F. Barbashov and E. S. Severin, FEBS Lett. 163, 66 (1983). 127. E. Kotthaus and W. H. Stratling, JBC 259, 13640 (1984). 128. A. C. Lennard and J. 0. Thomas, EMBOJ. 4, 3455 (1985). 129. M. L. Wilhelm, A. Mazen and F. X. Wilhelm, FEBS Lett. 79, 404 (1977). 130. J.-M. Sun, 2. Ali, R. Lurz and A. Ruiz-Carrillo, EMBOJ. 9, 1651 (1990). 131. P. D. Greenwood, J. J. Heikkila and I. R. Brown, Neurochem. Res. 7, 525 (1982). 132. N. Hsiung and R. Kucherlapati, J. Cell B i d . 87, 227 (1980). 133. A. Worcel, S. Han and M. L. Wong, Cell 15, 969 (1978). 134. S. Bavykin, L. Srebreva, T. Banchev, R. Tsanev, J, Zlatanova and A. Mirzabekov, PNAS 90, 3918 (1993). 135. A. Stein and P. Kunzler, Nature 302, 549 (1983). 136. P. Kunzler and A. Stein, Bchem 22, 1783 (1983). 137. A. Rodriguez-Campos, A. Shimamura and A. Worcel, JMB 209, 135 (1989). 138. K. Liu, J. D. Lauderdale and A. Stein, MCBiol 13, 7596 (1993). 139. M. Bellard, G. Dretzen, F. Bellard, P. Oudet and P. Chamhon, EMBOJ. 1, 223 (1982). 140. A. M. Campbell, R. I. Cotter and J. F. Pardon, NARes 5, 1571 (1978). 141. C. Marion, P. Bezot, C. Hesse-Bezot, B. Roux and J.-C. Bernengo, EJB 120, 169 (1981). 142. J. A. Subirana, S. Mufioz-Guerra, M. Radermacher and J. Frank, J. Biomol. Struct. Dynam. 1, 705 (1983). 143. C. Marion, J. Biomol. Struct. Dynam. 2, 303 (1984). 144. L. Perez-Grau, J. Bordas and M. H. J. Koch, NARes 12, 2987 (1984). 145. J. Bordas, L. Perez-Grau, M. H. J. Koch, M . C. Vega and C. Nave, Eur. Biophys. J. 13, 157 (1986). 146. J. Bordas, L. Perez-Grau, M. H. J. Koch, M. C. Vega and C. Nave, Eur. Biophys. J. 13, 175 (1986). 147. S. P. Williams, B. D. Athey, L. J. Muglia, R. S. Schappe, A. H. Gough and J. P. Langmore, Biophys. J. 49, 233 (1986). 148. S. E. Gerchman and V. Ramakrishnan, PNAS 84, 7802 (1987). 149. A. D. Gruzdev and 6. P. Kishchenko, Biofizika 26, 249 (1981). 150. D. 2. Martin, R. D. Todd, D. Lang, P. N. Pei and W. T. Garrard, JBC 252, 8269 (1977). 151. F. Strauss and A. Prunell, NARes 10, 2275 (1982). 152. S. Strogatz, J. Theor. B i d . 103, 601 (1983). 153. M. J. Smerdon and M. W. Lieberman, JBC 256, 2480 (1981). 154. C. Houssier, 1. Lasters, S. Muyldermans and L. Wyns, Int. J. B i d . Macromol. 3, 370 (1981). 155. C. Houssier, I. Lasters, S. Muyldermans and L. Wyns, NARes 9, 5763 (1981). 156. M. K. Cowman and 6. D. Fasman, Bchem 19, 532 (1980). 157. 6. Felsenfeld, Nature 355, 219 (1992). 158. D. S. Gilmour, A. R. Buchman and J. L. Workman, in “Transcription: Mechanisms and Regulation” (R. C. Conaway and J. W. Conaway, eds.), p. 515. Raven Press, New York, 1994.
LINKER HISTONES I N CHROMATIN
159. 160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171. 172. 173. 174. 175. 176. 177. 178. 179. 180. 181. 182. 183. 184. 185. 186. 187. 188. 189.
259
F. Thoma, BBA 1130, l(1992). Q. Lu, L. L. Wallrath and S. C. R. Elgin, J. Cell. Biochen~.55, 83 (1994). 6. Meersseman, S. Pennings and E. M. Bradbury, JMB 220, 89 (1991). S. Pennings, 6. Meersseman and E. M. Bradbury, PNAS 91, 10275 (1994). C. C. Chipev and A. P. WolEe, MCBiol. 12, 45 (1992). S. Jeong, J. D. Lauderdale and A. Stein, J M B 222, 1131 (1991). J. D. Lauderdale and A. Stein, NARes 20, 6589 (1992). C. L. Woodcock, J . Cell Biol. 125, 11 (1994). J. Widom and A. Klug, Cell 43, 207 (1985). D. Sen, S. Mitra and D. M. Crothers, Bchein 25, 3441 (1986). V. L. Karpov, S. G. Bavykin, 0. V. Preobrazhenskaya, A. V. Belyavsky and A. D. Mirzabekov, NARCS 10, 4321 (1982). S . G. Bavykin, S. I. Usachenko, A. 0. Zalensky and A. D. Mirzabekov, J M B 212, 495 (1990). J. D. McGhee, J. M. Nickol, G. Felsenfeld and D. C. Rau, Cell 33, 831 (1983). S. H. Leuba, J. Zlatanova and K. van Holde, JMB 235, 871 (1994). P. J. Hagerman, Annu. Reu. Biophys. Biophys. Chem. 17, 265 (1988). J. Yao, P. T. Lowary and J. Widom, PNAS 87, 7603 (1990). J. Yao, P. T. Lowary and 1. Widom, Bcheni 30, 8408 (1991). P. J. G. Butler and J. 0. Thomas, JMB 140, 505 (1980). J. Dubochet and M. Noll, Science 202, 280 (1978). J. R. Pehrson, PNAS 86, 9149 (1989). S. Mitra, D. Sen and D. M. Crothers, Nature 308, 247 (1984). J. S. Zlatanova, MCBchem 92, I (1990). S. I. Dimitrov, V. R. Russanova and I. 6. Pashev, E M B O J . 6, 2387 (1987). S. H. Leuba, J. Zlatanova and K. van Holde, JMB 229, 917 (1993). V. Graziano, S . E. Gerchmali, D. K. Schneider and V. Ramakrishnan, Nature 368, 351 (1994). A. V. Itkes, B. 0. Glotov, L. G. Nikolaev, S. R. Preem and E. S. Severin, NARCS. 8, 507 (1980). R. C. Krueger, Mol. B i d . Rep. 11, 189 (1986). P. M. Schwarz and J. C. Hansen, JBC 269, 16284 (1994). B. J. Sugarman, J. B. Dodgson and J. D. Engel, JBC 258, 9005 (1983). D . Doenecke and R. Tonjes, J M B 187, 461 (1986). G. Briand, D. Kmiecik, P. Sautiere, D. Wouters, 0. Borie-Loy, 6. Biserte, A. Mazen and M . Champagne, F E B S Lett. 112, 147 (1980).
Development of Antisense and Antigene Oligonucleotide Analogs’ PAUL S. MILLER Department of Biochemistry School of Hygiene and Public Health The Johns Hopkins University Baltimore, Maryland
I. Nuclease-resistant Antisense Oligonucleotide Analogs; Oligonucleoside Methylphosphonates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Structure and Chemical Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Hybridization with Complementary Oligonucleotides C. Psoralen-conjugated Oligonu D. Uptake by Cells in Culture E. Biological Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Antigene Oligonucleotide Analogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Targeting G C Base-pairs in Duplex DNA with 8-Oxoadenine . . . . 8. Targeting C . G and T.A Interruptions in Homopurine Tracts . . . . . . C. Triplex Formation by Oligonucleoside Methylphosphonates . . . . . . 111. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
262 263 265 267 270 273 276 278 282 285 287 287
In recent years, there has been considerable interest in exploring the possibility of using oligonucleotides to control gene expression. Besides the prospect of developing agents that can be used to study gene function in cells in culture, it is also clear that such reagents might have considerable therapeutic potential and could provide a means of rationally designing drugs of high selectivity. Two types of oligonucleotides, antisense oligonucleotides and antigene oligonucleotides, have been developed for this purpose. Antisense oligonucleotides are designed to interact with cytoplasmic messenger RNA or with precursor mRNA in the nucleus. The oligomers are complementary to and designed to bind to a functional region of the mRNA, such as the initiation codon region or a splice junction of precursor mRNA. As a consequence of binding, translation or splicing of the mRNA is prevented. There are a number of mechanisms by which this might occur, including prevention of assembly of ribosomes or degradation of the mRNA by nucleases. Abbreviations: d-OMP, Oligodeoxyribonucleoside methylphosphonate; mr-OMP, oligo2’-O-methylribonucleoside methylphosphonate. Progress in Nucleic Acid Research and Molecular Biology, Vol. 52
26 1
Copyright 0 1996 by Academic Press, Inc. All rights of repmduction in any form reserved.
262
PAUL S . MILLER
Although the precise mode of action of antisense oligonucleotides is often not well understood, it does appear to depend on the nature of the oligonucleotide and the site on the mRNA to which the oligonucleotide is targeted. Antigene oligonucleotides are designed to bind to double-stranded DNA via the formation of triple-stranded complexes (triplexes). Thus antigene oligonucleotides directly target the gene at the DNA level. In its current state of development, the targets for antigene oligonucleotides are homopurine tracts in double-stranded DNA. Such tracts, which consist of runs of As and Gs, are frequently found in the promoter regions of eukaryotic DNA. Thus, in principle, it should be possible to use antigene oligonucleotides to prevent binding of RNA polymerase and/or transcription factors to the gene promoter region and thereby inhibit gene transcription. Antisense oligonucleotides depend on Watson-Crick hydrogen-bonding interactions between the oligomer and its target, whereas antigene oligonucleotides make use of Hoogsteen or reversed-Hoogsteen hydrogenbonding schemes. In either case, the binding interactions are well understood and thus, in principle at least, it should be possible to design an antisense or antigene oligonucleotide using only the sequence information of the targeted gene. With the advent of modern sequencing techniques, such information is often available or can readily be obtained. However, in practice, many other factors intervene in determining the effectiveness of a particular oligomer. These include the ability of the oligomer to be taken up by the cell, the stability of the oligomer in the intracellular environment, and the ability of the oligomer to reach and interact selectively with the chosen target. These requirements have led to the development of a wide variety of oligonucleotide analogs. Many excellent reviews and monographs have been written describing the chemistry and biological activity of various antisense and antigene oligonucleotides (1-17). This report focuses on two lines of research from my laboratory that are aimed at the development of antisense and antigene oligonucleotide analogs. The first line of research involves the development of nuclease-resistant antisense oligonucleotide analogs, the oligonucleoside methylphosphonates. The second line of research involves the development of nucleoside and oligonucleotide analogs that can be used to recognize sequences in double-stranded DNA.
1. Nuclease-resistant Antisense Oligonucleotide Analogs; Oligonucleoside Methylphosphonates Although the pioneering work of Zamecnik and Stephenson (18)showed that oligodeoxyribonucleotides could be used as antisense agents in cell
263
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
cultures, it was clear that the phosphodiester linkages of these oligomers are susceptible to nuclease hydrolysis, thus rendering them unsuitable for therapeutic applications. Such oligonucleotides are rapidly degraded when injected into animals (19). For this reason, many workers have expended considerable effort to design and synthesize oligonucleotide analogs with nuclease-resistant backbones. Such analogs include oligonucleotide phosphorothioates (20) and phosphorodithioates (21-23), a-anomeric oligonucleotides (24, 25), and peptide nucleic-acid analogs (26). My efforts have focused on the oligonucleoside methylphosphonates.
A. Structure and Chemical Synthesis As shown in Fig. 1, oligonucleoside methylphosphonates contain a modified 3’-5’ internucleotide phosphodiester bond in which one of the negatively charged, nonbridging oxygens has been replaced with a methyl group. As a consequence, the methylphosphonate linkage is nonionic. In addition the phosphorus atom is now chiral and each methylphosphonate linkage can exist in an Rp or Sp configuration. The methylphosphonate group is tetrahedral and, as shown by x-ray diffraction (27), the bond lengths and bond angles are very similar to those found in the natural phosphodiester linkage.
1
a
b
R=H R=-OCH3
2
a
b
R=H
R=-OCHI
FIG. 1. Structures of an oligonucleoside methylphosphonate, 1, and a nucleoside methylphosphonamidite synthon, 2.
264
PAUL S . MILLER
Oligonucleoside methylphosphonates with two different types of sugarphosphonate backbone have been prepared. The oligodeoxyribonucleoside methylphosphonates (d-OMP), la, contain 2’-deoxyribose sugars and are similar in structure to oligodeoxyribonucleotides (28, 29). The olig0-2’-0methylribonucleoside methylphosphonates (mr-OMP), Ib, contain 2‘-0methylribose and are analogs of oligoribonucleotides (30).Oligoribonucleoside methylphosphonates containing 3’-5’ methylphosphonate linkages are not stable. This instability is presumably due to intramolecular attack by the 2’-hydroxyl on the neutral methylphosphonate linkage with subsequent cleavage of the 3’ 0 - P or 5’ 0-P bond (P. S . Miller, unpublished results). Interestingly, the synthesis and isolation of 2’-5‘-linked oligoribonucleoside methylphosphonates has been achieved (31). Although d-OMPs and mr-OMPs are nonionic, they are quite soluble in water. The solubility depends on the base composition and to some extent on the sequence of the oligomer. Oligomers that contain high percentages of G tend to be less soluble than oligomers with less G . Nonetheless, solubilities up to 1.7 mM have been reported for 18-mers in water (32). Both d-OMPs and mr-OMPs are stable in neutral aqueous solutions for prolonged periods of time. We have stored oligomers in water or in 50% aqueous acetonitrile for months at 4°C with no apparent degradation. The methylphosphonate linkages of these oligomers are cleaved under basic conditions and in the presence of certain amines such as piperidine. Complete hydrolysis by piperidine results in mixtures of nucleosides and nucleoside 3’ or 5’ 0-methylphosphonates. Treatment of d-OMPs with 1 N HCL at 3 T C results in depurination. In contrast, the N-glycosyl bond of purines in mr-OMPs is completely resistant to hydrolysis by 1 N HCI, even on prolonged treatment (32).Unlike abasic sites in oligodeoxyribonucleotides, an abasic site in a d-OMP undergoes spontaneous cleavage, even at pH 7. The sugar residue at the abasic site is completely removed, leaving two oligomers, one of which terminates in a 5’ 0-methylphosphonate group and the other in a 3’ 0-methylphosphonate group (33). This facile cleavage presumably results from intramolecular attack by the 4‘-hydroxyl group at the abasic site on the adjacent methylphosphonate linkages. Oligonucleoside methylphosphonates can be prepared by a variety of methods. Early work employed coupling of protected 5’-O-dimethoxytritylnucleoside-3 -O-methylphosphonates using dicyclohexylcarbodiimide or arenesulfonyl chlorides or tetrazolides as activating agents (28, 33). Later studies used protected nucleoside 3’-O-methylphosphonic chlorides or imidazolides as synthons for solid-phase synthesis on silica or polystyrene supports (34, 35). The advent of phosphoramidite chemistry resulted in the development of
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
265
protected 5 -O-dimethoxytritylnucleoside3 -O-N,N-bis-diisopropylaminomethylphosphonamidites (structure 2 in Fig. 1)(36-38). These synthons are commercially available. The base-protecting groups include benzoyl for A and isobutyryl for G and C. In the presence of a catalyst (tetrazole), these synthons allow rapid, high-yield syntheses to be carried out in automated DNA synthesizers on controlled-pore glass supports (39). It is most convenient to synthesize the oligomer in such a way that the 5’-terminal internucleotide bond is a phosphodiester linkage. This is readily accomplished by using a standard protected nucleoside 3 -O-P-cyanoethyl-N,N-bis-diisopropylaminophosphoramidite as the last synthon in the synthesis. The protected oligomers are cleaved from the support and the base-protecting groups are removed by a brief treatment with ammonium hydroxide followed by incubation with ethylenediamine for 6 hours at room temperature (40). Oligomers that contain a 5’-terminal internucleotide phosphodiester linkage are readily purified by ion-exchange chromatography on a weak anion-exchanger such as DEAE-cellulose. Short, “failure” sequences are uncharged and are therefore not absorbed by the column. The singly charged, full-length oligomer is eluted from the column with dilute sodium phosphate buffer and can be further purified by C,, reversed-phase HPLC. Methylphosphonate oligomers can be characterized by a variety of methods. Their molecular weights can be confirmed by electrospray mass spectrometry. Oligomers containing a 5’-terminal phosphodiester linkage can be phosphorylated using polynucleotide kinase and ATP. 5’-Phosphorylated oligomers migrate according to chain length on polyacrylamide gels, and the sequence of the oligomers can be characterized by chemical sequencing procedures similar to those used to characterize oligodeoxyribonucleotides (41).
B. Hybridization with Complementary Oligonucleotides Oligonucleoside methylphosphonates form stable duplexes with complementary single-stranded nucleic acids. Initially, studies were carried out on the interactions between d-OMPs and synthetic complementary oligodeoxyribonucleotides. The melting temperatures in 0.1 M sodium chloride of duplexes formed by d-OMPs are similar to those formed by unmodified oligodeoxyribonucleotides (42). In the absence of salt, the melting temperature of the dOMP.DNA duplex remains essentially unchanged, whereas that of the DNA. DNA duplex decreases dramatically. This difference in behavior is attributed to the lack of charge repulsion between the nonionic sugarmethylphosphonate backbone of the d-OMP and the negatively charged sugar-phosphodiester backbone of the target. The stability of duplexes formed between oligonucleoside methylphosphonates and their complementary targets depends on the configuration of
266
PAUL S . MILLER
the methylphosphonate linkage. This was originally shown to be the case for oligothymidylates that contain alternating methylphosphonateiphosphodiester linkages (43).Oligomers containing Rp methylphosphonate linkages form more stable duplexes than those containing Sp linkages. Studies with oligomers containing single methylphosphonate linkages show that the Sp linkage destabilizes duplex stability by 2-3°C. Similar results have been observed by others for oligomers having mixed sequences of bases (44-48). The stabilities of duplexes formed between d-OMPs and RNA targets are lower than those of dOMP.DNA duplexes of comparable sequence (42, 43). This may reflect the inherent lower stability observed for DNA.RNA vs DNA. DNA oligonucleotide duplexes (49). The observation that olig0-2’-0methylribonucleotides form exceedingly stable duplexes with complementary RNA suggests that the corresponding methylphosphonate analogs should behave in a similar manner. This is indeed the case. For example, the melting temperature of an mrOMP.RNA 12-mer duplex is at least 15”higher than that of the corresponding d0MP.RNA duplex (42). .As is the case for d0MP.DNA duplexes, the configuration of the methylphosphonate linkage determines oligonucleoside methy1phosphonate.RNA duplex stability. Oligo-2 -O-methylribonucleotidesthat contain a single Rp or Sp methylphosphonate linkage were prepared. This linkage was placed in the middle or at the 5’ end of the dodecamer. Although the oligomer-RNA duplex containing the Rp linkage had the same melting temperature as the corresponding RNA.RNA duplex, the melting temperature of the duplex containing the Sp linkage was reduced 2 to 4” (42). When oligonucleoside methylphosphonates are synthesized on controlledpore glass supports, each methylphosphonate linkage consists of both Rp and Sp isomers and consequently an oligomer containing R linkages is a mixture of 2” diastereoisomers. As described above, the configuration of the methylphosphonate linkage plays a role in determining duplex stability; thus each diastereoisomer can be expected to have its own unique binding affinity for its complementary target. Because UV melting experiments are carried out under conditions in which the oligomer and the target are present in stoichiometric amounts, it is perhaps not surprising that the melting transitions observed for mrOMP RNA duplexes are somewhat broader than those observed for oligo-2 -O-methylribonucleotide~RNA duplexes. The apparent isothermal binding constants of oligonucleoside methylphosphonates for their complementary targets can be determined by gel electrophoresis under nondenaturing conditions. Under ordinary conditions, the similarly charged oligomers comprising an RNA.RNA duplex can reequilibrate as the mixture migrates down the gel. However, dissociation of an oligonucleoside methylphosphonate. RNA duplex results in the formation of two oligomers of quite different electrophoretic mobility. The RNA strand
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
267
will migrate rapidly away from the essentially neutral methylphosphonate oligomer, and the two strands cannot reassociate. Thus only duplexes that have very long half-lives can be detected by this method. Long-lived mrOMP.RNA duplexes have in fact been observed on denaturing polyacrylamide gels (42). A new technique called constant activity gel electrophoresis (CAGE) has been developed to circumvent this problem (50).In this approach, individual gel lanes are cast with increasing concentrations of the oligonucleoside methylphosphonate. The radioactively labeled target is then electrophoresed through this “field” of methylphosphonate oligomer, which, because it is electroneutral, does not migrate. The mobility of the target strand decreases as the concentration of the methylphosphonate oligoiner increases, and the apparent dissociation constant can be determined from the methylphosphonate oligomer concentration that causes the target to migrate half the distance between the free and completely bound forms. Using this procedure, we have shown that mrOMP,RNA duplexes have dissociation constants t to & of those of the corresponding dOMP.RNA duplexes at 22°C (P. S . Miller, unpublished results). The magnitude of the difference appears to depend upon the G + C content of the duplex and the sequence of the oligonucleoside methylphosphonate.
C . Pso ra Ien- co niugated 0Iigo nuc Ieos ide Methylphosphonates As demonstrated in the preceding section, association between antisense oligonucleoside methylphosphonates and RNA targets can be enhanced by using 2 -O-methylribonucleosidesin the sugar-methylphosphonate backbone. It is also apparent that synthesis of oligomers with Rp methylphosphonate linkages should lead to enhanced binding, although present synthetic methodology does not permit stereoselective syntheses to be carried out on DNA synthesizers. Another strategy to enhance binding interaction involves formation of a covalent linkage between the oligomer and the target RNA. In this strategy, the oligomer first finds the correct complementary binding site on the target and, as a consequence of binding, positions an appended functional group such that it can form a covalent bond with some substitutent of the target molecule. We have found that 4,5,8-trimethylpsoralen, a well known photoreactive cross-linker, is a suitable functional group for this process (51). The structure of a psoralen-conjugated oligodeoxyribonucleoside methylphosphonate is shown in structure 3 of Fig. 2. The psoralen is attached via a phosphoramidate linkage to the 5’ end of the oligomer. Linkage of psoralen to internal positions in the oligomer has also been described (52). Target sequences are selected such that when the oligomer binds, the 3,4 double
268
PAUL S . MILLER
bond of the pyrone ring of the psoralen group is positioned to react with the 5,6 double bond of a pyrimidine in the target sequence (as shown in 4 of Fig. 2). When irradiated with long-wavelength UV (365 nm), a 2 2 cycloaddition reaction occurs, forming a cyclobutane bridge between the pyrone ring and the pyrimidine target base. Because psoralen reacts most efficiently with thymine or uracil, cross-linking is essentially base specific and the psoralen acts in effect like an extra base in the oligomer. Psoralen-conjugated d-OMPs are readily synthesized by a two-step procedure (53).The 5'-phosphorylated oligomer is first converted to an imidazole phosphoramidate derivative by reacting the oligomer with imidazole in the presence of a water-soluble carbodiimide. The imidazolide is then reacted with 4’-[(N-aminoethyl)amino]methyl-4,5’,8-trimethylpsoralento give 3. The derivatized oligomer, which can also be prepared in the 32Plabeled form, is purified either by gel electrophoresis or by C,, reversedphase HPLC. Psoralen-conjugated d-OMPs have also been further derivatized with tetramethylrhodamine (54). Such derivatization provides a convenient fluorescent tag, which can be used in various biological experiments. The interaction of psoralen-conjugated d-OMPs with dN, targets has been studied extensively in vitro (51, 55-57). In general, the cross-linking reaction is complete within 10 minutes; the reaction appears to plateau after this time. Up to 95% cross-linking has been observed in some systems. Quantitative cross-linking is not observed because the psoralen also undergoes photochemical degradation, which renders it inactive. The extent of cross-linking depends on the sequence and concentration of the oligomer and on the temperature of the reaction. Plots of cross-linking versus temperature mimic the UV melting-curves of the duplexes formed between the oligomer and its target. This suggests that interaction between the oligomer and the target is driven by Watson-Crick base-pairing interactions and not by interaction between the psoralen group, which is a weak intercalator, and the target. Consistent with this hypothesis is the observation that base-pair mismatches between the oligomer and the target result in little or no cross-linking. Target structure also plays an important role in determining the extent of cross-linking by psoralen-conjugated d-OMPs (57). The extent of crosslinking between psoralen-conjugated d-OMPs and a linear target, that is, an oligo-DNA target with no secondary structure, was compared to that with a hairpin structure (57).The sequences of the oligomer binding sites were the same in each target. For a d-OMP targeted to a 12-nucleotide region located in a single-stranded loop of the hairpin, the extent of cross-linking, 76%,was similar to that of the linear target, 83%, over the temperature range 4-20°C. Cross-linking to the hairpin rapidly diminished above 20°C, whereas crosslinking to the linear target began to diminish above 35°C. Because the stem
+
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
269
region of the hairpin target must dissociate to some extent to allow binding by the d-OMP, cross-linking to the linear target would be expected to be more effective than cross-linking to the hairpin target. This notion received further support from the observation that a psoralen-conjugated d-OMP 16 nucleotides long, which was complementary to the loop region as well as to nucleotides in the stem, cross-linked to the same extent to both the linear and the hairpin target over the temperature range 4-55°C. The site of cross-linking can be detected by treating the photoadduct with 1 M aqueous piperidine at 90°C. This treatment produces a strand break whose location can be analyzed by polyacrylamide gel electrophoresis (58). When a psoralen-conjugated d-OMP was reacted with a target that contained three contiguous thymidine residues 3’ to the binding site of the d-OMP, various levels of cross-linking were observed to each of the thymidines. This result suggested that intervening bases can “loop out” to accommodate binding by the psoralen group. Psoralen-conjugated d-OM Ps also cross-link to oligo-RNA targets (57). In this case, the extent of cross-linking decreased more rapidly with increasing temperature than was the case for the oligo-DNA targets. This greater sensitivity to temperature is consistent with the reduced binding affinity of d-OMPs for single-stranded RNA versus single-stranded DNA targets. A rhodamine-tagged, psoralen-conjugated d-OMP cross-links with the same efficiency to the oligo RNA target as does the psoralen-conjugated d-OMP (59).Thus the presence of the fluorescent tag does not interfere with either oligomer binding or photoadduct formation. Cross-linking between psoralen-conjugated d-OMPs and cellular or viral mRNAs in vitro has been studied. Oligomers were targeted to various regions of rabbit a-or p-globin mRNA (60)or to the coding regions of vesicular stomatitis virus M protein or N protein mRNA (61). These oligomers crosslink specifically with their targeted mRNA. Cross-linking depends on the sequence of the oligomer, the temperature at which the reaction is carried out, and the concentration of the oligomer. For example, the extent of crosslinking of a 16-mer complementary to nucleotides 387-402 in the coding region of vesicular stomatitis virus N protein mRNA varied from 67 to 36% over the temperature range 0-37°C at an oligomer concentration of 5 p M . In the case of rabbit globin-specific oligomers, the extent of cross-linking is greatest for oligomers whose binding sites were in known nuclease-sensitive regions of the mRNA. Thus it appears that mRNA secondary and tertiary structures play an important role in determining the efficiency of crosslinking. Translation of mRNA in a cell-free rabbit reticulocyte system is specifically inhibited as a consequence of cross-linking to psoralen-conjugated d-OMPs. Thus an oligomer targeted to the coding region of a-globin mRNA
270
PAUL S. MILLER
specifically inhibits a-globin synthesis 43%, whereas an oligomer targeted to the coding region of P-globin mRNA specifically inhibits p-globin synthesis 67%(60).The amount of inhibition observed is similar to the extent of crosslinking observed for the two oligomers to their respective mRNAs.
D. Uptake by Cells in Culture The methylphosphonate linkages of oligonucleoside methylphosphonates are completely resistant to hydrolysis by exo- and endonucleases. Oligomers incubated with mammalian cells in culture in a serum-containing medium are recovered intact, even after prolonged periods of incubation. These results suggest that oligonucleoside methylphosphonates should have long half-lives in cells and could therefore be useful as antisense agents in cell culture and in animals. Oligonucleoside methylphosphonates are taken up by mammalian cells in culture. This was first demonstrated by using d-OMPs containing a thymidine that was tritium labeled in the 5-methyl position of the thymine ring (62).When carried out on transformed Syrian hamster fibroblasts growing in monolayer, oligomer uptake as measured by radioactivity associated with the lysate appeared to be linear for approximately 1 hour and then continued to increase but at a reduced rate over the next 3 hours. The same rates of uptake were observed for oligomers 2 , 5, and 9 nucleotides in length. Most of the radioactivity, 94%, was recovered from the cell lysate in the form of intact oligomer. Radioactivity was also recovered in the form of [3H]d'ITP and as [3H]dpT and [3H]dT after digestion of cellular DNA with a combination of deoxyribonuclease and snake venom phosphodiesterase. This latter result suggests the oligomers could be degraded in the cells, possibly by a glycosylase activity, and the resulting thymine base converted to the nucleoside triphosphate, which was subsequently incorporated into DNA. Psoralen-conjugated oligonucleoside methylphosphonates of the types shown in Fig. 2 are also taken up by mouse L cells (P. S. Miller, unpublished results). The kinetics of uptake seem similar to those of the unmodified oligomer. In the case of oligomers that contain only deoxyribonucleosides (3a), intact oligomer as well as degradation products resulting from cleavage of the phosphodiester linkage are recovered from the cell lysate. When cells are incubated with 3b, only intact oligomer is recovered, even after 24 hours. This difference in oligomer stability reflects the increased resistance of the phosphodiester linkage of 3b to hydrolysis by endonucleases such as S1 nuclease. Careful examination of the uptake of 32P-labeled and rhodamine-tagged oligonucleoside methylphosphonates showed that the oligomers are taken up by a temperature-dependent mechanism (63). Experiments carried out
271
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
CH 3-
I
P=O I
3 a R=H
b R=-OCHs
5‘
3
‘Ui 4
FIG. 2. Structure of an oligonucleoside inethylphosphonate conjugated with 4’4”(aminoethyl)amino])methyl-4,5,8-trimethylpsoralen, 3. The interaction of a psoralen-conjugated oligomer with an RNA target is shown schematically at the bottom of the figure. The psoralen group is represented by the open rectangle.
in a Chinese hamster ovary tumor cell line showed that rhodamine-tagged d-OMPs are sequestered in perinuclear vacuoles consistent with endosomal localization. Similar experiments in mouse L929 cells showed similar intracellular distribution for both rhodamine-tagged d-OMPs and rhodaminetagged, psoralen-conjugated d-OMPs (59).In these latter experiments, some fluorescence was also observed in the cytoplasm and nucleus of the cell, suggesting that the oligomer can escape from the endosomal compartment.
272
PAUL S . MILLER
Oligonucleoside methylphosphonates microinjected directly into the cytoplasm migrate immediately to the nucleus (64).Similar behavior is observed with oligonucleotide phosphodiesters and phosphorothioates. Uptake of normal oligodeoxyribonucleotides and oligodeoxyribonucleoIn tide phosphorothioates appears to involve cell-surface receptors (65,66). contrast, uptake of oligonucleoside methylphosphonates is apparently not mediated by receptors. Thus, while uptake of oligodeoxyribonucleotides shows saturation at higher oligomer concentrations and can be inhibited by addition of nonlabeled oligonucleotides, saturation is not observed for uptake of oligonucleoside methylphosphonates and uptake is not inhibited by the presence of either exogenous oligodeoxyribonucleotides or oligonucleoside methylphosphonates (63). The results of uptake experiments suggest that oligonucleoside methylphosphonates enter mammalian cells in culture primarily by nonreceptormediated endocytosis. It appears that leakage from the endosome must occur for the oligomer to gain access to the cytoplasm or nucleus of the cell. This transport mechanism may represent a “bottleneck” for the efficient biodistribution of the oligomers. Although d-OMPs 2 to 20 nucleotides long are taken up by mammalian cells, only short oligomers, up to 4 nucleotides, are taken up by bacterial cells, such as E . coEi (67). It appears that the bacterial cell wall acts as a sieve and excludes oligomers above a certain size. When uptake experiments were carried out on mutants of E . coli that lack an intact cell wall, 7-mers were found to be taken up readily. This size-exclusion phenomenon is likely to restrict the use of antisense oligonucleoside methylphosphonates, and most likely other oligonucleotide analogs, in these organisms. The distribution and metabolism of an 3H-labeled 12-mer, dTp[3H]TCCTCCTGCGG, where the underline indicates the positions of methylphosphonate linkages, has been studied in mice (68). The oligomer was administered by injection into the tail vein. Plasma levels of the oligomer declined rapidly in a biexponential manner with half-lives of 6 and 17 minutes corresponding to the distribution and elimination phases, respectively. Approximately 70% of the total radioactivity injected was found in the urine after 120 minutes. The remaining radioactivity was found in various organs and tissues of the animal, with the highest levels found in the kidney and very little oligomer found in the brain. Two forms of the oligomer were recovered from the plasma, urine, and various organs as assayed by C,, reversed-phase HPLC. Intact oligomer was observed along with d-[3H]TCCTCCTGCGG. The latter oligomer most likely was formed as a consequence of nuclease hydrolysis of the 5’-terminal phosphodiester. This linkage is known to be susceptible to cleavage by calf spleen phosphodiesterase, a 5’-exonuclease. Importantly, there was no evi-
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
273
dence for cleavage of the methylphosphonate linkages of the oligomer in these studies.
E. Biological Activity The biological activity of an tisense oligonucleoside methylphosphonates in cell culture has been studied for oligomers targeted against a number of viral and cellular genes, including those of human immunodeficiency virus (HIV), (69-71), herpes simplex virus (HSV), (72-75), and the human rus oncogene (76). Studies in the HSV system showed that d-OMPs 8 and 12 nucleotides in length, d-TpCCTCCTG and d-TpTCCTCCTGCGG, could inhibit virus replication in HSV-1-infected Vero cells when added to the culture medium. These oligomers are complementary to nucleotide sequences in the acceptor splice junction of HSV-1 immediate-early mRNAs 4 and 5, which are identical in both of these precursor mRNAs. Inhibition of virus replication depends on the oligomer concentration and reaches levels of greater than 90% at oligomer concentrations of 100 pM. When the two central bases of d-TpCCTCCTG were exchanged, the resulting “mutant,” d-TpCCCTCTG, was no longer inhibitory (72). This suggested that the oligomer indeed interacted with its target. The octamer did not inhibit cellular DNA or protein synthesis. The 12-mer, d-TpTCCTCCTGCGG, is complementary to six nucleotides in the intron and six nucleotides at the splice junction (73). This oligomer specifically inhibits replication of HSV-1, but not HSV-2, in virus-infected Vero cells. The sequences of the two viruses differ in this region of the premRNA. The 12-mer gave 98% inhibition at a concentration of 100 pM. Other 12-mers complementary predominantly to nucleotides in the intron or the exon of the splice acceptor region showed little or no inhibitory effect on the replication of either HSV-1 or HSV-2. This result suggested that the secondary structure in the splice acceptor region may affect oligomer binding and thus the efficacy of the oligomer. The 12-mer was designed to inhibit mRNA splicing. Examination of RNA extracted from virus-infected cells showed the presence of increased levels of unspliced immediate-early mRNA 4 in those cells treated with d-TpTCCTCCTGCGG. Thus itappeared that the oligomer had successfully interacted with its intended target in the cells. The psoralen conjugate of d-TpCCTCCTGCGG was also tested for its ability to inhibit HSV-1 replication. In these experiments, virus-infected cells were treated with 5 p M oligomer at the time of infection. The cells were then irradiated for 10 minutes with 365-nm light at various times postinfection, and the virus titers were measured after 24 hours. Greater than 90% inhibition of replication was observed when HSV-1 infected cells were irradiated 1to 3 hours postinfection. Irradiation after 3 hours resulted
274
PAUL S. MILLER
in a decrease in inhibition. Thus, for example, when the cells were irradiated at 6 hours, approximately 60%inhibition was observed, and when cells were irradiated at 12 hours, only 20% inhibition was observed. Irradiation itself gave only a slight (5%) inhibition of virus replication. The results of these experiments are consistent with the proposed mechanism of action of the oligomer. Thus, the effect of the oligomer should be greatest shortly after infection when the immediate-early genes are expressed. Inhibition at this time results in a shutoff of subsequent early and late gene expression, and thus of virus replication. Irradiation at times subsequent to immediate-early gene expression is less effective because the proteins required for activation of early and late genes have already been synthesized. These experiments also demonstrate two important features of psoralenconjugated d-OMPs. First, these oligomers can be “triggered” to inactivate their target in a controlled manner. This feature may prove useful in studying temporal expression of genes in cells. Second, the oligomers function at significantly lower concentrations than do the underivatized oligomers. This enhanced activity is most likely a consequence of the ability of the oligomer to form covalent adducts with its target. The ability of antisense oligonucleoside methylphosphonates to inhibit mRNA expression should be a direct function of their ability to bind to their target. This idea was tested by examining the ability of the 2’-0methylribonucleoside version of the 12-mer to inhibit HSV-1 replication in cell culture (P. S . Miller, unpublished). The apparent dissociation constant of mr-UpUCCUCCUGCGG is approximately a fifth of that of d-TpTCCTCCTGCGG as determined by the CAGE technique described in Section 1,B. A series of dose-response experiments showed that the IC, value, the oligomer concentration at which virus replication is inhibited 50%, is 26 pM for d-TpTCCTCCTGCGG. In contrast, the IC,, value for mrUpUCCUCCUGCGG was only 6 pM.This approximate decrease to 22% in the IC,, is consistent with the lower dissociation constant of the 2’-0methylribo oligomer versus the 2’-deoxyribo oligomer. Oligodeoxyribonucleoside methylphosphonates have also been targeted to other HSV-1 immediate-early mHNAs (74, 75). For example, a d-OMP, d-GpCGGGGCTCCAT was prepared (75). This oligomer is complementary to the initiation codon and the following three codons of HSV-1 immediateearly mRNA 1.The oligomer inhibited virus replication with an IC5, value of approximately 17 p M . “Mutated” versions of this oligomer, in which two or four of the bases in the middle portion of the sequence were rearranged, were not inhibitory. When this oligomer was combined with d-TpTCCTCCTGCGG, the d-OMP complementary to the immediate-early mRNA 4 and 5 splice junction, a synergistic inhibitory effect was observed. Thus the IC,,
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
275
value of each oligomer in the combination was approximately 4 pM, a reduction to 4 to Q of that of the individual oligomers. In order to be effective antisense agents, oligonucleoside methylphosphonates must be able to interact selectively with their targets and inhibit target expression. The high degree of specificity possible was demonstrated in the rus oncogene system (76). In these experiments, two 11-mer d-OMPs, d-TpCCTCCTGGCC (rus-T) and d-TpCCTCCAGGCC (ras-A), were synthesized. The oligomers are complementary to a sequence including the codon for amino-acid 61 of normal human c-Ha-rus mRNA, rus-T, or to the same region of a human lung carcinoma c-Ha-rus mRNA, which contains a single mutation in the middle base of codon 61, rus-A. Two cell lines that contain multiple copies of the normal gene or the gene with the point mutation were used to test the selectivity of the two oligomers. Each cell line produces a rus-p21 protein and these proteins can be distinguished by their electrophoretic mobilities on polyacrylamide gels. Mixed cultures of the two cell lines were treated with various concentrations of either rus-T or rus-A and the relative amounts of the p21 proteins were analyzed by gel electrophoresis after immunoprecipitation. A dosedependent decrease of “normal” p21 synthesis was observed for cells treated with rus-T. The extent of inhibition was 89% in the presence of 150 pM rasT. No inhibition of “normal” p21 was observed when the cells were treated with rus-A. Conversely, increasing concentrations of ras-A resulted in decreasing amounts of “mutated” p21 and little or no inhibition of “normal” p-21 was observed in the presence of this oligomer. Thus, 150 p M rus-A inhibited “mutated p21 synthesis 97% whereas “normal” p21 synthesis was reduced on 21%. The experiments were also carried out with the psoralen-conjugated versions of rus-T and rus-A. Again, selective inhibition of p21 synthesis was observed. However, the concentration of oligomer required to give levels of inhibition comparable to those observed with the underivatized oligomers was reduced by approximately a factor of 10. Thus 15 p M rus-T inhibited “normal p21 synthesis 96% whereas 15 pM ras-A inhibited “mutated” p21 synthesis almost 100%. These results suggest that under certain conditions, antisense oligonucleoside methylphosphonates can distinguish point mutations in mRNA. It appears that this level of discrimination is maintained by the psoralenconjugated d-OMPs as well. This latter observation is consistent with the previously observed ability of psoralen-conjugated d-OMPs to discriminate mispaired bases in in vitro cross-linking experiments and again suggests that binding is a function of the oligomer portion of the psoralen-d-OMP conjugate. Antisense oligodeoxyribonucleoside methylphosphonates and their
276
PAUL S. MILLER
psoralen conjugates also have activity against HSV infection in animals (74). A mouse ear-pinna model system was developed to assess the effects of d-TpTCCTCCTGCGG on HSV-1 replication. The ear pinna was injected with a solution containing HSV-1 and 100 to 500 p M oligomer. The oligomer was then applied topically as a suspension in 50% aqueous polyethylene glycol on subsequent days postinfection. Reductions to Q and 3 in virus titer were observed 1 to 5 days postinfection. The psoralen conjugate of d-TpTCCTCCTGCGG inhibited HSV-1 replication 86-91% after irradiation Z h 365-nm light, but at a tenth of the concentration required for the underivatized oligomer. This oligomer specifically inhibited HSV-1 replication. When mice infected with HSV-2 were similarly treated, only a 27% inhibition of virus titer was observed. These experiments suggest that antisense oligonucleoside methylphosphonates and their psoralen conjugates have therapeutic potential, at least in situations where topical application is possible. More extensive studies of the pharmacokinetics of the molecules in animals is obviously required before the full potential of the oligomers can be realized.
II. Antigene Oligonucleotide Analogs Antigene oligonucleotides are designed to interact with double-stranded DNA (dsDNA) through the formation of triple-stranded complexes. Although formation of triple-stranded complexes at the polynucleotide level had been recognized for some time (79, the demonstration that shorter oligonucleotides could also participate in such interactions sparked interest in using oligonucleotides to target genes specifically (78). Antigene oligonucleotides have been targeted to homopurine sequences in dsDNA. The oligomers can interact with these homopurine sequences in essentially two different ways. Third-strand oligopyrimidines can bind in the major groove and contact the purine strand through the formation of Hoogsteen-type hydrogen bonds. This is the so-called pyrimidine-purinepyrimidine, Y.(R.Y), motif. The third strand is written first and its association with the duplex is indicated by a centered dot. Two types of base triads, T.(A.T) and C+.(G.C),are formed. Their structures are shown in the top half of Fig. 3. The T.(A.T)triad involves formation of two hydrogen bonds between the T of the third strand and the A of the target. Formation of the C+ .(G.C)triad also involves two hydrogen bonds between a protonated form of C in the third strand and a G of the duplex. Because protonation is required, the stability of these Y.(R.Y) triplexes decreases as the pH increases. The polarity of the third stand is parallel to that of the purine target strand of the duplex, and the bases in the triads are isomorphous.
277
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
dR-
A
T A T
\
dR
C?G C
.d R
H
T A T
G G C
FIG.3. Hydrogen bonding interactions involved in base triad formation.
The purine-purine-pyrimidine, R.(R.Y), motif involves formation of A.(A.T)or T.(A.T) and G.(G.C)triads. The structures of the latter two triads, which have been confirmed by N M R studies (79), are shown at the bottom half of Fig. 3. Here, reversed-Hoogsteen hydrogen bonds are formed between the bases of the third strand and the purine target, and the polarity of the third strand is oriented antiparallel to that of the purine target tract. Bases comprising the Y.(R.Y) triads are isomorphous, whereas bases comprising Y.(R.Y) triads are not. Most studies involving triplex-forming oligonucleotides have centered around understanding the structure and physical chemistry of triplestranded complexes (80-87) and exploring various strategies to extend the scope of triple-strand recognition (88-93). There are encouraging results
278
PAUL S. MILLER
suggesting that triplex-forming oligonucleotides have biological activity in cell culture (94-97). Our efforts in this area have focused on developing oligonucleotide analogs that can be used in the Y.(R.Y) motif. As described in the following sections, we have developed a base analog, 8-oxoadenine (8-0x0-A), capable of interacting with G.C base-pairs in a pH-independent manner. We have also designed and synthesized base analogs that can interact with pyrimidine “interruptions” in the homopurine target site. The ability of oligodeoxyribopyrimidine methylphosphonates to participate in triplex formation has also been explored.
A. Targeting GC Base-pairs in Duplex DNA with 8-Oxoadenine
As described above, G.C base-pairs can interact with C in third-strand oligopyrimidines provided that the pH allows protonation of the C. For oligopyrimidines that contain only a few unclustered Cs, stable triplex formation can be observed even at pH 7. However, oligodeoxyribopyrimidines that contain a high percentage of Cs and/or blocks of contiguous Cs usually form stable triplexes at pH 6.5 and below. This pH dependence makes the use of such antigene oligomers problematic in biological studies. A number of solutions have been developed to overcome this problem. One of the simplest is to substitute 5-methyldeoxycytidine (structure 5 , Fig.
5
d-CTTCTTTTTE-TTTT d-GAAGAAAAAAGAAAl CTTCTTTTTTCTTTT-d
A
B C
6
FIG.4. Triplex containing 5-methyl-Z -deoxycytidine, C.
279
ANTISENSE AND ANTIGENE OLIGONUCLEOTIDES
4) for deoxycytidine in the oligodeoxyribopyrimidine (98). 5-Methyldeoxycytidine is protonated at a higher pH than is deoxycytidine when incorporated into an oligodeoxyribopyrimidine that is undergoing triplex formation. The reason for this increased apparent pK is not clear. Nevertheless, for oligomers that contain few 5-niethyldeoxycytidine residues, relatively stable triplex formation can be observed even at pH 8.0. Thus, for example, the melting temperature of 6A in triplex 6 is 18°C at pH 8.0 (99). Although oligomers containing low percentages of 5-methyldeoxycytidine can form stable triplexes at physiological pH, the stability decreases significantly for oligomers containing multiple, contiguous 5-methyldeoxycytidines. To circumvent this problem, base analogs have been prepared that have two available hydrogen-bond donor sites and that therefore do not require protonation in order to interact with G. For example the C nucleoside, 2 -O-methylpseudoisocytidine(structure 7, Fig. 5), can be
H 7
cH32il
v
HO
d
R
O
G
UAAG > UGAU > UAAA UAAC. The other six signals are used either rarely or not at all (34). Several studies of eukaryotic stop codon contexts suggest that purines are the most common 3' nucleotide following the stop codon (127-130). The sequences around the stop codons of over 5200 mammalian genes has now been compiled as a database representing six species: human, mouse, rat, cow, pig, and rabbit. In addition to the obvious bias in the fourth base, there was also nonrandomness in the fifth base, and indeed in the first eight
-
INFIDELITIES OF TRANSLATIONAL STOP SIGNALS
321
positions following the stop codon as well as in the three positions preceding the codon (36). This nonrandomness in the fifth base and beyond was not seen in the analysis of the E . coli genes. In this case, of the bases following the stop codon, the fourth base alone showed striking bias, although there was also some nonrandomness in the preceding codons (126). The occurrence of the four-base signals in the mammalian genes reflects the frequency of these sequences in the noncoding regions. However, G in the fourth position is more abundant, and U is less abundant than expected ( P < 0.001).The frequencies of the four-base signals in a subset of genes such as globins, immunoglobulins, histones, actins, and albumins have been examined and found to be quite different from the complete dataset. These are putative highly expressed genes, but the definition of a such a set of highly expressed genes in the mammalian genome is more arbitrary than those defined in E . coli. Despite this limitation, certain signals (UAAG and UAGG) are overrepresented, whereas other signals (UGAC, UGAU, UAGC, and UAGU) are not used at all. Individual analyses with each particular mammalian species gave essentially the same information as that derived from combining the species (36).
C. Does the Fourth Base Affect Trans lat io na I Term ination ? These analyses suggested that a particular subset of four-base termination signals might be used in special circumstances at the end of highly expressed genes. However, different subsets are favored in mammals and E . coli. What do these context biases mean? The critical question is whether there is a hierarchy of termination signals including the fourth base, and perhaps the fifth and beyond, that influence how the stop codon functions in a cell. Statistical analyses of maininalian genes shows a biased context for initiation codons (131)and this context profoundly influences the initiation step (132). In contrast, codon biases for the other sense codons are determined by the regional (G + C)-content of the genome and probably do not have functional significance (133). It has been tested experimentally whether the termination codon context bias had any functional significance for polypeptide chain termination. For the prokaryotic system, the strengths of each of the twelve possible four-base stop signals were tested in an in vivo assay, using the direct competition of termination with frameshifting at the RF-2 frameshifting sequence motif. For frameshifting to occur, the elements 5' to the stop codon promoting the event must compete with translational termination at the stop codon. A change in either the concentration of RF-2, which recognizes the natural UGA(C), or the specific activity of the factor can change the efficiency of termination over a O-lOO% range (101). With the natural context, the two
322
WARREN P. TATE E T AL.
events compete almost equally, with termination efficiencies of 50-70% (100, 101). If the stop signal is altered in this system, the degree of frameshifting should be influenced by the efficiency of the sequence to signal stop. An immunological assay has been used to measure the ratio of the termination and frameshift products, synthesized in vivo from constructs varying only in their termination codon or fourth base. The termination efficiency varied widely within the UAAN and the UGAN sets. The order of efficiency correlated with how frequently the signals were used in natural contexts (derived from the statistical analysis) and were in the heirarchy of fourth base U > G > A > C (Fig. 12A). However, from the statistical analysis alone the other set of signals, UAGN, might be predicted to be poor, but this was not the case, because UAGN signals were as efficient as the other two sets (35).TAGN occurs with very low frequency throughout the whole genome, and the reason for this could be that UAGN stop signals might be mistakenly altered to a sense codon by the mechanism for vsp- or vsr-initiated DNA mismatch repair that operates at CTAG or TAGG sequences (134, 135). The usr gene product is a DNA mismatch endonuclease.
"
U
G UGAN
A
C
U
G A UGAN
C
FIG. 12. Influence of the fourth base in tetranucleotide stop signals on the efficiency of translational termination. In E. coli the relative strength of each of the four UGAN signals was determined as a competitor of fl frameshifting at the RF-2 frameshift site (35). In the mammalian system, the strength of each of the UGAN signals was determined in 5'-deiodinase mRNA as a competitor to selenocysteine incorporation at the site. The rat mRNA was transfected into human embryonic kidney cells for this experiment (36).
323
INFIDELITIES OF TRANSLATIONAL STOP SIGNALS
The rates of RF selection were calculated at each of the twelve termination signals (adapted from Ref. 136). The RF selection rate varied over a 50fold range, with that at UAAU the fastest and at UGAC the slowest (35). The influence of the fourth base has also been tested with mammalian termination signals in two ways. First, the recoding event at the internal UGA codon of the 5’-DI mRNA was used so that the strength of the termination signal could be matched against incorporation of selenocysteine at the codon in uivo. The base following this UGA was changed from the natural C to each of the other three bases. It was found that the ratio of termination to selenocysteine incorporation at this codon varied from 1:3 (C or U in the fourth position) to 3 : l (A or G in the fourth position) (see Fig. 12B). These U) as UGAN sequences had the same termination efficiency (A G > C was found in an in vitro termination assay that measured the release of a model peptide from the ribosome by the eRF. In addition, the fourth base in each of the UAAN and UAGN series also affected the efficiency of the release assay in the same order, varying over a 70-fold range for the UAAN series, and an %fold range for the UGAN and UAGN series (36).Again, as in the prokaryotic case, there was a strong correlation between the “strength’ of the signals and how frequently they occur at natural termination sites, but with eukaryotes it was the purines, A and G, rather than the pyrimidine, U, that boosted the efficiency of the signal. UGAU was determined to be a weak signal from in vitro termination studies (36), and a fourth base U is found at the frameshift site in the mammalian ornithine decarboxylase antizyme mRNA. Alteration of this fourth base showed that purines in the fourth position make a stronger competing stop signal for the undefined frameshifting mechanism (102).The data from this example provide modest support for this model of termination signal strength. However, it should be realized that there are several stimulating elements for frameshifting at this site, and although the stop signal is essential, its relative strength may be a less dominant influence than that of the stem-loop structure that follows. Suppression studies indicate that the measurement of termination rate in vitro, and its support from the statistical analysis, predicts very well the outcome of competition between termination and other events at some sites, but not at others. A natural stop signal, UGAC (the signal also found at the selenocysteine incorporation site of the 5’-DI mRNA) permits 10% readthrough in reticulocyte lysates when there are no apparent competing events (29). Suppression of UAGN in human cells by a mutated tRNA showed UAGC is the signal most readily suppressed. However, UAGU is poorly suppressed (137),which is not what would be predicted with the statistical analysis and the in uitro analysis of stop-signal strength. Translational termination in yeast has been reviewed recently (60). Ter-
-
-
324
WARREN P. TATE ET AL.
mination is most efficient at internal UAA stop codons followed by a purine, when suppression by noncognate tRNAs is excluded (138, 139). These observations correlate with the strong use of UAAR (R = purine) signals in highly expressed genes in S. cerevisiae (127), and suggest that yeast and mammals may have a similar hierarchy of termination signal strengths. Significantly, the proteins from yeast and mammals with properties of the decoding molecule e R F l have strong homology (61), and the equivalent protein from X. laevis functions in yeast (140). Clearly, the fourth base modulates the efficiency of the termination signal quite markedly throughout the prokaryotic and eukaryotic kingdoms, and consideration must be given now as to how one defines the signal for termination. Should the fourth base be regarded as a strong context influence or part of the signal itself? Two distinct models are possible for the effect of the fourth base on termination efficiency. Either this base may influence the conformation of the stop codon and thereby influence its recognition by the RF, or it may be recognized directly by the factor along with the codon. If direct recognition can be shown, there seems to be a good case for including this base as part of the termination signal rather than thinking of the signal as only the triplet termination codon.
D. Recognition of the Fourth Base by the Release Factor? When thioU was used to cross-link from stop signals in small ribosomebound mRNAs to the bacterial RF, the yield of the RF.mRNA complex depended upon the identity of the fourth base for the UGAN series of stop signals tested (92). This suggests that the fourth base of the signal affects the interaction between the factor and the stop codon, with purines at this position promoting more cross-linking than pyrimidines. Because there is no evidence that the complex is stabilized by fourth base purines, a more likely alternative explanation may be that the conformation of the thioU in the first position is altered by the fourth base purine so as to improve cross-linking. This may imply that the stop signal is in a stacked conformation during decoding, the effect being similar to that seen with a +4 purine on sense codons (141). A thioU in the fourth base position as well as the first did not increase cross-linking to RF. This may be because of poor orientation, or the residue may have a closer orientation to rRNA than to the RF. Cross-linking to rRNA was increased when thioU was in both the first and fourth positions of the signal (92). The cross-linking of stop signals to the 1400 region of the rRNA and the RF is consistent with either of two orientations for the bases of the stop signal. They could be oriented toward the RF, which then makes direct interactions with them. For example, the common keto hydrogen-bond ac-
INFIDELITIES O F TRANSLATIONAL STOP SIGNALS
325
FIG. 13. The UGAG stop signal modeled as an A helix. The large arrow indicates the proximity of the photoactivated thioU to any part of RF-2 interacting with the N-7 atom of the adjacent G . Small arrows indicate potential stacking interactions. Coordinates were generated by the program MC-SYM (153).
326
WARREN P. TATE ET AL.
ceptor and the imino hydrogen-bond donor groups of the fourth base U or G (see Fig. 2) might be hydrogen bonded to an amino acid in the RF structure. Equally possible, the backbone of the mRNA could make interactions with the RF leaving the bases to pair with the rRNA. The fact that the substitution of dU for U in the first position of a termination codon prevents in oitro termination supports some kind of backbone interaction of the mRNA with one of the other components of the termination complex (64).The important primary structural determinants in the stop signal are consistent with either RNA or protein recognition of stop codons, as either could form hydrogen bonds to the key positions in the signal. The secondary structure of the stop signal in the mRNA is unknown, but RNA is conformationally flexible. If it is modeled as a single-stranded A-helix, the cross-linking moiety of the thioU-containing signals would be located immediately over the common N 7 of the second base A or G in UAAN and UGAN, the signals recognized by RF-2 (Fig. 13). Hydrogen bonding may occur from this N7 to the RF and this could explain why cross-linking from the thioU to the RF is possible, because any part of the RF molecule contacting N7 would be close to the activated ring. The challenge now is to identify the sites on RF-2 that make contact with the stop signal.
VII. Physiological Advantages of Stop Signals Decoded with Varying Efficiencies A. The Inefficient Stop Signal 1. GENE EXPRESSION AND SHIFTS IN PHYSIOLOGICAL CONDITIONS
If a termination signal can be multifunctional and perhaps influenced by cell physiology, then an extra layer of cellular regulation is possible (142). Examples of this are the internal TGAC in the RF-2 gene and in the formatedehydrogenase gene of E . coli. In the first case, the synthesis of RF-2 can be regulated according to the cellular concentration of the factor, because RF-2 functions at the site of the internal UGA codon in the mRNA to release a short nonfunctional premature termination product. Premature termination is in competition with translational frameshifting to avoid the stop codon and allow complete synthesis of the protein. Hence, there is an autoregulatory circuit that operates according to the physiological requirements of the cell (98, 101). In the second example, formate-dehydrogenase expression has captured a niche to utilize the micronutrient selenium in an oxidation/reduction reaction with the selenium atom as part of the active center of the enzyme. However, the synthesis of three isoenzymes of formate dehydrogenase in E.
INFIDELITIES O F TRANSLATIONAL STOP SIGNALS
327
coli depends on the availability of selenium and the physiological state of the cell. The three isoenzymes are not synthesized simultaneously. In fact, there is a low carbon flux into selenocysteine biosynthesis (143).
2. SELENOPROTEIN SYNTHESIS IN MAMMALS AND SELENIUM AVAILABILITY Mammals have developed a hierarchy of selenium distribution between and within different tissues (144, 145). The distribution is most likely controlled at the translational level. In tissue culture, 5’-DI is synthesized preferentially over glutathionine peroxidase (GP) because it presumably competes better for the available selenium. 5’-DI is synthesized when selenium concentrations are below 2 nM and saturates at 10 nM, whereas G P is not synthesized below 2 nM and saturates at approximately 1 pM. It is interesting that 5’-DI has a poor stop signal, UGAC, and G P has a strong stop signal, UGAG, at the site of selenocysteine incorporation. The intriguing possibility arises that selenium utilization is controlled by the relative efficiency of the stop signal in the selenoprotein mRNAs, and the unfaithful stop codon may be playing a highly significant role in important physiological processes.
B. The Highly Efficient Stop Signals 1. GENE EXPRESSION AND GROWTHRATEI N BACTERIA
Protein synthesis becomes a more dominant activity of the cells as the growth rate of bacteria increases (146-148). Maximal growth rate is achieved by increasing the eficiency of translation and by increasing the concentrations of the components of the translational apparatus. The number of the RFs per cell increases with growth rate. However, as cell volume changes, this increase is sufficient only to maintain cellular concentration, and does not account for an increase in the rate of any steps of translational termination (148). The predominant genes that are expressed at maximal growth rates are the “highly expressed” genes. Such genes almost exclusively use two four-base signals for translational stop, UAAU and UAAG (Table 111). These signals are decoded at a rate many-fold higher than the average (35). Hence, the fourth base can influence the translational rate in a way that is significant for the organism, with the more efficient stop signals potentially providing an increase in the termination rate. This phase of translation would not then limit the increase in protein synthesis that is necessary for an increase in growth rate of the organism.
MAMMALIANGENES 2. HIGHLYEXPRESSED Genes such as those for globins, histones, actins, immunoglobulins, and albumins, which can be regarded as the equivalent group to the highly
328
WARREN P. TATE ET AL.
TABLE 111
THERELATIVE FREQUENCY OF OCCURRENCE OF TETF~ANUCLEOTIDE STOP SIGNALSIN E . coli AND MAMMALIANGENES“ Highly expressed (%)
Total (%)
Codon
E . coli
Mammals
E . coli
Mammals
UAAA UAAG UAAU UAAC UAGA UAGG UAGU UAGC UGAA UGAG UGAU UGAC
5.3 21.3 52 4.5 0 0.4 1.2 0 2.5 1.2 10.7 0.8
22 20.7 3.7 4.9 8.5 12.2 0 0 9.8 18.3 0 0
13.7 12.8 25.5 9.8 1.6 1.4 2.4 2.4 7.4 5.9 12.6 4.5
10.5 7.9 5.9 5.2 6.7 6.5 3.5 5.2 13.8 19.4 6.9 8.6
a A total of 2455 E. coli genes were analyzed; the highly expressed subset numbered 250. Out of a total of 5208 mammalian genes analyzed, the highly expressed subset numbered 82. The signals that are used at a higher frequency in highly expressed genes are shown in bold type (35. 65).
expressed genes of bacteria, use a subset of efficiently decoded termination signals, for example, UAAA and UAAG. Signals of low efficiency are avoided, such as UGAC and UGAU (Table 111). This may be significant physiologically, not only in terms of rates of synthesis, but also to avoid recoding of weak stop signals by noncognate tRNAs (36).In this regard, it is interesting that premature termination mutations in a conserved region of the gene for the transmembrane conductance regulator involved in cystic fibrosis can be less severe. This may relate to relatively high levels of noncognate tRNA recoding, and therefore readthrough at some loci. This occurs in the homologous gene in yeast, SteGp, when a stop codon is put in a context whereby C is the fourth base of the signal (139).
C. Is Translational Infidelity an Early or Late Event? 1. RECODING“EARLY”
Originally, translational stopping may have been a default mechanism when an unspecified or true “nonsense” codon entered the decoding site of the protoribosome. At this stage of evolution, the process would have been mediated independent of any protein factor. The product may have either
INFIDELITIES OF TRANSLATIONAL STOP SIGNALS
329
fallen off the ribosome, or an alternative event may have occurred during the prolonged pause at the nonsense codon. Indeed, this situation has been simulated by creating an artificial codon for a nonstandard amino acid in an mRNA for which there is no natural decoding tRNA (66). In this case, frameshifting events additional to termination are common in the absence of the decoding tRNA. This suggests that it would be possible for recoding events to arise at these true “nonsense” codons; that is, they could have been early events before a defined translational termination mechanism evolved. In most cases, the nonsense codon would have been captured subsequently for a specific translational termination mechanism. However, the more complex recoding events of frameshifting and translational hopping could also have been fine-tuned by the acquisition of a stop codon at the site. Pertinent to this argument is the status of selenocysteine as an ancient or as a relatively recent addition to the complement of amino acids found in proteins. Although this is somewhat contentious, it has been argued that a late addition of selenocysteine is more difficult to explain because all three lineages of eukaryotes, archaebacteria, and eubacteria have selenoproteins, which suggests that selenocysteine was present before these lineages separated (111).This implies that selenocysteine was one of the earliest amino acids and was encoded by UGA. Why then are there now only a few examples where UGA encodes selenocysteine? It has been suggested that the high susceptibility of this amino acid to oxidation (since the introduction of oxygen into the earth’s atmosphere) and its extreme sensitivity to metal ions have meant that the lower catalytic efficiency of cysteine became more acceptable because of its greater stability (111).Selenocysteine may have been retained only in special environments (19). UGA would then have acquired the function of translational stopping as a “sense codon takeover.” In a few cases, the original function of UGA as a selenocysteine codon might have been maintained by mechanisms that gave a clear competitive advantage to a sense codon function for UGA. Indeed, there appears to be a reverse precedent for this in the mammalian mitochondria, where UGA is now used as a sense codon for tryptophan and there is no RF that recognizes the codon as stop. The prokaryote-like RF-2 seems to have been lost whereas the prokaryote-like RF-1 (recognizing UAA and UAG) has been retained. Because it is believed that mitochondria evolved from the subdivision of purple bacteria (149), it is assumed that at some point both the RF-2 and tRNA,,, would have coexisted, with the cognate tRNA having a significant selective advantage over RF-2 for decoding UGA (150). 2.
RECODING
“LATE”
If the coding sequence of an mRNA is translated with high fidelity, a small loss of accuracy in reading the stop signal may not compromise the cell.
330
WARREN P. TATE ET AL.
Aberrant proteins could be degraded without obvious penalty, providing they are not inhibitory for any cellular process. The fact that many E . coli species harbor UAG suppressors without compromising growth, and that UGA is relatively leaky as a stop codon in this organism (151, 1521, supports this premise. Sites of translational stop signals where there was a more pronounced error frequency might have been good targets for “recoding takeover,” with the use of the codon for an alternative purpose such as selenocysteine incorporation, readthrough with a noncognate tRNA, or a frameshifting event. Although there may be examples of the inaccurate decoding of translational stop signals that have no physiological significance in cells today, there are clearly some situations where the infidelity has been captured for an event of physiological importance.
VIII. Conclusion The last decade of research has elevated the oft-forgotten translational stop codon into the realm of significant cell physiology. From a period in which “test-tube” infidelities in the decoding of the stop codon were only of esoteric interest, we have moved to an era in which the discovery of recoding events has revealed a new dimension in cellular regulation. New subtleties in the way the stop signal contributes to this are likely to emerge. There is a compelling need now to match these discoveries with a fundamental understanding of the normal mechanism of how the stop signal is decoded by the release factor on the ribosome, and why this decoding fails in the small number of cases where the signal has a dual function. ACKNOWLEDGMENTS Thanks to Dr. Chris Brown for helpful suggestions for the maniiscript. The authors are supported by an International Scholar award of the Howard Hughes Medical Institute to W.P.T., a Human Frontier Science Program grant (awarded to Yoshikazu Nakamura and W.P.T.),and grants from The Health Research Council of New Zealand and the NZ Lotteries Board.
REFERENCES 1. 2. 3. 4.
A. Caren, Science 160, 149 (1968). A. S . Sarabhai, A 0. W. Stretton, S. Brenner and A. Bolle, Nature 201, 13 (1964). M . G. Weigert, E. Gallucci, E. Lanka and A. Garen, CSHSQH 31, 145 (1966). A. Caren, S. Garen and R. C. Wilheim, ]MI3 14, 167 (1965).
INFIDELITIES OF TRANSLATIONAL STOP SIGNALS
33 I
E. Gallucci and A. Garen, JMB 15, 193 (1966). M . G. Weigert and A. Garen, Nature 206, 992 (1965). M. G. Weigert, E. Lanka and A. Garen, J M B 23, 391 (1967). S. Brenner, A. 0. W. Strettoii and S. Kaplan, Nature 206, 994 (1965). S. Brenner, L. Barnett, E . R. Katz and F. H. C. Crick, Nature 213, 449 (1967). W. Fiers, R. Contreras, F. Duerinck, 6. Haegeman, D. Iserentant, J. Merregaert, W. Min Jou, F. Molemans, A. Raeymaekers, A. Van den Berghe, G . Volckaert and M. Ysebaert, Nature 260, 500 (1976). 11. R. E. Marshall, C. T. Caskey and M . Nirenberg, Science 155, 820 (1967). 12. F. H. C . Crick, JMB 38, 367 (1968). 13. B. G . Barrell, A. T. Bankier and J. Drouin, Nature 282, 189 (1979). 14. T. H. Jukes and S. Osawa, Experientia 46, 1117 (1990). 15. J. E. Heckman, J. Sarnoff, B. Alzner-DeWeerd, S. Yin and U. L. RajBhandary, PNAS 77, 3159 (1980). 16. S. Osawa, T. H. Jukes, K. Watanabe and A. Muto, Microbiol. Reu. 56, 229 (1992). 17. N. Lehman and T. H. Jukes, J. Theor. Biol. 135, 203 (1988). 18. T. Oba, Y. Andachi, A. Muto and S. Osawa, PNAS 88, 921 (1991). 19. A. Bock, K. Forchhamtner, J. Heider, W. Leinfelder, G . Sawers, B. Veprek and F. Zinoni, Mol. Microbiol. 5, 515 (1991). 20. S . Osawa, A. Muto, T H. Jukes and T. Ohama, Proc. R. Soc. Lond. B 241, 19 (1990). 21. R. H. Buckingham, Experientia 46, 1126 (1990). 22. J. F. Atkins, R. B. Weiss and R. F. Gesteland, Cell 62, 413 (1990). 23. M. M. Fluck, W. Salser and R. H. Epstein, MGG 151, 137 (1977). 24. W. Salser, MCG 105, 125 (1969). 25. W. Salser, M. Fluck and R. Epstein, CSHSQB 34, 513 (1969). 26. R. H. Buckingham, E. J. Murgola, P. Sorensen, F. T. Pagel, K. A. Hijazi, B. H. Mims, N. Figueroa, D. Brechemier-Baey and E. Coppin-Raynal, in “The Ribosome: Structure, Function and Evolution” (W. E. Hill, A. E. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger and J. R. Warner, eds.), p. 541. American Society for Microbiology, Washington, D.C., 1990. 27. R. H . Buckingham, P. Sorensen, F. T. Pagel, K. A. Hijazi, B. H. Mims, D. BrechemierBaey and E. J. Murgola, BBA 1050, 259 (1990). 28. A. M. Weiner and K. Weber, J M B 80, 837 (1973). 29. G. Li and C. M. Rice, J . Virol. 67, 5062 (1993). 30. J. F. Atkins and R. F. Gesteland, EJB 137, 509 (1983). 31. J. M. Skuzeski, L. M. Nichols, R. F. Gesteland and J. F. Atkins, J M B 218, 365 (1991). 32. G. Eggertsson and D. Soll, Microbiol. Rev. 52, 354 (1988). 33. A. L. Beaudet and C. T. Caskey, PNAS 68, 619 (1971). 34. W. P. Tate and C. M. Brown, Bchem 31, 2443 (1992). 35. E. S. Poole, C. M. Brown and W. P. Tate, EMBOJ. 14, 151 (1995). 36. K. K. McCaughan, C. M. Brown, M. E. Dalphin, M. J. Berry and W. P. Tate, PNAS 92, 5431 (1995). 37. E . J. Murgola, K. A. Hijazi, H. U. Goringer and A. E. Dahlberg, PNAS 85, 4162 (1988). 38. C. D. Prescott and H. C. Kornau, NARes 20, 1567 (1992). 39. H. Moine and A. E. Dahlberg, JIMB 243, 402 (1994). 40. C. D. Prescott and H. U. Goringer, NARes 18, 5381 (1990). 41. M. OConnor and A. E. Dahlberg, PNAS 90, 9214 (1993). 42. Z. Shen and T. D. Fox, NARes 17, 4535 (1989). 43. R. Rosset and L. Gorini, J M B 39, 95 (1969). 44. W. Piepersberg, A. Bock and H. 6 . Wittmann, MGG 140, 91 (1975). 5. 6. 7. 8. 9. 10.
332
WARREN P. TATE ET AL.
45. L. Gorini, in “Ribosomes” (M. Nomura, A. Tissieres and P. Lengyel, eds.), p. 791. CSHLab, Cold Spring Harbor, New York, 1974. 46. L. R. Topisirovic, M. Villarroel, M. De Wilde, A. Herzog, T. Cabezon and A. Bollen, MGG 151, 89 (1977). 47. L. A. Kirseborn and L. A. Isaksson, PNAS 82, 717 (1985). 48. M. C. Ganoza, CSHSQB 31, 273 (1966). 49. M. R. Capecchi, PNAS 58, 1144 (1967). 50. C. T. Caskey, R. Tompkins, E. Scolnick, T. Caryk and M. Nirenberg, Science 162, 135 (1968). 51. E. Scolnick, R. Tompkins, T. Caskey and M. Nirenberg, PNAS 61, 768 (1968). 52. G. Milman, J. Coldstein, E. Scolnick and T. Caskey, PNAS 63, 183 (1969). 53. M. R. Capecchi and H. A. Klein, C S H S Q B 34, 469 (1969). 54. J. Goldstein, G. Milman, E. Scolnick and T. Caskey, PNAS 65, 430 (1970). 55. D. S. Konecki, K. C. Aune, W. P. Tate and C. T. Caskey, JBC 252, 4514 (1977). 56. S. M. Ryden and L. A. Isaksson, MGG 193, 38 (1984). 57. C. C. Lee, Y. Kohara, K. Akiyama, C. L. Smith, W. J. Craigen and C. T. Caskey,]. B a t . 170, 4537 (1988). 58. K. Kawakami, Y. H. Jonsson, G. R. Bjiirk, H. Ikeda and Y. Nakamura, PNAS 85, 5620 (1988). 59. 0. Mikuni, K. Ito, J. Moffat, K. Matsumura, K. McCaughan, T. Nobukuni, W. Tate and Y. Nakarnura, PNAS 91, 5798 (1994). 59a. 6. Grentzmann, D. Brechemier-Baey, V. Heurgue, L. Mora and R. Buckingham, PNAS 91, 5848 (1994). 60. I. Stansfield and M. F. Tuite, Curr. Genet. 25, 385 (1994). 61. L. Frolova, X. Le Goff, H. H. Rasmussen, S. Cheperegin, G. Drugeon, M. Kress, I. Arman, A-L. Haenni, J. E. Celis, M. Philippe, J. Justesen and L. Kisselev, Nature 372, 701 (1994). 62. J. Smrt, W. Kernper, T. Caskey and M. Nirenberg, JBC 245, 2753 (1970). 63. W. P. Tate, B. Greuer and R. Brimacombe, NARes 18, 6537 (1990). 64. R. D. Ricker and A. Kaji, NARes 19, 6573 (1991). 65. C. M. Brown, P. A. Stockwell, M. E. Dalphin and W. P. Tate, NARes 22, 3620 (1994). 66. J. D. Bain, C. Switzer, A. R. Chamberlin and S. A. Benner, Nature 356, 537 (1992). 67. W. Saenger, “Principles of Nucleic Acid Structure.” Springer-Verlag, New York, 1984. 68. E. J. Murgola, A. E. Dahlberg, K. A. Hijazi and A. A. Tiedernan, in “The Ribosome: Structure, Function and Evolution” (W. E. Hill, A . E. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger and J. R. Warner, eds.), p. 402. American Society for Microbiology, Washington, D.C., 1990. 69. C. D. Prescott. B. Kleuvers and H. U. Goringer, Biochimie 73, 1121 (1991). 70. C. Prescott, L. Krabben and K. Nierhaus, NARes 19, 5281 (1991). 71. H. A. Raue, W. Musters, C. A. Rutgers, J. Van’t Riet and R. J. Planta, in “The Ribosome: Structure, Function and Evolution” (W. E. Hill, A. E. Dahlberg, R. A. Garrett, P. B. Moore, D. Schlessinger and J. R. Warner, eds.), p. 217. American Society for Microbiology, Washington, D.C., 1990. 72. P. B. Moore, in “The RNA World (R. F. Gesteland and J. F. Atkins, eds.), p. 119. CSHLab, Cold Spring Harbor, New York, 1993. 73. J. Shine and L. Dalgarno, PNAS 71, 1342 (1974). 74. P. Purohit and S. Stern, Nature 370, 659 (1994). 75. H. F. Noller, V. Hoffarth and L. Zimniak, Science 256, 1416 (1992). 76. W. P. Tate, C. D. Ward, C. N. A. Trotman, R. LiihrmannandG. Stoffler, Biochem. Znt. 7, 529 (1983).
INFIDELITIES OF TRANSLATIONAL STOP SIGNALS
333
77. C. T. Caskey, L. Bosch and D. S. Konecki, J B C 252, 4435 (1977). 78. J. M. Neefs, Y. Vandepeer, P. Derijk, A. Goris and R. Dewachter, NARes 19, 1987 (1991). 79. G. J. Olsen, R. Overbeek, N. Larsen, T. L. Marsh, M. J. McCaughey, M . A. Maciukenas, W. M. Kuan, T J. Macke, Y. Q. Xing and C. R. Woese, NARes 20, 2199 (1992). 80. R. Brimacombe, Bchein 27, 4207 (1988). 81. H . U . Goringer, K. A. Hijazi, E . J. Murgola and A. E. Dahlberg, PNAS 88, 6603 (1991). 82. G. J. Olsen, N. Larsen and C. R. Woese, NARes 19, 2017 (1991). 83. S. Stern, T. Powers, L.-M. Changchien and H. F. N o h , Science 244, 783 (1989). 84. P. R. Cunningham, K. Nurse, C . J. Weitzmann, D. Negre and J. Ofengand, Bchern 31, 7629 (1992). 85. P. R. Cunningham, K. Nurse, C. J. Weitzniann and J. Ofengand, Bchern 32, 7172 (1993). 86. H. F. Noller, ARB 60, 191 (1991). 87. B. Kastner, C. N. A. Trotman and W. P. Tate, J M B 212, 241 (1990). 88. J. Rinke-Appel, N . Jiinke, R. Brimacomhe, S. Dokudovskaya, 0. Dontsova and A. Bogdanov, NARes 21, 2853 (1993). 89. C. M. Brown, K. K. McCaughan and W. P. Fate, NARes 21, 2109 (1993). 90. J. E. G. McCarthy and R. Brimacombe, Trends Genet. 10, 402 (1994). 91. 0. Dontsova, S. Dokudovskaya, A. Kopylov, A. Bogdanov, J. Rinke-Appel, N . Junke and R. Brimacombe, EMBO J . 11, 3105 (1992). 92. C. M. Brown and W. P. Tate, JBC 269, 33164 (1994). 93. W. P. Tate, A. L. Beaudet and C. T. Caskey, PNAS 70, 2350 (1973). 94. D. L. Hatfield, D. W. E. Smith, B. J. Lee, P. J. Worland and S. Oroszlan, Crit. Aeu. Biochem. Mol. Biol. 25, 71 (1990). 95. W. P. Tate, F. M. Adamski, C. M . Brown, M. E. Dalphin, J. P. Gray, J. A. Horsfield, K. K. McCaughan, J. G. MoKat, R. J. Powell, K. M. Timms and C. N. A. Trotman, in “The Translational Apparatus: Structure, Function, Regulation, Evolution” (K. H. Nierhaus, F. Franceschi, A. R. Subramanian, V. A. Erdmann and 8. Wittmann-Liebold, eds.), p. 253. Plenum, New York and London, 1993. 96. J. G. Moffat and W. P. Tate, ]BC 269, 18899 (1994). 97. H. J. Pel, M. Rep and L. A. Grivell, NARes 20, 4423 (1992). 98. W. J. Craigen, R. G. Cook, W. P. Fate and C. T. Caskey, PNAS 82, 3616 (1985). 99. R. B. Weiss, D. M. Dunn, A. E . Dahlberg, J. F. Atkins and R. F. Gesteland, E M B O J . 7, 1503 (1988). 100. W. J. Craigen and C. T. Caskey, Natui-e 322, 273 (1986). 101. B. C. DonIy, C. D. Edgar, F. M . Adamski and W. P. Ete, NARes 18, 6517 (1990). 102. S. Matsufuji, T. Matsufuji, Y. Miyazaki, Y. Murakanii, J. F. Atkins, R. F. Gesteland and S. Hayashi, Cell 80, 51 (1995). 103. E. Rom and C. Kahana, PNAS 91, 3959 (1994). 104. R. B. Weiss, D. M. Dunn, J. F. Atkins and R. F. Gesteland, C S H S Q B 52, 687 (1987). 105. T. Jacks, H. D. Madhani, F. R . Masiarz and H. E. Varmus, Cell 55, 447 (1988). 106. R. 8. Weiss, D. M. Dunn, M. Shuh, J. F. Atkins and R. F. Gesteland, New Biologist 1, 159 (1989). 107. J. A. Horsfield, D. N. Wilson, S. A. Mannering, F. M. Adamski and W. P. Tate, NARes 23, 1487 (1995). 108. W. M. Huang, S:Z. Ao, S. Casjens, R. Orlandi, R. Zeikus, R. Weiss, D. Winge and M. Fang, Science 239, 1005 (1988). 109. K. L. Herbst, L. M. Nichols, R. F. Gesteland and R. B. Weiss, PNAS 91, 12525 (1994). 110. S . C. Wong and A. T. Abdelal, J. Bnct. 172, 630 (1990). 111. A. Bock, K. Forchharnmer, J. Heider and C. Baron, Trends Biochern. Sci. 16, 463 (1991). 112. M. J. Berry, L. Banu, J. W. Harney and P. R. Larsen, E M B O ] . 12, 3315 (1993).
334
WARREN P. TATE ET AL.
113. M. J. Berry and P. R. Larsen, Biochem.’ Soc. Trans. 21, 827 (1993). 114. M. J. Berry, L. Banu, Y. Chen, S. J. Mandel, J. D. Kieffer, J. W. HarneyandP. R. Larsen, Nature 353, 273 (1991). 115. K. E. Hill, R. S. Lloyd and R. F. Burk, PNAS 90, 537 (1993). 116. P. J. Farabaugh, Cell 74, 591 (1993). 117. H. Hofstetter, H.-J. Monstein and C. Weissman, BBA 374, 238 (1974). 118. M. Ishikawa, T. Meshi, F. Motoyoshi, N. Tdkamatsu and Y. Okada, NARes 14,8291 (1986). 119. R. C. Nutter, K. Scheets, L. C. Panganiban and S. A. Lonimel, NARes 17, 3163 (1989). 120. E. G. Strauss, C. M. Rice and J. H. Strauss, PNAS 80, 5271 (1983). 121. K. Takkinen, NARes 14, 5667 (1986). 122. N . M. Wills, R. F. Gesteland and J. F. Atkins, PNAS 88, 6991 (1991). 123. 8 . Marshall and S. B. Levy, Nature 286, 524 (1980). 124. J. Sambrook, E. F. Fritsch and T. Maniatis, “Molecular Cloning: A Laboratory Manual,” 2nd Ed. CSHLab, Cold Spring Harbor, New York, 1989. 125. P. M. Sharp and M. Bulmer, Gene 63, 141 (1988). 126. C. M. Brown, P. A. Stockwell, C. N. A. Trotman and W. P. Tate, NARes 18, 2079 (1990). 127. C. M. Brown, P. A. Stockwell, C. N. A. Trotinan and W. P. Tate, NARes 18,6339 (1990). 128. J. Kohli and H. Grosjean, MGG 182, 430 (1981). 129. D. R. Cavener and S. C. Ray, NARes 19, 3185 (1991). 130. P. M. Sharp, C. J. Burgess, E. Cowe, A. T. Lloyd and K. J. Mitchell, in “Transfer RNA in Protein Synthesis” (D. L. Hatfield, B. J. Lee and R. M. Pirtle, eds.), p. 397. CRC Press, Boca Raton, FL, 1992. 131. M. Kozak, NARes 15, 8125 (1987). 132. M. Kozak, J . Cell. B i d . 115, 887 (1992). 133. P. M. Sharp, M . Stenico, J. F. Peden and A. T. Lloyd, Biochem. SOC. Trans. 21, 835 (1993). 134. M. McClelland and A. S. Bhagwat, Nature 355, 595 (1992). 135. G. Gutierrez, J. Casadesus, J. L. Oliver and A. Marin, J. Mol. E d . 39, 340 (1994). 136. W. T. Pedersen and J. F. Curran, JMB 219, 231 (1991). 137. R. Martin, M. K. Phillips-Jones, F. J. Watson and L. S. Hill, Biochem. Soc. Trans. 21,846 (1993). 138. J. B. Kopczynski, A. C. Raffand J. J. Bonner, MGG 234, 369 (1992). 139. K. Fearon, V. McClendon, B. Bonetti and D. M. Bedwell, JBC 269, 17802 (1994). 140. J. P. Tassan, K. Le Guellec, M. Kress, M. Faure, J. Camonis, M. Jacquet and M. Philippe, MCBiol 13, 2815 (1993). 141. M. Yarus and J. Curran, in “Transfer RNA in Protein Synthesis” (D. L. Hatfield, B. J. Lee and R. M. Pirtle, eds.), p. 319. CRC Press, Boca Raton, FL, 1992. 142. H. Engelberg-Knlka and R. Schoulaker-Schwarz, Trends Biochem. Sci. 13, 419 (1988). 143. C. Baron and A. Biick, JBC 266, 20375 (1991). 144. D. Behne, H. Hilniert, S. Scheid, H. Gessner and W. Elger, BBA 966, 12 (1988). 145. D. Behne, S . Scheid, A. Kyriakopoulos and H. Hilmert, BBA 1033, 219 (1990). 146. M. Ehrenberg and C. G. Kurland, Q. Reu. Biophys. 17, 45 (1984). 147. H. Bremer and P. P. Dennis, in “Escherichia coli and Salmonella typhimurium” (F. C. Neidart, ed.), p. 1527. American Society for Microbiology, Washington, D.C., 1987. 148. F. M. Adamski, K. K. McCaughan, F. J~rgensen,C. 6 . Kurland and W. P. Tate, J M B 238, 302 (1994). 149. D. Yang, Y. Oyaizu, H. Oyaizu, G. J. Olsen and C. R. Woese, PNAS 82, 4443 (1985). 150. C. C. Lee, K. M . Tinims, C. N. A. Trotman and W. P. Tate, JBC 262, 3548 (1987). 151. D. Hirsh and L. Gold, JMB 58, 459 (1971). 152. P. Model, R. E. Webster and N. D. Zinder, JMB 43, 177 (1969).
INFIDELITIES OF TRANSLATIONAL STOP SIGNALS
335
153. F. Major, M. Turcotte, D. Gautheret, 6. Lapalme, E. Fillion and R. Cedergren, Science 253, 1255 (1991). 154. T. Caskey, E . Scolnick, R. Tompkins, J. Goldstein and G. Milman, CSHSQB 34, 479 (1969). 155. H . J. Pel, C . Maat, M. Rep and L. A. Grivell, NARes 20, 6339 (1992). 156. J. L. Goldstein, A. L. Beaudet and C. T. Caskey, PNAS 67, 99 (1970). 157. M. F. Tuite and I. Stansfield, Nature 372, 614 (1994).
Structure of Replicating Chroma tin CLAUDIAGRUSSAND ROLF KNIPPERS Fakultat f u r Biologie Unioersitat Konstanz 0-78434 Konstanz, Germany I. 11. 111. IV. V. VI. VII. VIII.
IX. X.
Structure of Chromatin: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assembly in Vitro: Reconstitution of Chromatin .................... Experimental Systems for the Study of Replicative Chromatin Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Replicating Chromatin: Basic Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . Nucleosome-free Origins . . . . . . . . . . . . . . . . . . . . . . . . . . Chromatin Structure on Replicated DNA Strands . . . . . . . . . . . . . . . . . . Histone H1 and the Folding of Replicating Chromatin . . . . . . . . . . . . . . Replication-dependent Histone Modifications ...................... Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
338 342 345 347 349 350 352 357 358 359 36 1
The large genome of a eukaryotic cell is organized in the nucleus as a complex nucleoprotein structure, chromatin. The eukaryotic genome is thus condensed several orders of magnitude over B-form DNA, but in spite of this, chromatin must be available as a substrate for transcription and replication. The structure of transcribed chromatin has attracted considerable attention during the past few years and has been the subject of several recent reviews (1-8). However, the most dramatic changes in chromatin structure occur during genome replication, when advancing replication forks invade the parental chromatin and when new chromatin is assembled on emerging progeny D N A strands. In this essay, we summarize present knowledge describing structural changes in chromatin as replication-dependent events. However, we do not discuss the process of semiconservative DNA replication itself, and we do not describe the function of replication enzymes or other proteins involved in eukaryotic DNA replication. The replication of eukaryotic DNA is well described in several recent reviews (9-12). To provide an appropriate basis of the arguments elaborated below, we Progress in Nucleic Acid Research and Molecular Biology, Vol. 52
337
Copyright 0 1YY6 by Academic Press, Inc. All rights of reproduction in any form reserved.
338
CLAUDIA GRUSS AND ROLF KNIPPERS
begin with a brief overview describing some structural features of chromatin relevant to the topic of this essay (for details, see Refs. 13 and 14).
1. Structure of Chromatin: An Overview In higher eukaryotes, virtually all of the genome is packed into nucleosomes. A nucleosome is composed of a histone octamer, one molecule of histone H1, and 200 (215) base-pairs (bp) of DNA. The primary chromatin subunit is the nucleosome core particle, which consists of a histone octamer and a 146-bp segment of DNA wrapped around its surface. The nucleosome cores are distributed along the DNA, separated from one another by linker DNA segments.
A. Histones Histones have at least two structural domains, a central structured globular domain and an amino-terminal flexible basic extension or arm (Fig. 1). The C-terminal domains involved in histone-histone interactions inside nu-
FIG. 1. Structure of histones. This diagram emphasizes the general architecture of histones with their central globular domain and their flexible basic arms. N, Amino terminus; K, lysine; R, arginine; 0 , other amino-acid residues. The bracketed numbers refer to the total number of amino acids in mammalian histones. Histones H2A and H2B form dimers, and histones H3 and H4 form tetramers in solution.
339
STRUCTURE O F REPLICATING CHROMATIN
Histone
H3
H4
approximate size of 1 carboxyterminal arm 25
-
H2A
16
H2B
23
H1
92
~
modifications AC-K9; P-S10; Ac-Kl4; AC-Kl8; Ac-K23; k K 2 7 P-S]; Ac-KS; Ac-KB; AcK12; Ac-Kl6; Ac-K20 P-S1 ; AC-K5; AC-K9
P-Sl45; P-Sl73;P-
S180;
mitosis: P-Sl6; P-Tl7;P-Tl36;
FIG. 2. Properties of mammalian histones. We summarize the total size (in number of amino-acid residues), the lengths of the flexible arms (see Fig. l), and the sites of reversible amino-acid modifications. Ac, Acetylation; P, phosphorylation; K, lysine; R, arginine; T,threonine; S, serine: The numbers refer to the positions of the respective amino-acid residues in the polypeptide chain. For example, Ac-K9 indicates that the site chain of lysine residue at position 9 in the amino-acid sequence of histone H3 can be modified by acetylation.
cleosomal DNA are predominantly a-helical, with a long central helix bordered on each side by a loop segment and a shorter helix (15).All known sites of reversible modifications, such as acetylations and phosphorylations, are located in the flexible basic domains (Fig. 2) (16). Significantly, histone H3 and histone H4 can exist as stable tetramers, and histones H2A and H2B as stable dimers in solution. A treatment of native chromatin with increasing salt concentrations causes the initial release of H2A/H2B dimers and the subsequent release of H3/H4 tetramers (17).
B. Core Particles A nucleosome core particle has the form of a wedge-shaped disk, about 11 nm in diameter and 5.5 nm high. DNA forms 1.75 left-handed turns around the outside of the histone octainer with a pitch of roughly 3 nm (18) (Fig. 3). Hydroxy-radical cleavage and other methods show that the DNA on nucleosome cores is slightly overwound by 0.25-0.35 bp/turn, resulting in an average helical pitch of about 10.2 bp/turn compared to about 10.5 bp/turn for B form DNA in solution (19, 20). Structural analyses also indicate that the H3/H4 tetramer is in the center of the nucleosome core, flanked on each side by an H2A/H2B dimer (18).
340
CLAUDIA GRUSS AND ROLF KNIPPERS nuclemme
histone
octarner
HYH4
tetramer
C
t
t
v H2NH2B
dimer
3 0 nm pitch
FIG. 3. Nucleosome core particle. Histone H I seals two turns of DNA (168 bp) around the histone octamer. This model describes the octamer as a short central cylinder, formed by the histone H3/H4 tetramer, laterally covered by histone HZA/H2B dimers.
The stability of the core particle is determined by interactions between the inner globular histone domains because the flexible histone arms can be removed by mild protease treatment without much effect on the general stability of the core particle or the helical periodicity of nucleosomal DNA (20). However, the amino-terminal histone arms also appear to bind to specific sections of the DNA. For example, the amino-terminal domains of histones H 3 bind to DNA sections at the entry and exit from the core particle, whereas the amino-terminal domains of histones H4 interact with the internal 90 bp of nucleosomal DNA (21).
C. Histone H1 and Higher Order Chromatin Structures
Chromatin is a dynamic structure. Experiments in uitro show an extended 10-nm filament at low ionic strength (