Progress in Nucleic Acid Research and Molecular Biology, Volume 69

Some Articles Planned for Future Volumes The RNA World of Plant Mitochondria STEFAN BINDER, MICHAELAHOFMANN, JOSEF KUHN...

Author: Kivie Moldave

11 downloads 1091 Views 32MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Some Articles Planned for Future Volumes

The RNA World of Plant Mitochondria STEFAN BINDER, MICHAELAHOFMANN, JOSEF KUHN, AND KLAUSDASCHNER

Multiple Controlling Mechanisms of FGF-I Gene Expression through Multiple Tissue-Specific Promoters ING-MING CHIU, KATHYTOUHLISKY,AND CHRIS BARAN

CTD Phosphatase, Role in RNA Polymerase II Cycling, and Regulation of Transcript Elongation MICHAEL E. DAHMUS, NICK MARSHALL,AND PATRICKLIN

HIV-1 Nucleoprotein: Retroviral/Retrotransposons Nucleoproteins JEAN-LUC DARLIX

Biochemistry of Methiogenesis: Pathways, Genes, and Evolutionary Aspects UWE DEPPENMEIER

Manipulation of Aminoacylation Properties of tRNAs by Structure-Based and Combinational in Vitro Approaches RICHARD GIEGE AND JOEM PUETZ

Shunting and Reinitiation: Viral Strategies to Control Initiation of Translation THOMAS HaHN

Regulation of Yeast Glycolytic Gene Expression MICHAEL J. HOLLANDAND KATEWILLETT

Functions of Alphavirus Nonstructural Proteins in RNA Replication LEEVI KAARIAINENAND TERO AHOLA

DNA-Protein Interactions Involved in the Initiation and Termination of Plasmid Rolling Circle Replication SALEEM A. KAHN, T.-L. CHANG, M. G. KRAMER,AND M. ESPINOSA

Nonribosomal Synthesis of Chromopeptides and Related Compounds from Streptomycetesand Fungi ULLRICH KELLER AND FLORIANSCHAUWECKER

FGF3: A Gene with a Finely Tuned Spatiotemporal Pattern of Expression during Development CHRISTIAN LAVIALLE

X

SOME ARTICLES PLANNED FOR FUTURE VOLUMES

Specificity and Diversity in DNA Recognition by E. colt Cyclic AMP Receptor Protein JAMES C. LEE

Functional Significance and Mechanism of elF5-Promoted Hydrolysis in Eukaryotic Translation Initiation UMADAS MAITRA AND SUPRATIK DAS

DNA PolymeraseIll Holoenzyme: A Prototypical Replicative Complex CHARLES MCHENRY

Molecular Basisof Fidelity of DNA Synthesis and Nucleotide Specificity of Retroviral ReverseTranscriptase LUIS MENENDEZ-ARIAS

Catalytic Properties of the Translation Factors Necessary for mRNA Activation and Binding to 40S Subunits WILLIAM C. MERBICK

Initiation of Eukaryotic DNA Replication and Mechanisms HEINZ-PETER NASHEUER, KLAUS WEISSHART, AND FRANK GROSSE

Mechanisms of EF-Tu,a Pioneer GTPase ANDREA PARMEGGIANI AND I v o M. KRAB

Protein Kinase CK2-Linked Gone Expression Control WALTER PYERIN AND KARIN ACKERMANN

k Growing Family of Guanine Nucleotide Exchange Factors Is Responsiblefor the Activation of Ras Family GTPases LAWRENCE A. QUILLIAM

Steroid Hormone Regulation of mRNA Stability DAVID J. SHAPIRO AND ROBIN E. DODSON

HIV Transcriptional Regulation in the Context of Chromatin ERIC VERDIN

A Tale of Two Helicases: Rolesof Phage and Animal Virus Helicases in DNA Replication and Recombination SANDRA K. WELLER AND BORIANA MARINTCHEVA

Modulation of RNA Function by Oligonucleotides Recognizing RNA Structure J. J. TOULM~,.3'1 C . D I P R I M O , *'t AND S. M O R E A U *

*INSERM U 386 IFR Pathologies Infectieuses Universit# Victor Segalen 146 rue L#o-Saignat 33076 Bordeaux c6dex, France t lnstitut Europ6en de Chimie et Biologie CNRS FRE 2247 Avenue Pey Berland, B.P. 108 33402 Talence c#dex, France I. RNA Structures: Targets of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Functional RNA Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. RNA Ligands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. Antisense Oligonueleotides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Principles, Mechanisms, and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . B. The Mini-Exon Sequence of Leishmania as a Target for Antisense Oligonucleotides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Invading RNA Structures with Conventional RNA or DNA Oligonucleotides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Chemically Modified Invader Oligonucleotides . . . . . . . . . . . . . . . . . . . . . E. Affinity or Specificity: A Dilemma? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Reactive Oligonucleotides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Triple-Stranded Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. The Alphabet and the Chemistry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Modified Nucleotides and Intercalating Agents for Triple Helices . . . . . C. Clamp and Circular Oligonucleotides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Double Hairpin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV.. Oligonucleotides Identified through Combinatorial Approaches . . . . . . . . . . A. SELEX and Aptamers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Aptamers to the TAR RNA of H1V-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Loop-Loop (Kissing) Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 3 4 5 5 8 9 14 16 18 19 19

22 23 24 25 26 28 31 38 39

1Corresponding author. Tel: 33 (0)5 57 57 10 14; fax: 33 (0)5 57 57 10 15; E-mail: jean-jacques. [email protected]. Progress in Nucleic Acid Research and Molecular Biology, Vol. 69

1

Copyright © 2001 by Academic Press. All rights of reproduction in any form reserved. 0079-6603/01 $35.00

2

J.J. TOULMI~ ET AL. Numerous RNA structures are responsible for regulatory processes either because they constitute a signal, like the hairpins or pseudoknots involved in ribosomal frameshiPdng, or because they are binding sites for proteins such as the tram-activating responsive RNA element of the human immunodeficiency virus whose binding to the viral protein Tat and celhlar proteins allows full-length transcription of the retroviral genome. Selective ligands able to bind with high affinity to such RNA motifs may serve as tools for dissecting the molecular mechanisms in which they are involved. Such ligands might also constitute prototypes of therapeutic agents when RNA structures play a role in the expression of dysfunctional genes or in the multiplication of pathogens. Different classes of ligands (aminoglycosides, interacalating agents, peptides) are of interest to this aim. However, oligonucleotides deserve particular consideration. They have been extensively used in the frame of the antisense strategy. The apparent simplicity of this rational approach is, at frst sight, very attractive. Indeed, numerous successful studies have been published describing the efficient inhibition of translation, splicing, or reverse transcription in cellfree systems, in cultured cells, or in vivo by oligomers complementary to an RNA region. However, RNA structures restrict the access of the target site to the antisense sequence: The competition between the intramolecular association of RNA regions weakens or even abolishes the antisense effect. Various possibilities have been developed to circumvent this limitation. This includes both rational and combinatorial strategies. High-affinity ofgomers were designed to invade the RNA structure. Alternatively, triplex-forming oligonucleotides (TFO) and aptamers may recognize the folded RNA motif. Whereas the use of TFOs is rather limited owing to the strong sequence constraints for triple-hefx formation, in vitro selection offers a way to explore vast oligoribo or oligodeoxyribo libraries to identify strong, selective oligonucleotide binders. The candidates (aptamers) selected against the TAR RNA element of HIV-1, which form stable loop--loop (kissing) complexes with the target, provide interesting examples of ofgonucleotides recognizing a functional RNA structure through an important contribution of tertiary interactions. © 2001AcademicPress.

I. RNA Structures: Targets of Interest R N A has long b e e n c o n s i d e r e d as a family o f molecules (transfer RNA, messenger RNA, ribosomal RNA) playing a m i n o r role in the transmission o f the genetic information. Although a few antibiotics are R N A binders, most available drugs are t a r g e t e d to proteins whose complex t h r e e - d i m e n s i o n a l structure is s u p p o s e d to offer a b e t t e r chance to identify a selective ligand. Alternatively, D N A as the source o f the information has also b e e n o f interest for drug design insofar as a gene can b e d e m o n s t r a t e d to b e responsible for a given pathology: T h e uniqueness o f the target guarantees that it can b e saturated at a reasonable ligand concentration. However, multiple posttranscriptional s t e p s - - e d i t i n g , splicing, chemical modification o f e i t h e r the bases or the ribose, polyadenylation, translatability, d e c a y - - o f f e r room for regulatory events at the R N A level. Many

MODULATIONOF RNAFUNCTION

3

regulation processes actually take place in living cells, either in the nucleus or in the cytoplasm. Moreover, the genetic material of numerous viruses, including such major pathogens as the human immunodeficiencyvirus and the hepatitis C virus, consists of RNA, making RNA a valid target for therapeutic agents.

A. Functional RNA Structures Interest in RNA has increased since the discovery of catalytic RNA (1). The number of RNA motifs, which play a role in gene expression, is growing, both in prokaryotes and eukaryotes: RNA structure can exert a regulatory role per se or can be involved in regulation processes through interactions with RNA, DNA, or proteins. The iron-responsive element (IRE), a hairpin structure present in mRNAs coding for key proteins in the iron metabolism of vertebrate cells, is a good example. IREs were first identified in the 5' untranslated region of ferritin mRNAs (2). They are involved in the inhibition of mRNA translation in iron-deprived cells. Five IREs were also found in the 3' untranslated region of the transferrin receptor mRNA, in areas known for mediating differential stability of the message in response to iron level (3). The IRE hairpin is highly conserved through evolution and comprises a six-nucleotide apical loop (5'CAGUGU) and a 5-bp upper stem followed by a bulged cytosine (3). Both translation inhibition and mRNA degradation are dependent on the binding of iron-regulatory proteins (IRP-1 and IRP-2) to the IRE motif under conditions of iron deprivation (4). When iron is supplied to cells, IRPs are inactivated or degraded, thus allowing simultaneous ferritin synthesis and degradation of the transferrin receptor mRNA. Deletions in either the bulge or the loop region of the IRE severely reduce the affinity (Kd ~ 30 pM) of IRP for IRE (5). Numerous interesting examples of functional RNAs are also found in viruses. The packaging of viral RNA and the dimerization of retroviral genomes involve particular RNA motifs (6). In these organisms translation is exquisitely modulated by RNA structures. In retroviruses the gag and pol coding regions are out of frame, but nevertheless give rise to a Gag-Pol fusion protein, which is subsequently cleaved by the viral protease to generate the structural proteins and the enzymes required for viral development (7). A small number of ribosomes engaged in the Gag synthesis switch from the 0 frame (the frame of gag) to the - 1 reading frame (the frame of pol) at a point prior to the termination codon corresponding to the end of the gag gene. The retroviral frameshifting signal consists of a heptanucleotide located several bases upstream of a hairpin or a pseudoknot whose stems and loops vary both in size and sequence, depending on the virus. Such a structure has been shown to be essential for frameshifting and might constitute a signal to slow down the elongating ribosome on the shifting heptanucleotide. Moreover, the shifting efficiency drives the ratio between the Gag and Pol products, which is crucial for the development of the virus.

4

j.j. "rouLM~ rT XL.

Some other viruses (e.g., picornaviruses, hepatitis C virus) make use of a cap-independent mechanism for translation initiation. In contrast to the vast majority of eukaryotic mRNAs for which the 40S ribosomal subunit scans the message from the cap to a proximal AUG, initiation in these viruses occurs at an initiation codon that might be located several hundreds of nucleotides down from the 5' end (8). This 5' region constitutes an internal ribosome entry site (IRES), predicted to form a complex structure that binds to several proteins whose role in internal initiation of translation is not yet clearly identified. Interestingly, IRESdependent mRNA translation remains functional under stress conditions, when cap-dependent translation is severely impaired. These few examples demonstrate that functional RNA motifs might constitute targets for the artificial control of gene expression. High-affinity ligands that selectively recognize an RNA structure would be of interest both for genetic analysis and for therapeutic prospects.

B. RNA Ligands Natural compounds specific for RNA were uncovered quite early. Among these, aminoglycoside antibiotics are known to interact primarily with the ribosomal RNA. The 16S RNA component of the bacterial ribosome was the first RNA to be identified as a target for a small molecule (9). The aminoglycoside binds to the prokaryotic A site ofrRNA, which differs from the eukaryotic one by essentially a single base exchange (10). The structure of a 27-nucleotide A-site model RNA complexed with paromomycin has been determined by NMR spectroscopy, showing that the antibiotic binds in the RNA major groove in a pocket formed by noncanonical base pairs and a bulged nucleotide (11). Recently, it was found that these drugs are also able to interfere with different RNAs such as self-splicing group I introns, the hammerhead ribozyme, the hairpin ribozyme, the HDV ribozyme as well as the HIV TAR (trans-activating responsive) element (see Refs. 12 and 13 for reviews). RNA aptamers that bind to aminoglycosides with nanomolar-range affinity have been also selected in vitro (14, 15). The structural and physicochemicalparameters characterizing these interactions are dominated by the positively charged amino groups of the sugar backbone of aminoglycosides (16). Nonglycosidic natural compounds are also able to bind to RNA structural elements. Viomycin, a cyclic peptide antibiotic, specifically recognizes RNA pseudoknots, inducing RNA/RNA interactions that result in the inhibition of ribosomal subunit dissociation (17). Thiostrepton, a large thiazole-containing peptide, binds to the GTPase center of the bacterial 23S rRNA, stabilizes a conformation of the RNA, and interferes with a conformational change within the ribosomal L l l protein, thus inhibiting the ribosomal translocation (18). Synthetic compounds have been designed to target RNA structures. A library ofaminoglycosidemimetics based on the neamine backbone (an aminoglycoside)


5

was generated and screened for the interaction with mRNA sequences coding for oncogenic fusion proteins (19). Several derivatives showed nanomolar binding affinities for mRNA, demonstrating that high-affinity small molecules can be targeted to single-chain polyribonucleotides. Moreover 1,3- and 1,2-aminol compounds were shown to bind to the A site of the prokaryotic 16S rRNA as efficiently as paromomycin (20). Combinatorial approaches have also been used to identifypeptoids able to selectively bind to the TAR RNA element of HIV-1. A peptoid compound selected from a library of 3 × 106 compounds specifically inhibits the interaction of TAR with the viral protein Tat, both in the test tube and in cultured cells (21). A further optimization of this compound led to a highly active Tat antagonist inhibiting trans-activation in a cell context with a micromolar IC~0 (22). These few examples illustrate the potential of molecules targeted to RNA. Comprehensive views on the design of drugs targeting RNA have recently been published (15, 23). Alternatively, oligonucleotides can be evaluated as RNA ligands. About 15 years ago, the direct readout of the primary RNA sequence by a complementary nucleic acid provided the rational basis of the so-called antisense strategy (24, 25). However, RNA structures restrict the access of antisense oligonucleotides to the target sequence (26) and therefore weaken or even abolish the antisense effect. Indeed, many antisense users screen a large number of oligonucleotides against a given mRNA in order to find a region available for hybridization (27, 28). Consequently, structured regions corresponding to functional motifs are left out, despite their biological interest. Over the last 10 years, we have developed various strategies for designing oligonucleotides able to take, such RNA motifs into account. Beyond the classical antisense strategy, we will describe high-affinityoligomers able to unfold RNA structures. Oligonucleotides giving rise to local triple-stranded structures will also be presented. Finally, we will report recent results obtained in the frame of in vitro selection of either RNA or DNA sequences able to recognize functional RNA elements through tertiary interactions. These oligomers were mostly targeted to RNA structures playing a key role in the development of human pathogens: protozoan parasites (trypanosomes and leishmanias) and retroviruses (HIV-1 and HTLV-1). These ligands constitute tools to unravel the mechanisms in which the target RNA motif is involved and provide a basis for the conception of new therapeutic agents.

Ih Anfisense Oligonucleotides A. Principles,Mechanisms, and Limitations The antisense concept, first described by Grineva and co-workers as early as 1967 (29), received experimental support at the end of the 1970s: Synthetic

6

J.J. TOULMI~ET AL.

oligonucleotides were demonstrated to block RNA function both in cell-free assays and in cultured cells (24, 25, 30; see Ref. 31 for a review). Interestingly, it was demonstrated a few years later that prokaryotes make use of natural antisense RNA to modulate gene expression (32, 33). For instance, the replication of the plasmid ColE1 is negatively controlled by the short untranslated species RNA I which, upon binding to the complementary neosynthesized RNA II, prevents its hybridization to the template and hence the production of a primer by ribonuclease H (34). The antisense strategy has been widely used for artificially regulating the expression of a predetermined gene, either by antisense RNA produced in situ or by synthetic oligonucleotides delivered to the cell. Numerous genes have been turned off, in vitro and in vivo, allowing the elucidation of gene function, the inhibition of dysfunctional genes, or the control of pathogenic organisms

(35-37). Both the specificity and the efficiency of antisense oligonucleotides have been demonstrated on model systems (25, 38, 39). In particular, studies performed with mRNA microinjected into Xenopus ovocytes have provided a clearcut demonstration of antisense effects (40, 41). This work led to the identification of an unanticipated mechanism of inhibition. Initially, antisense oligonucleotides were envisioned as blockers for the reading of the message by the macromolecular complex responsible for translation of the mRNA. Inhibition of protein synthesis resulted from the competition between the oligonucleotide/ RNA hybrid and the ribosome. However, the observation that the RNA was cleaved in the presence of the antisense sequence (42, 43) suggested that the oligonucleotide might induce the degradation of the target RNA by RNase H, a ribonuclease specific for RNA/DNA duplexes. It has been demonstrated that RNase H activity actually mediated antisense effects in wheat germ extracts (41, 44) and to a much lesser extent in rabbit reticulocyte lysate, which might contain a low amount of a different type of RNase H (45). In higher eukaryotes, including human beings, two different classes of RNase H have been characterized, mainly on the basis of their biochemical properties (46-48). In addition, it was recently shown that these two classes display different cleavage patterns of RNA/DNA hybrids (49). RNase H is also likely to mediate most of the antisense effects observed in cultured ceils. Fragments of p53, c-myc, and bcr-abl mRNAs were detected in myelogenous leukemia KY01 cells in the presence of antisense oligonucleotides (50). The fragments corresponded to the selective cleavage of the target RNA at the expected binding site with no crossover effect between oligonucleotides. However, the cleavage was observed exclusively with permeabilized cells and no effect was detected upon incubation with intact cells. The involvement of RNase H in antisense inhibition was indirectly suggested using oligonucleotide analogs that do not elicit RNase H. It was recently reported that a PNA tridecamer (Fig. 1) targeted to the coding region of the Ha-Ras mRNA prevented the elongation of the polypeptide chain through an RNase

7

MODULATION OF RNA FUNCTION

OH Hd

OH "b"t

Hd

2'-O-methyl O

Hd'd

b%o,,

LockedNucleicAcid(LNA)

2'-O-methoxyethyl

-O

B

HN~" "0~P=O ~_~

OH

f

~yB t B

/~

"~N

"" gP"~l 0

HN

B

HN.

B

0 0 HN"~N/~O

I

N3'->P5'phosphoramidate

Morpholinophophorodiamidata

0

PeptidicNucleicAcid(PNA)

.NH2

.o

%NH2 2'-aminoethoxy-thymidine

S

clRib 2-Thiothymine

H2N/~N//~''N

'dRib 2-aminoadenine

SelectiveBindingComplementarybases FIG. 1. Structureofmodifiednueleosides and internucleotide linkages. Selective binding complementary (SBC) bases are also shown (bottom fight).

H-independent mechanism (51); very generally, however, antisense oligomers targeted downstream of the AUG initiator codon do not arrest translation unless they are covalently linked to the RNA (52, 53) or they mediate the degradation of the target RNA. The fact that synthetic oligonucleotides directed against the 3 r untranslated region of ICAM-1 mRNA display antisense effect in cultured A549 cells only when provided as analogs that direct RNase H cleavage strongly supports the involvement of this activity in the inhibitory process (54). RNase H was also shown to mediate the inhibition of reverse transcription (55-57). Several limitations drastically reduce the efficiency of antisense oligonucleotides. The degradation of the natural oligorners by nucleases has been

8

J.J. TOULMI~ET AL.

circumvented. Numerous nuclease-resistant derivatives have been prepared (58-61). Phosphorothioates are by far the most popular oligonucleotide analogues and, at present, constitute most of the ones engaged in clinical trials (36). Fomivirsen, the only marketed oligonucleotide approved by the U.S. Food and Drug Administration as treatment for cytomegalovirus retinitis, is a phosphorothioate (62). Although part of the clinical activity of this oligomer is likely due to nonantisense effects, this demonstrates that synthetic oligonucleotides represent promising drugs. Modifications of the ribose (2'-O-analogs, locked nucleic acids), the phosphodiester unit (morpholinodiamidates, N3'---~P5' phosphoramidates), and the backbone (peptide nucleic acids) led to promising nuclease-resistant compounds (Fig. 1). Some of these--those displaying an increased affinity for the target--are presented in Section II,D. Limited uptake by intact cells is also a severe restriction to the use ofantisense oligonucleotides. Various possibilities, including conjugation to either polycations or lipophilic compounds and delivery by vehicles such as liposomes, have been used successfully (for reviews see Ref. 63). The availability of the target sequence, which is generally difficult to predict, also presents a significant hurdle. RNAs are single-chain nucleic acids but cannot be considered uniform single-stranded polynucleotides. A large part of these molecules is involved in secondary and tertiary folds and hence unavailable for hybridization with the antisense sequence (26). Despite these problems, successful studies have been reported in many fields, including developmental biology, neurobiology, oncology, and infectious diseases (63, 64). In the next section we will describe the use of antisense oligonucleotides against a protozoan parasite, Leishmania amazonensis, which is responsible for severe diseases in human beings.

B. The Mini-Exon Sequence of Leishmaniaas a Target for Antisense Oligonucleotides In trypanosomatids the expression of genes involves a trans-splicing step between two pre-RNA species (65, 66). The shortest one, beating a cap structure at its 5' end, is used for the maturation of every transcript. As a consequence a 39-nucleotide sequence, termed a mini-exon or spliced leader, is present at the 5' end of all mRNAs in these organisms. The mini-exon sequence has been extensively used as a target for antisense oligonucleotides both in trypanosomes and in Leishmania (see Refs. 67 and 68 for recent reviews). A 9-mer covalently linked to an acridine derivative complementary to the mini-exon sequence of T brucei exhibited trypanocidal properties in culture (69). Similarly, a phosphorothioate 16-met was able to selectively kill the parasite Leishmania amazonensis in infected murine macrophages after a 48-h inhibition: The number of infected cells was reduced by about 40% at 20/xM oligonucleotide (70). This effect was selective and depended on the oligonucleotide sequence: Mismatched or


9

noncomplementary oligomers with the same chemistry displayed only a weak parasiticidal activity, if any. 2'-O-Methylribonucleotides and morpholinophosphorodiamidate derivatives were also active, indicating that the process was not mediated by RNase H (C. Bourget et al., unpublished). Moreover a phosphorothioate 16-mer complementary to the intron part of the L. amazonensis mini-exon precursor was as active as sequences complementary to the exon part, suggesting that the activity might at least partly originate in the inhibition oftrans-splicing (70). The conjugation of the anti-mini-exon phosphorothioate oligomer to a palmitate chain (71) or its entrapment into multilamellar vesicles (Delord et al., unpublished results) led to a significant improvement of the efficiency, likely related to an increased uptake, compared to the parent 16-mer. A detailed investigation of the hybridization properties of a series of antisense oligomers to mini-exon RNA sequences from various trypanosomatids, performed by thermal elution of filter-bound complexes, revealed abnormal behavior of oligonucleotides complementary to the L. amazonensis sequence: The low stability of oligonucleotide-L, amazonensis mini-exon complexes suggested restricted access to this RNA fragment (72). This was further confirmed by the weak photo-crosslinking efficiency of a psoralen-conjugated anti-mini-exon 12-mer to the full length mini-exon RNA compared to that observed with the 51 half of this RNA (73). Such a result can be explained by a competition between the intermolecular association (the formation of the oligonucleotide/RNA hybrid) and the intramolecular pairing of complementary regions (secondary structure) no longer present in the shortened RNA. The secondary structure of the mini-exon RNA from L. amazonensis deduced from enzymatic and chemical footprinting studies (74) confirmed that a large part of this region was engaged in double-stranded association, making it poorly available for hybridization. The limited size of the potential target (39 nucleotides), together with the functional interest of this region, prompted us to develop strategies taking into account nonlinear RNA targets for the design of antisense oligonucleotides. As pointed out earlier, this has potential for broad application. The first solution to be considered is the use of brute force: The oligomer is designed to invade the structured RNA---i.e., to shift the equilibrium from the folded to the unfolded form.

C. Invading RNA Structures with Conventional RNA or DNA Oligonucleotides In the antisense strategy one is tempted to maximize base pairing of the oligonucleotide with the target. But, as pointed out by Herschlag (75), "more isn't always better." If the energetic cost to open the structure cannot be experimentally determined, indications on how thermodynamically favorable the target is for antisense strategy may be obtained by predicting its stability (76, 77). The nearest-neighbor model (78), which assumes that the stability of a given

10

j.j. TOULMI~ET AL.

base pair depends on the stability of its adjacent base pair, can be used to predict the stability of folded RNA. A more complex task is to predict the stability of the new duplex formed between the antisense oligonucleotide and the target RNA, as at least three components will contribute to the binding free energy of the new complex: new base pairing and stacking interactions as well as a penalty to disrupt the folded target and, eventually, the folded antisense sequence. 1. THERMODYNAMICASPECTS Hairpins are among the most frequently encountered motifs in structured RNAs. Hybridization to all unpaired bases of a hairpin loop seems attractive; however, the resulting duplex could not follow the circular path of the target loop. Furthermore, the energetic cost of entirely disrupting the secondary structure of the hairpin to provide a single-stranded linear site for the antisense candidate may be thermodynamically prohibitive. Pseudoknots and pseudo-halfknots provide a way of binding unpaired bases in hairpin loops (79, 80). A pseudo-halfknot is formed when the antisense ligand hybridizes asymmetrically to the loop of a hairpin (Fig. 2). To evaluate pseudo-halfknotting, Ecker et al. (81, 82) targeted the TAR RNA of HIV-1 with oligonucleotides. Pseudo-halfknots (Fig. 2), characterized by enzymatic and chemical probes, were obtained with two 12-mers and two 17-mers designed to bind to either the 31 or the 5 ~ side of the loop created by disrupting the four base pairs above the bulge of the TAR RNA. One 17-mer designed to bind all the way around this loop did not form a complete pseudo-halfknot structure. This compound showed lower affinity for the target (Kd ~ 500 nM) than the 12-mers (Kd 67 nM) and the other 17-mers (Kd ~ 20 nM). Discrepancies between experimentally determined and calculated binding free energies emphasize the limits of the nearest-neighbor model to predict the stability of a complex with antisense candidates targeted against folded RNAs. Lima et al. targeted the 47-mer stem-loop transcript corresponding to residues + 18-64 of the mutant Ha-ras mRNA (83, 84). Antisense 10-mer RNAs were designed to bind to various regions of the hairpin: the stem region, both the stem and the 16-nucleotide loop, or the loop only. Oligonucleotides targeted against the stern region, including one also partially complementary to the loop, showed 105-106-fold lower affinity for the folded (Kd --~ 10 -5 M) than for the complementary single-stranded target (Kd ~ 10 -1° M), consistent with the unfavorable thermodynamic cost to disrupt all the base pairs of the stem. The equilibrium dissociation constant for the complexes with the oligonucleotides targeted to the loop decreased on going from the 31 side to the 5f side of the loop. Binding to the 5 f side of the hairpin loop was as strong as that to the corresponding single-stranded oligonucleotide (Kd ~ 10 -1° M). Because no base pairs have to be opened to hybridize this antisense candidate, the loop structure

MODULATIONOF RNA FUNCTION

e)

11

f)

FIG.2. Complexesformedby an oligonueleotideand a targethairpin (a):pseudo-halfknot(b), triple-strandedstem (c), double-hairpincomplex(d), kissingcomplex(e). For comparisonwiththe loop-loop (kissing) complex,the structure of a pseudoknotis also shown (f). The RNA and the oligonucleotidesare indicatedby boldand faintlines,respectively.Linesand starsindicateWatsonCrick and Hoogsteenhydrogenbonding,respectively.

should contribute to the binding constants. A model of the Ha-ras stem-loop fragment suggests that binding an oligonucleotide to the 5' side of the loop could occur without any major changes of the overall tertiary structure of the hairpin. In contrast, hybridization to other sites would require significant structural changes, and hence is thermodynamically unfavorable. Stronger binding of antisense oligonucleotides to the 5' side of a hairpin loop were also observed with tRNAs (85, 86). Direct evidence of the influence of the structure of the target RNA on the hybridization of antisense compounds, in a cellular context, was given by Vickers et al. (87). The 20-nucleotide target was cloned in front of a luciferase reporter gene in such a way as to be (1) entirely in a hairpin stem, (2) partially on the 5' side or the 3' side of the stem, (3) in a region with no predicted secondary structure, or (4) in a shorter hairpin. Luciferase expression in cells was inhibited by a complementary oligonucleotide targeted against the least structured construct. In contrast, the antisense oligonucleotide had almost no effect when the target site was entirely sequestered in the hairpin stem. In this example the predicted thermodynamics correlated well with the luciferase assay.

12

J.J. TOULMI~ET AL.

Interestingly, the observed effects were not sequence context-dependent. If the target sites were moved from the 5'-UTR to the 3'-UTR of the mRNA, luciferase expression was again more efficiently inhibited with the unstructured construct. 2. KINETICASPECTS Cell-free studies are a first step in identifying antisense candidates targeted against structured RNAs, however, the measured binding free energies may not reflect the situation in a cellular context where the concentrations may be far from those at equilibrium. Binding kinetics may also affect the antisense efficiency. Association rates for unstructured RNAs range from 106 to l0 s M -1 s-1 (88-90). These rates are sequence- and length-independent. In contrast, rates for helix disruption decrease when length increases. Lima et al. (83) reported effects of structure on oligonucleotide/RNA h~bridization kinetics for the Ha-ras message. Extremely slow binding (13 M -1 s- ) compared with that of the single-stranded complementary sequence (107 M- 1 s-l) was obtained with the oligonucleotide that was partially complementary both to the stem and the loop on the 5' side. In contrast, rates almost equal to those obtained for single-stranded regions were observed with oligonucleotides hybridizing to the 5' side or to the middle of the loop. On the other hand, dissociation rate constants were similar for the hairpin and the single-stranded regions. In fact, the measured equilibrium binding constants correlated well with the association rates. Further evidence that the structure of RNAs may influence binding kinetics was reported by Fedor and Uhlenbeck (91), who studied bimolecular "hammerhead" RNA self-cleaving domains--a large one containing most of the catalytically essential nucleotides and a small one containing the cleavage site. These authors examined the activity of four hammerhead ribozymes that differed only in the intermolecular helices generated by binding to the substrate. Cleavage rates that may vary more than 70-fold under similar conditions did not result from alternative conformations of the large catalytic RNA molecule; rather, they were likely due to secondary structures that the substrate RNA may form. Does fast binding kinetics measured in vitro correlate with efficient antisense inhibition in vivo? Several studies showed a direct relationship between antisense efficiency and structure of the target RNAs. Hajlt and Wagner (92) analyzed the effect of loop size on the efficiency of antisense RNA control using CopA and CopT, two complementary RNAs involved in the control of the plasmid R1 replication. The association rates show a maximum for loop sizes between 5 and 7 nucleotides. Smaller and larger loops bind more slowly. These in vitro experiments correlate reasonably well with the relative plasmids copy number. Rittner et al. (93) also showed that in vitro binding kinetics agreed with in vivo antisense activity for artificial antisense RNAs directed against HIV-1 RNA. Antisense sequences obtained by shortening a 150-mer antisense RNA


13

were directed against exons coding for Tat and Rev proteins. No simple relationship was observed between the antisense length and parameters such as the total free energies, the free energy per length, or even the secondary structures. Antisense oligonucleotides displayed association rates in the range of 10z-104 M -1 s-1 to 50 M -1 s-1, likely resulting from the effect of RNA structures on hybridization kinetics. All these works emphasize the need for strategies that may help antisense oligomers overcome structures. Fully sequence-randomized oligonucleotide libraries were used to identify antisense compounds targeted against the hepatitis C virus (HCV) RNA genome (28) and the Ha-ras mRNA hairpin fragment (94). Such a combinatorial screening may help to identify good antisense oligonucleotides when the target RNA is large. The benefit of this approach compared with the simple examination of the target followed by the design of a few complementary ligands is not clear when the target is short and its structure established enough. As a matter of fact, the best antisense found against the Ha-Ras mRNA hairpin was identified earlier by a rational approach. Patzel and Sczakiel (95) used kinetic in vitro selection to identify antisense RNA sequences against the chloramphenicol acetyltransferase (cat) RNA. The enrichment for antisense species after each selection round was evaluated by measuring association rate constants rather than equilibrium constants. Compared to the SELEX method, the structural diversity within the selected pool is reduced as the fastest annealing antisense species are the least structured ones. Seen from the target side, the kinetic selection actually restricts targeting to accessible single-stranded RNA regions. The good correlation between fast in vitro hybridization and inhibition of cat gene expression in HeLa cells strongly argues in favor of a kinetic control of the antisense effect in cells. A systematic use of computer-assisted methods offers one more way to identify favorable target sites within a highly structured RNA. One example of such an approach to assess the accessibility of DNA methyltransferase mRNA to antisense oligonucleotides was reported by Scherr et al. (96). An algorithm was used to predict secondary structures of a set of windows which are moved along the target by steps of a given length. Positions at which favorable sites occurred, such as large loops, bulges, joints, and free ends, are conserved for antisense targeting if present in overlapping sequence stretches and at the lowest energy. Several antisense oligonucleotides identified by this approach showed a significant increase of their efficacyin cells compared with the most effective published antisense. Lehmann et al. (97) used another algorithm to design intracellularly expressed antisense RNAs against HIV-1 gag RNA. Predicted favorable sequences 300 nt long showed antisense effects up to 60-fold stronger than predicted unfavorable ones. Surprisingly, however, this method did not improve the design of shorter oligonucleotides (100 nt long). Chemical modifications of antisense oligonucleotides may also help improve their efficacy.

14

J.J. TOULMt~ET AL.

D. Chemically Modified Invader Oligonucleotides We focus here on modifications enabling the oligomer to show increased affinity. The disruption of the secondary structure of RNA targets can be achieved by increasing the length of the oligomer or by improving the energetic contribution of each base pair resulting from the hybridization of the modified antisense. 2'-O-Alkylribonucleotides were easily obtained from phosphoramidite chemistry (98). Thermal denaturation experiments of various duplexes showed that 2'-O-methyl-RNA/RNA hybrids are more stable than RNA/RNA and DNA/ RNA duplexes (99). The increase in melting temperature (ATm) reaches 0.5°C per modified residue for an RNA target strand (100). Attachment of longer aliphatic chains to the 2' oxygen did not improve duplex stability (I01). Surprisingly, the 2'-O-methoxyethyl (MOE) substituent (Fig. 1) significantly enhanced RNA binding affinity (ATm = +I°C on average per substitution) (101,102). The higher affinity was accompanied by a significantly enhanced protection against nuclease degradation. The stabilizing effect of 2'-O-alkyl derivatives is related to the structure of the hybrid. The substitution of heteroatoms at the 2' position promotes an RNA-like C3'-endo sugar conformation leading to A-form duplexes. This is consistent with the higher stability of RNA duplex relative to DNA/RNA hybrids (103). A crystal structure of a fully modified MOE/RNA duplex revealed that the sugars of the MOE strand are indeed locked in the C3'-endo conformation (104). Thus, the 2'-O-alkyl-modified strand is conformationally preorganized for an A-form duplex. Locked nucleic acid (LNA) (Fig. 1), a recently introduced oligonucleotide analog, shows bridged 2'-4' positions of the ribose ring. Increased Tins of +3 to +8°C per modification toward DNA and RNA strand, respectively, were observed (105). Thanks to its bicyclic structure, the furanose ring of the LNA monomer is fully locked in a 3'-endo conformation, structurally mimicking the standard RNA strand. A recent development suggests that these oligomers have high in vivo efficiency and very low toxicity (106). However, this new phosphodiester backbone has not yet been evaluated for strand invasion in RNA structures. A different approach was used by Gryaznov and co-workers, who proposed the N3'--~PS' phosphoramidate backbone (3'-NP) (Fig. 1) (107). These oligodeoxynucleotide analogs containing 3'-amino instead of 3'-hydroxyl nucleosides form very stable duplexes with both DNA and RNA complementary strands. Increases in melting temperatures relative to the standard phosphodiester reach 2.9-3.5°C per modification. A further improvement of duplex stability was obtained with oligoribonucleotides and 2'-fluoro N3'--~P5' phosphoramidates (108). A high-resolution structure of a fully modified 3'-NP duplex revealed an A-RNA conformation (109). The crystal structure also showed hydration of the phosphoramidate DNA relative to unmodified DNA owing to the amino group.


15

These compounds represent promising therapeutic agents and have been shown to selectively inhibit in vitro translation (59, 110) and reverse transcription (111). Morpholino oligonucleotides (Fig. 1) are characterized by a morpholino heterocycle replacing the sugar moeity and a nonionic phosphorodiamidate linkage. These analogs exhibit preferential binding for RNA sequences compared to DNA or phosphorothioate oligomers of identical sequence (61). These properties might be related to a preorganized helical backbone, as suggested by a good stacking of adenosines in a dimeric unit (112), and to the uncharged interunit linkage. These oligomers exhibited quite good antisense properties in cell assays (113) and are able to form a stable triple helix in physiological ionic conditions (114). Peptide nucleic acids (PNA) were introduced by Nielsen et al. in 1991 (115). The entire sugar-phosphate backbone is replaced by an N-(2-aminoethyl)glycine polyamide structure (Fig. 1). These uncharged oligomers obey the WatsonCrick base pairing rules. They bind very efficiently to complementary DNA, and even better to complementary RNA (116). In contrast to DNA, they can bind in either the parallel or the antiparallel orientation (the PNA C terminus corresponds to the 3' end and the N terminus to the 5 t end of unmodified oligonucleotides) with a preferred antiparallel orientation. Antiparallel PNA/RNA duplexes exhibit increased melting temperature of about 1.5°C per base relative to DNA/RNA hybrids. PNA/RNA duplexes are on average 0.2-0.5°C per base more stable than the corresponding PNA/DNA duplexes (117). Detailed structural information for PNA/RNA duplexes was obtained from 1H-NMR spectroscopy (118). The heteroduplex is an antiparallel fight-handed double helix similar to the A form of RNA duplexes. A unique feature of PNA homopyrimidine strands is its binding to double-stranded DNA through strand displacement, leading to (PNA2)/DNA triple helices (115, 119). Despite the bad uptake of PNA sequences by live cells, multiple antisense effects have been described for these derivatives (60). Selective-binding complementary oligonucleotides (SBC ODNs) were conceived to allow strand invasion of either double-stranded DNA or RNA (120). They are defined as self-complementary oligonucleotides (or pairs of complementary ODNs) that do not interact with each other and so exist as singlestranded molecules. However, they form stable hybrids with complementary unmodified strands of nucleic acids owing to the formation of 2-thiothymine-2aminoadenine base pairs (Fig. 1). It has been shown that an SBC oligonucleotide was able to invade the hairpin of the L. amazonensis mini-exon RNA (74). SBC oligomers form more stable hybrids with RNA than do normal base oligonucleotides with either phosphodiester or phosphorothioate backbones. Moreover, the resulting heteroduplexes are substrates for the E. coli RNase H. The in vitro translation of L. amazonensis mRNA was selectively and more efficiently inhibited by an SBC 16-mer complementary to the mini-exon sequence than by

16

j. j, TOULMI~ET AL.

the corresponding ofigonucleotide with normal bases. The unique properties of these modified base pairs merit more extensive studies on modified backbone allowing the use of shorter oligomers.

E. Affinity or Specificity: A Dilemma? One of the major interests of the antisense strategy is the expected specificity of the effect resulting from the hybridization of the antisense oligomer with the complementary RNA site. The potential applications of this approach, both in molecular genetics and in therapeutics, rely on the assumption that the inhibition of gene expression will be restricted to the target RNA and that other RNAs in the cell will remain unaffected. This is primarily related to the length and base composition of both the antisense and target sequences. Assuming a statistical distribution of nucleotides in the genome, the minimal length of an antisense oligomer for ensuring a unique target in a human cellular RNA pool ranges from 11 to 15 bases for sequences containing exclusively Gs and Cs or As and Ts, respectively (37). This is indeed a very rough approximation, and using such oligonucleotides cannot guarantee specific antisense effects. Multiple parameters are responsible for adverse properties. This might result from undesirable binding to various proteins and receptors, some of which are now identified and may eventually contribute to therapeutic benefits (121). These effects will not be discussed here. Oligonucleotides will also bind to nontarget RNA sites, leading to mismatched hybrids with a reduced affinity compared to the actual linear complementary region. The selectivity, which is related to the ratio of saturated target sites to nontarget sites, will depend on the binding constants to target and nontarget sequences and on the concentration of the interacting species, (i.e., to the oligomer concentration) as, very generally, the antisense species is in vast excess in RNAs. The discrimination will depend on the AAG for the formation of the matched compared to the mismatched duplex--i.e., on the nature, the number, and the location of the mismatch(es) (84). The consequences of illegitimate binding will greatly depend on the ability of the RNA-oligonucleotide duplex to elicit RNase H activity. Indeed, RNase H can cleave RNA-DNA hybrids as short as five nucleotides; consequently, a mismatched RNA-antisense oligonucleotide complex that is stable under the experimental conditions will constitute a substrate for RNase H (122). The degradation of irrelevant RNAs will, of course, be a source of nonspecific effects (123). The longer the oligonucleotide sequence, the more likely the cleavage of nontarget sites. It is critical to adjust the affinity (the length) of the antisense sequence to achieve maximal discrimination between target and nontarget sites. This is beautifully illustrated by the work of Freier and co-workers (124), who targeted a point-mutated region of Ha-Ras mRNA involved in the activation of the protooncogene. They demonstrated an antisense effect of a 17-mer


17

phosphorothioate oligonucleotide which reduced the translation, in cultured HeLa cells, of the mutant Ha-Ras mRNA by 50% at 50 nM and showed a much lower effect on the wild-type message. In contrast, under the same conditions a 15-mer had no effect on the translation of either mRNA whereas a 19-mer inhibited to a similar extent the synthesis of both the wild-type and the mutant Ha-Ras protein (124). Increased specificity was observed for chimeric oligonucleotides comprising a stretch of phosphodiester nucleotides flanked on each side by modified sequences that do not elicit RNase H. A 15-mer phosphodiester (ODN1) fully complementary to the rabbit ot-globin mRNA and partially complementary to several regions of the rabbit fl-globin mRNA induced the cleavage, and consequently the translation inhibition, in wheat germ extract of both messages (125). A chimeric methylphosphonate-phosphodiester-methylphosphonateoligomer (ODN2) with five unmodified nucleotides in the central region exhibited improved selectivity owing to a reduced ability of RNase H to cleave fl-globin mRNA-oligonucleotide mismatched duplexes. In contrast, a similar molecule (ODN3) with a window of seven phosphodiesters was a nonselective inhibitor of both or- and fl-globin mRNAs, as the parent unmodified ODN1 (125). Stereorandom methylphosphonates exhibit a lower affinity for RNA than phosphodiester oligonucleotides in contrast to 2t-O-methyl RNA analogs, which are known to bind with a high affinity to RNA (99). Substituting the methylphosphonate flanks of the chimeric antisense ODN2 by 2P-O-methyl residues, which do not allow RNase H cleavage either, led to the loss of the discriminative effect on translation inhibition of globin messages as a result of increased affinity (126); indeed, Tms observed for ODN1, ODN2, and the 2'-O-methyl analog of ODN2 bound to the RNA complementary sequence were 59.5, 44.3, and 72°C, respectively. Significant differences were also observed for complexes formed by these oligomers and a mismatched RNA fragment corresponding to a nontarget site on the fl-globin mRNA. Although the 2'-O-methyl oligomer was far more efficient than the methylphosphonate one, it was of limited interest owing to its inability to discriminate between target and nontarget sites (126). It would be of interest to shorten the 2'-O-methyl antisense sequence, keeping the central region constant, in order to reduce the affinity to a value comparable to that of the methylphosphonate ODN2. One might predict that selectivity would be restored; however, shortening the antisense sequence will increase the probability of finding a fully complementary site on a nontarget mRNA. Indeed, in the above case the two mismatches with the fl-globin mRNA were located in the third and last positions from the 5' end. Similar improvement of specificity was obtained with methylphosphonate/phosphodiester antisense chimeras targeted to c-myc or p53 mRNAs (127-129). The above discussion is also relevant when RNA structures are targeted. Increasing the affinity of an antisense oligonucleotide for such regions, either by

18

j.j. TOULM#.ET AL.

lengthening the sequence or by using high-affinity oligomers to compensate for the energetic cost of disrupting the RNA structure, will simultaneously increase the number and/or the stability of mismatched complexes, thereby enhancing nonspecific effects.

F. Reactive Oligonucleotides An extreme case of high-affinity oligomers is presented by derivatives giving rise to a covalent link between the antisense sequence and the RNA of interest. Different reagents have been tethered to oligonucleotides: alkylating agents, photosensitizers, metal complexes, and so on. Knorre and co-workers developed alkylating oligonucleotides carrying a 2-chloroethylamino group (130). Such a reactive oligomer complementary to rabbit/~-globin mRNA was shown to selectively prevent in vitro translation in cell-free extracts. An alkylating 16-mer complementary to the mini-exon sequence present at the 5' end of every message of Trypanosoma brucei prevented the growth of the parasite in vitro (131). However, the effect was nonspecific and likely resulted from the reaction of the alkylating group on a nontarget site. Photosensitizers are of particular interest as the reaction can be triggered externally upon irradiation. Psoralens, which also display intercalating properties leading to a significant stabilization of the RNA-oligonucleotide hybrid, have been widely used (52, 132). Oligonucleotides complementary to the mini-exon sequence ofLeishmania amazonensis and covalently linked to either 5-methoxy or trimethyl psoralen selectively prevented the in vitro translation of mRNA in rabbit reticulocyte lysate upon UV irradiation (73). No effect was observed with the parent nonconjugated oligomers. No benefit of the presence of the photosensitizer was reported when translation was carried out in wheat germ extract. This differential behavior was ascribed to the contribution of RNase H, which is low (if any) in reticulocyte lysate and high in wheat germ extract. This underscores the potential of reactive antisense oligonucleotides, which constitute an alternative for permanently inactivating the target RNA by oligomers that do not elicit RNase H. The development of herpes virus in cultured cells was specifically prevented by a methylphosphonate dodecamer carrying a trimethyl psoralen derivative (133). The oligomer was targeted to the acceptor splice site of the immediate early mRNA 4, indicating that the use of antisense oligonucleotides could be extended to any step of RNA expression and that nuclear targets can be considered (134, 135). Oligonucleotides have also been derivatized with metal complexes to generate reactive antisense sequences. This includes EDTA-iron, phenanthrolinecopper, and porphyrin-iron chelates (136-138). Such conjugates have been used to cleave single-stranded nucleic acids and to map RNA structures. Platinum derivatives conjugated to oligonucleotides in order to crosslink the target strand were first used to arrest DNA synthesis, either by E. coli DNA


19

polymerase I or by AMV reverse transcriptase (139). Recently, Leng and coworkers described a very clever method to generate selective ohgonucleotideRNA crosslinks through the rearrangement of an intramolecular trans-diamine dichloroplatinum (trans-Pt) diadduct borne by two G residues of the antisense oligomer (140). The crosslinking reaction, which is rather slow in the case of DNA-DNA duplexes, is complete in a few minutes when first a 2'-O-methyl ribooligonucleotide is used and second an extra mismatched nucleotide is added on the oligomer side at the level of the adduct. Such platinated oligonucleotides have been shown to inhibit cDNA synthesis by AMV reverse transcriptase and to prevent in vitro translation of Ha-Ras mRNA (141). This strategy has also been used to perturb the HIV-1 gag-pol frameshifting signal. A 21-O-methyl 19-mer modified with trans-Pt was targeted to the slippery region upstream of the hairpin involved in ribosome frameshift. A strong and specific inhibition of in vitro translation of the luciferase gene, placed under the control of the HIV-1 gag-pol frameshifting signal, was observed when the mRNA was preincubated with the platinated oligomer (53). The inhibition of luciferase was lower but still observed when the oligomer was added directly to the translation mixture (without preincubation), indicating that the crosslinking reaction could compete with the translating ribosomes. This effect was selective as the platinated oligomer did not significantly reduce the translation of a luciferase mRNA devoid of the target site. Moreover, the nonplatinated 2'-O-methyl oligomer had no effect on luciferase synthesis. This platinated oligomer was also active in cultured cells: Transfecting both the mRNA and the oligonucleotide in NIH 3T3 or Vero cells resulted in a reduction of the luciferase activity. Interestingly, inhibition was still observed when the ohgomer was transfected i h prior to the mRNA, indicating that the oligomer-mRNA crosslink did not need to be preformed and that the platinum adduct rearrangement could take place in the cell (53). Indeed, inhibition of luciferase activity was also detected when the platinated oligomer was cotransfected with the DNA construct encoding the luciferase gene under the HIV-1 gag-pol frameshifting signal. This effect was specific, as no effect was observed when a similar experiment was performed with a DNA plasmid containing the hiciferase gene downstream of the HTLV-1 gag-pro slippery sequence.

III. Triple-Stranded Complexes A. The Alphabet and the Chemistry Triple-helical nucleic acid structures were first identified in 1957 (142). Felsenfeld described the formation of a specific complex between two polyU strands and one polyA strand in the presence of divalent cations. It resulted

20

J.j. TOULMEET AL.

from specific Hoogsteen or reversed Hoogsteen-type hydrogen-bonding interactions (143) between bases in a homopurine strand of a Watson-Crick duplex and an additional oligonucleotide third strand (designated here as the Hoogsteen strand). Triple-helix formation obeys structural features that do not accommodate every double-stranded sequence. At least two structural motifs have been characterized, which differ in the sequence composition of the third strand. In the "pyrimidine motif" or PyPu*Py triplexes (the Hoogsteen strand is written in the last position and the asterisk indicates Hoogsteen interactions), the pyrimidine third strand is aligned parallel to the purine strand of the Watson-Crick duplex. Sequence specificity derives from thymine recognition of adenine-thymine base pairs (TA*T triplet) and protonated cytosine (C +) recognition of guaninecytosine base pairs (CG*C + triplet). In the "purine motif" or PyPu*Pu triplexes, a purine Hoogsteen strand is antiparallel to the purine Watson-Crick strand, leading to CG*G and TA*A (or eventually TA*T) base triple combinations. The interaction results from specific reverse Hoogsteen hydrogen bonding (see Refs. 144 and 145 for reviews). Triplex formation requires the occurrence of homopurine stretches in double-stranded regions; in addition, the pyrimidine motif requires the protonation of Cs in the Hoogsteen strand, leading to a pH dependence of such complexes. The thermodynamic stability of triple-stranded complexes has been extensively studied both for the PyPu*Py and the PyPu*Pu families. The data obtained from the analysis of UV melting curves (146, 147), affinity cleavage titrations (148), or calorimetric measurements (149) clearly show that the free-energy change associated with Hoogsteen strand binding is lower than that associated with Watson-Crick base pair formation. Thus, the calorimetrically determined enthalpy changes (expressed per mole of triplet) for three short intermolecular triplexes of parallel or antiparallel families range from 4.5 to 2.4 kcal/mol. The corresponding enthalpy of formation of the targeted duplex was 7.1 kcal/mol (per base pair) (150). The same order of stability was observed in term of free energy of formation. The above-mentioned data were obtained on DNA homopolymeric species. The nature of the oligonucleotide backbone is an important determinant of triple-helix stability. Crothers and co-workers used hairpin duplexes with 5' Pu and 3' Py moieties synthesized with various backbone combinations (151): all DNA (DD), all RNA (RR), and a DNA Pu strand linked to an RNA Py strand (RD) or the reverse situation (DR). The addition of an all-DNA (D) or allRNA (R) Py third strand allowed eight possible combinations: DD*D, DD*R, RR*D, RR*R, DR*D, DR*R, RD*D, and RD*R. Two complexes were not experimentally observed: a Pu RNA strand and a DNA third strand (RR*D, DR*D). The most stable combinations corresponded to an RNA Py Hoogsteen


21

strand for reading a double-stranded target: an RNA third strand gives rise to RR*R, which is the most stable triple-stranded combination (151, 152). But 2'-O-methyl third strands failed to form as stable triplexes as RNA strands when targeted at duplex RNA (153). The same combinations of triple helical complexes were studied by Han and Dervan (154) using affinity cleavage titrations. The target was a 35-bp duplex containing an 18-bp site for triple-helix formation. Similar conclusions were reached; in particular, the RR*D and DR*D combinations were not observed. A pyrimidine RNA stretch can give rise to a RD*D triplex when complexed with DNA Watson-Crick and Hoogsteen strands. Interestingly, a Py RNA strand allows the formation of stable triplexes with either hairpin or circular homopurine oligonucleotides leading to PyPu*Pu complexes (155, 156). No comprehensive study of such complexes has yet been reported, but highly stable triple-stranded structures of this type have been identified (157). The above results underscore an additional limitation for targeting a local double-stranded RNA stretch through triple-helix formation: A synthetic oligodeoxypyrimidine antisense would give rise to a low-stability triplex with a double-stranded RNA region. Nevertheless, a synthetic oligopyrimidine designed to form a triplex with the 12-bp stem of the RNA hairpin responsible for the gag-pol frameshift signal on the HIV-1 mRNA (which contains a pyrimidinerich 5' strand and consequently a purine-rich 3' strand) was shown to elicit complex formation by electrophoretic mobility shift assay (158). This complex was stable at neutral pH but blocked neither the reverse transcription nor the cellfree translation of an RNA template in which the gag-pol frameshifting hairpin was inserted. Ribosomal frameshifting on the gag-pro message of the human T-lymphotropic virus I (HTLV-I) is dictated by a hairpin structure with a 10-bp stem. The stem sequence (10 purines on the 5' strand and 10 pyrimidines on the 3' strand) is appropriate for triple-helix formation. Whereas the "complementary" decapyrimidine 5'C7TC2 does not bind, the decapurine 5'G2AG7 bandshifted the target RNA hairpin on a nondenaturing polyacrylamide gel (R. Le Tin6vez et al., unpublished). The binding was sequence-dependent, as the oligomer in which T was substituted to A or the one with the inverted polarity did not bind. The G2AG7 oligomer reduced the in vitro translation efficiency of a luciferase gene placed downstream of the HTLV-I gag-pro frameshifting hairpin, either in the 0 or in the - 1 frame. The oligomers that did not bind to the target RNA in the bandshift assay had no effect on in vitro translation. Moreover G2AG7 had no effect on the luciferase synthesis from a control mRNA without the hairpin structure. Therefore, the observed inhibition was due to the selective binding of the oligomer to the hairpin stem, likely through the formation of a triple-stranded Py(RNA)Pu(RNA)*Pu(DNA) structure.

22

J.j. TOULMI~ET AL.

B. Modified Nucleotides and Intercalating Agents for Triple Helices Well-known intercalating agents of double-stranded DNA have been checked for interaction with triple helices. Ethidium bromide was shown to intercalate in a triplex (159). More recently, a few other drugs--acridine (160), naphthyl qninohne (161), and amidoanthraqninone derivatives (162)--were shown to significantly stabilize triple helices. Thus, an acridine derivative covalently linked to the 5 t end of a homopyrimidine oligodeoxynucleotide was demonstrated to intercalate at a triplex-duplex junction and to stabilize the complex (163). Benzo[e]pyridoindole was the first drug capable of preferential binding to and stabilization of a triplex rather than a duplex (164). Very efficient stabilization of a triple-stranded structure can be achieved by conjugating this intercalating agent to a pyrimidine third strand, at either a terminal or an internal position (165). More interestingly, the DNA intercalators berenil and DAPI induce the formation of DR*D, a triple helix that would not form otherwise (166). Nucleic acid bases were also modified or conceived in order to stabilize triple helices or to eliminate the pH dependence of PyPu*Py triplexes. Thus, 5-methylcytosine expands the pH range compatible with triplex formation by one pH unit up to near neutrality (167). A pyrimidine third strand where Ts were substituted by 5-(1-propynyl)-2'-deoxyuridine (pdU) and Cs by 5-methyl2'-deoxycytidine showed improved binding characteristics at neutral pH and low magnesium concentration. An increase in Tm of 30°C was observed for pdU- versus T-containing oligomer when complexed with the cognate duplex (168). It has been suggested that pdU may stabilize triplex formation through increased stacking interactions. A similar rationale led to the synthesis of extended aromatic heterocyclic bases capable of improving stacking interactions within the third strand. Although a specificity for triplex formation was observed upon substitution of T by qninazoline-2,4-dione- or benzo[g]quinazoline-2,4dione-containing oligomers, no stabilization of triple-stranded complexes was observed (169, 170). Benzoqninazoline derivatives display strong fluorescence emission, which was recently used to probe nucleic acid and protein interactions (171). A 2'-O-methyl sequence binds with a higher affinity to single-stranded RNA than a DNA sequence, but failed to improve triplex stability (see Section III,A). However, the C5 methylation of the pyrimidine bases on RNA backbone adds significant stability to such complexes (172). An interesting contribution to triplex stabilization was brought about by 2'-aminoethoxy substitution (Fig. 1), in this case Tm increased by 3.5°C per modified residue when introduced in a Py third strand (173) due to interstrand contacts between the 2t-amino group and the phosphate backbone of the purine strand. In conclusion, only limited data are available on triplex-forming oligonucleotides targeted at an RNA double strand. RNA oligomers seem to be the

23


most efficient ligands, but a main limitation is always linked to the availability of a homopurine stretch on the targeted structure.

C. Clamp and Circular Oligonucleotides Linear or circular oligonucleotides have been used to bind single-stranded targets. A purine stretch can interact twice with a sequence carrying both a Watson-Crick and a Hoogsteen domain, thus generating a bimolecular triple helix. This generally results in a fully cooperative association of the WatsonCrick and Hoogsteen domains to the target in an entropically favorable process. The simplest way to link two such domains is to connect them with nonpairing nucleotides (or nonnucleotide linkers), leading to "clamp" ligands (174). A further improvement can be achieved by adding an extra linkage between the two domains, leading to circular oligonucleotides. Strategies have been designed to target homopurine or homopyrimidine strands in either DNA or RNA chemistry. Thus, Watson-Crick or Hoogsteen hydrogen bonding can be used to read the target strand. Typical strategies are presented in Fig. 3. Such cyclic ligands greatly improved the binding properties for single-stranded DNA targets. They bind with a Tm nearly 20°C higher than that of the Watson-Crick complement (175), with a free energy nearly 7 kcal/mol more favorable. Interestingly, such a ligand also exhibits a higher sequence selectivity (176). As previously mentioned, however, for linear or hairpin targets a purine RNA stretch does not give rise to a triple-stranded structure when targeted by a circular DNA (155). Circular

Py o

o

o

o

o

~

Pu

.

•

•

•

.

.~

PY

o

o

o

o

o

Pu

.

.

.

.

•

PY

-]

Py Pu

o

o

o

o

o

Pv

FIG. 3. Schematic representation of triplex-forming circular oligonucleotides. The bold lines stand for the purine strand. Circles and stars indicate Watson~riek and Hoogsteen hydrogen bonding, respectively.

24

j.j. TOULMI~.ET AL.

RNA oligonueleotides also resulted in triplex formation (153). The targeting of a pyrimidine RNA strand by a circular homopurine oligomer was mentioned in a preceding section. Bannwarth (177), using "capped duplex DNA" (aDNA), demonstrated that a pyrimidine RNA strand is more tightly bound than the corresponding homopyrimidine DNA sequence. These aDNA molecules consist of Watson-Crick paired strands linked at their ends by two hexaethylene glycol spacers. They are able to engage Hoogsteen interactions with the singlestranded Py target, allowing the formation of PyPu*Py triplexes. In a similar approach, Kandimalla used a hairpin to target a pyrimidine strand by the formation of Watson-Crick hydrogen bonds (178). The hairpin consists of two parallel homopurine homopyrimidine strands attached through 3'-3' or 5'-5' linkages, which are engaged in Hoogsteen base pairing. Both DNA and RNA targets were able to form stable PyPu*Py triplexes.

D. Double Hairpin One particular application of triple-helix formation can be considered when a hairpin structure is targeted by an oligonucleotide able to interact simultaneously with a single-stranded part at the bottom of the stem and with the doublestranded stem itself (see Ref. 179 for a review). This approach is restricted to particular hairpins owing to the strong sequence constraints for triple-helix formation (180, 181). The concept was validated using a model DNA hairpin with a 51 strand comprising 16 purines, 10 of which are engaged in the 10-bp stem of the hairpin. A 26-mer DNA oligopyrimidine was designed to bind through Watson-Crick base pairing to the single-stranded purine region, then fold back to form a PyPu*Py triplex with both this short duplex and the stem (Fig. 2d). Hybridization of this oligomer to its target hairpin was actually demonstrated by electrophoretic mobility shift assay and UV absorption-monitored melting experiments, at pH 6.0 and in the presence of 10 mM Mgz+ (i.e., under conditions that promote triple helix formation) but not at pH 7.3, indicating that the complex likely involved CG*C + triplets (182). This was confirmed by the increased stability of complexes formed by a 5-methylcytosine-containing 26-mer, a modification known to stabilize triple helices (167). Footprinting experiments actually confirmed what was described as double hairpin complexes (180). Such complexes prevented the cleavage of a RsaI site next to the end of the triplex (182). The inhibition of the restriction enzyme depended on the formation of the triple-stranded structure as no effect of the oligomer was detected on a DNA hairpin with a mutated stem incapable of supporting third-strand binding. Moreover, when the binding sequences corresponding to the Watson-Crick and the Hoogsteen part were supplied individually, no inhibition was seen. Finally, it was demonstrated that the formation of the double-hairpin complex was accompanied by a structural reorganization of the target hairpin (183). Conceivably, then, some antisense effects could similarly result from oligonucleotide-induced structural changes.


25

The binding of an oligodeoxynucleotide to an RNA hairpin structure according to the above strategy would lead to the formation of DR*D and RR*D triple-stranded portions, which (as discussed in Section III,A) are much less stable than DD*D helices (151,152, 154). Nevertheless, the 2'-O-methyl analog of the oligopyrimidine used in the aforementioned study (180) gave rise to a bandshift on a nondenaturing polyacrylamide gel. Moreover, this oligomer induced the selective inhibition of in vitro translation of a luciferase gene inserted downstream of the target hairpin, but whether the complex adopts a double-hairpin structure is not known (R. Le Tin6vez et al., unpublished results). The mini-exon sequence of the protozoan parasite Leishmania amazonensis folds into a hairpin structure (see Section II,B). The formation of a doublehairpin complex on this RNA target (in addition to the formation of DR*D and DD*R triple-stranded structure predicted to show low stability) requires that mixed purine/pyrimidine sequences be accommodated, further reducing stability. Nevertheless, a 29-mer oligodeoxynucleotide was tailored to allow doublehairpin formation. Its sequence was chosen to generate the least disturbing "mismatched" triplets AU*G and GC*T (184, 185). This oligomer could fornl 10 Watson-Crick pairs and 15 triplets, the two parts being connected by four thymine residues. Under conditions promoting triple-helix formation (pH 6.0, 10 mM MgZ+), the 29-mer did bind to either the DNA or the RNA versions of the mini-exon sequence (186). Chemical and enzymatic footprints of both complexes were in agreement with a double-hairpin complex at acidic pH, whereas at neutral pH the interaction was restricted to the formation of a 10-bp duplex between the 5' end of the oligonucleotide and the single-stranded part of the mini-exon sequence. This suggests that the double-hairpin complex was stabilized by protonated cytosine and therefore induced a local triple-stranded structure (186). A similar approach was recently used to target the polypyrimidine tract in the internal ribosome entry site of the hepatitis C virus. Oligonucleotides able to generate double-hairpin complexes through the formation of PyPu*Py triple helices did actually bind to the target RNA (K. Aupeix-Scheidler, unpublished) at acidic pH. However, neither the anti-mini-exon nor the anti-IRES oligonucleotides reduced in vitro translation of RNAs comprising the target sequences, likely owing to the reduced affinity of these oligomers at neutral pH.

IV. Oligonucleofides Identified Ihrough Combinatorial Approaches High-affinity and selective ligands of RNA structures can be extracted through combinatorial strategies, which offer the potential to screen large numbers of compounds and to identify individual molecules with the desired properties. As an extension of their work on tethered oligonucleotide probes (TOPs),

26

j.j. TOULMI~.ET AL.

Schepartz and co-workers (187) devised a single-step selection method to identify ligands specific to the Rev response element (RRE) of HIV-1, an RNA structure constituting a binding site for the Rev protein involved in the regulation of splicing and transport of mRNA from the nucleus. A TOP molecule recognizes two noncontiguous sequences through the binding of two short oligonucleotides linked to each other by a nonnucleotide tether. The TOP library consisted of 4096 different molecules, i.e., a randomized 6-mer linked to an 8-mer complementary to a single-stranded region of RRE. The identification of high-affinity TOP molecules interacting simultaneously through the 8-mer and the 6-mer moieties was made by an RNase H assay (187). Oligomers able to recognize an RNA structure can be identified through a scanning method allowing the determination of the best residue at every position of the sequence by iterative synthesis and selection. Four sublibraries of oligomers of defined length were synthesized; in each library, a known residue (A, T, G, or C) was introduced at a fixed position while all others were randomized. The best sublibrary was determined by affinity measurement and the corresponding nucleotide was introduced at this fixed position in the next four sublibrafies, which were synthesized according to the same principle for the identification of a second residue. N steps allowed the determination of the oligomer sequence able to bind to the RNA structure. Using this method, Ecker et al. (188) identified a 2'-O-methyl ribooligonucleotide able to bind to an RNA hairpin from the activated Ha-ras mRNA.

A. SELEXand Aptamers The methods described above are restricted to the screening of rather small libraries and short sequences. The procedure known as SELEX (Systematic Evolution of Ligands by Exponential enrichment) allows the identification of DNA or RNA sequences up to several hundred nucleotides in length in libraries containing 1014-1015 different sequences (189, 190). The methodology is conceptually straightforward. A large library of random DNA sequences generated by chemical synthesis is eventually converted into an RNA library. It is then subjected to selection with respect to a given criterion, resulting in the enrichment of the molecules of interest. Each candidate in the library contains fixed sequences flanking the random region. As only a very limited fraction of the starting pool displays the expected properties, the candidates selected at the first step are amplified by PCR and subjected to further rounds. Successive selection and amplification cycles result in an exponential increase in the number of oligomers of interest. The winners--so-called aptamers--are cloned, sequenced, and analyzed individually. In vitro selection can be applied to many different targets: small ligands such as amino acids or nucleotides, proteins (191), or even intact viruses (192) and live cells (193). Aptamers displaying dissociation constants in the nanomolar


27

range have frequently been identified. These ligands show a very high selectivity, which can be further improved if a counterselection step is introduced in the process. For instance, an RNA aptamer able to bind to theophylline with a Kd of about 0.1/zM has a 10,000-fold lower affinity for caffeine, a purine derivative that differs from theophylline by a methyl residue on the N(7) position (194). High selectivity is achieved by elementary interactions between sites of the target and groups of the aptamer molecule, which are nicely positioned by threedimensional scaffolds such as hairpins, pseudoknots, or G tetrads. Aptamers are very promising tools, rivaling antibodies for diagnostic purposes (195, 196) and having potential therapeutic applications (197-199). Very few studies have been devoted to in vitro selection of RNA or DNA sequences against nucleic acids, except for artificial ribozymes (see Refs. 200, 201 for recent reviews on ribozymes). In vitro selection provides a way to investigate nucleic acid/nucleic acid interactions beyond Watson-Crick base pairing. In an effort to extend the repertoire of sequences able to recognize duplex DNA, Schultz and co-workers (202) used in vitro selection to identify RNA sequences able to bind a 16-bp homopurine-homopyrimidine DNA sequence. Starting from a library of 101°-10a2 different sequences with a 50nucleotide random region, they analyzed, after five cycles of affinity chromatography performed at pH 5.5, 17 aptamers that specifically bound the target DNA. Fourteen of these clones contained stretches of pyrimidines able to form the canonical TA*T and CG*C + triplets as well as a few nonstandard triplets at various positions. Two other selection experiments, carried out independently with an RNA library (203) and with a DNA library at neutral pH (204), also led to the isolation of sequences able to bind duplex DNA through known Py and Pu alphabets. Although these studies did not extend our knowledge of the formation of triple-stranded structures, they demonstrate that in vitro selection can indeed be applied to the identification of nucleic acid ligands. Mishra and co-workers (205) used in vitro selection to identify oligodeoxynucleotides able to bind to the DNA hairpin used for double-hairpin complexes (see Section III,D). The candidates were composed of a fixed motif, which served as an anchor complementary to the single-stranded part at the bottom of the stem, and a random region 16 nucleotides long. After four rounds, three sequences able to bind to the target DNA hairpin were characterized (205). Footprinting studies indicated that the aptamer-hairpin complex looked like a double-hairpin complex, even though the region corresponding to the third strand of the complex did not obey the rules for triplex formation, either via Hoogsteen or reverse Hoogsteen hydrogen bonding. In addition, the complex was stable at neutral pH and prevented the cleavage of the target stem by a restriction endonuclease at pH 7.5, in contrast to the rationnaly designed oligopyrimidine giving rise to a regular local triple-stranded structure (206). A

28

J.J. TOULMI~ET AL.

2t-O-methyl version of one of the selected aptamers was able to bind to the RNA homolog of the target hairpin. Moreover, this 2'-O-methyl-derived aptamer specifically prevented the in vitro translation of a mRNA in to which the target hairpin was inserted upstream of the initiator AUG of a reporter gene (207). A similar study performed against a DNA hairpin corresponding to the template for in vitro transcription of the TAR element of HIV-1 generated a class of ligands sharing a consensus sequence complementary to a weakly structured region (208). In this case, however, the binding was chemistry-dependent: The phosphorothioate and the 2'-O-methyl ribooligomers derived from a selected sequence exhibited a much weaker affinity than the parent molecule.

B. Aptamers to the TAR RNA of HIV-1 Using SELEX against structured RNA region might allow (1) detection of single-stranded areas, (2) discovery of invading oligonucleotides with optimal binding properties, and (3) identification of sequences able to engage nonWatson-Crick contacts, eventually leading to tertiary interactions. Conditions could be adjusted to mimic physiological ones; in particular, the concentration of divalent cations, which is crucial for RNA-RNA interactions, could be modulated. Selection pressure (temperature, concentration, ionic conditions, competitor) could be chosen to achieve high affinity and selectivity. In vitro selection was devised to generate DNA and RNA ligands against the trans-activation responsive region (TAR) of HIV-1, a regulatory RNA element located downstream of the initiation site for transcription. The 59-nucleotide TAR element folds into a very stable hairpin with a three-nucleotide U-rich bulge near the apex of the TAR stem (209). This bulge region is critical for binding Tat, a viral protein crucial for the transcription of the viral genome. In the absence of Tat, initiation of RNA synthesis is efficient, but RNA polymerase disengages from the template rapidly and transcription aborts. Moreover, Tat is able to form a ternary complex with cyclin T1, a component of the Tat-associated kinase; the formation of this ternary complex requires a functional apical loop sequence of TAR in addition to the three-nucleotide bulge (209, 210). Therefore, ligands interacting strongly with the upper part of the TAR hairpin would disrupt this ternary complex, leading to abortive RNA synthesis. Indeed a number of antiTAR molecules have been designed, including peptoids (21), groove binders (211), antisense oligonucleotides (111). A DNA library comprising about 1012 different sequences was screened for ability to bind to the TAR RNA element. Candidates with a 30-nucleotide random region were incubated at low temperature (4°C) in the presence of 10 mM Mg 2+. Sequences displaying affinity for 3'-biotinylated TAR RNA were captured by magnetic streptavidin beads. After 15 selection rounds, the analysis of more than 70 clones revealed a class of sequences containing the octamer 5'ACTCCCAT (212). These sequences could be folded in imperfect hairpins,


29

with the consensus octamer located in the apical loop. The best aptamer (D04) was able to bind to TAR RNA with a Kd value of about 20 nM under selection conditions. The central part of the octamer motif is complementary to the TAR loop, suggesting loop-loop interactions (Fig. 4). Footprinting studies and analysis of several mutants, either of TAR or of the aptamer, confirmed that the formation of the TAR RNA-D04 complex is driven by the TAR loop and the upper part of the aptamer. The top of the aptamer stem plays an important role in the binding process. The base pairs next to the loop are exclusively weak A-T or G-T pairs; bulges and internal loops are frequently observed (212). An attempt to restore a perfect double-stranded stem by deleting or inserting a single base in the upper part of the D04 stem led to a 25-40-fold increase in Kd, suggesting that a weak double-stranded structure in the vicinity of the binding loop was required for increased adaptability of the DNA aptamer. The formation of the D04-TAR complex was tightly controlled by ionic interactions. Decreasing the magnesium concentration from 10 mM (corresponding to the selection conditions) to 3 mM prevented the binding of the aptamer D04 to the TAR RNA. Indeed, electrostatic repulsion between phosphate groups has been shown to be critical for RNA-RNA loop-loop (kissing) complexes (213, 214). Selection of DNA aptamers against TAR at a lower magnesium concentration led to sequences with an extended consensus that were still able to give rise to kissing complex at 3 mM Mg2+ (D. Sekkai et al., unpublished). The selected sequence on each side of the aptamer loop likely allowed a particular conformation that minimized the phosphate-phosphate repulsion. RNA aptamers were also selected against the TAR RNA motif. Screening a pool of about 1011 oligoribonucleotides, 98 nt long and containing a random stretch of 60 nt, led to the identification of hairpin aptamers with an octameric consensus 5tGUCCCAGA allowing loop-loop interactions with the TAR RNA (215). In contrast to the DNA aptamer described above, the aptamer and TAR loops are fully complementary and can give rise to six base pairs (Fig. 4). Moreover, whereas two DNA residues and one RNA residue connected the stems and the loop-loop helix of the D04-TAR complex, a single phosphodiester ribose unit allowed the crossing of the helix grooves in the RNA aptamer-TAR complex. In addition, whereas the top part of the DNA aptamer showed a flexible structure, RNA aptamers generally displayed a G-C-rich double-stranded upper stem. This likely reflects structural differences between RNA-RNA and RNADNA helices and, more particularly, the width of the major groove of B- and A-type helices. Indeed, neither the RNA version of D04 nor the DNA version of R06, the best RNA aptamers, was a good ligand of the TAR RNA (212, 215). In contrast, both the 2t-O-methyl and the N3t--~ P5' phosphoramidate derivatives of the RNA aptamer were able to bind to the TAR RNA with an affinity similar to that of the parent molecule (E Darfeuille, C. Di Primo, Gryaznov, and J. j. Toulm6, in preparation). Both analogs are known to form A-type double helices.

30

R0624

I.J. TOULMI~ ET AL. GUC 5'UCAACACG C AGUUGUGC C 3' AGA

TAR

UC

TAR*

5"GCUGU CGACA

C C

GGAGcuc

A C C C 3"

GucCGAGuuuAGACC~GAUUGGUCUCUcUGGG S"

GA

3'

UCUG~CUAACUAGGGA

miniTAR | TAACT

D04~

5'CATG GTAC 3'

C C TATAC

RNA I,

RNA II! 13

UU 5 ' GGCACCG G G CCGUGGC 3' GA U 14

C C A

CGACCC

3"

GCUGGG 5 ' UC

DIS Lai 8

AAG

C 5" G G C A A U G • G CCGUUGC C 3' ACG

C GCA17 C

3"

CGUUGCC GUAACGG

G GAA

5'

9

DIS Mal 7 5"GCq~UG

AGG

U

G

C ACA

10G

CCG A C 3"

C~3CUG3' C C G AC

A C A CII

U GGA

5'

FIG. 4. Hairpins with the potential to generate kissing aptamers. The anti-TAR RNA and DNA aptamers (R06~4 and D0420) and the rationally designed TAR* hairpin are indicated at the top. MiniTAR, a truncated version of TAR, has been used for kissing complex studies (see text). RNA li and RNA IIi corresponding to the inverted loops of the sense-antisense RNA complex involved in the regulation of the ColE1 plasmid copy number as well as the DIS elements of the Lai and Mal strains of HIV-1 are presented in the lower part.


31

Therefore structural (chemical) constraints drive the optimal recognition of the target and the formation of the complex. It should be pointed out that the 8-mer consensus oligoribonucleotide alone binds to TAR RNA with a constant at least 2 orders of magnitude lower than the selected aptamer, even though both this oligomer and the RNA aptamer can form 6 base pairs with the RNA target. This underscores the role of secondary interactions for aptamer-RNA complexes and demonstrates the utility of in vitro selection for identifying high-affinity ligands of RNA structures. Interestingly enough, in vitro selection of RNA sequences against the yeast phenylalanine tRNA provided hairpin aptamers with a loop perfectly matching the entire 7-nt anticodon loop of tRNA (216). In addition, the anti-tRNA aptamer loop was closed by a G-A pair. Kissing aptamers are not the only answers provided by in vitro selection for selective recognition of hairpin structures. In the course of in vitro selection of anti-TAR DNA oligomers at low magnesium concentration, candidates displaying two noncontiguous short antisense sequences were obtained (D. Sekkai et al., unpublished). These aptamers, which are not characterized yet, bind selectively to TAR with an affinity rivaling that of kissing aptamers.

C. Loop-Loop(Kissing)Complexes The functional interest of these complexes strongly encouraged investigations to determine their three-dimensional structure as well as to analyze the thermodynamic and kinetic properties of loop-loop interactions. This may contribute to our understanding of the structure/function relationship. 1. BIOLOGICALFUNCTIONOF RNA-RNA KISSINGCOMPLEXES Loop-loop interactions were demonstrated to mediate several processes both in prokaryotes and in eukaryotes. The initial step of the regulation of the plasmid ColE1 and R1 copy number is a transient RNA-RNA kissing complex (217, 218). The dimerization of retroviral genomes involves the intermolecular interaction of two palindromic RNA loops (219, 220). A cis-acting RNA element involved in the initiation of the ( - ) strand synthesis of enteroviral genomes acquires a functional higher order structure through kissing interactions (221). Natural antisense RNAs exert regulatory functions in prokaryotes by binding to their complementary sequences (222). The regulation of the replication of the E. coli plasmid ColE1 involves the interaction of two complementary RNAs. RNA II is the precursor of the primer RNA, which forms a persistent hybrid with the template DNA; this hybrid is recognized by RNase H, yielding a truncated RNA piece that primes DNA synthesis by DNA polymerase I. The formation of the RNA II-DNA hybrid is negatively controlled by an antisense RNA, RNA I, which is complementary to RNA II (34). Both RNA I and RNA II form three stem-loop structures. The interaction of RNA I and RNA II begins at complementary loops and then zips up, through a series of steps, to generate a

32

j.j. TOULM~; ET AL.

double-stranded RNA along the entire length of RNA I (223). In addition, the replication of the plasmid is also controlled by a 63-amino acid plasmid-encoded protein, Rop (or Rom), which stabilizes the three hairpin pairs formed between RNA I and RNA II (224, 225). The Rop protein dimer does not show sequence specificity, but rather recognizes RNA structure (226). Initiation frequency of plasmid R1 replication is also controlled by an antisense RNA, CopA, which inhibits RepA protein synthesis by binding to CopT, the leader region of the repA mRNA (227). Both CopA and CopT show extensive imperfect stem-loop structures. The initial rate-limiting step involves the formation of a transient complex between the complementary loops of CopA and CopT. This kissing complex is rapidly converted into an extended complex which requires partial melting of the upper stem regions, most likely facilitated by the presence of bulged residues (228). Helix propagation then results in a four-way junction structure (229). Retroviral genomes are composed of two identical RNA molecules noncovalently assembled by their 5' region. Dimerization is thought to play an essential role in the encapsidation of the retroviral RNA and reverse transcription. The dimerization initiation site (DIS) adopts a stem-loop structure (219, 220, 230) capable of dimerizing in the absence of any protein, although the retrovirally encoded nucleocapsid protein helps in maturing the dimer structure (231). Two variants of the DIS stem-loop are found in HIV-1 natural isolates represented by Mal and Lai strains (Fig. 4). Both loops, which contain an autocomplementary sequence--5'GUGCAC and 5~GCGCGC, respectively--are closed by a noncanonical A-A pair (232). The double-stranded stem is interrupted by a purine bulge that seems to regulate the two-step dimerization process, leading to the formation of an extended duplex (231). Mutations in the kissing loop of HIV-1 RNA that affect the dimerization reaction reduce viral infectivity and viral RNA packaging (233). As key driving steps for viral infection, the DIS constitute a target of interest for controlling HIV development. Antisense and sense oligonucleotides have been used to interfere with the dimerization process, contributing to our understanding of the natural mechanism while allowing the development of new antiviral agents (220, 234, 235). Several ribo- or deoxyl"ibooligomers efficiently inhibited dimerization. However, major differences were observed between the Mal and Lai strains, depending on the DIS loop sequence, the oligonucleotide chemistry, and the temperature (235). 2. STRUCTURE OF KISSING COMPLEXES The solution structure of a loop-loop complex derived from the kissing complex formed between RNA I and RNA II from the ColE1 plasmid was determined by NMR spectroscopy (214). These model hairpins were modified at the end of the stems to avoid the formation of the extended RNA duplex that follows stem-loop interaction; moreover, the loop sequences were inverted


33

5t to 3', as the resulting stem-loop complex RNA Ii-RNA IIi (Fig. 4) is 350-fold more stable than the wild type (225). The structure shows that all seven bases of the complementary loops are involved in base pairing. The loop-loop helix is bent toward the major groove, allowing a quasi-continuous stacking from stem helix to the other through the loop-loop helix: The loop residues are 3'-stacked on their respective stem helices. This bend minimizes the distance for the G14 and the A13 phosphodiester bonds of RNA Ii and RNA IIi, respectively, to bridge across the major groove. Other features include purine stacking interactions at the helix junction. The purine residues at the 3' end of the loop-loop helix of one RNA stacks on a purine residue on the 5' side of the other RNA. Strong cross-strand stacking interactions between guanine bases in the stem helices adjacent to the loops are also observed. Gregorian and Crothers have shown that these stacking interactions are crucial for the stability of the RNA Ii-RNA IIi kissing complex (236). Although the loop-loop complex resembles an A form RNA, the authors hypothezised that structural differences, such as the phosphate clusters flanking the major groove of the loop-loop helix and the collapses of this major groove, may explain why the Rop protein binds only to the loop-loop complex and not to the corresponding extended duplex. These phosphate clusters were also proposed to provide a binding site for magnesium ions required for stable looploop interactions (223, 236). The DIS three-dimensional structures of two HIV-1 isolates, HIV-1 Mal and HIV-1 Lai (Fig. 4), have been investigated by NMR spectroscopy. Dardel et al. (237) presented preliminary data for a kissing complex formed between two 19-nt hairpins derived from the DIS ofHIV-1 Mal. As observed in the RNA Ii-RNA IIi kissing complex, there is a continuous stacking from one stem through the looploop helix formed by base pairing of the palindromic hexanucleotide sequence to the other stem helix. The loop palindromes, 5'GUGCAC3 t, are fully basepaired. Even if in these experiments (performed at millimolar concentrations) the loop-loop interaction is not affected by the addition of magnesium ions, dimerization at lower concentrations actually requires these divalent cations (235). This result is consistent with experiments using chemical probes and molecular dynamics simulations (238), which suggest that one magnesium ion could bind at the center of the pocket in the sharp turn that each loop makes, thereby preventing electrostatic repulsion between the phosphates. The NMR data give direct evidence that the conserved and crucial A residues that close the loop face each other and form, indeed, a noncanonical AA pair, as proposed earlier (232) and recently confirmed (238). Further features include pairs of intermolecular base triples at the loop-loop interface between two residues on one strand and one on the other. The G7 residue flanking the self-complementary sequence stacks on the C l l residue within the hexanucleotide sequence and on the G10 residue within the same sequence on the other strand.

34

j.j. TOULMI~ET AL.

Mujeeb et al. (239) solved the three-dimensional structure of the DIS of HIV-1 Lai. The hairpin was modified in the stem to favor the formation of the kissing complex over the extended duplex. The NMR spectra were collected in the absence of magnesium ions. As observed with HIV-1 Mal, the loop palindromes, 5'GCGCGC3 ~, are fully base-paired. Each stem forms a canonical A-type helix. Surprisingly, the first GC pair in the stem next to the loop is not observed in any structure. This result contrasts with those described above for RNA Ii-RNA IIi where all stem residues are base-paired. One other major difference is that there is no continuous stacking from one stem through the loop-loop helix to the other stem. Instead, the kissing complex is marked by abrupt discontinuities of stacking, by several unpaired junctional bases, and by an underwound, strained loop helix. The noncanonical AA pair observed in HIV-1 Mal is not seen, either. One possible explanation is that the structure was solved in the absence of divalent cations. Even if dimerization of HIV-1 Lai DIS does not require magnesium ions, such cations may actually favor a more structured hairpin with all stem residues engaged in Watson-Crick interactions. Then the overall conformation of the new kissing complex may be closer to the one reported for the other RNA-RNA loop-loop duplexes. Alternatively, the GC-rich content of the loop might induce some peculiar conformation that can be accommodated only at the expense of the first pair in the stem. Base triples are also observed in this DIS. The A8 residue of the loop on one strand crosses the interface to stack between the A9 residue of the loop and the C17 residue next to the loop on the other strand. These base triples were shown to be crucial for dimerization (230). Other base triples were proposed (238) but were not present in this NMR model. The complex formed between TAR and the rationally designed TAR* RNA is another example of loop-loop interaction between two hairpin RNAs whose structure has been solved (213). TAR* is a rationally designed hairpin with a sixnucleotide loop complementaryto the TAR loop (Fig. 4). Although the loop-loop helix and the stem helices are one and two base pairs shorter, respectively, than that of the RNA Ii-RNA IIi kissing complex, the two kissing complexes share most of the structural features. The loop-loop helix is bent toward the major groove, which is compressed, and three pairs of phosphates are brought into close proximity, providing a potential binding site for magnesium ions. The TARTAR* complex is also recognized by the Rop protein. Comolli et al. (240) used NMR spectroscopyto map the interface between Rop and the TAR-TAR* kissing complex. A model was proposed where the interaction would occur between the concave face of the protein formed by two hehces and the minor groove of the loop-loop helix, in agreement with previous mutagenesis (226) and ribonuclease cleavage (225) data for the ColE1 kissing complex. This study strongly suggests that local distortions in the loop-loop helix are the key factor for its specific binding to the Rom protein over extended duplexes.


35

The structural studies published thus far concern kissing complexes formed between two RNA hairpins. Recently, however, Collin et al. (241) presented the first structural analysis of an RNA-DNA kissing complex between an 18-nt RNA hairpin derived from the trans-activation responsive element of HIV and a 20-nt DNA derived from D04z0 (Fig. 4), an aptamer selected against TAR (212) (see Section IV,B). This high-affinity DNA hairpin presents in its loop a 5-nt sequence, 5'TCCCA3', complementary to the TAR loop sequence. This hybrid complex displays a quasi-continuous stacking from one stem to the other through the loop-loop helix, as observed for TAR-TAR* and RNA Ii-RNA IIi complexes as well as the DIS of HIV-1 Mal. The differences lie in the base pairs of the looploop helix and the linkers that connect it with the stem helices. Although TARTAR* and RNA Ii-RNA IIi loops are fully complementary, forming six and seven base pairs, respectively, the TAR and DNA aptamer loops are partially complementary, as also observed in the DIS kissing complexes. Only five bases of the aptamer loop are complementaryto the 6-nt TAR loop. In TAR-TAR* and RNA IiRNA IIi complexes phosphodiester bonds bridge across the major groove to connect the central loop-loop helix and the stems, whereas in the RNA-DNA complex the linkers consist of one RNA and two DNA residues. Bases used as linkers were also described in the DIS of HIV-1. The absence of recognition of the RNA-DNA hybrid by the Rop protein (C. Cazenave, unpublished results), which is specific for RNA-RNA kissing complexes, clearly indicates that the conformation of the TAR-DNA aptamer complex is different. So is the RNADNA loop-loop helix which is not recognized by the E. coli RNase H (212) for which RNA-DNA duplexes constitute a substrate (242). Distortions of the helix and/or steric hindrance at the stem-loop helix stem junction might prevent the enzyme from binding or cleavage. The RNA-DNA central helix might not be long enough for efficient cleavage by RNase H (243). As for the DIS Mal complex of HIV-1 in which loops are closed by a noncanonical AA pair, a TT pair closes the loop of the DNA aptamer. Deletion or replacement of one of these Ts by an A residue drastically decreased the stability of the complex (212). 3. THERMODYNAMICAND KINETICANALYSIS OF KISSINGCOMPLEXES The structural studies just presented clearly indicate that at least three structural determinants are crucial for stable loop-loop interactions: loop complementarity, the bases that close the loop, and the hairpin stem. Eguchi and Tomizawa (225) documented the kinetics of loop-loop interactions between the two RNA transcripts, RNA I and RNA II, of ColE1 plasmid with various sequences in the loop and the stem. They showed that the stability of the kissing complex is highly dependent on the loop sequence, even if the base composition is the same. Inverting the loop sequence 5' to 3' of both RNAs compared to the wild-type sequence increased the association and decreased the dissociation rate

36

j.j. TOULMt~ET AL.

constants. As a result, the affinity increased 350-fold. This clearly shows that the stability is determined not only by the Watson-Crick interactions between the complementary loops but also by other interactions such as the stacking interactions revealed by NMR studies. Similarly, they showed that base composition of the stem is crucial for stable interactions, also consistent with the stacking interactions observed between the stems and the loop-loop helix. Gregorian and Crothers (236) used thermal denaturation monitored by UV-absorption spectroscopy to analyze the stability of mutated loop-loop complexes in the pair that closes the loop and at the begining of the stem next to the loop-loop helix. They confirmed that for stable loop-loop complexes more than just loop complementarity is required. Replacement of the first G-C pair next to the loop in the wild-type kissing complex by a C-G pair decreased Tm by 20°C; but if this pair was changed to another purine-pyrimidine pair, A-U, Tm decreased by only 9°C. These determinants also affected the stability of a kissing complex formed between the imperfect TAR RNA and the RNA hairpin aptamer R06 (215). Mutations that decreased loop complementarity decreased the stability of the complex. Mutations of the aptamer loop closing pair induced similar effects. Removal of the aptamer stem drastically decreased its affinity for TAR RNA, demonstrating that high affinity was obtained only when the interacting region was presented in the context of a hairpin structure. It suggests that the stem organizes the aptamer loop for appropriate interaction with TAR RNA, consistent with the observed quasi-continuous stacking from one stem through the loop-loop helix to the other stem. More recently, a thermodynamic and kinetic study on the role that the aptamer loop-closing GA pair may play for stable loop-loop interactions at a physiological concentration of magnesium ions (3 raM) was reported (244). This role was investigated by systematically mutating the loop-closing pair of the aptamer. The effects of these mutations on the stability of the TAR RNA-aptamer complexes were analyzed by thermal denaturation monitored by UV-absorption spectroscopy. We also addressed the question of whether the differential behavior between the rationally designed TAR* RNA and the aptamer originates in the number of associated magnesium ions known to be required for stable loop-loop interactions. Tm values for complexes between mutated aptamers and mini-TAR (Fig. 4) show that, regardless of the mutation, the complexes were less stable than the one formed with the wild-type aptamer (Fig. 5). Moreover, a stable aptamer stem did not lead to a stable complex with the targeted RNA (see GC aptamer variant). The complex formed with TAR* was clearly less stable than the one formed with the aptamer selected by in vitro selection (ATm = -17°C). The stabilizing role of the loop-closing GA pair was further emphasized with two

37


C C C ~O

~n

~ff

Kd

Tm

l~.M-ls -1

10-3.s-I

nM

6.3±0.6

1.1 ±0.1

17±3

47.3±0.3

6.8±0.9

1.5±0.05

22±2

42.8±0.4

7.9±0.4

4.1±0.1

52±2

37.0±0.0

8.5±1.4

9.1±0.2

107±14

32.9±0.1

6.3±0.3

13±7

2~±1

29.9±0.6

ND

ND

ND

21.0±0.0

ND ND,notdeletmined

ND

ND

16.8±1.1

A ~

~

G _ C A

G

\

U

~ '

~

C O u\ [ G " - - - U ] ~ '

A

C 5'

\lull

O '

~ [ " ~ ' " ~

' '

FIG. 5. Effect of the pair closing the RNA aptamer loop on the rate constants, equilibrium constants, and Tins for aptamer R06-TAR complexes. Experiments were carried out as described in Ref. 244. Briefly, kinetic measurements were performed in a 20 mM HEPES buffer, pH 7.3, containing 140 mM potassium acetate, 20 mM sodium acetate, and 3 mM magnesium acetate, at 23°C. The same buffer was used for melting experiments except that cacodylate was substituted for HEPES. In this latter case, the oligonucleotide concentration was adjusted to 1/zM (each partner).

UA aptamer variants, equivalent to TAR* in terms of loop complementarity and closing pair but with stems formed of either eight or four base pairs. Tmvalues for the resulting complexes were about 30°C regardless of the intrinsic stability of the aptamer (244). Analysis of the rate constants determined by surface plasmon resonance demonstrated that the equilibrium binding constant was controlled by the off rate and that the GA pair is crucial to prevent fast dissociation of the TAR RNA-aptamer complex (Fig. 5). The number of magnesium ions that bind upon TAR binding to the aptamer and TAR* was shown to be 1.7 and 1.4, respectively; i.e., it was consistent with the proposed number for other kissing complexes (213, 236). This suggests that the phosphate cluster observed in these complexes, which may provide a binding site for Mg 2+, may exist for the TAR RNA-aptamer complex, too. This similar number also indicates that the lower stability of the rationally designed ligand TAR* (Ko = 93 nM) over the aptamer (Kd = 2 nM) originates in the closing GA pair rather than in a difference in the number of magnesium ions that bind. This difference in stabilityis illustrated in the binding kinetics. At 3 mM Mg2+, the rate constants for TAR binding to TAR* could not be determined. The magnesium concentration had to be increased to 10 mM to allow the measurements. Nair et al. (245) successfully increased 8-fold the stability of the TAR-TAR* complex with a TAR RNA containing a 2-thiouridine modification in its loop.

38

j.j. TOULMI~ET AL.

V. Conclusion Functional RNA structures, i.e., motifs that play a key role in various biological processes, constitute valid targets for the development of genetic tools or therapeutic agents. Many different classes of molecules have been investigated with respect to their binding properties to folded RNA regions. Combinatorial strategies shed some light on the key parameters driving the interactions between small molecules and RNA structures (15, 198). Recently, the structures of three aptamer-antibiotic complexes have been solved by NMR (246, 247). These structural investigations demonstrated than the RNA sequences selected against tobramycin and neomycin B adopt common architectural features, the antibiotic molecule being embedded into the folded RNA which engages multiple hydrogen bonds and electrostatic interactions with the amino groups of the aminoglycosides. Such informations might help in designing new antibiotics with increased selectivity. High-affinity complexes between RNA structures and small molecules might also provide a mean for the selective control ofgene expression, as demonstrated by Werstuck and Green (248). An RNA aptamer recognizing a small molecule was inserted in the 5' untranslated region of a mRNA. The addition of the small ligand to cultured chinese hamster ovary cells transfected with the aptamer construct selectively prevented the expression of the message. Oligonucleotides are a promising class of RNA ligands. Multiple reports have described the potential of antisense sequences for selectively modulating the expression of a target gene, either in vitro or in vivo. This constitutes the premise of derived strategies adapted to the specific recognition of RNA structures. The use of high-affinity derivatives allows the invasion of folded RNA regions. However, this does not solve the problem generated by the self-organization of oligonucleotides targeted to RNA structures, as partial complementarity between two regions of the antisense sequence will lead to oligomers inappropriate for binding to their target RNA site. Selective binding complementary (SBC) bases (Fig. 1) represent an interesting solution to this problem (74, 120). This approach, currently restricted to targeting AT-rich regions, requires the design of G and C analogs for allowing any secondary structure to be invaded. In addition, it would be worthwhile to simultaneously introduce SBC bases and modified backbone, giving rise to high-affinity oligonucleotides to take full advantage of this strategy. Nevertheless, one should keep in mind that increasing the affinity of an antisense oligomer for its target in order to compensate for the energetic cost of disrupting the structure in which the target is engaged will inevitably increase the probability of getting nonspecific effects. Recent in vitro selections performed with randomly synthesized RNA or DNA libraries have demonstrated the capacity of this strategy to uncover sequences able to "read" double-stranded or folded nucleic acids. In particular, experiments carried out in three different laboratories, including ours, have

MODULATIONOF RNA FUNCTION

39

identified aptamers targeted to stem-loop structures which have been demonstrated to form (212, 215, 216) or which could potentially give rise to (249) kissing complexes. RNA kissing aptamers were, in both cases (215, 216), characterized by a G-A pair closing the apical loop. This noncanonical pair was demonstrated to play a key role for complex stability (244). This work demonstrates the potential of in vitro selection for understanding RNA-RNA interactions. Knowledge of RNA tertiary structure will undoubtedly increase with the number of such studies. This might secondarily lead to the rational optimization of nonlinear antisense sequences. It should be pointed out that increased specificity is expected to arise from the recognition of folded structures by aptamers compared to the binding of primary sequences by antisense oligonucleotides. Indeed, the results obtained with the kissing aptamer demonstrate the key role played by the context of the Watson-Crick base-pairing region. The repertoire of aptamers targeted to an RNA structure is not restricted to the formation of loop-loop complexes. In vitro selection recently carried out in our laboratory against the TAR RNA element of HIV-1 or against the structured region at the 3' end of the hepatitis C virus RNA has generated other classes of aptamers able to bind their respective targets with nanomolar affinity (D. Sekkai et al., unpublished results). However, the potential of these selected sequences is not yet known. Finally, it is likely that improved ligands will be generated by the synthesis of conjugates associating an oligonucleotide moiety (a short antisense sequence targeted to an open region or an aptamer) to a tail exhibiting different binding properties (e.g., a basic peptide or a ligand bearing various potential binding elements eventually generated through combinatorial approaches). Such dual molecules might exhibit improved selectivity and higher affinity.

ACKNOWLEDGMENTS We thank our colleagues L. Aldaz-Carroll, K. Aupeix, S. Chabas, C. Boizian, C. Cazenave, F. Darfeuille,E. Dausse,R. Le Tin6vez,D. Sekkai,and L. Yurchenkofor sharing unpublished results. The workdescribedin this review,performedin J. J. Toulm6"slaboratory,was supportedoverthe last few years by the Conseil R~gionald'Aquitaine, the Agence Nationale de Recherches sur le SIDA, the Associationpour la Recherche sur le Cancer, the PSle M6dicament Aquitaine, the Direction de la Recherche et de la Technologie,and the European Union (BiotechnologyProgramme).

REFERENCES 1. T. R. Cech and B. L. Bass,Annu. Rev. Biochem. 55, 599-629 (1986). 2. O. Meleforsand M. W. Hentze, Bioassays 15, 85-90 (1993). 3. E. C. Theft, Biochem.J. 304, 1-11 (1994).

40

j.j. TOULMI~ ET AL.

4. M.W. Hentze and L. C. Kuhn, Proc. Natl. Acad. Sci. U.S.A. 93, 8175-8182 (1996). 5. P. Ponka, Kidney Int. Suppl. 69, $2-11 (1999). 6. J. C. Paillart, R. Marquet, E. Skripkin, C. Ehresmann, and B. Ehresmann, Biochimie 78, 639-653 (1996). 7. J. G. Levin, D. L. Hattfield, S. Oroszian, and A. Rein, in "Reverse Transcriptase" (A. M. Skalka and S. E Goff, eds.), pp. 5-31. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1993. 8. R.J. Jackson and A. Kaminski, RNA 1,985-1000 (1995). 9. D. Moazed and H. F. NoUer, Nature (London) 327, 389-394 (1987). 10. Y. Van de Peer, E. Robbrecht, S. de Hoog, A. Caers, P. De Rijk, and R. De Wachter, Nucleic Acids Res. 27, 179-183 (1999). 11. D. Fourmy, M. I, Recht, and j. D. Puglisi, j. Mol. Biol. 277, 347-362 (1998). 12. M. G. Wallis and R. Schroeder, Prog. Biophys. Mol. Biol. 67, 141-54 (1997). 13. F. Walter, Q. Vicens, and E. Westhof, Curt. Opin. Chem. Biol. 3, 694-704 (1999). 14. T. Hermann and E. Westhof, Curt Opin. Biotechnol. 9, 66-73 (1998). 15. T. Hermann and D. J. Patel, Science 287, 820-825 (2000). 16. T. Hermann and E. Westhof, J. MoL Biol. 276, 903-912 (1998). 17. H. Wank and R. Schroeder, J. Mol. Biol. 258, 53-61 (1996). 18. L. B. Blyn, L. M. Risen, R. H. Griffey, and D. E. Draper, Nucleic Acids Res. 28, 1778-1784 (2000). 19. S. J. Sucheck, W. A. Greenberg, T. J. Tolbert, and C. H. Wong, Angew. Chem. Int. Ed. 39, 1080 (2000). 20. J. B. H. Tok and R. R. Rando, J. Amer. Chem. Soc. 120, 8279-8280 (1998). 21. F. Hamy, E. R. Felder, G. Heizmann, J. Lazdins, F. Aboul-ela, G. Varani, J. Karn, and T. Klimkait, Proc. Natl. Acad. Sci. U.S.A. 94, 3548-3553 (1997). 22. T. Klimkait, E. R. Felder, G. Albrecht, and E Hamy, Biotechnol. Bioeng. 61,155-168 (1998). 23. W. D. Wilson and K. Li, Curt. Med. Chem. 7, 73-98 (2000). 24. P. S. Miller, L. T. Braiterman, and P. O. P. Ts'o, Biochemistry 16, 1988-1996 (1977). 25. P. C. Zamecnik and M. L. Stephenson, Proc. Natl. Acad. Sci. U.S.A. 75, 280-284 (1978). 26. K. U. Mir and E. M. Southern, Nat. Biotechnol. 17, 788-792 (1999). 27. N. Milner, K. U. Mir, and E. M. Southern, Nat. Biotechnol. 15, 537-541 (1997). 28. W. F. Lima, V. Brown-Driver, M. Fox, R. Hanecak, and T. W. Bmice, J. Biol. Chem. 272, 626-638 (1997). 29. A. M. Belikova, V. F. Zarytova, and N. I. Grineva, Tetrahedron Lett. 37, 3557-3562 (1967). 30. J. C. Barrett, P. S. Miller, and P. O. P. Ts'o, Biochemistry 13, 4897-4906 (1974). 31. J. J. Toulm6, in "Antisense RNA and DNA" (J. A. H. Murray, ed.), pp. 175-194. Wiley, New York, 1992. 32. T. Mizuno, M. Y. Chou, and M. Inouye, Proc. Natl. Acad. SCI. U.S.A. 81, 1966-1970 (1984). 33. R.W. Simons and N. Kleckner, Cell (Cambridge, Mass.) 34, 683-691 (1983). 34. J. I. Tomizawa and T. Itoh, Proc. Natl. Acad. SCI. U.S.A. 78, 6096-6100 (1981). 35. S.T. Crooke and B. Lebleu, "Antisense Research and Applications," CRC Press, Boca Raton, FL, 1993. 36. S. Agrawal and E. R. KandimaUa, Mol. Med. Today 6, 72-81 (2000). 37. C. H61bne andJ. J. Toulm6, Biochim. Biophys. Acta. 1049, 99-125 (1990). 38. J. J. Toulm6, H. M. Krisch, N. Loreau, N. T. Thuong, and C. H61~ne, Proc. Natl. Acad. Sci. U.S.A. 83, 1227-1231 (1986). 39. M. Lemattre, B. Bayard, and B. Lebleu, Proc. Natl. Acad. Sci. U.S.A. 84, 648-652 (1987). 40. E. S. Kawasaki, Nucleic Acids Res. 13, 4991-5004 (1985). 41. C. Cazenave, N. Loreau, N. T. Thuong, J. j. Toulm6, and C, H61bne, Nucleic Acids Res. 15, 4717-4736 (1987).


41

M.T. Haeuptle, R. Frank, and B. Dobberstein, Nucleic Acids Res. 14, 1427-1445 (1986). J. Minshull and T. Hunt, Nucleic Acids Res. 14, 6433-6451 (1986). R. Y. Walder and J. A. Walder, Proc. Natl. Acad. Sci. U.S.A. 85, 5011-5015 (1988). C. Cazenave, P. Frank, and W. Biisen, Biochimie 75, 113-122 (1993). P. S. Eder and J. A. Walder, J. Biol. Chem. 266, 6472-6479 (1991). P. Frank, S. Albert, C. Cazenave, and J. J. Toulm6, Nucleic Acids Res. 22, 5247-5"254 (1994). 48. J. J. Toulm6, P. Frank, and R. J. Crouch, in "Ribonucleases H" (R. J. Crouch andJ. J. Tonlm& eds.), pp. 147-162. John Libbey, Paris, 1998. 49. F. Pileur, J. J. Toulrn~, and C. Cazenave, Nucleic Acids Res. 18, 3674-3683 (2000). 50. R. V. Giles, D. G. Spiller, and D. M. Tidd, Antisense Res. Dev. 5, 23-31 (1995). 51. N. Dias, S. Dheur, P. E. Nielsen, S. Gryaznov, A. Van Aerschot, P. Herdewijn, C. H~l~ne, and T. E. Saisou Behmoaras, J. Mol. Biol. 294, 403-416 (1999). 52. J. M. Kean, A. Murakarni, K. R. Blake, C. D. Cushman, and P. S. Miller, Biochemistry 27, 9113-9121 (1988). 53. K. Aupeix-Scheidler, S. Chabas, L. Bidou, J. P. Rousset, M. Leng, and J. j. Toulrnd, Nucleic Acids Res. 28, 438-445 (2000). 54. M. Y. Chiang, H. Chan, M. A. Zounes, S. M. Freier, W. F. Lima, and C. F. Bennett, J. Biol. Chem. 266, 18162-18171 (1991). .55. C. Boizian, N. T. Thuong, and J. j. Toulm~, Proc. Natl. Acad. Sci. U.S.A. 89, 768-772 (1992). 56. B. Bordier, M. Perala-Heape, G. Degols, B. Lebleu, S. Litvak, L. Sarih-Cottin, and C. H61~ne, Proc. Natl. Acad. Sci. U.S.A. 92, 9383-9387 (1995). 57. C. Boizian, L. Tarrago-Litvak, N. D. Sinha, S. Moreau, S. Litvak, and J. j. Toulm6, Antisense Nucleic Acid Drug Dev. 6, 103-109 (1996). 58. C. H61~ne andJ. J. Toulrn~, in "Oligodeoxynucleotides: Antisense Inhibitors of Gene Expression" (J. S. Cohen ed.), pp. 137-172. Macmillan Press, London, 1989. 59. S. M. Gryaznov, Biochim. Biophys. Acta 1489, 131-140 (1999). 60. H.J. Larsen, T. Bentin, and P. E. Nielsen, Biochim. Biophys. Acta 1489, 159-166 (1999). 61. J. Summerton and D. Weller, Antisense Nucleic Acid Drug. Dev. 7, 187-195 (1997). 62. G. B. Mulamba, A. Hu, R. F. Azad, K. P. Anderson, and D. M. Coen, Antimicrob. Agents Chemother. 42, 971-973 (1998). 63. P. Couvreur and C. Malvy, "Pharmaceutical Aspects of Ohgonucleotides," Taylor & Francis, London, 2000. 64. B. Weiss, "Antisense Oligodeoxynucleotides and Antisense Agents. Novel Pharmacological and Therapeutic Agents," CRC Press, Boca Raton, FL, 1997. 65. P. Borst, Annu. Rev. Biochem. 55, 701-732 (1986). 66. N. Agabian, Cell (Cambridge, Mass.) 61, 1157-1160 (1990). 67. J.J. Toulm6, C. Bourget, D. Compagno, and L. Yurchenko, Parasitology 114, $45-$59 (1997). 68. J. J. Toulrnd, in "Pharmaceutical Aspects of Oligonucleotides" (P. Couvreur and C. Malvy, eds.), pp. 286-308. Taylor & Francis, London, 2000, 69. P. Verspieren, A. W. C. A. Cornelissen, N. T. Thuong, C. H61~ne, and J. J. Toulm6, Gene 61, 307-315 (1987). 70. C. Ramazeilles, R. K. Mishra, S. Moreau, E. Pascolo, and J. J. Toulm6, Proc. Natl. Acad. Sci. U.S.A. 91, 7859-7863 (1994). 71. R. K. Mishra, C. Moreau, C. Ramazeilles, S. Moreau, J. Bonnet, and J. J. Toulm6, Biochim. Biophys. Acta 19.64, 229-237 (1995). 72. P. Verspieren, N. Lorean, N. T. Thuong, D. Shire, and J. J. Toulm6, Nucleic Acids Res. 18, 4711-4717 (1990). 73. E. Pascolo, D. Hudrisier, B. Sproat, N. T. Thuong, and J.-J. Toulm6, Biochim. Biophys. Acta 1219, 98-106 (1994).

42. 43. 44. 45. 46. 47.

42

J. j, TOULMI~ ET AL.

74. D. Compagno, J. N. Lampe, C. Bourget, I. V. Kutyavin, L. Yurchenko, E. A. Lukhtanov, V. V. Gorn, H. B. Gamper, and J. J. Toulm6, J. Biol. Chem. 274, 8191-8198 (1999). 75. D. Herschlag, Proc. Natl. Acad. Sci. U.S.A. 88, 6921-6925 (1991). 76. M.J. Serra and D. H. Turner, Methods Enzymol. 259, 242-261 (1995). 77. I. Tinoco and C. Bustamante, J. Mol. Biol. 293, 271-281 (1999). 78. I. Tinoco,Jr., P. N. Borer, B. Dengler, M. D. Levin, O. C. Uhlenbeck, D. M. Crothers, and j. Bralla, Nat. New Biol, 246, 40-41 (1973). 79. J. D. Puglisi, J. R. Wyatt, and I. Tinoco, J. Mol. Biol. 214, 437-453 (1990). 80. J. R. Wyatt, J. D. Puglisi, and I. Tinoco, J. Mol. Biol. 214, 455-470 (1990). 81. D. J. Ecker, in "Antisense Research and Applications" (B. Lebleu and S. T. Crooke, eds.),

pp. 387-399. CRC Press, Boca Raton, FL, 1993. 82. D. J. Ecker, T. A. Vickers, T. W. Bruice, S. M. Freier, R. D. Jenison, M. Manoharan, and M. Zounes, Science 257, 958-961 (1992). 83. W. F. Lima, B. P. Monia, D. J. Ecker, and S. M. Freier, Biochemistry 31,12055-12061 (1992). 84. S. Freier, in "Antisense Research and Applications" (B. Lebleu and S. T. Crooke, eds.), 85. 86. 87. 88. 89.

90. 91. 92. 93. 94. 95. 96. 97. 98. 99.

100. 101. 102. 103. 104. 105. 106.

107. 108. 109. 110.

pp. 67-82. CRC Press, Boca Raton, FL, 1993. S. M. Freier and I. Tinoco, Jr., Biochemistry 14, 3310-3314 (1975). O. C. Uhlenbeck, J. Mol. Biol. 65, 25-41 (1972). T. A. Vickers, J. R. Wyatt, and S. M. Freier, Nucleic Acids Res. 28, 1340-1347 (2000). D. Porschke and M. Eigen, J. Mol. Biol. 62, 361-381 (1971). M. E. Craig, D. M. Crothers, and P. Doty, J. Mol. Biol. 62, 383-401 (1971). C. Cantor and P. Schimmel (P. C. Vapnek, ed.), W. H. Freeman, San Francisco, 1980. M. J. Fedor and O. C. Uhlenbeck, Proc. Natl. Acad. Sci. U.S.A. 87, 1668-1672 (1990). T. Hjalt and E. G. Wagner, Nucleic Acids Res. 20, 6723-6732 (1992). K. Rittner, C. Burmester, and G. Sczakiel, Nucleic Acids Res. 21, 1381-1387 (1993). T.W. Bruice and W. F. Lima, Biochemistry 36, 5004-5019 (1997). V. Patzel and G. Sczakiel, Nucleic Acids Res. 28, 2462-2466 (2000). M. Scherr, J. j. Rossi, G. Sczakiel, and V. Patzel, Nucleic Acids Res. 28, 2455-2461 (2000). M.J. Lehmann, V. Patzel, and G. Sczakiel, Nucleic Acids Res. 28, 2597-2604 (2000). B. S. Sproat, A. I. Lamond, B. Beijer, P. Neuner, and U. Ryder, Nucleic Acids Res. 17, 3373-3386 (1989). A. I. Lamond and B. S. Sproat, FEBS Lett. 325, 123-127 (1993). E. A. Lesnik, C. J. Guinosso, A. M. Kawasaki, H. Sasmor, M. Zounes, L. L. Cummins, D. J. Ecker, P. D, Cook, and S. M. Freier, Biochemistry 32, 7832-7838 (1993). S. M. Freier and K. Altmann, Nucleic Acids Res. 25, 4429-4443 (1997). V. Tereshko, S. Portmann, E. C. Tay, P. Martin, F. Natt, K. H. Altmann, and M. Egli, Biochemistry 37, 10626-10634 (1998). E. A. Lesnik and S. M. Freier, Biochemistry 34, 10807-10815 (1995). M. Teplova, G. Minasov, V. Tereshko, G. B. Inamati, P. D. Cook, M. Manoharan, and M. Egli, Nature Struct. Biol. 6, 535-539 (1999). A. A. Koshkin, S. K. Singh, P. Nielsen, V. K. Rajwanshi, R. Kumar, M. Meldgaard, C. E. Olsen, and J. Wengel, Tetrahedron 54, 3607-3630 (1998). C. Wahlestedt, P. Salmi, L. Good, J. Kela, T. Johnsson, T. Hokfelt, C. Broberger, F. Porreca, J. Lai, K. Ren, M. Ossipov, A. Koshkin, N. Jakobsen, J. Skouv, H. Oerum, M. H. Jacobsen, and J. Wengel, Proc. Natl. Acad. Sei. U.S.A. 97, 5633-5638 (2000). S. Gryaznov and J. K. Chen, J. Amer Chem. Soc. 116, 3143-3144 (1994). T. J. Matray and S. M. Gryaznov, Nucleic Acids Res. 27, 3976-3985 (1999). V. Tereshko, S. Gryaznov, and M. Egli, J. Amer. Chem. Soc. 120, 269-283 (1998). T. Skorski, D. Perrotti, M. Nieborowska-Skorska, S. Gryaznov, and B. Calabretta, Proc. Natl. Acad. Sci. U.S.A. 94, 3966-3971 (1997).


43

111. E Boulm6, E Freund, S. Moreau, P. Nielsen, S. Gryaznov, J. J. Toulm6, and S. Litvak, Nucleic Acids Res. 26, 5492-5500 (1998). 112. H. Kang, E J. Chou, W. C. Johnson, Jr., D. Weller, S. B. Huang, and J. E. Summerton, Biopolymers 32, 1351-1363 (1992). 113. G. Sehmajuk, H. Sierakowska, and R. Kole, J. Biol. Chem. 274, 21783-21789 (1999). 114. L. Lacroix, P. B. Arimondo, M. Takasugi, C. H616ne, and J. L. Mergny, Biochem. Biophys. Res. Commun. 270, 363-369 (2000). 115. E E. Nielsen, M. E. Egholm, and R. H. Berg, Science 254, 1497-1500 (1991). 116. M. Egholm, O. Buehardt, L. Christensen, C. Behrens, S. M. Freier, D. A. Driver, R. H. Berg, S. K. Kim, B. Norden, and E E. Nielsen, Nature (London) 365, 566-568 (1993). 117. K. K. Jensen, H. Ornm, E E. Nielsen, and B. Norden, Biochemistry 36, 5072-5077 (1997). 118. S. C. Brown, S. A. Thomson, J. M. Veal, and D. G. Davis, Science 265, 777-779 (1994). 119. D.Y. Cherny, B. P. Belotserkovskii, M. D. Frankkamenetskii, M. Eghohn, O. Buchardt, R. H. Berg, and P. E. Nielsen, Proc. Natl. Acad. Sci. U.S.A. 90, 1667-1670 (1993). 120. I. V. Kutyavin, R. L. Rhinehart, E. A. Lukhtanov, Gornvv, R. B. Meyer, and H. B. Gamper, Biochemistry 35, 11170-11176 (1996). 121. C. A. Stein and Y. C. Cheng, Science 261, 1004-1012 (1993). 122. J. J. Toulm6 and D. Tidd, in "Ribonueleases H" (R. J. Crouch and J. j. Toulm6, eds.), pp. 225-

250. John Libbey, Paris, 1998. 123. C. A. Stein, Pharmacol Ther. 85, 231-236 (2000). 124. B. P. Monia, J. E Johnston, D. J. Eeker, M. A. Zounes, W. F. Lima, and S. M. Freier, J. Biol. Chem. 267, 19954-19962 (1992). 125. B. Larrouy, C. Blonski, C. Boizian, M. Stuer, S. Moreau, D. Shire, and J. J. Toulm6, Gene 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141. 142. 143.

121, 189-194 (1992). B. Larrouy, C. Boiziau, B. Sproat, and J. j. Toulm6, Nucleic Acids Res. 23, 3434-3440 (1995). R.V. Giles and D. M. Tidd, Nucleic Acids Res. 20, 763-770 (1992). R.V. Giles, D. G. Spiller, and D. M. Tidd, Anti-Cancer Drug Des. 8, 33-51 (1993). R. V. Giles, C. J. Ruddell, D. G. Spiller, J. A. Green, and D. M. Tidd, Nucleic Acids Res. 23, 954-961 (1995). G. G. Karpova, D. G. Knorre, A. S. Ryte, and L. E. Stephanovich, FEBS Lett. 122, 21-24 (1980). C. Boiziau, A. S. Boutorine, N. Loreau, E Verspieren, N. T. Thuong, and J. j. Toulm6, Nudeosides Nucleotides 10, 239-244 (1991). U. Pieles and U. Englisch, Nucleic Acids Res. 17, 285-299 (1989). M. Kulka, C. C. Smith, L. Aurelian, R. Fishelevich, K. Meade, P. Miller, and P. O. P. Ts'o, Proc. Natl. Acad. Sci. U.S.A. 86, 6868-6872 (1989). K. J. Friedman, J. Kole, J. A. Cohn, M. R. Knowles, L. M. Silverman, and R. Kole, J. Biol. Chem. 274, 36193-36199 (1999). H. Sierakowska, L. Gorman, S. H. Kang, and R. Kole, in "Antisense Technology, Pt. A" (M. I. Phillips, ed.), pp. 506-521. Academic Press, San Diego, 2000. G. B. Dreyer and P. B. Dervan, Proc. Natl. Acad. Sci. U.S.A. 82, 968-972 (1985). C. H. B. Chen and D. S. Sigman, Proc. Natl. Acad. Sci. U.S.A. 83, 7147-7151 (1986). T. Le Doan, L. Perrouault, C. H41~ne, M. Chassignol, and N. T. Thuong, Biochemistry 25, 6736-6739 (1986). B. C. IV.Chu and L. E. Orgel, Nucleic Acids Res. 17, 4783-4798 (1989). R. Dalbies, D. Payet, and M. Leng, Proc. Natl. Acad. Sci. U.S.A. 91, 8147-8151 (1994). M. Boudvillain, M. Guerin, R. Dalbies, T. Saison-Behmoaras, and M. Leng, Biochemistry 36, 2925-2931 (1997). G. Felsenfeld, D. R. Davies, and A. Rich, J. Amer. Chem. Soc. 79, 2023-2024 (1957). K. Hoogsteen, Acta. Cryst. 12, 822-823 (1959).

44

J.j. TOULMI~ ET AL.

144. N. T. Thuong and C. H61bne, Angew. Chem. Int. Ed. 32, 666-690 (1993). I. Radhakrishnan and D. J. Patel, Biochemistry 33, 11405-11416 (1994). D. S. Pilch, C. Levenson, and R. H. Shafer, Biochemistry 30, 6081-6087 (1991). D. S. Pilch, R. Brousseau, and R. Sharer, Nucleic Acids Res. 18, 5743-5750 (1990). G. C. Best and P. B. Dervan, J. Amer. Chem. Soc. 117, 1187-1193 (1995). P.V. Scaria, S. Will, C. Levenson, and R. H. Shafer, J. Biol. Chem. 270, 7295-7303 (1995). P.V. Scaria and R. H. Shafer, Biochemistry 35, 10985-10994 (1996). R.W. Roberts and D. M. Crothers, Science 258, 1463-1366 (1992). C. Escude, J. c. Francois, J. S. Sun, G. Ott, M. Sprinzl, T. Garestier, and C. H61~ne, Nucleic Acids Res. 21, 5547-5553 (1993). 153. S. Wang and E. T. Kool, Nucleic Acids Res. 23, 1157-1164 (1995). 154. H. Han and P. B. Dervan, Proc. Natl. Acad. Sci. U.S.A. 90, 3806-3810 (1993). 155. S. H. Wang and E. T. Kool, J. Amer. Chem. Soc. 116, 8857-8858 (1994). 156. T. Vo, S. H. Wang, and E. T. Kool, Nucleic Acids Res. 23, 2937-2944 (1995). 157. F. Svinarchuk, J. Paoletti, and C. Malvy,J. Biol. Chem. 270, 14068-14071 (1995). 158. K. Aupeix, R. Le Tin6vez, and J. J. Toulm6, FEBS Lett. 449, 169-174 (1999). 159. J. L. Mergny, D. Collier, M. Roug6e, T. Montenay-Garestier, and C. H61bne, Nucleic Acids Res. 19, 1521-1526 (1991). 160. W D. Wilson, S. Mizan, F. A. Tanious, S. Yao, and G. Zon, J. Mol. Recognit. 7, 89-98 (1994). 161. W. D. Wilson, F. A. Tanious, S. Mizan, S. Tan, A. S. Kiselyov, G. Zon, and L. Strekowski, 32, 10614-10621 (1993). 162. K. R. Fox, P. Polucci, T. C. Jenkins, and S. Neidle, Proc. Natl. Acad. Sci. U.S.A. 92, 7887-7891 (1995). 163. J. S. Sun, j. c. Francois, T. Montenay-Garestier, T. Saison-Behmoaras, V. Roig, N. T. Thuong, and C. H61bne, Proc. Natl. Acad. Sci. U.S.A. 86, 9198-9202 (1989). 164. J. L. Mergny, G. Duvalvalentin, C. H. Nguyen, L. Perrouanlt, B. Faucon, M. Rougee, T. Montenaygarestier, E. Bisagni, and C. H61bne, Science 256, 1681-1684 (1992). 165. G. C. Silver, J. S. Sun, C. H. Nguyen, A. S. Boutorine, E. Bisagni, and C. H61~ne,J. Amer. Chem. Soc. 119, 263-268 (1997). 166. D. S. Pilch and K. J. Breslauer, Proc. Natl. Acad. Sci. U.S.A. 91, 9332-9336 (1994). 167. L. E. Xodo, M. Giorgio, F. Quadrifoglio, G. A. Van der Marel, and J. van Boom, Nucleic ACIds Res. 19, 5625-5631 (1991). 168. L. Lacroix, J. Lacoste, J. F. Reddoch, J. L. Mergny, D. D. Levy, M. M. Seidman, M. D. Matteueci, and P. M. Glazer, Biochemistry 38, 1893-1901 (1999). 169. 17.Godde, J. J. Toulm6, and S. Morean, Biochemistry 37, 13765-13775 (1998). 170. J. Michel, G. Gueguen, J. Vercauteren, and S. Moreau, Tetrahedron 53, 8457-8478 (1997). 171. A. Arzumanov, F. Godde, S. Moreau, J. J. Toulm6, A. Weeds, and M. J. Gait, Helv. Chim. Acta 83, 1424-1436 (2000). 172. S. H. Wang and E. T. Kool, Biochemistry 34, 4125-4132 (1995). 173. B. Cuenoud, F. Casset, D. Husken, F. Natt, R. M. Wolf, K. H. Altmann, P. Martin, and H. E. Moser, Angew. Chem. Int. Ed. 37, 1288-1291 (1998). 174. C. Giovannangeli, T. Montenay-Garestier, M. Roug6e, M. Chassignol, N. T. Thuong, and C. H61~ne,J. Amer. Chem. Soc. 113, 7775-7777 (1991), 175. G. Prakash and E. T. Kool, J. Chem Soc. Chem. Commun. 1161-1163 (1991). 176. G. Prakash and E. T. Kool, J. Amer. Chem. Soc. 114, 3523-3527 (1992). 177. W. Bannwarth and P. Iaiza, Helv. Chim. Acta 81, 1739-1748 (1998). 178. E. R. Kandimalla, A. N. Manning, G. Venkataraman, V. Sasisekharan, and S. Agrawal, Nucleic Acids Res. 23, 4510-4517 (1995). 179. J. J. Toulm6, R. Le Tin6vez, and E. Brossalina, Biochimie 78, 663-673 (1996). 180. E. Brossalina and J. J. Toulm6,J. Amer. Chem. Soc. 115, 796-797 (1993). 145. 146. 147. 148. 149. 150. 151. 152.


45

181. J. C. Francois and C. H~l~ne, Biochemistry 34, 65-72 (1995). 182. E. Brossalina, E. Pascolo, and J. J. Toulm6, Nucleic Acids Res. 21, 5616-5622 (1993). 183. E. Brossalina, E. Demehenko, Y. Demchenko, V.Vlassov,and J. J. Totdm6, Nucleic Acids Res.

24, 3392-3398 (1996). 184. L. C. Griffin and P. B. Dervan, Science 245, 967-971 (1989). 185. J. L. Mergny, J. S. Sun, M. Roug6e, T. Montenay-Garestier, F. Bareelo, J. Chomilier, and C. H~l~ne, Biochemistry 30, 9791-9798 (1991). 186. E. Pascolo and J.-J. Toulm6,J. Biol. Chem. 271, 24187-24192 (1996). 187. S.T. Cload and A. Sehepartz, J. Amer. Chem. Soc. 116, 437-442 (1994). 188. D. J. Ecker, T. A. Vickers, R. Hanecak, V. Driver, and K. Anderson, Nucleic Acids Res. 21,

1853-1856 (1993). A. D. Ellington and J. W. Szostak, Nature (London) 346, 818-822 (1990). C. Tuerk and L. Gold, Science 249, 505-510 (1990). L. Gold, B. Polisky, O. Uhlenbeck, and M. Yarus, Annu. Rev. Biochem. 64, 763-797 (1995). W. Pan, R. B. Craven, Q. Qui, C. B. Wilson, J. w. Wills, S. Golovine, and J. F. Wang, Proc. Natl. Acad. Sci. U.S.A. 92, 11509-11513 (1995). 193. M. Homann and H. U. Goringer, Nucleic Acids Res. 27, 2006-2014 (1999). 194. R. D. Jenison, S. C. Gill, A. Pardi, and B. Polisky, Science 263, 1425-1429 (1994). 195. S. D. Jayasena, Clin. Chem. 45, 1628-1650 (1999). 196. E. N. Brody, M. C. Willis, J. D. Smith, S. Jayasena, D. Zichi, and L. Gold, Mol. Diagn. 4, 381-388 (1999). 197. A. D. Ellington and R. Conrad, Biotechnol. Annu. Rev. 1, 185-215 (1995). 198. M. Famulok and A. Jenne, Curt Opin. Chem. Biol. 2, 320-327 (1998). 199. J.-J. Toulm6, Curt. Opin. Mol. Therap., in press (2000). 200. M. Kurz and R. R. Breaker, C u ~ Top. Microbiol. Immunol. 243, 137-158 (1999). 201. D. S. Wilson and J. w. Szostak, Annu. Rev. Biochem. 68, 611-647 (1999). 202. D. Pei, H. D. Ulrich, and P. G. Schultz, Science 253, 1408-1411 (1991). 203. G. A. Soukup, A. D. Ellington, and L. J. Maher, J. Mol. Biol. 259, 216-228 (1996). 204. P. Hardenbol and M. W. van Dyke, eroc. Natl. Acad. Sci. U.S.A. 93, 2811-2816 (1996). 205. R. K. Mishra and J. J. Toulm~, C. R. Acad. Sci. Paris 317, 977-982 (1994). 206. R.K. Mishra, R. Le Tin6vez, and J. j. Toulm6, Proc. Natl. Acad. Sci. U.S.A. 93, 10679-10684 (1996). 207. R. Le Tin6vez, R. K. Mishra, and J. J. Toulm~, Nucleic ACIds Res. 26, 2273-2278 (1998). 208. C. Boiziau, E. Dausse, R. Mishra, F. Ducong~, and J. J. Toulm6, Antisense Nucleic Acid Drug. Dev. 7, 369-380 (1997). 209. J. Karn, J. Mol. Biol. 293, 235-254 (1999). 210. K. A. Jones, Genes Dev. 11, 2593-2599 (1997). 211. L. Dassonneville, F. Hamy, P. Colson, C. Houssier, and C. Bailly, Nucleic Acids Res. 25, 4487-4492 (1997). 212. C. Boiziau, E. Dansse, L. Yurchenko, and J. J. Toulm~, J. Biol. Chem. 274, 12730-12737 (1999). 213. K. Y. Chang and I. Tinoco, J. Mol. Biol. 269, 52-66 (1997). 214. A.J. Lee and D. M. Crothers, Structure 6, 993-1005 (1998). 215. F. Ducong6 and J. J. Toulm6, RNA 5, 1605-1614 (1999). 216. D. Scarabino, A. Crisari, S. Lorenzini, K. Williams, and G. P. Tocchini-Valentini, EMBOJ. 18, 4571-4578 (1999). 217. J. I. Tomizawa, Cell (Cambridge, Mass.) 47, 89-97 (1986). 218. C. Persson, E. G. H. Wagner, and K. NordstrSm, EMBOJ. 9, 3761-3775 (1990). 219. E. Skripkin, J. c. PaiUart, R. Marquet, B. Ehresmann, and C. Ehresmann, Proc. Natl. Acacl. Sci. U.S.A. 91, 4945-4949 (1994). 189. 190. 191. 192.

46

j.j. TOULMI~ ET AL.

220. D. Muriaux, P. M. Girard, B. Bonnet-Mathoniere, and J. Paoletti, J. Biol. Chem. 270, 8209-

8216 (1995). 221. J. Wang, J. M. Bakkers, J. M. Galama, H. J. Bruins Slot, E. V. Pilipenko, V. I. Agol, and W. J. Melchers, Nucleic Acids Res. 27, 485-490 (1999). 222. E. G. H. Wagner and S. Brantl, Trends Biochem. Sci. 23, 451-454 (1998). 223. J. Tomizawa, J. Mol. Biol. 212, 695-708 (1990). 224. Y. Eguchi and J. I. Tomizawa, Cell (Cambridge, Mass.) 60, 199-209 (1990). 225. Y. Egnchi and J. I. Tomizawa, J. Mol. Biol. 220, 831-842 (199l). 226. P. F. Predki, L. M. Nayak, M. B. Gottlieb, and L. Regan, Cell (Cambridge, Mass.) 80, 41-50

(1995). 227. E Blomberg, H. M. Engdahl, C. Malmgren, E Romby, and E. G. H. Wagner, Mol. Microbiol.

12, 49-60 (1994). 228. T. A. H. Hjalt and E. G. H. Wagner, Nucleic Acids Res. 23, 580-587 (1995). 229. F. A. Kolb, C. Malmgren, E. Westh, C. Ehresmann, B. Ehresmann, E. G. Wagner, and

P. Romby, RNA 6, 311-324 (2000). 230. J. L. Clever, M. L. Wong, andT. G. Parslow, J. Virol. 70, 5902-5908 (1996). 231. K. I. Takahashi, S. Baba, P. Chattopadhyay, Y. Koyanagi, N. Yamamoto, H. Takaku, and

G. Kawai, RNA 6, 96-102 (2000). 232. J. C. Paillart, E. Westhof, C. Ehresmann, B. Ehresmann, and R. Marquet, J. Mol. Biol. 270,

36-49 (1997). 233. M. Laughrea, L. Jette, j. Mak, L. Kleiman, C. Liang, and M. A. Wainberg, j. Virol. 71,

3397-3406 (1997). 234. E. Skripkin, J. C. Paillart, R. Marquet, M. Blumenfeld, B. Ehresmann, and C. Ehresmann, J. Biol. Chem. 271, 28812-28817 (1996). 235. j. s. Lodmell, J. c. Paillart, D. Mignot, B. Ehresmann, C. Ehresmann, and R. Marquet, Antisense Nucleic Acid Drug. Dec. 8, 517-529 (1998). 236. R. S. Gregorian and D. M. Crothers, J. Mol. Biol. 248, 968-984 (1995). 237. F. Dardel, R. Marquet, C. Ehresmann, B. Ehresmann, and S. Blanquet, Nucleic Acids Res.

26, 3567-3571 (1998). 238. F. Jossinet, j. c. Paillart, E. Westhof, T. Hennann, E. Skripkin, J. S. Lodmell, C. Ehresmann,

B. Ehresmann, and R. Marquet, RNA 5, 1222-1234 (1999). 239. A. Mujeeb, J. L. Clever, T. M. Billeci, T. L. James, and T. G. Parslow, Nature Struct. Biol. 5,

432-436 (1998). 240. L. R. Comolli, J. G. Pelton, and I. Tinoco, NucIeicAcids Res. 26, 4688-4695 (1998). 241. D. Colin, C. Heijenoort, C. Boiziau, J. Toulm6, and E. Guittet, Nucleic Acids Res. 28, 3386-

3391 (2000). 242. O. Y. Fedoroff, M. Salazar, and B. R. Reid, J. Mol. Biol. 233, 509-523 (1993). 243. H. H. Hogrefe, R. I. Hogrefe, R. Y. Walder, andJ. A. Walder, J. Biol. Chem. 265, 5561-5566

(1990). 244. F. Dueong6, C. Di Primo, and J. J. Toulm6, J. Biol. Chem. 275, 21287-21294 (2000). 245. T. M. Nair, D. G. Myszka, and D. R. Davis, Nucleic Acids Res. 28, 1935-1940 (2000). 246. L. Jiang, A. Majumdar, W. Hu, T. J. Jaishree, W. Xu, and D. J. Patel, Structure Fold, Des. 7, 817-827 (1999). 247. L. C. Jiang and D. J. Patel, Nature Struct. Biol. 5, 769-774 (1998). 248. G. Werstuck and M. R. Green, Science 282, 296-298 (1998). 249. J. B. Tok, J. Cho, and R. R. Rando, Nucleic Acids Res. 28, 2902-2910 (2000).

Regulation of the DNA Methylation Machinery and Its Role in Cellular Transformation M O S H E SZYF 1 AND NANCY D E T I C H

Department of Pharmacology and Therapeutics McGill University 3655 Sir William Osler Promenade Montreal, Quebec Canada H3G 1Y6 I. DNA Methylation Is a Covalent Epigenetic Modification of DNA . . . . . . . . . A. The Reversibility of the Epigenome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. DNA Methylation Is a Component of the Epigenome That Is a Part of the Covalent Structure of DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. DNA Methylation Represses Gene Expression by Two Different Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Ectopic Methylation Suppresses Expression . . . . . . . . . . . . . . . . . . . . . . . B. DNA Methylation Can Suppress Genes by Interfering with the Binding of Transcription Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Methylated DNA Interacts with Methylated DNA Binding Proteins (Mbds); Mbds Recruit Corepressors and Histone Deacetylases . . . . . . . . III. Enzymatic DNA Methylation Machinery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. DNA Methyltransferases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Demethylases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Coordination of DNA Replication and DNA Methylation Machineries . . . . . A. Regulatory Regions of Dnmtl Interact with Nodal Mitogenic and Oneogenic Signaling Pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Dnmtl Expression Plays a Causal Role in Cellular Transformation . . . . . C. Regulation ofdnmtl by the Retinoblastoma Tumor Suppressor Rb . . . . . D. The APC-TCF Pathway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Cell-Cycle Regulation ofdnmtl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Expression of Demethylase/Mbd2 and Mhd3 Is Regulated with the Cell Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. DNA Methylation Occurs Concurrently with DNA Replication . . . . . . . . H. DNMT1 Is Targeted to the Replication Fork and Complexed with the Replication Fork Protein PCNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. DNMT1 Regulates the Expression of Growth Suppressor Genes Directly by Protein-Protein Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Inhibition of DNMT1 Inhibits DNA Replication in Human Lung Cancer Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. DNMT1 and Oncogenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Global Hypomethylation and Regional Hypermethylation . . . . . . . . . . . .

49 49 50 50 50 51 51 53 53 54 55 56 56 58 58 59 59 61 61 62 63 64 64

1To whom correspondence should be addressed. Telephone: 514-398-7107; fax: 514-398-6690; E-mail: [email protected]. Progressin NucleicAcidResearch andMolecularBiology,Vol.69

47

Copyright© 2001byAcademicPress. Allrightsof reproductionin anyformreserved. 0079-6603/01$35.00

48

MOSHE SZYFAND NANCY DETICH B. Overexpression of DNMT1 in Cancer Ceils . . . . . . . . . . . . . . . . . . . . . . . . C. Hypothesis I: Overexpression of DNMT1 in Cancer Cells Causes Hypermethylation of Tumor Suppressor Genes . . . . . . . . . . . . . . . . . . . . . D. Hypothesis II: Deregulation of Cell-Cycle Control of DNMT1 in Cancer Cells Results in Disruption of Cell Regulatory Circuits . . . . . . . . . E. Why Are Tumor Suppressors Hypermethylated in Cancer Cells? . . . . . . . . E Methylation of Tumor Suppressors Confers a Selective Advantage . . . . . . G. The Hypermethylator Phenotype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. The Possible Role of de Novo DNMTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. The Possible Role of Demethylase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. The Accessibility of Demethylase to DNA Is Gated by Chromatin Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Histone Deacetylafion and Hypermethylation of Tumor Suppressor Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65 65 66 67 67 68 68 69 69 70 71 72

DNA methylation, a covalent modification of the genome, is emerging as an important player in the regulation of gene expression. This review discusses the different components of the DNA methylation machinery responsible for replicating the DNA methylation pattern. Recent data have changed our basic understanding of the DNA methylation machinery. A number of DNA methyltransferases (DNMT) have been identified and a demethylase has recently been reported. Because the DNA methylation pattern is critical for gene expression programs, the cell possesses a number of mechanisms to coordinate DNA replication and methylation. DNMT1 levels are regulated with the cell cycle and are induced upon entry into the S phase of the cell cycle. DNMT1 also regulates expression of cell-cycle proteins by its other regulatory functions and not through its DNA methylation activity. Once the mechanisms that coordinate DNMT1 and the cell cycle are disrupted, DNMT1 exerts an oncogenic activity. Tumor suppressor genes are frequently methylated in cancer but the mechanisms responsible are unclear. Overexpression of DNMT1 is probably not responsible for the aberrant methylation of tumor suppressor genes. Unraveling how the different components of the DNA methylation machinery interact to replicate the DNA methylation pattern, and how they are disrupted in cancer, is critical for understanding the molecular mechanisms of cancer. © 2001AcademicPress.

D N A m e t h y l a t i o n is a c o v a l e n t modification o f t h e g e n o m e that is e m e r g i n g as a m a j o r c o m p o n e n t o f t h e e p i g e n o m e . D N A m e t h y l a t i o n t h e r e f o r e plays a m a j o r role in c o n t r o l l i n g a n d m o d e l i n g g e n e expression p r o g r a m s . D N A m e t h y l ation p a t t e r n s c o u l d b e a p p r o a c h e d f r o m two points o f view. O n e line o f study focuses o n u n d e r s t a n d i n g t h e i r f u n c t i o n - - h o w D N A m e t h y l a t i o n regulates g e n e

REGULATIONOF DNMT1

49

expression. The second approach focuses on understanding how methylation patterns are formed and maintained. The fact that DNA methylation is a covalent component of the genome begs the additional question of whether the DNA methylation pattern is reversible like other biological modifications, or whether it is permanently fixed once the pattern is formed during development. It is clear that these two approaches to DNA methylation are linked. Understanding how the pattern of methylation is formed will shed light on its possible function, while comprehending how DNA methylation regulates gene expression might help elucidate some of the mechanisms involved in the generation, maintenance, and reversal of DNA methylation patterns. This review does not discuss the important question of how DNA methylation patterns are formed during development or during X inactivation; nor does it discuss the way in which certain genes are parentally imprinted by methylation. We focus, instead, on understanding how the DNA methylation machinery is regulated, and how it is involved in fashioning and maintaining the DNA methylation pattern in somatic and cancer ceils. In recent years this question has led us to focus on the role of the DNA methylation machinery in cancer, and on the therapeutic potential of some of the components of the DNA methylation machinery (1-3).

h DNA Melhylation Is a Covalent Epigenetic Modification of DNA A. The Reversibilityof the Epigenome The epigenome is the layer of information that dictates which parts of the genome are expressed in a specific place and at a specific time. The genome is fixed and identical in all tissues of multicellular organisms. The epigenome, on the other hand, is modeled during development in order to enable the multiple gene expression programs found in a complex organism, and is continuously modified throughout life in response to physiological, environmental, and pathological cues. It is now clear that the epigenome consists of multiple layers that act in a combinatorial fashion to activate or suppress a genomic region. An active region of the DNA is distinguished by an open chromatin structure which allows access to enzymatic machineries that interact with the genome. Most of tile components of the epigenome, such as histones, which form the core of the nueleosomes, and histone acetylases and deacetylases, which open or close the chromatin, are independent chemical entities that interact with the genome but are not part of it. The association of these molecules with the DNA is therefore reversible by its nature and can be continuously modified throughout life.

50

MOSHESZYFAND NANCYDETICH

B. DNA Methylation Is a Component of the Epigenome That Is a Part of the Covalent Structure of DNA The epigenome consists of an additional component that is part of the covalent structure of the genome, a coating of methyl groups. The hallmark of the vertebrate genome is that modification of cytosines by methylation occurs only when they are found 5' to guanosine in the sequence CpG, and that 80% of these CpGs are methylated (4). The pattern of methylation is unique in each tissue. A large set of data has established a correlation between the state of activity of genes, the chromatin structure, and DNA methylation (5-7). In general, regions of the genome that are active are less methylated than other regions (5). The same gene is methylated in a tissue where it is not expressed whereas it is not methylated in tissues where it is expressed (4). Moreover, when two alleles of the same genes are differentially expressed in the same cell, as is the case in X inactivation (8-13) and parental imprinting (14-17), the inactive allele is believed to be marked by methylation of some CpGs. The covalent bond between the methyl group and the cytosine is stable, and therefore the DNA methylation reaction is believed to be irreversible. Thus, the DNA methylation pattern is considered a permanent component of the epigenome.

Ih DNA Melhylation Represses Gene Expression by Two Different Mechanisms A. Ectopic Methylation Suppresses Expression A long line of data has established that in vitro methylation can suppress genes when ectopically introduced into vertebrate cells (18-20). Although these data clearly demonstrated that ectopic methylation can suppress gene expression, it was not clear whether DNA methylation plays a similar role in regulating differential gene expression in vivo, since genes reported to be regulated by reversible methylation were not expressed ectopically or precociously in Dnmtldeficient mouse embryos (21). However, a critical analysis of these results reveals that hypomethylation is insufficient for expression of genes in the wrong tissue context only if the required transcriptional machineryis not available. These data do not exclude the possibility that demethylation is required for expression of genes in vivo in the appropriate context. DNA methylation might be required to prevent, expression ofa gene under conditions where the necessary transcription factors are present, as is the case in parental imprinting and X inactivation and as might be the case in some tissue-specific genes. However, demethylation of a gene in a tissue that does not possess the transcription machinery required to express the gene will be of no consequence, as has been seen in Dnmtl-deficient

REGULATIONOF DNMT1

51

mouse embryos (21). A recent experiment taking advantage of a signal that is required for demethylation of the aprt gene during development has shown that a transgenic aprt that is not demethylated in vivo is not expressed (22). This experiment demonstrates that demethylation is required for expression of genes

in vivo.

B. DNA Methylation Can SuppressGenes by Interfering with the Binding of Transcription Factors The first mechanism by which methylation inhibits gene expression is by interfering with the binding of transcription factors to their recognition site which includes a CpG sequence. A number of examples of direct inhibition of binding of transcription factors to regulatory sequences by methylation are documented, including myc (23, 24), AP-2 (25), cAMP response element CRE (26, 27), and the interaction of CTCF with the parentally imprinted IgflI insulator (28-31). This mechanism provides a simple explanation for the inhibition of gene expression by methylation. However, a number of studies have established that many transcription factors either do not contain a CpG sequence in their recognition sequence or are not inhibited by methylation even when they do contain a CpG sequence. A classical example of a transcription factor that does not discriminate between methylated and nonmethylated CpGs in its consensus recognition sequence is the ubiquitous Spl, which controls the expression of many genes that bear CG-rich islands within their promoter region (32, 33). It is clear that an additional mechanism must exist to explain the suppression of CpG island promoters by methylation.

C. Methylated DNA Interactswith Methylated DNA Binding Proteins(Mbds); Mbds RecruitCorepressors and Histone Deacetylases A second mechanism currently attracting significant interest is an indirect mechanism whereby methylated cytosines are recognized by a family of methylated DNA-binding proteins (Mbd) such as Mecp2 (34, 35) or Mbdl (36) and Mbd2 (37). The Mbds associate with corepressors such as Sin3A which recruit histone deacetylases to methylated genes (38-42). Histone deacetylases have been shown to be associated with repressor complexes and are believed to play an important role in inactivating or "closing" chromatin. A number of studies using either a coimmunoprecipitation approach or chromatographic fractionation show that different Mbds are found in known repressor multiprotein complexes such as the chromatin remodeling Mi2 complex (43) and NuRD complex (44). Some of the repression mediated by methylated DNA binding protein is relieved by treating cells with the histone deacetylase inhibitor trichostatin A (TSA), demonstrating the critical role of histone deacetylation in mediating

52

MOSHE SZYFAND NANCYDETICH

gene suppression by DNA methylation (38). However, a number of studies have shown that not all repression by Mbds could be remoVed by inhibiting histone deacetylation (45). Thus, there must be an additional mechanism through which Mbds inhibit the expression of methylated genes. Different splice forms of Mbdl (46) and Mbd2 (37) have been identified; however, the differential role of the various Mbds and their different splice forms in mediating gene suppression by methylation is not yet clear. One interesting point is that the interaction of some Mbds such as Mecp2 with methylated DNA is independent of the density of the methylated CpGs (35), whereas other Mbds such as Mbdl (47) and Mecpl [possibly containing Mbd2 as the methylated DNA recognition component (41)] require a high density of methylated CpGs for binding. It is possible that they are responsible for regulating different classes of methylated genes. Another interesting point that has not yet been fully clarified is the profile of expression of the different Mbds. Whereas Mecp2 is not expressed at high levels in either embryonal cells or transformed cell lines (35), Mbd2 is expressed at high levels in breast cancer ceils (48) and other cancer cell lines. The different Mbds might play specialized roles during development and cellular transformation and might interact with distinct classes of methylated promoters such as densely methylated CpG islands in tumor suppressor genes. It is well known, as will be discussed below, that tumor suppressors bearing CpG islands are methylated in many tumors (49), so it is tempting to speculate that specific Mbds that interact with densely methylated CpG islands are responsible for suppression of methylated tumor suppressors in tumor cells. The specific Mbds that might be responsible for silencing tumor suppressors are obviously important potential anticancer drug targets. In addition, our unpublished data suggest that the interaction of a specific Mbd with a promoter might be dependent on the combination of transcription factors interacting with the promoter. Studies using chromatin immunoprecipitation (CHIP) with antibodies against specific Mbds will help in delineating the specific groups of promoters that interact with the various Mbds. Although little is known about the exact functions of the different Mbds and the type of promoters that they regulate, it stands to reason that the diversity of Mbds creates an additional level of specificity to the interpretation of DNA methylation signals. This diversity allows for clustering of the response to DNA hypermethylation. The group of promoters shut down by methylation depends on the assortment of Mbds expressed in the cell, as well as on the combination of transcription factors interacting with the different promoters. It is possible that this clustering of the response to methylation orchestrates specific programs of genes that are suppressed by DNA methylation and have a coordinated role in cellular life at different points in development, in response to different

REGULATIONOF DNMT1

53

physiological signals, and in cancer. Another intriguing observation is that some Mbd 1 isoforms (46) can repress promoters irrespective of their state of methylation. It is important to know whether this effect is coordinated with the other functions of Mbds, creating orchestrated programs of gene suppression that involve both methylated and unmethylated promoters. Recent data have associated mutations in the methylated DNA binding protein Mecp2 with RETT syndrome, a progressive neurodevelopmental disorder that one of the most common causes of mental retardation in females, with an incidence of 1 in 10,000-15,000 (50). This discovery focused attention on the critical role that Mbds might play in human physiology and pathology.

lU. Enzymatic DNA Metflylation Machinery A. DNA Methyltransferases DNA methylation is carried out by DNA methyltransferase enzymes, which catalyze the transfer of a methyl group from S-adenosylmethionine (AdoMet) onto the 5' position of the cytosine ring (51-54). The products of the reaction are S-adenosylhomocysteine (AdoCys) and methylated DNA. The first DNA methyltransferase to be cloned from vertebrates, DNMT1, has been extensively studied in the last decade (55-59) and is responsible for replicating the DNA methylation pattern, as demonstrated by mouse knockout experiments (60). It is believed that it can faithfully do so because it transfers a methyl group to the nascent strand only if a methyl group is present on the parallel position on the parental strand (61). Other candidate DNA methyltransferases, DNMT3a and b, have recently been identified. It appears that they play a role in specific methylation events such as methylation of satellite sequences in the centrosome by DNMT3b and possibly some de novo methylation during early development (62, 63). During development new patterns of methylation are generated de novo. They must be catalyzed by enzymes that are not guided by the methylation of the parental strand. Mutations in DNMT3b were recently associated with ICF syndrome, an imunodeficiencysyndrome associated with hypomethylation of centrosomic sequences and an increased frequency of chromosomal rearrangements (64, 65). It still unknown which DNMTs are involved in the different de novo methylation events during development, and it is unclear whether maintenance DNMT1 plays such a role. It is clear, however, that DNMT1 plays a critical role in replicating the DNA methylation pattern and is crucial for maintenance of the methylation pattern in somatic cells. In this review, we focus on the regulation of DNA methylation in somatic cells and its consequences for cellular transformation.

54


B. Demethylases It is clear that during the process of development a sequence of global and site-specific demethylation events is involved in shaping the DNA methylation pattern (66-71). It is also well established that global changes in DNA methylation occur during cellular differentiation (72-75). However, the nature of the enzymatic reaction that removes methyl groups from DNA remains controversial. The main theoretical problem with active demethylation is the strength of the covalent bond between the methyl residue and the cytosine ring (76). Because cleavage of this bond has been considered impossible, it was suggested that demethylation could be accomplished by either repair or replication in the absence of DNA methyltransferase activity. Critical sites might be masked during replication and thus protected from maintenance DNA methylation (6). In accordance with this hypothesis, recent experiments have demonstrated that high-affinity site-specific binding of an ectopically expressed bacterial protein to an engineered recognition element can result in site-specific demethylation (77). This is possibly accomplished by inhibiting access of the DNMT to the protein-bound site. It is not yet clear whether this artificial paradigm resembles authentic demethylation in vivo. Two different repair mechanisms have been proposed to cause site-specific active demethylation. The first mechanism involves a glycosylase that recognizes methylated CpGs and cleaves the bond between the methylated cytosine base and the deoxyribose. The apyrimidinic base is removed and replaced with a nonmethylated cytosine nucleotide by a repair activity. Such a mechanism was suggested to play a role in chicken embryos (78-83) and during differentiation of erythroleukemia cells in culture (75). An alternative repair mechanism involves nucleotide excision of the methylated CpG dinucleotide and its replacement with a nonmethylated cytosine (84). I have previously suggested that a true demethylase that removes the methyl moiety from cytosine must exist, since global repair of the DNA to achieve hypomethylation might seriously threaten the integrity of the genome. Recent data have clearly demonstrated rapid demethylation of the paternal genome in the zygote only hours after fertilization, before the first round of DNA replication commences (85), This must be catalyzed by an active demethylase. Following the demonstration that P19 cells that ectopically express Ras also express high demethylating activity (86), we purified a bonafide demethylase from human lung cancer A549 cells (87). The demethylase enzyme from A549 cells catalyzes the hydrolytic cleavage of the bond between the methyl group and the cytosine ring, releasing the methyl group in the form of methanol. The demethylase activity is processive both in vitro (88) and in vivo (N. Cervoni et al., unpublished data). The processivity of the enzyme is probably critical for the rapid global

REGULATIONOF DNMT1

55

hypomethylation observed during early development. It is also consistent with the regional differences observed in methylation of CG islands, suggesting that a whole region rather than one specific site is demethylated in active genes. We also demonstrated that the methylated DNA binding protein Mbd2 (40) possesses a demethylase activity (89). The assignment of a demethylase activity to a protein that was independently discovered as a methylated DNA repressor has triggered obvious controversy in the field (40). Our unpublished data suggest a bifuuetional role for demethylase/Mbd2 as both a suppressor of methylated genes and as a demethylase. It is possible that both functions reside in one protein to coordinate a program of gene expression that requires suppression of some methylated genes and activation of others. In summary, our data and the fact that demethylase activity is present in somatic cancer cells raise the possibility that DNA methylation is a reversible biological modification (87), and that the observed pattern of methylation is an equilibrium of methylation and demethylation activities. It is not yet clear whether the demethylase that we have identified plays a role in the generation of DNA methylation patterns during development. It is similarly unclear whether it plays a role in the inheritance of DNA methylation pattern during replication. However, since we have demonstrated that a demethylase activity indeed exists in mammals, it stands to reason that a similar activity is involved in demethylation during development. The fact that a demethylase is found in somatic cells and especially in cancer cells has obvious implications for our understanding of how the D NA methylation pattern is inherited and maintained in normal cells as well as in cellular transformation (90). Since this review focuses on the maintenance of DNA methylafion patterns and not on development, we will limit our discussion to the role that demethylase might play in replication of the DNA methylation pattern and in cellular transformation.

IV. Coordination of DNA Replication and DNA Melhylation Machineries During replication, both the genome and the epigenome have to be faithfully replicated. The faithful replication and accurate maintenance of the DNA methylation pattern are obviously critical for the integrity of genome function. We reasoned that vertebrate cells must have established multiple mechanisms to ensure that the replication of the epigenome and the genome is accurately coordinated (73, 91). We also reasoned that deregulation of the DNA methylation machinery must play a critical role in disease states involving changes in genes (1-3). The DNA methylation machinery is also an attractive pharmacological target because agents that interfere with its components might be utilized to

56


correct an aberrant regulation of the DNA methylation enzymes or to force a cell to execute a gene expression program that is of therapeutic utility (1-3, 92). We therefore concentrated on elucidation the regulation of the DNA methylation machinery in somatic cells.

A. Regulatory Regionsof Dnmtl Interactwith Nodal Mitogenic and Oncogenic Signaling Pathways Our primary focus was on the regulation of Dnmtl, the enzyme that plays a critical role in replication of the DNA methylation pattern. We studied the transcriptional regulatory regions of murine and human Dnmtl (93--95). The Dnmtl gene is composed of a large number of exons (35) spread over 60 kb of genomic sequence (93). We have identified three downstream promoters, which are located 5' to the second, third, and fourth exons, and a strong upstream housekeeping promoter residing in a CpG island 5' to the first exon (93). The most striking observation in the regulatory domain of Dnmtl is the presence of AP-1 recognition elements upstream to the proximal promoters (93, 94). AP-1 regulatory sequences are known to be activated by the nodal mitogenic and protooncogenic Ras-c-Jun signaling pathway (96-99). We have demonstrated that the AP-1 recognition region acts as an enhancer that is activated by c-Jun (93, 94). We also demonstrated that levels of cellular dnmtl mRNA could be reduced by downregulating the Ras signaling pathway by either ectopic expression of human GAP or a dominant negative c-jun (100). Similarly, overexpression of Ha-ras in P19 cells results in increased dnmtl transcription and mRNA levels (94). These data suggest that the expression ofdnmtl is coordinated with other cellular events that regulate DNA synthesis. This level of regulation might be required to ensure that DNA synthesis does not proceed in the absence of DNA methylation since replication in the absence of adequate DNMT1 activity will result in a loss of the DNA methylation pattern (Fig. 1).

B. Dnmtl ExpressionPlaysa Causal Role

in Cellular Transformation The responsiveness of Dnmtl to mitogenic and protooncogenic pathways can explain the increased levels of Dnmtl observed in cancer cells (101, 102). The observation that the expression of Dnmtl is linked with a nodal oncogenic pathway raises the question of whether it is a downstream effector of this pathway (1). To address this question, we expressed an antisense mRNA to dnmtl in Y1 cells, an adenocarcinoma cell line that bears amplified copies of the Ha-ras gene and expresses high levels of Ha-ras and dnmtl mRNA (103). Expression ofdnmtl antisense reverses tumor growth in culture and in vivo (103) and administration of dnmtl antisense oligonucleotides inhibits Y1 tumor cell growth in vivo in a syngeneic mouse (104). Our hypothesis that dnmtl is downstream to the AP-1

REGULATION OF DNMT1

57

Rb APC

P GO mRNA destabilization

j_

13-catenin I

Ras

1

A-1

TCF-1 J'

I

I DMAP1

lUll II IIIII II

L ~

~.

IIIIII IIIIII

IIIII IIIII

TRD FTR

II

I

II IIII II IIII.

I

I IIII IIIII II I IIII IIIII II

repression Rb ~ling catalytic

I

I

methylation

Replication w

tumor suppressor genes

tumor suppressor genes

FIG. 1. The DNMT1 gene is regulated by nodal cell cycle control signals and encodes a multifunctional protein with potential cell-cycle regulatory activities: a model. The exon-intron structure of the DNMT1 gene is shown. Horizontal arrows indicate transcription initiation sites (93). The organization of the exons is consistent with the concept that DNMT1 encodes a complex protein consisting of several distinct functional domains (56). The different regulatory pathways that are postulated to regulate the DNMTI gene are indicated above the map 0fthe gene. The different functional domains that are encoded by the gene are indicated below the map of the gene. The gene is transcriptionally upregulated by known oncogenic and mitogenic pathways such as the Ras-AP-1 pathway and possibly the APC-TCF1 pathway. The DNMT1 mRNA is possibly destabilized in arrested cells through a c/s acting signal in the 3~ UTR of the gene. The different protein domains encoded by the gene can potentially regulate the cell cycle either by inhibiting the expression of tumor suppressor genes or by their interactions in the replication fork. DMAP1--DNMTl-associated protein 1, which was shown to link DNMT1 and HDAC2 (137); PCNA the exons encoding the domain of DNMT1 that binds to the proliferating cell's nuclear antigen PCNA; TRD---~e exons encoding the target recognition domain of DNMT1 (F. D. Araujo et al., unpublished); FTR the exons encoding the replication fork targeting region of DNMT1; repression-- the exons encoding a domain of DNMT1 involved in binding Rb and suppressing expression of E2F-responsive elements and the p19/ARF tumor suppressor promoter; catalytic--exons encoding the DNA methyltransferase catalytic domain.

58


signaling pathway is supported by recent data showing that dnmtl is one of the genes induced by forced expression of c-fos, (a partner of c-JUN in the AP-1 complex) and that inhibition of dnmtl by antisense expression reverses cellular transformation induced by c-fos (105). Expression of Dnmtl is regulated by Ras in human T cell lines (106), and hypermethylation is correlated with activating mutations of Ras in some colon human tumors (107, 108), although it is unclear whether hypermethylation of tumor suppressor is indeed a consequence of increased DNMT1 activity.

C. Regulationof dnmtl by the Retinoblastoma TumorSuppressorRb If dnmtl is indeed critical for the progression of the cell cycle, it should be also regulated by other critical cellular regulatory pathways. In accordance with this hypothesis, we have recently shown that ectopic expresion of SV40 T antigen induces dnmtl expression (109). One of the important mechanisms through which T antigen exerts its effects is its ability to interact with and suppress the tumor suppressor Rb (109). A mutated T antigen that is incapable of interacting with Rb is also incapable of inducing Dnmtl, suggesting that Rb represses dnmtl expression either directly or indirectly (109). Another mode of involvement of Rb in regulating dnmtl has recently been described by us (A. D. Slack et al., unpublished data). In differentiating cells, Rb acts synergistically with the protooncogene c-Jun to activate dnmtl. This activation ofdnmtl is mediated through a noncanonical AP-1 recognition signal upstream to the third exon. Whereas c-Jun does not bind this site in the absence of Rb, the presence of both Rb and c-Jun results in formation of an AP-1 complex and strong synergistic activation of this gene. The cooperative activation ofdnmtl by Rb and c-Jun might play a role in regulating expression of dnmtl during development. Whereas a number of studies have shown that dnmtl is expressed during cellular differentiation (110) and that this expression is necessary (111), the role that DNMT1 might play during cellular differentiation is unknown. It stands to reason that the synergistic activation of dnmtl by Rb and c-Jun might also play a role in mitogenesis and oncogenesis, whereby a protooncogene recruits a tumor suppressor to enhance its mitogenic and protooncogenic activity.

D. The APC-TCF Pathway Another possible link between Dnmtl and critical cellular control pathways is the APC-fl-catenin-TCF pathway (112-114). The APC (adenomatus polyposis coli) gene is mutated in many cases of familial colon cancer, fl-Catenin associates with nuclear TCFs, forming a bipartite transcription factor and the cytoplasmic tumor-suppressor protein APC binds to fl-catenin, causing its destruction. In APC-deficient colon carcinoma cells, fl-catenin accumulates and is constitutively complexed with TCF, resulting in transcriptional activation of TCF

REGULATIONOF DNMT1

59

protooncogenic target genes such as c-myc (115). Min mice bearing a mutation in the mouse homolog of the APC gene spontaneously develop adenomatous polyps in the colon (••6). When Min mice are genetically crossed with heterozygous dnmtl knockouts, they show a reduction in polyp formation (116), suggesting that dnmtl is a downstream target of APC signaling. This experiment provides strong genetic evidence linking the APC mutation and dnmtl in vivo. The sequence of the minimal promoter of human Dnmtl contains a number of TCF recognition sites, suggesting that TCF might be activating Dnmtl directly, thus providing a molecular explanation for the genetic link between Dnmtl and APC (56). In summary, one possible mechanism of coordinating DNA replication and methylation is regulation of expression of Dnmtl by nodal mitogenic regulatory circuits. We and others have also found that activation of Dnmtl by these pathways is essential for the maintenance of cellular transformation. Thus, the study of how Dnmtl is regulated has unraveled a molecular link between Dnmtl and cellular transformation (2, 90) (Fig. 1).

E. Cell-Cycle Regulation of dnmtl In addition to the transcriptional upregulation ofdnmtl by mitogenic signaling pathways, we have uncovered another mechanism of regulation of Dnmtl that is posttranscriptional. This might be as critical as the transcriptional regulation discussed above for cellular growth and transformation. Dnmtl mRNA is not present in arrested cells but is highly induced at the G1-S boundary (91). Expression remains high during the S phase and is then reduced (91). However, runoff transcription experiments have demonstrated that Dnmtl is transcribed throughout the cell cycle (91), which suggests that the levels of dnmtl mRNA are regulated with the cell cycle at the posttranscripfional level. We have recently identified a conserved 3'untranslated mRNA element that is responsible for this response. A 50-nt sequence in this region is 100% conserved between human and chicken Dnmtl. This 3' RNA element interacts with a 40-kDa protein that is present only in growth-arrested cells (N. Detich et al., unpublished data). Although the sequence of this protein is still unknown, it is possible that this protein plays a critical role in ensuring that dnmtl is not expressed precociously at the wrong phase of the cell cycle. This protein might play a critical role in cell-cycle control and cellular transformation, as will be discussed below (Fig. 2).

F. Expression of Demethylase/Mbd2 and Mbd3 Is Regulated with the Cell Cycle An important question is whether the inheritance of the methylation pattern is exclusively carried out by DNMT1 or whether a demethylase activity participates in the replication fork ensuring that the DNA methylation pattern is accurately replicated. The consensus is that since DNMT1 is highly specific

60

MOSHE SZYFAND NANCYDETICH 120

Dnmtl

100

/S

'*'*~.~'~. ~ "

A

B 80 =

/

I\

•~ 60

/M,,~\

~ 40

I

,"-'

phase A

/\ " ;

/ ". -I

20

0

3

6

8

12 16 ?0 24

32

48

time (h)

FIG. 2. Differentmembers of the DNA methylationmachineryare coordinatedwith the cell cycle. A scheme of the expressionprofileof DNMT1, demethylase/Mbd2,and Mbd3 during the cell cycle.The schemeis based on Northern analysisof the indicatedmRNAsfromserum-starved mouse BALB/ccells that were stimulatedto enter into the cell cyclewith serum. The time points indicatetime after additionof serum. The percentageof cells in S at the indicatedtime pointswas determinedby a FACSanalysisof propidium iodidestainednuclei (N. Detichet al., unpublished data).

to hemimethylated DNA, it will accurately copy the parental-strand methylation pattern; therefore, there is no theoretical need for any other methylation or demethylation activity. However, since the maintenance activity of DNMT1 has been demonstrated only in vitro (61), it is still unclear whether DNMT1 has any de novo methylation activity in vivo at the replication fork. A number of studies have indicated that DNA methylation patterns tend to drift and progress from heavily methylated regions, suggesting that DNMT1 might use signals other than the template methylation (91). Such drifting in DNA methylation has been identified in tumors and aging humans (117, 118). Drifting of the DNA methylation pattern might result in suppression of critical genes, and so must be controlled. It stands to reason that the cell has developed mechanisms to counteract this drifting DNA methylation by expressing a demethylase activity during DNA replication. We therefore propose that the inheritance of the methylation pattern is an outcome of an equilibrium of methylation and demethylation activities. Using coimmunoprecipitation assays, it was demonstrated that, indeed, demethylase/Mbd2 and Mbd3 are resident in the replication fork during late S and form a complex with DNMT1 (119). Our unpublished data demonstrate that both demethylase/Mbd2 and Mbd3 are regulated with the cell cycle and are expressed at early G1 and late S phases (N. Detich et al., unpublished) (Fig. 2). Thus, all components of the DNA methylation machinery are coordinated with the cell

REGULATIONOF DNMT1

61

cycle and possibly play an active role in maintaining the integrity of the DNA methylation pattern during replication. One interesting question is what determines the specificity of demethylase in protecting DNA from hypermethylation during replication. Our recent unpublished data suggest that the accessibility of demethylase to methylated DNA is determined by its state of chromatin activity (N. Cervoni et al., unpublished). According to this model, the unmethylated state of active genes is maintained, since demethylase can access only open chromatin regions. Thus, the presence of demethylase during replication at the replication fork ensures that active genes are protected from methylation. In summary, one mode of coordinating the S phase of the cell cycle with DNA methylation is by regulating expression ofdnmtl mRNA and protein levels with the initiation of S phase. Since dnmtl expression is induced upon entrance to S, DNA replication does not proceed in the absence of DNA methyltransferase, ensuring that the replicating DNA is properly methylated. The coordinate expression of other components of the DNA methylation machinery, such as demethylase/Mbd2 and the methylated DNA-binding protein Mbd3, suggests that they might act in concert with DNMT1 to accurately replicate the DNA methylation pattern (Fig. 2).

G. DNA Methylation Occurs Concurrently with DNA Replication Another important means of coordinating the replication of the DNA methylation pattern with DNA replication is by positioning of DNMT1 in the replication fork. Thus, DNMT1 methylates DNA simultaneouslywith its synthesis, ensuring that the DNA methylation pattern is copied accurately with the genetic component. We have shown that nascent DNA is immediately fully methylated following its synthesis (120). The general abundance of methyl groups in newly synthesized DNA, as determined by a nearest-neighbor analysis, is similar to that of total DNA. Similarly, specific CpG sites close to replication initiation points are methylated in nascent DNA immediately following replication, as demonstrated by a bisulfite-mapping analysis (120). Interestingly, an origin of replication is positioned in the 5' regulatory region of the human Dnmtl gene. It is unclear whether the positioning of an origin of replication in the transcriptional regulatory region of Dnmtl is utilized to coordinate expression of Dnmtl and initiation of replication (121).

H. DNMT1 Is Targeted to the Replication Fork and Complexed with the Replication Fork Protein PCNA Concurrent methylation and replication is possible because DNMT1 is targeted to the replication fork by a specific domain in the protein (122). This domain binds the proliferating cell nuclear antigen PCNA (123). It is interesting

62


that DNMT1 binds PCNA at the same position as p21 a tumor suppressor that inhibits DNA replication by forming a complex with PCNA (123). We have previously suggested that this competitive interaction of p21 with PCNA serves as a mechanism to ensure that DNA replication cannot proceed in the absence of DNMT1, as will be further discussed below (2).

I. DNMT1 Regulates the Expression of Growth Suppressor Genes Directly by Protein-Protein Interactions One novel and surprising mechanism that coordinates DNMT1 with cellcycle regulatory circuits is the regulation of tumor suppressors directly by DNMT1 by a mechanism that does not involve DNA methylation (124). The prevailing common wisdom has been that DNA methyltransferase can alter gene expression only by methylating certain regulatory sites in genes. The methylation of the DNA causes silencing by the mechanisms described above. However, our recent data suggest that DNMT1 can regulate gene expression by a mechanism that does not involve DNA methylation. Treatment of human lung cancer cell line A549 with dnmtl antisense oligonucleotidies, hairpin oligonucleotides that inhibit DNMT1, or 5-azaCdR results in a rapid increase in the expression of the tumor suppressor p21 (124). Inhibition ofDnmtl mRNA byDnmtl antisense induces an unmethylated p21 promoter-luciferase reporter construct; and since p21 is not methylated even in untreated cells, this induction of p21 does not result from inhibition of DNA methylation. A number of possible mechanisms can explain this unexpected observation that DNMT1 can regulate gene expression without DNA methylation (90). First, inhibition of DNMT1 and its removal from the replication fork frees PCNA and allows it to form a quaternary complex with p21 and cdk and eye]in (125), resulting in inhibition of Rb phosphorylation. An increase in the relative level of hypophosphorylated Rb can lead to multiple changes in gene expression including activation of SpI (126-130). The p21 promoter is regulated by Spl (131-133). An alternative explanation is based on the recently discovered protein-protein interactions between DNMT1, Rb, and E2F (134). This complex suppresses genes that are regulated by E2F. Inhibition of DNMT1 will therefore result in induction of genes bearing E2F recognition sequences such as p21 (135). A third possibility is that DNMT1 suppresses the expression of p21 by its ability to interact with histone deacetylase 1 (136) and 2 (137). Histone deacetylase inhibitors were previously shown to be strong inducers of p21 (138). DNMT1 inhibitors might act by an identical mechanism. It is also possible that DNMT1 acts to suppress the expression of tumor suppressor genes by all three mechanisms.

REGULATIONOF DNMT1

63

In summary, a new level of coordination of DNMT1 levels with growth regulatory circuits has recently been unearthed. In addition to regulation of DNMT1 levels by growth regulatory circuits, DNMT1 can regulate growth regulatory circuits by altering expression of tumor suppressors such as p21 (124) or p19 ARF (134 ) by protein-protein interactions. The picture that emerges from functional analyses of the DNMT1 protein is that it is a multifunctional protein that bears an enzymatic DNA methyltransferase domain as well as cell-cycle regulatory functions (Fig. 1). The cell-cycle regulatory domains act in concert with other growth regulatory circuits to inhibit tumor suppressor gene expression and to compete with p21 away from the replication fork, thus enabling initiation of DNA replication. Tumor suppressors bear transcription factor regulatory sequence elements similar to other housekeeping genes that are needed for replication, such as Sp-1 and E2F1. Once the DNA synthetic phase is initiated, the cell must simultaneously suppress tumor suppressors and induce DNA synthetic proteins. E2F1-DNMT1 complexes might play a role in discriminating tumor suppressors from other housekeeping genes and their silencing during S phase. Since DNMT1 is induced upon entrance to S (91), it is well positioned to suppress tumor suppressor gene expression upon initiation of the S phase of the cell cycle. However, it is not yet clear what elements in tumor suppressor genes target them for repression by the DNMT1 complex. We propose that the assembly of cell-cycle regulatory functions and DNA methyltransferase activities into one polypeptide has evolved to ensure that DNA replication will never proceed in the absence of DNA methylation, and that DNA methylation should not occur in the absence of DNA replication. The hypothesis that Dnmtl evolved from an earlier replication-associated protein is supported by the discovery of a protein related to DNMT1 in Drosophila. Drosophila do not express DNA methyltransferases and do not have methylated DNA (139). This DNMTl-related protein interacts with PCNA similarly to DNMT1 but does not possess catalytic methyl transferase activity. The fact that DNMT1 has cell cycle regulatory functions in addition to and independent of its DNA methylation activity highlights the significance of the cell cycle regulation of Dnmtl expression and its potential role in controlling the cell cycle.

V. Inhibition of DNMT1 Inhibits DNA Replication in Human Lung Cancer Cells The close interrelationship between DNMT1 and DNA replication and growth regulatory circuits raises the possibility that DNMT1 is critical for

64


replication at least under some conditions. Using different inhibitors ofDNMT1, we tested the hypothesis that inhibition of DN MT1 results in inhibition of DNA replication (140). We used three different inhibitors of DNMT1 to ascertain that the results obtained do not result from peculiar properties of the inhibitor. Expression of Dnmtl antisense mRNA, or treatment of A549 human lung cancer cell line with either antisense oligonucleotides or modfied hairpin inhibitors, results in inhibition of DNA synthesis activity and halts the progression of the cells through the cell cycle (140). DNMT1 inhibitors prevent the firing of three origins, of replication, the/~-globin origin, the c-myc origin, and two origins of replication located in the Dnmtl locus. Both methylated and unmethylated origins are inhibited, suggesting that loss of methylation of origin sequences is not responsible for this effect. The mechanism through which inhibition of DNMT1 affects replication is still unknown. It is possible that the presence of DNMT1 in the replication fork is required for initiation of DNA replication. Alternatively, it is possible that since DNMT1 forms protein-protein interactions that repress tumor suppressors, as discussed above (124, 137), inhibition of DNMT1 triggers induction of tumor suppressor genes. And as discussed above, the induction of p21 by inhibition of DNMT1 does not involve DNA methylation and can be explained by the protein-protein interactions of DNMT1. Another alternative explanation is that inhibition of DNMT1 and its removal from the replication fork frees PCNA to interact with p21 and arrest DNA synthesis (90). As discussed above, DNMT1 and p21 interact competitively with the same site on PCNA (123).

VI. DNMT1 and Oncogenesis A. Global Hypomethylation and Regional Hypermethylation The positioning of DNMT1 at nodal points in critical growth regulatory circuits might provide a new explanation for the involvement of DNMT1 and DNA methylation in cellular transformation and cancer. An understanding of the mechanistic role of DNMT1 in cellular transformation has important implications for the utility of DNMT1 inhibitors in cancer and the proper design of DNMT1 inhibitors (2, 92). It has been a longstanding observation that aberrations in the DNA methylation pattern are a hallmark of many tumor cells (49, 141-148). However, the nature of these changes has been confusing since both global hypomethylation (149-161) and regional hypermethylation (49, 141, 143, 146, 147, 162-184) of tumor suppressors and other specific genes that are known to be silenced in cancer cells were observed.

REGULATIONOF DNMT1

65

B. Overexpression of DNMT1 in Cancer Cells A separate line of evidence linking DNA methylation and cancer is the interrelationship between overexpression of DNMT1 and cellular transformation (1, 3, 90). First, high levels ofDnmtl mRNA and DNMT1 activitywere reported in tumor samples and in cancer cells (101,102, 185), and Dnmtl expresion is regulated by nodal protooncogenic signaling pathways, as discussed above. Second, ectopic expression of Dnmtl results in cellular transformation (105, 109, 186). Third, inhibition of Dnmtl reverses tumorigenesis (103, 104, 116), as discussed above. The obvious question is whether there is link between the induction of DNMT1 expression by oncogenic pathways, the requirement for high levels of DNMT1 expression in cellular transformation, and the hypermethylation of tumor suppressor genes.

C. Hypothesis I: Overexpression of DNMT1 in Cancer Cells Causes Hypermethylation of Tumor Suppressor Genes An attractive and simple model that links these two lines of evidence is that overexpression of dnmtl causes hypermethylation of tumor suppressor genes. Ample evidence demonstrates that hypermethylation of tumor suppressor genes is a common occurrence in tumors, as discussed above, and that inhibition of dnmtl can lead to demethylation and activation of tumor (187-189). However, several problems cast doubts on this model. First, DNMT1 is a maintenance methyltransferase that is very efficient in accurately copying a methylation pattern but very weak in de novo methyltransferase activity. It is not yet clear yet whether an increase in DNMT1 activity can lead to a significant change in DNA methylation. Although ectopic expression of dnmtl leads to methylation of tumor suppressor genes (190), this process is slow and cannot explain the relatively rapid transformation observed (105, 109). Second, the hypermethylation observed in tumor cells is specific to tumor suppressor genes and other genes that confer a selective advantage upon the cell. DNMT1, however, is a general methyltransferase that does not have a distinct sequence selectivity beyond the CpG dinucleotide sequence. An increase in a general methyltransferase activity should therefore lead to a global rise in CpG methylation. This is not what is observed in cancer cells, where the global level of methylation is reduced relative to normal cells, as discussed above. Third, recent studies have shown that the increase in DNMT1 observed in tumor cells corresponds to the relative increase in DNA synthesis. The methylation capacity, which is defined as the amount of DNMT1 per replicating DNA unit, is therefore not increased in cancer cells (191). Fourth, studies that correlate the level of DNMT1 activity and the state of methylation of tumor suppressor genes have failed to show a clear correlation

(191).

66


D. Hypothesis I1: Deregulation of Cell-Cycle Control o1"DNMT1 in Cancer Cells Results in Disruption of Cell Regulatory Circuits We propose an alternative hypothesis whereby DNMT1 causes cell transformarion by interfering with cell-cycle regulatory circuits, and is mediated by its protein-protein interactions and not by DNA methylation as discussed above. We suggest that a critical property of DNMT1 is its coordination with other growth regulatory circuits in the cell, as discussed above and as supported by recent data (192). Recent immunochemical analysis comparing the expression of cell cycle markers and DNMT1 in colorectal cancers with normal cells suggests that the coordinated cell cycle regulation of DNMT1 is disrupted in colorectal cancer cells in vivo (193). Whereas in normal cells there is a concordance between expression of cell proliferation markers and DNMT1, this is disrupted in carcinoma cells. For example, DNMT1 is expressed in carcinoma cells that also express markers that distinguish arrested cells such as p21 (193). We suggest that the main defect in DNMT1 expression in cancer cell is not a change in its absolute levels, but the loss of its coordinated cell-cycle regulation. Since DNMT1 expression regulates nodal cell cycle controls, aberrant expression of DNMT1 at the inappropriate phases of the cell cycle overrides normal growth inhibitory signals. DNMT1 suppresses the expression of tumor suppressors by complexing with Rb and E2F (134). The increased abundance of DNMT1 also displaces p21 from PCNA and enables the formation ofa DNMT1PCNA complex in the replication fork. The combined effect of DNMT1 results in overriding arrest signals and abnormal entry into the cell cycle. Under normal growth regulatory circuits, DNMT1 is expressed only at the G1-S boundary (91). The normal induction of DNMT1 at the G1-S boundary might play a role in altering the balance between the cellular tone of tumor suppressors, such as p21, and proteins required for DNA synthesis, such as PCNA. DNMT1 might also play an important role during the DNA synthetic phase by inhibiting the expression of tumor suppressors through its interactions with HDAC1 (136), HDAC2 (137), or Rb (134). Why does DNMT1 repress the expression of tumor suppressor genes? Tumor suppressors are regulated by the same transcriptional activators that regulate other housekeeping genes required for DNA synthesis. The promoters of tumor suppressors bear CpG islands that are also present in essential "DNAreplication" housekeeping genes. These promoters are regulated by the ubiquitous transcription factors Sp1 as well as E2F1 (131, 133, 194-196). This presents a challenge to the cell, since it is required to simultaneously repress tumor suppressors and activate DNA synthetic genes that bear similar promoter elements. We suggest that DNMT1 selectively inactivates tumor suppressor promoters through its specific protein-protein interactions and contribute to the decrease

R E G U L A T I O N OF DNMT1

67

in the cell arrest tone during the DNA synthetic phase. Upon extinguishing of normal growth stimulatory signals, DNMT1 expression is downregulated, enabling a shift in the relative level of tumor suppressors in the cell; and as a consequence, cell growth is arrested. Our model is consistent with the observation that while the total level of DNMT1 is increased in tumors, it matches a similar increase in DNA synthetic activity. We suggest that any oncogenic signal that increases DNMT1 in arrested cells leads to a commensurate increase in DNA synthesis, since DNMT1 stimulates entry into the cell cycle. Because all the effects of DNMT1 on the cell cycle are accomplished through its protein-protein interactions, it is clear why there is no general increase in global methylation in cancer cells.

E. Why Are Tumor Suppressors Hypermethylated in Cancer Cells? One question remains to be addressed: What is the mechanism behind the hypermethylationof tumor suppressor genes in cancer cells? As described above, methylation of CpG islands in cancer cells presents itself in two main paradigms. The first and most common paradigm is that sporadic methylation occurs at different tumor suppressor genes in different cancers. The second paradigm is the methylator form, a subset of tumors that are methylated at multiple CpG island-bearing genes (197). The first paradigm of methylation of CG islands in tumors could be explained by a selection model.

F. Methylation of Tumor Suppressors Confers a Selective Advantage We suggest that the state of methylation of any gene is a steady-state equilibrium of methylation and demethylation. In normal CpG islands, demethylation is dominant, and our unpublished data suggest that the extent of demethylation is determined by the state of chromatin configuration. Whereas the steady state is faithfully maintained in general, there is a normal age-dependent drift of methylation from hypermethylated regions toward unmethylated regions (117, 198201). Since hypermethylation of CG islands contributes to their silencing, there is a significant advantage to a cell bearing a methylated tumor suppressor allele. Thus, selection will tend to fix the rare methylated alleles in the population. Perhaps the strongest evidence that tumor suppressor hypermethylation is driven by selection is the observation that in a heterozygous tumor bearing two alleles of p16, mutated and wild-type, only the wild-type allele is hypermethylated (178). Because the two alleles are of almost identical sequence and position in the genome, it is hard to understand how the DNA methylation machinery would selectively identify the wild-type allele. A selection mechanism, on the other hand, could easily explain the data. Since there is no selective advantage

68


whatsoever for methylating a mutant allele of a tumor suppressor that is transcribed but does not produce an active protein, such a methylation event is not going to be selected. Methylation of a wild-type allele of a tumor suppressor, however, confers a strong selective advantage and is therefore fixed.

G. The Hypermethylator Phenotype The hypermethylator phenotype, which is characterized by concurrent methylation of multiple CpG islands in a subset of tumors, is consistent with the presence of a common factor that regulates the methylation pattern of all or many CG islands and is differentially expressed in these tumors. The factor might be a protein that protects CpG islands from aberrant methylation and is downregulated in cells expressing the hypermethylator phenotype. Alternatively, these cells might have upregulated either a factor that targets CpG islands for methylation or a DNMT that is selective for CpG islands.

H. The PossibleRole of de Novo DNMTs The factors that are responsible for the hypermethylator phenotype are still unknown. Our results discussed above suggest that an active demethylase is expressed in cancer cells (87). It might play a critical role in maintaining hypomethylated CpG islands. To fully understand how the pattern of methylation is altered in the hypermethylator phenotype, we have to know the relative contribution of the demethylase(s), DNMT1, and DNMT3a and b to the final methylation pattern of CpG islands. The critical question is what determines the accessibility of the different members of the methylation machinery to CpG islands and what decides their target selectivity. As far as DNMT1 is concerned, the common model is that substrate recognition and methyl transfer are directed by the pattern of methylation of the template (6). Therefore, there is no apparent need for additional factors to explain its selectivity. However, the fact that the hypermethylatorphenotype occurs might imply that either DNMT1 has some de novo activity that is independent of the template methylation pattern or that other de novo DN MTs are involved. This must be the case since CpG islands are unmethylated in the normal counterparts of tumor cells. However, a recent paper provides further evidence that methylation of tumor suppressor genes is not caused by overexpression of DNMT1 (202). In this paper, human colorectal carcinoma cells that lacked DNMT1 owing to disruption of the d n m t l gene through homologous recombination still maintained the methylation of the tumor suppressor p16 (202). Alternatively, other de novo DNMTs that are overexpressed in a subset of cancer cells might be responsible for the hypermethylatorphenotype. Currently, there is no evidence that any of the known de novo DNMTs are specifically overexpressed in tumor cells expressing the hypermethylator phenotype. Moreover, there is no correlation between the extent of CpG island hypermethylation and expression of any of the known DNMTs (191).

REGULATIONOF DNMT1

69

An additional problem with the model that overexpression of DNMTs is responsible for the hypermethylator phenotype is what determines the specificity of these putative de novo methylation activities. The fact that tumor suppressor CpG islands are selectively methylated in tumor cells while the genome is globally hypomethylated suggests that either these de novo DNMTs are selective for CpG islands or that another factor is responsible for the selective methylation of CpG islands.

I. The Possible Role of Demethylase With regard to the demethylase, it is unclear what determines its selectivity in vivo. The demethylase that we purified can demethylate CpGs found in any sequence context (87, 89). It is clear, however, that the demethylase accessibility to methylated CpG sites in vivo is limited since most of the genome remains methyl-

ated at any given time. An additional question is whether the demethylase is required for the primary inheritance of the replication pattern or whether it performs a repair function protecting active CpG islands from ectopic methylation.

J. The Accessibility of Demethylase to DNA Is Gated by Chromatin Structure Cancer cells can recognize ectopically hypermethylated CpG islands and demethylate them (N. Cervoni et al., unpublished data). We have recently demonstrated that in vitro methylated CpG sequences are actively demethylated in human cancer cell lines once they are packaged into acetylated histones (N. Cervoni et al., unpublished data). Thus, the access of demethylase to methylated CpGs is determined by the state of acetylation of the histones. These data might provide a simple and attractive explanation for the excellent correlation between gene expression, chromatin structure, and DNA methylation. What role does demethylase activity play in cancer and in normal mitotic cells? The answer to this question is obviously dependent on whether DNMT activity during replication is faithfully dictated by the state of methylation of the parental strand. IfDNMT can faithfully replicate the methylation pattern, there is obviously no need for demethylase during replication. It is clear, however, that the methylation machinery is prone to errors since slowly drifting de novo methylation events are well documented, as discussed above (117, 199, 203). One possible role for demethylase might be a repair function. Since demethylase can access only genes that are associated with acetylated histones, it will remove ectopic methylation from active genes. Thus, the demethylase might guard active genes from ectopic hypermethylation. An alternative, but unlikely, provocative hypothesis is that the demethylase plays an active role in the maintenance of a DNA methylation patterns. The inheritance of a DNA methylation pattern during replication might be an outcome of an equilibrium of DNA methylation and demethylation activities.

70


The demethylase contributes to the inheritance of methylation patterns by removing methylation selectively from active genes. The demethylase is able to discriminate between active and inactive genes because of its inherent affinity to DNA which is associated with acetylated histones. The pattern of methylation which is distinguished by the correlation of chromatin structure and DNA methylation is therefore inherited. The hypothesis that demethylase plays a role in DNA replication is supported by the observation that demethylase/Mbd2 is localized to the replication fork (119).

K. Histone Deacetylation and Hypermethylation of Tumor SuppressorGenes According to the model presented above, histone acetylation is the primary determinant of the methylation pattern of tumor suppressor genes. If a common factor that is critical for maintaining an active chromatin structure around multiple CpG islands is missing or downregulated in cancer cells exhibiting the hypermethylator phenotype, it will result in ectopic de novo methylation of multiple CpG islands. Histone deaeetylation is a primary mechanism of regulation of tumor suppressors in cancer cells and during the cell cycle (138, 204, 205). It stands to reason that the activity of tumor suppressors depends on the presence of factors that target histone acetyltransferases and histone deaeetylases. It is interesting to note that there is a documented example of truncation mutations in a histone acetyltransferase, EP300, in a number of tumors and cancer cells (206). According to this model, the correlation between gene expression and the state of activity of some or most genes might be a consequence of the fact that demethylase can only access active genes. There is evidence supporting the hypothesis that methylation follows repression of gene expression. One example is the methylation of the Hprt gene on the inactive X chromosome which occurs after chromosome inactivation (207). Another interesting example is the switching of y-fl-globin gene expression during development. Somatic cell hybrids made by fusing mouse erythroleukemia and human fetal erythroid cells initially express human F-globin but switch with time in culture to adult globin gene production. In hybrids before the switch, the F-genes are unmethylated (208). After completion of the switch, the hybrids contain methylated F-globin genes. However, during the time that the F-fl switch is occurring, hybrids are found that no longer express F-globin, yet still possess unmethylated ~,-globin genes (208). This suggests that methylation of y-globin genes occurs after silencing of the gene which is most probably associated with formation of inaccessible chromatin (209). Similarly, we have previously shown that silencing of the 21-hydroxylase gene in adrenal carcinoma cells precedes the hypermethylation of the gene (210).

REGULATIONOF DNMT1

71

Histone deaeetylation might be also involved in sporadic CpG hypermethylation. The steady-state chromatin structure around genes is the result of a dynamic equilibrium of histone acetylation and deacetylation. Whereas the state of the chromatin structure is accurately inherited during replication, it is possible that because of the selective advantage of the silenced tumor suppressor, the rare event of inactive deacetylated chromatin is selected. The inactive allele is subject to ectopic methylation which is not repaired by the demethylase, resuiting in de n o v o methylation of the gene. DNA methylation fixes the inactive state of the gene by a covalent modification. If our model is correct, one question remains to be addressed: If the state of activity of a gene is the primary determinant of its DNA methylation pattern, why is DNA methylation needed to regulate gene expression? One possible answer is that DNA methylation might serve as a significantly more stable mark of gene activity than histone deacetylation. Our data suggest that DNA methylation is also a reversible process; nevertheless, it is obviously a stable modification and is probably reversed only when stringent conditions are met.

VII. Conclusions The DNA methylation machinery is responsible for maintaining the DNA methylation pattern and coordinating the inheritance of the DNA methylation pattern with the replication of DNA. We propose that multiple mechanisms ensure the coordinate regulation of DNMT1, an enzyme that plays a critical role in the replication of the DNA methylation pattern and DNA replication. The prevailing wisdom in the field it that DNMT1 plays a role in gene regulation through its known biochemical activity, DNA methylation. Surprisingly, however, recent data suggest that DNMT1 can have important effects on gene regulation, and DNA replication by a mechanism that does not involve DNA methylation. DNMT1 can exert this effect because it is a multifunctional protein that can form protein-protein interactions with histone deacetylases, transcription regulators, and the replication fork. One of the main consequences of this activity is the repression of tumor suppressors. We propose that gene regulatory functions and DNA methyltransferase activity are combined in one polypeptide to ensure that cell-cycle progression and the level of DNMT1 are tightly linked (Fig. 1). We further suggest that the oncogenic effect of ectopic expression of DNMT1 is a consequence of its regulatory functions. The questions of which proteins are responsible for coordinating expression of DNMT1 with the cell cycle and what determines which genes are regulated by DNMT1 remain to be answered. One of the remaining mysteries is why tumor suppressor genes are methylated in many cancers. We suggest that the hypermethylation of tumor s u p p r e s s o r s

72


in cancer cells is not a result of ectopic DNMT1 expression; therefore, we need to address the mechanisms that are responsible for this aberrant methylation in cancers. The answer to this question will depend on a full understanding of all the components of the DNA methylation machinery that are required for replicating the DNA methylation pattern in somatic cells. One exciting possibility is that a demethylase plays a role in ensuring that active genes maintain their CpG island free of methylation during replication. The existence of a demethylase changes our understanding of how DNA methylation patterns are inherited and suggests that DNA methylation is a plastic and reversible signal. It leads us to look at the steady-state DNA methylation pattern as an equilibrium of DNA methylation and demethylation reactions. Understanding the role that demethylase plays during DNA replication, and the factors that are responsible for regulating the activity of demethylase on newly synthesized DNA, might shed some light on the aberrant de n o v o methylation in cancer cells. The fact that de n o v o methylation happens at all suggests a tone o f d e n o v o methylation in somatic cells. What, then, is the extent of de n o v o methylation catalyzed by either of the known DNMTs during replication and do these DNMTs exhibit any specificity? Once this question is answered, the role of demethylase in protecting genes from de n o v o methylation might be understood. Although many questions remain to be answered, recent data have changed our fundamental understanding of the DNA methylation machinery. Many lines of evidence suggest that different components of the DNA methylation machinery might play a role in human disease. Understanding how these different components orchestrate the inheritance and regulated alteration of the DNA methylation pattern is therefore an important challenge.

ACKNOWLEDGMENTS The work from my laboratorythat is discussedin this reviewis supported by grants from the NCIC and the CIHR.

REFERENCES 1. M. Szyf, Trends Pharmacol. Sci. 15, 233 (1994). 2. M. Szyf, Cancer Metastasis Rev. 17, 219 (1998). 3. M. Szyf, Pharmacol. Ther. 70, 1 (1996). 4. A. Razin and M. Szyf,Biochim. Biophys. Acta. 782, 331 (1984). 5. A. Razin and H. Cedar, Proc. Natl. Acad. Sci. U.S.A. 74, 2725 (1977). 6. A. Razin and A. D. Riggs, Science 210, 604 (1980). 7. A. Razin, EmboJ 17, 4905 (1998). 8. A. D. Riggs, Cytogenet. Cell Genet. 14, 9 (1975).

REGULATION OF DNMT1

73

9. J. Singer-Sam, M. Grant, J. M. LeBon, K. Okuyama, V. Chapman, M. Monk, and A. D. Riggs, MoI. Cell. Biol. 10, 4987 (1990). 10. M. Grant, M. Zuccotti, and M. Monk, Nat. Genet. 2, 161 (1992). 11. J. Singer-Sam and A. D. Riggs, Exs. 64, 358 (1993). 12. L. Carrel, C. M. Clemson, J. M. Dunn, A. P. Miller, P. A. Hunt, J. B. Lawrence, and H. F. Willard, Hum. Mol. Genet. 5, 391 (1996). 13. T. Sado, M. H. Fenner, S. S. Tan, P. Tam, T. Shioda, and E. Li, Dev. Biol. 225, 294 (2000). 14. C. Sapienza, Sci. Am. 263, 52 (1990). 15. M.A. Surani, N. D. Allen, S. C. Barton, R. Fundele, S. K. Howlett, M. L. Norris, andW. Reik, Phil. Trans. R. Soc. Lond. B, Biol. Sci. 326, 313 (1990). 16. C. Polychronakos, Adv. Exp. Med. Biol. 343, 189 (1993). 17. S. M. Tilghman, M. S. Bartolomei, A. L. Webber, M. E. Brunkow, J. Saam, P. A. Leighton, K. Pfeifer, and S. Zemel, Cold Spring Harb. Symp. Quant. Biol. 58, 287 (1993). 18. L. Vardimon, A. Kressmann, H. Cedar, M. Maechler, and W. Doerfler, Proc. Natl. Acad. Sci. U.S.A. 79, 1073 (1982). 19. R. Stein, Y. Grnenbanm, Y. Pollack, A. Razin, and H. Cedar, Proc. Natl. Acad. Sci. U.S.A. 79, 61 (1982). 20. H. Cedar, R. Stein, Y. Gruenbaum, T. Naveh-Many, N. Sciaky-Gallih, and A. Razin, Cold Spring Harb. Syrup. Quant. Biol. 47, 605 (1983). 21. C. P. Walsh and T. H. Bestor, Genes Dev. 13, 26 (1999). 22. Z. Siegfried, S. Eden, M. Mendelsohn, X. Feng, B. Z. Tsuberi, and H. Cedar, Nat. Genet. 22, 203 (1999). 23. G. C. Prendergast, D. Lawe, and E. B. Ziff, Cell (Cambridge, Mass.) 65, 395 (1991). 24. G. C. Prendergast and E. B. Ziff, Science 251, 186 (1991). 25. M. Comb and H. M. Goodman, Nucleic Acids Res. 18, 3975 (1990). 26. N. M. Inamdar, K. C. Ehrlich, and M. Ehrlich, Plant Mol. Biol. 17, 111 (1991). 27. U. Moens, N. Subramaniam, B. Johansen, and J. Aarbal~e, Biochim. Biophys. Acta 1173, 63 (1993). 28. A. P. Wolffe, C u ~ Biol. 10, R463 (2000). 29. A. T. Hark, C. J. Schoenherr, D. J. Katz, R. S. Ingram, J. M. Levorse, and S. M. Tilghman, Nature (London) 405, 486 (2000). 30. A. C. Bell and G. Felsenfeld, Nature (London) 405, 482 (2000). 31. C. Kanduri, V. Pant, D. Louldnov, E. Paugacheva, C. F. Qi, A. Wolffe, R. Ohlsson, and V. V. Lobanenkov, Curt Biol. 10, 853 (2000). 32. M. Holler, G. Westin, J. Jiricny, and W. Schaffner, Genes Dev. 2, 1127 (1988). 33. M. A. Harrington, P. A. Jones, M. Imagawa, and M. Karin, Proc. Natl. Acad. 8ci. U.S.A. 85, 2066 (1988). 34. R. R. Meehan, J. D. Lewis, and A. P. Bird, Nucleic Acids Res. 20, 5085 (1992). 35. J. D. Lewis, R. R. Meehan, W. J. Henzel, I. Maurer-Fogy, P. Jeppesen, F. Klein, and A. Bird, Cell (Cambridge, Mass.) 69, 905 (1992). 36. S. H. Cross, R. R. Meehan, X. Nan, and A. Bird, Nat. Genet. 16, 256 (1997). 37. B. Hendrich and A. Bird, Mol. Cell. Biol. 18, 6538 (1998). 38. X. Nan, H. H. Ng, C. A. Johnson, C. D. Laherty, B. M. Turner, R. N. Eisenman, and A. Bird, Nature (London) 393, 386 (1998). 39. P. L. Jones, G. J. Veenstra, P. A. Wade, D. Vermaak, S. U. Kass, N. Landsberger, J. Strouboulis and A. P. Wolffe, Nat. Genet. 19, 187 (1998). 40. H. H. Ng, Y. Zhang, B. Hendrich, C. A. Johnson, B. M. Turner, H. Erdjument-Bronmge, P. Tempst, D. Reinberg, and A. Bird, Nat. Genet. 23, 58 (1999). 41. H. H. Ng, P. Jeppesen, and A. Bird, Mol. Cell. Biol. 20, 1394 (2000). 42. J. Boeke, O Ammerpohl, S. Kegel, U. Moehren, and R. Renkawitz, J. Biol. Chem. (2000).

74

MOSHE SZYF AND NANCY DETICH

43. P. A. Wade, A. Gegonne, P. L. Jones, E. Ballestar, E Aubry, and A. P. Wolffe, Nat. Genet. 23,

62 (1999). 44. Y. Zhang, H. H. Ng, H. Erdjument-Bromage, P. Tempst, A. Bird, and D. Reinberg, Genes Dev.

13, 1924 (1999). 45. E Yu, J. Thiesen, and W. H. Stratling, Nucleic Acids Res. 28, 2201 (2000). 46. N. Fujita, N. Shimotake, I. Ohki, T. Chiba, H. Saya, M. Shirakawa, and M. Nakao, Mol. Cell. Biol. 20, 5107 (2000). 47. N. Fujita, S. Takebayashi, K. Okumura, S. Kudo, T. Chiba, H. Saya, and M. Nakao, Mol. Cell. Biol. 19, 6415 (1999). 48. A. Vilain, N. Vogt, B. Dutrillaux, and B. Malfoy, FEBS Lett. 460, 231 (1999). 49. S. B. Baylin, AIDS Res. Hum. Retroviruses 8, 811 (1992). 50. R. E. Amir, I. B. Van den Veyver, M. Wan, C. Q. Tran, U. Franeke, and H. Y. Zoghbi, Nat. Genet. 23, 185 (1999). 51. R. L. Adams, J. Turnbull, E. ]. Smillie, and R. H. Burdon, in "Post-Synthetic Modification of

Maeromolecules" (E Antoni and A. Farago, eds.) North-Holland, Amsterdam, 1975. 52. ]. C. Wu and D. V. Santi, Prog. Clin. Biol. Res. 198, 119 (1985). 53. ]. C. Wu and D. V. Santi, J. Biol. Chem. 262, 4778 (1987). 54. S. Kumar, X. Cheng, S. Klimasauskas, S. Mi, J. Posfai, R. J. Roberts, and G. G. Wilson, Nucleic Acids Res. 22, 1 (1994). 55. T. H. Bestor, Gene 74, 9 (1988). 56. S. Ramehandani, E Bigey, and M. Szyf, Biol. Chem. 379, 535 (1998). 57. A. Bacolla, S. Pradhan, R. ]. Roberts, and R. D. Wells, J. Biol. Chem. 274, 33011 (1999). 58. S. Pradhan, A. Baeolla, R. D. Wells, and R. J. Roberts, J. Biol. Chem. 274, 33002 (1999). 59. J. B. Margot, A. M. Aguirre-Arteta, B. V. Di Giaceo, S. pradhan, R. J. Roberts, M. C. Cardoso, and H. Leonhardt, J. Mol. Biol. 297, 293 (2000). 60. E. Li, T. H. Bestor, and R. ]aeniseh, Cell (Cambridge, Mass.) 69, 915 (1992). 61. Y. Gruenbaum, H. Cedar, and A. Razin, Nature (London) 295, 620 (1982). 62. M. Okano, D. W. Bell, D. A. Haber, and E. Li, Cell (Cambridge, Mass.) 99, 247 (1999). 63. M. Okano, S. Xie, and E. Li, Nat. Genet. 19, 219 (1998). 64. R. S. Hansen, C. Wijmenga, P. Luo, A. M. Stanek, T. K. Canfield, C. M. Weemaes, and S. M. Gartler, Proe. Natl. Acad. Sci. U.S.A. 96, 14412 (1999). 65. G. L. Xu, T. H. Bestor, D. Boure'his, C. L. Hsieh, N. Tommerup, M. Bugge, M. Hulten, X. Qu, j. j. Russo, and E. Viegas-Pequignot, Nature (London) 402, 187 (1999). 66. T. Kafri, M. Ariel, M. Brandeis, R. Shemer, L. Urven, J. McCarrey, H. Cedar, and A. Razin, Genes Dev. 6, 705 (1992). 67. T. Kafri, X. Gao, andA. Razin, Proc. Natl. Acad. Sci. U.S.A. 90, 10558 (1993). 68. M. Brandeis, T. Kafri, M. Ariel, J. R. Chaillet, J. McCarrey, A. Razin, and H. Cedar, Embo J 69. 70. 71. 72. 73. 74. 75. 76. 77.

12, 3669 (1993). A. Razin and H. Cedar, Exs 64, 343 (1993). A. Razin and T. Kafri, Prog Nucleic Acid Res. Mol. Biol. 48, 53 (1994). A. Razin and R. Shemer, Hum. Mol. Genet. 4, 1751 (1995). A. Razin, C. Webb, M. Szyf, J. Yisraeli, A. Rosenthal, T. Naveh-Many, N. Sciaky-Gallili, and H. Cedar, Proc. Natl. Acad. Sci. U.S.A. 81, 2275 (1984). M. Szyf, L. Eliasson, V. Mann, G. Klein, and A. Razin, Proc. Natl. Acad. Sci. U.S.A. 82, 8090 (1985). A. Razin, E. Feldmesser, T. Kafri, and M. Szyf, Prog. Clin. Biol. Res. 198, 239 (1985). A. Razin, M. Szyf, T. Kafri, M. Roll, H. Giloh, S. Searpa, D. Carotti, and G. L. Cantoni, Proc. Natl. Acad. Sci. U.S.A. 83, 2827 (1986). H. Cedar and G. L. Verdine, Nature (London) 397, 568 (1999). I. G. Lin, T. J. Tomzynski, Q. Ou, and C. L. Hsieh, Mol. Cell. Biol. 20, 2343 (2000).

REGULATION OF DNMT1 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88.

89. 90. 91. 92. 93. 94. 95. 96. 97.

98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114.

75

J. P. Jost, Proc. Natl. Acad. Sci. U.S.A. 96, 4684 (1993). J. P. Jost, M. Siegmann, L. Sun, and R. Leung, J. Biol. Chem. 270, 9734 (1995). J. P. Jost and Y. C. Jost, Gene 157, 265 (1995). J. P. Jost, M. Fremont, M. Siegmann, and J. Hofsteenge, Nucleic Acids Res. 25, 4545 (1997). J. P. Jost, M. Siegmann, S. Thiry, Y. C. Jost, D. Benjamin, and S. Schwarz, FEBS Left. 449, 251 (1999). S. Schwarz, C. Bourgeois, F. Soussaline, C. Homsy, A. Podesta, and J. P. Jost, Eur. ]. Cell. Biol. 79, 488 (2000). A. Weiss, I. Keshet, A. Razin, and H. Cedar, Cell (Cambridge, Mass.) 86, 709 (1996). J. Oswald, S. Engemann, N. Lane, W. Mayer, A. Olek, R. fundele, W. Dean, W. Reik, and J. walter, Curt Biol. 10, 475 (2000). M. Szyf, J. Theberge, and V. Bozovic,J. Biol. Chem. 270, 12690 (1995). S. Ramchandani, S. K. Bhattacharya, N. Cervoni, and M. Szyf, Proc. Natl. Acad. Sci. U.S.A. 96, 6107 (1999). N. Cervoni, S. Bhattacharya, and M. Szyf,J. Biol. Chem. 274, 8363 (1999). S. K. Bhattacharya, S. Ramchandani, N. Cervoni, and M. Szyf, Nature (London) 397, 579 (1999). M. Szyf, D. J. Knox, S. Milutinovic, A. D. Slack, and F. D. Araujo, Ann. N.Y. Acad. Sci. 910, 156 (2000). M. Szyf, V. Bozovic, and G. Tanigawa,J. Biol. Chem. 266, 10027 (1991). M. Szyf, Curt Drug Targets 1, 101 (2000). P. Bigey, S. Ramchandani, J. Theberge, F. D. Araujo, and M. Szyf, Gene 242, 407 (2000). J. Roulean, A. R. MacLeod, and M. Szyf,J. Biol. Chem. 276, 1595 (1995). J. Roulean, G. Tanigawa, and M. Szyf, J. Biol. Chem. 267, 7368 (1992). P. Angel and M. Karin, Biochim. Biophys. Acta. 1072, 129 (1991). T. Deng and M. Karin, Nature (London) 371, 171 (1994). M. Karin, J. Biol. Chem. 276, 16483 (1995). Y. S. IA, J. Y. Shyy, S. Li, J. Lee, B. Su, M. Karin, and S. Chien, Mol. Cell. Biol. 16, 5947 (1996). A. R. MacLeod, J. Roulean, and M. Szyf,J. Biol. Chem. 276, 11327 (1995). T. L. Kautiainen and P. A. Jones,]. Biol. Chem. 261, 1594 (1986). J. P. Issa, P. M. Vertino, J. Wu, S. Sazawal, P. Celano, B. D. Nelkin, S. R. Hamilton, and S. B. Baylin, J. Natl. Cancer Inst. 85, 1235 (1993). A. R. MacLeod and M. Szyf,J. Biol. Chem. 270, 8037 (1995). S. Ramchandani, A. R. MacLeod, M. Pinard, E. yon Hofe, and M. Szyf, Proc. Natl. Acad. Sci. U.S.A. 94, 684 (1997). A.V. Bakin and T. Curran, Science 283, 387 (1999). J. Yang, c. Deng, N. HematS, S. M. Hanash, and B. C. Richardson, ]. Immunol. 159, 1303 (1997). R. J. Guan, Y. Fu, P. R. Holt, and A. B. Pardee, Gastroenterology 116, 1063 (1999). M. Toyota, M. Ohe-Toyota, N. Ahuja, andJ. P. Issa, Proc. Natl. Acad. Sci. U.S.A. 97, 710 (2000). A. Slack, N. Cervoni, M. Pinard, and M. Szyf,J. Biol. Chem. 274, 10105 (1999). J. Deng and M. Szyf, Brain Res. Mol. Brain Res. 71, 23 (1999). S. P. Persengiev and D. L. Kilpatrick, Neuroreport 8, 227 (1996). V. Korinek, N. Barker, P. J. Morin, D. van Wichen, R. de Weger, K. W. Kinzler, B. Vogelstein, and H. Clevers, Science 275, 1784 (1997). B. Rubinfeld, P. Robbins, M. E1-Gamil, I. Albert, E. Porfiri, and P. Polakis, Science 275, 1790 (1997). H. Clevers and M. van de Wetering, Trends Genet. 13, 485 (1997).

76


115. T. C. He, A. B. Sparks, C. Rago, H. Hermeking, L. Zawel, L. T. da Costa, P. J. Morin, B. Vogelstein, and K. W. Kinzler, Science 281, 1509 (1998). 116. P.W. Laird, L. Jackson-Grusby, A. Fazeli, S. L. Dickinson, W. E. Jung, E. Li, R. A. Weinberg, and R. Jaenisch, Cell (Cambridge, Mass.) 81, 197 (1995). 117. N. Ahuja, Q. Li, A. L. Mohan, S. B. Bay[in, and J. P. Issa, Cancer Res. 58, 5489 (1998). 118. C. Salem, G. lAang, Y. C. Tsai, J. Coulter, M. A. Knowles, A. C. Feng, S. Groshen, P. W. Nichols, and P. A. Jones, Cancer Res. 60, 2473 (2000). 119. K. I. Tatematsu, T. Yamazaki, and F. Ishikawa, Genes Cells 5, 677 (2000). 120. F. D. Araujo, J. D. Knox, M. Szyf, G. B. Price, and M. Zannis-Hadjopoulos, Mol. Cell. Biol.

18, 3475 (1998). 121. F. D. Aranjo, J. D. Knox, S. Ramchandani, R. Pelletier, P. Bigey, G. Price, M. Szyf, and M. Zannis-Hadjopoulos, J. Biol. Chem. 274, 9335 (1999). 122. H. Leonhardt, A. W. Page, H. U. Weier, and T. H. Bestor, Cell (Cambridge, Mass.) 71,865 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141. 142. 143.

(1992). L. S. Chuang, H. I. Ian, T. W. Koh, H. H. Ng, G. Xu, and B. ELi, Science 277, 1996 (1997). S. Milutinovic, J. D. Knox, and M. Szyf,J. Biol. Chem. 275, 6353 (2000). H. Zhang, Y. Xiong, and D. Beach, Mol. Biol. Cell 4, 897 (1993). S. J. Kim, U. S. Onwuta, Y. I. Lee, R. Li, M. R. Botchan, and P. D. Robbins, Mol. Cell. Biol. 12, 2455 (1992). L. I. Chen, T. Nishinaka, K. Kwan, I. Kitabayashi, K. Yokoyama, Y. H. Fu, S. Grunwald, and R. Chiu, Mol. Cell. Biol. 14, 4380 (1994). A. J. Udvadia, D. J. Templeton, and J. M. Horowitz, Proc. Natl. Acad. Sci. U.S.A. 92, 3953 (1995). V. Noe, C. Alemany, L. A. Chasin, and C. J. Ciudad, Oncogene 16, 1931 (1998). F. Sohm, C. Gaiddon, M. Antoine, A. L. Boutillier, and J. P. Loeffler, Oncogene 18, 2762 (1999). Y. Sowa, T. Orita, S. Hiranabe-Minamikawa, K. Nakano, T. Mizuno, H. Nomura, and T. Sakai, Ann. N.Y. Acad. Sci. 886, 195 (1999). Y. Sowa, T. Orita, S. Minamikawa-Hiranabe, T. Mizuno, H. Nomura, and T. Sakai, Cancer Res. 59, 4266 (1999). A. Pag[iuca, P. Gallo, and L. Lania,]. Cell. Biochem. 76, 360 (2000). K. D. Robertson, S. Ait-Si-Ali,T. Yokochi, E A. Wade, P. L. Jones, and A. P. Wolffe, Nat. Genet. 25, 338 (2000). H. Hiyama, A. Iavarone, and S. A. Reeves, Oncogene 16, 1513 (1998). F. Fuks, W. A. Burgers, A. Brehm, L. Hughes-Davies, and T. Kouzarides, Nat. Genet. 24, 88 (2000). M. R. Rountree, K. E. Baehman, and S. B. Bay[in, Nat. C~net. 25, 269 (2000). L.C. Sambueetti, D. D. Fischer, S. Zabludoff, E O. Kwon, H. Chamberlin, N. Trogani, H. Xu, and D. Cohen, J. Biol. Chem. 274, 34940 (1999). M. S. Hung, N. Karthikeyan, B. Huang, H. C. Koo, J. Kigar, and C. J. Shen, Proc. Natl. Acad. Sci. U.S.A. 96, 11940 (1999). J. D. Knox, F. D. Araujo, P. Bigey, A. D. Slack, G. B. Price, M. Zannis-Hadjopoulos, and M. Szyf, J. Biol. Chem. 275, 17986 (2000). S. B. Bayhn, J. w. Hoppener, A. de Bustros, P. H. Steenberg, C. J. Lips, and B. D. Nelkin, Cancer Res. 46, 2917 (1986). S. B. Bay[in, E. R. Fearon, B. Vogelstein, A. de Bustros, S. J. Sharkis, P. J. Burke, S. P. Staal, and B. D. Nelkin, Blood 70, 412 (1987). A. L. Silverman, J. G. Park, S. R. Hamilton, A. F. Gazdar, G. D. Luk, and S. B. Bay[in, Cancer Res. 49, 3468 (1989).

REGULATION OF DNMT1

77

144. S. B. Bay[in, M. Makos, J. J. Wu, R. W. Yen, A. de Bustros, P. Verfino, and B. D. Nelkin, Cancer Cells 3, 383 (1991). 145. B. D. NeLldn, D. Przepiorka, P. ]. Burke, E. D. Thomas, and S. B. Baylin, Blood 77, 2431

(1991). 146. M. Makos, B. D. Nelkin, M. I. Lerman, F. Latif, B. Zbar, and S. B. Bay[in, Proc. Natl. Acad. Sci. U.S.A. 89, 1929 (1992). 147. M. Makos, B. D. Nelkin, R. E. Reiter, J. B. Gnarra, J. Brooks, W. Issacs, M. Linehan, and S. B. Bay[in, CancerRes. 53, 2719 (1993). 148. M. Makos, B. D. Nelkin, V. R. Chazin, W. K. Cavenee, G. M. Brodeur, and 8. B. Bay[in, Cancer Res. 53, 2715 (1993). 149. L. J. Lu, E. Randerath, and K. Randerath, CancerLett. 19, 231 (1983). 150. A. P. Feinberg and B. Vogelstein, Biochem. Biophys. Res. Commun. 111, 47 (1983). 151. A. P. Feinberg and B. Vogelstein, Nature (London) 301, 89 (1983). 152. M. S. Cheah, C. D. Wallace, and R. M. Hoffman, J. Natl. Cancerlnst. 73, 1057 (1984). 153. R. A. Morgan and R. C. Huang, CancerRes 44, 5234 (1984). 154. M. T. Bedford and P. D. van Heldeu, CancerRes. 47, 5274 (1987). 155. A. P. Feiuberg and B. Vogelstein, Semin. Surg. Oncol. 3, 149 (1987). 156. R. Tran, S. V. Kashmiri, J. Kantor, J. w. Greiner, S. Pestka, J. E. Shively, andJ. 8chlom, Cancer Res. 48, 5674 (1988). 157. A. P. Feinberg, C. W. Gehrke, K. C. Kuo, and M. Ehr[ich, Cancer Res. 48, 1159 (1988). 158. B. Jurgens, B. J. Schmitz-Drager, and W. A. Schulz, Cancer Res. 56, 5698 (1996). 159. M. Cravo, R. Pinto, P. Fidalgo, P. Chaves, L. Gloria, C. Nobre-Leitao, and F. Costa Mira, Gut

39, 434 (1996). 160. T. Tsujiuchi, M. Tsutsumi, Y. Sasaki, M. Takahama, andY. Konishi, Jpn. J. CancerRes. 90, 909

(1999). 161. J. Soares, A. E. Pinto, C. V. Cunha, S. Andre, I. Barao, J. M. Sousa, and M. Cravo, Cancer 85,

112 (1999). 162. N. Ahuja, A. L. Mohan, Q. Li, J. M. 8tolker, J. G. Herman, S. R. Hamilton, S. B. Bay[in, and J. P. Issa, Cancer Res. 57, 3370 (1997). 163. S. A. Belinsky, K. J. Nikula, W. A. Palmisano, R. Michels, G. Saccomanno, E. Gabrielson, S. B. Baylin, and J. G. Herman, Proc. Natl. Acad. Sci. U.S.A. 95, 11891 (1998). 164. A. de Bustros, B. D. Nelkin, A. Silverman, G. Ehrlich, B. Poiesz, and S. B. Bay[in, Proc. Natl. Acad. Sci. U.S.A. 85, 5693 (1988). 165. M. Esteller, M. Toyota, M. Sauchez-Cespedes, G. Capella, M. A. Peinado, D. N. Watkius, J. P. Issa, D. Sidransky, 8. B. Bay[in, and J. G. Herman, Cancer Res. 60, 2368 (2000). 166. H. Fujii, M. A. Biel, W. Zhou, S. A. Weitzman, 8. B. Bayliu, and E. Gabrielson, Oncogene 16,

2159 (1998). 167. J. R. Graft, V. E. Greenberg, J. G. Herman, W. H. Westra, E. R. Boghaert, K. B. Ain, M. 8aji, M. A. Zeiger, S. G. Zimmer, and S. B. Bay]in, Cancer Res. 58, 2063 (1998). 168. J. R. Graft, J. G. Herman, R. G. Lapidus, H. Chopra, R. Xu, D. F. Jarrard, W. B. Isaacs, P. M. Pitha, N. E. Davidson, and 8. B. Bay[in, CancerRes. 55, 5195 (1995). 169. J. G. Herman and S. B. Bay[in, Curt Top. Microbiol. Immunol. 249, 35 (2000). 170. J. G. Herman, C. I. Civin, J. P. Issa, M. I. Collector, S. J. Shar]ds, and S. B. Bay[in, Cancer Res.

57, 837 (1997). 171. J. G. Herman, J. Jen, A. Merlo, and S. B. Bay[in, CancerRes. 56, 722 (1996). 172. J. G. Herman, et al., Proc. Natl. Acad. Sci. U.S.A. 91, 9700 (1994). 173. J. G. Herman, A. Umar, K. Polyak, J. R. Graft, N. Ahuja, J. P. Issa, s. Markowitz, J. K. Wfllson,

S. R. Hamilton, K. W. Kinzler, M. F. Kane, R. D. Kolodner, B. Vogelstein, T. A. Kunkel, and 8. B. Bay[in, Proc. Natl. Acad. Sci. U.S.A. 95, 6870 (1998).

78


174. C.J. Hsieh, B. Klump, K. Holzmann, F. Borehard, M. Gregor, and R. Porsehen, Cancer Res. 58, 3942 (1998). 175. J. P. Issa, S. B. Bay[in, and S. A. Be[insky, CancerRes. 56, 3655 (1996). 176. M. Malumbres, I. Perez de Castro, J. Santos, B. Melendez, R. Mangues, M. Serrano, A. Pellicer, and J. Femandez-Piqueras, Oncogene 14, 1361 (1997). I77. R. A. Morton, Jr., j. j. Watkins, G. S. Bova, M. M. Wales, S. B. Baylin, and W. B. Isaacs,J. Urol. 156, 512 (1996). 178. S. K. Myohanen, S. B. Bay[in, andJ. G. Herman, CancerRes. 58, 591 (1998). 179. M. Pieretti, D. E. Powell, H. H. GaUion, P. S. Conway, E. A. Case, and M. S. Turker, Hum. Pathol. 26, 398 (1995). 180. R. Piva, V. L. Kumar, S. Hanau, A. P. Rimondi, S. Pansini, G. Mol[ica, and L. del Senno, J. Steroid. Biochem. 32, 1 (1989). 181. R. Piva, A. P. Rimondi, S. Hanau, I. Maestri, A. Alvisi, V. L. Kumar, and L. del Senno, Br J. Cancer 61, 270 (1990). 182. S. Ribieras, X. G. Song-Wang, V. Martin, P. Lointier, L. Frappart, and R. Dante, J. Cell. Biochem. 56, 86 (1994). 183. W. M. Rideout 3rd, P. Eversole-Cire, C. H. Spruck 3rd, C. M. Hustad, G. A. Coetzee, 17. A. Gonzales, and P. A. Jones, Mol. Cell. Biol. 14, 6143 (1994). 184. E. P. Xing, Y. Nie, Y. Song, G. Y. Yang, Y. C. Cai, L. D. Wang, and C. S. Yang, Clin. CancerRes. 5, 2704 (1999). 185. S. A. Belinsky, K. J. Nikula, S. B. Bay[in, and J. P. Issa, eroc. Natl. Acad. Sci. U.S.A. 93, 4045 (1996). 186. J. Wu, J. P. Issa, J. Herman, D. E. Bassett, Jr., B. D. Nelkin, and S. B. Baylin, Proc. Natl. Acad. Sci. U.S.A. 90, 8891 (1993). 187. M. Foumel, P. Sapieha, N. Beau[ieu, J. M. Besterman, and A. R. MacLeod, J. Biol. Chem. 274, 24250 (1999). 188. M. L. Gonzalgo, T. Hayashida, C. M. Bender, M. M. Pan, Y. C. Tsai, F. A. Gonzales, H. D. Nguyen, T. T. Nguyen, and P. A. Jones, CancerRes. 58, 1245 (1998). 189. A. Merlo, J. G. Herman, L. Mao, D. J. Lee, E. Gabrielson, P. C. Burger, S. B. Bay[in, and D. Sidransky, Nat. Med. 1, 686 (1995). 190. P. M. Vertino, R. W. Yen, J. Gao, and S. B. Bay[in, Mol. Cell. Biol. 16, 4555 (1996). 191. C. A. Eads, K. D. Danenberg, K. Kawakami, L. B. Saltz, P. V. Danenberg, and P. W. Laird, Cancer Res. 59, 2302 (1999). 192. K. D. Robertson, K. Keyomarsi, F. A. Gonzales, M. Velicescu, and P. A. Jones, Nucleic Acids Res. 28, 2108 (2000). 193. A. M. De Matzo, V. L. Marchi, E. S. Yang, R. Veeraswamy, X. Lin, andW. G. Nelson, Cancer Res. 59, 3855 (1999). 194. L. Kivinen, K. Tsubari, T. Haapajarvi, M. B. Datto, X. F. Wang, and M. Laiho, Oncogene 18, 6252 (1999). 195. H. Xiao, T. Hasegawa, and K. Isobe, J. Biol. Chem. 275, 1371 (2000). 196. A. L. Gartel, E. Goufman, S. G. Tevosian, H. Shih, A. S. Yee, and A. L. Tyner, Oncogene 17, 3463 (1998). 197. M. Toyota, N. Ahuja, M. Ohe-Toyota, J. G. Herman, S. B. Bay[in, and J. E Issa, Proc. Natl. Acad. Sci. U.S.A. 96, 8681 (1999). 198. J. P. Issa, Y. L. Ottaviano, P. Celano, S. R. Hamilton, N. E. Davidson, and S. B. Bay[in, Nat. Genet. 7, 536 (1994). 199. J. P. Issa, Crit. Rev. Oncol. Hematol. 32, 31 (1999). 200. M. Toyota and J. P. Issa, Semin. Cancer Biol. 9, 349 (1999). 201. J. P. Issa, Curt. Top. Microbiol. Immunol. $49, 101 (2000).

REGULATION OF DNMT1

79

202. I. Rhee, K. W. Jair, R. W. Yen, C. Lengauer, J. G. Herman, K. W. Kinzler, B. Vogelstein, S. B. Baylin, and K. E. Schuebel, Nature (London) 404, 1003 (2000). 203. S. B. Baylin, J. G. Herman, J. R. Graft, P. M. Vertino, and J. P. Issa, Adv. Cancer Res. 72, 141

(1998). 204. V. M. Richon, T. W. Sandhoff, R. A. Rifldnd, and P. A. Marks, Proc. Natl. Acad. Sci. U.S.A. 97,

10014 (2000). 205. W. Wharton, J. Save[[,w. D. Cress, E. Seto, and W. J. Pledger,]. Biol. Chem. (2000). 206. S.A. Gayther, S. J. Batley, L. Linger, A. Bannister, K. Thorpe, S. F. Chin, Y. Daigo, P. Russell, A. Willson, H. M. Sowter, J. D. Delhanty, B. A. Ponder, T. Kouzarides, and C. Caldas, Nat. Genet. 24, 300 (2000). 207. L. F. Lock, N. Takagi, and G. R. Martin, Cell (Cambridge, Mass.) 48, 39 (1987). 208. T. Enver, J. W. Zhang, T. Papayannopoulou, and G. Stamatoyannopoulos, Genes Dev. 2, 698

(1988). 209. M. Groudine, R. Eisenman, R. Gelinas, and H. Weintraub, Prog. Clin. Biol. Res. 134, 159

(1983). 210. M. Szyf, D. 8. Milstone, B. P. Schimmer, K. L. Parker, and J. G. Seidman, Mol. Endocrinol. 4,

1144 (1990).

LysosomalMultienzyme Complex: Biochemistry, Genetics, and Molecular Pathophysiology ALEXEY W. PSHEZHETSKY AND MILA ASHMARINA

Service de Gdndtique M~dicale H6pital Sainte-Justine and DCpartement de Pddiatrie Facultd de M~dicine Universitd de Montreal Montrdal (Qc), H3T 1C5, Canada I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. Discovery of Lysosomal Multienzyme Complex . . . . . . . . . . . . . . . . . . . . . . III. Components of the Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Lysosomal Carboxypeptidase A (Cathepsin A) . . . . . . . . . . . . . . . . . . . . B. fl-Galactosidase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Sialidase (Neuraminidase) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. N-Acetylgalactosamine-6-sulfate Sulfatase . . . . . . . . . . . . . . . . . . . . . . . . IV. Composition, Stoichiometry, and Structure of Lysosomal Multienzyme Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Plasma Membrane Complex of Elastin-Binding Protein, Cathepsin A, and Sialidase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Molecular Pathology of Lysosomal Multienzyme Complex . . . . . . . . . . . . . A. /~-Galactosidosis: GM1-Gangliosidosis and Morquio B Disease . . . . . . . B. Galactosialidosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Sialidosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Future Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82 82 84 84 87 91 95 97 98 99 99 102 104 105 107

Lysosomal enzymes sialidase (a-neuraminidase), ~-galaetosidase, and N-acetylaminogalacto-6-sulfate suffatase are involved in the catabolism of glycolipids, glycoproteins, and oligosaccharides. Their functional activity in the cell depends on their association in a multienzyme complex with lysosomal carboxypeptidase, cathepsin A. We review the data suggesting that the integrity of the complex plays a crucial role at different stages of biogenesis of lysosomal enzymes, including intracellular sorting and proteolytic processing of their precursors. The complex plays a protective role for all components, extending their haft-life in the lysosome from several hours to several days; and for sialidase, the associallon with cathepsin A is also necessary for the expression of

Progress in NucleicAcidResearch

andMolecularBiology.Vol.69

81

Copyright© 2001by AcademicPress. All fightsof reproductioninanyformreserved. 0079-6603/01$35.00

82

ALEXEYV. PSHEZHETSKYAND MILAASHMARINA enzymatic activity. The disintegration of the complex due to genetic mutations in its components results in their functional deficiency and causes severe metabolic disorders: sialidosis (mutations in sialidase), GMl-gangliosidosis and Morquio disease type B (mutations in /~-galactosidase), galactosialidosis (mutations in cathepsin A), and Morquio disease type A (mutations in N-acetylaminogalacto6-sulfate sulfatase). The genetic, biochemical, and direct structural studies described here clarify the molecular pathogenic mechanisms of these disorders and suggest new diagnostic tools. © 2001AcademicPress.

I. Introduction Interactions between enzymes or between enzymes and other proteins are now recognized as one of the most significant factors affecting metabolic regulation in the cellular microenvironment. Enzyme-enzyme interactions have proved to be a key factor for most, if not all, cellular functions, including DNA replication, protein synthesis, cellular signaling, and endo- and exocytosis. They involve proteins of all cellular compartments, starting from the proteosome, which is a multicatalytic proteinase complex, and ending with "soluble" enzymes of the cytosol. For lysosomal enzymes the concept suggesting their association into supramolecular structures to form a lysosomal matrix is almost as old as the discovery of the lysosome itself (1). So far, this has proved correct for four of about 100 known lysosomal enzymes---~ree glycosidases (fl-galactosidase, 0t-neuraminidase, and N-acetylaminogalacto-6-sulfate sulfatase) and one protease (lysosomal carboxypeptidase A, also called cathepsin A), which form a high-molecular-weight lysosomal multienzyme complex (LMC). The studies of the LMC and its individual components, which we review here, have revealed fundamental mechanisms of enzyme function in the lysosome and provided the molecular pathophysiological basis for several inherited human diseases. This review covers the structure, biochemistry, genetics, and biogenesis of LMC and its individual components, as well as the molecular mechanisms of metabolic diseases resulting from their inherited deficiencies. We start with a historical perspective of the discovery of LMC, which illustrates the success of the combined efforts of clinical investigation, biochemistry, and cellular and molecular biology.

II. Discovery of Lysosomal Multienzyme Complex Lysosomes are cytoplasmic organelles harboring more than 100 hydrolytic enzymes all of which have an acidic pH optimum and are involved in the degradation of essentially all types of biological macromolecules. Both extracellular

LYSOSOMAL MULTIENZYME COMPLEX

83

materials that are taken into the cell by endocytosis and intracellular structures that undergo autophagy are degraded within lysosomes to their elementary constituents. The biogenesis of lysosomes is a complex process, which requires that specific soluble and membrane proteins synthesized in the endoplasmic reticulum be segregated from proteins with other subcellular destinations and transferred to developing or mature lysosomes. Any failure in the biogenesis, lysosomal targeting, or function of one or more lysosomal enzymes can result in metabolic diseases called lysosomal storage diseases because of the massive accumulation of undegraded substrates of deficient enzymes in the lysosomes of the affected tissues. GMl-gangliosidosis caused by lysosomal fl-galactosidase (GAL) deficiency was one of the first lysosomal storage diseases with the identified biochemical mechanism (2). This discovery facilitated rapid diagnosis of many GM1gangliosidosis patients. Among those patients, two happened to have GAL deficiency but showed an abnormal clinical phenotype with normal intelligence and late development of psychomotor deterioration (3, 4). Somatic cell hybridization studies performed by Galjaard et al. (5) showed that these variants are caused by the defects in a gene different from that of GAL. Wenger et al. (6), investigating similar cases have found a combined deficiency of two lysosomal enzymes, GAL and sialidase (ot-neuraminidase, SIAL). All reported patients belonged to the same complementation group, which allowed the condition to be designated as a distinct disorder named galactosialidosis (GS) (7). For several years, the primary molecular defect in GS was thought to be identical to that in single SIAL deficiency, sialidosis (8), until Hoogeveen et al. (9, 10) showed that hybridization and even coculturing of fibroblasts from GS and sialidosis patients resulted in partial correction of SIAL and GAL activity. These studies suggested the existence of a protein "corrective factor" secreted by normal, sialidosis, or GMl-gangliosidosis cells, but absent in the cells of GS patients. Further studies showed a 10-fold enhanced cellular degradation of GAL in GS fibroblasts (11) that could be prevented either by the addition to the cell medium of a fraction containing the corrective factor or by the inhibition of lysosomal proteases (11a, 12). These results led to the hypothesis that corrective factor protects GAL against the rapid degradation in the lysosome. Meanwhile, several groups showed that GAL purified from mammalian tissues and cells exists in multiple oligomeric forms, including 70-80-kDa monomers, ll0-170-kDa dimers, 250-kDa tetramers, and 600-700-kDa multimers (13-17). The ratio between different oligomeric forms was dependent on the methods of enzyme purification; and under conditions that mimicked those in the lysosome (acidic pH, high concentration of protein), GAL existed exclusively as high-molecular-weight aggregates (18). All oligomeric forms contained 67-70-kDa polypeptides of GAL, while 600-700-kDa aggregates contained additionally polypeptides of 32- and 20-kDa (12, 19-21). Moreover, the presence of a 32-kDa polypeptide was essential for the formation of 600-700-kDa

84

ALEXEYV.PSHEZHETSKYAND MILAASHMARINA

aggregates of GAL (19). Finally, d'Azzo et al. (12) showed that the 32-kDa protein and its 54-kDa precursor are genetically absent in the cells of all GS patients and that administration of the 54-kDa precursor to these cells restores the normal level of GAL protein, proving the identity of 32-kDa protein (called "protective protein") and the corrective factor missing in GS cells. Further studies showed that protective protein is lysosomal carboxypeptidase, cathepsin A (reviewed in Section III,A,1). Simultaneously, Verheijen et al. (22, 23) demonstrated that SIAL from mammalian tissues can be copurified with GAL on affinity chromatography, gel filtration, and density gradient centrifugation. Both activities were coprecipitated with antibodies against GAL (22) or against protective protein (23), suggesting that GAL, SIAL, and protective protein are in the same complex. SIAL activity disappeared after the dissociation of the complex in vitro and could be restored in the presence of protective protein (24). Although several works claimed that SIAL activation involved its proteolytic processing (25, 26), the majority of the data suggested that the activation of SIAL in the complex resuited from its conformational change. These experiments established the concept that a multienzyme complex is required for functional activity of SIAL and for the protection of GAL in the lysosomal microenvironment. The discovery of the last component of LMC resulted from attempts to explain the unusual composition of storage products in the patients affected with a clinical variant of GAL deficiency, Morquio disorder type B. These patients accumulated keratan sulfate, which is a substrate of N-acetylgalactosamine-6sulfate sulfatase (GALNS). It was shown that GALNS is a part of LMC and that the accumulation of keratan sulfate may result from GALNS functional deficiency after the disruption of LMC in the cells of Morqnio B patients (27).

III. Comlx)nents oJ: IJle Complex

A. LysosomalCarboxypeptidaseA (CathepsinA) 1. NAMEAND DISTINGUISHINGFEATURES The nomenclature name of protective protein is lysosomal carboxypeptidase A (EC 3.4.16.5) (28). The enzyme was originally described under the name of cathepsin I, and later cathepsin A (CathA), as an enzyme responsible for the hydrolysis of Z-Glu#Tyr in beef kidney and spleen extracts at acidic pH (29, 30). Later, CathA was shown to be a carboxypeptidase (31, 32). The purified preparations of CathA were obtained from different mammalian tissues, and their biochemical properties were studied (33-39); however, the enzyme has not been cloned or sequenced. Independently, CathA was discovered as an LMC component, and dubbed protective protein because of its ability to protect


85

GAL and SIAL against rapid intralysosomal degradation (reviewed in Section II). The cloning and sequencing of both human and mouse protective proteins demonstrated that they have high amino acid sequence homology with yeast serine carboxypeptidases (40, 41). Indeed, Tranchemontagne et al. (42) demonstrated the carboxypeptidase activity of protective protein and named it lysosoreal carboxypeptidase, or carboxypeptidase L. The enzyme was discovered for the third time as a deamidase from human platelets (43). The catalytic properties of deamidase resembled those of CathA, whereas the N-terminal amino acid sequence was identical to that of protective protein. Finally, the similarity of CathA to protective protein was proved when protective protein was shown to be responsible for all activity against Z-Phe#Ala and Z-Glu#Tyr substrates in human placenta tissue (44, 45). 2. ACTIVITY AND SPECIFICITY

Like the other members of the serine carboxypeptidase family (46), CathA is a multifunctional enzyme that expresses deamidase and esterase activities at neutral pH (optimal pH 7.0) and carboxypeptidase activity at acidic pH (optimal pH 5.0-5.2) (43). The enzyme has a preference for the substrates with hydrophobic amino acid residues at the P1t position (47), and is therefore a C-type carboxypeptidase (48), but it also shows high affinity for the positively charged amino acid residues at the PI' position. The recent modeling of the substrate binding in the SIt subsite of CathA (49) provided a theoretical explanation for the observed substrate specificity. Esterase activity of CathA is assayed with Bz-Tyr-O#Et (43), and carboxypeptidase activitywith Z-Phe#Xaa dipeptides, followed by fluorimetric (44) or spectrophotometric (42) assay of the released amino acid. Z-Phe#Phe and Z-Phe#Leu are the most specific substrates (keat/Km); the enzyme shows the highest activity (keat)with Z-Phe#Ala, and FA-Phe#Leu can be used for the continuous spectrophotometric assay (47). A fluorescent substrate, 5-dimethylaminonaphthalene-l-sulfonyl-D-Tyr-Val-NH2, is useful for the HPLC-based determination of the deamidase activity of the enzyme (50). CathA is inhibited by PMSF, iodoacetamide, DFP, lactacystin, thiol reagents in high concentrations, and heavy metals such as Hg~+ , Ag2+ , and Cu 2+ (42, 50--52). A specific inhibitor from potato has also been described (53). Pepstatin, leupeptin, and phosphoramidon have no effect on the enzyme (52). 3. STRUCTURE

CathA contains two protein chains of about 32 and 20 kDa, which are linked by disulfide bonds. Both chains contain N-linked oligosaccharides, but only the one attached to the 32-kDa chain is mannose-6-phosphorylated (54). The amino acid sequence of CathA has about 30% identity with the other serine carboxypeptidases: yeast carboxypeptidase Y and wheat carboxypeptidase II (40, 55). X-Ray atomic coordinates of the wheat enzyme were used to model

86

ALEXEY V. PSHEZHETSKY AND MILA ASHMARINA

the CathA structure (55). Later, the x-ray structure of the CathA precursor, expressed in a baculovirus system, was determined with a 2.2-2.4 A resolution (56). The structure is similar to those of plant and yeast carboxypeptidases and shows that CathA belongs to a so-called ot/fl-hydrolase family (46)• The protein contains a core domain and a cap domain. The core domain consists of a central 10-stranded t-sheet flanked by ten a-helices and two small t-strands on both sides • The cap domain contains three a-helices and a triP le-stranded mixed t-sheet. The catalytic triad in the active site is formed by the Ser 15°, His 429, and Asp372 residues. At acidic pH the enzyme forms 95-98-kDa homodimers, whose x-ray structure is also determined (56). 4. SYNTHESIS,PROCESSING,AND FUNCTION The gene coding for human CathA spans 7.5 kb on chromosome 20 (20q13.1) and comprises 15 exons (57). The protein is synthesized as 54-kDa single-chain precursor which contains a 28-amino acid N-terminal signal peptide (40). In the endoplasmic reticulum after the cleavage of the signal peptide the protein is folded and glycosylated at Ash117and Asn3w. In the late Golgi compartment the oligosaccharide chain at Asn117 gets a mannose-6-phosphate recognition signal (54) which mediates the binding of CathA to mannose-6-phosphate receptors and transport to the endosomal/lysosomal compartment. In lysosomes the precursor is cleaved into 34-kDa and 20-kDa chains, which is followed by further C-terminal processing of the larger chain into the 32-kDa form. The last event is necessary for the activation of the enzyme (58). CathA is widely distributed in mammalian tissues, with the highest expression in kidney, liver, lung (59), and placenta (45). Three oligomeric forms of CathA were detected: a 1.27-MDa complex with GAL, SIAL, and GALNS; a 680-kDa complex with GAL only; and a 98-kDa homodimer (27, 45). The association of CathA and GAL was reported to occur already in the endoplasmic reticulum (54); however, the complex is stable in vitro only at acidic pH (60). Association with CathA is essential for the stabilization of GAL and GALNS as well as for the activation of SIAL in the lysosome (12, 18, 24, 27). The potential mechanism and biological role of this phenomenon are reviewed in Sections II, IV, and VI. The discovery of carboxypeptidase activity of protective protein prompted multiple speculations about the partial responsibility of CathA for C-terminal processing of SIAL and GAL; however, site-directed mutagenesis studies have shown that catalytically inactive enzyme fully restores GAL and SIAL activity in GS cells (41). Enzyme activity of CathA, which is not necessary for its protective function in the complex (41), was conserved throughout evolution, suggesting that CathA may have a dual function in vivo. This has inspired a search for its physiological peptide substrates. Several studies have shown that CathA from different

LYSOSOMALMULTIENZYMECOMPLEX

87

organs and cells hydrolyzein vitro some regulatory peptides, including substance P, angiotensin I, Met-enkephalin-Arg6-Phe7, and oxytocin (35, 39, 61, 62), suggesting that CathA may be involved in their metabolism. In particular, numerous works addressed the potential role of CathA in the hydrolysis of endothelin-1, a potent vasoconstrictive peptide, which also plays multiple roles in nonvascular tissues (reviewed in Ref. 63). CathA released from platelets and lymphocytes rapidly inactivates endothelin by cleaving its C-terminal amino acid residue, which is required for the expression of the peptide's biological activities (43, 64-66). Studies of cultured cells from GS patients deficient in CathA demonstrated that CathA is the only endothelin-converting enzyme in skin fibroblasts (67), and autopsy of brain tissues of three GS patients showed high endothelinspecific immunoreactivity. In contrast, we found normal levels of endothelin in the blood and urine of a 16-year-old female GS patient (68) who had a very low expression level of active CathA in skin fibroblasts. Our results, therefore, do not support the suggested role of CathA in endothelin degradation; however, we cannot eliminate the possibility that the residual CathA activity of our patient could be adequate to provide the normal endothelin degradation. Others have reported that in vitro degradation ofendothelin is catalyzedby neutral endopeptidase (69). Further experiments at physiological level are required to understand if CathA is indeed an endothelin-converting enzyme. Zhou et al. (70) have developed a CathA-deficient mouse model that showed a combined deficiency of SIAL and CathA activities in tissues. CathA-deficient mice demonstrated a much more severe phenotype than that recently reported for SIAL-deficient (sialidosis) mice (71), suggesting that deficiency of CathA activity could contribute to the pathology of murine GS. Unfortunately, GS mice develop progressive and diffuse edema, apparent ataxic movement, and tremor owing to secondary SIAL deficiency (70) and are therefore unsuitable for most physiological and behaviorial studies. Inverse immunoregulation" was used to specifically inhibit blood serum CathA in rats (72). The immunization did not affect blood pressure and heart rate regulation, which was not consistent with the involvement of CathA in the inactivation of endothelin or generation of angiotensin II. Instead, the immunized animals showed a long-term suppression of learning ability in active avoidance tests as well as of motor and exploratory activities in open-field tests, which suggested the involvement of CathA in metabolism of peptides implicated in memory consolidation such as corticotrophin and vasopressin(s).

B. /~-Galactosidase 1. NAMEAND DISTINGUISHINGFEATURES The correct name for the enzyme is GMl-ganglioside fl-galactosidase, since there are two lysosomal enzymes able to cleave terminal/~-linked galactose

88


residues of glycoconjugates: GAL (EC 3.2.1.23) catalyzing the hydrolysis of Gm-ganglioside and galactocerebrosidase (EC 3.2.1.46) catalyzing the hydrolysis of galactosylceramide, lactosylceramide, galactosylsphingosine, and monogalactosyl diglyceride. Galactocerebrosidase, whose inherited deficiency causes the lysosomal storage disease globoid-cell leukodystrophy, is not reviewed here. In addition, a/~-galactosidase activity associated with a cellular replicative senescence was detected in human smooth muscle cells, ovarian epithelial cells, fibroblasts, and keratinocytes (73, 74). The enzyme has a neutral pH optimum and is clearly distinct from lysosomal GAL and galactocerebrosidase. 2. ACTIVITYAND SPECIFICITY GAL cleaves terminal /~-1,4- or /~-l,3-linked galactose residues from GMl-ganglioside, GA1 (asialo Gral)-ganglioside, lactosylceramide, asialofetuin, galactose-containing oligosaccharides, and keratan sulfate (Fig. 1). Artificial substrates include fluorogenic 4-methylumbelliferyl/~-D-galactopyranoside and chromogenic p-nitrophenyl/6-D-galactopyranoside. Several particularly sensitive techniques were developed for clinical diagnostics of the enzyme deficiency, including a single-cell assays, microtiter plate assays, and HPLC (75-78). The pH optimum of GAL is 4.5-4.75. The enzyme is activated by chloride ions (79, 80) and is inhibited by mucopolysaccharides (81, 82). Several specific inhibitors of the enzyme, including N-bromoacetyl-/~-galactosylamiline (83), E-Dgalactopyranosylmethyl-p-uitrotriazene (84), and 2,4-dinitrophenyl-2-fluoro-2deoxy-/~-o-galactopyranoside (85), were developed. Unlike hydrolysis of artificial substrates, the cleavage of gangliosides and lactosylceramide by GAL requires an activator protein saposin B (SapB, also called saposin i or sulfatide activator protein) (86-88). SapB, a 9-kDa glycoprotein derived together with two other activator proteins, saposin A and saposin C, from their common precursor prosaposin, binds a lipid part of ganglioside, thus facilitating its solubilization and interaction with the enzyme. SapB also activates hydrolysis of sulfatides and trihexosylceramides by arylsulfatase A and a-galactosidase, respectively (reviewed in Ref. 89). In vitro, SapB can be replaced by a detergent, for example, taurocholate. SapB is not involved in the lysosomal hydrolysis of keratan sulfate or oligosaccharide chains of glycoproteins. 3. STRUCTURE

The tertiary structure of GAL has not been determined. However, the studies involving the suicide substrate of GAL, 2,4-diuitrophenyl-2-fluoro-2-deoxy-/~-Dgalactopyranoside defined the catalytic mechanism and the active site residues of the enzyme (85, 90). The Glu2~8 residue, completely conserved among homologous galactosidases of mammalian, plant and bacterial origin, was identified as the galactosylated nucleophile of the catalytic site (90). Mutation analysis in

ql

:g i

i :E i .m

o

U ra

.~

?

a

o

0

a

'iI

~

i

+ 8.-S8

8 Z

=.~ u

c'~N

~ ...._m -L

_z m

=

tl N

gg~

~.~

~z~ O

m

m

90


GMl-t-gangliosidosis patients (91) suggested another active site residue of GAL, Asp~2. Asp332, which is also conserved between species, probably serves as the acid-based catalyst for the hydrolysis of Glu268-galactosylated residue in the process of catalysis (92). 4. SYNTHESIS,PROCESSING,AND FUNCTION A gene coding for human GAL is 62.5 kb long, contains 16 exons, and is localized on chromosome 3 (3p21.33) (93-95). The gene is transcribed in two splice variants (94, 96)--a 2.4-kb mRNA coding for catalytically active GAL and a 2.0-bp mRNA coding for elastinAaminin-binding protein (97, 98), whose molecular properties and physiological role are reviewed in Section V. GAL is synthesized as 677-amino acid precursor containing a 23-amino acid N-terminal signal peptide, which is cleaved on entry to the endoplasmic reticulum. The enzyme is heavily glycosylated: 7.5-9% of carbohydrate was estimated for human liver enzyme (13, 14, 99), suggesting that three of seven putative Asn-glycosylation sites may be occupied by oligosaccharides. The glycosylated 84-kDa precursor of GAL is fully enzymatically active and has substrate specificity and pH optimum similar to those of mature enzyme (85). The 84-kDa precursor is transported to the lysosomal/endosomal compartment and C-terminally processed into 64-kDa monomers. The maturation of GAL precursor happens within "-~2h, and requires its association with CathA, since in GS ceils in the absence of functional CathA normal C-terminal maturation was not observed (100). Degradation of mature GAL involves its cleavage by cathepsin B or similar protease to 18-kDa and 50-kDa polypeptides (101). Recently, van der Spoel (102) reported that a C-terminal 20-kDa polypeptide (amino acid residues 543-677) that is cleaved from the precursor stays tightly associated with the 64-kDa polypeptide and is required for the catalytic activity of GAL. This interesting finding has not yet been independently confirmed. GAL is present in a wide variety of human tissues and body fluids, being most abundant in systemic organs such as liver and kidney, as well as in skin and blood cells. Three major groups of in vitro substrates of GAL are sphingolipids, complex oligosaccharides of glycoproteins and keratan sulfate. Major evidence for GAL involvement in catabolism of these substrates was obtained from the analysis of the storage products in patients affected with the inherited deficiency of GAL and will be reviewed in Section VI,A. In white blood cells, GAL has an additional function (103, 104). Together with SIAL, it participates in the processing ofoligosaccharide chains of so-called "group-specific component," converting it to a potent activator of macrophages (see also Section III,C,4). This suggests that GAL is potentially involved in the development of immune response, which is especially interesting in the context of immunodeficiency observed in the mouse model with knockout GAL gene

(lO5).


91

C. Sialidase (Neuraminidase) 1. NAME AND DISTINGUISHINGFEATURES

Sialidase (a-neuraminidase, EC 3.2.1.18) catalyzes the hydrolysis of terminal sialic acid residues of glyconjugates. Sialidases have been well studied in viruses and bacteria, where they destroy the sialic acid-containing receptors at the surface of host cells (106-108), and mobilize bacterial nutrients (109). In mammals, three types of sialidases have been described: lysosomal, plasma membrane, and cytosolic, which have different enzymatic and biochemical properties (110-112) and are encoded by different genes (113-115). Cytosolic sialidase has pH optimum 6.0 and is active against or,2~ 3-sialylated oligosaccharides, glycopeptides, and gangliosides (116). The exact biological role of this enzyme is not known, but it was suggested that it may cleave GMa-ganghoside, associated with the cytoskeleton, leading to the alteration of cytoskeletal functions (117, 118). In accordance with this, the cytosolic sialidase activity of melanoma cells inversely correlates with their invasive and metastatic potential (119). Plasma membrane sialidase is an integral membrane protein with pH optimum 4.5 active mostly against gangliosides including GM1, GDla, and other polysialogangliosides (112). It is probably involved in the modulation of the oligosaccharide chains of gangliosides on the cell surface in the course of transformation, differentiation, and formation of cell contacts (120, 121). Lysosomal SIAL has pH optimum 4.2 and is active against both glycolipids and glycopeptides, but not against glycoproteins such as fetuin or submaxillary mucin (112, 116, 122, 123). Despite significant efforts, the identification, cloning, and sequencing of SIAL have been hampered for almost two decades by low tissue content and instability of the enzyme. These studies benefited from the rapid progress of the Human Genome Project, when three groups simultaneously cloned and sequenced human (124-126) and mouse (127-129) SIAL cDNA, either by homologous search in the expressed sequence tags database (dbEST; National Center for Biotechnology Information) or by direct sequencing of human chromosome 6. 2. ACTIVITYAND SPECIFICITY Characterization of the storage products in urine and cultured fibroblasts from patients affected with sialidosis revealed that sialylated oligosaccharides are the major natural substrates for SIAL (130-133). The involvement of SIAL in the hydrolysis of sialylated gangliosides has been a matter of debate for a long time. The results of the analysis of the storage products in the autopsy materials from sialidosis and GS patients were controversial. For some patients, a severalfold increase of Gs!3- and GD3-gangliosides was reported in systemic organs (134) and brain (135), although in the other similar cases (136) or in the knockout mouse model of GS (70), storage of gangliosides was not observed. The

92


cultured fibroblasts of sialidosis and GS patients treated with radioactively labeled GMl-ganglioside accumulated GM3-ganglioside (137), strongly suggesting SIAL involvment in the degradation of this glycolipid in vivo. Further studies (112, 122, 123) showed that the hydrolysis of gangliosides by SIAL depends on the presence of detergents, sodium cholate, or taurodeoxycholate (Triton X-100 activates plasma membrane sialidase but has no effect on SIAL), suggesting that in vivo this reaction requires activator proteins. That was confirmed by Fingerhut et al. (138), who showed that SIAL cleaved GM3-, Gma-, and GTlb-gangliosides in the presence of SapB. The complete hydrolysis of Gmb-ganglioside to lactosylceramide by a glycoprotein fraction from human placenta containing essentially all soluble lysosomal enzymes required the presence of two activators: SapB, for the reactions catalyzed by GAL and SIAL; and GM2-activator, for the reaction of GM2-tO-GM3 conversion catalyzed by hexosaminidase A. Gangliosides GDlb, GM1, and GM2 were extremely poor substrates for SIAL (138). The last conclusion was, however, reconsidered when asialylated GM1- and GM2-gangliosides, GA1 and G~, respectively, were found among major storage products in knockout mouse models of GMl-gangliosidosis (105) and Sandhoff disease (combined deficiency of hexosaminidase A and B) (139). Sandhoff mice resembled a human phenotype of profound neurodegenerative disease, while the mouse model of Tay-Sachs disease, depleted of hexosaminidase A only, remained asymptomatic to at least i year of age because Gm-ganglioside was further efficiently cleaved by hexosaminidase B (139). Therefore, mouse model of Tay-Sachs disease have revealed a metabolic bypass based on the potent activity of SIAL toward GM2 (Fig. lb). To determine whether increasing the level of SIAL would produce a similar effect in human Tay-Sachs cells, Igdoura et al. (140) introduced a human SIAL cDNA into neuroglia cells derived from a Tay-Sachs fetus and demonstrated a dramatic reduction in the accumulation of GM2, proving the involvement of SIAL in the hydrolysis of dais ganglioside. 3. STRUCTURE So far, SIAL has not been crystallized. However, structural models of lysosomal SIAL were built using the atomic coordinates of homologous sialidases from Micromonospora viridifaciens (EUR), Salmonella typhimurium (SIL), and Vibrio cholerae (KIT) as templates (141, 142). Analysis of the deduced structure (Fig. 2; see color insert) indicated that human lysosomal SIAL shares the same fold as bacterial and viral sialidases, consisting of six four-stranded antiparallel d-sheets arranged as the blades of a propeller around a pseudo six-fold axis (143-145). Viral sialidases are tetramers of four identical d-propellers (143), whereas some bacterial sialidases contain additional domains built around the central canonical fold (146, 147). These additional domains, which are usually involved in carbohydrate recognition,


93

are not present in human lysosomal SIAL. Despite the low sequence identity (15% between the bacterial and viral sialidases and about 30% between different bacterial sialidases), the topology of the catalytic domain and the active site residues is strictly conserved in these enzymes. This architecture of the active site is also conserved in human SIAL. In particular, Arg78 is probably one of n~ the sialic acid carboxylate group. The other the residues resp onsible for bindin two Arg residues (Arg2s° and Arg ) stabilize the carboxylic group in the active site. The conserved Asp ~35probably binds the N-acetyl/N-glycolyl group of the substrate. The Glu394 residue, which stabilizes the position of Arg78 through a hydrogen bond, as well as Tyr37° and Glu 264, which are connected by a hydrogen bond and may donate a proton in the process of the substrate hydrolysis (145-147), are also conserved. Asp 1°3 is either a proton donor for the glycosidic bond or a stabilizer of the proton-donating water molecule. So-called "Asp-box" motifs, which are found in all bacterial and mammalian sialidases (Ser/Thr-X-Asp-X-Gly-X-X-Trp/Phe), are also conserved in human SIAL. These repeats are always located between the third and the fourth /%strand at the each sheet (riD and fiE, flH and/61, fiN and tO, /~S and fiT). Asp-boxes always have a similar arrangement, with the aromatic residues packed into the hydrophobic core stabilizing the turn, whereas the hydrophilic Asp residues are solvent-exposed. Gaskel et al. (146) have reported that similar motifs are also present at topologically conserved positions in the eight-bladed t-propeller structure of bacterial methanol and methylamine dehydrogenases as well as in the seven-bladed fungal galactose and glyoxal oxidases, suggesting that these enzymes have evolved from the same four-bladed precursors through a gene duplication. 4. SYNTHESIS,PROCESSING,AND FUNCTION The gene coding for SIAL was mapped to human chromosome 6 (6p21.3)

(124, 125, 148) and mouse chromosome 17 (•49) inside the locus of major histocompatibility complex. Both human and mouse genes contain five introns and six exons each. In human gene intron 1 (424 bp) starts after nucleotide 159 in the cDNA, intron 2 (547 bp) after nucleotide 352, intron 3 (564 bp) after nucleotide 615, intron 4 (174 bp) after nucleotide 798, and intron 5 (96 bp) after nucleotide 1021. The total length of the gene from the initiating to the stop codon is 3.051 kb (141). In all human tissues, only a single splice product was detected a 1245-bp SIAL mRNA coding for a 415--amino acid protein precursor. After the cleavage of the 47-amino acid N-terminal signal peptide and glycosylation, SIAL becomes a 48.3-kDa mature active enzyme. However, the exact mechanism of sorting of the SIAL precursor still remains unclear. Comparing the intracellular distribution of human SIAL expressed in COS-1 ceils transfected with SIAL cDNA or

94


eotransfected with SIAL and human CathA eDNA, Van der Spoel et al. (150) suggested that SIAL associates with CathA precursor shortly after synthesis and that this complex is targeted to the lysosome using a mannose-6-phosphate receptordependent pathway, while in the absence of CathA, SIAL is partially secreted and partially segregates to the endosomal compartment. In contrast, numerous data demonstrated the existence in the lysosome of two SIAL pools, soluble and membrane-associated. Both forms are absent in cultured cells of sialidosis patients and are therefore encoded by a single gene (110, 111, 151, 152). In addition, the product of the same gene is also found on the surface of activated T lymphocytes (153, 154). Immunoelectron microscopy (155) has also demonstrated the presence of SIAL on the inner side of the lysosomal membrane, lysosomal lumen, plasma membranes, and intracellular (possibly endocytic) vesicles in both normal and GS cells, suggesting that intralysosomal targeting of SIAL may happen in the absence of CathA. Analysis of the deduced amino acid sequence of SIAL (125) revealed that a Cterminal tetrapeptide, 412YGTL415,has similarity to the internalization signal TyrX-X-¢ (hydrophobic residue) previously determined in cytoplasmic domains of several lysosomal membrane proteins, including glucocerebrosidase, LAMP-I, LAMP-2, LGP-85, and LDL, as well as endocyted surface receptors, including transferrin, asialoglycoprotein, polymeric immunoglobulin, and cation-independent mannose-6-phosphate receptors (reviewed in Ref. 156-158). All these proteins are transported to the lysosome via clathrin-coated vesicles by a mechanism that involves the association of an internalization signal with a/z2 subunit of HA-2 adaptor complex (157, 158). Recently (K. E. Lukong et al., submitted), we obtained direct evidence that the C-terminal tetrapeptide of SIAL 412YGTL415 is the tyrosine-containing internalization signal. We mutated y412 and L 415 residues of SIAL and showed that these mutants are sorted to the plasma membrane, but are not further internalized. Our results suggested that lysosomal SIAL is an integral membrane protein containing a single transmembrane domain and a short cytoplasmic tail carrying the internalization signal. It is possible that in the lysosome the transmembrane domain can be cleaved similarly to that of acid phosphatase, resulting in the appearance of the soluble pool of the enzyme; however, experimental evidence of that has not been obtained. The role of SIAL in the intralysosomal catabolism ofsialylated glycolipids and glycoproteins is well established. Multiple data, however, suggest that SIAL plays an additional role in cellular signaling. First, SIAL of T lymphocytes converts so-called vitamin D3-binding protein (also known as group-specific component or Gc protein) into a factor necessary for the inflammation-primed activation of macrophages (103, 104, 154), Second, SIAL of T cells is required for the production of cytokine IL-4, the potent regulator of many hemopoietic and nonhemopoietically derived cells and tissues. SIAL is involved both in early

FIG. 2. Ribbon drawing of SIAL model. The deduced active site residues side-chains are shown in blue.

%11 ~ii~

!'~':'~i~!ii!ilI

FIG. 3. Schematic drawing showing binding of the GAL monomer to the CathA dimer in their 680 kDa conlplex. The GAL monomer is represented by a sphere with a radius of 28 A,. CathA peptides potentially involved in GAL binding are shown in red and blue, carbohydrate chains in pink and active site residues in green.

FIG. 4. Schematic diagram of the CathA monomer. The catalytic triad residues are shown in blue. The mutations identified in GS patients localized in core domain and potentially causing misfolding of the protein are shown in red. Mutations localized in the a-helical cap domain are shown in green. Reproduced from [212]. Copyright (1998) National Academy of Sciences, U.S.A. I i

FIc. 5. Schematic diagram of SIAL model, showing the location of mutations identified in sialidosis patients. Mutations localized in putative SIAL-CathA binding site are shown in red, mutations in the active site residues or those that may affect the positions of the active site residues, in blue; and mutations that do not cause obvious structural changes, in green.


95

production of IL-4 and in the IL-4 priming processing of conventional T cells to become active IL-4 producers (159, 160). The T cells derived from SM/J or B10.SM strands of mice deficient in lysosomal SIAL due to the point mutation in its gene (127, 129) failed to convert Gc and synthesize IL-4, whereas B cells of these mice were not able to produce IgG1 and IgE after the immunization with pertussis toxin (103, 153, 159). Our studies of the promoter of human SIAL gene showed that SIAL expression is potently induced by the proinflammatory factors and is inhibited by curcumin and N-acetylcysteine,which has been shown to inhibit inflammatoryresponses in different tissues (V. Seyrantepe et al., in preparation). Most probably, the role of SIAL in immune response is connected with the desialylation of surface antigen-presenting molecules such as MHC class I, which is required to render T cells responsive to antigen-presenting cells (162). In addition, SIAL may regulate IL-4 production in CD4(+) T cells by reducing the content of GM3-ganglioside on the cell surface, which, in turn, modulates Ca2+ immobilization (160). Although further study of the mechanisms controlling SIAL expression is required for complete understanding of this process, the suggested mechanism of SIAL sorting, first to the plasma membrane and then to the lysosome, correlates well with its dual physiological role: intralysosomal catabolism of sialylated glycoconjugates and cellular signaling during the immune response.

D. N-Acetylgalactosamine-6-sulfate Sulfatase 1. NAMEAND DISTINGUISHINGFEATURES Analysis of the storage products in patients affected with Morquio disease suggested that they are deficient in an enzyme that cleaves both keratan sulfate and chondroitin sulfate. Further studies confirmed the existence of the enzyme, which was called N-acetylhexosamine or N-acetylgalactosamine6-sulfatase (163, 164). The enzyme was later shown to be specific exclusively against 6-sulfated galactose and N-acetylgalactosamine (165). Currently it is called N-acetylgalactosamine-6-sulfatase or galactose-6-sulfatase (EC 3.1.6.4), abbreviated here as GALNS. cDNA of human GALNS has been cloned and sequenced (166). 2. ACTIVITYAND SPECIFICITY GALNS has dual enzymatic activity. It cleaves the sulfate ester bond from both galactose-6-sulfate and N-acetylgalactosamine-6-sulfate(Fig. la). The ratio between the catalytic activities of GALNS toward these substrates is --~1:140 (167). For both substrates, the pH optimum is between 3.8 and 4. Inorganic sulfate and phosphate ions are potent competitive inhibitors of GALNS with Ki of 35 and 200/zM, respectively (165, 167). The enzyme is also inhibited by Hg 2+ and C1- ions (168, 169).

96


Tritiated substrate, 6-sulfo-N-acetylgalactosamine glucuronic acid 6-sulfoN-acetyl[1-SH]galactosaminitol is usually used for the assay of GALNS activity and for the biochemical diagnostics of Morquio A disease (169). Fluorescent substrate 4-methylumbelliferyl-fl-D-galactopyranoside-6-sulfate,which requires the sequential action of GALNS and GAL for the enzymatic liberation of the fluorochrome, is also described (170). 3. STRUCTURE GALNS has not been yet crystallized; but since the enzyme is a member of a highly conserved family of sulfatases (166) and shares with them an extensive sequence homology, Sukegawa et al. (171) have constructed a model of human GALNS based on x-ray crystal structures of N-acetylgalactosamine-4-sulfatase and arylsulfatase A. The model was used to study the structural changes induced by 32 missense mutations previously identified in GALNS gene in Morquio A patients. The mutations that produced the most severe clinical phenotype were shown to modify the active site or to destabilize the entire conformation of the enzyme by destroying the hydrophobic core, modifying its packing, or removing salt bridges. In contrast, mutations identified in Morquio A patients with mild clinical phenotype were mostly located on the surface of the GALNS protein (171). 4. SYNTHESIS,PROCESSING,AND FUNCTION The gene coding for GALNS maps to 16q24.3. (172, 173). It is about 50 kb long and contains 14 exons (174, 175). Single splicing results in a 1566-bp cDNA, which encodes a 522-amino acid polypeptide. The deduced amino acid sequence is composed of a 26-amino acid N-terminal signal peptide and a mature polypeptide of 496 amino acid residues, including two potential Asn-linked glycosylation sites. The 60-kDa enzyme precursor is transported to the lysosome using a mannose-6-phosphate receptor pathway and, upon arrival, is cleaved to 40- and 15-kDa polypeptides (167). The purified enzyme forms a 120-kDa homodimer, but at least 50% of GALNS in tissues is found associated with LMC (27). GALNS activity and cross-reacting material were reduced in the fibroblasts of patients affected with GS, indicating that association with the complex may protect GALN S in the lysosome (27). The potential biological role of GALNS association with LMC is reviewed in Sections IV and VI,A. In the lysosome, GALNS is involved in the cleavage of sulfate residue from N-acetylgalactosamine-6-sulfate at the nonreducing end of chondroitin6-sulfate and from galactose-6-sulfate at the nonreducing end of keratan sulfate (163, 169). Both glycosaminoglycans are stored in Morquio type A patients with the inherited deficiency of GALNS resulting in generalized skeletal dysplasia (reviewed in Ref. 176).


97

IV. Composition,Stoichiometry,and Structureof LysosomalMultienzyme Complex The data on composition and stoichiometry of the complex as wen as the structural and quantitative relationships between its components in the lysosome are controversial. Hoogeveen et al. (18) suggested that about 85% of GAL in human skin fibroblasts form a complex with CathA, whereas later studies (44) reduced this number to 30%. D'Agrosa et al. (177) reported that only about 12% of GAL and 20-25% of CathA in cultured skin fibroblasts are associated together. Further studies, however, showed that these discrepancies resulted from the instability of LMC, which exists in equilibrium with its free components and readily dissociates at low protein concentrations and neutral pH. Two distinct forms of the complex were purified from human placenta by affinity chomatography and separated by gel filtration: a 1.27-MDa complex containing CathA, GAL, SIAL, and GALNS, and a 680-kDa complex containing CathA and GAL only (27, 45, 60,178). The two forms are in dynamic equilibrium, but an average ratio between a 1.27-kDa complex and a 680-kDa complex is 1:10 (27, 45, 155). Similar data were reported for other tissues (23, 122, 123, 179, 180). At neutral pH, the 680-kDa GAL-CathA complex can be separated to its components, 100-kDa CathA dimers and 320-kDa GAL tetramers, and again reconstituted in vitro by mixing the components at pH 4.75 (60, 180). Reconstitution experiments showed that CathA-GAL complex is made up of four GAL and eight CathA molecules (60, 181). Since the 680-kDa GAL-CathA complex represents the major oligomeric form of GAL in the lysosome (45), the understanding of its structural organization can help to reveal the molecular mechanism of GAL protection by CathA. Radiation inactivation analysis has demonstrated the presence in the complex of a 168-kDa structural subunit containing both GAL and CathA (178). Chemical crosslinking of the complex confrmed the existence of this subunit and showed that it is composed of one CathA dimer and one GAL monomer. Computing the solvent accessibility of the amino acid residues of CathA dimer suggested that a putative GAL-binding interface on CathA is a cavity formed by the association of two CathA monomers and large enough to accommodate the GAL molecule with a Stokes radius of ~28 A (Fig. 3; see color insert). Synthetic peptides corresponding to the exposed loops of CathA bordering the cavity were demonstrated (1) to dissociate the purified GAL-CathA complex, (2) to hamper the reassociation of CathA with GAL, and (3) to bind to purified GAL. The determined location of GAL monomers in the complex with 35% of their surface covered by CathA dimers may explain the stabilizing effect of CathA on GAL in lysosomes (178). Moreover, since the predominant oligomeric structure of GAL in the absence of CathA is a homotetramer (180, 182), it is tempting to speculate that the 168-kDa CathA-GAL subunits are

98


arranged in the complex in such a way that the GAL tetramer becomes an inner core. In this case, GAL would become completely inaccessible for lysosomal proteases. Interestingly, Itoh et al. (183) observed an opposite stabilizing effect of GAL on CathA at neutral pH and suggested that this may be physiologically important for stabilizing CathA secreted by platelets in the blood plasma. A 1.27-MDa complex is probably formed by the association of SIAL and GALNS with the 680-kDa GAL-CathA "basic" complex. This complex could never be purified in considerable amount, so its exact composition and stoichiometry are not clear. SDS-PAGE analysis of the complex purified from human placenta by affinity chromatography on either a GAL-binding or a CathA-binding column, FPLC gel filtration, and chromatofocusing (27, 155) revealed multiple protein components. The presence of the following six fragments was confirmed by several studies and did not depend on the method of purification or tissue of origin: 64-kDa subunits of GAL, 48-kDa subunits of SIAL, a 45-kDa degradation product of GAL, 40-kDa subunit of GALNS, and 32- and 20-kDa subunits of CathA. Other components identified on different occasions by N-terminal sequencing most probably were bound to LMC unspecifically. These proteins included heavy chains and K-chains ofimmunoglobulins as well as so-called MAC-2 binding protein, which is secreted into the extracellular matrix and breast milk, and shows immunostimulatory properties (24, 155, 184; A. V. Pshezhetsky et al., unpublished). Two more lysosomal glycosidases, /~-glucuronidase (155), and ot-N-acetylgalactosaminidase (185) were identified among the components of the 1.27-MDa complex, but it still is not clear if the association of these enzymes with LMC is functionally important.

V. PlasmaMembrane Complex of Elastin-Binding Protein, Calhepsin A, and Sialidase Cloning of the GAL gene suggested the existence of an alternatively spliced mRNA, which lacks exons 3, 4, and 6 of GAL and contains a frameshift in exon 5 resulting in the appearance of a new amino acid sequence (94, 96). This mRNA produces a 546 amino acid protein (67 kDa after glycosylation) that does not have enzyme activity and is sorted to the plasma membrane and perinuclear area of the cytoplasm (96, 98). The biological function of the alternatively spliced form of GAL was revealed several years later, when it was shown to be identical to elastin-binding protein (EBP) (97, 98). EBP (reviewed in Ref. 186) is a major component of the nonintegrin cell surface receptor expressed in all elastin-producing cells, including fibroblasts, smooth muscle cells, chondroblasts, leukocytes, and several human tumor cells (melanoma, lung cancer, and breast cancer). It recognizes several nonidentical hydrophobic domains on elastin (repeated hexapeptide VGVAPG), laminin, and type IV collagen. The


99

binding depends on the conformation of the ligand and EBP and is inhibited by galactosugars, which therefore modulate cell-matrix interactions. EBP receptor complex may be involved in the regulation of the migratory behavior of the cells during vascular thickening, tumor cell metastasis, or tissue infiltration by the leukocytes. In elastin-producing cells, EBP serves as a molecular chaperone oftropoelastin, facilitating its intracellular transport and extracellular assembly. The EBP complex is composed of three types of subunits, a 67-kDa EBP and two subunits of 55 kDa and 61 kDa. These two subunits (or one of them) are responsible for binding the whole complex to the plasma membrane. A 55-kDa subunit has reacted on Western blot with antibodies raised against human CathA (186). The same antibodies colocalized with the EBP complex on the surface of aortic smooth muscle cells, suggesting that the 55-kDa subunit is identical to a 54-kDa CathA precursor. Being transported to the cell surface, CathA precursor lacks processing to 30- and 20-kDa subunits, which occurs only in the lysosome; therefore, it does not have enzymatic activity (58), but can fulfil its protective function toward EBP. It is tempting to speculate that the third component of the EBP complex is identical to SIAL. Targeting of SIAL to the plasma membrane followed by its internalization into the endosomal compartment (see Section III,C,4) perfectly matches the intracellular sorting of EBP. After the synthesis, EBP binds tropoelastin and escorts it through secretory pathways to the plasma membrane. Some of the protein stays on the plasma membrane associated with elastic fibers, but the majority of EBP after dissociation from tropoelastin is internalized and sorted first to the early and then to the recycling endosome, where it associates with newly synthesized tropoelastin (186). Since neither EBP itself nor CathA precursor is able to bind with the membrane, the sorting and functional activity of the EBP complex should depend on the presence of the 61-kDa subunit. Therefore, the observed failure of cells from sialidosis patients, who lack functional SIAL, to assemble elastic fibers (92) is consistent with the suggestion that SIAL is a component of EBP complex and serves as an anchor that attaches the complex to the endosomal or plasma membrane.

VI. Molecular Palhologyof Lysosomal Mulfienzyme Complex A. fl-Galactosidosis:GM1-Gangliosidosisand Morquio B Disease Inherited deficiency of GAL in humans results in two clinically and biochemically different diseases: GM1-Gangliosidosis and Morquio syndrome type B. GM1Gangliosidosis is a neurosomatic disease clinically close to other gangliosidoses

100


such as Sanhoff and Tay-Sachs syndromes. Morqnio B disease is classified under genetic mucopolysaccharidoses (MPS) because clinically it is a milder form of MPS IV A (Morqnio syndrome type A). Although GMl-gangliosidosis and Morquio B disease are currently combined under the common name fl-galactosidosis (187), we review them separately because different molecular mechanisms underlying these disorders result in different spectra of biochemical defects and clinical features. 1. GM1-GANGLIOSIDOSIS According to the age of onset and clinical manifestations, GMl-gangliosidosis is subdivided into three types: infantile (type 1), late infantile/juvenile (type 2), and adult/chronic (type 3). Infantile patients usually develop between birth and the sixth month of age with a progressive neurologic deterioration, facial dysmorphism, hepatosplenomegaly, macular cherry-red spots, and generalized skeletal dysplasia. Age of onset in late infantile/juvenile patients is usually between 7 months and 3 years. Clinical features are similar to those of infantile GMl-gangliosidosis patients but are milder, especially for the patients with relatively late onset. Skeletal dysplasia may not be present, but has been reported for many patients. The adult/chronic form of GMl-gangliosidosis has been described mainly in patients of Japanese origin (188). Early development of these patients is normal. The disease manifests between 3 and 30 years of age usually as progressive cerebral ataxia, action myoclonus, cherry-red spots, and corneal clouding. Dysmorphism may be present, but in some patients it is not obvious. The spectra of storage products are different for the type 1, type 2, and type 3 patients, but generally include all three types of GAL natural substrates: glycosphingolipids, oligosaccharides, and keratan sulfate. GM1-Ganglioside, which gave a name to the disease, is the main storage product (189). The level of GMl-ganglioside stored in the brain does not differ significantly between infantile and late infantile/juvenile patients (190-192). In adult/chronic cases, GMl-ganglioside is detected only in specific parts of the brain (193). Asialo GM1 (Gm)-ganglioside has also been detected in considerable amount. Heterogeneous galactose-containing oligosaccharides are stored in liver and excreted in urine of GMl-gangliosidosis patients of all clinical types. Analysis of their structure indicated that they are derived from complex Asn-linked oligosaccharide chains of glycoproteins, such as immunoglobulins or erythrocyte stromal glycoproteins (analyzed in Ref. 187). This finding indicates that GAL deficiency blocks catabolism of glycoproteins at a step subsequent to their hydrolysis by SIAL (Fig. lc). Water-soluble mucopolysaccharides are detected in large amounts in liver and spleen OfGM1-gangliosidosis patients (191,194). In several reports these storage products were identified as keratan sulfate on the basis of its composition and electrophoretic mobility (191, 194), while others detected the presence of two forms of keratan sulfate, normal and undersulfated (195, 196).


101

2. MORQUIOB DISEASE Morquio B disease manifests as a generalized skeletal dysplasia (197). Symptoms resemble those of Morquio A patients (inherited GALNS deficiency), but the clinical course is more prolonged so the patients usually develop normally until early childhood, when they start to show growth retardation (198, 199). Patients do not develop facial dysmorphism or visceromegaly. No cherry-red spots have been recorded, but corneal clouding has been described in many cases (197, 200). Neurological manifestations, with the exception of several cases (199), are negative. Storage compounds in Morquio B patients mostly consist ofkeratan sulfate (197-199, 201). In one case, undersulfated keratan sulfate was detected as a main storage product (200). Storage of glycoprotein-derived oligosaccharides has also been reported. GM1-Ganglioside is not stored in Morquio B patients and is normally converted by cultured skin fibroblasts derived from them (202). Discovery of Morquio B disease raised the question of why GAL deficiency results in two distinct disorders with different biochemical backgrounds. A partial answer to this question was obtained from the analysis of molecular defects in these diseases. About 40 different mutations in GAL gene have been reported in GMl-gangliosidosis patients (reviewed in Refs. 92 and 187). Genotype and biochemical lesions show strong correlation with the phenotype, the most severe being reported for frameshift or splice mutants that do not produce functional protein. In contrast, Morquio B phenotype in 12 of 13 reported cases is caused by the W273L mutation, and in one case by W509C change. Based on the sequence homology of GAL to a bacterial 6-phospho-/~-galactosidase, whose tertiary structure has been resolved by x-ray analysis, Callahan (92) predicted that the W273 residue is involved in substrate binding. This led to the hypothesis that W273L mutation specifically changes the affinity of GAL for keratan sulfate, but not for GMl-ganglioside, since the binding of ganglioside in the active site of GAL requires formation of a triple complex GAL-GMl-ganglioside-SapB (203). This hypothesis is supported by the expression studies of GAL W273L mutant, which showed a significantly decreased affinity of GAL toward the synthetic substrate (204). The accumulation of keratan sulfate occurs in both G~al-gangliosidosis and Morquio B patients. The amount of stored oligosaccharide is proportional to the residual activity of GAL. Infantile and late infantile GMl-gangliosidosis patients with low residual activity of GAL store significant amount of keratan sulfate, whereas in adult/chronic patients the product is hardly detectable. Our studies (27) showed that GALNS is a part of LMC and that keratan sulfate is also stored in GS patients with primary CathA deficiency. Therefore, accumulation of keratan sulfate happens in all conditions in which LMC is disrupted, suggesting that the association of GAL and GALNS in the complex is essential for keratan sulfate catabolism.

102


The majority of mutations in GAL gene will affect both spliced products, GAL and EBP. Direct expression studies of the mutant EBP are required to form a definite conclusion if the deficiency of the last protein contributes to the phenotype of GMl-gangliosidisis and Morquio B patients, although this recently was suggested for several GMl-gangliosidosis patients with cardiomyopathy, which normally is not associated with this disease (205).

B. Galactosialidosis Autosomal inherited deficiency of CathA causes the lysosomal storage disorder galactosialidosis (GS) (for recent reviews, see Refs. 206 and 207). Three clinical phenotypes of GS are recognized on the basis of the age of onset and clinical manifestations. Patients with early infantile type of GS usually present prenatally as fetal hydrops, or postnatally between birth and the third month of life with ascites or renal failure. Other clinical features include neonatal edema, coarse facies, visceromegaly, psychomotor delay, and skeletal changes. Patients with late infantile type of GS present in the first months of life with coarse facies, hepatosplenomegaly, dysostosis multiplex (a number of skeletal changes typical of many lysosomal storage diseases), bilateral macular cherry-red spots, and mental and motor retardation. The juvenile/adult type of GS includes patients with the age of onset between 1 and 40 years. Most of these patients are of Japanese origin. Clinical symptoms may include coarse facies and skeletal (mostly spinal) changes, but generalized dysostosis is rare. Neurological manifestations include myoclonus, cerebral ataxia, seizures, and mental retardation. Patients develop progressive loss of vision, development of bilateral macular cherry-red spots, and corneal clouding. Patients with severe forms of GS often die in childhood from cardiac failure or airway obstruction (67). In contrast, those with a relatively mild form of the disease may survive into adulthood, often without mental retardation or visceromegaly, but with myoclonus and ataxia. The compounds excreted in urine of GS patients and stored in their tissues are mostly sialooligosaccharides and sialoglycopeptides, while in brain and spinal cord gangliosides (mostly GM3 and GDla) are also stored. This composition of storage products resembles that in sialidosis patients (208, 209) and implicates SIAL deficiency as the major pathogenic defect underlying GS. Site-directed mutagenesis of active site residues of CathA (44) showed that GS is caused by the absence of structural interactions of CathA with GAL and SIAL, but not by the lack of catalytic activity of the enzyme. Transient expression studies (210, 211) suggested that most of the missense mutations in GS patients affect the processing or the stability of CathA. Resolving the x-ray structure of CathA precursor (56) allowed a comprehensive analysis of structural changes induced by the mutations in CathA (212). The analysis revealed a correlation between the effects of mutations on protein


103

structure and the clinical phenotypes of the affected patients. None of mutations occurred in the active site or at the protein surface. Of 11 amino acid substitutions that have been modeled, 9 were found in patients affected with severe early or late infantile type of GS (Q21R, $23Y, W37R, $62L, V104M, L208P, Y367C, M378T, and G411S, shown in red in Fig. 4; see color insert). These mutations, located in the central core domain of CathA, introduced unsatisfied charged groups, hydrogen bonds, or bulkier side chains to the protein core, or created cavities in the protein interiors and interfaces. All these changes would dramatically alter the folding of mutant CathA, resulting in the impared sorting and rapid degradation. In contrast, the other two mutations (F412V and Y221N, shown in green in Fig. 4, see color insert), which were present in patients with moderate clinical phenotype and a relatively high residual CathA activity, were located in the or-helical cap domain of the enzyme (see Section III,A,3). These mutations were predicted to have a milder effect on the protein structure. For Q21R, W37R, $62L, Y221N, Y367C, and F412V mutations, similar predictions have been made on the basis of homologous modeling of CathA structure (55). Therefore, only those mutations in CathA that result in defective folding of the whole protein affect both GAL and SIAL-binding sites, producing the simultaneous deficiency of the two enzymes. The molecular mechanism of GAL deficiency in GS involves rapid proteolytic degradation of the enzyme in the lysosome (reviewed in Sections II and IV). Recent studies demonstrated that CathA deficiency in GS cells also results in improper proteolytic processing of the C-terminal region of GAL precursor (100). In normal cells, the 84-kDa GAL precursor is rapidly processed to the 67- and 64-kDa mature forms of the enzyme. In GS fibroblasts, processing of GAL resulted in the appearance of abnormal 80- and 72-kDa enzymatically inactive polypeptides. The 80-kDa polypeptide was then rapidly cleaved into a 45-kDa product. Cleavage of the 84-kDa precursor first into 80-kDa and then 45-kDa products could be prevented by leupeptin, a potent inhibitor of thioldependent proteases, whereas the amount of the abnormal 72-kDa polypeptide was not influenced by leupeptin treatment. Altogether, these data showed that binding of GAL precursor to CathA is absolutely essential to ensure proper processing of GAL and to protect it from at least two separate proteolytic attacks (100). Three different molecular mechanisms can be suggested to explain the molecular mechanism of SIAL deficiency in GS. First, SIAL activity has always been associated with the LMC (22, 23, 122, 123, 179). The dissociation of the complex resulted in complete inactivation of SIAL; however, the activity could be restored after the reassociation of the complex in vitro (24), suggesting that the association with the complex is required for SIAL to adopt a eatalytically active conformation. This hypothesis is supported by the finding (155) that the reconstitution of LMC in vitro has increased SIAL affinity toward its substrates;

104


however, direct studies of SIAL tertiary structure are required to understand the molecular mechanism of this phenomenon. Second, it was suggested that association with CathA is required for lysosoreal targeting of SIAL (150). So, in the cells of GS patients, which lack functional CathA, SIAL may not reach lysosomes and is secreted out of the cell or accumulated in ER or Golgi and degraded. The data that contradict this hypothesis were reviewed in Section III,C,4. Third, by analogy with GAL, one can hypothesize that CathA may protect SIAL in the lysosome. In accordance with this hypothesis, immunofluorescent and immunoelectron microscopy showed that in both normal and GS cells transiently expressed wild-type SIAL was able to reach lysosomes, but the amount of protein found in GS cells was ~5-fold reduced (150). Metabolic labeling studies demonstrated that the 48.3-kDa mature active form of SIAL was stable in normal fibroblasts (half-life -~2.7 h) whereas in GS fibroblasts, the enzyme was rapidly converted (half-life "~30 min) into 38.7- and 24-kDa catalytically inactive forms.

C. Sialidosis Sialidosis (also called mucolipidosis I and cherry-red spot myoclonus syndrome) is an autosomal recessive lysosomal storage disease caused by the genetic deficiency of lysosomal SIAL (see Refs. 209, 213, and 214 for recent reviews). Sialidosis is subdivided into two main clinical variants with different age of onset and severity. Sialidosis type I or nondysmorphic type is a late-onset mild form, characterized by bilateral macular cherry-red spots, progressive impaired vision, and myoclonus syndrome (215-219). Sialidosis type II or dysmorphic type is the infantile-onset form, which is also associated with skeletal dysplasia, Hurler-like phenotype, dysostosis multiplex, mental retardation, and hepatosplenomegaly (220-223). A severe form of the disease manifests prenatally and is associated with ascites and hydrops fetalis (224-226). The age of onset and severity of the clinical manifestations correlate with the amount of residual SIAL activity, suggesting the existence of considerable genetic heterogeneity. For both types, storage products accumulated in tissues and excreted in urine mostly consist of sialylated oligosaccharides and glycoproteins (209). An understanding of the molecular defects in sialidosis became possible after the recent cloning of the SIAL gene and characterization of the protein (124-126). So far, 27 mutations have been identified, including 6 nonsense and 21 missense mutations leading to amino acid substitutions (124, 125, 141, 142, 227; H. Sakuraba, private communication). In addition to mutations identified in humans, a 625C---~A (Leu209Ile) change in SIAL gene was reported in the SM/J mouse strain, characterized by reduced SIAL activity (127, 129). Localization of the missense mutations on the SIAL structural model (Fig. 5; see color insert) suggested that several of them (shown in blue in Fig. 5, see color insert) affect active site residues (Tyr370Cys) or mayinterfere with their


105

correct positions (Leu91Arg with the active site residue R78; Pro80Leu with R97; dup399HisTyr with E394; Pro316Ser with RZS°; and Pro335Gln with Ra41). Leu363Pro mutation is situated on a t-strand adjacent to that containing the active site residue Tyr37°. Leu 363 is probably necessary to anchor this t-strand to the one containing y370, so the Leu363Pro mutation can potentially also have an effect on the active site. However, in contrast to mutations in CathA, which mostly affect the enzyme central core and cause unfolding of the protein, the majority of SIAL mutations involve residues on the surface of the enzyme, and are not supposed to result in significant structural changes. Moreover, the distribution of mutations on the SIAL surface is uneven. The region that contains the majority of mutations resulting in complete or almost complete inactivation of the enzyme and causing severe sialidosis type II phenotype is easily detectable (shown in red in Fig. 5, see color insert). In particular, this region contains mutations Gly227Arg, Phe260Tyr, Leu270Phe, and Ala298Val (141); Arg294Ser, l_~u231His, and Gly218Ala (227); Trp240Arg (H. Sakuraba, private communication); as well as Val217Met and Gly243Arg (142). We have expressed eight SIAL mutants, four of which contained amino acid substitutions in the defined surface patch (Gly227Arg, Phe260Tyr, Leu270Phe, Ala298Val) and four at the opposite surface of the SIAL molecule (Gly68Val, Ser182Gly, Gly328Ser, and Leu363Pro) in COS-7 cells, and studied sorting, activity, and stability of the produced protein (229). The experiments revealed that the SIAL mutants containing amino acid substitutions in the surface patch had common properties. First, they had very low or absent SIAL activity. Second, the stability of SIAL mutants in cellular homogenates or their half-life in the cell estimated by metabolic labeling was significantly lower than that of the wild-type enzyme; and Western blot (141) demonstrated the presence of 37-kDa, 26-kDa, and 24-kDa degradation products of SIAL protein similar to those previously observed for wild-type SIAL expressed in the cells of a GS patient that lack functional CathA (155). Third, in the extracts of cells transfected with Ala298Val, Leu270Phe, and Phe260Tyr mutants, we could not detect a high-molecular-weight complex of SIAL with CathA. Instead, we observed that mutant SIAL sedimented during the density gradient centrifugation together with low-molecular-weight markers. Altogether, the data indicated that Ala298Val, Leu270Phe, and Phe260Tyr mutations in SIAL destabilize its association with CathA in the complex, leading to the enzyme inactivation.

VII. Future Perspectives The data reviewed here provide important information about the biogenesis and function of lysosomal enzymes. However, a number of intriguing questions

106


remain. Is LMC limited to five enzymes--GAL, SIAL, GALNS, and CathA--or does it include other glycosidases, perhaps metabolically related? So far, biochemical purification has been the primary method of studying the composition of LMC. However, weak protein-protein interactions can be affected by the purification procedures. This may be particularly important for the enzymes of the lysosomal matrix, which are characterized by extremely low water content and high protein concentration. The situation has changed in recent years with the development of proteomics and mass spectrometry, which make possible the detection of microquantities of even rapidly dissociating complexes. Further characterization of LMC will allow testing our hypothesis that it represents a functional supramolecular organization, in which metabolic intermediates are directly transferred between the active sites of the associated enzymes (182). In the pathways of the intralysosomal degradation of keratan sulfate (Fig. la), GALNS and GAL catalyze sequential hydrolytic reactions, while SIAL and GAL catalyze consecutive reactions in the catabolism of gangliosides (Fig. lb) or complex-type oligosaccharide chains of glycoproteins (Fig. lc). In all pathways, the step following removal of 2,3-linked sialic acid residues and 1,4-1inked galactose is the hydrolysis of N-acetylgalactosamine residues (or N-acetylglucosamine in the case of glycoproteins), which is catalyzed by lysosomal hexosaminidase A (HEXA). Association of this enzyme with LMC may be functionally important since it is plausible that it might significantly accelerate the whole process and help to overcome the low solubility of glycolipids that become more and more hydrophobic in the process of hydrolysis. These studies as well as finding the precise mechanisms of activation and stabilization of enzymes of the complex will require the determination of the tertiary structures of SIAL, GAL and GALNS, and LMC itself. Knowledge of the enzyme structures may also provide new solutions for therapeutic intervention, such as the recently described "enzyme activation therapy," where molecular chaperones are used to modify the conformation of the mutant lysosomal enzymes, which are otherwise retained and degraded in ER so as to obtain a more stable structure closer to that of the wild-type enzyme (228). The development of therapeutic tools will require more information about the physiological role of LMC. Interestingly, nearly all components of LMC have been shown to have other biological functions outside the lysosome. Lysosomal SIAL is targeted both to the lysosome and the cell surface, where it is involved in signaling during immune response. An alternatively spliced form of GAL, EBP, also targeted to the plasma membrane, is a major protein responsible for the cell binding to elastin fibers. CathA is secreted to the blood plasma by platelets and lymphocytes and may be involved in the processing of regulatory peptides. Future studies should show if the impairment of the nonlysosomal function of LMC components can contribute to the clinical phenotype of the patients affected with lysosomal storage disorders.


107

The ongoing structural, enzymological, and clinical studies should extend our knowledge about the role of LMC in cell physiology and provide an effective treatment for the severe inherited diseases that are related to it.

ACKNOWLEDGMENTS We thank our colleagues Drs. Mexandra d'Azzo, Marc-Andre Elsliger, Hitoshi Sakuraba, and Kohji Itoh, who generously provided unpublished information. We also acknowledge Dr. Grant Mitchell for critical reading of the review, and Canadian Institutes of Health Research (MT-15079), Canadian Foundation for Innovation, Fonds de La Recherche en Sant6 du Quebec, and Vaincre les Maladies Lysosomales Foundation (France) for providing financial support.

REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

H. Koening, Nature (London) 195, 782-785 (1962). S. Okada and J. s. O'Brien, Science 160, 1002-1004 (1968). M. C. B. Loonen, L. van de Lugh, and L. C. Franke, Lancet 785, ii (1974). L. Pinsky, J. Miller, B. Shanfield, G. Watters, and L. S. Wolfe, Am. J. Hum. Genet. 26, 563-577 (1974). H. Galjaard, A. Hoogeveen, H. A. de Wit-Verbeek, A. J. Reuser, M. W. Ho, and D. Robinson, Nature (London) 257(5521), 60-62 (1975). D. A. Wenger, T. J. Tarby, and C. Wharton, Biochem. Biophys. Res. Commun. 82, 589-595 (1978). G. Andria, G. Strisciuglio, W. S. Pontarelli, W. S. Sly, and W. E. Dodson, "Sialidases and Sialidoses" (P. Durand et al.,eds.), p. 365. Edi. Ermes, Milano, 1981. M. Cantz, J. Gehler, andJ. Spranger, Biochem. Biophys. Res. Commun. 74, 732-738 (1977). A. T. Hoogeveen, F. W. Verheijen, A. d'Azzo, and H. Galjaard, Nature (London) 285(5765), 500-502 (1980). A. Hoogeveen, A. d'Azzo, R. Brossmer, and H. Galjaard, Biochem. Biophys. Res. Commun. 103(1), 292-300 (1981). o. P. van Diggelen, A. T. Hoogeveen, P. J. Smith, A. J. Reuser, and H. Galjaard, Biochim.

Biophys. Acta 703(1), 69-76 (1982). 11a. Y. Suzuki, H. Sakuraba, K. Hayashi, K. Suzuki, and K. Imahori,J. Biochem. 90, 271-273 (1981). 12. A. d'Azzo, A. Hoogeveen, A. J. Reuser, D. Robinson, and H. Galjaard, Proc. Natl. Acad. Sci. U.S.A. 79, 4535-4539 (1982). 13. A. G. Norden, L. L. Tennant, andJ. S. O'Brien, J. Biol. Chem. 249(24), 7969-7976 (1974). 14. R. G. Frost, E. W. Holmes, A. G. Norden, and J. S. O'Brien, Biochem. J. 175(1), 181-182 (1978). 15. s. Tomino and M. Meisler, J. Biol. Chem. 10, 7752-7758 (1975). 16. J. K. Anderson, J. E. Mole, and H. J. Baker, Biochemistry 17(3), 467-473 (1978). 17. J. Hoekseina, O. P. van Diggelen, and H. Galjaard, Biochim. Biophys. Acta 566, 72-79 (1979). 18. A. T. Hoogeveen, F. W. Verheijen, and H. Galjaard, J. Biol. Chem. 258, 12143-12146 (1983). 19. A. T. Hoogeveen, H. Graham-Kawashima, A. d'Azzo, and H. Galjaard, J. Biol. Chem. 259(3), 1974-1977 (1984). 20. Y. Yamamoto, M. Fujie, and K. Nishimura, J. Biochem. 92, 13-21 (1982).

108


21. C. S. Jones, D. Mahuran, J. A. Lowden, and J. w. Callahan, Can. J. Biochem. Cell Biol. 64, 529-534 (1984). 22. E Verheijen, R. Brossmer, and H. Galjaard, Biochem. Biophys. Res. Commun. 108, 868-875 (1982). 23. E Verheijen, S. Palmeri, A. T. Hoogeveen, and H. Galjaard, Eur. J. Biochem. 149, 315-321 (1985). 24. G. T. van der Horst, N. J. Galjart, A. d'Azzo, H. Galjaard, and F. W. Verheijen, J. Biol. Chem. 264(2), 1317-1322 (1989). 25. R. M. D'Agrosa and J. w. Callahan, Biochem. Biophys. Res. Comrnun. 157(2), 770-775 (1988). 26. M. Hiraiwa, T. Yamauchi, S. Tsuji, M. Nishizawa, T. Miyatake, K. Oyanagi, F. Ikuta, and Y. Uda, ]. Biochem. (Tokyo) 114(6), 901-905 (1993). 27. A. V. Pshezhetsky and M. Potier,]. Biol. Chem. 271, 28359-28365 (1996). 28. A. J. Barrett, N. D. Rawlings, and J. F. Woessner (eds.), "Handbook of Proteolytic Enzymes," pp. 393-398. Academic Press, London, 1998. 29. J. s. Fruton and M. Bergmann, J. Biol. Chem. 130, 19-27 (1939). 30. J. S. Fruton, G. W. Irving, Jr., and M. Bergmann,]. Biol. Chem. 138, 249-262 (1941). 31. A. A. Iodice, Arch. Biochem. Biophys. 121, 241-242 (1967). 32. A. A. Iodice, V. Leong, and I. M. Weinstock, Arch. Biochem. Biophys. 117, 477-486 (1966). 33. Y. Kawamura, T. Matoba, T. Hata, and E. Doi, J. Biochem. 76, 915-924 (1974). 34. Y. Kawamura, T. Matoba, T. Hata, and E. Doi, J. Biochem. 77, 729-737 (1975). 35. Y. Kawamura, T. Matoba, T. Hata, and E. Doi, J. Biochem. 81,435-441 (1977). 36. A. I. Logunov and V. N. Orekhovich, Biochem. Biophys. Res. Commun. 46, 1161-1168 (1972). 37. K. Matsuda and E. Misaka,J. Biochem. 78, 31-39 (1975). 38. K. Matsuda and E. Misaka,J. Biochem. 76, 639-649 (1974). 39. K. Matsuda, J. Biochem. 80, 659-669 (1976). 40. N. J. Galjart, N. Gillemans, A. Harris, T. Gijsbertus, J. van der Horst, F. W. Verheijen, H. Galjaard, and A. d'Asso, Cell (Cambrigde, Mass. ) 54, 755-764 (1988). 41. N.J. Galjart, N. Gillemans, D. Meijer, and A. d'Azzo, J. Biol. Chem. 265, 4678-4684 (1990). 42. J. Tranchemontagne, L. Michand, and M. Potier, Biochem. Biophys. Res. Commun. 168, 22-29 (1990). 43. H. L. Jackman, F. L. Tan, H, Tamei, C. Beurling-Harbury, X. Y. Li, R. A. Skidgel, and E. G. Erdos, J. Biol. Chem. 265, 11265-11272 (1990). 44. N. J. Galjart, H. Morreau, R. Willemsen, N. Gillemans, E. J. Bonten, and A. d'Azzo, J. Biol. Chem. 266, 14754-14762 (1991). 45. A. V. Pshezhetsky and M. Potier, Arch. Biochem. Biophys. 313, 64-70 (1994). 46. S.J. Remington, Curt. Opin. Biotechnol. 4, 462-468 (1993). 47. A. V. Pshezhetsky, M. V. Vinogradova, M.-A. Elsliger, F. El-Zein, S. K. Svedas, and M. Potier, Anal. Biochem. 230, 303-307 (1995). 48. S.J. Remington and K. Breddam, Methods Enzymol. 244, 231-248 (1994). 49. M.-A.Elsliger, A. V. Pshezhetsky, M. V.Vinogradova, V. K. Svedas, and M. Potier, Biochemistry 35(47), 14899-14909 (1996). 50. T. Chikuma, K. Matsumoto, A. Furukawa, N. Nakayama, R. Yajima, T. Kato, Y. Ishii, and A. Tanaka, Anal. Biochem. 233, 36-41 (1996). 51. H. Ostrowska, C. Wojcik, S. Wilk, S. Omura, L. Kozlowski, T. Stoklosa, K. Worowski, and P. Radziwon, Int. J. Biochem. Cell. Biol. 32(7), 747-757 (2000). 52. K. Itoh, N. Takiyama, R. Kase, K. Kondoh, A. Sano, A. Oshima, H. Sakuraba, and Y. Suzuki, J. Biol. Chem. 268, 1180-1186 (1993). 53. J. Worowski, Experientia 31, 637-638 (1975). 54. H. Morreau, N. J. Galjart, R. W'dlemsen, N. GiUemans, X. Y. Zhou, and A. d'Azzo, J. Biol. Chem. 267, 17949-17956 (1992).


109

55. M.-A. Elsliger and M. Pofier, Proteins 18, 81-93 (1994). 56. G. Rudenko, E. Bonten, A. d'Azzo, andW. G. J. Hol, Structure 3, 1249-1259 (1995). 57. M. Shimmoto, Y. Nakahori, I. Matsuhita, T. Shinka, Y. Kuroki, K. Itoh, and H. Sakuraba, Biochem. Biophys. Res. Commun. 220, 802-806 (1996). 58. E.J. Bonten, N. J. Galjart, R. Willemsen, M. Usmany, J. M. Vlak, and A. d'Azzo, J. BioL Chem. 270, 26441-26445 (1995). 59. A. Satake, K. Itoh, M. Shimmoto, T. C. Saido, H. Sakuraba, and Y. Suzuki, Biochem. Biophys. Res. Commun. 205, 38-43 (1994). 60. A. V. Pshezhetsky and M. Polier, Biochem. Biophys. Res. Commun. 195, 354-362 (1993). 61. N. Marks, L. Sachs, and F. Stern, Peptides 2, 159-164 (1981). 62. J.J. Miller, D. G. Changaris, and R. S. Levy, Biochem. Biophys. Res. Commun. 154,1122-1129 (1988). 63. M. Yanagisawaand T. Masaki, Trends Pharvnacol. Sct. 10(9), 374-378 (1989). 64. H. L. Jackman, P. W. Morris, P. A. Deddish, R. A. Skidgel, and E. G. Erdos, J. Biol. Chem. 267, 2872-2875 (1992). 65. H. L. Jackman, P. W. Morris, S. F. Rabito, G. B. Johansson, R. A. Skigel, and E. G. Erdos, Hypertension 21,925-928 (1993). 66. W. L. Hanna, J. M. Turbov, H. L. Jackman, iv. Tan, and C. J. Froelich, J. ImmunoL 153, 4663-4672 (1994). 67. K. Itoh, R. Kase, M. Shimmoto, H. Satake, andY. Suzuki,J. Biol. Chem. 270, 515-518 (1995). 68. C. Richard, J. Tranchemontagne, M. A. Elshger, G. A. Mitchell, M. Potier, and A. V. Pshezhetsky, Hum. Mutat. 11(6), 461-469 (1998). 69. Z. A. Abassi, J. E. Tate, E. Golomb, and H. R. Keiser, Hypertension 20(1), 89-95 (1992). 70. X. Y. Zhou, H. Morreau, R. Rottier, D. Davis, E. Bonten, N. Gillemans, D. Wenger, F. G. Grosveld, P. Doherty, K. Suzuki, G. Grosveld, and A. d'Azzo, Genes Dev. 9, 2623-2634 (1995). 71. N. de Geest, L. Mann, J. de Sousa-Hitzler, C. Hahn, R. Rottier, and A. d'Azzo, Am. ]. Hum, Genet. 65, N4 (suppl.), A26 (1999). 72. I. P. Ashmarin, E. Buzinova, M. Vinogradova, M. Potier, and A. Pshezhetsky, Neurosci. Res. Commun. 21, 153-162 (1997). 73. B. van der Loo, M. J. Fenton, andJ. D. Erusalimsky, Exp. Cell Res. 241(2), 309-315 (1998). 74. J. R. Litaker, J. Pan, Y, Cheung, D. K. Zhang, Y. Liu, S. C. Wong, T. S. Wan, and S. W. Tsao, Int. J. Oncol. 13(5), 951-956 (1998). 75. Y. Suzuki, S. Yokota, N. Kobayashi, and T. Kato, Acta Paediatr. Jpn 23, 44 (1981). 76. T. Furuya, Y. Suzuki, and T. Momoi, J. Biochem. (Tokyo) 99(2), 437-443 (1986). 77. D. N. Arvidson, P. Youderian, T. D. Schneider, and G. D. Stormo, Biotechniques 11(6), 733738 (1991). 78. M. Naoi, M. Kondoh, T. Mutoh, T. Takahashi, T. Kojima, T. Hirooka, andT. Nagatsu, J. Chromatogr 426(1), 75-82 (1988). 79. M.W. Ho andJ. S. O'Brien, Clin. Chim. Acta 30(2), 531-534 (1970). 80. M. W. Ho andJ. S. O'Brien, Clin. Chim. Acta 32(2), 443-450 (1971). 81. J. A. Kint, FEBS Left. 36(1), 53-56 (1973). 82. M. W. Ho and A. Fluharty, Nature (London) 253(5493), 660 (1975). 83. M. H. Meisler, Biochim. Biophys. Acta 410(2), 347-353 (1975). 84. O. P. van Diggelen, H. Galjaard, M. L. Sinnott, and P. J. Smith, Biochem. ]. 188(2), 337-343

(1980). 85. S. Zhang, J. D. McCarter, Y. Okamura-Oho, F. Yaghi, A. Hinek, S. G. Withers, and J. w. Callahan, Biochem. J. 304(Pt. 1), 281-288 (1994). 86. A. Zschoche, W. Furst, G. Schwarzmann, and K. Sanhoff, Eur J. Biochem. 222(1), 83-90 (1994).

110


87. M. Hiraiwa, B. M. Martin, Y. Kishimoto, G. E. Conner, S. Tsuji, and ]. S. O'Brien, Arch. Biochem. Biophys. 341(1), 17-24 (1997). 88. T. Kolter and K. Sandhoff, J. Inherit. Metab. Dis. 21(5), 548-553 (1998). 89. J. S. O'Brien andY. Kishimoto, FASEBJ. 5(3), 301-308 (1991). 90. J. D. McCarter, D. L. Burgoyne, S. Miao, S. Zhang, J. w. Callahan, and S. G. Withers, J. Biol. Chem. ~.72(1), 396-400 (1997). 91. S. Zhang, R. Bagshaw, W. Hi/son, Y. Oho, A. Hinek, J. T. Clarke, and J. w. Callahan, Biochem. 1' 348(Pt. 3), 621-632 (2000). 92. J.W. Callahan, Biochim. Biophys. Acta 1455(2-3), 85-103 (1999). 93. H.J. Sips, H. A. de Wit-Verbeek, J. de Wit, A. Westerveld, and H. Galjaard, Hum. Genet. 69(4), 340-344 (1985). 94. Y. Yamamoto, C. A. Hake, B. M. Martin, K. A. Kretz, A. J. Ahern-Rindell, S. L. NayIor, M. Mudd, and J. S. O'Brien, DNA Cell Biol. 9(2), 119-127 (1990). 95. T. Takano and Y. Yamanouchi, Hum. Genet. 92(4), 403-404 (1993). 96. H. Morreau, N. J. Galjart, N. Gi/lemans, R. Wi/lemsen, G. T. van tier Horst, and A. d'Azzo, J. Biol. Chem. 264(34), 20655-20663 (1989). 97. A. Hinek, M. Rabinovitch, F. Keeley, Y. Okamura-Oho, andJ. Callahan,J. Clin. Invest. 91(3), 1198-1205 (1993). 98. S. Privitera, C. A. Prody, J. W. Callahan, and A. Hinek, J. Biol. Chem. 273(11), 6319-6326 (1998). 99. B. Overdijk, E. K J. Hiensch-Goormachtig, E. P. Beem, G. J. van Steijn, L. A. W. Trippelvitz, J. J. w. Lisman, H. van Halbeek, J. H. G. M. Mutsaers, and J. F. G. Vliegenthart, Glycoconjugate ]. 3, 339-345 (1986). 100. Y. Okamura-Oho, S. Zlaand,W. Hi/son, A. Hinek, and J. W. Callahan, Biochem. ]. 313, 787-794 (1996). 101. Y. Okamura-Oho, S. Zhang, J. w. Callahan, M. Murata, A. Oshima, and Y Suzuki, FEBS Lett. 419(2-3), 231-234 (1997). 102. A. van der Spoel, E. Bonten, and A, d'Azzo, J. Biol. Chem. 275(14), 10035-10040 (2000). 103. N. Yamamoto and R. Kumashiro, J. Immunol. 151(5), 2794-2802 (1993). 104. N. Yamamoto andV. R. Naraparaju, J. Immunol. 157(4), 1744-1749 (1996). 105. J. Matsuda, O. Suzuki, A. Oshima, A. Ogura, Y. Noguchi, Y. Yamamoto, T. Asano, K, Takimoto, K. Sukegawa, Y. Suzuki, and M. Naiki, Glycoconjugate]. 14(6), 729-736 (1997). 106. G. K. Hirst, Science 94, 22-23 (1941). 107. A. Gottschalk and P. E. Lind, Nature (London) 164, 232-233 (1949). 108. J. E. Galen eta/., Infect. Immun. 60, 406-415 (1992). 109. T. Corfield, Glycobiology 2, 509-521 (1992). 110. E W. Verheijen, H. C. Janse, O. P. van Diggelen, H. D. Bakker, M. C. Loonen, P. Durand, and H. Galjaard, Biochem. Biophys. Res. Commun. 117, 470-478 (1983). 111. T. Miyagi, J. Sagawa, K. Konno, and S. Tsui/d,J. Biochem. 107, 794-798 (1990). 112. H. R. Sehneider-Jakob and M. Cant,z, Biol. Chem. Hoppe-Seyler 372, 443-450 (1991). 113. T. Miyagi, K. Konno, Y. Emori, H. Kawasaki, K. Suzuki, A. Yasui, and S. Tsuiki, J. Biol. Chem. 268(35), 26435-26440 (1993). 114. E. Monti, A. Prefi, C. Nesti, A. Ballabio, and G. Borsani, Glycobiology 9(12), 1313-1321 (1999). 115. T. Wada, Y.Yoshikawa, S. Tokuyama, M. Kuwabara, H. Akita, and T. Miyagi, Biochem. Biophys. Res. Commun. 261(1), 21-27 (1999). 116. T. Miyagi and S. Tsuiki,J. Biol. Chem. 260, 6710-6716 (1985). 117. H. Akita, T. Miyagi, K. Hata, and M. Kagayama, Histochem. Cell. Biol. I07(6), 495-503 (1997). 118. K. Sato andT. Miyagi, Biochem. Biophys. Res. Commun. 221(3), 826-830 (1996). 119. S. Tokuyama, S. Moriya, S. Taniguehi, A. Yasui, J. Miyazaki, S. Orikasa, and T. Miyagi, Int. J. Cancer 73(3), 410-415 (1997).


111

120. J. Kopitz, C. von Reitzenstein, K. Sinz, and M. Cantz, Glycobiology 6(3), 367-376

(1996). 121. J. Kopitz, C. von Reitzenstein, M. Burchert, M. Cantz, and H. J. Gabius,J. Biol. Chem. 273(18),

11205-11211 (1998). 122. M. Hiraiwa, Y. Uda, M. Nishizawa, and T. Miyatake, J. Biochem. (Tokyo) 101(5), 1273-1279

(1987). 123. M. Hiraiwa, M. Nishizawa, Y. Uda, T. Nakajima, and T. Miyatake,]. Biochem. (Tokyo) 103(1),

86-90 (1988). 124. E. J. Bonten, A. van der Spoel, M. Fornerod, G. Grosveld, and A. d'Azzo, Genes Dev. 10,

3156-3168 (1996). 125. A. V. Pshezhetsky, C. Richard, L. Michaud, S. Igdoura, S. Wang, M.-A. Elsliger, J. Qu, D. Leclerc, R. Gravel, L. Dallaire, and M. Potier, Nature Genet. 15, 316-320 (1997). 126. C. M. Milner, S. V. Smith, M. B. Carrillo, G. L. Taylor, M. Hollinshead, and R. D. Campbell, J. Biol. Chem. 272, 4549-4558 (1997). 127. M.B. Carrillo, C. N. Milner, S. T. Ball, M. Shock, and R. D. Campbell, Glycobiology 7, 975-986

(1997). 128. S. A. Igdoura, C. Gafuik, C. Mertineit, F. Saberi, A. V. Pshezbetsky, M. Potier, J. M. Trasier, and R. A. Gravel, Hum. Mol. Genet. 7, 115-121 (1998). 129. R.J. Rottier, E. Bonten, and A. d'Azzo, Hum. Mol. Genet. 7, 313-321 (1998). 130. J. C. Michalski, G. Strecker, B. Fournet, M. Cantz, and J. Spranger, FEBS Lett. 79, 101-104

(1977). 131. G. Strecker, M. C. Peers, J. C. Michalski, T. Hondi-Assah, B. Fournet, G. Spik, J. Montreuil, J. P. Farriaux, P. Maroteaux, and P. Durand, Eur J. Biochem. 72(2), 391-403 (1977). 132. L. Dorland, J. Haverkamp, J. F. Viliegenthart, G. Strecker, J. c. Michalski, B. Fournet, G. Spik, and J. Montreuil, Eur J. Biochem. 87(2), 323-329 (1978). 133. J. van Pelt, J. P. Kamerling, J. F. G. Vliegenthart, F. W. Verheijen, and H. Galjaard, Biochem. Biophys. Acta 965, 36-45 (1988). 134. B. Ulrich-Bott, B. Klem, R. Kaiser, J. Spranger, and M. Cantz, Enzyme 38(1-4), 262-266

(1987). 135. H. Yoshino et al. , J. Neurol. Sci. 97(1), 53-65 (1990). 136. H. Sakuraba, Y. Suzuki, M. Akagi, M. Sakai, and N. Amano, Ann. Neurol. 13(5), 497-503

(1983). 137. G. M. Mancini, A. T. Hoogeveen, H. Galjaard, J. E. Mansson, and L. Svennerholm, Hum. Genet. 73(1), 35-38 (1986). 138. R. Fingerhut, G. T. van der Horst, F. W. Verheijen, and E. Conzelmann, Eur ]. Biochem.

208(3), 623-629 (1992). 139. J. Q. Huang, J. M. Trasler, S. Igdoura, J. Michaud, N. Hanal, and R. A. Gravel, Hum. Mol. Genet. 6(11), 1879-1885 (1997). 140. S.A. Igdoura, C. Mertineit, J. M. Trasler, and R. A. Gravel, Hum. Mol. Genet. 8(6), 1111-1116

(1999). 141. K. E. Lukong, M. A. Elsliger, Y. Chang, C. Richard, G. Thomas, W. Carey, A. Tylki-Szymanska, B. Czartoryska, T. Buchholz, G. R. Criado, S. Pahneri, andA. V. Pshezhetsky, Hum. Mol. Genet.

9(7), 1075-1085 (2000). 142. Y. Naganawa, K. Itob, M. Shimmoto, K. Takiguehi, H. Doi, Y. Nishizawa, T. Kobayashi, S. Kamei, K. E. Lukong, A. V. Pshezhetsky, and H. Sakuraba, J. Hum. Genet. 45(4), 241-

249 (2000). 143. J. N. Vargbese, W. G. Laver, and E M. Colman, Nature (London) 303, 35-40 (1983). 144. W. E Burmeister, R. W. H. Ruigrok, and S. Cusack, EMBOJ. I1, 49-56 (1992). 145. S. J. Crennell, E. E Garman, W. G. Laver, E. R. Vimr, and G. Taylor, Proe. Natl. Acad. Sci. U.S.A. 90, 9852-9856 (1993). 146. A. Gaskell, S. Crennell, and G. Taylor, Structure 3, 1197-1205 (1995).

112


147. S. J. Crennell, E. F. Garman, W. G. Laver, E. R. Vimr, and G. Taylor, Structure 2, 535-544 (1994). 148. T. Oohira, N. Nagata, I. Akaboshi, I. Matsuda, and S. Naito, Hum. Genet. 70, 341-343 (1985). 149. J. E. Womack, D. L. Yah, and M. Potier, Science 212, 63-65 (1981). 150. A. van der Spoel, E. Bonten, and A. Azzo, EMBOJ. 17, 1588-1597 (1998). 151. T. Miyagi, K. Hata, K. Konno, and S. Tsuiki, Tohoku]. Exp. Med. 168(2), 223-229 (1992). 152. T. Miyagi, K. Hata, A. Hasegawa, and T. Aoyagi, Glycoconjugate]. 10(1), 45-49 (1993). 153. N. F. Landolfi, J. Leone, J. E. Womack, and R. G. Cook, Immunogenetics 22(2), 159-167 (1985). 154. V. R. Naraparaju and N. Yamamoto, Immunol. Lett. 43(3), 143-148 (1994). 155. M. V. Vinogradova, L. Michand, A. V. Mezentsev, K. E. Lukong, M. E1-Alfy,C. R. Morales, M. Potier, and A. V. Pshezhetsky, Biochem. J. 330(pt. 2), 641-650 (1998). 156. c. Peters and K. von Figura, FEB8 Lett 346(1), 108-114 (1994). 157. B. M. Pearse, C. J. Smith, and D. J. Owen, Curt. Opin. Struct. Biol. 10(2), 220-228 (2000). 158. J. Hirst and M. S. Robinson, Biochim. Biophys. Acta 1404(1-2), 173-193 (1998). 159. X. P. Chen, E. Y. Enioutina, and R. A. Daynes, J. Immunol. 158(7), 3070-3080 (1997). 160. X. P. Chen, X. Ding, and R. A. Daynes, Cytokine 12(7), 972-985 (2000). 162. N. F. Landolfi and R. G. Cook, Mol. Immunol. 23(3), 297-309 (1986). 163. N. M. Di Ferrante, L. C. Ginsburg, P. V. Donnelly, D. T. DiFerrante, and C. T. Caskey, Science 199, 79-81 (1978). 164. J. 8ingh, N. M. DiFerrante, P. Niebes, and D. Tavella, J. Clin. Invest. 57, 1036-1040 (1976). 165. J. Glossl, W. Truppe, and H. Kresse, Biochem. ]. 181(1), 37-46 (1979). 166. S. Tomatsu, S. Fukuda, M. Masue, K. Sukegawa, T. Fukao, A. Yamagishi, T. Hori, H. Iwata, T. Ogawa, Y. Nakashima, Y. Hanyu, T. Hashimoto, K. Titani, R. Oyama, M. Suzuki, K. Yagi, Y. Hayashi, and T. Orii, Biochem. Biophys. Res. Commun. 181,677-683 (1991). 167. M. Masue, K. Sukegawa, T. Orii, andT. Hashimoto,J. Biochem. (Tokyo) 110(6), 965-70 (1991). 168. R. Basher, H. Kresse, and K. yon Figura, J. Biol. Chem. 254(4), 1151-1158 (1979). 169. J. Glossl and H. Kresse, Clin. Chim. Acta 88(1), 111-119 (1978). 170. O. P. van Diggelen, H. Zhao, W. J. Kleijer, H. C. Janse, B. J. Poorthuis, J. van Pelt, J. P. Kamerling, and H. Galjaard, Clin. Chim. Acta 187(2), 131-139 (1990). 171. K. Sukegawa, H. Nakamura, Z. Kato, S. Tomatsu, A. M. Montano, T. Fukao, G. Toietta, I?. Tortora, T. Orii, and N. Kondo, Hum. Mol. Genet. 9(9), 1283-1290 (2000). 172. E. Baker, X.-H. Guo, A. M. Orsbom, G. R. Sutherland, D. E Callen, J. J. Hopwood, and C. P. Morris, Am. J. Hum. Genet. 52, 96-98 (1993). 173. M. Masuno, S. Tomatsu, Y. Nakashima, T. Hori, S. Fukuda, M. Masue, K. Sukegawa, and T. Orii, Genomics 16, 777-778 (1993). 174. C. I?. Morris, X.-H. Guo, S. Apostolou, J. J. Hopwood, and H. S. Scott, Genomics 22, 652-654 (1994). 175. Y. Nakashima, S. Tomatsu, T. Hori, S. Fukuda, K. Sukegawa, N. Kondo, Y. Suzuki, N. Shimozawa, and T. Orii, Genomics 20, 99-104 (1994). 176. E. F. Neufeld and J. Muenzer, in "The Metabolic and Molecular Bases of Inherited Disease" (C. R. Scriver, A. L. Beaudet, W. S. Sly, D. Valle, eds.), Vol. 2, chap. 90. McGraw-Hill, New York, 1995. 177. R. M. D'Agrosa, M. Hubbes, S. Zhang, R. Shankaran, and J. w. Callahan, Biochem. J. 285(Pt. 3), 833-838 (1992). 178. A. V. Pshezhetsky, M.-A. Elsliger, M. V. Vinogradova, and M. Potier, Biochemistry 34, 24312440 (1995). 179. F. W. Verheijen, S. Palmeri, and H. Galjaard, Eur J. Biochem. 162(1), 63-67 (1987).


113

180. M. Hiraiwa, M. Saitoh, N. Arai, T. Shiraishi, S. Odani, Y. Uda, T. Ono, and j. s. O'Brien, Biochim. Biophys. Acta 1341(2), 189-199 (1997). 181. L. I. Ashmarina, A. V. Pshezhetsky, H. O. Spivey, and M. Potier, Anal. Biochem. 219(2), 349-355 (1994). 182. A. V. Pshezhetsky, A. V. Levashov, and G. Y. Wiederschain, Biochim. Biophys. Acta 1122(2), 154-160 (1992). 183. K. Itoh, Y. Naganawa, S. Kamei, M. Shimmoto, and H. Sakuraba, Biochem. Biophys. Res. Commun. 253(2), 228-234 (1998). 184. M. Hiraiwa, Y. Uda, S. Tsuji, T. Miyatake, B. M. Martin, M. Tayama, J. s. O'Brien, and Y. Kishimoto, Biochem. Biophys. Res. Commun. 177(3), 1211-1216 (1991). 185. S. Tsuji, T. Yamauchi, M. Hiraiwa, T. Isobe, T. Okuyama, K. Sakimura, Y. Takahashi, M. Nishizawa, Y. Uda, and T. Miyatake, Biochem. Biophys. Res. Commun. 163(3), 1498-1504 (1989). 186. A. Hinek, Biol. Chem. 377(7-8), 471-480 (1996). 187. Y. Suzuki, H. Sakuraba, and A. Oshima, in "The Metabolic and Molecular Bases of Inherited Disease" (C. R. Scriver, A. L. Beaudet, W. S. Sly, D. Valle, eds.), Vol. 2, chap. 90. McGraw-Hill, New York, 1995. 188. Y. Suzuki, N. Nakamura, K. Fukuoka, Y. Shimada, and M. Uono, Hum. Genet. 36(2), 219-229 (1977). 189. J. S. O'Brien, M. B. Stem, B. H. Landing, J. K. O'Brien, and G. N. Donnell, Am. J. Dis. Child. 109, 338 (1965). 190. Y. Suzuki, A. C. Crocker, and K. Suzuki, Arch. Neural. 24(1), 58-64 (1971). 191. K. Suzuki and S. Kamoshita, J. Neuropathol. Exp. Neurol. 28(1), 25-73 (1969). 192. T. Kasama and T. Taketomi, Jpn. J. Exp. Med. 56(1), 1-11 (1986). 193. T. Kobayashi and K. Suzuki, Ann. Neurol. 9(5), 476-483 (1981). 194. K. Suzuki, Science 159(822), 1471-1472 (1968). 195. L. S. Wolfe, J. Callahan, J. s. Fawcett, F. Andermann, and C. R. Scriver, Neurology 20(1), 23-44 (1970). 196. J. w Callahan and L. S. Wolfe, Biochim. Biophys. Acta 215(3), 527-543 (1970). 197. J. S. O'Brien, E. Gugler, A. Giedion, U. Wiessmann, N. Herschkowitz, C. Meier, andJ. Leroy, Clin. Genet. 9(5), 495-504 (1976). 198. J. J. van Gemund, M. A. Giesberts, R. E Eerdmans, W. Blom, and W. J. Kleijer, Hum. Genet. 64(1), 50-54 (1983). 199. R. Giugliani, M. Jackson, S. J. Skinner, C. M. Vimal, A. H. Fensom, N. Fahmy, A. Sjovall, and P. F. Benson, Clin. Genet. 32(5), 313-325 (1987). 200. A. I. Arhisser, K. A. Donnelly, C. I. Scott, Jr., N. DiFerrante, J. Singh, R. E. Stevenson, A. S. Aylesworth, and R. R. Howell, Am. J. Med. Genet. 1(2), 195-205 (1977). 201. H. Groebe, M. Krins, H. Schmidherger, K. von Figura, K. Harzer, H. Kresse, E. Paschke, A. Sewell, and K. Ullrich, Am. ]. Hum. Genet. 32(2), 258-272 (1980). 202. T. Kohayashi, N. Shinnoh, and Y. Kuroiwa, Biochim. Biophys. Acta 875(1), 115-121 (1986). 203. E. Paschke and H. Kresse, Biochem. Biophys. Res. Commun. 109(2), 568-575 (1982). 204. A. Oshima, K. Yoshida, M. Shimmoto, Y. Fukuhara, H. Sakuraba, and Y. Suzuki, Am. J. Hum. Genet. 49(5), 1091-1093 (1991). 205. A. Morrone, T. Bardelli, M. A. Donati, M. Giorgi, M. Di Rocco, R. Gatti, R. Parini, R. Ricci, G. Taddeucci, A. D'Azzo, and E. Zammarchi, Hum. Mutat. 15(4), 354-366 (2000). 206. A. d'Azzo, G. Andria, P. Strisciuglio, and H. Galijaard, in "Metabolic and Molecular Bases of Inherited Disease" (C. R. Scriver, A. L. Beaudet, W S. Sly, and D. Valle, eds.), pp. 3811-3826. McGraw-Hill, New York, 2001. 207. Y. Okamura-Oho, S. Q. Zhang, and J. w. Callahan, Biochem. Biophys. Acta. 1225, 244-254 (1994).

114


208. J. A. Lowden and J. S. O'Brien, Am. J. Hum. Genet. 31(1), 1-8 (1979). 209. G. H. Thomas, in "The Metabolic and Molecular Bases of Inherited Disease" (C. R. Scriver,

A. L. Beaudet, W. S. Sly, D. Valle, eds.), pp. 3507-3534. McGraw-Hill, New York, 2001. 210. M. Shimmoto, Y. Fukuhara, K. Itoh, A. Oshima, H. Sakuraba, and Y. Suzuki, J. Clin. Invest.

91, 2393-2398 (1993). 211. x. Y. Zhou, N. J. Galjart, R. Willemsen, N. Gillemans, H. Galjaard, and A. d'Azzo, EMBOJ.

10, 4041-4048 (1991). 212. G. Rudenko, E. Bonten, W. G. Hol, and A. d'Azzo, Proc. Natl. Acad. Sci. U.S.A. 95(2), 621-625

(1998). 213. A. Federico, S. Battistini, G. Ciacci, N. de Ste£ano, R. Gatti, P. Durand, and G. C. Guazzi, Dev. Neurosci. 13, 320-326 (1991). 214. M. Cantz and B. Ulrich-Bott, J. Inherit. Metab. Dis. 13, 523-537 (1990). 215. P. Durand, R. Gatti, S. Cavalieri, C. Borrone, M. Tondeur, J. C. Michalski, and C. Strecker, Helv. Paediatr. Acta 32, 391-400 (1977). 216. A. Federico, A. Cecio, G. A. Battini, J. C. Michalski, G. Strecker, and G. C. Guazzi, J. Neurol.

Sc/. 48, 157-169 (1980). 217. J. S. O'Brien, Clin. Genet. 14, 55-60 (1979). 218. I. Rapin, S. Goldfischer, R. Katzman, J. Engel, Jr., andJ. S. O'Brien, Ann. Neurol. 3, 234-242

(1978). 219. R. L. Sogg, L. Steinman, B. Rathjen, B. R. Tharp, J. S. O'Brien, and K. R. Kenyon, Ophthalmology 86, 1861-1874 (1979). 220. J. S. Till, E. S. Roach, and B. K. Burton, J. Clin. Neur.-Ophthalmol. 7, 40-44 (1987). 221. T. E. Kelly and G. Graetz, Am. J. Med. Genet. 1, 31-46 (1977). 222. T. Oohira, N. Nagata, I. Akaboshi, I. Matsuda, and S. Naito, Hum. Genet. 70, 341-343 (1985). 223. R. M. Winter, D. M. Swallow, M. Baraitser, and P. Purkiss, Clin. Gen. 18, 203-210 (1980). 224. A. S. Aylswort, G. H. Thomas, J. L. Hood, N. Malou£, and J. Libert, J. Pediatr 96, 662-668

(1980). 225. W. G. Johnson, C. S. Cohen, A. F. Miranda, S. P. Waran, and A. M. Chutorian, Am. J. Hum. Genet. 32, 43A (1980). 226. M. Beck, S. W. Bender, H. L. Reiter, W. Otto, R. Bassler, H. Dancygier, and J. Gehler, Eur. J. Pediatr. 143, 135-139 (1984). 227. E. J. Bonten, W. F. Arts, M. Beck, A. Covanis, M. A. Donati, R. Parini, E. Zammanchi, and A. d'Azzo, Hum. Mol. Genet. 9, 2715-2725 (2000). 228. J. Q. Fan, S. Ishii, N. Asano, and Y. Suzauki, Nat. Med. 5, 112-115 (1999). 229. K. E. Lukong, K. Landry, M. A. Elsliger, Y. Chang, S. Lefrancois, C. R. Morales, and A. V. Pshezhetsky, J. Biol. Chem 276, 17286-17290 (2001).

Phosphoribosylpyrophosphate Synthetase and the Regulation of Phosphoribosylpyrophosphate Production in Human Cells 1 M I C H A E L A. B E C K E R

The University of Chicago University of Chicago Medical Center 5841 South Maryland Avenue Chicago, Illinois 60637 I. II. III. IV.

PRPP and the PRPP Synthetase Reaction . . . . . . . . . . . . . . . . . . . . . . . . . . . PRPP Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PRPP: Regulatory Role . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PRPP Synthetase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Structure and Proposed Mechanism of a Bacterial PRPP Synthetase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. The H u m a n PRPP Synthetase Gene and Isoform Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Effectors of the H u m a n PRPP Synthetase Reaction . . . . . . . . . . . . . . . . D. Structures of H u m a n PRPP Synthetases . . . . . . . . . . . . . . . . . . . . . . . . . E. PRPP Synthetase-Associated Proteins (PAPs) . . . . . . . . . . . . . . . . . . . . . F. Inherited H u m a n PRPP Synthetase Overactivity . . . . . . . . . . . . . . . . . . . V. Regulation of PRPP Synthesis in H u m a n Cells . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

117 119 121 125 125 126 127 133 134 136 140 145

5-Phosphoribosyl 1-pyrophosphate (PRPP) is a substrate common to the biosynthesis of purine, pyrimidine, and pyridine nucleotides and is also a regulator of the pathways of purine and pyrimidine nucleotide synthesis de novo. Synthesis of PRPP in human cells is controlled at several levels of genetic information transfer by mechanisms ultimately influencing the PRPP synthetase reaction, which is catalyzed by a family of homologous isoforms (PRS isoforms) encoded by separate (PRPS) genes. Studies of X chromosome-linked PRPP synthetase overactivity have confirmed two determinants of regulation of PRPP synthesis in human cells: allosteric control ofPRPP synthetase activity by antagonism 1Abbreviations: PRPP, 5-phosphoribosyl 1-pyrophosphate; PRS, PRPP synthetase isoform;

PRPS, PRPP synthetase gene; Pi, inorganic phosphate; PAP, PRPP synthetase-associated protein; Rib-5-P, ribose-5-phosphate; PRTase, phosphoribosyltransferase; PPi, pyrophosphate; MAP kinase, mitogen-activated protein kinase; DPG, 2,3-diphosphoglycerate; GAPDH, glyceraldehyde3-phosphate dehydrogenase. Progressin NucleicAcidResearch and MolecularBiologT, Vol.69

115

Copyright© 2001by AcademicPress. All rightsof reproductionin anyformreserved. 0079-6603/01$35,00

116

MICHAEL A. BECKER

between purine nucleoside diphosphate inhibition and inorganic phosphate (Pi) activation; and intracellular concentration of the PRS 1 isoform. The operation of additional determinants of rates of PRPP synthesis in human cells is suggested by: (1) multiple PRS isoforms with distinctive physical and kinetic properties; (2) nearly immediate activation of intracellular PRPP synthesis in response to mitogens, growth-promoters, and increased intracellular Mg ~+ concentrations; (3) tissue-specific differences in PRS1 and PRS2 transcript and isoform expression; and (4) reversible association of PRS subunits with one another and/or with PRS-associated proteins (PAPs), as a result of which the catalytic and perhaps regulatory properties of PRS isoforms are modified. © 2001AcademicPress.

Between the mid-1940s and the 1960s, the intermediates and enzymatic reactions comprising the pathways of purine and pyrimidine nucleotide biosynthesis were identified and studied (1, 2). Among the key advances in this process were the demonstration that the activated sugar phosphate 5-phosphoribosyl ot-l-pyrophosphate (PRPP) is the immediate donor of the ribose-5-phosphate (Rib-5-P) moiety of the nucleotides and the characterization of the PRPP synthetase (ATP: D-Rib-5-P pyrophosphotransferase; EC 2.7.6.1) reaction, by means of which PRPP is formed (3). Subsequently, concepts of PRPP as a regulator as well as a substrate for purine and pyrimidine nucleotide synthesis in mammalian ceils (4-6) have been confirmed (7), and refinement of these concepts has contributed insight into integration in the control of salvage and de novo arms of the respective pathways and into coordination between purine and pyrimidine nucleotide synthesis. After a description of PRPP as a substrate and as a regulator, this review will focus on PRPP production in human cells and on biochemical and genetic mechanisms regulating this process by influencing activity of PRPP synthetase. Since the last review of these subjects from this laboratory (8), substantial advances in the knowledge of mammalian PRPP production have occurred, including (1) recognition of multiple homologous PRPP synthetase (PRS) isoform products of separate PRPS genes (9); (2) delineation of the respective amino acid and nucleotide sequences ofPRS isoforms and cDNAs (9-12); (3) recognition of the considerable homology among mammalian and bacterial PRPP synthetases (9-15), a prototype of the latter having recently been characterized at protein structural and mechanistic levels (16); (4) characterization of PRPP synthetaseassociated proteins (17), which suggest an additional level of posttranslational control of enzyme activity and the likelihood that PRS isoforms participate in heteroprotein complex formation; (5) extended description of the functional (18,19) and, in some cases, structural (18) defects in human PRPP synthetases, which result in enzyme overactivity with metabolic and neurologic consequences (20); and (6) demonstration of tissue-differential and isofonn-specific expression of PRPP synthetase activity, processes that appear to be regulated, at least in part, at a pretranslational level (21, 22).

PRPP SYNTHETASEAND REGULATIONOF PRPPPRODUCTION

117

I. PRPP and lhe PRPP Synthetase Reaction PRPP is a substrate in the synthesis of purine, pyrimidine, and pyricline nucleotides and is an important regulator of the rates of the pathways of purine and pyrimidine nucleotide synthesis de novo (4-7). In bacteria, PRPP is also a precursor in the synthesis of the amino acids tryptophan and histidine. PRPP is synthesized uniquely in the reaction MgATP + Rib-5-P

~ PRPP + AMP.

This reaction is catalyzed by PRPP synthetase, an essential enzyme present in one or more forms in all cells. PRPP synthetases have been purified from several species and subjected to kinetic (23-30) and structure-function (16, 25, 31-35) studies. Genes encoding PRPP synthetases have been identified and sequenced from an even broader array of organisms, spanning all kingdoms (reviewed in Ref. 35). Substantial sequence homology has been demonstrated among PRPP synthetases, including those of mammals and Gram-positive and Gram-negative bacteria (9-15). Although members of the PRPP synthetase enzyme family have no overall similarity to other proteins, two important homologies have been noted (Fig. 1): first, a short region with a PRPP-binding motif shared with type1 phosphoribosyltransferases (PRTases) (•3); second, significant resemblance to noncatalytic mammalian proteins, called PRPP synthetase-associated proteins (PAPs) (17, 36-39), that are believed to form complexes with PRS isoforms and to inhibit catalytic and perhaps regulatory functions of these enzymes (17, 36, 40). Structural and mechanistic analysis of B. subtilis PRPP synthetase confirms the prediction that regions of the conserved PRPP synthetase sequence generally correspond to residues or groups of residues critical to the catalytic process or its regulation (16, 35). As such, the features of the PRPP synthetase reaction catalyzed by human PRS isoforms are likely to correspond substantially to those identified for the B. subtilis model. In the synthesis of PRPP, the terminal pyrophosphoryl group of ATP (in the form of a magnesium-ATP complex) is transferred to the Ol'position of Rib-5-P. The reaction requires Mg 2+ both for formation of the magnesium-ATP substrate complex (13, 31, 32) and bound to the enzyme (41). All mammalian PRPP synthetases have an absolute requirement for inorganic phosphate (Pi) (29, 30, 32) and are allosterically inhibited by the purine nucleoside diphosphates ADP and GDP (26, 28-30, 32-34). 2 The human PRPP synthetase reaction, like its bacterial counterparts, follows an ordered sequential mechanism, in which 2Recently,evidencehas emergedfromthe studyof someisoformsof PRPPsynthetasefrom spinach (42) and Arabidopsis thaliana (43) suggestinga novelclass of the enzyme,provisionally referredto as typeII PRPPsynthetase.TypeII PRPPsynthetasesshowactivityindependentof Pi and maybe insensitiveto allostericinhibitonby ADPand GDP.

118

M I C H A E L A. B E C K E R

PRSI

I: . . . . . . . . . . . . . . . .

1 MPNI K I F S G S S -- H - Q D L S Q K I AD

PRS2

I: . . . . . . . . . . . . . . . .

MPN1V L F S G S S -- H - Q D L S Q R V A D

PAP39 I: . . . . . . . . . . . .

MNAARTGYRV F L A N S T A A C T E LAKR I TE

PAP4L I : M F C V T P P E L E T K M N ITKGGLV L F S A N S N S S C M E LS KK IAE •

•

•

•

PRSI 2 1 : R L G L E L G K V V T K K F S N Q E T C V E I G E S V R G E D V Y IVQS GCG PRS2 21:R L G L E L G K V V T K K F S N Q E T S V E I G E S V R G E D V Y I I QSG CG PAP39 28:R L G A E L G K S V V Y Q E T N G E T R V E I K E F V R G Q D I F I I QT I PR PAP41 41:R LGVEMGKVQVYQ E P N R E T R V Q I Q E SVRGKDV F 1 IQTV SK OO • m PRSI 6 1 : E I N D N L M E L L I M I N ACK I ASAS R V T A V I P C F P Y A R Q D K K D K S PI~2 61:El N D N L M E L L I M I N ACK I A S S S R V T A V I P C F P Y A R Q D K K D K S P ~ 3 9 6 8 : D V N T A V M E L L I M A Y AL K T A C A R N I I G V I P Y F P Y S K Q S K M R K P~41 81:DVNTTI M E L L I M V Y A C K T S C A K S I [ G V I P Y F P Y S KQCKMRKm~m • • • PRSI 103:RAPI S A K L V A N M LS V A G A D H I I T M D L H A S Q I Q G F F D I P V D N PRS2 103:RAPI S A K L V A N M

LS V A G A D H I

ITMDLHA

PAP39 I I 0 : R G S [ V C K L

L AKAGLTHI

I TMDLHQKEIQGFFSFPVDN

LASM

SQIQGFFDIPVDN

PAP41 122:RGSIVS K L L A S M M C K

AGLTHL[TMDLHQKEIQGFFN

IPVDN

PRSI 1 4 4 : L Y A E P A V L K W I R E N I S E

WRNCT

PPI IVS PDAGGAKRVT

SIADRL

PRS2 1 4 4 : L Y A E P A V L Q W I R E N I A E

WKNC

I IVS P DAGGAKRVT

SIADRL

PAP39151:LRASPFLLQ

YIQE

EIPNY

RN AVIVAKSP

DAAKRAQSYAERL

PAP41 1 6 3 : L R A S P F L L Q

YIQE

E[PDY

RN AVIVAKSP

A SAKRAQSFAERL

PRSI 1 8 5 : N V D F A

LIHK

E-R

KKAN

E- V D-R

PRS2 I g S : N V E F A

LIHK

E-R

K KANEDM

.... M ---V

V D- R .... M---V

PAP39192:RLGLAV

[HGEAQC

TEL

DDGRHSPP

PAP41 205:RLG] A V

IHGEAQD

AE S D LV DGRHSPP

......

L--

......

L--

MVKN-

ATVHPG

LE

MVRSV

AAI HP S LE

FIG. 1. Alignment of the predicted amino acid sequences of h u m a n X chromosome-linked PRPP synthetase isoforms (PRS1 and PRS2) (10, 11) and of PRPP-associated proteins, PAP39 and PAP41 (38, 39). Closed circles and closed squares denote residues homologous to those that take part in binding the ADP analog, mADP, respectively, at the active (ATP) site and the regulatory binding site of B. subtilis PRPP synthetase (16). Regions of sequence corresponding to those of the flexible loop, the Rib-5-P binding loop, and the PPi binding loop of the B. subtilis enzyme (16) are also shown. The Rib-5-P binding region corresponds to the PRPP binding motif of type 1 PRTases (13).

119

PRPP SYNTHETASE AND REGULATION OF PRPP PRODUCTION

PRSI 208:. . . . . . . . . . . . . . .

VGDVKDRVAI

RIb-5-P LVDDMAD TCGTI CHAA

PRS2 208:. . . . . . . . . . . . . .

VGDVKDRVAI

LVDDMAD TCGTI CHAA

P A P 3 9 2 3 0 : L P L M M A K E K P P ! TV V G D V G G R I AI ! V D D I ! D D V E S F V A A A PAP41 243:! P M L I P K E K P P I TV V G D V G G R I A! I V D D I ! D D V D S F L A A A Pill 234:DKLLSAGATRVYA

|LTHG|FSGPAI

SRINNACFE

AVVV TNT!

PP~2 2 3 4 : D K L L S A G A T K V Y A

ILTHGIFSGPAI

SRINNAAFE

AVVVTNTI

PAP39270:E I L K E R G A Y K

IYVMATHGILSAEAPRLIE

ESSVDE VVVTNTV

P A P 4 1 2 8 3 : E T L K E R G A Y K I F V MATHGLLF S D A P R R I E E S A !

PRSI 2 7 6 : P Q E D K M K H C S K I Q V I D I S

D EVVVTNTI

MILAEAIRRTHNGESV

SYLFSHVP

PI~2 2 7 6 : P Q E D K M K H C T K I Q V I D I S M I L A E A I R R T H N G E S V

SYLFSHVP

PAP39312:PHEVQK L Q C P K I K T V D I S L !

LSEAIRR IHNGESMAYLFRN

IT

PAP41 3 2 5 : P H E V Q K L Q C P K I K T V D I S M I

LSEAIRRI HNGESMSYLFRN

IG

PRSI 3 1 7 : L - PRS2 3 1 7 : L - PAP39 354:VD D PAP41 367:LD D

FIG. 1. (continued)

substrates bind to the enzyme after Mg2+ and PRPP is the last product released (26, 31). MgATP is the first substrate bound by S. typhimurium PRPP synthetase (31), preceding Rib-5-P, and this sequence has been confirmed for the Bacillus enzyme as well (44). Reports that Rib-5-P binds first to human PRPP synthetase (26, 45) are discordant with the catalytic mechanism recently defined for the Bacillus enzyme (i 6, 35). The discrepancy may, in fact, reflect differences in the pH at which the bacterial (pH 8) and human (pH < 7.5) enzymes were tested (26, 35, 45). The mechanism of the reaction catalyzed by bacterial PRPP synthetases proceeds via a nucleophilic attack of the anomeric hydroxyl of Rib-5-P on the fl-phosphorus atom of ATP in an SN2 inversion (16, 46).

II. PRPP Utilization Hydrolysis of PRPP by nonspecific phosphatases may occur, but PRPP utilization proceeds in large part through specific PRTase-catalyzed reactions, in

0 0

"

~

~.~

~ .0~~ 0~. ~~~' ~,~ ~ ~gNgN

~.~_~ ~, .~{ ~.~ ~ ' ~

~

Z

~.~ c

0

~.~ ©

.~

°°~i

~ ~ ,~0o-~ OZZ


121

each of which the ribosylphosphate moiety of PRPP is transferred to one or more nitrogenous base acceptors, with formation of an N-glycoside bond at the C'I position (1, 47). A variety of PRTases have been described in mammals (Table I), including those utilizing quinolinate or imidazoleacetic acid as acceptors. The major PRTase reactions in mammalian cells, however, are those directed to the synthesis of purine, pyrimidine, and pyridine nucleotides. In order to maintain cellular economy in the setting of a potential multiplicity" of utilization reactions, the metabolic fate of PRPP must be regulated. Potential determinants of PRPP utilization include the abundances of the individual PRTases and their absolute and relative affinities for PRPP and cosubstrates, intracellular concentrations of PRPP and cosubstrates, and the action of effectors (such as allosteric regulators) on individual enzyme activities. For example, marked increases in intracellular PRPP concentrations (but not in PRPP synthesis) accompany increased rates of purine synthesis de novo in cells severely deficient in hypoxanthine-guanine PRTase (48, 49). This implies that certain cells normally have high rates of PRPP utilization through this reaction ofpurine base salvage and that PRPP utilization is redirected toward purine synthesis de novo in the enzyme deficiency state. As discussed below, such a formulation is in accord with the kinetic and regulatory properties of hypoxanthine-guanine PRTase and amidoPRTase (glutamine PRPP PRTase), enzymes catalyzing PRPP utilization in purine salvage and de novo pathways, respectively. By way of contrast, human cells deficient in adenine PRTase have normal PRPP concentrations and rates ofpurine synthesis de novo despite the fact that adenine PRTase has the highest affinity for PRPP among PRTases (8). Normally lower rates of PRPP utilization in adenine than in hypoxanthine salvage are implied by this finding, which is best explained by limited intracellular availability of adenine, a compound synthesized in mammalian cells largely by phosphorolysis of methylthioadenosine rather than adenosine (50).

III. PRPP: Regulatory Role Characterization of PRTases, as well as studies of the effects of genetic and pharmacological alterations of cellular PRPP synthesis and utilization, has helped define the mechanisms controlling intracellular rates of purine and pyrimidine nucleotide synthesis and has clarified the role of PRPP as a regulator and coordinator of these processes. The most extensively studied route of PRPP utilization in mammalian cells is purine nucleotide synthesis (Fig. 2). Regulation ofpurine nucleotide synthesis involves the interplay of the pathways of purine synthesis de novo and purine base salvage (1, 2, 4, 5, 7). PRPP, which is synthesized solely by means of the PRPP synthetase reaction, is a substrate in both pathways. In purine synthesis de novo, a biosynthetic sequence of 10 reactions is required

19,2

MICHAELA. BECKER ATP+RIb-6-P P! I ~ l ~ PP-RIb-P

synthetase

~(--) -.

PP-mb-P ~ W )

/ - i t.m,d,

.d.

i

:::

FIG.2. Schematicrepresentationof the pathwaysofPRPP (PP-Rib-P) and purinenucleotide synthesisand the regulationof rates of purine synthesisde novo by PRPP and purine nucleotide end products. Curvedarrowsdepict single-steppurine base salvagepathwaysof purine nucleotide synthesis,requiringPRPP and catalyzedby the PRTaseenzymes,hypoxanthine(hyp)-guanine(gua) PRTase (HGPRT) and adenine (ade) PRTase (Ade PRT). Purine synthesisde novo is shownby a solid arrow representingthe initialand rate-limitingamidoPRTase(AmidoPRT)reaction and a dashedarrowdepictingthe finalninestepsin thissequenceofreactions.Pufinenucleotideinhibition (-) of PRPP synthetaseand amidoPRTaseis indicatedby the heavyhatchedarrow,and allosteric activation(+) ofAmidoPRTaseby PRPPis shownbythe heavydarkarrow.ReproducedfromRef.51 by permission. for assembly of the complete purine ring (of inosine 5'-monophosphate) on a Rib-5-P backbone provided by PRPP in the first reaction committed to the pathway (2, 5, 7). This reaction is catalyzed by amidoPRTase, an enzyme with kinetic (4, 5, 52, 53) and structural (4, 53-56) properties consistent with a regulatory role in the pathway. In fact, a wealth of evidence supports the concept that the sequential PRPP synthetase and amidoPRTase reactions comprise the dominant regulatory domain for purine nucleotide synthesis (4, 5, 7, 8). Within this domain, amidoPRTase serves as the rate-limiting reaction in purine synthesis de novo and is the primary site of regulation of the pathway (7, 57). Analysis of the structure and catalytic mechanisms of B. subtilis amidoPRTase has confirmed that phosphoribosylamine synthesis is allosterically regulated and that catalysis involves sequential steps at two domains, effecting, respectively, hydrolysis of glutamine to NH3 and then PPi displacement from PRPP by NHa (53-56). Regulation is achieved by virtue of an antagonistic interaction of PRPP and purine nucleotide pathway products acting as allosteric effectors of enzyme structure and thus catalysis (Fig. 2). Human amidoPRTase is both inhibited


123

(4, 52) and converted to an inactive dimer from an active monomer (4, 58) by purine nucleotides, and inhibition of amidoPRTase by purine mononucleotides shows synergy between nucleotides bearing different (amino or hydroxyl) substituents at position 6 of the purine ring (52, 55, 56). These inhibitory effects can be overcome by increasing concentrations of PRPP (4, 52, 58). At physiological purine nucleotide concentrations, activity of the human enzyme responds to increasing PRPP concentrations in a sigmoidal fashion (4, 58). The fact that PRPP concentrations in normal cells are less than the apparent affinity constant of amidoPRTase for PRPP suggests that PRPP availability is the basis of rate limitation of the pathway at the amidoPRTase reaction. With few exceptions (51), in fact, increased intracellular PRPP concentrations accelerate purine synthesis de novo, and depletion of the compound slows the rate of the pathway (8). An additional contribution to the regulation of purine synthesis and its potential coordination with other nucleotide biosynthetic pathways is provided by purine nucleotide inhibition of PRPP synthetase activity (7, 49, 57). Because sensitivity of this enzyme to nucleotide inhibitors is less than that of amidoPRTase (49, 57), the latter serves as the primary site of regulation of purine nucleotide synthesis de novo. Neverthless, dual sites of regulation of pathway activity provide the potential for a flexible system of control, such that small changes in purine nucleotide pools may modulate amidoPRTase activity without substantial alteration in the generation of PRPP for its other major metabolic roles. More marked changes in nucleotide pools, however, may alter the state of activation of PRPP synthetase, regulating availability of PRPP as a substrate and/or activator in pyrimidine and pyridine as well as purine nucleotide synthetic pathways. The structural and kinetic studies suggesting that PRPP is both a rate-limiting substrate for and an allosteric regulator ofpurine synthesis de novo are supported by the results of investigations in intact cells and organisms (reviewed in Ref. 8). In brief, depletion of PRPP with compounds such as orotic acid and nicotinic acid results in diminished rates of purine synthesis de novo in cultured human fibroblasts, even though the nucleotide derivatives of these compounds do not inhibit amidoPRTase or PRPP synthetase. Conversely, increases in intracellular PRPP concentrations resulting from activation of the pentose phosphate pathway in mammalian cells incubated at supraphysiological Pi concentrations are accompanied by accelerated purine nucleotide synthesis. Finally, increased availability of PRPP for purine synthesis de novo, manifested by increased intracellular PRPP concentrations, is associated with excessive rates of the pathway in persons with the two X chromosome-linked human disorders, hypoxanthine-guaninePRTase deficiency (48, 49, 59) and PRPP synthetase overactivity (49, 60). In the alternative pathways ofpurine nucleotide synthesis, preformed purine bases are converted to their respective nucleoside monophosphates in PRTase

124

MICHAEL A. BECKER

reactions with PRPP (Table I; Fig. 2). The two enzymes catalyzing the purine base salvage reactions are hypoxanthine-guaninePRTase, which converts hypoxanthine, guanine, and xanthine into IMP, GMP, and XMP, respectively; and adeninePRTase, for which adenine and aminoimidazolecarboxamide serve as natural substrates. The capacity of each enzyme to catalyze conversion of several synthetic purine base analogs (for example, 6-thioguanine and 6-mercapto purine) into biologically active nucleoside monophosphate derivatives has been widely exploited in cell biology and chemotherapeutics (1, 2). Although operation of the purine base salvage pathway does not result in net synthesis of body purine compounds, the requirement for only one mole of ATP (for PRPP synthesis) per mole of nucleotide product represents considerable economy vis-h-vis the pathway of purine synthesis de novo, which requires six moles of ATP for the synthesis of one mole of inosine 5'-monophosphate. The reciprocal relationship between rates of the pathways of purine nucleotide synthesis appears to be governed, at least at the moment-to-moment level, by competition for the common substrate, PRPP, and by differences among PRTases with respect to purine nucleotide endproduct inhibition sensitivity. For example, when compared in extracts of human cells, purine PRTases have higher affinities and amidoPRTase has lower affinity for PRPP (61). In pyrimidine nucleotide synthesis de novo, PRPP is a substrate in the orotatePRTase reaction in which orotate is converted to the ribonucleoside monophosphate, OMP, which is then decarboxylated to UMP. In contrast to the case in purine salvage, the major pathway of reutilization of pyrimidine compounds in animal cells is PRPP-independent phosphorylation of pyrimidine nucleosides. Despite this fundamental difference in the organization of purine and pyrimidine biosynthetic pathways, considerable evidence implicates PRPP as a regulator as well as a substrate in mammalian pyrimidine nucleotide synthesis (6, 62). In mammals, the first three reactions of pyrimidine ring formation are catalyzed by a multifunctional 243-kDa protein (CAD) with glutamine-dependent carbamyl phosphate synthetase II (EC 2.7.2.9), aspartate transcarbamylase (EC 2.1.3.2), and dihydroorotase (EC 3.5.2.3) specificities. In keeping with this structural organization, in which the first enzyme committed to pyrimidine ring formation is carbamyl phosphate synthetase (rather than aspartate transcarbamlyase, as in bacteria), control of rates of mammalian pyrimidine nucleotide synthesis is ordinarily directed at this enzyme (62). Specifically, activity of the carbamyl phosphate synthetase II domain of CAD appears to be regulated by UTP feedback inhibition and PRPP activation operating within the B3 subdomain of CAD. Recent evidence indicates that the B3 region is a site of phosphorylation by the mitogen-activated protein (MAP) kinase cascade (63). When phosphorylated by MAP kinase in vitro or activated by epidermal growth factor in vivo, CAD becomes less sensitive to feedback inhibition and more sensitive to activation.


125

These effects favor enhanced pyrimidine nucleotide synthesis. Support for PRPP regulation in this pathway comes from studies in human fibroblasts derived from individuals with excessive rates of PRPP synthesis due to mutations diminishing allosteric control of PRPP synthetase activity (18, 49). In these cells, intracellular pyrimidine as well as purine nucleoside triphosphate concentrations are increased (49).

IV. PRPP Synthetase A. Structure and Proposed Mechanism of a Bacterial PRPP Synthetase Recently, the three-dimensional structures of B. subtilis PRPP synthetase in complex with analogs of the activator Pi and the inhibitor ADP were solved (16, 35). The enzyme monomer contains 316 amino acid residues, but the functional form of the enzyme is a homohexamer, a quaternary structural state that is essential to the proposed mechanisms of catalysis and its regulation. This, together with the high degree of primary structure homology with mammalian as well as other bacterial PRPP synthetases (9-15), suggests that the hexameric structure will be a general rule among type I PRPP synthetases (16, 35). The PRPP synthetase monomer has amino- and carboxyl-terminal domains with secondary folds that are markedly similar and are related by twofold symmetry. This finding, which suggests a remote gene duplication, had not been predicted from a consideration of the primary structure that shows only 20% sequence homology between amino- and carboxyl-terminal domains. The secondary structure of each domain contains a five-stranded parallel/%sheet with two a-helices on each side and a flag region; each domain closely resembles a type 1 PRTase. A PRPP-binding fingerprint motif in the carboxyl-terminal domain of the PRPP synthetase subunit (Fig. 1) is believed to be involved in Rib-5-P binding, and residues likely to bind the terminal pyrophosphoryl group ofATP (PPi loop) are also present in this domain. The N-terminal domains of adjacent subunits, in a bent head-to-head arrangement, contribute, through their a-helices, to interactions involved in the nesting of the catalytic ATP binding site. It is believed that MgATP is bound to amino-terminal domain residues on the opposite side of a catalytic cleft (located between the amino- and carboxylterminal domains) from the carboxyl-terminal domain-bound Rib-5-P (16). The hypothesized mechanism of catalysis involves bridging of the cleft by the terminal pyrophosphate of ATP (complexed with Mg~+), binding of this moiety to the PPi loop, and interaction with conserved basic residues in the C-terminal domain flag region and the N-terminal domain flexible loop. In contrast, the mechanism for allosteric regulation by ADP and Pi is believed to involve

126

MICHAEL A. BECKER

effector interactions with residues from three subunits of the hexamer, including residues from the flexible loop. The inhibitor, ADP, and the activator, Pi, appear to compete at this regulatory site, which is located at the interface of two dimers of the hexamer. Although some aspects of this model were speculative at the time of publication, more recent success in crystallizing B. subtilis PRPP synthetase in presence of Mg 2+ has permitted growth and analysis of crystals representing the inhibited form of PRPP synthetase (complexed with GDP and substrates) and a complex (transition-state structure) containing one substrate (Rib-5-P), one product (AMP), and the transition-state analog aluminum fluoride.3 These studies have extended and confirmed the structural scheme and have led to a detailed elucidation the catalytic and regulatory mechanisms of the PRPP synthetase reaction.

B. The Human PRPPSynthetaseGene and Isoform Family The human PRPP synthetase gene family is composed of three PRPP synthetase genes (PRPS 1-3) encoding catalytic isoforms (PRS 1-3) of identical length (10-12), and two PRPP synthetase-associated genes (PRPSAP 1 and 2) encoding noncatalytic polypeptides of 39 kDa (38) and 41 kDa (39), respectively. Human PRPS1 and PRPS2 map, respectively, to the long and the short arms of the X chromosome (64), PRPS1 to the interval Xq22-q24 and PRPS2 to the interval Xp22.2-p22.3 (65). Both of these genes are widely or universally expressed in tissues (21,22). PRPS3 maps to human chromosome 7p (64) and appears to be transcribed only in the testes (21). X-linked human PRS1 and PRS2 cDNAs show 80% nucleotide sequence identity throughout their 954-bp translated regions, but the corresponding 5' and 3t untranslated regions lack homology. PRS1 and PRS2 cDNAs hybridize with distinct transcripts of 2.3 and 2.7 kb, respectively (65), and no immature or alternative transcripts have been identified to date. A PRS3 protein corresponding to the 1.4-kb PRS3 transcript in testicular tissue has not yet been identified (17). Each X-linked PRPS gene contains seven exons with virtually identical exon-intron borders. 5' Promoter regions of the genes are, however, structurally distinct (66). Normal human PRS1 and PRS2 cDNAs were expressed in E. coli, and purified recombinant PRS isoforms were compared (30). Despite 95% amino acid sequence identity (Fig. 1), recombinant PRS1 and PRS2 isoforms differ in several physical and kinetic properties.The proteins differ in isoelectric points, and this has been exploited to achieve separation and quantitation ofPRS 1 (pI = 6.8) and PRS2 (pI = 6.6) in tissue samples (Fig. 3). PRS2 activity is more thermolabile than that of PRS1 and also undergoes substantial immediate but reversible inactivation when diluted in phosphate buffer lacking Mg 2+ and ATP, conditions 3F. Nygaard, Universityof Copenhagen, unpublished.


127

1

2

3

4

S

6

7

8

9

I

I

I

I

I

I

I

I

I

PRS1 - [

pH

- 6.8

FIG. 3. Immunoblot analysis of normal human purified recombinant PRS1 and PRS2 (lanes 5-9) and of PRS isoforms in supematant fractions of crude extracts of human B lymphoblasts derived from normal individuals and patients with PRPP synthetase "catalytic" overactivity (lanes 1-4). After electrophoresis of samples on a polyacrylamide-urea isoelectric focusing gel and electrotransfer of protein to a polyvinylidene difluoride membrane, immunoblotting was carried out, all as described in Ref. 67. Recombinant PRS1 and PRS2 migrate as doublets of immunoreactive bands (brackets) with pls near 6.8 and 6.6, respectively. For PRS1, minor bands migrating with more acidic pls are seen (lanes 5-7) when increasing amounts of PRS1 are applied. Lanes i and 3: Normal lymphoblast extracts (25 ttg); lanes 2 and 4: extracts from patients 7 and 9 (Table V), respectively (25 ttg each); lanes 5-9: recombinant PRS1 (61.2, 30.6, 15.3, 7.6, and 3.8 ng, respectively) and PRS2 (37.5, 18.8, 9.4, 4.7, and 2.3 ng, respectively). Note that normal human B lymphoblasts contain comparable levels of PRS1 and PRS2 compared with much higher levels of PRS1 than PRS2 in normal fibroblasts (Figs. 5 and 6). Lymphoblasts from patients show more modest increases in PRS1 levels (about 2-fold) relative to those in normal lymphoblasts) than is the case in fibroblasts (2-6-fold; Table V). Reproduced in modified form from Ref. 67 by permission.

in which PRS1 activity remains virtually intact. PRS2 activity is, however, stable when diluted in the presence of 0.3 m M ATP and 6 m M MgC12, and activity can be restored completely on readdition of ATP and Mg 2+ to dilution-inactivated enzyme. Prior studies of purified h u m a n erythroeyte PRPP synthetase (discussed in Section IV, D) demonstrated that reversible aggregation of the enzyme subunits in t)i buffer depends on concentrations of Mg + and ATP and that enzyme activity resides in the largest aggregates (32-84). Accordingly, the recombinant PRS isoforms were studied by molecular exclusion high-pressure liquid chromatography in the presence and absence of ATP and Mg ~+ (30). W h e n these small molecule effeetors were present, both isozymes eluted as single peaks with molecular masses >1000 kDa. In the absence of stabilizing agents, P R S I eluted in two peaks, a major peak at >1000 k D a and a minor peak at "-,100 kDa. In contrast, PRS2 eluted as a single peak at 100 kDa. These studies imply that h u m a n PRSs require ATP and Mg 2+ to stabilize a state or states of subunit aggregation required for enzyme activity and establish that PRS1 and PRS2 respond differentially to changes in concentrations of these effector compounds.

C. Effectorsof the Human PRPPSynthetase Reaction Complexity in the structural and kinetic characteristics of h u m a n and microbial P R P P synthetases is consistent with the view that enzyme activity is

128

MICHAEL A. BECKER

influenced by a variety of compounds, including substrates, inhibitors, and activators. In fact, analyses of PRPP synthesis in intact mammalian cells containing either normal or variant PRPP synthetases (7, 18, 49) confirm regulation of intracellular PRPP synthetase activity by some of these potential effectors. Other studies suggest additional mechanisms of regulation of PRPP production operating at genetic (19, 67), protein structural (30, 34), and protein interactional (17, 40) levels. Prior to recognition of the multiplicity of PRPP synthetase isoforms (9--12) and the availability of human recombinant PRS1 and PRS2 (30), studies of the human PRPP synthetase utilized purified erythrocyte enzyme (largely isoform 1) (26, 32-34) or partially purified enzyme preparations from cultured fibroblasts (80-90% isoform 1) or B lymphoblasts (isoforms 1 and 2 in nearly equal amounts; Fig. 3) (67). To the extent that the isoforms differ in structural and kinetic properties, or, in fact, active heteroaggregates of PRS subunits may be present, the earlier studies may be viewed as providing composite and potentially inconsistent data. Where available, emphasis is given here to results obtained with purified human recombinant PRS isoforms i and 2. 1. SUBSTRATES

Among naturally occurring nucleoside triphosphates, only ATP and dATP are substrates for mammalian PRPP synthetases (28, 32), and ATP is bound to PRPP synthetase in the form of a magnesium-ATP complex, with optimal enzyme activity also requiring the presence of free Mg 2+ (13, 31, 32, 35, 41). A high degree of substrate specificity is also demonstrable for Rib-5-P as the pyrophosphoryl receptor, although ribulose-5-phosphate can serve less well as a substrate for the human enzyme (32). Kinetic constants for Rib-5-P and MgATP were determined for purified human recombinant PRS1 and PRS2 (30) (Table II). The Km value for Rib-5-P was lower for isoform 1 than for isoform 2. The corresponding kinetic constant for MgATP was also lower for isoform 1. The double reciprocal plot for Rib-5-P saturation of each isoform was linear, as was the plot for MgATP saturation of recombinant PRS1. Mthough the MgATP saturation curve for PRS2 was consistent with negative cooperativity of MgATP binding, the instability ofisoform 2 at low MgATP concentrations appeared a more likely explanation (30). The low values for MgATP saturation relative to intracellular ATP and Mg 2+ concentrations suggest saturation for MgATP of both isoforms in the cell. In an earlier report (28) involving purified rat liver PRPP synthetase, a sigmoidal relationship between MgATP concentration and initial reaction velocity was observed when ATP was present in excess of Mg 2+ or with equimolar concentrations of each. This suggested that free Mg2+ as well as MgATP could be a determinant of intracellular enzyme activity. As described in Section IV,F, recombinant superactive human PRSls with mutations that diminish or abolish the noncompetitive inhibition of PRPP synthetase by ADP and GDP retain a


129

TABLE II KINETIC CONSTANTSFOB THE PRPP SYNTHETASE REACTION CATALYZEDBY HUMAN RECOMBINANTPRS1 AND PRS2 REACTIONSa

Substrate affinity Rib-5-P MgATP Activation Mg2+ Pi Inhibition ADP GDP DPG

rPRS1

rPRS2

Km Km S0.5

52/zM 21/zM --

83/zM -67-70/zM

Ka ga

70/zM 0.7 mM

110/~M 2.1 mM

I0.5 Io.5 Io.5

160/zM 420/zM 4.5 mM

260/~M 800 #M 2.0 mM

~Adapted from Ref. 30 by permisssion. Substrate affinities + andMg2 activationwere studied at 50 mM Pi; inhibition by ADP and GDP was studied at 5 mM Pi and by DPG at 1.0 mM Pi.

potent mechanism of competitive inhibition by ADP (with respect to ATP) and have unaltered Km values for MgATP (18). This suggests that, at least for PRS1, neither conformational changes associated with binding of allosteric inhibitors nor the presence of significant ADP concentrations is of importance in altering affinity of PRS for MgATP in cells. A recent study of rat liver PRPP synthetase and its catalytic isoforrns (68) concluded, however, that at least a portion of free Mg 2+ activation of the enzyme and isoforms involves antagonism between Mg2+ and purine nucleoside diphosphate inhibitors. Sources of intracellular Rib-5-P include the oxidative and nonoxidative pentose phosphate pathways; phosphorolysis of purine and pyrimidine nucleosides and conversion of ribose-l-phosphate to Rib-5-P in a phosphoribomutase reaction; and, of minor importance, direct conversion of ribose to Rib-5-P in a ribokinase reaction. Of these, conversion of sugar phosphates by means of the pentose phosphate pathways constitutes the major source of Rib-5-P production. The PRPP synthetase reaction, utilizing Rib-5-P in the formation of PRPE can thus be viewed as linking the pentose phosphate pathway and nucleotide biosynthetic pathways. Only a relatively small proportion of Rib-5-P generated is ordinarily converted to PRPP (69), but intracellular Rib-5-P concentrations are below or near the Michaelis constants of PRPP synthetase for this compound (70, 71). Thus, a role for Rib-5-P as a regulator as well as a substrate in the PRPP synthetase reaction would provide an attractive means to relate sugar and purine metabolic pathway activities. In a number of experimental circumstances in which Rib-5-P availability is increased, either through acceleration of the oxidative pentose phosphate

130

MICHAEL A. BECKER

pathway or increased rate of nucleoside phosphorolysis, increased rates of PRPP production and/or increased rates of purine or pyrimidine nucleotide production have been demonstrated in human and other mammalian cells (reviewed in Ref. 8). In these studies, however, accompanying intracellular Pi concentrations have not been reported. Indeed, other studies in human erythrocytes, leucocytes, and fibroblasts and in rat liver slices have reported that increased PRPP synthesis in response to demonstrated or presumed increases in Rib-5-P levels is dependent on the Pi concentration at which the study is conducted (8). In one study (72), changes in oxidative pentose phosphate generation were completely dissociated from changes in PRPP synthesis. This study supported an important contribution of nonoxidative pentose phosphate generation in Rib-5-P production but concluded that factors other than Rib-5-P supply determine rates of PRPP production. Thus, a role for Rib-5-P in controlling PRPP production has not been fully established; however, as discussed below, it is postulated that Pi concentration exerts primary control over PRPP synthetase activity. In this scheme, rate-limiting control of PRPP production by Rib-5-P, if operative at all, is manifested only at Pi concentrations higher than those in the physiological range. 2. INHIBITORS Kinetic studies of purified erythrocyte PRPP synthetase indicated at least three sites of interaction between the enzyme and inhibitors and described a broad array of inhibitory nucleotides, including pyrimidine and pyridine as well as purine nucleotides and reaction products (26). Subsequent studies of purified rat liver PRPP synthetase (28) and rat (29) and human (30) recombinant PRS1 and PRS2 have found a much more restricted range ofinhibitors of PRPP synthetase activity. The likely basis for the discrepant findings is the presence or absence of stabilizing compounds, such as albumin, EDTA, and dithiothreitol in the reaction mixture. In the presence of such agents, the range of enzyme inhibitors narrows sharply and includes only purine nucleoside diphosphates and 2,3-diphosphoglycerate (DPG). The most potent inhibitor of human PRPP synthetase was ADP, which showed a competitive pattern of ADP inhibition with respect to MgATP. The apparent inhibition constant (Ki) was 10 tzM, a value below intraeellular ADP concentrations. This suggested a physiological role for the competitive interaction of adenylates in controlling enzyme activity (26). Subsequent studies of purified normal and mutant human recombinant PRSls (18, 73) (Table II; Fig. 4), however, indicate that physiological adenylate (and guanylate) control of enzyme activity involves an allosteric mechanism of regulation that reflects a noncompetitive rather than competitive mechanism of purine nucleotide inhibition. Noncompetitive inhibition (with respect to substrates) is normally more clearly seen with GDP as inhibitor (18), because this process is obscured with ADP by the coexisting competitive mechanism (Fig. 4). ADP is the more

131


25

o ,,¢~

A.

B.

25

20

20

g,

'o x

-0.01

15

lO

lO

5.0

5.0

0.00

0.01

0.02

0.03

0.04

0.05

0.06

-0.02

0.00

0.02

0.04

0.06

C.

25 2.5

E

20 E 2,0

~" ~o x

15

,r°=j

~

o

1.s

x o

10

-0.06

-0.04 -0.02

1.0

0 . 0 0 0 . 0 2 0.04

1/ ATP, IJM

0.06

0.08

-0.02 -0.01

j

0.00 0.01

0.02 0.03

0,04 0.05 0,06

1/ ATP, laM

FIC. 4. Double reciprocal plots of the inhibition of purified recombinant PRSls by ADP (A and B) and GDP (C and D) at varying ATP concentrations. (A) Recombinant normal PRS1. ADP concentrations were: 0 (O); 10/~M (0); 20/zM (IN);and 30/zM (11). (B) Recombinant PRS1 from patient 1 (Tables III and IV). ADP concentrations were: 0 (O); 10/zM (0); 30 #M (r-q); and 50/zM (11). (C) Recombinant normal PRS1. GDP concentrations were: 0,(O); 10/zM (0); 50/zM ([2]); and 100/~M (B). (D) Patient 1 recombinant PRS1. GDP concentrations were: 0 (O); 100/zM (@); and 250/zM (IN). Ki slope values, calculated from secondary plots of the slopes determined in panels A and B versus ADP concentrations, were 3 and 6/zM for normal and patient 1 PRSls, respectively. The Ki intercept value for normal PRS1 was 7/~M. Note that the noncompetitive component of nucleoside diphosphate inhibition of normal PRS1 (panels A and C) are not present in plots of inhibition of patient 1 PRS1 (panels B and D). Reproduced in modified form from Ref. 18 by permission.

132

MICHAELA. BECKER

potent nucleoside diphosphate inhibitor of PRS1 and PRS2 isoforms at all Pi concentrations (29, 30). The potency of ADP and GDP inhibition of each recombinant PRS isoform diminishes with increasing Pi concentration, so that curves relating Pi concentration to initial velocity become increasingly sigmoid in the presence of either or both inhibitors (30). This finding is another reflection of the allosteric mechanism of regulation of PRS activity, which by analogy with the Bacillus model (16, 35), implies competition between Pi as activator and purine nucleotide inhibitors at amino acid residues in the regulatory site of the enzyme. Mammalian purified recombinant PRS2s are less sensitive to inhibition by ADP and GDP than are their PRS1 counterparts (29, 30) (Table II). A third site of erythrocyte PRPP synthetase-inhibitor interaction was invoked to explain the action of DPG and PRPP, which showed competitive patterns of inhibition with respect to Rib-5-P (26). Although the Ki for PRPP product inhibition was substantially higher than intracellular PRPP concentrations, the corresponding value for DPG was similar to that of DPG concentrations in the erythrocyte. The possibility that erythrocyte PRPP synthesis could be regulated indirectly by hemoglobin oxygenation, a process accompanied by changes in DPG, free Mg2+, and MgATP concentrations, was raised (74) but remains unproved. Among small-molecule effectors of recombinant PRS isoform activities, DPG is unique in having greater potency for PRS2 than for PRS1 (30) (Table II). Complexity in the interaction ofPRS isoforms with this compound is suggested by two additional findings: (1) Uniquely among small-molecule effector compounds, incubation of erythrocyte PRPP synthetase with DPG results in reversible subunit disaggregation to the monomer state (34); (2) at Pi concentrations between 5 and 10 mM, DPG reproducibly stimulates rather than inhibits recombinant PRS1 and, to a lesser extent, PRS2 activities (30). 3. ACTIVATORS Human PRPP synthetase has an absolute requirement for divalent cations, the most effective of which is Mg 2+ (32). The Mg2+ requirement includes the 1 31, 32, .35' 41 ). need for a MgATP complex as a substrate and for free2+Mg 2+ (2+3, Substitution of other divalent cations, such as Mn or Cd , results m measurable but impaired activity (30). Ca2+ inhibits purified recombinant PRS isoform activities. Binding of Mg ~+ to PRS isoforms appears to precede binding of MgATP, reinforcing the view of the cation as an activator (16). Ka values calculated for free Mg2+ were lower for human recombinant PRS1 than for PRS2 (30) (Table II). Inorganic phosphate is required for activity of human PRPP synthetase, and removal of Pi from the enzyme results in complete but reversible loss of activity. Pi activation of human recombinant PRS isoforms was optimal at 40--50 mM when studied at pH 7.5, but Ka values for Pi were substantially lower (Table II),


133

especially for PRS1 (30). Activation in both cases was sigmoidal, even in the absence of nucleotide inhibitors. The sigmoid shape of the Pi activation curves was progressively shifted to the right by addition of increasing concentrations of ADP or GDP. When SO42- was substituted for Pi, PRS isoforms could be partially activated to approximately 30% of maximal Pi activation at 50 mM SO42- for PRS1 and 17% of maximal Pi activation at 100 mM SO42- for PRS2. In the absence of other effectors, addition of Pi to erythrocyte PRPP synthetase (34) or purified recombinant PRS isoforms (30) is unaccompanied by changes in the quaternary structure of the enzyme. Aggregation and disaggregation of PRPP synthetase subunits in respose to others effectors, however, require the presence of Pi (30, 32, 34). Potentiation of aggregation clearly cannot account for the magnitude of Pi activation of PRPP synthetase, at least part of which appears to involve Pi binding to the allosteric regulatory region of the enzyme to promote an active conformation or to reverse an inhibitor-induced or effector-null inactive conformation (18). Further understanding of this process is of importance in view of the primary role assigned to Pi in the control of intracellular PRPP synthetase activity and hence the rate of PRPP production.

D. Structures of Human PRPPSynthetases Human PRS1 and PRS2 isoforms (Fig. 1) are polypeptides of 317 residues (after removal of N-terminal Met) with calculated and observed molecular masses of 34.7 kDa and 34.6 kDa, respectively (10, 11). Homology between the isoforms is 95.3%, with 11 single, mostly conservative, amino acid substitutions widely distributed and two tandem sequential amino acid substitutions at residues 4 and 5 and residues 17 and 18. Homologies with B. subtilis PRPP synthetase are 47.6% and 46.6%, respectively, and, with the E. coli enzyme, 48.9% for each human PRS (35, 43). (The two bacterial enzymes are 51.8% homologous.) Of particular note, when aligned sequences of 27 PRPP synthetases from 19 disparate species are compared (16), the human isoforms are identical with B. subtilis PRPP synthetase at all but four of 70 sites sharing an identical residue in at least 22 of the 27 enzymes. In fact, the human isoforms are identical with the B. subtilis enzyme in all 30 residues shared by 26 or 27 of the PRPP synthetases. Secondary folding predictions (M. A. Becker and M. Ahmed, unpublished), based on the amino acid sequences of PRS1 and PRS2, indicate strong similarities between human and B. subtilis enzymes, especially with regard to the parallel t-sheet and or-helical regions, as well as for the PPi loop, the PRPP binding fingerprint motif, and the flexible loop (Fig. 1). A relationship between PRPP synthetase quaternary structure and activity is supported by studies of human recombinant PRS isoforms (30) (Section IV,B) and PRPP synthetase isolated from cells (34). The major forms of the enzyme purified from mammalian tissues appear to be high-molecular-weightaggregates

134

MICHAEL A. BECKER

(26, 28, 33, 34, 75). When purified erythrocyte enzyme was subjected to structural analysis, evidence for multiple states of aggregation was shown, each composed of PRS1 subunits only.4 Reversibility of the aggregation-disaggregation process in vitro was also established in studies in which the number of subunits comprising each aggregate was estimated by gel filtration and sucrose density gradient centrifugation, carried out in the presence of appropriate effectors of enzyme activity (32-34). In brief, Pi is required for aggregation and disaggregation of subunits, but neither Pi nor Rib-5-P induces aggregation or disaggregation in the absence of other effectors. Mg 2+ and MgATP promote subunit aggregation to forms estimated to contain 16 and 32 subunits (33, 34). 5 Nucleotide inhibitors of enzyme activity also promote aggregation and appear to exert their effects directly on enzyme subunits in the largest aggregated states (33). In contrast, DPG not only induces dissociation to monomers but also antagonizes Mg 2+- and MgATP-mediated aggregation. When the activities of the individual aggregates were assessed under conditions permitting measurement of enzyme activity but no shift in states of aggregation, only the largest aggregates of PRPP synthetase exhibited enzyme activity (34). Overall, aggregation and disaggregation of PRPP synthetase subunits is an enzyme and effector concentration-dependent process associated with several potential mechanisms through which quaternary structure and activity could be related to the action of inhibitors, activators, and substrates. Whether, however, this process operates in the control of PRPP production in intact cells remains uncertain (34). Furthermore, the role, if any, of heteroaggregates of human PRS 1 and PRS2 subunits in regulation of enzyme activity is unexplored to date. Studies with purified rat liver PRPP synthetase support the existence of such PRS heteroaggregates (75) and even more complex heteroaggregates that contain not only PRS1 and PRS2 catalytic isoforms but also specific noncatalytic PRPP synthetase-associated proteins (17, 36-40, 75).

E. PRPP Synthetase-Associated Proteins (PAPs) In the course of purification and structural analysis of rat liver PRPP synthetase, Tatibana and colleagues identified proteins with approximate molecular masses of 39 kDa and 41 kDa (PAP39 and PAP41, respectively) that copurified with 34-kDa PRS catalytic isoforms (17, 75). The complex could be partially dissociated by gel filtration in the presence of i M MgC12, yielding purified PRPP synthetase with increased specific activity. The suggestion that mammalian PRPP 4The latter pointwas supportedby the demonstrationof PRS1 but not PRS2peptide sequences in enzymaticand chemical digests of the purified protein (M. A. Becker,unpublished). 5A discrepancy, currently unresolved, exists between the estimated numbers of subunits in aggregates of human PRPP synthetase (2, 4, 8, 16, and 32) and the establishedhexameric structure of the Bacillus enzyme (16, 35). It is possiblethe methods used to estimate the human aggregate molecularweights (33) did not providethe accuracyachievableby contemporarystructural analysis.


135

synthetase subunits might interact with a protein or proteins suppressing the activity of PRS catalytic isoforms led to studies in which the cloning of cDNAs for rat PAP39 and PAP41 ( and, subsequently, for human PAP39 and PAP41) was achieved (36-39). The deduced amino acid sequences of rat and human PAP39s and PAP41s (Fig. 1) show considerable homology with PRS isoforms from the respective species; in the case of human PAP39, exclusive of two entirely dissimilar regions of sequence, homologies with PRS1 and PRS2 isoforms are 44% (38). For human PAP41, the corresponding homologies are 50% (39). Despite sequence homology, there are multiple substitutions in sequences and in specific residues otherwise conserved among mammalian and bacterial PRPP synthetases. This, and the presence of nonhomologous regions, favors the view that PAPs are noncatalytic proteins (17, 36, 40), which can, however, be regarded as members of a subfamily derived from the PRPP synthetase gene family (17). Interestingly, PAP39 and PAP41 transcripts are detectable in all rat and human tissues and cells examined to date (36-39). Patterns of tissue expression differ for each PAP transcript, however, revealing tissue-specific regulation of PAP expression, which appears to be independent of the tissue-specific expression of PRS1 and PRS2 (21, 36-39). Molecular interaction between PAP39 and PRS isoforms was confirmed by covalent crosslinking experiments and by coimmunoprecipitation of rat liver PRPP synthetase with PAP39 after incubation with monoclonal anti-PAP39 (36). An increase in the specific activity of rat liver PRPP synthetase as a result either of dissociating the PAP-PRS complex or of limited trypsin digestion of the complex was consistent with an enzyme-inhibitory role for PAP39 (17, 36). Further evidence was provided by inhibition of PRPP synthetase activity in a PAP39 dose-dependent fashion when the enzyme complex was partially reconstituted in E. coli by coexpression of rat PAP39 and either PRS1 or PRS2 (40). Although less firmly established, evidence from in vitro reconstitution experiments implies that, in addition to inhibiting PRPP synthetase catalytic activity, PAPs may lower the sensitivity of the enzyme to nucleotide inhibition (17, 36). With regard to the structural basis of the PAP-PRS interaction, the approximate molar ratio of the components has been estimated to be 20:5:8:1 (PRSl:PRS2:PAP39:PAP41), indicating a molecular mass exceeding 1000 kDa (17). Sedimentation properties of the rat liver complex, however, suggest that further aggregation of the complex is likely as well (17). Selective sensitivity of PAP39 to digestion during incubation of the complex with trypsin implies a spatial arrangement in which PAP39 subunits are arrayed on the surface of the complex, providing access to the proteolytic enzyme (40). Furthermore, the PAP39 components of partially reconstituted complexes with PRS1 are more sensitive to tryptic digestion than are the PAP39 components of complexes with PRS2 (40). Overall, the identification and study of PAPs have provided evidence for heteroprotein interactions involving PRS isoforms, with resulting alterations

136

M I C H A E L A. BECKER

in catalytic and possibly regulatoryproperties. Much remains to be learned about the magnitude of these effects and their physiological significance. On the other hand, tissue-differential expression of both PRS isoforms (21, 22) and PAPs (36--39), as well as evidence that most or all of the PRS isoforms may be complexed in cells (36), suggests the potential for a major posttranscriptional role for PAPs in the regulation of intracellular PRPP production.

F. Inherited Human PRPP Synthetase Overactivity Much of the information relevant to specific processes involved in the control ofPRS isoform expression in human cells comes from the study of inherited states in which an excess of PRPP synthetase activity results in disordered purine metabolism (60, 76). Inherited overactivity of PRPP synthetase is an X chromosome-linked disorder (77) associated with purine nucleotide and uric acid overproduction, gout (60, 76), and, in some families, neurodevelopmental impairment (20). The metabolic aspects of PRPP synthetase overactivity are, in all affected individuals, consequences of increased PRPP availability (49, 78), leading to acceleration of purine nucleotide and uric acid synthesis. The kinetic bases of enzyme overactivity are, however, heterogeneous, and include (1) regulatory defects, in which allosteric control of PRPP synthetase activity by purine nucleotides and Pi is impaired (18, 20, 78--80); and (2) "catalytic" defects, characterized by increased maximal reaction velocity but otherwise normal kinetic properties with respect to substrates, inhibitors, and activators (27, 81-83). Individuals in whom altered allosteric regulation comprises the sole or a major portion of the defect in PRPP synthetase activity show the most severe biochemical and clinical phenotypes (20). The genetic heterogeneity implied by differences in the kinetic abnormalities and phenotypic expression of inherited PRPP synthetase overactivity has been confirmed: Point mutations in the translated region of PRPS1 provide the genetic basis for altered allosteric control of PRS1 activity (18, 84); in contrast, PRPP synthetase "catalytic" overactivity results from an increased concentration of the normal PRS1 isoform (67), apparently due to selective acceleration of transcription of PRPS1 (19). We carried out reverse transcription-polymerase chain reaction amplification of patient and normal fibroblast-RNA and sequenced the resulting PRS cDNAs (18). We identified and confirmed single base substitutions in the PRS1 cDNAs derived from six unrelated hemizygous male patients with ADP- and GDP-resistant, overactive PRSs (Table III). For each individual, the base change in PRS1 cDNA predicted a single amino acid substitution in PRS1. The altered residues ranged from amino acid residue 51 to residue 192 of the 317 amino acid PRS1 polypeptide. In order to assess the functional significance of the unique amino acid substitution predicted for each mutant PRS1, we expressed all six variant PRS1 cDNAs in E. coli and purified the mutant recombinant PRSls (18). Each of


137

TABLE III MUTATIONS IN PRS1 IN PATIENTSWITH PRS OVEI~CTIWTY DUE TO ALTERED ALLOSTERIC REGULATIONa Patient

Base substitution

Deduced amino acid replacement

1 2 3 4 5 6

G154---~C A341---~G C385~A G547---~C C569---~T C579---~ G

Asp 51 --+His Ash113 ~ Ser Leu128 ~ l l e Asp182 --~ His Ala189 ---~Val His192 -+Gin

aAdapted from Ref. 18 by permisssion.

these enzymes showed a pattern of aberrant responses to purine nucleoside diphosphate inhibition and to Pi activation (Table IV) that corresponded closely to that found for PRS in extracts of cells derived from the respective individual. That is, each mutation resulted in disruption of the noncompetitive mechanism of inhibition of PRS1 by ADP and GDP (Fig. 4) and lowered the apparent Ka for Pi (Table IV). In no instance, however, was the affinity for MgATP altered (18). Overall, these findings establish that point mutations in the translated region of the PRPS1 provide the genetic basis of inherited PRPP synthetase overactivity due to disordered allosteric regulation. That disruption of allosteric control of PRPP synthetase activity results in marked PRPP and purine overproduction confirms a physiological role for this process in the control of intracellular PRPP synthesis. Moreover, the fact that TABLE 1V KINETIC CONSTANTS FOR THE PRPP SYNTHETASEREACTION CATALYZED BY NORMAL AND MUTANT RECOMBINANTPRSIs a I0.5 (/xM) for Source of rPRS1

ADP b

Normal Patient 1 2 3 4 5 6

21 75 78 128 150 69 141

GDP b

Activation constant (mM) for Pi Ka

r/Hc

52

0.88

3.1

225 790 >1000 810 565 770

0.26 0.32 0.15 0.14 0.13 < 0.10

2.3 2.1 1.7 2.1 2.1 --

aAdapted from Ref. 18 by permission. bDetermined at 1.0 mM Pi at pH 8.0. CHill coefficient.

138

MICHAELA. BECKER

single residue substitutions dispersed over a major portion of PRS1 result in diminished responsiveness to inhibition by ADP and GDP suggests that these compounds share a complex and probably extensive allosteric nucleotide inhibitory mechanism. It seems likely that these widespread mutations alter the transmission ofallosteric effects to the active site ofPRS 1 rather than the primary structure of the nucleotide-binding residues in the allosteric site (18). Overall, multiple residues and regions of the PRS1 polypeptide appear to be involved in determining isoform stability, activity, and resposiveness to inhibitors. These suggestions are in accord with the model for regulation of B. subtilis PRPP synthetase activity (16, 35) and also with studies in which domains of rat PRS1 and PRS2 were switched and the kinetic differences between the isoforms were examined (85). Furthermore, concurrent impairment of nucleotide inhibition and of Pi responsiveness suggests that these effectors share a mechanism in which inhibited conformations favored by nucleotide binding are in equilibrium with more active conformations favored by Pi binding. In such a model, mutant PRSls would likely be altered in residues involved in stabilizing the inhibited conformations, rather than directly in Pi or nucleotide binding. The kinetics of ADP and GDP inhibition of purified recombinant normal and mutant PRSls also provide information regarding mechanisms by which inhibition is effected in vivo (18). ADP inhibition of normal PRS1 involves noncompetitive as well as competitive mechanisms (18), but only a competitive mechanism of ADP inhibition is demonstrable for mutant recombinant PRSls (Fig. 4). Despite a potent mechanism of competitive inhibition, however, cells bearing mutant PRSls clearly resist inhibition of PRPP and purine nucleotide synthesis by both ADP and the noncompetitive inhibitor GDP (49). These findings are best reconciled by the view that these point mutations in PRS 1 disrupt the major noncompetitive component of the allosteric nucleotide inhibitory mechanism (18, 73) without altering the ATP substrate binding region and that the competitive mechanism exerts little or no control on PRS1 activity under physiological conditions in which ATP concentrations are saturating for the enzyme. When reverse transcription-polymerase chain reaction analysis of PRS1 and PRS2 transcripts was applied to the ceils from six male patients with X-linked "catalytic" overactivity of PRPP synthetase, no alterations in the sequences of the translated regions of PRS1 or PRS2 cDNAs were detected (67), showing that mutations in the translated region of X-linked PRPS1 or PRPS2 do not account for this class of enzyme overactivity. The prior kinetic studies of PRPP synthetase "catalytic" overactivity (27, 81--83) were consistent with the possibility that this enzyme abnormality results from increased expression of one or both X-linked PRS isoforms with normal primary sequence. We then compared levels of X chromosome-linked PRS isoforms and transcripts in cultured cells from normal and affected individuals in order to determine whether one or both PRS

139


isoforms were involved in expression of PRPP synthetase "catalytic" overactivity and to clarify where in the genetic information process PRS isoform overexpression was likely to be determined (67). In these studies, we found that PRPP synthetase "catalytic" overactivity in fibroblasts is associated with a selective 2-6-fold increase in concentrations of PRS1 isoform and that total PRS activities in normal and patient cells correlate closely with total PRS isoform contents (Table V), nearly 90% of which is PRS1 (67). Enzyme overactivity is thus clearly associated with increased PRS1 content. In addition, PRS isoform specific activities (PRPP synthetase activity per mg PRS isoform) (67) in cell extracts are comparable in normal and patient cells and are similar to those of purified recombinant normal PRS isoforms (30). "Catalytic" overactivity is therefore not a result of preferential inhibition of normal PRPP synthetase activity, as might occur, for example, if the inherited defect altered interaction of normal PRS isoforms with an aberrant PAP39 or PAP41 (17, 36). Finally, steady-state levels of PRS1 transcripts are also selectively increased in patient cells (67) (Table V). The increases in PRS1 transcript and isoform levels in patient cells are coordinate, supporting the view that PRS1 overexpression is determined at least in major part by an altered pretranslational mechanism of control of PRPS1 gene expression. We have subsequently determined that neither PRPS1 gene amplification nor alterations in PRS1 transcript sequence or stability explain increased PRS1 transcript and isoform levels in cells from affected individuals (19). In contrast,

TABLE V COORDINATE INCREASESIN PRPS1 TRANSCRIPTIONRATES, PRS1 TRANSCRIPTLEVELS, AND PRS ISOFORM CONTENTSAND ACTIVITIESIN FIBROBLASTSFROM INDIVIDUALSWITH INHERITED OVERACTIVITYOF NORMALPRPP SYNTHETASEa

PRPS1 Relative transcription rate Fibroblast source

(PRPS1/GAPDH) (× 10z)

PRS1 Relative transcript level (PRS/GAPDH) (× 102)

Total PRS isoforms (PRS1 + PBS2) (/zg/mg protein)

PRPP Synthetase activity b (mUnits/mg protein)

Normal

5.4

2.5

0.31

8.11

(n = 5)

(1.0)

(1.0)

(1.0)

(1.0)

(4.5) (3.3) (3.0)

(5.9) (4.7) (3.4)

(6.0) (4.5) (3.5)

(4.9) (3.9) (3.3)

Patient

7 8 9

aAdapted from Ref. 19 by permission. Values are derived from the means of at least three separate determinations each fibroblast strain. Means for normal fibroblast strains were averaged and are shown. Each normal value was assigned a relative value of 1.0 to which the relative mean values measured in patient cells are compared. bAt 32 mM Pi.

140

MICHAEL A. BECKER

rates of PRPS1 transcription (by nuclear runoff analysis) were 3-4-fold greater relative to those of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and PRPS2 transcription in patient fibroblasts than in normal cells (Table V). The relative increases in PRPS1 transcription rates are also coordinate with the relative increases in PRS transcript and isoform levels and PRPP synthetase activities in patient fibroblasts. The molecular defect underlying the selectively excessive rate of PRPS1 transcription is currently unknown, but does not involve mutation in the PRPS1 transcribed region or in the promoter or 5'-adjacent 3 kb of PRPS1 DNA (22). Nevertheless, the fact that, despite intact allosteric contol of PRS1 in cells from affected individuals (49), rates of PRPP, purine nucleotide, and uric acid production are excessive (81-83) establishes PRS1 isoform concentration as a determinant in the regulation of PRPP and purine nucleotide synthesis.

V. Regulation of PRPP Synthesis in Human Cells Comparison of the properties of human PRS isoforms with the known intracellular concentrations of their effectors assures that conditions optimal to the full expression of enzyme activity in vitro never exist in cells. Rather, it is likely that substrate and free Mg 2+ availability and, especially, intracelhilar concentrations of allosteric effectors play major roles in the moment-to-moment control of PRPP synthesis in intact cells under ordinary conditions of growth and nutrition. With regard to substrate availability, high intracellular concentrations of ATP and Mgz+ in comparison to association constants of PRPP synthetases for MgATP make this substrate an unlikely candidate for a regulatory role under physiological circumstances. For reasons cited earlier (68), however, a potential role for free Mg~+ in antagonizing nucleotide inhibitors of PRPP synthetase warrants further investigation. This is especially pertinent to a proposed role for increases in intracelhilar free Mg 2+ (86, 87) as the basis for the rapidly enhanced PRPP availability noted in growth-arrested mouse 3T6 (88) and 3T3 (89, 90) fibroblasts incubated with fresh serum or mitogens and growthpromoting factors. Baseline concentrations of Rib-5-P that are at or below the Michaelis constants of PRS isoforms (70, 71) make this substrate attractive as a potential determinant of PRPP synthetic rate. Until, however, evidence is presented to document that increased Rib-5-P availability through any of the pathways of generation of this compound (69, 72) activates PRPP synthesis independent of (or synergistically with) prevailing intracellular Pi concentrations (8, 91), this suggested link (92, 93) between activities of pentose phosphate and purine biosynthetic pathways will remain unestablished. In fact, the weight of experimental evidence to date indicates that the concentration of Pi is a more potent determinant of PRPP synthesis than is the


141

concentration of Rib-5-P (91, 94-96). Even though intracellular Pi concentrations are usually quite low with respect to maximal Pi activation ofPRS isoforms, cooperative Pi activation patterns (30) and association constants of PRS isoforms for Pi that are similar to intracellular Pi concentrations in some tissues (91) provide support for this contention. Finally, the primacy of Pi and of purine nucleoside diphosphate inhibitors as posttranslational regulators of PRPP synthetase activity is strongly supported by structural and mechanistic studies of the mammalian (30, 68) and bacterial (16, 35) enzymes and by the functional consequences of mutations in PRPS1 that abolish allosteric control of enzyme activity (18, 84). A number of reports document or suggest mammalian cell type-specific increases in either PRPP concentrations or PRPP synthetase activities or rates of PRPP-dependent pathways in response to fresh serum, growth-promoting or mitogenic agents (88-90, 97, 98), hormones (99, 100), or dietary (101) or environmental (102) manipulations. In some instances, evidence for activation of PRPP synthesis is observable within minutes to a few hours of experimental manipulation. For example, both purine synthesis de novo and purine base phosphoribosylation are accelerated in quiescent, serum-depleted 3T3 (89, 90) and 3T6 (88) mouse fibroblasts within minutes of addition of serum or growth-promoting agents. The increased metabolic flux through PRPP in these circumstances (90) most likely results from activation of preformed PRS isoforms, which, in turn, may reflect increased intracelhilar divalent cation (Mg ~+) concentration (86, 87). Changes in free Mg 2+ may activate PRPP synthetase directly or, alternatively, through changes in the state of aggregation of PRS isoforms or the composition of isoform aggregates. In the latter case, a posttranslational mechanism of increased PRPP production could involve isoform homoaggregation, the formation of heteroaggregates of PRS1 and PRS2 isoforms, or changes in the structure or composition of PRS isoform complexes with PAPs. Whether any or all of these possibilities play a role in physiological or pharmacological models of growth stimulation remains uncertain, but in vitro support for the existence of these protein-protein interactions (17, 32, 34, 36, 40) provides a likely additional level of complexity of posttranslational regulation of PRPP synthetase activity. PRPP and purine nucleotide overproduction characterizes cells from individuals in whom inherited excesses in concentrations of the normal PRS1 isoform result from accelerated PRPS1 transcription (19, 49). This association demonstrates that PRS1 enzyme concentration can, even under circumstances of intact altosteric regulation, act as a determinant of the rate of PRPP production. Involvement at the transcriptional level in this disorder implies that alteration in any pretranslational mechanism of PRPS1 gene expression could lead to excessive PRS1 isoform concentration and also result in PRPP overproduction. If, as seems likely under some circumstances and in some tissues, PRPS2 overexpression can similarly underlie PRPP overproduction (97), a framework

142

MICHAELA. BECKER

for approaching investigation of tissue-differential and isoform-specific differences in PRPP synthesis is provided. There is, in addition, some evidence (see below) that stable increases in rates of PRPP production occurring many hours to several days after mitogenic or hormone/nutrient stimulation of cells results from increased enzyme isoform abundance (22). Synthesis of PRPP in human cells thus appears to be regulated in a complex manner by multiple posttranslational and pretranslational mechanisms, ultimately influencing activities of PRS isoforms. Our recent work has been aimed at delineating apparent pretranslational mechanisms underlying tissuedifferential and isoform-specific expression of structurally normal PRPS genes. PRPP synthetase activity is present in all human and rodent organs, tissues, and cell lines studied (21, 22, 97) and is highest in rapidly dividing cell lines and in tissues populated largely by cells with high rates of turnover (M. A. Becker and M. Ahmed, unpublished). Although both X-linked PRS isoforms are present in all cell and tissue extracts, the relative contributions of PRS 1 and PRS2 isoforms to total enzyme activity differ according to sample source (21, 97). PRS1 isoform concentrations vary over a range of about 3-fold; by comparison, PRS2 levels vary from just detectable to >50% of total PRS isoforms, resulting in a greater relative (although not necessarily absolute) range of expression for PRS2 (22). Both isoforms are most highly represented in cells that are dividing rapidly. In all instances studied to date, there is coordinate expression of the respective PRS transcript and isoform, suggesting that control of expression of the isoforms in this context is pretranslationally determined (22). Results of studies in normal human fibroblasts and B lymphoblasts exemplify these statements. Fibroblasts and lymphoblasts contain comparable concentrations of PRS1 isoform and transcript, but PRS2 transcript and isoform levels are substantially greater in lymphoblasts than in fibroblasts (67) (Table VI). PRS2 transcript and isoform levels in normal peripheral blood lymphocytes are barely detectable and considerably lower than those of PRS1 (Fig. 5). Within 72 hours after addition of concanavalin A, however, or after EB virus lysatemediated establishment of permanent B lymphoblast lines, both PRS1 and, especially, PRS2 transcript and isoform levels are increased so that PRS1 and PRS2 transcript and isoform levels are comparable. Similarly, in primary cultures of human fibroblasts, PRS2 comprises 10% or less of total PRS isoforms (Figs. 5 and 6). In contrast, permanent fibroblast lines, such as the human 293 kidney fibroblast line, show higher PRS enzyme activities and isoform levels, which are accompanied by PRS transcript and isoform levels with a greater proportion of PRPS2 gene products (Fig. 6). These findings suggest a preliminary model of isoform-specific and tissue-differential expression of PRPS genes in which PRPS1 expression is more or less constitutive, and PRPS2 responds in a more robust fashion to proliferative and transforming signals. The coordinate relationship between PRS transcript and isoform levels implies that regulation of PRPP

143


TABLE VI PRS ISOFORMANDTRANSCRIPTEXPRESSIONIN NORMALHUMANFIBROBLASTSTRAINS AND B LYMPHOBLASTLINES Transcript levelb (PRS/GAPDHc x 102)

Isoform concentrationa (/zg PRS/mg protein) Cell type

PRS1

PRS2

%PRS2

PRS1

PRS2

%PRS2

Fibroblast strains (n = 8) Lymphoblast lines (n = 5)

0.28 4- 0.06

0.03 :k 0.01

9.6

2.5 4- 0.4

0.4 + 0.1

13.7

0.26 4- 0.04

0.22 rk0.04

45.8

3.5 4- 0.3

4.3 4- 0.5

55.1

aValues givenare means +1 SD for the respectivecell type. Mean valuesfor each cell strain or cell line were based on at least three determinations. bValues are the group means 4-1 SD of three determinations in each cell strain or cell line of the ratios of densities in the respectivebands on Northern blots measured on a Phospholmager screen during 16-h exposure. GAPDH = glyeeraldeliyde-3-phosphatedehydrogenasetranscript.

p r o d u c t i o n is d e t e r m i n e d , in p a r t at least, at a p r e t r a n s l a t i o n a l level (19, 67), b u t t h e c o n s e q u e n t e x p r e s s i o n o f e n z y m e activity m a y also reflect t h e m o r e m o d e s t sensitivity o f P R S 2 t h a n P R S 1 to allosteric e f f e c t o r s (29, 30). W e h a v e s o u g h t to i d e n t i f y s t r u c t u r a l a n d functional bases for tissue-speciflc r e g u l a t i o n ofPRPS g e n e e x p r e s s i o n in studies o f t h e distinctive p r o m o t e r regions a n d a d j a c e n t 5' D N A s o f t h e two g e n e s (66). E x p r e s s i o n o f b o t h P R S i s o f o r m s in all cell types (21, 97) a n d t h e p r e s e n c e o f G C - r i c h regions s u r r o u n d i n g t h e p r o m o t e r s o f e a c h (66) are c o n s i s t e n t w i t h t h e d e s i g n a t i o n o f b o t h as h o u s e k e e p i n g

1

PRSI

PRS2

2

5

4

5

6

pH 6.8 6.6

FIG. 5. Immunoblot analysis of X chromosome-linked PRS isoforms in the supernatant layers of extracts of human and, mouse cells. Isoelectric focusing and immunoblotting were carried out as in Fig. 3 and Ref. 67. Lanes 1 and 3: normal human B lymphoblast extract, 50/zg and 100/zg protein, respectively; lane 2: normal primary human fibroblast extract, 50/zg; lane 4: mouse NIH3T3 fibroblast extract, 50/zg; lane 5: mouse A20 B lymphoma cell extract, 25/zg; lane 6: human peripheral blood lymphocyte extract, 80/zg. Note the comparable expression of PRS2 and PRS1 in lymphoblast lines and the relative dearth of PRS2 expression in the fibroblast strains as well as in blood lymphocytes.

144

MICHAELA. BECKER 1

2

3

4

s

PRS1

PRSZ

FIG. 6. Immunoblotanalysisof X chromosome--linkedPRS isoformsin supernatantslayersof extracts of human lymphoblastand 293 fibroblastcell lines and in primaryfibroblastcell strains. Isoeleetriefocusingand immunoblottingwere carriedout as describedin Fig.3 and Ref.67. Lane 1: ]urkat (T) lymphomacell extract,30/zg; lane 2: B lymphoblastextract, 58/zg; lane 3: extract of normalprimaryfibroblastculture, 114/zg;lane4: kidneyfibroblastcell line293 extract,82/zg; and lane 5: extract of a second normalprimaryfibroblastculture, 98/~g. Note that the establishedT lymphoma,EB virus-stimulatedB lymphoblast,and fibroblast293 lines all express PRS2 per mg extractprotein at highlevelscomparedwithprimaryhumanfibroblaststrains. In the caseof Jurkat and 293 celllines,PRS1 expressionis alsoincreasedbut to a relativelysmallerdegree.

genes. Also consistent with this view is the good agreement between human PRPS promoter activities and steady-state levels of the respective PRS transcripts in several transformed human cell lines (66). Nevertheless, both genes contain TATA-like sequences located appropriately for roles in transcriptional regulation, and both contain multiple promoter region consensus sequences associated with binding of transcription regulating proteins, such as Sp1, which have been implicated in the tissue-differential regulation of other housekeeping genes. We have prepared PRPS promoter region/reporter gene constructs and have transiently transfected mouse NIH-3T3 fibroblasts and A20 B lymphoma cells as well as normal human B lymphoblasts and fibroblasts. In all of these cell lines, PRPS1 promoter expression is nearly constant independent of the length of PRPS1 5' proximal DNA (up to nearly 4 kbp from the transcription initiation sites) contained in the reporter gene construct. In contrast, fibroblast PRPS2 promoter expression studies indicate a negative regulatory element within about 1 kbp of the transcription initiation site, because reporter gene expression in constructs containing up to 2.2 kbp is less than one-third that in constructs with 20 times lower than that for NAD, and reduction of acetaldehyde is "-5 times faster than oxidation of ethanol. This indicates that the ADHI equilibrium of ethanol/acetaldehyde interconversion could be influenced by the redox balance in the cell. Besides alcA, A. nidulans possesses at least two other genes coding for typical alcohol dehydrogenases (EC 1.1.1.1). alcB (25) codes for ADHII (15). ADHIII is the product of the alcC gene (26). The three genes are unlinked (27; H. M. Sealy-Lewis, personal communication). In contrast to alcA and aldA, the alcB and alcC genes are not positively controlled by the transactivator of ethanol catabolism, AlcR (15, 28, 29). The deduced amino acid sequences of the three A. nidulans ADHs show that they belong to the zinc-dependent enzyme subclass of the medium-chain dehydrogenase/reductaseprotein family (30, 31). The Saccharomyces cerevisiae counterparts Adhlp, Adh2p, and Adh3p also belong to this subclass. alcA and alcC are highly similar: 78% identity at the protein level and 71% at the DNA level for the coding region. The two genes cross-hybridize when less stringent hybridization and washing conditions are applied (32). The positions

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 155 of both introns are conserved between alcA and alcC, which suggests that these genes have evolved from a recent gene duplication. The two genes are also highly similar to their yeast counterparts: up to 57% identity at the protein level. Interestingly, alcB is less closely related. At the protein level the similarity is limited to 40-43% identity with both ADHI and ADHIII as well as with the S. cerevisiae proteins, the enzymes fielding the highest identity figures. The alcB gene contains three introns in the first half of its coding region. Unfortunately, ADHII and ADHIII have never been purified or kinetically characterized. Evidence has accumulated indicating that neither alcB nor alcC is actively involved in ethanol catabolism. Gene disruption of alcC does not affect the growth characteristics ofA. nidulans on ethanol (32). Sealy-Lewis (33) has isolated mutants in an alcA/alcR-deleted background that exhibited some ability to utilize ethanol as a sole carbon source on plates. The additive effects of two "up-mutations," alcD and alcE, resulted in a 100-fold overexpression of the alcB gene, and apparently, the basal level expression of aldA in absence of AlcR was sufficient to support the observed growth on ethanol. So, although ADHII and ADHIII appear able to convert ethanol into acetaldehyde (3), neither alcB nor alcC contributes to wild-type growth on ethanol as the sole source of carbon. Their promoters are simply not strong enough to support a catabolic flux.

B. The aldA Gene/ALDH The aldA gene was cloned and sequenced by Pickett et al. (19). The gene product is a single 1.8-kb messenger encoding a primary translation product of 497 amino acids. It contains two short introns, one at each end of the coding region. The deduced amino acid sequence predicts a molecular mass of 54.1 kDa for the translation product, in perfect agreement with the apparent molecular mass of the subunit from the purified protein (54 kDa) upon SDS-PAGE (34). The native protein has an apparent molecular mass of 265 kDa (34), indicating a multimeric structure for the active enzyme. aldA encodes a typical ALDH (EC 1.2.1.5) with a broad substrate specificity characterized by a low Km for acetaldehyde (5 tzM) (34). The enzyme exclusively converts aldehydes into carboxyl acids and not vice versa: acetaldehyde and benzaldehyde are the best substrates tested. It can use the cofactors NAD and NADP equally well (Km values of 0.55 and 0.29 raM, respectively). The purified enzyme is stable only in the presence of thiol reducing agents in fairly high concentration and its activity can be stimulated by potassium. The sequence of A. nidulans enzyme is highly similar to those of the human liver cytosolic ALDH 1 and mitochondrial ALDH 2 isozymes (35), exhibiting up to 60% identity with the latter. In contrast to its human counterpart ALDH 2, the aldA primary translation product lacks a typical N-terminal mitochondrial target sequence. Although its subcellular localization has never been addressed experimentally, sequence alignments suggest the absence of a target presequence, and we presume that the fungal enzyme resides in the cytosol. The identity of

156

B. FELENBOKETAL.

two functional analogs in S. cerevisiae, cytosolic Ald6p and mitochondrial Ald4p was revealed only recently (36-38). To date, nothing has been published on the regulation of the yeast ALD genes. Sequence analysis of three loss-of-function alleles ofaldA (38a) has provided novel structural information on the ALDH reaction mechanism in addition to that known from ALDHs for which the three-dimensional structures have been resolved (39-41). A nonsense mutation aldA67 (2) results in a truncated protein of 130 amino acids without enzymatic activity. This mutant is unable to grow on ethanol but grows normally on other nonrelated carbon sources, aldA, the sole ALDH-encoding gene identified in A. nidulans, is therefore not essential for the fungus. Another strong mutant aldA57 (2) contains a missense mutation that impairs ALDH activity. The ALDH superfamily sequence alignment (35) shows that an invariant residue was changed (Gly 338 Ser). The third mutant aldA15 (5) has a leaky phenotype caused by a missense mutation in a region coding for an or-helix involved in the substrate entrance tunnel of the protein (41). This suggests that the mutation (Ala 286 Val) in aldA15 results in a reduced affinity for the substrate. These sequencing data show that the A. nidulans ALDH is highly similar to eukaryotic ALDHs. The amino acids involved in the active site are highly conserved and sequencing of aldA alleles has shown that they are functional in A. nidulans.

C. The alcRGene The positive-acting gene alcR encodes a 821-amino acid protein of the zinc binuclear class Zn2Cys6 (9, 42). A single small intron of 60 bp interrupts the ORF and maps close to the translation start site. A putative TATA box is localized at 324 bp upstream of the start codon (9). Two transcription initiation points of the alcR gene were identified. These are separated by 467 bp and correspond to a short and a long mRNA species of 2.6 and 3.0 kb, respectively (13). The utilization of the two transcription initiation sites in the alcR gene depends on the presence of an externally applied coinducer. Without an external inducer, the proximal site is used which corresponds to a 2.6-kb mRNA. The presence of an external coinducer leads, in addition, to synthesis of a second longer transcript (3.0 kb), which is shown dispensable for induced transcription (13).

U. AIcR: The Transcriptional Activator of the Ethanol Regulon in A. nidulans A. AIcR Is a Zinc Binuclear Cluster Protein The deduced amino acid sequence of AlcR contains in its N-terminal region a sequence motif with six cysteine residues, Cys-Xz--Cys-X6--Cys-X18-Cys-X2Cys-X6-Cys (Zn2Cyss). This motif is closely related to DNA-binding domains of

TRANSCRIPTIONAL REGULATION OF THE ETHANOL UTILIZATION PATHWAY

157

fungal transcriptional factors that belong to the so-called zinc binuclear cluster family (43, 44). The proteins of this class, as exemplified by Gal4p, Lac9p, Pprlp, and Haplp whose three-dimensional structures have been resolved (45-50), have been shown to form a compact cloverleaf-like structure in which six cysteine residues are involved in chelating two atoms of Zn. Several important features differentiate the AlcR putative DNA-binding domain from the other proteins of this class (Fig. 2). The first evidence for Zn requirement was obtained by in vitro DNA-binding experiments with a

B

A E

N

A

N E N

R

N E

G W

P25

A D S

R R C2._22

V S N

C39

C4__.2_2K

D D P

N S HIO

T F

N R

L

@

R

T D A

s

s

Q

R S K60

M FIG. 2. The DNA-binding domain of AlcR. (A) A "cloverleaf structure" model proposed for the AlcR(1-60) binuclear cluster. Six Cys residues coordinating two atoms of Zn are underlined. The distance between two atoms of Zn and that between Zn and sulfur of Cys are indicated. The extended loop of 16 residues between the third and the fourth Cys results in asymmetry of the cluster. Amino acids interacting with the DNA-speeiflc target are circled. (B) Three-dimensional structure of the AIcR-DNA complex resolved by NMR. The AleR(1-60) DNA-binding domain is represented as a ribbon diagram. The chelated Zn(II) ions are shown as small balls. Side chains of the amino acid residues Lys 19 and Trp 45, shown as sticks, establish specific contacts with bases within the consensus sequence 5'-TGCGG-3', which are indicated on the DNA helix. The Arg 6 residue presumably interacts with the DNA backbone or/and bases adjacent to the conserved core. The extended loop between the third and fourth Cys and the region following the zinc cluster form a-helical structures.

158

B. FELENBOK ET AL.

GST-AlcR (7-58) fusion protein (20, 51). Removal of the intrinsic Zn with Chelex completely abolished the DNA-binding ability, whereas it was restored upon the addition of Zn. Measurements of the Zn content by atomic absorption spectroscopy gave the value of 2.5 d: 0.3 mole of Zn(II) per mole of protein (52), indicating the presence of two atoms of Zn per one molecule of AlcR. Additional information concerning distances, type, and number of atoms coordinated at the metal site was obtained using x-ray absorption spectroscopy (EXAFS and XANES). The average value of Zn--S bond length is ~2.34 4- 0.02/~, whereas the distance between two Zn atoms is 3.16 -4- 0.02 A (53). These small values are within the range of distances expected for bridging sulfur atoms and are very similar to those found in other zinc binuclear cluster proteins, such as Gal4p and Pprlp (45, 46, 48). In fact, two possibilities for Zn ligation could formally exist for A|cR, since the six cysteine residues are preceded by a histidine residue that may serve as a possible candidate for Zn coordination (Fig. 2A). Thus, either six Cys or five Cys and one His could chelate Zn atoms. In order to discriminate between the ZnS4 and ZnS3N mode of ligand coordination, one of the six Cys, Cys 49, as well as His 10 were independently mutagenized. Spectroscopic results clearly show that in the absence of Cys 49, AlcR is unable to chelate Zn and maintain the correct conformation of the DNA-binding domain (53). Consequently, this results in a loss of DNA binding. Similar data have been obtained with Haplp by mutating the sixth Cys (54), whereas the mutation of the third Cys in Gal4p DNA-binding domain only reduced the binding affinity (46). Apparently, a central Cys has a less prominent effect, as the other central Cys is able to maintain the sulfur-zinc coordination resulting in a proper three-dimensional structure. In contrast, His 10 is not directly involved in Zn ligation, but rather contributes to DNA binding (53). This finding was finally confirmed upon resolving the three-dimensional structure of the AlcR DNA-binding domain (55). All these data are in favor of a zinc binuclear cluster model in which six cysteine sulfurs ligate two Zn atoms. However, in contrast to other members of the Zn2Cys6 family, the DNAbinding motif of AlcR has a strongly asymmetric structure resulting from the 16 residues between the third and the fourth Cys compared to 6-8 residues usually found (Fig. 2). Furthermore, in this loop the proline residue conserved in all other zinc cluster domains is absent in AIcR. This Pro is necessary for correct folding of the protein joining the two substructures of the cluster by a cis-peptide bond, as exemplified by Gal4p (47). In fact, in AlcR a proline residue is present between the third and the fourth Cys but not at the consensus position. Its replacement by Ala has no effect either on the Zn content or on the binding affinity (53). The absence of Pro at the proper position raises the question of how AlcR can adopt a folded structure similar to that of other zinc cluster proteins. This issue was resolved when the three-dimensional structure of the AIcR(1-60) DNA-binding domain was established by NMR (55). Our results


159

clearly show that the sequence of 16 amino acids constituting the second loop between the third and the fourth cysteines is long enough to accommodate an alternative secondary structure which can replace the conserved proline residue while retaining the overall structure of the zinc binuclear cluster (Fig. 2B). The overall geometry of the core of the cluster is similar to that of the other proteins of this class whose structures are known (45-50, 56-58). This region of AlcR notably forms an ot-helical structure (Fig. 2B), while in other proteins it is present in an extended-strand conformation, or-Helices are frequently involved in various types of interactions. Therefore, this loop can be considered as a potential interacting surface that could be implicated in recruitment of an additional regulating factor or another AIcR molecule. Currently, there is no evidence to support this hypothesis; however, it has been established that the second loop is not involved in the recognition of DNA targets (Section II,D). Another unique feature revealed by the three-dimensional structure of the AIcR peptide is a helical region immediately downstream of the zinc cluster (residues 52-60) (Fig. 2B). For the other known Zn2Cys6 structures, the linker region connecting the DNA-binding and dimerization domains is very flexible without DNA. It forms a/%strand or an extended conformation when bound to DNA. Taken together, our structural data unambiguously confirm that, despite the substantial differences between AIcR and other members of Zn2Cys6 family, they have no effect on the coordination of the two Zn atoms by six cysteine-sulfurs and on the global structure of the AlcR DNA-binding domain.

B. Dimer or Monomer? Most of the zinc binuclear cluster proteins form stable dimers or dimerize when bound to their DNA targets via a coiled-coil dimerization motif of a leucine-zipper type, found downstream of the DNA-binding domain (43, 44). Such a dimerization interface is absent in AlcR. In fact, two stretches of leucine heptad repeats located within the region from residues 102 to 177 are interrupted by several proline residues known to disrupt continuous a-helical structures. A bacterially expressed truncated AIcR(1-197) protein which comprises this region behaves like a monomer both in solution and upon binding to DNA (59, 60). According to gel filtration chromatography the estimated molecular weight of the truncated AleR is "-'18 kDa, which is expected for a monomer. Although the physiological AlcR DNA targets are always organized as inverted or direct repeats (20, 51), in vitro DNA-binding experiments showed that a single AIcR molecule binds with a high affinity (Ka 5 x 10 -8 M) to a single site (60). No cooperativity characteristic of dimer formation was observed when inverted repeat targets were tested. Two molecules can be fixed independently on DNA with nearly the same affinities (Ka 2 x 10 -s M). In this respect, AlcR resembles Adrlp, a transcriptional factor required for the derepression of the ADH2 gene encoding the repressible alcohol dehydrogenase II of S. cerevisiae (61). Two

160

B. FELENBOK ET AL.

Adrlp monomers bind to a palindromic sequence in the ADH2 promoter. It should be pointed out that AIcR and Adrlp, although demonstrating in part an equivalent function, are totally unrelated both in structure and function. Adrlp is a DNA-binding activator of the Cys2His2type and is involved in many different processes rather than being a pathway-specific activator (62, 63). However, the truncated AlcR(1-197) protein could be lacking a downstream dimefization motif. Unfortunately, all the attempts to isolate the fulllength protein have been unsuccessful to date. Nevertheless, indirect lines of evidence strongly support the monomeric mode of the AlcR binding. First, none of the AlcR regions fused to the DNA-binding domain of the CI Z-phage repressor was able to promote its dimerization. Furthermore, glutaraldehyde crosslinking of full-length AlcR protein expressed in vitro did not reveal any dimeric forms (59). Therefore, there is no predicted dimerization domain, consisting of a coiled coil downstream of the DNA-binding domain. It could be still possible that the entire protein might dimerize on its DNA targets, as described for Haplp (64). Physiological studies by site-directed mutagenesis of AlcR targets in the alcA promoter do not favor a dimeric mode of AIcR binding (10). Disruption of any individual binding site within the direct repeat target e has only a weak effect on the alcA promoter strength (see Fig. 7). However, in order to maintain transcriptional activation in these mutants, two AlcR binding sites should be present, whatever the spacing distance and the orientation of the repeats. This result would not be expected if AlcR were a dimer. To explain the necessity of coupled DNA-binding sites, the involvement of another element including other regions of AlcR should be hypothesized (see Fig. 11). Whatever the complexity of the situation in vivo, it would imply the binding of one AlcR molecule per consensus site. AlcR is the first example of this type described.

C. Specificityof the DNA Binding The monomeric structure of AlcR may account for its unusual capacity to recognize and bind both symmetric and asymmetric types of targets. Both types are functional in vivo (10). In fungi, zinc cluster proteins generally bind to their DNA targets, which are organized either as inverted or direct repeats, via direct interaction with the consensus triplets at both ends (43, 44). In most cases these are 5'-CGG-3' or 5'-CGC-3' triplets, though a more degenerate sequence can also be encountered (65-67). DNase I protection footprint analysis demonstrated that the consensus core of the AlcR binding site is extended to the site 5'-WGCGG-3' (20, 51). For both inverted and direct-repeat targets, depurination interference footprinting revealed a strong interference of all the G residues within the consensus

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 161 motif in both strands (Fig. 3A) (59). The interaction occurs mainly through direct contacts in the major groove of the DNA. Methylation of two G residues at positions 2 and 4 interferes strongly with AlcR binding, while the adjacent G (position 5) contributes weakly to binding. Within each recognition motif, these direct sequence-specific contacts are restricted to bases positioned on the same strand. Methylation of the G (position 3) in the C-rich strand results in a weak interference. As will be discussed in Section II,D, the Lys 19 residue between the second and the third cysteines of the zinc cluster conserved in the Zn2Cys6 protein family makes specific contacts with the two G residues of the CGG triplet. A similar pattern of interaction was observed for all types of targets (inverted or direct repeats or an artificial single copy site), indicating that AlcR exhibits a unique mode for DNA recognition (Fig. 3A). Notably, there is no site preference for in vitro binding either to direct or inverted repeats in the alcA promoter. All sites can be occupied randomly as they have nearly equal affinities. It is, however, important to mention that only one AlcR molecule can bind to any of the natural direct-repeat targets, whereas two molecules can simultaneously occupy a palindromic target if the interacting cores are separated by 2 bp (Fig. 3B). The distance between physiologically active direct repeats in the alcA promoter varies from 7 to 8 bp. Hence, the interacting bases in both sites partly lie on the same face of the DNA helix. Given a head-to-tail orientation of AlcR in this case, such overlapping positioning of direct-repeated sites obviously prevents simultaneous occupancy of both by AIcR in vitro (Fig, 3B). The second site becomes accessible for a second AlcR molecule when the interacting 5'-GCGG-3' cores are separated from each other by insertion of an additional 3 bp into the spacer region (59). The phasing of the sites is also crucial when the sites are inversely oriented. The 2-bp spacing between inverted repeats results in positioning of the interacting cores on the opposite sides of the DNA helix (Fig. 3B), thereby avoiding steric hindrance between two AlcR molecules in a head-to-head orientation. This explains why the spacer size in all symmetric targets identified in the AlcR-regulated promoters (in the alcA, aldA, alcR, and alcS/M genes) is restricted to 2 bp (59). Increasing the spacer length by one nucleotide completely abolished binding of the second AlcR molecule. One possible explanation for this observation could be a conformational change in the DNA induced by the fixation of the first AIcR molecule. A second hypothesis would imply an interaction between two AIcR molecules. However, such an interaction would involve DNA-binding cooperativity, which has not been observed to date, in contrast to Gal4p (68). Simultaneous binding can be restored when both sites are placed on the same face of the double helix but one turn apart (59). By contrast, the composition of the spacer region between inverted repeats appears to be less essential for binding. Substitution of the inner

162

B. FELENBOK ET AL.

A O

OO

654321

654321

5' CGTCCGCATCGG-CATCCGCAGC3' 3' GCAGC~GTAGCCGTAGC~GTCG 5' . OOOO 1

2

O

O

654321

654321

5' TGTCCGCACGGGATGTCCGCACG3' 3' ACAGGCGTGCCCTACAC_JGCGTGC5' ¢OOO OOO 1

2

a

£

|U 123456

654321

5' CATGCGGAACCGCACG3' 3' GTACGCCTTGGCGTGC 5' OO IO0

[]

123456

5' CATGCGGAACAG3' 3' GTACGCCTTGTC 5' 00

om •

I

)

bl

b

B Head to head

• 00 • 5' TGCGGAACCGCA 3' 3" AC~CCTT~C~T 5'

b

Head to tail

O • 5' TCCGCACGGGATGTCCGCA 3' 3'A C TGCCCTACA C T5'

C

FIG. 3. Recognition by AtcR of different types of DNA targets. (A) Schematic diagram of contacts revealed by interference foot'printing assays between AlcR(1-197) and its DNA-binding sites which are direct-repeat targets a and e, inverted-repeat target b in the alcA promoter (see also Fig. 7), and an artificial single site bl. Horizontal arrows indicate the orientation of the consensus sequence 5r-TGCGG-3 '. Positions numbered above the sequence correspond to the 5~-to-3' orientation of this motif. Circles represent purine contacts identified by base-missing interference, whereas squares symbolize G contacts identified by methylation interference foot'printing.

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 163 sequence CT in the alcA palindromic site by GC bases (never encountered in the natural targets) reduced the binding Affinity only 3-5 times. The interacting motif GCGG represents a combination of overlapping CGG and CGC triplets found on opposite strands. Mutational analysis of the core allowed discrimination between the two triplets and the estimation of the contribution to specific binding of each base of the core, as well as adjacent nucleotides. Like other zinc cluster proteins, AlcR interacts mainly with the CGG triplet. Changes of any bases within this triplet resulted in a complete loss of binding. Apart from the consensus repeat itself, sequence alignment of symmetric sites found in AlcR-regulated promoters showed that similar nucleotides are present adjacent to the core. They appeared to contribute significantly to the binding affinity. Besides, the AT-rich content of the 3' flanking sequence favors optimal binding. This finding indicates that a certain DNA structure rather than a definite sequence could play a role in binding. For example, an adjacent AT-rich region could facilitate DNA bending and thereby enhance the AlcR affinity for its site. This has been suggested for Miglp, a glucose catabolite repressor in S. cerevisiae (69). Requirements for the composition of this region could explain why some of the direct repeated targets, in the alcR promoter for example, are not functional in vivo (13). Reviewing all our data, an optimal AlcR-binding site can be extended beyond the canonical CGG triplet, resulting in a sequence 51-RNGCGG-AT-rich-3 ' (59). This result is not surprising. Considering a monomeric mode of binding by AlcR, the CGG triplet alone cannot be sufficient to provide the binding selectivity required, since statistically it will be found every 64 bases. Discrimination between different but related DNA targets (considering the spacing between recognition triplets and their relative orientation) by a typical zinc binuclear cluster protein is mediated by a linker region that joins the dimerization and the DNA-binding domains (70, 71). As noted before, AlcR obviously does not possess such elements. In fact, the sequence C-terminal of the DNA-binding module contributes substantially to DNA binding, presumably by increasing the stability of the AlcR-DNA complex rather than by additional

Filled symbolsdenote strong base contacts and open symbolscorrespond to weak ones. Weak contacts observedfor direct-repeattargets resultfrom a randomoccupancyof each bindingsite by AIcR.(B) Modelfor AlcRbindingto invertedand directrepeats.TwoAlcRmoleculesshownin gray interactwithboth siteswithinpalindromictargetb, whereasone AIcRmoleculecan bind to directrepeat target e. A putativepositionof another moleculeis shownby a dashedline. The interacting bases of the conservedmotifGCGGon both strands are displayedon a split projectionof B-form DNA. Onlythe top strandis indicated.FilledcirclesindicateG residueson oppositestrandswhich are involvedin contactswithAlcR.OpencirclesdenoteputativeinteracldngG residuesof a second binding site, providedthat the first one is occupiedby AIcR.The head-to-headand head-to-tail orientationsof AIcRmoleculescorrespondto the rotationalsymmetryof the target.

164

B. FELENBOK ET AL.

specific contacts (60). Apparently, other determinants might contribute in vivo to the unusual binding specificity of AlcR. One such determinant appears to be the N-terminal sequence adjacent to the zinc cluster. Deletion of the first six amino acids severely reduced AIcR binding both to inverted and direct-repeat targets a and b in the alcA promoter, but without consequences for the direct-repeat target e (see Fig. 7) (72). The absence of binding is not caused by misfolding of AlcR lacking the N terminus, as its deletion does not affect the structure of the zinc duster as shown by NMR (55, 73). Thus, the zinc cluster alone is not able to recognize all its functional targets but requires residues outside the cluster for recognition. Interestingly, Arg 6, almost completely mimics the effect of the deletion, implying that this residue plays a crucial role in binding both in vitro and in vivo (see Fig. 2A). Therefore, the DNA recognition module of AlcR is not exclusively restricted to the zinc cluster and includes at least the Arg 6 residue. The drastic effect of the Arg 6 mutation on the DNA recognition process resulted in a loss of the transcriptional induction of the alcR gene and, by a cascade mechanism, that of the two structural alcA and aldA genes (72). This mutant strain exhibited impaired growth on ethanol. The N terminus of AIcR appears to be responsible for its nuclear localization (Section II,E). However, Arg 6 is not involved in the translocation of AlcR into the nucleus.

D. NMR Structureof the AIcR(I:60)-DNA Complex The unusual mode of AlcR binding to its cognate targets originally revealed by biochemical approaches was recently confirmed by NMR structural analysis (55). AlcR(1-60) forms a stable complex with a single site (core 5'-TGCGG-3') with a lifetime 2 orders of magnitude lower than that of Gal4p (73). This result is consistent with a monomeric structure of AlcR. The main reason for the stronger affinity and longer residence time of the complex lies in the network of specific interactions established uniquely by the AlcR DNA-binding domain outside the core. The side chains of Arg 16, Lys 19, and Arg 21 have orientations similar to those of the corresponding residues in Gal4p, Pprlp, and Put3p which contact DNA (47, 48, 57). Conserved basic residues at equivalent positions between the second and the third Cys residues and their relative orientation led us to propose a model of primary recognition similar to that of Gal4p (Fig. 2B). Recognition of the internal CGG triplet within the consensus sequence WGCGG occurs in the major groove of DNA and is mainly mediated by the conserved Lys 19 residue, as in the other zinc cluster proteins. In addition, significant chemical-shift variations were observed for the side chain of Trp 45 and thymine 1, indicating contacts. In our model Trp 45 faces this base and is close enough to form a hydrophobic bond. Interestingly, this interacting Trp 45 was also identified at a protein/DNA interface by mass spectroscopy upon chemical modification of Trp residues with Koshland reagent. In addition Trp 53 was also shown to be involved in DNA recognition (74). Possible interactions


165

also exist between Arg 44 and the facing guanine 2 in the consensus core. The additional stability of the complex is acquired through the contacts between the basic residue Arg 6 located in the N-terminal arm of AlcR and the bases in the 3' AT-rich flanking region. These interactions occur in the minor groove of the DNA. The N terminus becomes more ordered when bound to DNA. All these additional contacts distinguish AlcR from the other members of the Zn2Cys6 protein class and highlight its original mode of binding (B. Cahuzac, R. Cerdan, B. Felenbok, and E. Guittet, unpublished results).

E. Subcellular Localization To activate the transcription of the ethanol regulon, AlcR has to be targeted to the nucleus. The nuclear import of proteins is a complex process and is mediated by short stretches of specific, predominantly basic amino acids known as nuclear localization signals, NLS (75, 76). The N-terminal sequence of AlcR encompassing the first 75 residues is very rich in basic amino acids (a positive charge of +12). We have studied nuclear translocation of AlcR in A. nidulans, using green fluorescent protein (GFP) as a probe for subcellular localization. Fluorescence microscopic studies revealed a nuclear accumulation of the GFP:AlcR(1-75) chimera (Fig. 4 ) (72). A similar pattern was obtained with the entire AlcR protein (results to be published elsewhere). Nuclear import of AlcR does not require the presence of the inducer, implying that the inducing signal must be transmitted

+3

+ 5

+3

+3

+5

1 MADTRRRQNH~DP~qR~RCDAPE~1RNEANENGWv~N~KRWNKDC-TFNWLSSQRSKAKGAAPRART~RT

Region: I

II

AIcR(1-75):GFP

III

IV

75

V

DAPI

FIG. 4. Nuclear localization of AleR: minimal sequences required for nuclear transport of AlcR. Basic residues organized in five regions are shown in bold and are all necessary for AlcR nuclear transport. A positive charge of each region is indicated above the sequence. Cys residues involved in zinc binuelear cluster formation are underlined. Left: GFP fluorescence showing the nuclear accumulation of the AlcR(1-75):GFP fusion protein (noninduced growth conditions). Right: Nuclei DAPI staining of the same filament.

166

B. FELENBOK ET AL.

to AIcR inside the nucleus. Some other fungal transcription factors, such as PrnA from A. nidulans (77) or Gal4p from S. cerevisiae (78), are also translocated in the absence of their specific inducer. By contrast, Miglp from S. cerevisiae is translocated into the nucleus only in the presence of its corepressor glucose (79). The minimal sequence for nuclear entrance or/and retention of the AIcR:GFP fusion protein comprises at least four out of five stretches of basic amino acids dispersed throughout the N-terminal region (Fig. 4). Thus, the N-terminal part of AlcR plays a dual role. It directs the protein into the nucleus, where it serves as a DNA recognition unit, to bind its specific targets in the responsive promoters to activate alc gene expression. This colocalization of the NLS and the DNA-binding domain seems to be encountered among transcriptional factors and may reflect a coevolutionary selective pressure to ensure that a protein that binds DNA is able to access the nucleus. The NLS of AIcR has a more complex structure than the "classical" monoor bipartite NLS known to interact with the importin a/l~ heteromeric complex (75, 76). The NLS of PrnA, another Zn2Cys~protein ofA. nidulans, consists of three short stretches of basic amino acids located within its N-terminal region and overlapping the zinc cluster domain (77). Moreover, Gal4p contains the NLS within its DNA-binding domain and is recognized directly by importin/~ rather than the conventional NLS-binding importin ~ subunit (80). The fact that all zinc cluster proteins share a significant homology within their DNA-binding motifs (43) could imply that they exhibit a similar mechanism of nuclear import. The mechanism of the AlcR recognition by different types of transporters is currently under study.

F. Other Domains Little information about structure-functional organization of other domains of AIcR is currently available. Deletions of various regions suggests that total integrity of the alcR coding sequence is essential for its activity. Fusing different regions of this to the DNA-binding domain of Gal4p allowed us to roughly map the AlcR transactivation domain in a one-hybrid yeast system. One strong activation domain is localized immediately C-terminal of the zinc cluster, between residues 60 and 197. In other ZnzCys6 proteins, the main activation domain is C-terminally located (43). In Gal4p and Pdr3p, a second weak activation domain is localized near the zinc duster (81, 82). In Adrlp, the activation region is found close to the zinc finger motif (83). Transcriptional activation domains are commonly rich in glutamine, proline, isoleucine, or acidic residues. A short acidic sequence is present within the AlcR putative activation domain. Further experiments are required to determine whether it is functional or not. A truncated AlcR protein comprising the DNA-binding domain and putative activation domain is unable to drive expression of the alc genes and does not provide growth on ethanol.

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 167 Transcriptional activity of several zinc cluster proteins is modulated by the central region which represents the major part of those proteins. For example, in Gal4p this region contains an inhibitory domain involved in glucose repression (84). Mutations or deletions within this region result either in constitutivity, for example, of Leu3p (85), or in a loss of inducibility, as in the case of Haplp (86) or AmdR (87), suggesting the involvement of specific sequences in more specialized functions, such as recognition of metabolic intermediates. Furthermore, an additional role could be assigned to the central domain. On the basis of sequence comparison, it has been proposed that regions with moderate similarity within the central part of the Zn2Cys6proteins could assist the zinc cluster in DNA target discrimination (43). Similar stretches of moderate similarity occur within the AlcR sequence. A systematic analysis is in progress to gain insight into the precise role of the other domains of AleR. This study is necessary to analyze precisely the interactions between AlcR and other factors in order to contact the transcriptional machinery.

IIh Induction of lhe alc Gene System: The Inducer Induction of the alc gene system absolutely requires the presence of an inducer compound. In strains in which the transcription of a structurally functional alcR gene is controlled by a constitutive and nonrepressible promoter (gpdA or pyrG), the permanent presence of high levels ofalcR mRNA does not lead to alc gene induction unless an inducing compound is present (11, 12, 38a). As a prelude to their purification ofA. nidulans ADHI and ALDH, Creaser et al. (24, 34) monitored the induction spectrum of these two enzymes in a wildtype strain. They found that certain small ketones (acetone and 2-butanone) and corresponding secondary alcohols were very good inducers, while primary alcohols (ethanol) could provoke only a marginal induction of ADH activity. The only growth substrate with a really potent inductive capacity was L-threonine. By contrast, ALDH was strongly induced only by 3-oxobutyric acid and L-threonine while ethanol gave rise to moderate induction. Early in our research, we adopted 2-butanone (ethyl methyl ketone, EMK) as the most potent, gratuitous inducing compound and virtually all of our transcriptional studies have been carried out with this inducer. At the level of transcription, EMK promotes a very high level of induction of the ADH-encoding gene alcA and the ALDH-encoding aldA gene (7-9). In an early study of alcA promoter strength using the A. nidulans fl-tubulin genes as reporters in a phenotypic test on plates, Waring et al. (88) confirmed that ketones are more potent inducers than the convertible inducers ethanol and L-threonine. They established cyclopentanone as the best inducer and found that L-threonine and ethanol are more or less equally good inducers of alcA.

168

B. FELENBOK ET AL.

A. Direct Involvementof the alc Gene Regulon in the Catabolism of CompoundsOther Than Ethanol L-Threonine and ethylamine can serve as sole sources of nitrogen and carbon for A. nidulans. The first step in L-threonine degradation generates glycine and acetaldehyde and is catalyzed by threonine aldolase (EC 4.1.2.5) (E. Creaser, unpublished results). In yeast, this enzyme is responsible for the biogenesis of glycine during growth on D-glucose (89). Aliphatic monoamines are oxidized by copper-containing amine oxidase (EC 1.4.3.6) or by monoamine oxidase (EC 1.4.3.4). This reaction yields ammonium and the corresponding aldehyde with concomitant formation of hydrogen peroxide. Both ethylamine and L-threonine require induced ALDH activity, i.e., a functional AlcR transactivator, in order to serve as source of carbon for the fungus (3, 90). Moreover, the phenotypes of all mutants in ethanol utilization appear to be essentially identical on ethanol, Lthreonine, and ethylamine as sole carbon sources. The alc gene system is not necessary for the utilization of L-threonine and ethylamine as sole nitrogen sources. The alc system is also responsible for the conversion of n-propanol and n-butanol in the presence of a growth-supporting substrate. Like ethanol, these toxic alcohols are oxidized to aldehydes and carboxyl acids, which are converted to their respective CoA-esters. This conversion is mediated by ADHI, ALDH, and acetyl-CoA synthetase (ACS, EC 6.2.1.1) (Fig. 5 ). Aliphatic alcohols inhibit fungal growth when present in excess owing to their lipophilic character (91). Toxicity of n-propanol at lower concentrations, however, occurs upon its bioconversion (unpublished results). Like allyl alcohol, n-propanol does not inhibit growth ofalc mutants on plates containing 2 mM alcohol and 50 mM L-glutamate, a nonrepressing carbon source. Wild-type strains and aldA mutants do not grow on such plates. Strains carrying mutations in the ACS-encodingfacA gene (92, 93) are able to grow on L-glutamate in the presence of 2 mM n-propanol, but their growth is more restricted, n-Propanol toxicity at low concentrations thus occurs mainly at the level of propionyl-CoA formation. These results are complementary to the evidence recently presented by Brock et al. (94). These authors studied propionate toxicity at much higher concentrations in the presence of D-glucose. They established the existence of the methylisocitrate cycle in A. nidulans that enables the fungus to convert propionate into acetyl-CoA via pyruvate (95). It is likely that propionate inhibits its own conversion to acetylCoA because it drains CoA from cellular metabolism. In catabolic pathways leading to the formation of aliphatic aldehydes, the alc gene system becomes induced upon growt_hon glycerol and D-galacturonate (96). In A. nidulans, glycerol is catabolized to dihydroxyacetone phosphate via glycerol3-phosphate. D-Galacturonate is converted into pyruvate and D-glyceraldehyde, which is reduced to glycerol. A prominent NADH-dependent D-glyceraldehyde reductase activity could be ascribed to ADHI (96). The alc gene system is,

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 169

/m// Ethylamine

Ethanol 1/~

I/AO., jk

"Acetaldehyde F

NH4++ H2Oz

/

/ ,1~ Glyclne

1

ALDH

Acetate

I Acetyi-CoA Synthetase

ACS

AcetyI-CoA FIG.5. Catabolicpathwaysrelatedto the ethanolutilizationpathwayin A. nidulans. In addition to primaryalcoholslike ethanol,primarymonoamineslike ethylamineand L-threoninecan be metabolizedinto acetaldehyde.Therefore,acetaldehydeis at the junctionof severalmetabolicpathways.In A. nidulans, it seemsthat a unique aldehydedehydrogenaseencodedbyaldA is responsible for the oxidationof acetaldehydeinto acetate. This gene is under the controlof AlcR. however, dispensable for the utilization of glycerol and D-galacturonate. Growth ofalc mutants on plates was not impaired because A. nidulans produces constitutive high levels of an NADPH-dependent reductase activity.

B. Classesof Inducing Compounds In the framework of a current study on signal transmission, we have recently tested the inductive capacity of a series of compounds at the level of transcription in wild-type and various mutant strains. The full results of our experiments will be published in the near future: here, we present a summary. Our study reconfirms that ketones are the most potent inducers of the alc gene system, 2-butanone (EMK) being the strongest one tested. However, not all ketones induce: Dihydroxyacetone and 3-pentanone, for instance, do not

170

B. FELENBOK ET AL.

provoke induction. In contrast to the results of Creaser and co-workers (24, 34), we found that the transcriptional induction of the three AlcR-responsive genes (alcR, alcA, and aldA) is always coordinated. Besides L-threonine, small primary alcohols and small primary aliphatic monoamines provoke a considerable induction, although methylamine and methanol do not. L-Threonine, ethanol, and ethylamine give rise to similar levels of induction, around 30% of the EMK level. In sharp contrast with the work of Creaser and co-workers, we found that the secondary alcohols 2-propanol and 2-butanol do not induce, even after a prolonged induction period. However, we identified a new class of convertible substances, small acetylesters, that are able to induce the alc gene system.

C. Metabolism of the Inducing Signal: The Physiological Inducer and ALDH 1. EFFECTS OF aldA LOSS-OF-FUNCTIONMUTANTSON THE REGULATIONOF THE alc GENES The involvement of ALDH in multiple catabolic pathways prompted a closer examination of the regulation of the aldA gene. In contrast to S. cerevisiae (38), only one ald gene actively participates in ethanol catabolism in A. nidulans. Strains carrying strong mutations at the aldA locus are unable to grow on both ethanol and L-threonine as sole sources of carbon. Leaky a/dA mutants have also been isolated (5). Transcriptional analysis comparing noninducing conditions with ethanol induction revealed an interesting regulatory phenomenon in the three aldA lossof-function mutants aldA15, aldA57, and aldA67 (38a). Their transcriptional behavior is normal upon ethanol induction. However, they express the alc gene system to considerable levels under noninducing conditions. Apparently, an inducing compound is produced from regular cellular metabolism that is a genuine substrate for ALDH and is accumulated in structural aldA mutants. A schematic view of this phenomenon is presented in Fig. 6. Accordingly, the acquired constitutivity is reverted upon introduction of a functional aldA gene in an aldA67 mutant background (38a). Conversely, some alc gene induction can be provoked in an aldA + strain upon addition of the ALDH suicide inhibitor disulfiram to noninduced biomass in the micromolar range (2.5-12.5/zM), amounts that do not affect the transcription efficiency (unpublished results). This phenomenon characteristic of aldA loss-of-function mutants is termed pseudo-constitutive expression. It requires the integrity of the regulatory alcR gene and, as a consequence, is subject to carbon catabolite repression. Because acetaldehyde is the main substrate for A. nidulans ALDH (34), we presume that acetaldehyde is accumulated in aldA mutants. Continuous formation of

TRANSCRIPTIONAL REGULATION OF THE ETHANOL UTILIZATION PATHWAY 171 Ethanol

+/- t x @

~1-- - - Acetaldehyde

¢

Acetate

FIG. 6. Scheme outlining the role of acetaldehyde in the induction process of alc genes. Right : In aldA loss-of-function mutants, acetaldehyde is no longer metabolized into acetate. Therefore, acetaldehyde is accumulated under nonindueing growth conditions. This phenomenon, called pseudo-constitutivity, is indicated by the thick gray arrow. Since acetaldehyde is the physiological inducer of the a/c system, it activates the alcR gene and, by a cascade mechanism, the alcA gene. This observed pseudo-eonstitutivity is related to the stringency of the aldA mutation. Left : When the aldA transcription is driven from a strong constitutive promoter, gpdA:aldA,the acetaldehyde is immediately oxidized into acetate and no accumulation of acetaldehyde occurs. Under these conditions, the induction of the alcR gene by ethanol is drastically decreased, which is indicated by a dotted arrow, and the alcA gene cannot be properly induced.

acetaldehyde during growth could be due to L-threonine turnover (97) or result from a p e r m a n e n t flux through C2 catabolism byvirtue of some constitutive pyruvate decarboxylase (EC 4.1.1.1) activity. A single PDC gene has been described in A. nidulans (98). Interestingly, the levels of pseudo-constitutive expression correlate with the specific nature of the mutation in aldA. In the leaky mutant aldA15, the levels of pseudo-constitutive expression are lower than those observed in the strong mutant aldA57, while the highest levels are produced in the absolute loss-offunction mutant aldA67. The pseudo-constitutive alc gene transcript level in this latter mutant is almost equal to those caused by the addition of ethanol in all these strains. It appears that the levels of pseudo-constitutive expression in the three aldA mutants reflect the intracellular amounts of the coinducer accumulated. This implies that A L D H 1 5 has retained a considerable enzymatic activity. The

172

B. FELENBOK ET AL.

difference between aldA57 and aldA67 could be explained by presuming that the ALDH57 protein can still bind acetaldehyde, but is unable to convert it, while aldA67 does not produce ALDH protein at all (Section I,B). The direct correlation between the level of pseudo-constitutive expression and the intracellular concentration of the coinducing compound could be confirmed by in vivo titration with the aldehyde scavenger semicarbazide (38a, 98a). Pseudo-constitutive expression in aldA67 can be suppressed progressively with increasing amounts of semicarbazide. Apparently, the scavenger is able to reduce the intracellular amounts of the accumulated inducing compound in a concentration range that does not affect general transcription efficiency. 2. IDENTIFICATIONOF THE PHYSIOLOGICALINDUCEROF THE alc GENE SYSTEMIN THE PRESENCE OF ETHANOL The phenomenon of pseudo-constitutive expression in aldA loss-of-function mutants indicates a close relationship between in situ ALDH activity and the AlcR-mediated induction of the alc gene system. We have further investigated this aspect of specific induction in vivo in the opposite sense, in a situation in which ALDH is produced to high levels under all growth conditions (38a). The coding part of the aldA gene was thus put under the control of the strong, constitutive, nonrepressible promoter of the A. nidulans gpdA gene. Transformants carrying the chimeric gpdA:aldA gene produce aldA transcript in equally high amounts under all growth conditions examined. However, the resulting constitutive high steady-state level of ALDH protein in these transformants has serious consequences for the induction of the alc gene system by ethanol (see Fig. 6). The levels ofalcA and alcR transcript decline with increasing mRNA levels ofaldA present in the various gpdA:aldA transformants. In the most extreme case only a faint induction ofalcA could be provoked by ethanol. Apparently, the acetaldehyde initially formed from ethanol is drained away in such transformants, slowing down a buildup of intracelhilar acetaldehyde. This leads to a failure to induce the alc gene system properly as insufficient ADHI is produced to increase the acetaldehyde concentration. On the other hand, 2-butanone (EMK) is not a substrate of ALDH. The induction provoked by this gratuitous inducer is essentially not affected by a constitutive overexpression of ALDH, From these results, it is clear that alcohol itself cannot provoke an induction. It has to be metabolized into acetaldehyde first. 3. INDUCTIONAND INTOXIFICATIONBYACETALDEHYDE The results in structural aldA mutants and in strains constitutively overexpressing ALDH suggest that the induction of the alc gene system by ethanol depends on the presence of a certain level of acetaldehyde within the cell. Recently, we have obtained experimental evidence showing that neither L-threonine nor ethylamine is a direct physiological inducer (unpublished results). We have established unambiguously that acetaldehyde is the sole


173

physiological inducer of the alc gene system in all three catabolic pathways sharing acetaldehyde as a common intermediate (see Fig. 5). In the 1980s, it was already suggested that acetaldehyde could be a common physiological inducer (2, 15). However, its capacity as an inducer of the alc gene system could not be proved experimentally. Using transcriptional analysis, we have shown for the first time that acetaldehyde is indeed capable of inducing the alc gene system. The highest induction could be accomplished when amounts were administered that yielded low external concentrations of acetaldehyde (0.5-3 raM). In the wild type, the induction provoked by i mM acetaldehyde is essentially identical to that induced by 50 mM ethanol. The induction profile throughout the concentration range tested was similar in a wild-type strain and in a structural alcA mutant, suggesting that interconversion with ethanol does not affect induction. Some other aldehydes likewise cause maximal induction at external concentrations around i mM. In parallel with ketones, there are also aldehydes that do not induce the alc gene system. Above 3 mM, the general transcription efficiency decreases rapidly with increasing acetaldehyde concentration, as seen by the actin transcript level diminishing to virtually undetectable at 50 mM with equal amounts of ribosomal RNAs loaded. This inhibitory effect of acetaldehyde on de novo transcription becomes apparent at concentrations similar to those causing malfunctions in mitosis at the level of chromosome segregation (91). The aneugenic effect of acetaldehyde occurs at concentrations 2 orders of magnitude lower than that of ethanol. Higher concentrations of acetaldehyde lead to growth arrest (91), which could be due to the severe inhibition of de novo transcription that was observed at concentrations >8 mM. It is generally recognized that enzyme activities depending on essential thiol groups are extremely susceptible to aldehyde inactivation (99). A continuous acetaldehyde formation from cellular metabolism, which presumably causes pseudo-constitutive alc gene expression in structural aldA mutants (as shown earlier), provides an explanation for the nonrepressible basal level expression of the aldA gene. Conversion of acetaldehyde into acetate could be a physiologically relevant function of the ALDH constitutively present in growing biomass by virtue of aldA basal level expression as it avoids accumulation of a highly toxic compound under all growth conditions.

D. The Substratesof ALDH Are the Physiological Inducers of the alc Gene System Ethanol utilization is an example of a catabolic pathway induced by a toxic catabolic intermediate and not by the growth substrate itself. Purine degradation in A. nidulans is another example of such a pathway. Structural mutants in the urate oxidase (uaZ) gene (100) exhibit pseudo-constitutive expression of the structural genes of this pathway owing to an intracellular accumulation of uric acid (101). Like aldA, the uaZ gene is constitutively expressed to relatively high

174

B. FELENBOKETAL.

levels. Such basal expression levels allow the organism to convert toxic metabolic intermediates under all growth conditions. The alc gene system is induced only under conditions in which the intracellular acetaldehyde concentration exceeds the conversion capacity of the constitutively present ALDH. Apparently, this is the case when ethanol, L-threonine, or ethylamine are utilized as sole sources of carbon. In ethanol conversion, the alcA gene encoding alcohol dehydrogenase I, which is responsible for continuous formation of the inducer, is induced in coordination with the structural aldA gene coding for the aldehyde dehydrogenase, converting it to acetate. A subtle finetuning ofalcA and aldA expression appears to be essential for the offset and maintenance of an optimal acetaldehyde concentration between inducing and toxic levels. Constitutive overexpression of the transcriptional activator AlcR, as occurs in the gpdA:alcR transformants, already disturbs this balance and results in a slower growth rate on ethanol plates. Besides aldehydes, two other classes of compounds are apparently able to induce the a/c gene system directly: acetylesters and structurally related ketones. The latter compounds act as gratuitous inducers as they are metabolically inert [ketones cannot be reduced to secondary alcohols by anyA. nidulans ADH (2, 3)]. We propose that the physiological inducers of the alc gene systems are in vivo substrates of ALDH. It is well established that cytoplasmatic and mitochondrial ALDHs from a variety of mammalian tissues possess a carboxyl esterase activity. S. cerevisiae mitochondrial Adh4p, one of the analogs of A. nidulans ALDH, also harbors this second enzymatic activity (102). Methyl ketones should therefore be considered as substrate analogs of acetylesters. Given that AlcR is active only in the presence of an inducing compound, this raises the important question of how AlcR recognizes compounds as genuine substrates of ALDH. This could be accomplished by the formation of a covalent bond between AlcR and the physiological inducer. Direct interaction is essential for the function of the aryl hydrocarbon receptor (AHR), a mammalian transactivator involved in the response to certain xenobiotics (103-105). Alternatively, activation of AlcR could proceed via an additional protein. Such a situation occurs in the case of the activator of D-galactose catabolism in S. cerevisiae, Gal4p (106, 107). The mechanism by which the inductive signal is transmitted to the AlcR transactivator is one of the subjects currently under study in our laboratory.

IV. The CreA Repressor CreA is the transcriptional repressor mediating carbon catabolite repression in A. nidulans. In the early 1970s, Arst and Cove (6) and Bailey and Arst (108) used different strategies to select several derepressed CreA mutations. Each

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 175 isolated creA allele resulted in a partially or totally derepressed expression of the gene products normally subjected to carbon catabolite repression.

A. The CreA DNA-Binding Domains The creA gene has been cloned (109) and sequenced (110) and encodes a protein of 416 amino acids. CreA is a DNA-binding protein that has two zinc fingers, which resemble the Cys2His2 class of zinc finger described for transcription factor TFIIIA, and a number of transcriptional regulators of the Zif268 type (111-113), including Miglp. This latter repressor mediates glucose repression in yeast (114, 114a) and has two zinc fingers which bind to GC-rich motifs in responsive promoters. As the three-dimensional crystal structure of the Zif268-DNA complex had already been resolved (115), deducing the CreA and Miglp binding mode was simple. The zinc-binding domains of CreA and Miglp are very similar, the first zinc finger showing 27/31 identical amino acids and the second zinc finger showing 28/32 identical residues (110). It should be stressed, however, that the homology between these two fungal repressors is restricted to the DNA-binding domain. The first zinc fingers of both CreA and Miglp share a high similarity with the fingers of those other members of this family able to make contact with the triplet target 5'-GGG-3'. The second zinc fingers of the two fungal repressors are, in turn, very similar to the fingers able to make contacts with the 5'-GCG-3' triplet. The binding of the fingers to DNA is antiparallel; that is, the amino terminus of the zinc finger regions binds the 3' end of the cognate DNA targets. It was therefore proposed (21) that the first finger of both CreA and Miglp proteins recognizes the triplet 5'-GGG-3', while the second finger recognizes the triplet 5'-SYG-3'. The CreA consensus binding site was further extended to 5~-SYGGRG-3' (116). The three zinc fingers in the DNA-binding domain of Zif268 bind in the major groove of the DNA. Each finger contacts the triplet "GCG" in the guanine-rich strand (115). These interactions were transposed to CreA to build a model for the interaction of the two CreA zinc fingers with its DNA target (4, 21, 117).

B. Specific CreA Functional Targets Several thousand CreA consensus sites should exist in the A. nidulans genome. A fusion GST-CreA protein is able to bind in vitro to every CreA consensus site, regardless of the origin of the DNA (21). However, we know that in the alcA promoter, two of the seven putative CreA consensus sites are functional (12), and in the alcR promoter, four of the nine putative CreA sites are functional (13) (Fig. 7 ). Additional elements must therefore play a role in the specificity of CreA recognition. The region adjacent to the CreA binding sites could be relevant. In this respect, the importance of a flanking AT-rich region in site selection by Miglp was shown by Lundin et al. (69). As the mode

176

B. FELENBOK ET AL.

- 800

t

ST1

~

-I

~"

J~ 1

A

ST2

- 400

I

F~

2

1

a -

500 I

b 1

A

A

v

w

C 2

3

c

3

t

alcR

ATG

RT alcA

1

B

B

- 300

I

-~-

2

ATG

ST

~'~i

aldA

ATG

FIG. 7. Schematic representation of the regulatory elements of the promoter regions of the

alcR, alcA, and aldA genes. The arrows, ST, are positioned on the transcription initiation sites and point toward the direction of transcription. In alcR, two--ST1 and ST2--are used. The black ovals represent functional AlcR targets organized as direct and/or inverted repeats. The horizontal arrows indicate the orientation of the AlcR consensus core, 5'-WGCGG-3'. The CreA targets whose consensus sites are 5'-SYGGRG-3', are represented by gray triangles. The alcR promoter encompasses one functional palindromic site for the activator AIcR,a, and two pairs of functional CreA sites AIA2 and CIC3, respectively. The alcA promoter contains three upstream activating regions containing AlcR repeated sites: a direct repeat, a; an inverted repeat, b; and a more complex region with a direct repeat overlapping an inverted repeat, e. Two functional CreA sites, BI and B2, are predominantly responsible for carbon catabolite repression. The aldA promoter contains a single palindromic site for AlcR and is not subject to CreA's control.

of D N A recognition by M i g l p and CreA is similar, a similar role for the flanking region might also be expected. The prnB p r o m o t e r (116) shows AT-rich regions upstream from the GC consensus sequences. However, in the alcA promoter, the two functional CreA sites (B1 and B2) have completely different flanking regions while no AT-rich region was identified (12). T h e same was observed in the alcR p r o m o t e r (13) and in certain M i g l p sites in GALl and GAL4. Conversely, in vitro binding of CreA to its putative sites in the ipnA p r o m o t e r region is AT c o n t e x t - d e p e n d e n t , but these sites appeared to be nonfunctional (118). Therefore, it seems that AT-rich regions flanking CreA functional sites do not play an important role in site selection repression. Another characteristic of the CreA-binding sites is their frequent organization in A. nidulans of CreA functional targets as pairs. This is the case for alcR (13), alcA (12) (Fig. 7), in the p r n cluster (116) and in creA itself (119 ). M i g l p

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 177 sites are also often organized as repeats, e.g., in the promoters of GALl and SUC2 (114). However, in the xlnA promoter, three CreA consensus sites are present, but only a single site seems to be functional (120). Therefore, we do not yet know if functional CreA sites always work in pairs. From regulation studies of the alc system (Section V) and of the prnB gene (121; V. Gavrias and C. Scazzocchio, personal communication), it appears that CreA is the sole repressor responsible for carbon catabolite repression in A. nidulans. In S. cerevisiae, it has been shown that another zinc finger repressor, Mig2p, binds to the same binding sites as Miglp and is involved in glucose repression of SUC2 expression (122). Disruption of Miglp relieves most of the glucose repression of GALl and GAlA expression and partially relieves SUC2 repression, but does not affect glucose repression of other genes (123). As few systems have been studied in great detail, it is still possible that other repressors are involved in carbon catabolite repression in A. nidulans.

C. Structure-Function Relations in CreA The CreA protein contains, in addition to the DNA-binding domain, an alanine-rich region, an acidic region, and a region that presents similarity with the yeast Rgrlp protein (124, 125). This global regulator is a component of the mediator and RNA II polymerase holoenzyme complexes and can play both negative and positive roles in transcriptional regulation, presumably at the level of nucleosome positioning (125a, 125b). However, the relevance of these regions for CreA function is still unknown. A number of creA alleles have been isolated (6, 108, 126-129), as described in several reviews (4, 117, 130, 131). These various alleles display nonhierarchical heterogeneity toward derepression of different systems analyzed. For example, the creA1 mutation results in strong derepression of the alc system (108), whereas the arabinofuranosidase gene is still repressed (132). The creA2 and creA25 alleles have opposite effects on the alc genes and the prnB gene (B. Cubero, M. Mathieu, C. Scazzocchio, and B. Felenbok, unpublished results). Some of these sequenced creA alleles are missense mutations within the DNA-binding domain but do not concern amino acids shown to contact DNA in Zif268 (115). These alleles do not result in total derepression of the systems. In fact, they alter rather than abolish DNAbinding activity, which probably results from a reduced affinity for DNA (128). Therefore, the heterogeneous phenotype of these mutants could be explained by differences in affinity of the CreA mutant proteins for various cognate targets. Different derepressing mutations resulting from C-terminally truncated CreA proteins also display heterogeneous phenotypes. The most extreme derepressed CreA mutations are those resulting in a truncated zinc finger, abolishing binding to DNA (129). Both the deletions in the 3' region of the creA gene (129) and the pericentric inversion in the creA~t30 mutant (127) present an altered growth and morphology. However, it appears from these studies that creA is

178

B. FELENBOKET AL.

not an essential gene for the cell. Interestingly, mutations in CreA targets in CreA-controlled promoters such as alcR or alcA result in total derepression and are more efficient than any known mutations creA itself (12, 13). Therefore, the regions in CreA encompassing these mutations appear to be important but not essential for CreA activity. In Miglp, several important domains have been identified in addition to the DNA-binding domain: Two regulatory domains mediate suppression of Miglp activity in the absence of glucose, while an effector domain is involved in Miglp-dependent repression (133). This latter domain, together with the DNA-binding domain, is sufficient for glucose repression. Such functional dissection studies are still to be performed with CreA.

D. Regulation of CreA Function at the Transcriptional and Posttranscriptional Levels Regulation of CreA expression appears to be very complex. Three mechanisms seem to be involved (119). First, the addition of a repressing or a derepressing monosaccharide to carbon-depleted mycelium triggers a rapid and transient increase of CreA transcription. This increase is dependent on monosaccharide uptake and is presumed to be under the control of the creB gene, as yet not characterized (134), and is not controlled by CreA. Second, the addition of a repressing carbon source results in a transcriptional autorepression of creA, This downregulation is mediated via the two CreA-binding sites in the creA promoter. It does not occur upon addition of nonrepressible carbon sources (such as ethanol or L-arabinose) or in case of disruption of the CreA-binding sites in the creA promoter (119). Third, under derepressing conditions, the amount of creA mRNA is higher than in glucose-grown cultures (128). This elevated concentration of creA mRNA does not correlate with a subsequent increase in a CreA functional protein. The change from active to inactive CreA might be caused by covalent modification and/or protein degradation. Strauss et al. (119) hypothesized that either CreA or a factor activating CreA on glucose could be a target for degradation, and that the shift to repressing conditions would require de novo synthesis. As in the case of Miglp, other levels ofposttranscriptional regulation of CreA could exist. Miglp function is regulated at the level of nuclear import/export and by phosphorylation/dephosphorylation. In the presence of glucose, Miglp is imported into the nucleus and, upon glucose depletion, exported into the cytoplasm. These changes in subcellular localization are coincident with changes in the phosphorylation status of Miglp (79). Snflp is a protein kinase (135, 136) that appears to circumvent Miglp function in the absence of glucose (137, 138). The addition of glucose inactivates Snflp and results in dephosphorylation of Miglp and subsequent nuclear import. Once in the nucleus, Miglp binds to its target promoters to repress transcription (79). Another mechanism of Miglp action in the absence of glucose involves the masking of its effector domain


179

by means of an intramolecular interaction with an internal domain (133). It is possible that CreAis also targeted to the nucleus in the presence of glucose. It was shown that the CreA homolog in Sclerotinia sclerotiorum, Crel, is imported into the nucleus ofA. nidulans in the presence of glucose (139). Homologous Snflpproteins have been identified in S. sclerotiorum and A. nidulans (S. Vacher and M. Fevre, personal communication). Therefore, the elements exist to regulate CreA function at the level of nuclear import/export in response to a repressing carbon source. Repression of transcription by Miglp requires the recruitment of corepressors. When Miglp binds to target sites in responsive promoters, it recruits Snn6p and Tuplp repressors proteins (140). This complex is the actual transcriptional repressor (141, 142). We do not know yet how CreA interacts with the transcriptional machinery in order to repress transcription.

E. CreA Homologs in FilamentousFungi and Mode of Action The cloning and the sequencing of the creA gene from A. nidulans (I 10) have not only opened up the field of CreA protein/DNA interaction studies (4), but also facilitated the isolation of CreA homologs in other filamentous fungi. Such CreA-like repressors have been also identified in AspergiUus niger (131, 143), Trichoderma reesei (144-146), S. sclerotiorum (139), Humicola grisea (147), Gibberella fujikuroi, Botrytis cinerea (148), and Neurospora crassa (148a). Table I presents a nonexhaustive list of different creA homologous genes in filamentous fungi. These are characterized by a high percentage of identity with A. nidulans CreA, including the zinc finger DNA-binding domain. Their specific cognate GC-rich targets also present the same consensus, and proteinDNA interactions are equivalent to those found for CreA. It is therefore not surprising that creA heterologous gene complementation proved efficient in the filamentous fungi tested. Carbon catabolite repression is a general phenomenon

TABLE I creA HOMOLOGSIN FILAMENTOUSFUNGI Gene abbreviation

Fungal species

Reference

creA creA cre l creA cre l creA cre l

Aspergillus nidulans Aspergillus niger Trichoderma reesei Humicola grisea Sclerotinia sclerotiorum GibbererUa fujikuroi and Botrytis cinerea Neurospora crassa

109, 110 143 144-146 147 139 148 148a

180

B. FELENBOKETAL.

found in fungi. Therefore, it is expected that the mechanism of repression by these repressors will present common features similar to those elucidated in A. nidulans. CreA homologs may directly repress the activator genes, resulting in a subsequent repression of the structural genes such as aldA (38a). They may also act at the level of both structural and activator genes and exhibit a "doublelock mechanism" such as for alcR and alcA (11-13). These homologs could also act only on the structural genes. All these mechanisms exist in A. nidulans (4, 116, 120).

V. Mechanism of Regulation of the alc Genes The molecular mechanism of induced transcriptional activation and repression of the alc system in A. nidulans has been worked out with respect to three genes: alcR, alcA, and aldA. In this section, we shall see how these three genes are regulated differentially by the two regulators AlcR and CreA at the molecular level and how the two regulatory circuits interact to modulate the expression of the alc system in response to nutrient modification in the medium.

A. alcR Regulation 1. ALCR-SPECIFICINDUCTIONAND ITS EFFECTS ON alcA AND aldA REGULATION The alcR gene is controlled at the transcriptional level by a positive-feedback loop. This was shown in a nonsense alcR mutant (8) in which the steady-state level of the alcR transcript is the same under"noninduced" and"induced" growth conditions. Recently, we identified a functional AlcR target in the alcR promoter (13). The expression of the two structural genes alcA and aldA is turned on following the induction of the transactivator, AlcR. The induced levels of both alcA and aldA transcripts are very high (8). ADHI and ALDH are the major translation products of RNA extracted from mycelium induced with threonine (2, 7, 9, 16). In alcR mutant strains, there is no induced expression of the autoregulated alcR gene and, as a consequence, no expression ofalcA and aldA (3, 7, 8, 9). Under noninduced conditions, the basal level of alcA mRNA is low but still also depends on the presence of AlcR. The alcR basal level is itself partly controlled by AlcR. On the contrary, that ofaldA is higher and is independent of AlcR (149). As we have already seen, the transcriptional activation of the alcR gene is absolutely dependent on the presence of an external coinducer compound. Therefore, the AlcR protein is active to induce transcription only in the presence of a coinducer. In the alcR promoter, two AlcR consensus repeated sites are present: a direct repeat adjacent to a palindromic site. Gel band shift experiments, using truncated AlcR recombinant proteins, have shown that AlcR binds in vitro with

TRANSCRIPTIONAL REGULATION OF THE ETHANOL UTILIZATION PATHWAY 181

low affinity to the direct repeat and with high affinity to the palindromic site. We have shown that only the palindromic target site a is involved in the control of the alcR-induced transcription (Fig. 7). This functional element is very similar to other symmetric elements found in other AlcR-responsive promoters whose consensus sites are separated by 2 bp (59, 150). The palindromic AIcR target a is necessary for the initiation of transcription at both the distal and the proximal transcription start sites. The messengers specify the same coding sequence, and both lead to a functional AlcR protein able to activate transcription of the alc structural genes. Disruption of this site does not, however, abolish growth on ethanol. The basal level ofalcR expression in the presence of a coinducer is sufficient to provide sufficient transcription of alcA and aldA. Therefore, alcR endogenous regulation is not necessary for the utilization of ethanol (13). Most of the specific pathway activators, like NirA and UaY in A. nidulans, present low constitutive levels of transcription (151, 152). Activators that are subject to autogenous regulation (like AIcR) are expressed at relatively high levels, e.g., the wide-domain activator genes areA (153) and pacC (154). In the alc system, it has been shown that there is a close correlation between the level of expression of the alc genes and that ofalcR (10, 11,155). Therefore, alcR positive autoregulation is an important mechanism that amplifies the external induction signal and results in a rapid and efficient response of the responsive genes to utilize ethanol as a sole carbon source. 2. CARBON CATABOLITE REPRESSION

The other regulatory circuit controlling alcR gene expression is carbon catabolite repression mediated by CreA. The alcR gene is almost totally repressed by glucose, even in the presence of an inducer (8, 9). CreA represses the alcR transcription directly. This was shown (8,11,13) in CreA loss-of-function mutants creA d1 and creA d30 (108, 134), in which the alcR gene is derepressed. It was shown that CreA binds in vitro to GC-rich targets 51-SYGGRG-31 in the alcR promoter (21). Among nine perfect consensus CreA sites located in the alcR promoter, four are functional. Two partly overlap the AlcR palindromic target. Two others are localized between the two transcription starts of the alcR gene. The mechanism of repression mediated via these two sets of CreA sites is completely different. One mechanism is based on a direct competition shown to occur between the AlcR and CreA proteins which exert mutual antagonism in the alcR promoter. The overlapping of the AlcR and CreA targets in the alcR promoter accounts for this competition as demonstrated in vitro by gel band shift experiments (11). Inactivation of just one CreA binding site, A1 or A2 (Fig. 7), or their simultaneous inactivation results in the same striking overinduction (4-fold) in the absence of glucose (compared to the alcR wild-type induced mRNA level) and a similar partial derepression in the presence of glucose.

182

B. FELENBOKET AL.

Assuming that transcription reflects AlcR binding, this result implies that the apparent dissociation constant of the active AlcR protein for the alcR promoter decreases drastically in the mutated derepressed promoter. The mechanism of this competition could result from steric hindrance between AlcR (92 kDa) and CreA (47 kDa) for the overlapping targets in the promoter region (Fig. 8). The CreA sites, A1 and As, are both functional in repressing alcR transcription and they act as a pair. It is important to note that the repression conferred through this CreA couple, A1A~, is only partial. Therefore, the total repression of the alcR promoter involves the two other CreA sites. These two sites, C1 and C3 (Fig. 7), are separated by a consensus, nonfunctional CreA site, Ca. Mutagenesis of either of the two sites results in an almost total derepression of alcR transcription, showing that these sites act as a pair, constituting the major cis-acting element for CreA repression ofalcR. Individual or simultaneous disruption of C1 and Ca does not change the induced steadystate level ofalcR gene transcription. This indicates a direct mechanism of CreA repression, irrespective ofalcR activation, which does not occur via competition with AlcR. The localization of these sites between the two alcR transcription starts may be an important feature in the efficiency of the repression process, as this region could become inaccessible to the transcriptional machinery. In conclusion, we clearly showed that CreA controls alcR expression by two different mechanisms. The first one involves competition with the activator AlcR via the CreA sites, overlapping the functional AIcR induction target in the alcR promoter, although this repression is only partial. However, this mechanism is expected to play a decisive role under physiological growth conditions in which

FIG. 8. Competitionmodelbetween AleRand CreAfor the samealc gene promoter region. The CreAproteinis indicatedbya halfcrownand its cognatetargetby a half-circularcore.The AlcR protein is markedby a square pattern and its cognatetarget by a black triangle.These regulatory targets are both overlappingthe same promoter region of the alcXgene (whichcouldbe alcR or alcA). Under inducedgrowth conditions,AlcRis active and binds to its target, preventingCreA from occupyingits cognatesite. In glucosegrowthconditions,the oppositeoccurs;CreAbinds to its target,preventingAIcRbinding.

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 183 an inducer of the alc regulon is present. The second repression mechanism, mediated via the CreA targets localized between the two transcription starts, is drastic and might directly affect the transcriptional machinery. This should be very efficient under physiological conditions in which a rich carbon source is present. Other molecular mechanisms, such as promoter bending provoked by another protein and/or nucleosome positioning, as shown in the niiA/niaD, intergenic region (156), and in 5' region of the amdS gene (157), might be involved in selection of the transcription initiation site. Moreover, the coexistence of two populations of alcR messengers opens up the possibility that other regulatory mechanisms affect alcR expression, for example, at the level of translation initiation or elongation. Two important conclusions follow from these studies: that glucose repression does not operate through inducer exclusion, and that CreA is the unique carbon catabolite repressor responsible for alcR repression.

B. alcA

Regulation

One intriguing question was to elucidate the molecular mechanisms ofalcA transcriptional activation, one of the strongest inducible promoters in A. nidulans, which is widely used to overexpress homologous and heterologous proteins (7) (Section VI). 1. ALCR TARGETSIN THE alcA PROMOTERINVOLVEDIN TRANSCRIPTIONALACTIVATION The transcriptional induction of the alcA gene occurs when the active AlcR activator is bound to its cognate targets in the alcA promoter (Fig. 7). Three types of AlcR repeated binding sites have been identified in vitro by gel retardation assays, which occupy a short 150-bp region in the alcA promoter. A palindromic target, b, is flanked upstream by two direct-repeat sites, target a, and downstream by a more complex target, e, encompassing a direct repeat separated by 16 bp from an inversely oriented single copy site. Both AlcR fusion proteins GST-AlcR(I:60) and His-tagged AlcR(l:197) bind in vitro to each type of target (10, 20, 51, 60) (Section II). The high induction level of alcA gene transcription could be explained by taking into account three characteristics found in the alcA promoter. The first one is the presence of three functional AIcR targets a, b, and e, which are completely different in their configuration and contribute differently to the activation of alcA. Target a is not essential for growth on ethanol but contributes to alcA induction. Mutual disruption of target a and b leads to a drastic decrease, resulting in a residual noninducible ADHI activity sufficient to allow only weak growth on ethanol. Target c is also essential for growth on ethanol and for ADHI induction (Fig. 9). To be functional, AlcR targets in the alcA promoter have to be organized in repeats, either in tandem or in palindrome. Mutations in target b, leaving a

B. FELENBOK ET AL.

184 -500

a

b

ADHI

c

wt

4.-I-4-4.

phenotype on ethanol

4.

ATG /

Role of A I c R targets

N•lt

!

a/~N

I !

position

b A w

A /w N

a

b

A w

A w

.........

........................

w

!

/w N

It

C

A w

at w

............

v

I

ma mb

-I-4-

4.

_

_

4.

_

,~

a A v

a •

I- ~

A

w

effect

target svnerl~ism

c A w

b "

v

b A ,~

Ir

I

~

I- ~

~

c A

w

w c A w

mc

m

A b

4.4.

4.

Ac

--

--

I

,,~

b A

..................................

I

I

Ir

mc

A a

+

--

--

++

+

4. --

_

I ~

I

A ab

FIG. 9. Schematic representation of the molecular mechanisms ofalcA transcriptional activation. The alcAwild-typepromoter is at the top. ADHI activitywas monitored in native electrophoretic gels, and phenotype on ethanol was scored on growth plates. Three upstream activating regions are localized in the alcA promoter (see Fig. 7), encompassingtargets a, b, and e. The role of each AlcR individual target was determined after site-directed mutagenesis, resulting in the mutated strains, ma, mb, and me respectively. The position effect of the AlcR targets on the alcA promoter was shown by replacing target b by a, changing the position of targets a and b, and disrupting target c. Synergistic activationwas deduced after deletion of targets a and of a plus b, respectively. single site, result in a total loss of alcA transcription (Fig. 9). This is also true in the case of target e, which contains three sites and has an organization with direct sites (el-e~) and an overlapping inverted repeat (e2-e3). In fact, the direct repeat sites (ez-e~) contribute predominantly to target e efficiency (Fig. 9). However, the inverted repeat couple (e2-ea), in which the spacing is important (16 nucleotides), also makes a substantial contribution (10). Such an alternative utilization of binding sites has also b e e n described for another transactivator of A. nidulans, AreA, which can utilize alternative GATA-binding sites in the areA p r o m o t e r region (153). The high diversity of targets found in the alcA p r o m o t e r could be an important element contributing to the strength of AIcR activation.

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 185 The second characteristic of the alcA promoter is the position effect of the targets in this promoter. The upstream target a is the least important among the three, since its deletion or its mutagenesis does not abolish inducibility of the alcA promoter. However, a drastic effect was observed when target b sites were mutated, resulting in total loss ofalcA transcription. Disrupting target e strongly prevents alcA induction mediated by AlcR binding to the remaining targets a and b, but the effect is not total. These results indicate that AlcR is no longer able to occupy the remaining targets in the alcA promoter. Both target positions, b and e, are important. Half-site organization, either in symmetric or asymmetric repeats, is not involved in this position effect. Replacing the palindromic site b by the direct repeat site a results in an alcA promoter containing two direct repeat targets, which allows transcriptional induction (Fig. 9). Therefore, the position occupied by target b appears to be crucial for the transcriptional activation of alcA. Note that Gal4p behaves differently vis-d-vis its targets. A single copy of a Gal4p palindromic site is able to activate transcription, with a wide tolerance in position in the GAL4-controlled promoter (68, 158). The conclusion is that each AlcR target appears to contribute differently to alcA expression in order of importance b > e > a. The third characteristic is the synergism observed between the three AlcR targets a, b, and c, which is essential for the high transcriptional activation of the alcA gene (Fig. 9). An important parameter for synergism is the AlcR target position effect. This synergism cannot be explained by cooperative binding of AlcR to the three binding repeated sites, which was not observed in vitro, in DNA-binding experiments (59, 60). The absence of cooperativity distinguishes AlcR from Gal4p, for which cooperative DNA binding provides an explanation for the synergistic effects observed in vivo (68, 159). Another nonexclusive model could explain synergistic transactivation. The multiple contact model suggests that synergism is a manifestation of multiple contacts between activators and the general transcriptional machinery. Multiple DNA-bound molecules of a single activator could contact a common target protein interacting with the transcriptional apparatus (160, 161 ). In conclusion, the number of AlcR targets, their position in the alcA promoter, and the synergistic mode of transcriptional activation are the key elements explaining the exceptional strength of the alcA promoter. Our results rule out completely in vivo utilization of AlcR single sites. They also do not favor dimerization of AlcR in the presence of properly spaced DNA sites. AlcR binds in vitro to a single site, whereas the in vivo functional cis-acting element is a repeat. However, our results cannot eliminate an interaction between AlcR molecules on the DNA. A yet unidentified factor could also contribute, enabling AlcR to activate transcription. This has been suggested in the case of Gal4p and Pprlp activators from S. cerevisiae (47, 48). The interaction of another protein with AlcR contacting the transcriptional machinery could account for

186

B. FELENBOK ET AL.

both in vitro binding experiments and physiological results as discussed earlier (see the model presented in Fig. 11).

2. alcA TRANSCRIPTIONALREPRESSION alcA transcription is completely repressed when glucose is added simultaneously with an external inducer. We know that alcA expression is absolutely dependent on an active AlcR protein. Therefore, the repression of alcR could be sufficient to explain alcA repression. However, seven CreA binding sites (5'-SYGGRG-3 I) are localized in the alcA promoter. We have addressed several questions: (1) Which CreA binding sites are functional in vivo? (2) Is CreA the only repressor responsible for alcA glucose repression? (3) Is there competition between the activator AlcR and the repressor CreA for the same region in the alcA promoter in A. nidulans? These seven CreA binding sites in the alcA promoter are all able to bind a CreA fusion protein GST:CreA in vitro. DNAse I footprints have shown that the protected bases correspond to the CreA consensus site (21). In order to study alcA repression independently from induction, it was necessary to perform these studies in an alcR-derepressed background in which the direct repression ofalcA could be monitored. In the gpdA:alcR strain in which alcR is constitutively expressed and derepressed, alcA transcription is substantially derepressed (50%) in glucose medium. Therefore, the 50% remaining repression should be the result of the independent repression of the alcA promoter by CreA. Among the seven CreA binding sites, two sites (B1 and B~) are largely responsible for alcA repression (Fig. 7). The CreA B1 and B~ sites work as a pair since the same almost total ADHI derepression was observed when the sites were mutagenized individually or simultaneously. Disruption of the two CreA targets, BI and B2, results in totally derepressed alcA expression. This result suggests that CreA is the only repressor, as in the case of alcR, acting directly on alcA during carbon catabolite repression (12). As we have seen, in the alcA promoter, CreA sites B1 and B~ are very close to, and even overlap with, AlcR binding sites. Therefore, it is expected that, as for alcR (11, 149), competition occurs between AlcR and CreA for the same promoter region (Fig. 8). Indeed, this was effectivelyobserved, since disruption of the site B1 or B2 resulted in an increased level of alcA induced transcription in addition to a total derepression in glucose medium (12). The induction of alcA is correlated with the level of AIcR produced, as observed in contexts in which AlcR is highly (gpdA:alcR) or moderately (pyrG:alcR) expressed. In the absence of CreA sites when they are disrupted, AlcR can fully occupy its cognate targets as in the case of alcR. These results are in agreement with a regulatory mechanism whereby the relative amounts of CreA and AlcR proteins govern the level ofalc regulon expression under different growth conditions (12).

TRANSCRIPTIONALREGULATIONOF THE ETHANOL UTILIZATIONPATHWAY 187

C. aldA Regulation The aldA gene is strongly activated upon the addition of the external coinducer. This induced transcriptional activation appears as strong as that for alcA. The repression of the aldA gene is complete. But we will see that the regulatory mechanism of the aldA gene differs from that of the alcR and alcA genes. l. INDUCTION OF aldA TRANSCRIPTION CORRELATES WITH THE INTRACELLULAR LEVEL OF AN ACTIVE ALCR

As for alcA, the aldA gene requires an active AlcR protein to be induced. Increasing the level of an active AlcR by driving it from the strong constitutive promoter (gpdA:alcR) results in a corresponding increase in aldA-induced transcription. Thus, the AlcR protein is limiting in the cell for aldA activation, as well as for that of alcA and for the other alc clustered genes (11, 155, 38a). However, the aldA basal level is independent of AlcR (149, 38a). In this latter respect, aldA expression differs markedly from the other structural gene of the alc regulon, alcA. 2. ALCR CONTROLS THE aldA PROMOTER VIA A UNIQUE PALINDROMIC TARGET

The shortest complementation unit of the aldA67 mutant contains 220 bp upstream of the aldA proposed start codon. Within this region, a single palindromic AlcR target is localized upstream of the start codon (see Fig. 7). This AIcR site presents a symmetric change of the first base pair, A to T and T to A, respectively, compared to functional symmetrical AlcR targets in the alcR and alcA promoters. In the aldA promoter, the His-tagged AlcR(1-197) protein was able in vitro to bind these two symmetric sites exactly as for alcA and alcR (60). This in vitro result is in agreement with in vivo experiments (38a). The palindromic site is the only one functional in the aldA promoter. Its disruption results in a major loss of inducibility of aldA whereas the basal level of transcription is unchanged. As expected, the aldA promoter mutant is unable to grow on ethanol as a sole carbon source. Hence, this palindromic site constitutes the c/s-acting element involved in AlcR-mediated induction of aldA, which is absolutely required for growth on ethanol.

3. aldA REPRESSION A single putative CreA consensus site is present in the aldA promoter, but it is not functional. As shown in several alcR derepressed backgrounds, when CreA site C3 was disrupted, the aldA gene is transcribed to the same extent under induced and repressed conditions (38a). Hence, unlike alcA, repression ofaldA occurs solely by CreA-mediated repression of the regulatory alcR gene.

188

B. FELENBOKET AL.

In conclusion, although the alcA and aldA genes are coordinately regulated, they exhibit different mechanisms of regulation toward the activator AlcR and the repressor CreA. This is not surprising, since aldA is found at a junction between three metabolic pathways--ethanol, threonine, and ethylamine--whereas alcA is involved only in the first pathway (Fig. 5). As we discussed earlier, the intermediate product, acetaldehyde, is lethal when accumulated in the cell. It has to be degraded by ALDH, which can explain a less tight control of the aldA gene by AlcR and by CreA.

D. The alc Cluster In filamentous fungi, catabolic "dispensable" pathways are sometimes organized in gene clusters. One possible explanation is that a close linkage of pathway genes is somehow involved in their mutual regulation. In A. nidulans, genes involved in various metabolic pathways, such as proline, nitrate, and quinate, are clustered (121, 162-165). AlcR-responsive genes are clustered as well (155). A thorough analysis of transcription units localized in the alcR-alcA region on chromosome VII has shown that five other genes are induced by the gratuitous inducer 2-butanone (EMK) and are carbon catabolite repressed. Gene order and orientations for this region (24 kb) have been established to be ORFP, alcR, alcO, alcA, alcM, alcS, and alcU, as shown in Fig. 10. These genes appear to

Glucose

/

Repressi°niI I 1 1 Chr. VH Transcript, kb 1.4

3.0/2.6 1.4

1.4

1.2/1.0 1.4

1.8

Induction ?

~

Inducing signal (2-butanone)

FIG. 10. The alc cluster: state of our current knowledge. On chromosomeVII, five alc genes--alcB, alcO, alcA, alcM, and alcS--are under AlcRcontrol'inthe presence of a coinducer, which is indicatedby a black arrow,alcU exhibitsonlya weak controlby AlcR, shownby a dotted arrow,whereas ORFP is inducedby ketones but is not under AlcR'scontrol.The transcript size of each gene is indicated underneath. Direct repressionby CreA occurs in the promotersof ORFP, alcR, alcO, alcA,alcS, and alcU and is indicated by black bars. Two intergenicregions share the same regulatoryelements:alcR/alcOand alcM/alcS.

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 189 be differentially regulated by the specific transactivator AIcR, while not all are under the direct control of the CreA repressor. To determine whether and how the regulatory proteins McR and CreA regulate the expression of the five alc genes, a thorough transcriptional analysis was carried out in three regulatory mutant strains. (1) If induction of transcription is dependent on McR, it should be abolished in an alcR loss-of-function strain (alcR125; 14). (2) To determine if CreA is involved in the repression, regulation studies were performed in a creA-derepressed background (creAd30, 127). (3) Finally, to establish if CreA represses directly these alc genes, alcR repression had to be separated from induction in a context in which alcR is driven by a constitutive and derepressed promoter, gpdA:alcR (11). The transcriptional regulation patterns fall into three distinct classes: (1) those genes regulated directly by AlcR and CreA (i.e., alcA, alcR, alcS, and alcO); (2) those whose regulation involves McR only (i.e., alcM), carbon catabolite repression being mediated by a cascade mechanism in which only alcR is repressed by CreA; and (3) one gene, alcU, which shows a relatively high constitutive level of expression, carbon catabolite repression being mediated only via alcR repression. These new alc genes not only have different patterns of regulation v/s-av/s the transactivator AIcR and the repressor CreA, but also vary considerably in the steady-state amounts of their transcripts, alcO and alcU are very weakly induced, alcM shows intermediate expression, and alcS is highly expressed but not as highly as alcA. The results were used to build a model for the regulation of expression of the alc gene cluster (Fig. 10). Two couples of divergently transcribed genes and three unidirectional transcribed genes make it possible to speculate on the regulatory mechanisms involved. Between the two sets of divergently transcribed genes (alcM/alcS and alcR/alcO), a common cis-acting element could mediate regulation as well as unidirectional functional elements. The impact of c/s-acting elements on the transcription of either gene is likely to depend on several parameters, such as the distance to the start of transcription (and hence to the transcription complex), the affinity of the binding site for the regulatory protein and chromatin structure. Interplay between c/s-regulatory elements could explain the distinct patterns of expression of divergently transcribed genes. Examination of the sequence of the intergenic region between alcM and alcS (S. Fillinger, I. Nikolaev, and B. Felenbok, unpublished) and alcR and alcO (13, 150) identified two types ofcis-acting regulatory elements mediated by McR and CreA: bidirectional ones, involved in the regulation of both genes, and unidirectional elements, acting on only one of these genes. Two types of specific regulatory elements have also been described in the intergenic region of the niaA-niaD genes in the nitrate utilization pathway in A. nidulans (166). A c/s-acting element was identified by Arst and MacDonald (121) in derepressed mutants, prn d which was shown to encompass CreA-binding sites in the intergenic region (116).

190

B. FELENBOK ET AL.

In the alc cluster, in addition, another individually transcribed ORF was localized adjacent to alcR. It is carbon catabolite-repressed and shows no control by AlcR and a relativelyhigh constitutive level of expression. Recent data indicate that ORFP is inducible by ketones but is not controlled by AlcR (J. Pinchard, M. Flipphi, and B. Felenbok, unpublished). Therefore, ORFP does not belong to the alc system. That is the reason why this gene, which was previously called alcP (155), was renamed ORFP. The function of the new alc genes is unknown. Complementation of an alc mutant completely deleted for the whole region with alcR and alcA restores normal growth on ethanol and shows a normal transcriptional pattern of regulation (155). The expression of the alcR, alcA, and aldA genes is inducible and glucose-repressed, as in a wild-type strain. Apparently, the three genes alcO, alcM, and alcS do not play an evident role in ethanol oxidation. However, it is unlikely that alcS, considering its high expression level, would have no function. Interestingly, in fungal metabolic pathways, gene clustering is not always conserved for the same function. In proline utilization, genes subject to the same control systems are clustered in A. nidulans (167), whereas they are dispersed in S. cerevisiae (168). The structural genes of the nitrate-utilization pathway are clustered in A. nidulans (169), while they are scattered in Neurospora crassa (170). In the case of alcohol catabolism in S. cerevisiae, alcohol dehydrogenase2-encoding gene (ADH2) is not linked to its regulatory gene ADR1 (171). This situation might be because Adrlp is not a pathway-specific activator. Adrlp activity has never been reported to depend on a coinducer compound. Adrlp is also involved in glycerol and acetate catabolism and in peroxisome function and biogenesis (62, 63, 172). The physiological significance ofalc gene clustering is an open question.

E. Discussion It is tentative to compare how the two antagonizing control circuits, pathwayspecific induction and general carbon catabolite repression, impose the regulation of the three principal genes of ethanol catabolism. In both alcR and aldA, activation is mediated via a single AlcR inverted-repeat target, while in alcA three functional targets are present. The targets in alcA have been shown to act in synergy (10), and this could well account for the strength of the alcA promoter upon induction of the alc gene system. However, the aldA promoter should be considered at least as powerful as that of alcA, despite the fact that it contains only one activation target. The direct involvement of the CreA repressor in the expression of alcA and alcR could explain this. Repressor and activator compete for binding the same promoter area in both these genes under all growth conditions (149). Carbon catabolite repression is a phenomenon that depends on the catabolic flow. In other words, carbon catabolite repression is never totally absent in growing cells. As a consequence, disruption of functional CreA targets in both alcR and alcA genes does not only lead to a derepression in the

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 191 presence of D-glucose, but also results in an overexpression under induced growth conditions (11-13). Induction of aldA transcription is regulated more straightforwardly since the aldA promoter is subject only to AlcR-mediated activation, aldA gene expression thus reflects the true force of the AlcR-activation cascade mechanism mediated via a single c/s-acting element. Additionally, the differences in promoter strengths could depend on the composition and the context in which the various functional AIcR targets reside. Previous studies have shown that core context, spacing between the two half-sites of a target, and orientation of the half-sites influence AIcR binding in vitro (59, 60). The basic Arg 6 residue outside the AlcR zinc binuclear cluster is involved in DNA binding and also in transcriptional regulation. The "downtranscription" effect on the aldA gene is likely to be caused by the reduced affinity for the inverted repeat target when this basic residue is mutated (59). Another possible important parameter could be the distance between the activation target and the site where the transcription machinery assembles. Moreover, the aldA promoter may be more accessible to the general transcription machinery. Finally, considering the ALDH involvement in multiple catabolic pathways (Fig. 5), the existence of a second, as yet unidentified, transactivator for aldA should not be excluded. It is also worthwhile to compare the basal level of expression of the three alc genes. The regulation of the basal level of the three alc genes involved in ethanol oxidation is completely different and should play a role in the onset of induction. To enable initial induction of the alc system, a constitutive basal level of alcR expression is required. Indeed, alcR basal transcription is substantial, as observed in alcR loss-of-function mutants (11, 13, 59, 149). However, the control of the alcR basal level seems more complex since part of it is controlled by AlcR and it is still subject to CreA-mediated repression (149). The second prerequisite for alc induction is the presence of a low level of acetaldehyde, the physiological inducer, able to activate AlcR (38a). This metabolite is likely to be formed continuously from regular metabolism. However, the constitutive basal level of expression of aldA is independent of both AlcR and CreA and serves to control the intracellular levels of acetaldehyde under any growth conditions. This is of considerable physiological relevance for the organism since it is generally accepted that acetaldehyde is toxic for the cell (91). On the contrary, in aldA loss-of-function mutants, the level of alc gene expression under noninduced growth conditions depends on the level of acetaldehyde accumulation, directly related to the stringency of the mutation (38a). For the onset of the induction process, the level of acetaldehyde should exceed the conversion capacity of the basal constitutively present ALDH. The alcA gene could play a role in this initial coinducer accumulation, since its basal level is directly controlled by an active AIcR protein. Furthermore, it was clearly shown that ADHI is present in low amounts in noninduced grown cells (10). Moreover, this alcA basal level is high enough to provide the expression of a reporter

192

B. FELENBOKETAL.

gene (uaZ) driven by the alcA promoter, which allows growth even under noninduced growth conditions (unpublished results). In the presence of ethanol as the sole carbon source (see Fig. 1), limiting amounts of acetaldehyde might trigger the formation of extra acetaldehyde, as the ADHI reaction equilibrium is fully shifted toward oxidation of ethanol. Alternatively, the initial formation of acetaldehyde might be the result of some ethanol conversion by a constitutively present dehydrogenase activity. A preferential induction of alcA in the early stages of the induction process would facilitate further accumulation of acetaldehyde. With the increase in intracellular acetaldehyde in a cascade-like manner, the activation system can adjust the expression profile of the responsive genes until a steady-state acetaldehyde concentration is reached. That balanced concentration would permit an optimal catabolic flux from ethanol without the risk of acetaldehyde toxicity. The presence of three synergistic acting AlcR targets in the alcA promoter and differences in affinity among these functional AlcR targets in alcA, alcR, and aldA could contribute to a preferential alcA induction in response to limiting amounts of acetaldehyde. We have shown that mutants carrying a functional alcR gene, which is not subject to autoactivation, are perfectly able to grow on ethanol as the sole carbon source (13), even if the induction of the alc-responsive genes is far weaker than that in the wild type. This puts the autoregulation of the transactivator-encoding alcR gene in another perspective. Adaptation in response to increasing coinducer levels apparently does not require induced alcR expression but is facilitated by the AlcR positive-feedback loop. Changes in acetaldehyde concentration are important in determining the onset and maintenance of the induction process. Clear evidence, therefore, is provided by transformants expressing aldA from the strong constitutive gpdA promoter. Prominent expression ofaldA in the early phases of induction in such transformants slows down the accumulation of intraeellular acetaldehyde. As a consequence, the induction process is suppressed due to a failure to induce alcA expression sufficiently (38a). This implies that the apparent in situ level of ALDH plays an important role in the transduction of the coinducer signal to the transcriptional activator AlcR. A subtle control of both aldA and alcA gene expression by the transactivator AlcR in response to differential intracellular concentrations of acetaldehyde could therefore be essential for the onset and maintenance of an optimal catabolic flow from ethanol.

Vh The alc System as a Heterologous Expression Tool in Fundamental Research and Biotechnology For a number of reasons, the ethanol regulon may be exploited as a suitable system for expression of heterologous proteins in filamentous fungi (7). First, the structural alcA and aldA genes are very highly expressed. The corresponding

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 193 promoters are considered to be among the strongest described thus far in A. nidulans. Under certain induction conditions, the specific transcripts comprise 1% of the total mRNA content and the two enzymes represent the major translated proteins (9). Nevertheless, the basal level of expression, especially under glucose-repressed conditons, remains very low. A very high induction ratio and a relatively high rate of mRNA synthesis give an inestimable advantage to the alc expression system, allowing separation of the induction phase from the initial rapid accumulation of biomass and control of the induction parameters by varying the carbon source in the medium. This property could resolve possible problems arising from the toxic effects of a foreign gene product. Moreover, the inducibility of the system provides alternative approaches to study the expression of genes whose function is essential for the cell by creating conditionally null mutants. Second, chemical inducers are relatively simple organic molecules, cheap and biodegradable, making the alc system readily applicable to commercial purposes. Finally, as the minimal expression system comprises only two elements--namely, AIcR and the alcA promoter-it can be transferred to industrially important fungal species or even to other organisms. Undoubtedly, the alcR-alcA expression system has found its principal application in fundamental research in A. nidulans. In particular, the alcA promoter was widely used to study genes controlling cell division and developmental cycles in A. nidulans. Overexpression of developmental regulatory genes brIA, abaA, and wetA caused growth inhibition and development at times scheduled by these regulators (173,174). Moreover, several previously uncharacterized genes whose overexpression resulted in specific phenotypic changes of development have been identified. The role of~-tubulins in vegetative growth (88) as well as NIMA kinase in nuclear division control (175) was established by overexpressing the corresponding genes from the alcA promoter. In addition, induction studies demonstrated the role of myosin I and calmodulin in polarized growth and control of cell proliferation, respectively (176, 177). Recently, an attempt to elucidate the role of the histone HI in the organization of chromatin structure was made using the same approach (178). Overexpression of the genes for penicillin biosynthesis in A. nidulans not only allowed the determination of a rate-limiting step in antibiotic production, but also resulted in a 30-fold increase in penicillin yield (179, 180). In the early 1990s, efforts were made to adapt and optimize the alcRalcA expression system for the controlled production of mammalian proteins in A. nidulans. A number of human proteins, mainly for therapeutic use--interferon 2, interleukin 6a, growth hormone, epidermal growth factor, superoxide dismutase, and lactoferrin--were expressed from the alcA promoter (181, 182). In most cases, the recombinant proteins were biologically active. The product yields varied from several micrograms to 100 mg per liter of the culture medium, depending on the protein produced. Although the particular expression cassette was designed for secretion of foreign proteins by virtue of either an artificial

194

B. FELENBOK ET AL.

or a-glucoamylase-derived signal peptide, it has proved equally useful for intracellular production (181). Secretion can be increased up to 100-fold when a heterologous gene is fused in frame to the alcA-driven glucoamylase gene (183). Given the limited amounts of AlcR in A. nidulans, further improvement of the expression level has been obtained either by introducing additional copies of the alcR gene or placing it under the strong constitutive gpdA promoter (11). In the latter case, a 5-fold increase of endogenous alcA transcription was observed. Further optimization of the alc expression system was achieved by releasing the alcA gene from carbon catabolite repression by disrupting CreA targets within its promoter region (12, 183). Such a glucose-derepressed expression system permits the production of an aleA-driven product even in the presence of glucose. Cultivation ofA. nidulans on glucose significantly decreases the endogenous protein background in the culture medium and prevents degradation of a recombinant product by extracellular proteases coexpressed at the time of maximal secretion. On the whole, the transcription level of the alc expression system does not seem to be a limiting factor for foreign protein production. Other problems, such as mRNA instability, misfolding, and incorrect processing of a recombinant protein or its degradation by endogenous proteases, are the bottlenecks for biotechnological applications of the system. These problems common to filamentous fungi await resolution. While A. nidulans is clearly the most amenable organism for fundamental molecular biology and genetics, A. niger is of such great industrial importance that development of a convenient expression system is a high-priority goal. Therefore, the efficiency of the alcR-alcA expression system was tested in A. niger (results will be published elsewhere). A. niger possesses alcohol and aldehyde dehydrogenases, and they present a different mode of regulation (28, 184). Endogenous ADH activities are constitutively expressed at low levels and do not interfere with the regulation of the alcA gene introduced. Expression of either the exogenous aleA gene or of the reporter fl-glucuronidase gene driven from the alcA promoter was dependent on the integrity of the gpdA:alcR fusion construct introduced into the A. niger genome. The system was highly inducible by ethanol to levels equivalent to those observed in A. nidulans. Unfortunately, the elevated basal level may cause undesirable effects arising from the production of recombinant proteins that are toxic for the cell (I. Nikolaev, M. Mathieu, P. van deVondervoort, J. Visser, and B. Felenbok, unpublished results). The usefulness of the aleR-alcA expression system beyond the fungal kingdom was demonstrated in transgenic plant studies (185). A high level of expression of an yeast invertase transcribed from the alcA promoter in transgenic tobacco, resulting in characteristic phenotypic modifications in developing leaves, was observed when plants were treated by ethanol. After removal of the inducer, the plants continue to grow normally. The alc system demonstrated its potential by allowing inducible manipulation of carbon metabolism in transgenic

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 195 plants. The alc system could prove an invaluable tool in the study of essential genes in organisms other than filamentous fungi.

VII. Conclusionsand Prospects In A. nidulans, the structural genes of the ethanol-utilization pathway, alcA encoding alcohol dehydrogenase I and aldA encoding aldehyde dehydrogenase, are among the most efficiently transcribed genes. The system is strictly regulated at the transcriptional level by two regulatory circuits whose interplay modulates the alc system. Specific induction is mediated by the transactivator protein AIcR in the presence of a coinducer. The physiological inducer is an intermediate catabolite in the pathway--acetaldehyde. Aldehyde dehydrogenase oxidizes acetaldehyde into acetate and plays a crucial role in the induction process. This enzyme maintains the balance between the accumulation of acetaldehyde, which induces the alc system, while avoiding intoxification The mechanism bywhich the inducing signal is transmitted to AlcR is still unknown. Isolation ofalcR constitutive mutants should help us to understand whether acetaldehyde binds directly to AlcR or whether the signal is transduced via another protein interacting with AlcR. Induction trans-signalization pathways starting from acetaldehyde may also involve a cascade of specific kinases/phosphatases which, however, remains to be elucidated. AlcR is a zinc binuclear cluster protein with original features compared to the regulators of this class. Structurally, the DNA-binding domain of AlcR presents an asymmetric motif. Functionally, AlcR is able to bind in vitro to a single site as a monomer. As expected for a monomer, no dimerization sequence has been identified. No heterodimer is formed between AlcR proteins of different length in a reticulocyte transcription-translation system, and neither part of AlcR is able to drive dimerization of the CI ~ repressor. However, the in vivo situation is more complex, as all AlcR functional targets located in responsive promoters are expected sites with the same consensus core. These latter can occur as inverted and direct repeats. Several lines of in vivo evidence favor the binding of one AlcR molecule per consensus site. Spacing between two repeated sites can vary from 2 to 16 bp. The different targets in which the sites are oriented symmetrically or asymmetrically appear equally efficient in eliciting transcription. Furthermore, the in vivo function of one target (e) in the alcA promoter encompassing three AlcR physiological binding sites, a direct repeat overlapping an inverted one, should involve individual binding of AlcR molecules to each site. This does not exclude an interaction between AlcR molecules, which could occur when DNA bending occurs upon AIcR binding, or the involvement of another protein that could interact with AIcR and contact the transcriptional machinery (see Fig. 11). The apparent paradox between our in vitro andin vivo results should be resolved

196

B. FELENBOK ET AL.

FIG. 11. Putative model of the transcriptional mechanism of induction of the three genes, alcR, alcA, and aldA, involved in ethanol oxidation. This model takes into account in vitro AlcR binding data and in vivo regulation studies after target disruption in the alc promoters. Nothing is yet known about the transducing pathway. AlcR functional palindromic sites in the alcR, alcA, and aldA promoters are able to bind two AleR molecules, which is indicated by x2. AlcR functional direct-repeat sites in the alcA promoter bind in vitro one AlcR molecule (x 1) whereas in vivo two AlcR molecules (x2) are necessary for transcriptional activation and three AIcR molecules (x3) may even bound region e. These data lead us to propose an interacting coactivator protein with AlcR, which could be in contact with the transcriptional machinery. This could explain the AIeR position effect and the synergistic activation.

by searching for eventual interactions between AlcR molecules and/or between McR and other as yet unidentified factors. The unique McR DNA-binding properties are reflected by the interaction of amino acids with DNA not found in other zinc binuclear proteins. In addition, an amino acid N-terminal of the zinc binuclear cluster, Arg 6, is directly involved in AlcR DNA-binding specificity as well as in transcriptional activation. NMR studies showed that this residue interacts in the minor groove with bases adjacent to the consensus core. The structural approach will be continued in order to resolve the three-dimensional structure of McR full-length protein complexed with different DNA targets. All the repeated DNA sites are found in the AlcR-responsive promoters and play a role in the proper induction of ethanol catabolism. The alcR gene is subject to positive autoregulation mediated by one functional target, a palindromic site. This positive autoregulation amplifies the external inducing signal, thereby

TRANSCRIPTIONALREGULATIONOF THE ETHANOLUTILIZATIONPATHWAY 197 increasing alcR induced transcription. One direct consequence is the strong transcriptional induction of the alcA and aldA genes, which is correlated with, and dependent on, the AlcR level in the cell. This can explain the high level induced transcription of alc genes. The second element is the mechanism of AlcR transcriptional activation of the structural genes in which the alcA and aldA genes represent different cases. In the alcA promoter, three different types of targets are present which are necessary for its full activation. There is a definite position effect of the two AIcR distal targets upstream of the initiation site in which the position effect of the central target b plays a crucial role. In addition, a strong synergistic interaction between these three targets strongly increases induced alcA transcriptional activation. In the aldA promoter a single functional AlcR palindromic target has been localized which is very efficient for induced aldA transcription. In this case, other parameters could play a role, such as the chromatin structure or the interaction with components of the transcriptional machinery. Indeed, the elevated structural alc gene transcripts correspond to abundant translation products ADHI and ALDH. The second regulatory circuit, carbon catabolite repression, is mediated by the CreA repressor, a protein of the zinc finger class. CreA mediates repression via its binding to GC-rich targets in responsive promoters and appears to be a unique repressor for the alc system. Three different mechanisms of repression occur among the three alc genes. The alcR gene is directly repressed by CreA via its binding to two couples of CreA sites functioning as pairs. One is overlapping the AlcR target, alcR repression mediated by this target is only partial while a direct competition with AlcR for the same promoter region occurs, probably as the result of steric hindrance between the two DNA-binding proteins. Competition between the two regulators was also found in the alcA promoter containing one pair of CreA sites overlapping the proximal AlcR target. This target mediates all direct CreA repression ofalcA. Steric competition accounts for an overinduction at the transcriptional level in the absence of glucose when one of these functional CreA binding sites is disrupted in either promoter. This mechanism plays a decisive role under physiological growth conditions in which an inducer of the alc regulon is present. The second repression mechanism is found in the alcR promoter and is mediated by a second pair of CreA sites localized between the two transcription starts. Repression mediated via this pair of CreA sites is virtually absolute and presumably interferes directly with the general transcription machinery. It should be very efficient under physiological conditions in which a rich carbon source is present. A third mechanism occurs in the aldA gene which is not under the direct control of CreA. The total carbon catabolite repression of the alcR gene is sufficient to prevent aldA transcriptional induction by a direct cascade mechanism. However, the absence of direct repression could be essential for the basal

198

B. FELENBOK ET AL.

level expression o f a l d A shown to be of physiological importance for the organism under all growth conditions. It is clear that the interplay between the two regulatory circuits, which in some cases could interact directly, modulates efficient utilization of nutrients when there is a change in growth conditions and occurs under any growth conditions. These results involving the dissection of the alc promoters permit a good understanding of the molecular mechanisms of specific induction and carbon catabolite repression. What remains reflects what we need to know of the regulation at the chromatin level, i.e., the role of these regulators in nucleosome positioning and its link with induced transcriptional activation. Another feature yet to be elucidated is the transducing pathway and the interaction with the transcriptional machinery. Finally, considering that five alc genes are organized in a cluster, it would be interesting to analyze further a functional involvement of this cluster organization in the regulation of ethanol pathway gene expression as well as relationships with other carbon catabolite pathways in A. nidulans.

ACKNOWLEDGMENTS We are indebted to Prof. Barry Holland for correcting the English. We are grateful to Christian Velot for his constructive comments. We thank Martine Mathieu for help with figures and Claire Denis for typing. The work was supported in part by the Centre National de la Recherche Scientifique, by the Universit6 Paris-Sud, and by grants from the European Commission N BIO4-CT975028, BIO4-CT0535, and QLCK3-CT1999-00729.

REFERENCES I. H. N. Arst, Jr., and C. Scazzocchio, in "Gene Manipulations in Fungi" (J. W. Bennett and L. L. Lasure, eds.), Vol. 13, pp. 310-337. Academic Press, New York, 1985. 2. J. A. Pateman, C. H. Doy, J. E. Olsen, U. Norris, E. H. Creaser, and M. Hynes, Proc. R. Soc. London B 217, 243-264 (1983). 3. B. Felenbok and H. M. Sealy-Lewis, in "Aspergillus: 50 Years On" (S. D. Martinelli and J. R. Kinghorn, eds.), Progress in Industrial Microbiology Vol. 29, pp. 141-179. Elsevier Amsterdam-London-New York-Tokyo, 1994. 4. B. Felenbok and J. M. Kelly, in "Biochemistry and Molecular Biology"(R. Brambl and G. A. Marzluf, eds.), The Mycota III, pp. 369-380. Springer-Verlag,Berlin-Heidelberg-New York, 1996. 5. M. M. Page, "Genetics and Biochemical Studies on the Catabolism of Amines and Alcohols in AspergiUus nidulans," Ph.D Thesis, University of Cambridge, Cambridge UK, 1971. 6. H. N. Arst, Jr., and D. J. Cove, Mol. Gen. Genet. 126, 111-141 (1973). 7. B. Felenbok, J. Biotechnol. 17, 11-18 (1991). 8. R. Lockington, C. Scazzocchio, D. Sequeval, M. Mathieu, and B. Felenbok, Mol. Microbiol. 1, 275-281 (1987).

TRANSCRIPTIONAL REGULATION OF THE ETHANOL UTILIZATION PATHWAY 199 9. B. Felenbok, D. Sequeval, M. Mathieu, S. Sibley, D. I. Gwynne, and R. W. Davies, Gene 73, 385-396 (1988). 10. C. Panozzo, V. Capuano, S. Fillinger, and B. Felenbok, J. Biol. Chem. 272, 22859-22865 (1997). 11. M. Mathieu and B. Felenbok, EMBO]. 13, 4022-4027 (1994). 12. C. Panozzo, E. Cornillot, and B. Felenbok, J. Biol. Chem. 273, 6367-6372 (1998). 13. M. Mathieu, S. Fillinger, and B. Felenbok, Mol. Microbiol. 36, 123-131 (2000). 14. T. Roberts, S. Martinelli, and C. Scazzocchio, Mol. Gen. Genet. 177, 57-64 (1979). 15. H. M. Sea]y-Lewis and R. A. Lockington, Curt Genet. 8, 253-259 (1984). 16. R. A. Lockington, H. M. Sea]y-Lewis, C. Scazzocchio, and R. W. Davies, Gene 33, 137-149 (1985). 17. R. R. Rando, Biochem. Pharmacol. 23, 2328-2331 (1974). 18. C. H. Doy, J. A. Pateman, J. E. OIsen, H. J. Kane, and E. H. Creaser, DNA 4, 105-114 (1985). 19. M. Pickett, D. I. Gwynne, F. E Buxton, R. Elhott, R. W. Davies, R. A. Lockington, C. Scazzocchio, and H. M. Sealy-Lewis, Gene 51, 217-226 (1987). 20. P. Kulmburg, D. Sequeva], F. Lenouvel, M. Mathieu, and B. Felenbok, Mol. Cell. Biol. 12, 1932-1939 (1992). 21. P. Kulmburg, M. Mathieu, C. Dowzer, J. Kelly, and B. Felenbok, Mol. Microbiol. 7, 847-857 (1993). 22. D. I. Gwynne, F. E Buxton, S. Sibley, R. W. Davies, R. A. Lockington, C. Scazzocchio, and H. M. Sea]y-Lewis, Gene 51, 205-216 (1987). 23. S. E. Unldes, in "Applied Molecular Genetics of Filamentous Fungi" (J. R. Kinghorn and G. Turner, eds.), pp. 28-53. Blackie Academic & Professional, Chapman & Hall, London, 1992. 24. E. H. Creaser, R. L. Porter, K. A. Britt, J. A. Pateman, and C. H. Doy, Biochem. ]. 225, 449-454 (1985). 25. G. D. Hunter, I. G. Jones, and H. M. Sea]y-Lewis, Curt Genet. 29, 122-129 (1996). 26. G. L. McKnight, H. Kato, A. Upsha]l, M. D. Parker, G. Saari, and P. J. O'Hara, EMBO J. 4, 2093-2099 (1985). 27. I. G. Jones and H. M. Sea]y-Lewis, Curt Genet. 17, 81-83 (1990). 28. J. M. Kelly, M. R. Drysdale, H. M. Sea]y-Lewis, I. G. Jones, and R. A. Lockington, Mol. Gen. Genet. 222, 323-328 (1990). 29. H. M. Sealy-Lewis and V. Fairhurst, Microbiology 141, 2295-2300 (1995). 30. H.-W. Sun and B. V. Plapp, J. Mol. Evol. 34, 522-535 (1992). 31. M. F. Reid and C. A. Fewson, Crit. Rev. Microbiol. 20, 13-56 (1994). 32. I. G. Jones and H. M. Sea]y-Lewis, C u ~ Genet. 15, 135-142 (1989). 33. H. M. Sea]y-Lewis, Curt Genet. 18, 65-70 (1990). 34. E. H. Creaser, R. L. Porter, and J. A. Pateman, Int. ]. Biochem. 19, 1009-1012 (1987). 35. J. Perozich, H. Nicholas, B.-C. Wang, R. Lindahl, and J. Hempel, Protein Sei. 8, 137-146 (1999). 36. P. G. Meaden, F. M. Dickinson, A. Mifsud, W. Tessier, J. Westwater, H. Bussey, and M. Midgley, Yeast 13, 1319-1327 (1997). 37. W. D. Tessier, P. G. Meaden, 17.M. Dickinson, and M. Midgley, FEMS Microbiol. Lett. 164, 29-34 (1998). 38. X. Wang, C. ]. Mann, Y. Bai, L. Ni, and H. Weiner, J. Bacteriol. 180, 822-830 (1998). 38a. M. Flipphi, M. Mathieu, I. Cirpus, C. Panozzo, and B. Felenbok,]. Biol. Chem. 276, 69506958 (2001). 39. Z.-]. Liu, Y.-J. Sun, J. Rose, Y.-J. Chung, C.-D. Hsiao, W.-R. Chang, I. Kuo, J. Perozich, R. Lindahl, J. Hempel, and B.-C. Wang, Nat. 8truct. Biol. 4, 317-326 (1997).

200

B. FELENBOK ET AL.

40. K. Johansson, M. E1-Ahmad, S. Ramaswamy, L. Hjelmqvist, H. J~mvall, and H. Eklund, Protein Sc/. 7, 2106-2117 (1998). 41. S.A. Moore, H. M. Baker, T. J. Blythe, K. E. Kitson, T. M. Kitson, and E. N. Baker, Structure

6, 1541-1551 (1998). 42. P. Kulmburg, T. Prang6, M. Mathieu, D. Sequeval, C. Scazzocchio, and B. Felenbok, FEBS

Lett. 280, 11-16 (1991). 43. P. Schjerling and S. Holmberg, Nucleic Acids Res. 24, 4599-4607 (1996). 44. R. B. Todd and A. Andrianopoulos, Fungal Genet. Biol. 21, 388-405 (1997). 45. J. D. Baleja, R. Marmorstein, S. C. Harrison, and G. Wagner, Nature (London) 356, 450-453

(1992). 46. P.J. Kraulis, A. R. C. Raine, P. L. Gadhavi, and E. D. Laue, Nature (London) 356, 448-450

(1992). 47. R. Marmorstein, M. Carey, M. Ptashne, and S. Harrison, Nature (London) 356, 408-414

(1992). R. Marmorstein and S. C. Harrison, Genes Dev. 8, 2504-2512 (1994). K. H. Gardner, S. E Anderson, and J. E. Coleman, Nat. Struct. Biol. 2, 898-905 (1995). D. A. King, Li Zhang, L. Guarente, and R. Marmorstein, Nat. Struct. Biol. 6, 64-71 (1999). P. Kulmburg, N. Judewicz, M. Mathieu, F. Lenouvel, D. Sequeval, and B. Felenbok, J. Biol. Chem. 267, 21146-21153 (1992). 52. D. Sequeval and B. Felenbok, Mol. Gen. Genet. 242, 33-39 (1994). 53. I. Ascone, E Lenouvel, D. Sequeval, H. Dexpert, and B. Felenbok, Biochim. Biophys. Acta 1343, 211-220 (1997). 54. N. Defranoux, M. Gaisne, and J. Verdi~re, Mol. Gen. Genet. 242, 699-707 (1994). 55. R. Cerdan, B. Cahuzac, B. Felenbok, and E. Guittet, J. Mol. Biol. 295, 729-736 (2000). 56. K. J. Waiters, K. T. Dayie, R. ]. Reece, M. Ptashne, and G. Wagner, Nat. Struct. Biol. 4, 744-750 (1997). 57. K. Swaminathan, P. Flynn, R. Reece, and R. Marmorstein, Nat. Struct. Biol. 4, 751-759 (1997). 58. J. Timmerman, A.-L. Vuidepot, E Bontems, J.-V. Lallemand, M. Gervais, E. Shechter, and B. Guiard, J. Mol. Biol. 259, 792-804 (1996). 59. I. Nikolaev, E Lenouvel, and B. Felenbok, J. Biol. Chem. 274, 9795-9802 (1999). 60. E Lenouvel, I. Nikolaev, and B. Felenbok, J. Biol. Chem. 272, 15521-15526 (1997). 61. S. K. Thukral, A. Eisen, and E. T. Young, Mol. Cell. Biol. 11, 1566-1577 (1991). 62. M. Simon, G. Adam, W. Rapatz, W. Spevak, and H. Ruis, Mol. Cell. Biol. 11, 699-704 (1991). 63. M. Granslund, J. M. Lopes, and B. Ronnow, Nucleic Acids Res. 27, 4391-4398 (1999). 64. Li Zhang and L. Guarente, EMBOJ. 15, 4676-4681 (1996). 65. J. Strauss, M. I. Muro-Pastor, and C. Scazzocchio, Mol. Cell. Biol. 18, 1339-1348 (1998). 66. R. B. Todd, A. Andrianopoulos, M. A. Davis, and M. J. Hynes, EMBO J. 17, 2042-2054 (1998). 67. O. I. Sirenko, B. Ni, and R. B. Needleman, Curt. @net. 27, 509-516 (1995). 68. H. E. Xu, T. Kodadek, and S. A. Johnston, Proc. Natl. Acad. Set. U.S.A. 92, 7677-7680 (1995). 69. M. Lundin, J. O. Nehlin, and H. Ronne, Mol. Cell. Biol. 14, 1979-1985 (1994). 70. R. Reece and M. Ptashne, Science 261, 909-911 (1993). 71. J. W. R. Schwabe and D. Rhodes, Nat. Struct. Biol. 4, 680-682 (1997). 72. I. Nikolaev, M.-F. Cochet, E Lenouvel, and B. Felenbok, Mol. Microbiol. 31, 1115-1124 (1999). 73. R. Cerdan, D. Conln, F. Lenouvel, B. Felenbok, and E. Guittet, FEBS Lett. 408, 235-240 (1997). 74. G. Marie, L. Serani, O. Lapr~vote, B. Cahuzac, E. Guittet, and B. Felenbok. Protein Set. 10, 99-107 (2001).

48. 49. 50. 51.

TRANSCRIPTIONAL REGULATION OF THE ETHANOL UTILIZATION PATHWAY 201 D. G6rlich and U. Kutay, Annu. Rev. Cell. Dev. 15, 607-660 (1999). S. Nakielny and G. Dreyfuss, Cell (Cambridge, Mass.) 99, 677-690 (1999). A. Pokorska, C. Drevet, and C. Scazzocchio, J. MoI. Biol. 298, 585-596 (2000). P. A. Silver, L. P. Keegan, and M. Ptashne, Proc. Natl. Acad. Sci. U.S.A. 81, 5951-5955 (1984). 79. M. J. DeVit, J. A. Waddle, and M. Johnston, Mol. Biol. Cell 8, 1603-1618 (1997). 80. C.-K. Chan and D. A. Jans, FEBS Lett. 462, 221-224 (1999). 81. J. Ma and M. Ptashne, Cell (Cambridge, Mass.) 48, 847-853 (1987). 82. T. Delaveau, A. Delahodde, E. Carvajal, J. Subik, and C. Jacq, Mol. Gen. Genet. 244, 501-511 (1994). 83. E. T. Young, J. Saario, N. Kacherovsky, A. Chao, J. S. Sloan, and K. M. Dombek, J. Biol. Chem. 273, 32080-32087 (1998). 84. G. Stone and I. Sadowsld, EMBOJ. 12, 1375-1385 (1993). 85. P. Friden, C. Reynolds, and P. Schimmel, Mol. Cell. Biol. 9, 4056-4060 (1989). 86. K. Pfeifer, K.-S. Kim, S. Kogan, and L. Guarente, Cell (Cambridge, Mass.) 56, 291-301 (1989). 87. L. M. Parsons, M. A. Davis, and M. J. Hynes, Mol. Microbiol. 6, 2999-3007 (1992). 88. R. B. Waring, G. S. May, and N. R. Morris, Gene 79, 119-130 (1989). 89. N. Monschau, K. P. Stahlmann, H. Sahm, J. B. McNeil, and A. L. Bognar, FEMS Microbiol. Lett. 150, 55-60 (1997). 90. M. M. Page and D. J. Cove, Biochem.J. 127, Suppl. Proe. Biochem. Soc., 17P (1972). 91. R. Crebelli, G. Conti, L. Conti, and A. Carere, Murat. Res. 215, 187-195 (1989). 92. S. Armitt, W. McCullough, and C. 17.Roberts,]. Gen. Microbiol. 92, 263-283 (1976). 93. I. F. Connerton, J. R. S. Fincham, R. A. Sandeman, and M. J. Hynes, Mol. Microbiol. 4, 451-460 (1990). 94. M. Brock, R. Fischer, D. lander, and W. Buckel, Mol. Microbiol. 35, 961-973 (2000). 95. T. Tabuchi and H. Uchiyama, Agr/c. Biol. Chem. 39, 2035-2042 (1975). 96. D. H. A. Hondmann, R. Busink, C. F. B. Witteveen, and J. Visser, J. Gen. Microbiol. 137, 629-636 (1991). 97. R. Lindahl, Crit. Rev. Biochem. Mol. Biol. 27, 283-335 (1992). 98. R. A. Lockington, G. N. Borlace, andJ. M. Kelly, Gene 191, 61-67 (1997). 98a. J. March, in "Advanced Organic Chemistry" John Wiley and Sons, p. 805, New York, 1985. 99. R. P. Jones, Enzyme Microb. Technol. 11, 130-153 (1989). 100. N. Oestreicher, H. M. Sealy-Lewis, and C. Scazzocchio, Gene 132, 185-192 (1993). 101. A.J. Darlington, C. Scazzocchio, and J. A. Pateman, Nature (London) 206, 599-600 (1965). 102. E M. Dickinson and G. W. Haywood, Biochem. J. 247, 377-384 (1987). 103. J. c. Rowlands and J.-A. Gustafsson, Crit. Rev. Toxicol. 27, 109-134 (1997). 104. M. E. Hahn, Comp. Biochem. Physiol., Part C 121, 23-53 (1998). 105. C. L. Wilson and S. Safe, Toxicol. Pathol. 26, 657-671 (1998). 106. M. Johnston, Microbiol. Rev. 51, 458-476 (1987). 107. A. Platt and R. J. Reece, EMBOJ. 17, 4086-4091 (1998). 108. C. Bailey and H. N. Arst, Jr., Eur. J. Biochem. 51, 573-577 (1975). 109. C. E. A. Dowzer and J. M. Kelly, Curt Genet. 15, 457-459 (1989). 110. C. E. A. Dowzer and J. M. Kelly, Mol. Cell. Biol. 11, 5701-5709 (1991). 111. V.P. Sukhatme, X. M. Cao, L. C. Chang, C. H. Tsai-Morris, D. Stamenkovich, P. C. P. Ferreira, D. R. Cohen, S. A. Edwards, T. B. Shows, T. Curran, M. M. LeBeau, and E. D. Adamson, Cell (Cambridge, Mass.) 53, 37-43 (1988). 112. D.A. Haber, A. J. Buckler, T. Glazer, K. M. Call, J. Pelletier, R. L. Sohn, E. C. Douglass, and D. E. Housman, Cell (Cambridge, Mass.) 61, 1257-1269 (1990). 113. J. Nardelli, T. J. Gibson, C. Vesque, and P. Charnay, Nature (London) 349, 175-178 (1991). 114. J. O. Nehlin and H. Ronne, EMBOJ. 9, 2891-2898 (1990). 75. 76. 77. 78.

202

B. FELENBOK ET AL.

114a. J. O. Nehlin, M. Carlberg, and H. Ronne, EMBOJ. 10, 3373-3377 (1991). 115. N. P. Pavletieh and C. O. Pabo, Science 252, 809-817 (1991). 116. B. Cubero and C. Scazzoechio, EMBOJ. 13, 407-415 (1994). 117. C. Scazzocchio, V. Gavrias, B. Cubero, C. Panozzo, M. Mathieu, and B. Felenbok, Can. J. Bot. 73(Suppl. 1), S160-$166 (1995). 118. E. A. Espeso and M. A. Pefialva, FEBS Lett. 342, 43-48 (1994). 119. J. Strauss, H. K. Horvath, B. M. Abdallah, J. Kindermann, R. L. Mach, and C. P. Kubicek, Mol. Microbiol. 32, 169-178 (1999). 120. M. Orejas, A. E MacCabe, J. A. P6rez Gonz~ilez,S. Kumar, and D. Ramon, Mol. Microbiol. 121. 122. 123. 124. 125. 125a. 125b. 126. 127. 128. 129. 130.

131. 132. 133. 134.

135. 136. 137. 138. 139. 140. 141. 142. 143. 144. 145. 146.

31, 177-184 (1999). H. N. Arst, Jr., and D. W. MacDonald, Nature (London) 254, 26-31 (1975). L. L. Lutflyya and M. Johnston, Mol. Cell. Biol. 14, 4790-4797 (1996). H. Ronne, Trends Genet. 11, 12-17 (1995). A. Sakai, Y. Shimizu, and F. Hishinuma, Genetics 119, 499-506 (1988). A. Sakai, Y. Shimizu, S. Kondou, T. Chibazakura, and F. Hishinuma, Mol. Cell. Biol. 10, 4130-4138 (1990). Y. Li, S. Bjorklund, Y. W. Jiang, Y. J. Kim, W. S. Lane, D. J. Sfillman, and R. D. Kornberg, Proc. Natl. Acad. Sci. U.S.A. 92, 10864-10868 (1995). D. R. Moss and P. J. Layboum, Mol. Microbiol. 36, 1293-1305 (2000). M.J. Hynes andJ. M. Kelly, Mol. Gen. Genet. 150, 381-204 (1977). H. N. Arst, Jr., D. Tollervey, C. E. A. Dowzer, and J. M. Kelly, Mol. Microbiol. 4, 851-854 (1990). R. A. Shroff, R. A. Lockington, and J. M. Kelly, Can. J. Microbiol. 42, 950-959 (1996). R.A. Shroff, S. M. O'Connor, M. J. Hynes, R. A. Lockington, and J. M. Kelly, Fungal Genet. Biol. 22, 28-38 (1997). J. M. Kelly, in "Aspergillus: 50 Years On" (S. D. Martinelli and J. R. Kinghorn, eds.), Progress in Industrial Microbiology, Vol. 29, pp. 355-367. Elsevier Amsterdam-London-New YorkTokyo, 1994. G. J. G. Ruijter and J. Visser, FEMS Microbiol. Lett. 151, 103-114 (1997). P. van der Veen, H. N. Arst, Jr., M. J. A. Flipphi, and J. Visser, Arch. Microbiol. 162, 433-440 (1994). J. 0stling, M. Carlberg, and H. Ronne, Mol. Cell. Biol. 16, 753-761 (1996). H. N. Arst, in "Genetics as a Tool in Microbiology" (S. W. Glover and D. A. Hopwood, eds.), Soc. Gen. Microbiol. Symp. Vol. 31, pp. 131-160. Cambridge University Press, Cambridge, 1981. A. Woods, M. R. Munday, J. Scott, X. Yang, M. Carlson, and D. Carling, J. Biol. Chem. 269, 19509-19515 (1994). R. Jiang and M. Carlson, Genes Dev. 10, 3105-3115 (1996). M. Johnston, J. S. Flick, and T. Pexton, Mol. Cell. Biol. 14, 3834-3841 (1994). L. G. Vallier and M. Carlson, Genetics 137, 49-54 (1994). G. Vautard, P. Cotton, and M. F~vre, FEBS Lett. 453, 54-58 (1999). C.A. Keleher, M. J. Redd, J. Schultz, M. Carlson, and A. D. Johnson, Cell (Cambridge, Mass.) 68, 709-719 (1992). J. P. Cooper, S. Y. Roth, and R. T. Simpson, Genes Dev. 8, 1400-1410 (1994). M. J. Redd, M. B. Arnaud, andA. D. Johnson, J. Biol. Chem. 272, 11193-11197 (1997). M. R. Drysdale, S. E. Kolze, andJ. M. Kelly, Gene 130, 241-245 (1993). J. Strauss, R. L. Mach, S. Zeilinger, G. Hartler, G. Stoffler, M. Wolschek, and C. P. Kubicek, FEBS Lett. 27, 103-107 (1995). M. Ilmen, C. Thrane, and M. Pentfil~i,Mol. Gen. Genet. 9.51, 451-460 (1996). S. Takashima, A. Nakamura, H. Iikura, H. Masaki, andT. Uozumi, Biosci. Biotechnol. Biochem. 60, 173-176 (1996).

TRANSCRIPTIONAL REGULATION OF THE ETHANOL UTILIZATION PATHWAY 203 147. S. Takashima, A. Nakamura, M. Hidaka, H. Masaki, and T. Uozumi, Biosci. Biotechnol. Biochem. 62, 2364-2370 (1998). 148. B. Tudzynski, S. Liu, and J. M. Kelly, FEMS Microbiol. Lett. 184, 9-15 (2000). 148a. I. de la Serna, D. Ng, and B. M. Tyler, Fungal Genet. Biol. 26, 253-269 (1999). 149. S. Fillinger, C. Panozzo, M. Mathieu, and B. Felenbok, FEBS Lett. 368, 547-550 (1995). 150. S. Fillinger, "Identification et Etude Fonctionelle de Nouveaux Genes Appartenant an ReguIon Ethanol chezAspergillus nidulans," Ph.D. Thesis, Universit6 de Paris-Sud, Orsay, France, 1996. 151. G. Burger, J. Strauss, C. Scazzocchio, and B. F. Lang, Mol. Cell. Biol. 11,795-802 (1991). 152. T. Su~rez, N. Oestreicher, M. A. Pefialva, and C. Scazzocchio, Mol. Gen. Genet. 230, 369-375 (1991). 153. T. Langdon, A. Sheerins, A. Ravagnani, M. Gielkens, M. X. Caddiek, and H. N. Arst, Jr., Mol. Microbiol. 17, 877-888 (1995). 154. J. Tilburn, S. Sarkar, D. A. Widdick, E. A. Espeso, M. Orejas, J. Mungroo, M. A. Pefialva, and H. N. Arst, Jr., EMBOJ. 14, 779-790 (1995). 155. S. Fillinger and B. Felenbok, Mol. Microbiol. 20, 475-488 (1996). 156. M.I. Muro-Pastor, R. Gonzales, J. Strauss, E M. Narendja, and C. Scazzocchio, EMBOJ. 18, 1584-1597 (1999). 157. F. M. Narendja, M. A. Davis, and M. J. Hynes, Mol. Cell. Biol. 19, 6523-6531 (1999). 158. S.D. Liang, R. Marmorstein, S. C. Harrison, and M. Ptashne, Mol. Cell. Biol. 16, 3773-3780 (1996). 159. E. Giniger and M. Ptashne, Proc. Natl. Acad. Sci. U.S.A. 85, 382-386 (1988). 160. Y.-S. Lin, M. Carey, M. Ptashne, and M. R. Green, Nature (London) 345, 359-361 (1990). 161. M. Carey, Y.-S. Lin, M. R. Green, and M. Ptashne, Nature (London) 345, 361-364 (1990). 162. I. L. Johnstone, P. C. McCabe, P. Greaves, S. J. Gun', G. E. Cole, M. A. D. Brown, S. E. Unkles, A. J. Clutterbuck, J. R. Kinghorn, and M. A. Innis, Gene 90, 181-192 (1990). 163. A. R. Hawkins, H. K. Lamb, and C. F. Roberts, in "AspergiUus: 50 Years On" (S. D. Martinelli and J. R. Kinghorn, eds.), Progress in Industrial Microbiology, Vol. 29, pp. 195-220. Elsevier Amsterdam-London-New York-Tokyo, 1994. 164. N. E Keller and T. M. Hohn, Fungal Genet. Biol. 21, 17-29 (1997). 165. V. Gavrias, B. Cubero, B. Cazelle, V. Sophianopoulou, and C. Scazzocchio, in "The Genus Aspergillus: From Taxonomy and Genetics to Industrial Application" (K. A. Powell, A. Renwick, and J. F. Peberdy, eds.), FEMS Symp. Ser., Vol. 69, pp. 225-232. Plenum Press, New York-London, 1994. 166. P.J. Punt, j. Strauss, R. Stair, J. R. Kinghorn, C. A. M. J. j. van den Hondel, and C. Scazzocchio, Mol. Cell. Biol. 15, 5688-5699 (1995). 167. S. A. Jones, H. N. Arst, Jr., and D. W. MacDonald, Curt. Genet. 3, 49-56 (1981). 168. K. A. Krzywicki and M. C. Brandiss, Mol. Cell. Biol. 4, 2837-2842 (1984). 169. D. J. Cove, Biol. Rev. 54, 291-327 (1979). 170. A. B. Tomsett and R. H. Gan.ett, Genetics 95, 649-660 (1980). 171. M. Ciriacy, Mol. Gen. Genet. 138, 157-164 (1975). 172. M. Simon, M. Binder, G. Adam, A. Hartig, and H. Ruis, Yeast 8, 303-309 (1992). 173. T. H. Adams and W. E. Timberlake, Proc. Natl. Acad. Sci. U.S.A. 87, 5405-5409 (1990). 174. J. F. Marhoul and T. H. Adams, Genetics 139, 537-547 (1995). 175. K. P. Lu and A. R. Means, EMBOJ. 13, 2103-2113 (1994). 176. C. A. McGoldrick, C. Gruver, and G. R. May,]. Cell. Biol. 128, 577-587 (1995). 177. K. E Lu, C. D. Rasmussen, G. S. May, and A. R. Means, Mol. Endocrinol. 6, 365-374 (1990). 178. A. C. Ram6n Pacheco, "Structure Chromatinienne et Expression G~nique chez Aspergillus nidulans," Ph.D. Thesis, Universit6 de Paris-Sud, Orsay, France, 2000. 179. J. M. Fernandez-Canon and M. A. Pefialva, Mol. Gen. Genet. 246, 110-118 (1995). 180. J. Kennedy and G. Turner, Mol. Gen. Genet. 253, 189-197 (1996).

204

B. FELENBOK ET AL.

181. R. W. Davies, in "Molecular Industrial Mycology Systems and Applications for Filamentous

182. 183. 184. 185.

Fungi" (S. A. Leong and R. M. Berka, eds.), pp. 45-58. Morris-Bekker, New York-BaselHong Kong, 1991. P. P. Ward, G. S. May, D, R. Headon, and O. M. Conneely, Gene 122, 219-223 (1992). W. E. Hintz, I. Kalsner, E. Plawinski, Z. Guo, and P. A. Lagosky, Can. J. Bot. 73(Suppl. 1), $876- $884 (1995). M.J. O'Connell andJ. M. Kelly, Curt. Genet. 14, 95-103 (1988). M. X. Caddick, A. J. Greenland, J. Jepson, K.-P. Krause, N. Qu, K. V. Riddell, M. G. Salter, W. Schuch, U. Sonnewald, and B. Tomsett, Nature Biotechnol. 16, 177-180 (1998).

The ROR Nuclear Orphan Receptor Subfamily: Critical Regulators of Multiple Biological Processes ANTON M. JETTEN, 1 S H O G O KUREBAYASHI, AND EIICHIRO UEDA

Cell Biology Section Division of Intramural Research National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park, North Carolina 27709 I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. Cloning and Expression Pattern of RORs . . . . . . . . . . . . . . . . . . . . . . . . . . A. ROR0t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. RORfl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. RORy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Insect Homologs of ROR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III. Structure of ROR Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Characterization of ROR Response Elements . . . . . . . . . . . . . . . . . . . . . . . V. Transcriptional Control by RORs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Ligand-Dependent or -Independent Activation? . . . . . . . . . . . . . . . . . . B. Interaction of ROR with Corepressors and Coactivators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Regulation of ROR-Dependent Transactivation by CaMKIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Genomic Structure and Chromosomal Localization . . . . . . . . . . . . . . . . . . VII. Targeted Knockouts of RORs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Phenotype of R O R ~ - / - Mice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Phenotype of R O R f l - / - Mice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Phenotype of R O R y - / - Mice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Overexpression of RORs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Effect of RORy on Thymopoiesis and Apoptosis . . . . . . . . . . . . . . . . . B. Inhibition of Myogenesis by Dominant-Negative RORu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX. Other Target Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1To whom correspondence should be addressed. Tel: 919-541-2768; E-mail: [email protected]. Progressin NucleicAcidResearch and MolecularBiology,Vol.69

205

207 208 208 209 210 211 212 214 217 217 218 223 224

226 226 229 230 236 236 239 239 241 242

fax: 919-541-4133;

Copyright© 2001byAcademicPress. All rightsof reproductionin anyformreserved. 0079-6603/01 $35,00

206

A.M. JETYEN ET AL. The nuclear receptor superfamily, a group of structurally related, liganddependent transcription factors, includes a large number of orphan receptors for which no ligand has yet been identified. These proteins function as key regulators of many physiological processes that occur during embryonic development and in the adult. The retlnoid-related orphan receptors (RORs) ct,/3, and "ycomprise one nuclear orphan receptor gene subfamily. RORs exhibit a modular structure that is characteristic for nuclear receptors; the DNA-bindingdomain is highly conserved and the ligand-bindingdomain is moderately conserved among RORs. By a combination of alternative promoter usage and exon splicing, each ROR gene generates several isoforms that differ only in their amino terminus. RORs bind as monomers to specific ROR response elements (ROREs) consisting of the consensus core motif AGGTCA preceded by a 5-bp A/T-rich sequence. RORE-dependent transcriptional activation by RORs is cell type-specific and mediated through interactions with nuclear cofactors. RORs have been shown to interact with certain corepressors as well as coactivators, suggesting that RORs are not constitutively active but that their activity is under some regulatory control. RORs likely can assume at least two different conformations: a repressive state, which allows interaction with corepressor complexes, and an active state, which promotes binding of coactivator complexes. Whether the transition between these two states is regulated by ligand binding and/or by phosphorylation remains to be determined. Ca~+/calmodulin-dependentkinase IV (CaMKIV) can dramatically enhance ROR-mediated transcriptional activation. This stimulation involves CaMKIV-mediated phosphorylation not of RORs, but likely of specific nuclear cofactors that interact with RORs. RORa is widely expressed. In the cerebellum, its expression is limited to the Purkinje cells. RORc~- / - mice and the natural RORc~-deficient staggerer mice exhibit severe cerebellar ataxia due to a defect in Purkinje cell development. In addition, these mice have thin long bones, suggesting a role for RORa in bone metabolism, and develop severe atherosclerosis when placed on a high-fat diet. Expression of ROR/3 is very restricted. ROR/3 is highly expressed in different parts of the neurophotoendocrine system, the pineal gland, the retina, and suprachiasmatic nuclei, suggesting a role in the control of circadian rhythm. This is by observations showing alterations in circadian behavior in ROR/3_~upp°rted mice. ROR% which is most highly expressed in the thymus, plays an important role in thymopoiesis. Thymocytes from ROR'?-/- mice undergo accelerated apoptosis. The induction of apoptosis is, at least in part, due to a downregulation of the expression of the antiapoptotic gene BcI-XL. In addition to the thymic phenotype, ROR'y-/- mice lack lymph nodes, indicating that ROR~ is essential for lymph node organogenesis. Overexpression of RORv has been shown to inhibit T cell receptor-mediated apoptosis in T cell hybridomas and to repress the induction of Fas-ligand and interleukin 2. These studies demonstrate that RORs play critical roles in the regnlation of a variety of physiological processes. Further characterization of the mechanisms of action of RORs will not only lead to the identificationof ROR target genes and provide additional insight into their normal physiological functions, but will also determine their rnles in disease. © 2001AcademicPress.

ROR NUCLEARORPHANRECEPTORSUBFAMILY

207

I. Introduction The nuclear hormone receptor superfamily consists of structurally related, ligand-dependent transcription factors (1-4). This family includes receptors for steroid hormones, retinoic acid, thyroid hormone, vitamin D3, eicosanoids, and bile acids (1, 5-7). In addition, a large number of genes have been cloned, that encode orphan receptors, receptors for which regulatory ligands have not yet been identified. Nuclear receptors share a common modular structure composed of several domains: the amino-terminal domain (A/B region), DNA-binding domain (DBD or C region), the ligand-binding domain (LBD or E region), and a flexible hinge domain (D region) connecting the DBD and LBD (1, 4, 8). Some receptors contain an extensive carboxyl-terminal domain for which a function has not yet been clearly established. In certain receptors, the amino-terminal domain contains a ligand-independent transactivation function (AF-1). This domain can also influence the affinitywith which receptors bind DNA elements (9). The DBD, which is the most highly conserved region among nuclear receptors, targets the receptor to specific DNA sequences known as hormone response elements (REs). These REs are usually located in the upstream promoter region of target genes. The DBD encompasses two "zinc-finger" motifs, each containing an a-helix referred to as a P- or D-box. The P-box makes specific base contacts between the receptor and the major groove of the DNA helix, while the D-box is involved in proteinprotein interactions, particularly in homo- and heterodimerization of nuclear receptors. The carboxyl-terminal extension (CTE), a region adjacent to the DBD, is also highly conserved among members of each nuclear receptor subfamily; this region influences the RE-binding affinityof the receptor. The nonconserved hinge domain can have multiple functions in repression and activation. The LBD combines several important functions. In addition to forming a ligand-binding pocket, it contains regions that are critical in repression, activation, nuclear localization, and dimerization (4, 7). In certain receptors (e.g., the estrogen receptor) the LBD is involved in interactions with heat-shock proteins (10). Analysis of the crystal structure of the LBD of several receptors revealed a very similar canonical structure consisting of 11-12 helical regions (11, 12). Helices 3-5 play an essential role in the transcriptional regulation by nuclear receptors in that they provide the interaction surface for several coactivators and corepressors. Helix 12 contains the core motif of the transactivation function 2 (AF-2) and is critical in the control of transcriptional activity of nuclear receptors. The conformation of the agonist-bound (holo) receptor has been reported to differ significantly from that of the unliganded (apo) receptor. For example, in retinoid and PPAR receptors (11,12), ]igand binding induces an extensive shift in the position of helix 12, resulting in the dissociation of a multimeric corepressor

208

A.M.

JETrEN ET AL.

complex that consists of corepressors, histone deacetylases, and other cofactors (13-15). This conformational change promotes the formation of a large multimeric coactivator complex containing coactivators, histone acetylases, and additional cofactors (11-13, 15, 16). The latter enzymes cause acetylation ofnucleosomal histones and local remodeling of chromatin structure. Interaction of these multimeric complexes with the basal transcriptional machinery results in the activation of RNA polymerase II and enhanced transcription of target genes. However, nuclear receptors can influence gene expression by a number of other mechanisms, such as the inhibition of NF-KB or AP-l-mediated transcription by glucocorticoid and retinoid receptors (17, 18). Members of the nuclear hormone receptor superfamily have been reported to regulate a variety of physiological processes, including many aspects of embryonic development, differentiation, proliferation, homeostasis, and metabolism (3, 5-7, 19, 20). Genetic alterations and changes in the expression of several receptors have been implicated in a number of pathological conditions (5, 6, 21, 22). Identification of natural and synthetic agonists and antagonists has made it possible to interfere in normal as well as pathological processes and has led to novel strategies in drug development and new therapies for a variety of illnesses, including cancer and diabetes (22-24). The retinoid-related orphan receptors (RORs) el, t, and ~/, initially referred to as RZRs and named NRIF1, -2, and -3, respectively, by the Nuclear Receptor Nomenclature Committee, constitute one subfamily of nuclear orphan receptors (9, 25-28). In this chapter, we analyze and compare the structure, mechanism of action, and functions of this subfamily of nuclear receptors.

II. Cloning and Expression Panern of ROlls A. RORu ROR receptors were identified as a result of different strategies to clone novel members of the nuclear receptor superfamily. Nuclear receptors are particularly highly conserved in the two zinc fingers of the DBD. Using two degenerate primers, the sequences of which were based on the two most highly conserved DBD regions, and a template of poly(A)+ RNA from a variety of tissues, PCR amplification has led to the cloning of DBDs of many novel orphan receptors, including RORs (3, 8, 25, 27, 29). 5'-RACE and cDNA library screening have subsequently been used to obtain their respective, full-length coding regions, hRORot, initially referred to as hRZRu, was the first member of the ROR subfamily to be cloned in this way from the RNA of human umbilical vein endothelial cells (25). Several cDNAs encoding multiple isoforms of hRORa were isolated by screening human retina and testis LgtllcDNA libraries (30). Four different RORot RNA species (al-4) have been identified


209

in humans, while in mice only two isoforms, ot1 and or4, have been detected (30, 31). These isoforms share the same DBD, hinge, and LBD regions but display different amino-terminal domains. These isoforms, which are generated by a combination of alternative promoter usage and exon splicing, have been reported to differ in their DNA-binding specificities and pattern of expression, and therefore regulate different physiological processes and target genes (30). RORa mRNA has been detected in many tissues, including heart, brain, skin, muscle, lung, spleen, testis, ovary, thymus, and peripheral blood leukocytes (25). Peripheral blood leukocytes contain the highest level of RORot mRNA. In most tissues the predominant transcript is about 15 kb. Some tissues, including lung, testis, liver, and leukocytes, contain additional transcripts, 7.5, 5.5, and 2.3 kb in size, which may be generated by the use of alternative polyadenylation signals. Most mouse tissues, including skin, lung, kidney, thymus, and leukocytes, contain only ROR~4 transcripts, while RORot2 and -a3 mRNA are exclusively detected in testis (31, 32). Mouse cerebellum, where RORot mRNA localizes only to the Purkinje neuronal cells, expresses both RORotl and RORot4 transcripts. These cells arise from the proliferative zone above the fourth ventricle beginning on day 13 of murine embryogenesis and migrate along the glia from day 14 through 17. In situ hybridization of sections of El4 embryos revealed high expression of RORot in Purkinje precursor cells in the cerebellar anlage (33). The ataxia displayed by RORa-deficient mice is related to abnormalities in Purkinje cell differentiation (32, 33). RORa mRNA is also expressed in the thalamus and in the suprachiasmatic nuclei of the hypothalamus. In the testis, RORot expression is observed only after sexual maturation and is localized specifically to the peritubular cells (32). Expression of RORot is also observed in the epithelial layer of the epididymus. In the skin, RORa is localized to the hair follicles, epidermis, and sebaceous glands. In the growing hair follicle (anagen stage) RORot expression is restricted to a discrete set of differentiating keratinocytes. Similarly, RORot is expressed in the differentiated, suprabasal layers of the epidermis. The latter indicates a role for RORa in the regulation of gene expression during epidermal differentiation.

B. ROR~ RORfl was originally cloned using a similar PCR strategy with RNA isolated from rat brain (26). RORfl mRNA expression is much more restricted than that of RORot and is most abundant in brain, pineal gland, and eye (34). In situ hybridization studies showed that RORfl mRNA expression localizes particularly to several regions of the central nervous system (34, 35). RORfl mRNA has been detected in the nonpyramidal neurons of layer IV and V of the cerebral cortex and is most highly expressed in primary sensory cortices, particularly the primary visual, auditory, somatosensory, and motor cortex. In the hypothalamus RORfl

210

A.M. JETFEN ET AL.

mRNA was found to be most abundant in the suprachiasmatic nuclei. ROR/3 could not be detected in the hippocampus, striatum, cerebellum, the ventral part of the spinal cord, or the motor nuclei of the cranial nerves. In the spinal cord, ROR/6 localizes to layers of the dorsal horn that receive sensory input from the periphery (35). Developmental regulation of ROR/3 has been observed in the adenohypophysis (Rathke's pouch) in which ROR/3 is expressed highly during early development but at a low level in the adult; the reverse is true for the cerebral cortex (35). In situ hybridization has localized ROR/~ mRNA to the retina (in the retinal photoreceptor layer) and to the pineal gland, the principal site of melatonin synthesis. In the retina, the expression of ROR~6 in the inner and outer nuclear layer is highly regulated during development. Thus far, two different isoforms, ROR/31 and -/32, have been identified which are likely derived by transcription from alternative promoters (36). The two ROR/6 proteins differ only in their amino-terminal sequence and exhibit a different pattern of expression. Expression of ROR/~2 is restricted to the pineal gland and retina, while ROR/31 is expressed highly in cerebral cortex, hypothalamus, and thalamus, and at low levels in the pineal gland and retina. ROR/~2 mRNA expression in the pineal gland and retina has been reported to oscillate dramatically and to change as a function of the circadian rhythm (35, 37). Pineal glands from daytime animals contain the 10-kb ROR/~ 1 mRNA transcript, while the pineal gland from nocturnal animals also express the 1.5-kb ROR/32 mRNA transcript (36). ROR/3 expression does not change in the suprachiasmic nuclei or elsewhere (35). ROR/3 expression in the pineal gland has been reported to be under photoneural regulation, which involves an adrenergic and cAMP-dependent mechanism (37). The distribution pattern of ROR/3 indicates that ROR/~ is most highly expressed in tissues involved in processing sensory information and in anatomical components implicated in the regulation of circadian rhythm. The latter is supported by observations showing fluctuations in ROR/32 mRNA expression with circadian changes and suggests that the ROR/~2 promoter is controlled by the circadian clock (35, 37). Thus, ROR/32 may regulate genes encoding proteins involved in the regulation of the processing of sensory information and circadian rhythm. As discussed below, abnormalities in circadian behavior observed in ROR/3-/- mice are in agreement with such a hypothesis.

C. RORy hRORy was first cloned by PCR using poly(A) + RNA from human pancreas and two degenerate primers, the sequences of which were based on the two most highly conserved regions in the DBDs of the RAR and RXR receptors (27). The murine homolog of RORF, also named TOR, was cloned by screening a mouse muscle (28) and a T cell cDNA library (38). Two different isoforms, referred to as RORF1 and RORy2 (also named RORFt), have been identified (39). The RORF2 lacks an A/B domain and therefore is a truncated form of RORyl; it


211

is derived by transcription from an alternative promoter (40). Northern blot analysis indicated that RORy generates two mRNAs of different size, 2.4 kb and 3.5 kb. These different-size transcripts are derived from the use of two alternative polyadenylation signals (28). RORy 1 has been identified in many tissues and is most highly expressed in the thymus, skeletal muscle, liver, mammary gland, and kidney. It is also highly expressed in brown fat tissue but not in white fat tissue, suggesting a possible role in the regulation of brown fat-specific genes (27, 28, 38). However, both RORy and RORc~ have been shown to be induced during adipocyte differentiation in cultured 3T3-L1 and D1 preadipocytes, which function as in vitro models for white fat cell differentiation (41). The cytokines, TNF-a and TGF-fl 1, which inhibit adipocyte differentiation, also suppress the induction of RORot and -y. What function RORy has in fat cell differentiation awaits further study. RORy 1 transcripts can be found in all tissues where RORy is expressed, but RORy2 transcripts are restricted to the thymus. In the thymus RORy2 is most highly expressed in double-positive (DP) CD4+CD8 + thymocytes but not in mature, single-positive (SP) CD4 + or CD8 + thymocytes or in thymic epithelial cells (39, 40, 42). RORy2 mRNA is also found in immature, double-negative (DN) CD44+CD25 - cells but not in other subpopulations of DN thymocytes. These observations indicate that RORy2 expression is tightly controlled during thymopoiesis and suggest that RORy2 regulates gene expression at discrete stages of T cell development. Expression of RORy has also been observed in the murine thymocyte-like cell line $49, in a number of T cell lymphomas (including mouse EL-4 and YAC-1 cells), and human cutaneous T cell lymphoma HUT78 (38). RORy is undetectable in spleen, bone marrow, natural killer (NK) cells, and B lymphocytes. No expression of RORy mRNA was found in several B cell lymphomas and monocytic cell lines. In E14.5 embryos, RORy has been found in regions where lymph nodes develop; particularly, CD3-CD4+CD45 + IL-7Ra + lymph node precursor cells express high levels of RORy mRNA (43). These findings suggest a role for RORy in lymph node development. The latter is supported by the observed absence of lymph nodes in RORy-/- mice

(43, 44).

D. InsectHomologs of ROR DHR3 and MHR3 are genes identified in Drosophila melanogaster and Manduca sexta, respectively; these genes encode transcription factors structurally related to the nuclear hormone receptor family (45-52). Sequence comparison has indicated that these genes are most closely related to the ROR subfamily and may represent the insect homologs of ROR. In particular, their DBD and AF-2 regions show high homology with those of RORs. Recent studies have demonstrated that DHR3 is required for the prepupal-pupal transition and differentiation of adult structures during Drosophila metamorphosis (45, 53). Mutant DHR3 has been shown to cause defects in pattern formation of

212

A.M. JETFEN ET AL.

the peripheral nervous system (47). These studies indicate that DHR3 plays an important role in the regulation of normal Drosophila development. 20-Hydroxyecdysone, a hormone that controls insect molting and metamorphosis, has been demonstrated to induce DHR3 and MHR3 expression (53). This induction is not immediate and requires protein synthesis. Although several ecdysone REs have been identified in the promoter of MHR3, only one of the three putative ecdysone REs has been found to bind a heterodimeric complex consisting of the ecdysone receptor EcR-B 1 and the RXR homolog USP-1 (51). Future studies must determine whether this binding site has any functional role in the regulation of DHR3 and MHR3 by 20-hydroxyecdysonein vivo or whether this control occurs via an indirect mechanism.

III. Slructure of ROR Proteins The RORs have a domain structure very similar to that of other members of the nuclear receptor family and contain an amino-terminal domain, DBD, hinge domain, and LBD (1, 25-28, 30, 38). The different isoforms generated by each ROR gene differ only in their amino-terminal sequence (30, 36, 39, 40). A

AF-2 i~ORal 459

222

1 7

75

28

96

53

121

84

153 180

131

200

100

168

hROR~l 1

280

517

hRORrl 1

235

465

DHR3 1

462

hRARa 1

462

230

hRXRc~ 1

579

265

hRev-Erb~ 1

119

180

50

118

206

474

hPPAR7 1

410

hT3R~I

FIG. 1. Comparisonof the modularstructures of RORs with those of several other nuclear receptors.The percentagesindicatethe percenthomologyofthe respectiveDBD or LBDwiththose ofthe RORal receptor.The AF2 regionsare indicatedby blackboxes.DHR3,Drosophilahomolog of RORs; RAR,retinoicacid receptor; RXR, retinoidX receptor; PPAR,peroxisomeproliferator activatedreceptor;T3R, thyroidhormonereceptor.

213

ROR N U C L E A R ORPHAN RECEPTOR SUBFAMILY

schematic comparison of the different domains in h u m a n RORot 1, -fl 1, and -Y 1 and those of several other nuclear receptors is shown in Fig. 1. R O R F 1 exhibits a 54% and 51.5% identity with RORot 1 and RORfl 1, respectively, while RORfi 1 and D H R 3 are, respectively, 64.6% and 38.7% identical to R O R a l . The D B D is the most highly conserved domain among RORs. The D B D s of RORfi and R O R F are, respectively, 91% and 88% identical to the D B D of R O R a . The D B D of the Drosophilahomolog D H R 3 is 77% homologous to that of R O R a . The D B D of the RAR receptor shares the next highest (68%) identity with hRORUI

ME SAPAAPDPAAS

EPGSSGADAAAGSR

ETPLNQESARKSE

PpAPVRRQSY

50

M - R- PQRQER

10

IDRT SRNRC

125

N--S ...... R ....... N ....

62

hROR~I

• hRORal

S S TSRGI SVTKKTHTSQI

hROR~I hROR71

DNA-Bindi.8 Domain

E I I P~KI~GDKS

SGI HYGVI T~EG~KGFFRRSQQSNATYS~PEQKN~L

MR .... V ................................ AS--LLA..

AKKTHTS ---V ...............................

b

~

CTE

RC--A---T--Q--P

.........

83

•

hRORUI

QH~RLQE~LAVGMSRDAVKFGRMS

hROR~I

..........

L ...........................

K K Q R D S L Y A E V Q K H R M Q Q Q Q R D E Q ..... Q Q P G E A E P L T P T Y N I S .

hROR71

..........

L ....................

Q. - R L - E Q R - .

.... E - S . . . . R - A R V - S S -

H . . . . . Q L Q - R - -Q Q Q E P V V K T P P A -

PEGS ......... KADSAVSSFYLDI

194 131

AQGAD- L - - TLGLPD- QL

hRORUl

ELHD ..... DLSNYIDGHT.

hROR~I

N-NN ..... ET-....-TY.AN--.

hRORTI

P-GSSPDLPEA-ACPP-LLKAS--GPSYSNNLA--GLNGA-CH-EYSPERGKAEGRESFYSTGS-LT--RC--..

hRORUl

NG... IKPEPICDYTPASGFFPYCSFTNGETSPTVSMAELEHLAQNISKSHLETCQYLREELQQITWQTFLQEEI

hRoa~1

T- IKQ--Q---Y-L-SVPNL-T-

hROR71

.R F E E H R H P G L G E L G Q G P D S Y G S P - - R S T P E A - Y A -

hRORUl

ENYQNKQREVMWQLCAI

hROR~I

KA- - S-S--AL--Q---Q--H

hRORyl

TG--R-SMWE--ER--HHL

........ VI-LPK-EG~YNV

ANGLT IS---S

.................. ...............

QPSPDQSGLDI

IDRIA---I

.............

H6

247

H5

IVLLKAGSLEWF

AM-- -LV ..... YNAD-R--F-E-

H7

KYASPDVFKSLGCEDF

--GGMQM- -A- -SD-LVNEA-D-A-N---LQ-

hRoR71

--GGMEL- RA---SEL--

hRORul

KNHREDG ILTKLI CKVSTLRALCGRMTEKLMAFKAI

hROR~I

- - - L D - E T - A - - - A - I P- I T - V - N L - G - -

hROR71

- T - - Q . S - - A - - P P - .G K - - S - - S Q - V - R - Q I - Q H L H -

I SFVFEFGKSLCSMHLTEDE

IRMCRAFDSQNNTVYFDG

L---S-C .... LV ...... NPL .... L-E-

LS .................

hROR~l

H8 IAL FSAFVLMSADRSWLQEKVK

SI-D- SE-- SAL -FS ...... YT-L--IN-E-PG

383 322 380

Hg-10

I EKLQQKI QLALQHVLQ

- -E ...... ,,--,-,--,--,-,,-,,---,--,,

HI!

308

305

T ...........

hROR~I

112

LT ..... V- SVC--YR .... LRL-D-LRQRSNI-SR--V

DGFMELCQNDQ

............

172 231

......... TM~--H-LA---HTY---

H4

H3 K ITEAI QYVVEFAKRI

236

VSG .......... M

H1

S--N--QLA-GIT-T-

158

.... R-V-Q--YNLE--

458

. . . . . I-

397

FH-H-C

455

HI2 ypD IVRLHFP PL~LFTSEFEPAMQ

IDG

-QV- -QSH- E - -NTL ......... NPDCATACK IV-QAA ......... ~TDV- SPEGLK

523 459 517

// AF-2 FIG. 2. Amino acid sequence comparison of human RORal, -/31, and -yl. Amino acids in ROR~I and -yl that are identical to those in RORal are indicated by dashed lines; gaps are indicated by dotted lines. The DBDs and CTEs are most highly conserved among RORs. The eysteines that are part of the two zinc-finger motifs in the DBD are underlined and bold. The 12 helices (Hl-12) of the LDB are indicated. H3-5 and H12 exhibit the highest degree of homology among RORs. H12 contains the AF-2 consensus motif ¢¢XEeP¢.

214

a.M. JETrEN ETAL.

those of RORs, while the DBDs of all other receptors exhibit less homology. The hinge domains of RORs exhibit little homology, while the LBDs of RORs are moderately conserved. The LBDs of RORfl and RORF exhibit, respectively, a 63% and a 58% identitywith the LBD of RORa. Among vertebrate receptors, the LBDs of RORs are most closely related to those of Rev-Erb and T3R receptors, exhibiting 35-40% identities among one another. A comparison of the amino acid sequences of RORa 1, RORfl 1, and RORF 1 is shown in Fig. 2. In addition to the zinc-finger region of the DBD, the carboxylterminal extension (CTE) of the DBD is also highly conserved. The CTE has been shown to play a role in determining the affinity of RORa to RORE (9), as discussed below in more detail. As has been demonstrated for other nuclear receptors (11, 12), the LBD of RORs contains 12 a-helical regions. Two regions in the LBD, one comprising helices 3-5 and the other helix 12, are particularly highly conserved among RORs. The helices 3-5 form the interaction surface for several coactivators and corepressors. The helix 12 region, consisting of the sequence PPLYKELF at the carboxyl terminus, is absolutely conserved among the three RORs and contains the consensus AF-2 motif d ~ X E ~ (~ represents a hydrophobic amino acid, and X is any amino acid) (54). The AF-2 in the Drosophila homolog DHR3 differs from that of ROR in only one amino acid. As will be discussed below, the AF-2 domain has a critical function in controlling the interaction of RORs with corepressors and coactivators, and hence the activity of RORs.

IV. Characterization of ROR Response Elements The characteristics of the interactions of nuclear receptors with REs can vary substantially among receptors. A nuclear receptor can bind an RE as a monomer, as a homodimer, or as part of a heterodimer. Formation of heterodimeric complexes is usually in partnership with one of the retinoid X receptors (RXRs). Dimeric complexes can interact with direct, everted, or inverted (palindromic) repeats of the core motif AGGTCA spanned by 0-7 nucleotides (3, 7). Monomeric receptor binding occurs to REs containing variations of the single-core motif. To define the consensus sequence of hormone response elements that bind RORs, an electrophoretic mobility assay (EMSA)/PCR-based strategy was used that selects for oligonucleotides with the highest affinity for ROR from a pool of degenerate oligonucleotides. These studies revealed that RORs bind with highest affinity to DNA elements, referred to as ROREs, consisting of the core motif AGGTCA preceded by an AT-rich sequence (26, 28, 30, 38, 55). Variations in the A/T-rich half of the RORE can greatly influence the binding of ROR, indicating the importance of this sequence in determining the affinity and specificity of ROR binding.


215

Two-hybrid analysis and EMSA have shown that RORs bind to ROREs as monomers and do not form homodimers (26, 28, 30, 55). Although EMSA using a DR7 response element has demonstrated the formation of two ROR:nucleotide complexes representing the binding of either one or two RORs, the binding of the two ROR molecules appears to occur independently and does not involve dimerization of the two ROR proteins. A number of nuclear receptors have been reported to form heterodimeric complexes with RXRs; however, RORs have been found to be unable to heterodimerize with RXRs (26, 30). Deletion mutation analysis has demonstrated that the two zinc fingers are not sufficient for ROR binding to RORE and that additional regions are required (9). To identify such regions, the effects of several amino- and carboxyl-terminal deletions on the binding of RORotl to RORE were examined. Deletion of the amino terminus of RORcd up to Ser35 (Fig. 2) had little effect on the binding of RORotl to RORE; however, deletion of an additional 10 residues caused a dramatic reduction in binding, while deletion of another 10 amino acids did not further decrease binding. These results suggest that the region of RORcd between Ser35 and Val45 is important for optimal binding. C-Terminal deletions up to Gln166 had little effect on the binding of RORt~I; however, deletion up to Lys150totally abolished binding. These results indicate that deletion of the LBD does not affect binding in a major way, suggesting that the LBD is not required for optimal binding. However, the CTE, the region flanking the C-terminal side of the DBD from Metl3s to Gln166, is critical for optimal RORot binding (9). The CTE is highly conserved among RORs (Fig. 2) as well as Rev-Erb receptors but shows little homology with CTEs of other receptors, including NGFI-B and SF-1, which likewise bind as a monomer to similar REs (56-58). Specific mutations within the CTE region totally abolished DNA binding of RORot, supporting its critical importance in ROR binding. Methylation interference studies suggested that the zinc fingers of RORoI containing the P-box contact the major groove at the AGGTCA half of the RORE, while the CTE interacts with the adjacent minor groove at the 5'-A/T-rich half of the RORE (9). Although all ROR receptors bind REs consisting of the core motifAGGTCA preceded by an AT-rich motif, different isoforms exhibit distinct affinities for different ROREs, as has been demonstrated for RORotl and -or2 and ROR/~I and -,62 (30, 36). Since the amino terminus is the only difference in amino acid sequence between ROR isoforms, this region is likely involved in influencing the RORE binding specificity of RORs. This was corroborated by experiments comparing the binding specificities of RORot2 mutants carrying various deletions in the amino terminus. These results showed that such mutations greatly influenced the binding of this receptor to ROREs (30). In addition, experiments using hybrid receptors in which the amino terminus of the thyroid hormone receptor/~ (T3Rj6) was replaced by the amino terminus of either RORot1

216

A.M. JETrEN ET AL.

or -or2 showed that the ROR amino-terminal domains impose DNA-binding specificity upon the heterologous nuclear receptor. The mechanism by which the amino terminus controls the binding affinity for ROREs has not yet been fully established. Although the amino terminus could make contacts with the ATrich region itself and provide an additional DNA-binding site for the receptor, experimental evidence appears not to support this concept. Based on circular permutation and methylation interference analysis of ROR-RORE complexes, it was proposed that changes in the amino terminus alter the tertiary structure of the DBD and adjacent CTE, thereby affecting their contacts with DNA (59). The latter may explain the differences in affinity of ROR isoforms for different ROREs, but also provide a mechanism for differential regulation of target genes by ROR isoforms. In the case of RORy, the y2 isoform is a truncated form of y l and has only three additional amino acids upstream from the DBD (39, 40). Both isoforms are able to bind the consensus RORF-RE with high affinity and to enhance RORE-dependent transactivation to a similar degree (60). These results suggest that the amino terminus is not a requirement for binding and transactivation. However, these observations do not rule out a role for the amino terminus in finetuning the binding specificity of ROR F, as has been reported for RORot (30). Some of the binding characteristics of RORs are shared with those of other nuclear receptors, such as Rev-Erbot and -fl, SF-1, RTR, Nur77, and estrogen receptor-related receptors (ERRs) (57, 58, 61, 62). Therefore, these receptors could bind some of the same REs and compete with each other for binding. The type and extent of cross-talk between different receptor signaling pathways depends on whether the receptors are coexpressed in the same cell, the presence of their respective ligands and cofactors, and the affinities of the receptors for the same RE. The orphan receptors Rev-Erbot and -fl have been reported to act as dominant-negative repressors of transcription and can bind to some of the same REs to which RORot and RORy bind. Therefore, by competing for the same DNA-binding site, Rev-Erb can inhibit the transcriptional activation by ROR (28, 41, 63-65). In the case of N-Myc, the reverse has been reported. The repression of N-Myc by Rev-Erb~6 can be abrogated by expression of RORot through a mechanism that involves competition between ROR and Rev-Erbot for the same RE (65). A number of nuclear receptors bind as part of a heterodimer to REs consisting of a direct repeat (DR) spanned by 0-7 nucleotides. Depending on the specific sequence of these DRs, ROR has been found to be able to suppress the transcriptional activation mediated by some receptors by competing for binding to the same site. For example, the CRBPI gene contains a DR2 that is able to bind the RAR/RXR heterodimer. RORfl is able to bind this RE as well and competes with RAR/RXR for binding to this RE (26). Similar observations have been reported for RAREs and TREs (38).


217

V. Transcriptional Con ol by RORs A. Ligand-Dependentor -Independent Activation? Nuclear receptors can function as repressors as well as activators of transcription, and for many receptors these activities are controlled by ligands. Crystallographic studies with retinoid and PPAR receptors have demonstrated that ligand binding causes a change in the conformation of the receptor that results in the dissociation of a corepressor complex and the association of a coactivator complex (11-13, 15, 16). The coactivator complex induces through histone acetylation local changes in chromatin structure and mediates interaction of the receptor with the basal transcriptional machinery. Stimulation of RNA polyrnerase II activity then results in enhanced transcription of target genes. However, for certain receptors, such as the constitutive androstane receptor (CAR), androstanol binding acts in the reverse manner and results in repression of target gene expression (66). With the discovery of nuclear orphan receptors, a number of questions have been raised about receptor activation by ligands. Do all nuclear receptors have ligands, or do certain receptors act as constitutive repressors or activators of transcription? Are certain receptors activated by mechanisms other than ligand binding, such as phosphorylation? Most, if not all, nuclear receptors appear to be phosphorylated. Alterations in phosphorylation can affect the receptor in a variety of ways, including modulation of their activity and protein stability. For example, phosphorylation of the PPARF receptor at its amino terminus by a MAPK-activated signaling pathway has been shown to inhibit transcriptional activation by this receptor (67, 68). Mutation of a single tyrosine in the LBD of the estrogen receptor results in a constitutively active receptor (69), while phosphorylation of a serine residue in the hinge domain of SF-1 enhances the transactivation by this receptor (70). Phosphorylation sites in the glucocorticoid receptor have been reported to be involved in the control of its stability (71). Like other nuclear receptors, RORs are likely phosphoproteins; however, the precise sites ofphosphorylation have not yet been determined. RORa 1 contains potential protein kinase C phosphorylation sites at Sera5 and Thr~3 and a potential protein kinase A phosphorylation site at Ser49 (9). RORF also conrains several potential PKA and PKC phosphorylation sites. The AF-2 domains of RORs contain a Tyr residue that could be a target for phospho-Tyr ldnases. As discussed below, mutation of this residue into Phe abrogates the interaction of RORy with the steroid receptor coactivator-1 (SCR-1) and abolishes the transactivating activity of RORF but has no effect on its interaction with the nuclear receptor corepressor (N-CoR) (60, 72, 73). These results indicate the importance of this residue in the activation function of RORF. Although this Tyr may have only a structural role, its phosphorylation could control the

218

A.M. JETrEN ET AL.

activation of ROR by inducing a conformational change in its LBD and promoting the association of coactivators, such as SRC-1. Future studies have to determine whether phosphorylation of this Tyr plays any role in regulating the activity of RORs. RORs are considered orphan receptors because it is not known whether their activity is regulated by ligands. Reporter gene assays, in which an ROR expression vector and an RaRE-dependent reporter gene plasmid are cotransfected into mammalian cells, have demonstrated that RORs are potent activators of transcription in many cell types (26, 28, 38, 55). In general, it appears that one RaRE is insufficient to induce ROR-mediated transactivation and that two or more RaREs are required to obtain optimal transcriptional activation. The absence of fetal calf serum from the medium does not influence transactivation by RORs, suggesting that potential ligands in serum are not required for RORmediated transactivation (38). Interestingly, transcriptional activation by both RORot and ROR/~ has been reported to be cell type-dependent. RORfl can increase RaRE-dependent transcription in neuronal cells but not in nonneuronal cells (55), while RORot-mediated transcriptional activation has been observed in human choriocarcinoma JEG-3 cells but not in human kidney 293T cells (74). This apparent cell type-specific transactivation by RORs could be due to the cell type-specific expression or activation of one or more coactivators. Alternatively, the activation of ROR itself could be cell type-specific and depend on the cell type-specific synthesis of a ligand or phosphorylation of ROR or cofactors by a cell type-specific kinase. Since ROR/6 is highly expressed in the pineal gland, the principal source for melatonin, it has been hypothesized that melatonin could be a ligand for RORs. Initial studies reported that melatonin was able to bind to RORfl and to enhance the transcriptional activation by ROR/~ (34, 75). However, subsequent studies by several laboratories were unable to demonstrate binding and activation by melatonin (55, 76) (A. M. Jetten, unpublished results), indicating that melatonin does not function as a ligand for RORs. Several thiazolidine derivatives, including CGP52608 which has potent antiarthritic activity, have been reported to enhance specifically the transactivation by RORa and ROR/~ (77, 78). However, further analysis is needed to confirm this agent as a true synthetic ligand for RORs. Therefore, the question remains: Do RORs act as constitutively active receptors or are their activities controlled by a ligand-dependent or -independent mechanism?

B. Interaction of ROR with Corepressors and Coactivators Transcriptional activation by nuclear receptors is mediated through interaction with multiprotein coactivator complexes that consist of histone acetylases, coaetivators, and other cofactors (13, 15, 16). Recently, an increasing number

ROR NUCLEAR ORPHAN RECEPTOR SUBFAMILY

219

of cofactors have been identified, including SRC-1, glucocorticoid receptorinteracting protein-1 (GRIP-l, also known as TIF-2 or N-CoA-2), receptorinteracting protein 140 (RIP-140), T3R-interacting proteins (TRIPs), T3Rassociated proteins (TRAPs), and cAMP response element-binding protein (CBP) (13, 15, 16). Some of these cofactors bind to a limited number of nuclear receptors whereas others exhibit a low specificity. Mammalian two-hybrid analysis has demonstrated that RORot can interact with coactivators TRIP-l, transcription intermediary protein-1 (TIF-1), TRIP230, peroxisome proliferator-binding protein (PBP; also named hTRAP220), and GRIP-1 (79). ROR~/can interact with several coactivators, including SRC-I and CBP (60). Pulldown analyses have demonstrated that these coactivators physically interact with RORs. In addition, SRC-1, GRIP-l, and CBP are able to enhance ROR-mediated transcription, suggesting a physiological role for these coactivators in the induction of gene expression by RORs (Fig. 3). GRIP-1 and

External signals ~

Ca z+~

CaMKIV N-CoR

dlb

~ CaMKIV*

co-factors* ~ c o - f a c t o r s

...........................

-'"

......

RIP-140

c,,,.

,,oso,

I~ RoR~Ni ~ /

/ RORE "4\

Machinery - " % t ~

~""......

TRANSCRIPTION

[

TAT"'~

Target Gene

TAA(Atr)NTAAC~3TCA Ligand?/Phosphorylation? FIG. 3. Model oftranscripfionalactivationby RORs andthe potential roleofCAMKIV. RORs bind as a monomer to ROR-response elements (ROREs) consisting of the consensus core motif AGGTCA preceded by an AT-rich sequence. In the transcriptionally inactive form, RORs interact with a corepressor complex and repress transcription. Ligand binding and/or phosphorylation induce(s) changes in the conformation of ROR, causing dissociation of the corepressor complex and association of a coactivator complex. The corepressor RIP-140 may compete with coactivators for binding to ROR, thereby inhibitingtransactivation. Signaling pathways that increase Ca2+ result in the activation of CaMKIV and the subsequent phosphorylation of one or more nuclear cofactors. Phosphorylation of such cofactors may increase its affinity for ROR, promote the assembly of specific coactivator complexes, and induce ROR-mediated transactivation. An asterisk indicates activated or phosphorylated.

220

n.M. JETFEN ET AL.

CBP exhibit intrinsic histone acetylase activity that leads to acetylation of nucleosomal histones, opening of the chromatin structure, and subsequently enhanced transcription. RORF also interacts with RIP-140 (60) which has been reported to function as a corepressor as well as a coactivator (80, 81). RIP-140 was shown to suppress RORE-dependent transactivation by RORF (60). Similarly, PBP decreased rather than increased the transactivation by RORa (79). These observations indicate that RIP-140 and possibly PBP function as repressors of ROR-mediated transcriptional activation, likely by competing with coactivators for ROR binding (Figs. 3 and 4). Although RORF is a very effective inducer of transcription, it is also able to interact with the corepressor N-CoR in both two-hybrid and pulldown analyses (60). ROR y is unable to interact with the corepressor SMRT (silencing mediator for retinoic acid and thyroid hormone receptor). Thus, RORF can interact with both the corepressor N-CoR and the coactivator SRC-1. Studies with several other nuclear receptors have demonstrated that, upon ligand binding, the LBD of the receptor undergoes a conformational change. The apo-receptor is usually transcriptionally inactive and permits binding of corepressors while the conformation of the holo-receptor promotes interaction with coactivators (11-13, 15, 16). It appears unlikely that ROR displays only one conformational state that enables it to interact with corepressors as well as coactivators. It is more likely that ROR can assume two or more different conformations (Fig. 4); one conformation allows interaction of RORF with the corepressors, such as N-CoR, while another conformation permits association with coactivators, such as SRC- 1, or the corepressor RIP-140. This interpretation implies that RORs are not constitutive activators of transcription but that their activities are regulated by some mechanism. Although the shift between different conformations of ROR could be independent of ligand or phosphorylation and occur as part of a constant thermodynamic equilibrium (as in the shift between conformation I and III; Fig. 4), it appears more likely that the transition between different conformations is controlled by ligand binding or through phosphorylation by protein kinase (as in the shift between conformations I and II; Fig. 4). A number of different regions in the nuclear receptor have been implicated in the interactions of receptors with corepressors and coactivators. Some of these regions serve as an interaction surface, while others control binding through conformational changes in the LBD. Deletion and point mutation analyses were carried out to identify the regions important in the interaction of RORs with corepressors and coactivators. Amino-terminal deletion up to Q221 of RORF 1 largely abolishes the interaction with N-CoR but has little effect on the binding of SRC-1 and RIP-140 (60). These results suggest that the amino-terminal region of the hinge domain is important for the binding of N-CoR. It is, however, not required for the interaction of ROR with SCR-1 or RIP-140. The hinge domain has also been implicated in the binding of N-CoR to other receptors,


221

Ligand/phosphorylation ?

1

AG

ROR7 (III) ~

ROR~/(I)

> ROR~/(II)

110117 Y~IF

N-CoR SRC-1 RIP140

SRC-1 RIP-140

N-CoR

"

,v Passive

Repression

RIP-140

"'.

Activation

~

Active Repression

SRC-1 i

v Activation

Conformations of ROR 7 FIG. 4. The transcriptional activity of RORs is dependent on different conformations. ROR in conformation I can interact with the corepressor N-CoR and acts as an active repressor, while conformations II and III promote interaction with SRC-1 and RIP-140, resulting in activation or repression of transcription, respectively.Although the shift between two conformations (e.g., I and III) may not require ligand binding or phosphorylation of ROR, it appears more likelythat the transition is regulated by ligand binding and/or phosphorylation by a specific protein kinase (as the shift between I and II). The point mutation Ys01Fin hROR F 1 abolishes binding of SRC-1 and RIP-140 but does not affect its interaction with N-CoR. This mutation may retain RORv in a conformation similar to I and make RORy behave as a constitutive repressor. The mutation Es0aQ abolishes binding of hRORy to SRC-1, RIP140, and N-CoR and may represent another conformation (IV) of RORy 1 which functions as a passive repressor (16).

b u t the regions within the hinge d o m a i n r e q u i r e d for this interaction vary a m o n g receptors and do not exhibit any s e q u e n c e similarities (72). It appears that these regions in the hinge d o m a i n are structurally i m p o r t a n t instead o f providing an interaction surface for N-CoR. Helix 12 and helices 3 - 5 in the L B D o f nuclear receptors have b e e n rep o r t e d to b e critical e l e m e n t s in the binding o f eoactivators and corepressors

222

A.M. JEqTEN ET AL.

(15,16, 82). Helices 3--5 are part of the interaction surface for the LXXLL motif in coactivators, such as SRC-1 and CBP, as well as for certain corepressors (83-85). This region is moderately conserved among ROR receptors (see Fig. 2). As expected, deletion of this region totally abolishes the ability of ROR to bind the coactivators SRC-1, CBP, and GRIP-l, and the corepressor RIP-140 (60, 79). In addition, RORot (V33~R)containing a point mutation in helix 3 no longer interacts with either GRIP-1 or PBP. Helix 12 region constitutes the carboxyl-terminal end of RORs (Fig. 2). The amino acid sequence of helix 12 contains the nuclear receptor AF-2 consensus sequence ¢ b ~ X E / D ~ (54). This region has been demonstrated to play a critical role in controlling the binding of coactivators and hence the activity of the receptor (82, 86, 87). The role that helix 12 plays in the interaction of nuclear receptors with corepressors is somewhat different for each receptor. Deletion of the AF-2 region in RORy or RORot completely abolishes its interaction with SRC-1, CBP, GRIP-l, PBP, N-CoR, and RIP-140 (60, 79). The AF-2 point mutation Ys01F does abolish the binding of RORy 1 to SRC-1 and RIP-140 but does not affect the interaction with N-CoR (60). This mutation may retain RORy in an inactive conformation (similar to conformation I in Fig. 4), making it behave as a constitutive repressor. Whether Tyrs01 has only a structural role or whether its potential phosphorylation can modulate the conformation and activity of RORs has yet to be established. The AF-2 mutation Es03Q abolishes the binding of RORy to SRC-1, RIP-140, and N-CoR, indicating that this mutation induces a change in conformation of the LBD (as in conformation IV in Fig. 4) that does not allow interaction with any of these three proteins. The fact that different mutations affect the binding of SRC-1 and N-CoR differently suggests that each mutation induces a different conformational change in the DBD of RORy (Fig. 4). Recently, using RORfl as a bait in yeast two-hybrid screening, a novel protein referred to as neuronal interacting factor X1 (NIX1) was identified (88). In addition to binding ROR/8, NIX1 was also able to interact with ligand-bound RAR and T3R but not with RXR or several steroid hormone receptors. NIX1 is exclusively expressed in brain with significant expression in the dentate gyrus of the hypocampus and in the thalamus, hypothalamus, and brainstem nuclei. NIX1 is a 27-kD nuclear protein that contains two LXXLL motifs. These motifs are found in many coactivators and are critical elements in receptor-cofactor interactions. The AF2 of ROR/~ is required for NIX1 binding, and only one of the LXXLL motifs in NIX1 is necessary for binding ROR/~. No intrinsic transcriptional activity is associated with NIX1, and like RIP-140, it inhibits transactivation by RORfi, possibly by competing with coactivators for receptor binding. Two-hybrid analysis identified the nucleoside diphosphate kinase N M23 and the coactivator TRIP-1 as proteins interacting with RORfi (89). NM23 has been reported to play a role in organogenesis and differentiation, and its expression


223

is inversely related to metastasis. Pulldown analysis confirmed interactions of NM23 with RORa or -/~. However, whether these interactions have any physiological significance has yet to be established.

C. Regulation of ROR-Dependent Transactivation by CaMKIV CaMKIV is a multifunctional Ser/Thr protein kinase that can phosphorylate a variety of substrates (90). CaMKIV is expressed in several tissues, including brain, T lymphocytes, and testis, where it is found in spermatogonia and spermatids (90, 91). CaMKIV is rapidly activated upon elevation of the intracellular Ca 2+ concentration and is predominantly localized to the nucleus. Its nuclear localization suggested a possible role for GaMKIV in the regulation of transcription. This was supported by reports showing that several transcription factors, including cAMP response element-binding protein (CREB), activating transcription factor-1 (ATF-1), and serum response factor (SRF), are targets for CaMKIV phosphorylation (92, 93). Recent studies have shown that CaMKIV can also enhance transcriptional activation mediated by members of the ROR family (94). Cotransfeetion of expression vectors encoding RORot and a Ca2+/ealmodulin-independent form of CaMKIV enhanced RORE-dependent transcriptional activation of a reporter gene 20-30-fold. Cotransfeetion of a catalytically inactive CaMKIV had no effect (94). Stimulation of ROR-mediated transactivation was also observed in epidermal HaCaT cells after activation of endogenous CaMKW by the Ca 2+ionophore ionomyein. CaMKIV was able to enhance not only transcriptional activation mediated by RORal, but also that by RORot2 and RORy and, to a lesser extent, that by COUP-TF1. CaMKIV did not increase T3Ra- or ERmediated transaetivation, indicating that this type of activation is limited to a distinct group of nuclear receptors. Stimulation of ROR-mediated transaetivation was also observed with CaMKI but not with CaMKII. Deletion studies demonstrated that the LBD of RORot is required for the CaMKIV-induced activation. Although RORot contains two putative CaMKIV phosphorylation sites at the amino terminus, mutation analysis indicated that these sites are not involved in CaMKIV-induced transaetivation. In addition, CaMKIV was unable to phosphorylate in vitro transcribed RORo~ (94). These observations suggest that the increase in ROR-mediated transactivation by CaMKIV may involve phosphorylation of other proteins. Since transactivation by RORs is mediated through interactions with other nuclear proteins, such cofactors may be putative targets for CaMKIV phosphorylation. Alternatively, CaMKIVstimulated transactivation could result from modification of a biosynthetic enzyme involved in the production of ROR ligands or activation of another kinase.

224

a.M. JETrEN ET AL.

As discussed above, RORs can interact with several nuclear cofactors, including SRC-1, GRIP-l, CBP, and p300 (60, 79, 89). Any of these cofactors could potentially be involved in the CaMKIV-induced transactivation by RORs. These coactivators interact with the LBD of nuclear receptors through their signature LXXLL motifs (13). Two-hybrid analyses examining the interaction of VP16-RORot(LBD) with a series of Gal4(DBD)-peptides containing various I_2(XLL motifs showed that several peptides containing the consensus HVXXHPLLcPXLL are able to bind RORot (94). Constitutively active CaMKIV dramatically enhances transactivation in this two-hybrid system. These peptides are also able to inhibit CaMKIV-stimulated, RORE-dependent transactivation by RORotl and RORy. The sequence HVXXHPLLdPXLL has not yet been identified in any known cofactor, suggesting that a novel, as yet unidentified, cofactor may mediate ROR-dependent transactivation. Since CaMKIV is a Ca2+-dependent kinase, one could hypothesize that the transcriptional activation by RORs may be modulated by Ca2+ influx through the activation of CaMKIV. Therefore, signaling pathways that induce Ca z+ influx should be able to dramatically enhance ROR-dependent transactivation in cells that express both RORs and CaMKIV (94). Figure 3 shows a putative model of the mechanism of ROR-mediated transcriptional activation and the potential role of CAMKIV. It is interesting to note that CaM KIV is expressed in several tissues, including the cerebellum, retina, and thymus, where RORs control important functions (27, 32, 36, 39, 90). In addition, CaMKIV-/- mice exhibit several phenotypic changes similar to those observed in RORot-/- mice (94). However, in contrast to ROR-knockout mice, spermatogenesis is greatly affected in CaMKIV-/mice and the mice are infertile (95), suggesting that this phenotype involves alterations in signaling pathways other than RORs. Thymocytes from mice expressing a catalytically inactive CAMKIV undergo rapid cell death when placed in culture, as do thymocytes from RORy -/- mice (43, 44, 91). These observations further support a possible link between CaMKIV and ROR-mediated transcriptional activation, at least in the regulation of certain biological processes.

VI. Genomic Structure and Chromosomal Localization The genomic structure of the RORy gene was determined from a P1 vector clone containing the entire mouse RORy gene (96). A schematic representation of the RORy genomic structure is shown in Fig. 5. The mouse RORy gene spans more than 21 kb and consists of 12 exons separated by 11 introns. As mentioned above, the RORy gene generates two isoforms that differ in their amino termini. The amino terminus of RORy 1 is encoded by two exons, la and 2, while that of

ROR NUCLEAR ORPHAN RECEPTOR SUBFAMILY A r~.

ATG(yI)

ATG(T2)

Exon la

2

225 TGA

1 kb

Ib

34

5 6

789

10

I!

B. Exon la

RORTI

2 3

4

5

: : ' " " 1~

I'

. ,i,, ~'~it" ~.

6 .

.

7

8

9

10

.

.

.

.

.

11 :::::::::::::::::::::

~

~

I

I

i

i

i

i

1

i

i

516

....

:'

''

PAS1

I

PAS2

AF-2 Exon

lb 3

4

I

5

6

~

ATG B

8

9

10

11 I

i

TGA

i ROR'/2

7

~' .

II

455 .

.

.

AF-2

FIG. 5. (A) Schematic presentation of the genomic structure of the mouse RORF gene. The RORF gene consists of 12 exons (black boxes). The start and stop codons are indicated. (B) Comparison of the structure of RORF 1 and RORy2 mRNA and protein. Sparse stippling indicates 5'or 31-UTR; diagonal stripes indicate DBD or LBD; dense stippling indicates amino terminus or hinge domain; black boxes indicate AF2 regions. The regions of the RORy mRNAs corresponding to the various exons are indicated. Through the usage of alternative promoters, the RORy gene generates two isoforms, RORy1 and RORy2. RORF2 is identical to RORF1 except that it lacks the amino-terminal domain of RORF 1. Exons la and 2 encode the 5'-UTR and amino terminus of RORF 1, while exon lb encodes the 5'-UTR and three amino acids at the amino terminus of RORF2. The RORF gene generates several transcripts, 2.1 kb and 2.8 kb in size, by the usage of different promoters and alternative polyadenylationsignals PAS] and PAS2.

R O R F 2 is encoded by a single exon, l b (39, 40, 96). The positions of these exons are shown inFig. 5. Based on the genomic structure and the different cell t y p e specific patterns of expression exhibited by the two R O R F isoforms, one can conclude that these isoforms are regulated by different promoters. The D B D of R O R F , spanning the region from Cys31 to Cysgl, is contained within exons 3 and 4. Exon 5 encodes the hinge domain, while the remaining exons encode the entire LBD. The sites of several intron/exon junctions in R O R F are conserved with those in other nuclear receptors. The location of the second intron (at Ser24 in R O R F 1 ) is shared with an equivalent splice site in the RORc~ gene. Intron 3 is located between the exons encoding the two zinc fingers of R O R y , at Lys54. The location of this intron is identical to that of equivalent introns

226

A.M. JE'ITEN ET AL.

in thyroid hormone and retinoid receptors but differs from those in steroid hormone receptors. The position of intron 4 at the C terminus of the DBD of RORF (Alal00 in RORF 1) is highly conserved among nuclear receptors. Based on the locations of these splice sites, the receptors have been divided into several evolutionarily divergent subgroups. In this respect, RORs fit into the T3R/RAR subgroup. The chromosomal localizations of mouse and human RORF were determined by fluorescence in situ hybridization (FISH) analysis using 100-kb fragments of genomic DNA as probes (96). These studies mapped the mouse RORy gene to a position that is 54% of the distance from the heterochromaticeuchromatic boundary to the telomere of chromosome 3, an area that corresponds to 3F2.1-2.2. The human ROR~/was mapped to chromosome 1, an area that corresponds to lq21 (96). The RORa gene was mapped to human chromosome 15q21-q22. To map the mouse RORa gene, a partial mouse cDNA clone was isolated from brain. Using interspecific backcross analysis, the RORot gene was mapped to mouse chromosome 9 (33, 97), 12 centimorgans from the thy-1 locus (98). ROR/~ was mapped to human chromosome 9q22, a region syntenic with mouse chromosome 4 (99).

VII. Targeted Knockouts of RORs A. Phenotypeof RORa-/- Mice Disruption of the RORot gene has been linked to the phenotype observed in homozygous staggerer (sg/sg) mice (31, 33). This natural mutant mouse strain was first described in 1962 and the affected allele mapped to chromosome 9 where RORot also resides (31, 33, 100, 101). Sg/sg mice show tremor, body imbalance, small body size, and die shortly after weaning. These mice exhibit severe cerebellar ataxia due to a defect in Purkinje cell development. Defective development of the thymus and immunological abnormalities have also been reported in sg/sg mice (102, 103). Positional cloning using genetic and physical mapping revealed a 6.5-kb deletion in the genomic sequence of the RORot gene (31, 33) that results in the deletion of an exon encoding the aminoterminal part of the ligand-binding domain. This deletion also causes a shift in the reading frame at amino acid 273 of RORotl and creates a premature stop codon 27 amino acids further. Such a deletion results in a truncated RORa that retains the DNA-binding activity of RORot but lacks the ligand-binding domain. Mice lacking a functional RORot gene have also been generated by targeted disruption using a knockout vector in which the/%galactosidase (fl-gal) or neomycin phosphotransferase (neo) gene replaced the second zinc finger of the DBD of RORot (32, 104). RORot -/- mice exhibit a phenotype very


227

similar to that of sg/sg mice (32, 33). As in sg/sg mice, RORa -/- mice have an abnormal body balance and die a month after birth. Their motor coordination is reduced, as indicated by increased stumbling frequency. Tests to determine muscle strength and equilibrium showed that these capabilities are significantly reduced. The morphological and electrophysiological characteristics of the cerebella from RORoe-/- mice are indistinguishable from those ofs~/sg mice. Subsequent studies have shown that the cerebellar cortex in RORot- - mice is grossly underdeveloped. The granular layer is almost nonexistent and depleted of granule cells, while the Purkinje cells are immature and reduced in number (32, 104). Expression of calbindin and GAD67 mRNAs is unaffected in Purkinje cells from sg/sg mice, in agreement with the hypothesis that the sg/sg defect occurs in developing Purkinje cells after the initiation of differentiation (105). Although in heterozygous mice the morphology of the cerebellar cortex appears normal, a significant loss of cerebellar neurons occurs during aging. The onset of Purkinje cell loss occurs earlier in males than in females

(106). The thyroid hormone (T3) also plays a key role in cerebellar development. Hypothyroid rodents exhibit abnormal Purkinje cell neurogenesis similar to that seen in sg/sg mice (107). Since both RORa and T3R are expressed in these cells, the question has been raised as to whether there is any link between the mechanism by which these two receptor signaling pathways affect Purkinje cell neurogenesis and whether RORo~ acts upstream of the T3R receptor or the reverse. Interestingly, the response of Purkinje cells to T3 is blocked in sg/sg mice. The Purkinje cell protein-2 (pep-2) gene has been identified as a putative target for regulation by T3R (108) and RORa (109). Expression of this protein is undetectable in sg/sg mice despite the presence of T3R~. One study has eoncluded that the effect of RORa on cerebellar development may be mediated through an influenee on the T3 signaling pathway (33). A different study has shown that T3 can alter the timing of RORa expression during development and may, as a consequence, influence Purkinje cell neurogenesis (110). Another consideration is that a subset of TREs may serve as response elements for both RORa and T3R. Changes in the level of expression of either receptor may affect the competition between T3R and RORa for such binding sites and alter the transcription of specific genes. Future studies are needed to provide further insight into the precise mechanisms underlying the interactions between these two receptor-signaling pathways. Although high levels of RORol are normally expressed in the suprabasal layers of the epidermis and in hair follicles, no changes in the epidermis of RORa -/- mice were observed (32). However, these mice develop a significantly less dense fur that grows back much more slowly after shaving. RORa is also expressed in testis; however, spermatogenesis in RORa -/- mice appears normal and the animals are fertile.

228

A.M. JETFEN ET AL. Recently, RORot has also been implicated in the control of bone metabolism

(111). Bone is a metabolically highly active tissue in which homeostasis is maintained through a balance between the activities of osteoblasts and osteoclasts. RORa is expressed in human mesenchymal stem cells in bone marrow, and its expression is increased when these stem cells undergo osteogenic differentiation. A functional role for RORot in bone metabolism has been indicated by studies showing that sg/sg mice have thin long bones that are osteopenic (111). The total bone mineral content in the tibia was found to be significantly diminished in sg/sg mice compared to sg/+ and wt mice. These results suggest a positive role for RORot in the regulation of bone metabolism and bone homeostasis. Osteoblasts produce a number of proteins, including collagen I, bone sialoprotein, osteopontin, and osteocalcin, important in the formation of the bone extracellular matrix and mineralization. An RORE has been identified in the promoter.of the bone sialoprotein gene, and RORot has been shown to enhance transcription through this promoter in rat osteosarcoma ROS 17.2,8 cells (111). In contrast, RORa inhibits the activation of the osteocalcin promoter by vitamin D, suggesting potential cross-talk between these two receptor-signaling pathways. The RORa gene also appears to influence the susceptibility to atherosclerosis (112). Sg/sg mice put on a 9-week high-fat diet developed many atherosclerotic lesions in the small and large coronary arteries and displayed a profound hypoalphalipoproteinemia. The latter was associated with decreased plasma levels of the HDL proteins, apolipoprotein AI and AII (apoA-I and A-II). The reduction in apoA-I levels was due to a decreased expression of the apoA-I gene in the intestine but not in liver. In this regard, it is interesting to note that RORot1 was shown to activate apoA-I transcription in intestinal Caco-2 cells (113). This activation might be mediated by the RORa-response element present in the promoter region of the apoA-I gene. These results suggest that apoA-I may be a potential target gene for RORa. The sg/sg mutation also causes developmental and regulatory changes in the immune system (101). The development of the thymus is delayed; the spleen is undersized and the lymph nodes are enlarged. This immune phenotype is different from that observed in RORF -/- mice (43, 44). The formation of helper T cells appears normal, but sg/sg mice are deficient in the generation of suppressor cells (103). Splenocytes from sg/sg mice treated in vitro with lipopolysaccharide (LPS) show dramatically higher levels of induction of interleukin (IL)lot, IL-I~, and TNFot than do those from wild-type mice treated with LPS (102). Similarly, treatment of sg/sg intraperitoneal macrophages with LPS or N-acetylmttramyl-L-alanyl-D-isoglutamine increases IL-lot mRNA and IL-lfl protein to levels 5-10-fold higher than those attained in macrophages from wild-type mice, demonstrating that these agents induce a hyperexcitable state in macrophages from sg/sg mice.


229

B. Phenotypeof RORfl-/- Mice To investigate the biological role of RORfl, Andre and co-workers (99) disrupted the RORfl gene using a targeting vector in which the second zinc finger of RORfl was replaced by the fl-gal gene. The pattern of fl-gal activity generated in the ROR/~-/- mice correlates well with the expression of RORfl. In agreement with in situ hybridization analysis (35), fl-gal activity matches RORfl expression in the retina, pineal gland, spinal cord, and several areas in the brain. RORfl -/mice do not show gross anatomical changes in the pineal gland, brain, or spinal cord of adult mice, suggesting that these tissues undergo normal development. Young RORfl-/- mice are undersized and initially manifest diminished muscular strength and ataxic movements. Later in adulthood they display a characteristic "duck-like" gait. The biological defect underlying this gait abnormality may be due to an impaired integration of sensory input information (99). The phenotype of the RORfl -/- mice resembles that of the extinct, spontaneous mouse strain vacillans. RORfl-/- mice are generally fertile, except that males do not sexually reproduce during the first 6 months, fl-Gal activity is found in the epithelial cell lining of the epididymis and vas deferens, while testis and prostate are negative. No difference in this expression is observed between young and old RORfl -/mice that could explain infertility at early age. Histological analysis of the eyes of adult RORfl -/- mice showed that the retina is greatly malformed. The retina is disorganized and seems to be collapsed. Shortly after birth, the developing retina of RORfl-/- mice is not very different from that of wt mice; however, several weeks later, the retina appears to exhibit defects in cellular differentiation and manifests degenerative cell loss (99). Adult mice are therefore blind, and tests to analyze the visual capabilities demonstrated that RORfl -/- mice do not have any visual activity. Although the RORfl -/- mice exhibit a very different circadian behavior, the system ofnonvisual photoreceptors in the retina that mediates light-induced circadian responses was not impaired. Under constant darkness the free-running circadian period is lengthened by about 0.4 h. Mutations in several other proteins have been reported to increase the free-running period, including the prion protein and the transcription factor CLOCK, a member of the helix-loop-helix PAS family (114, 115). Whether RORfl acts upstream or downstream from these proteins has yet to be established. It has been suggested that RORfl might regulate the transcription of effectors of circadian rhythm, such as melatonin. Circumstantial evidence for this is provided by observations showing that RORfl is expressed in the pineal gland and in photoreceptors, the two principal producers of melatonin. In addition, the level of RORfl correlates with melatonin biosynthesis. For example, the onset of the rhythmic expression of RORfl in retina and pineal gland coincides with the induction of melatonin synthesis. Although melatonin could potentially act as a ligand for RORfl, several laboratories have demonstrated that this is not the case (55, 76).

230

A.M. JETrEN ET AL.

Future studies are needed to determine the precise molecular mechanisms by which RORfl controls circadian rhythm.

C. Phenotypeof RORy-/- Mice 1. ROLE FOR RORF IN LYMPHNODE ORGANOGENESIS Two different laboratories have reported on the disruption of the RORy gene in mice (43, 44). The general appearance of RORy -/- mice is normal and the mice are healthy during early stages of life. Their reproductive capacity is not compromised. Necropsy studies have revealed that RORy-/- mice lack all lymph nodes. Peripheral (e.g., popliteal, inguinal, cervical), paraaortic, and mesenteric lymph nodes as well as Peyer's patches are absent, indicating that lymph node development is arrested. In contrast, lymphatic vessels appear to be normal. These observations suggest that RORy plays a critical role in lymph node organogenesis. Recent studies have demonstrated the importance of several proteins in the regulation of lymph node development and include members of the tumor necrosis factor (TNF) family, their receptors, and the transcription factor Id2. Like RORy -/- mice, mice deficient in lymphotoxin (LT) ot or LT receptor (LTR) fl usually lack all lymph nodes and Peyer's patches and, in contrast to RORy-/- mice, have a disorganized spleen that lacks germinal centers (116-119). LTfl -/- mice lack most lymph nodes but retain mesenteric and cervical lymph nodes (120). RAN K is a member of the TNF receptor family; RANK -/mice lack all lymph nodes except mucosal-associated lymph nodes and also display a deficiency in B lymphocytes and osteoclast differentiation (121). Alymphoplasia (aly) mice, which carry a point mutation in the NF-K/6-inducing kinase (NIK) gene, are characterized by the systemic absence of lymph nodes and Peyer's patches, disorganized splenic and thymic architectures, and immunodeficiency (122,123). These observations indicate that lymph node organogenesis is complex and that formation of different lymph nodes involves control by different signaling pathways. Whether there is any link between expression of RORy and the LT and LTR signaling pathways has yet to be investigated. Mice disrupted in the Id2 gene, which encodes a transcriptional factor of the basic helix-loop-helix family, lack lymph nodes and Peyer's patches (124). In contrast to RORF -/- mice, Id2 -/- mice also lack natural killer cells, but do not appear to exhibit any abnormal thymic or splenic phenotype. Recent studies have indicated that CD3-CD4+CD45 + IL-7Rot + progenitor cells are important in the development of secondary lymphoid organs and NK cells (125). These early precursor cells express both RORy and Id2 and are absent in ROR F-/- and Id2 -/- mice, which suggests that both of these transcription factors are essential for differentiation and/or survival of these progenitor cells (43, 124). Whethe the RORF- and Id2-signaling pathways are engaged

231


in any cross-talk or whether RORv acts down- or upstream of Id2 are intriguing questions that await further study. 2. R O L E FOR

ROR F

IN THYMOPOIESIS AND APOPTOSIS

In addition to the absence of lymph nodes, RORF -/- mice exhibit several changes in thymopoiesis (43, 44). At 2-3 months of age, the thymus is relatively smaller and the total number of T lymphocytes is reduced by 75%. The number of peripheral blood T lymphocytes in ROR7-/- mice is about one-sixth that in wild-type mice, whereas the number of B lymphocytes does not change. Although the spleen contains almost 3 times more lymphocytes, splenic architecture is normal. The number of B lymphocytes in spleen is increased relative to that of T lymphocytes. T lymphocyte maturation in the thymus is a well-defined, multistep process that involves proliferation, differentiation, apoptosis, selection, and commitment to different lineages (126-133). A schematic presentation of thymopoiesis is shown in Fig. 6. Early during thymopoiesis, the immature CD4-CD8- double negative (DN) thymocytes, which represent a minority (3-5%) in the adult thymus, undergo a series of changes. The thymic lymphoid CD44+CD25 - progenitors differentiatevia two intermediate stages, CD44+CD25 + and CD44-CD25 +, into CD44- CD25- pre-T cells. During this differentiation, the T cell receptor (TCR) fl gene undergoes rearrangements and becomes expressed. The C D44 - C D25 - cells differentiate further into C D4+C D8 + double-positive (D P ) Negative selection Positive selection Death by neglect

Rapidly cycling cells TCR gene rearrangement

Early T cell precursor o

SP CD4+CD8"

o

o

°~0

oooo CD44+ CD25"

TCR gene rearrangement

CD44+ CD25+

RORT2 expression

CD44CD25+

oo o

0

~D ~

o~o ~o CD44" CD2,5-

~

0

~ ~-

I

I

TCR~

"~..

"~

DP thymocytes CIM+CD8+ TCR[~j° TCRc~[~

sP CIM-CD8 + TCRa[~

I

Mature T Cells

RORT2 expression Bcl-XLexpression

FIG.& Schematic of the multistep pathway of thymocyte maturation. The expression of RORF and Bcl-XL is indicated.

I

232

A.M. JETrEN ET AL.

thymocytes which constitute the majority (80-85%) of the thymocyte population. At this stage TCRa rearrangements take place and TCRot begins to be expressed. A majority of DP thymocytes, which do not recognize complexes of major histocompatibility antigen complex (MHC) proteins and peptides, undergo apoptosis (death by neglect). Negative selection eliminates, via apoptosis, DP thymocytes that express self-reactive T cell antigen receptors, while thymocytes exhibiting low affinities for MHC-peptide complexes undergo positive selection. Only a small fraction of the surviving, positively selected DP cells mature further into single-positive (SP) CD4 + helper and SPCD8 + cytotoxic lineages. This differentiation depends critically on the specificity of the interactions between TCRs and class I and II major histocompatibility complexes. DP ceils differ from SP cells in several ways. DP cells express much lower levels of TCR and do not proliferate or produce IL-2 after stimulation with anti-CD3 antibodies or calcium ionophore. In the last stage, SP thymocytes locate to the thymic medula and are released into the blood. ROR}, has been reported to be expressed at specific stages during thymopoiesis (Fig. 6). RORy2 is expressed at high levels only in the very immature CD44+CD25 - thymocyte precursor cells and immature DP thymocytes and is undetectable in fully mature SPCD4 + and SPCD8 + cells (39, 42). These results indicate that RORy is induced when CD44-CD25- pre-T cells differentiate into DP thymocytes. Evidence has been provided that suggests that the induction of RORy expression may be mediated by the pre-TCR signaling cascade (40). This induction precedes the expression of TEA (T early alpha). Interestingly, the TEA promoter contains an RORE able to bind RORy2, suggesting that RORy may regulate TEA and, as a consequence, Vot-to-Ja rearrangements (40). CD4+CD81°WHSA(heat-stable antigen)hi~ cells, which are an intermediate step in the maturation from DP to SPCD4 + HSAl°w thymocytes, express moderate levels of RORy2. The latter suggests that RORy2 is downregnlated gradually during thymocyte maturation. These observations show that RORy2 expression is tightly regulated during thymopoiesis and suggest that RORy controls gene expression at very specific stages of thymopoiesis. Study of the various CD4/CD8 subpopulations in thymi from RORy-/- mice showed that the percentages ofDP, SPCD4 +, and SPCD8 + cells are significantly reduced compared to those of wild-type mice (43, 44). Although the percentage of DN cells is greatly enhanced, their total number is not changed significantly. In addition, little change is observed in the expression of CD44 and CD25, indicating that the early stages of thymopoiesis proceed normally. The reduction in DP, SPCD4 +, and SPCD8 + thymocytes in RORy-/- mice could be due to changes in differentiation, proliferation, and/or apoptosis. SPCD4 + cells, although dramatically reduced in number, contain normal levels of TCR and CD4, suggesting that they have undergone positive selection. Examination of tissue sections stained by eosin/hematoxylin revealed the presence of


233

an increased number of apoptotic cells in thymi of RORy -/- mice (43, 44). This was confirmed by the observed increase in TUNEL staining, a measurement of the extent of DNA fragmentation. TUNEL staining is localized to the cortical regions where the DP thymocytes reside. Flow cytometric analysis confirmed that accelerated a~poptosis is associated with the DP thymocytes (44). Crosses between RORy-'- mice and TCRot -/- mice, in which DP cells are unable to undergo negative selection, demonstrated that TCRa-/-/RORy-/- thymocytes have the same phenotype as RORy-/- thymocytes, suggesting that accelerated apoptosis in RORy-/- thymocytes is not due to enhanced negative selection but to increased death by neglect (43). Apoptosis is a multistep process of programmed cell death in which dissipation of mitochondrial transmembrane potential, release of cytochrome c, and activation of caspases, a family of cysteine proteases, often are important events (134-137). Caspases can be involved in the initiation of apoptosis as well as in its execution (138, 139). Activation of caspases leads to cleavage of various protein substrates, DNA fragmentation, and translocation of phosphatidylserine from the inner layer of the plasma membrane to the outer layer. The latter can be monitored by measuring the binding of FITC-conjugated annexin V, a Ca2+-binding protein with high affinity for phosphatidylserine, by flow cytometry (140). Analysis of caspase activity and annexin V binding have confirmed that RORy-/- thymocytes undergo apoptosis at an accelerated rate (44, 141). Annexin V binding studies showed that within 5 h more than 80% of cultured RORy -/- thymocytes are undergoing apoptosis compared to 10% of RORF +/+ thymocytes (43, 44). Z-VAD-FMK, a cell-permeable peptide that at high concentrations irreversibly inhibits the activity of many caspases, effectively suppresses the progression of apoptosis in cultured RORF -/- thymocytes (141) (Fig. 7). The release of cytochrome c from mitochondria into the cytosol is a critical step in the induction of many, but not all, apoptotic signaling pathways (136, 137). When released into the cytosol, cytochrome c forms a complex with apoptosis-activating factor (Apaf)-l, caspase 9, and dATP that results in the activation of caspase 9 and subsequently other downstream effector caspases, ultimately leading to cell death (134-137) (Fig. 7). In certain cell systems, release of cytochrome c from mitochondria appears to be controlled by members of the Bcl-2 family that includes the antiapoptotic proteins, Bcl-2, Mcl-1, and BclXL, and the proapoptotic proteins, Bax, Bak, Bad, Bid, and Bcl-Xs (142). Bcl-2 and Bcl-XL inhibit release of cytochrome c from mitochondria, while proteins that promote apoptosis induce this release. Bcl-2 family members may regulate exit of cytochrome c by modulating the activity of existing channels or by forming new channels in the mitochondrial membrane (137). A recent study demonstrated that Bax and Bak facilitate the opening of the permeability transition pore (PTP), resulting in the collapse of the mitochondrial transmembrane

234

A.M. JET'FEN ET AL.

m

Bd-XL

Bax

Apaf-1

~ l P c y t ¢~.

2VAD.J~ ~

~erexpression ~fBcl-XL

'~ee~r"

Cdk2*~ caslmSe sub.rates

DNAFragmentation I Apoptosis

ROWt4,

Membrane

Roscovitine

FIG. 7. Model of apoptotic events in RORy-/- thymocytes. RORy-/- thymocytes undergo accelerated apoptosis that is associated with repression of Bcl-XL,activation of cdk2, translocation of Bax from the cytosol to the mitochondria, dissipation of AtPm, release of cytochrome c into the cytosol, activation of caspase activity,increased annexin V binding, and ultimately DNA fragmentation. The precise relationship between several of these events has not not yet been determined, and what is presented is a possible chain of events. Inhibition of caspase activityby ZVAD.fmkinhibits the execution of apoptosis. Activation of cdk2 appears to play a critical role in this induction of apoptosis, sinee the cdk2 inhibitor roscovitine inhibits apoptosis. Cdk2 activationis upstream from the dissipation of AtPm, release of cytochrome c, and caspase activation. Cdk2 may induce phosphorylation and thereby change the activity of proteins involved in the apoptotic process, such as Bax or p53. The link between BcI-XLrepression and cdk2 activation is not known. Activationof edk2 could be downstream of Bcl-XLrepression. This appears to be supported by observations showing that overexpression of Bcl-XLblocks apoptosis and confers on RORy-I- thymocytesa normal cell cycle behavior. Alternatively, cdk2 activation might occur independently of Bcl-XLand act synergistallywith the reduction in Bcl-XL.Repression of Bcl-XLmay result in change of activityof several proteins, including PTP and Apaf-1. potential (AtPm) and possibly permeation of cytochrome e through the channel, while Bcl-XL stimulates closure of this channel (143). The modulation of this channel may be mediated through interactions of Bcl-XL, Bax, and Bak with the voltage-dependent anion channel (VDAC), one of the components of the PTP. However, release of cytochrome c can also occur in the absence of a collapse in AtPm (144), suggesting the existence of other targets for members of the Bcl-2 family. The latter is illustrated by a report showing that Bcl-XL can physically interact with Apaf-1 and inhibit maturation of caspase 9 (145). Clearly, the


235

relationships between the various proteins and activities that have been associated with apoptosis are still very controversial and await further study. Although the precise role of RORF in the control of apoptosis is not yet fully understood, RNA protection assays as well as Northern and Western blot analyses have shown that the expression of Bcl-XL mRNA and protein is dramatically reduced in RORF -/- thymocytes (43, 44). Little change in the expression of Bax, Bak, and Bcl-2 mRNA is detected. Overexpression of Bcl-XL under the control of the Ick proximal promoter is able to rescue RORF-/- thymocytes from undergoing cell death (43). It is interesting to note that thymocytes from Bcl-XL-/- mice also exhibit decreased survival as RORF -/- thymocytes (146). These observations support a model (Fig. 7) in which downregulation of the antiapoptotic protein Bcl-XL may be at least part of the mechanism by which accelerated apoptosis occurs in RORF-/- thyrnocytes. The loss of BcI-XL and the observed translocation of Bax to mitochondfia in RORF-/- thymocytes (141) may facilitate opening of the PTP and subsequently result in a collapse of AO2m, release of cytochrome c, and apoptosis (Fig. 7). The rapid collapse in A~m / and release of cytochrome c into the cytosol observed in RORy- - thymocytes placed in culture are in agreement with this concept (141). However, the loss of Bcl-XL may have an effect on the activity of other proteins, such as Apaf-1, as well. Apoptosis and mitosis have many features in common (147). Many gene products that control the cell cycle, including p53, c-myc, and retinoblastoma (Rb) protein (148-150), also have an effect on the susceptibility of cells to undergo apoptosis, while gene products involved in apoptosis, such as Bcl-2 and Bax, can regulate cell _g/rowth(151-153). In addition to the observed increase in apoptosis in RORy- - thymocytes, the percentage of thymocytes in S phase is dramatically increased from 4.4% in wild-._t_~_eto 25.7% in RORy -/- mice (43, 44). Moreover, thymocytes from R O R y - - mice contain reduced levels of the cyclin-dependent kinase (cdk) 2 inhibitor p27ldpl while cdk2 activity is dramatically increased (43). These changes are probably only in part due to the increase in the percentage of rapidly proliferating CD44-CD25- thymocytes observed in RORF -/- mice, since inhibition of cdk2 activity by roscovitine greatly reduces apoptosis in RORF -/- DP thymocytes. These observations indicate not only that the increase in cdk2 activity is associated with apoptosis but that cdk function is required for the accelerated apoptosis in RORy-'- DP thymocytes (43) (Fig. 7). Activation of cdk2 has been reported to play a critical role in the induction of apoptosis in thymocytes by a number of apoptotic stimuli (153, 154). These studies indicated that activation of cdk2 acts upstream of caspases and the A~m, and demonstrated that it is a step of no return. However, these studies reached different conclusions about whether cdk2 acts up- or downstream of p53 and Bcl-2. The activation ofcdk2 in RORF-/- DP thymocytes also

236

A.M. JETrEN ET AL.

occurs upstream of caspases and A d . t m (Fig. 7); however, what its relationship is with the downregulation of Bcl-XL has not yet been established. Expression of Bcl-XL in RORy-/- thymocytes restores normal cell cycle behavior and inhibits apoptosis (43), suggesting that Bcl-XL may function upstream of cdk2 activation. Alternatively, activation of cdk2 may be induced independently of BcI-XL when thymocytes are placed in culture and act synergistically with the downregulation of Bcl-XL. The inhibition of apoptosis in RORy-/- thymocytes by roscovitine may support the latter hypothesis. One mechanism by which cdk2 might induce apoptosis is through phosphorylation of other apoptosis-regulatory proteins, such as proteins that control PTP function or caspase activation (153) (Fig. 7). Bax and p53 have been proposed as possible target proteins for cdk2 phosphorylation. Repression of Bcl-XL expression observed in RORy -/- thymocytes might imply that the inverse could also be true. Expression of RORy may induce Bcl-XL expression and thereby function as a suppressor of apoptosis and enhance survival of thymocytes. The expression of Bcl-XL during thymopoiesis has been reported to be restricted to DP thymocytes (146, 155) (Fig. 6) and therefore is coexpressed with RORy2 in the same cells. Whether Bcl-XL is a direct target gene for RORy or whether RORy regulates Bcl-XL expression by an indirect mechanism has yet to be established (Fig. 7). Examination of the 750-bp 5'-regulatory region of the Bcl-XL gene has not identified any sequence resembling an RORE. RNase protection analysis showed relatively little change in the expression of Fas or FasL, suggesting that the increased apoptosis in RORy-/- thymocytes is not due to increased FasL expression (44). This is supported by observations showing that gld/gld/RORy -/- thymocytes obtained from RORy -/- mice crossed with glg/gld mice defective in FasL function, underwent apoptosis to the same extent as RORy -/- thymocytes (43). In addition, in contrast to apoptosis in RORy-/- thymocytes, induction of apoptosis by FasL cannot be blocked by Bcl-2 or Bcl-XL (156).

VIII. Overexpression of RORs A. Effectof RORy on Thymopoiesisand Apoptosis He and coinvestigators used a different approach to study the role of RORy2 in thymopoiesis (42). Transgenic mice were generated in which the ectopic expression of RORy2 was driven by the hCD2 promoter/enhancer regulatory region. This promoter targets expression of RORy2 to all T cells, immature as well as mature. Therefore, in these RORy2 transgenic mice, RORy2 can be detected in several DN subpopulations, SPCD4 + and SPCD8 + cells, and in


237

T lymphocytes of spleen and lymph nodes where RORy2 normally is not expressed. The number of thymocytes in RORF2 transgenic mice is reduced by 85%. The percentage of DP thymocytes is dramatically lower than in their wildtype littermates while the percentage of DN thymocytes increases. Although the percentage of SPCD4 + and SPCD8 + thymocytes is higher in transgenics, the absolute number of these cells is significantly reduced compared to control mice. The number of T lymphocytes in spleen and lymph nodes is also reduced, by 50-60%. These observations indicate that targeted expression of RORF2 affects the transition of DN to DP thymocytes and suggest that downregulation of RORF during early thymopoiesis is essential for normal thymocyte differentiation to proceed. Analysis of triple-negative (TN) CD4-CD8-CD3- cells showed several changes in the distribution of the different TN subpopulations from RORF2 mice compared to those from control mice. The percentage of CD44-CD25 + cells is dramatically enhanced, whereas few CD44-CD25- cells are detected. The changes in these two cell populations could be attributed to an inhibition of the differentiation of the CD44-CD25 + subpopulation into CD44-CD25cells, to an inhibition of the proliferation and expansion of the CD25-CD44subpopulation, or to increased apoptosis in CD25-CD44- cells (42). The last possibility can be ruled out, because no increase in apoptosis is observed. Cell cycle analysis showed that less than 3% of the CD44-CD25- subpopulation from RORF2 mice is in the S and G2/M phase of the cell cycle compared to 30% of the CD44-C D25- cells from control mice. These observations strongly indicate that ectopic expression of RORy2 in these transgenic mice inhibits the proliferation of CD44-CD25- thymocytes. In addition to inhibiting CD44-CD25- cells, ectopic expression of RORy2 also suppresses proliferation of mature SPCD4 + cells by phorbol ester and ionomycin. Ectopic expression of RORF2 has an effect on the expression of several genes. The expression of TCR in SP thymocytes and peripheral T cells from spleen and lymph nodes is downregulated in RORF2 transgenic mice while FasL expression is only slightly affected (42). The induction of IL-2 by PMA and ionomycin is 3-6-fold lower in SPCD4 + and splenic T ceils from RORF2 transgenic mice compared to those of wild-type mice. The regulation of IL-2 is complex and has been reported to involve multiple transcription factors, including Erg family members and c-rel (157). Although ectopic expression of RORF2 negatively regulates c-rel expression, ectopic expression of c-rel is unable to reverse the effect of RORF2, indicating that downregulation of interleukin 2 by RORF2 does not involve c-rel or involves other factors in addition to c-rel. Similar results were obtained in T cell hybridoma KMIs-8.3.5 cells overexpressing ROR×2 (39). As mentioned above, study of RORF-/- mice has indicated that RORF expression suppresses apoptosis in thymocytes. Such a negative regulatory role has

238

A.M. JETYENET AL.

also been observed in T cell hybridomas. Ectopic expression of RORy in T cell hybridoma cells DOl1.10, 2B4, and KMIs-8.3, which normally do not express RORF, greatly inhibits the induction of apoptosis (39) (S. Kurebayashi and A. M. Jetten, unpublished observations). T cell hybridomas expressing RORy become refractory to both TCR-mediated apoptosis by anti-CD3 monoclonal antibodies and TCR-independent apoptosis stimulated by phorbol ester plus ionomycin. However, the induction of apoptosis by FasL, dexamethasone, ceramide, and the kinase inhibitor staurosporin is not affected by RORy (39). These results indicate that ROR F affects specific apoptotic signaling pathways in T cells. To induce this inhibitory effect, both the DBD and the LBD of RORy are required. RORF 2 was shown to be more effective in inhibiting apoptosis than was RORy 1. TCR-mediated apoptosis is a complex, multistep process that, through the activation of various kinases and phosphatases, results in the activation and increased expression of several transcriptional factors (158). These factors include the nuclear orphan receptor Nur77 (159-161), members of the nuclear factor of activated T cells (NFAT) (162, 163), the forkhead family (164), and the Egr family (165, 166). The induction of these transcription factors leads to an increase in the transcription of several other genes, including Fas-ligand and interleukin 2. FasL is secreted and, after binding to the Fas receptor, induces T cell death. Since the induction of apoptosis by FasL is not inhibited by RORy, the inhibition of apoptosis by RORF appears to occur at the level of FasL expression or at a step further upstream of this induction. Northern analysis demonstrated that the inhibition of TCR-mediated apoptosis in T cell hybridomas by RORF is related to the repression of FasL mRNA induction. The antagonism of TCR-mediated apoptosis in T cell hybridomas by retinoic acid or glucocorticoids must not be mediated by RORy, because studies have shown that ROR~/ is not induced by these agents (39). RORF also inhibits the TCR-mediated increase in IL-2 mRNA but does not affect the induction of CD69 and CD44, indicating that ROR~/inhibits a specific step that may be common to the control of FasL and IL-2 expression. Interestingly, some of the same transcription factors, including members of the Egr family, have been implicated in the regulation of both of these genes. The mechanism by which RORy suppresses FasL induction remains to be elucidated. The regulation of FasL gene expression is complex and many transcription factors have been implicated in its control. Activation of TCR results in a dramatic increase in the expression of Nur77, Egr2, and Egr-3 mRNAs (39). However, ROR~/does not inhibit the induction of these transcription factors, suggesting that the repression of FasL by RORy occurs at a different level downstream of this induction. Preliminary results have indicated that RORy can suppress Egr-mediated transactivation, suggesting that antagonism between the RORF and Egr signaling pathways may be responsible for repression of FasL expression by RORy (M. Sakaue and A. M. Jetten, unpublished observations).

ROB NUCLEARORPHANRECEPTORSUBFAMILY

239

B. Inhibition of Myogenesis by Dominant-Negative RORe RORot has been reported to be expressed in skeletal muscle tissue and in the mouse myoblast cell line C2C 12 (25, 167). Its expression does not change when proliferative C2C12 cells undergo differentiation into postmitotic, multinucleated myotubules upon serum withdrawal. To examine the role of RORa in muscle differentiation, a dominant-negative RORa (dn-RORot) expression vector was stably transfected into C2C12 cells and its effect on myogenesis was determined (167). These results showed that ectopic expression of dn-RORa delays and inhibits muscle cell differentiation. Forty-eight hours after serum withdrawal, C2C12 cells expressing dn-ROR~ do not express skeletal myosin heavy chain (MHC) or form myotubules, in contrast to parental cells. However, MHC and myotubules appear after 96 h, although not to the same level as in control cells. Expression ofdn-RORo~ inhibits the induction of both MyoD and myogenin, two basic helix-loop-helix transcription factors critical in the control of myogenesis. The dn-ROR~ also inhibits the induction of the cdk-inhibitor p21 waF1/cipl, a marker for cell cycle exit. Based on this inhibitory function of dn-RORot, it was concluded that RORot may positively regulate myogenesis. This study also provided evidence for a direct interaction between ROR~ and MyoD (167). This interaction requires the amino-terminal activation domain of MyoD and the DBD of RORa. The precise role this interaction has in the control of myogenesis is yet to be determined.

IX. Other Target Genes A number of potential ROR target genes have been identified. These include 5-1ipoxygenase, yF-crystallin, apoA-I, laminin B1, cellular retinol binding protein (CRBP), oxytocin, Purkinje cell protein-2, TEA, and p21 wffl (32, 40, 76, 113, 168-170). The identification of these target genes is based largely upon the presence of ROREs in their 5'-promoter flanking regions. However, little evidence has been accumulated to indicate that these genes are true targets for RORs in a physiological setting. The regulation of TEA has already been discussed above. The promoter of the gene encoding the neuropeptide oxytocin has been found to be activated by RORot (170). This gene is expressed in specific hypothalamic neuroendocrine cells, the pineal gland, the uterine epithelium, fetal membranes, and corpus luteum, all sites of high RORfl expression. Two RORElike elements have been found in the oxytocin promoter region. Mutations in these elements significantly reduced transcriptional activation by ROR. A single RORE has been identified in both the 5-lipoxygenase and mCRBP promoter.

240

A.M. JET'FENET AL.

Gel shift analyses demonstrated binding of RORot to these DNA elements (168). In Drosophila SL-3 cells, RORo~was able to enhance transcription of a reporter under the control of these elements. The RORE in the 5-1ipoxygenase was able to bind RORa i but not RORa2 or RORot3. Although studies with promoterreporter constructs and cotransfection with ROR expression vectors can demonstrate that ROR is able to bind to these sites, they do not prove that these genes are truly targets for ROR in vivo. The laminin B1 gene contains three core motifs spaced by 3 and 13 bp. All three core motifs are necessary to confer ROR- and RAR-dependent transactivation (169). Cotransfection of a reporter under the control of the -460-bp promoter flanking region of the laminin B1 gene along with an RORal expression vector resulted in a sevenfold increase in reporter activity. The activation was inhibited by cotransfection with RAR. This inhibition is likely due to competition for binding to the same elements. RORal has been reported to enhance the transcriptional activation mediated by the apoA-I promoter in the colon carcinoma Caco-2 cells (113). In this study, neither RORa2 nor RORtr3 was able to induce transcriptional activation through the apoA-I promoter. An ATATATAGGTCA sequence was found that overlapped with the TATA-box. Mutation of the AGGTCA core motif abolished the transactivation by RORal. EMSA showed that RORal could bind to the wt-RE but not to the mutated RE. To analyze the physiological significance of this regulation the level of apoA-I expression in wt and sg/sg mice were compared. Results showed that the level of apoA-I mRNA was significantly lower in the intestine of sg/sg mice than that of wt mice. These observations appear to support the hypothesis that the APO-AI gene is a target gene for RORal and indicate a potential role for RORa in the regulation of genes involved in lipid and lipoprotein metabolism. The severe atherosclerosis observed in sg/sg mice kept on a high-fat diet is in agreement with such a concept. Scanning the databases for the presence of RORE sites in sequences involved in transcriptional regulation revealed an RORE in the first intron of the N-Myc gene, a region reported to be implicated in the control of this gene (65). In further studies, EMSA demonstrated that RORu was able to bind to this RORE site, In addition, RORot was shown to induce transcriptional activation of N-Myc significantly when an RORa expression vector and the entire N-Myc transcription unit of 7.3 kb were cotransfected into COS-7 cells. The inhibition of N-Myc induction during differentiation of embryonal carcinoma P19 cells by the nuclear receptor Rev-Erbfl was reversed by RORa. This antagonism is likely due to competition between the two receptors for the same response element. N-Myc in combination with activated Ha-ras causes oncogenic transformation in rat embryo fihrohlasts and an increase in the formation of transformed loci. Concomitant expression of RORot causes a twofold increase in foci formation,


241

indicating that RORa enhanced the transformed phenotype in these cells (65). These results suggest that expression of RORot may contribute to the progression of certain neoplasias. RORot is also expressed in the murine lens. Tiniet al. (76) have shown that RORa can activate transcription through the yF-crystallin promoter. An RE was identified between nucleotides -210 to -185 of the y F-crystallin promoter that can bind either a ROR monomer or a RAR/RXR heterodimer, which suggests that these receptors compete with each other for binding to this element. The constitutive activation of this element by ROR can be suppressed by the RAR/RXR heterodimer. However, further studies must discern whether this antagonism has any physiological significance and whether FF-crystallin is indeed a true target gene for ROR.

X. ConcludingRemarks It is clear from the studies reviewed above that much progress has been made in understanding the mechanisms of action and biological functions of RORs. However, these studies have also left open some questions and raised new ones. The unique pattern of expression of the various ROR isoforms suggests that each isoform is under a different transcriptional control and regulates different physiological processes. These studies have also demonstrated that RORs play critical roles in the regulation of a number of physiological processes, including motor coordination, circadian rhythm, bone metabolism, thymopoiesis, apoptosis, and lymph node development. Future studies must determine the exact role of RORs in these biological processes and the precise mechanisms and target genes by which RORs regulate these processes. For example, they have to determine the molecular mechanism by which cdk2 activity is induced and Bcl-XL is repressed in RORy-/- thymocytes, what the relationship is between these two changes, and how they relate to the induction of apoptosis. Further characterization of the molecular mechanisms of action of RORs will not only provide greater insight into their functions in normal physiological processes but will also determine whether RORs are implicated in any diseases. Recent studies have provided preliminary evidence for a potential role of RORs in atherosclerosis and immune disorders. It has also become evident that RORs do not act as constitutively active receptors but that their activities are regulated by a mechanism that could involve ligands and/or specific kinases. The link between CaMKIV and ROR activation is intriguing and suggests that the activity of RORs may be under the control of signaling pathways that regulate Ca2+ concentration. Future study of the mechanisms that control ROR activation may lead to the development of novel therapeutic strategies.

242

A.M. JETTEN ET AL. ACKNOWLEDGMENTS

This work was supported by the Japanese society for the Promotion of Science. The authors thank Drs. D. Newman and D. Mehta (NIEHS) for their comments on the manuscript.

REFERENCES 1. R. M. Evans, Science 240, 889-895 (1988). 2. V. Laudet, J. Mol. Endocrinol. 19, 207-226 (1997). 3. P. J. Willy and D. J. Mangelsdorf, in "Hormones and Signaling" (B. W. O'Malley, ed.), pp. 308--358. Academic Press, San Diego, 1998. 4. R. Kumar and E. B. Thompson, Steroids 64, 310-319 (1999). 5. B. Desvergne and W. Wahli, Endocr Rev. 20, 649-688 (1999). 6. S.A. Kliewer, J. M. Lehmann, M. V. Milburn, and T. M. Willson, Recent Prog. Horm. Res. 54, 345-367 (1999). 7. R. Sladek and V. Gignere, in "Hormones and Signaling" (B. W. O'Malley, ed.), pp. 23-87. Academic Press, San Diego, 2000. 8. V. Gignere, Endocr. Rev. 20, 689-725 (1999). 9. V. Gignere, L. D. McBroom, and G. Flock, Mol. Cell. Biol. 15, 2517-2526 (1995). 10. B. Chambraud, M. Berry, G. Redeuilh, P. Chambon, and E. E. Baulieu, J. Biol. Chem. 265, 20686-20691 (1990). 11. J. P. Renaud, N. Rochel, M. Ruff, V.Vivat, P. Chambon, H. Gronemeyer, and D. Moras, Nature (London) 378, 681-689 (1995). 12. D. Moras and H. Gronerneyer, Curt Opin. Cell Biol. 10, 384--391 (1998). 13. C. K. Glass and M. G. Rosenfeld, Genes Dev. 14, 121-141 (2000). 14. T. Heinzel, R. M. Lavinsky, T. M. Mullen, M. Soderstrom, C. D. Lalmrty, J. Torchia, W. M. Yang, G. Brard, S. D. Ngo, J. R. Davie, E. Seto, R. N. Eisenman, D. W. Rose, C. K. Glass, and M. G. Rosenfeld, Nature (London) 387, 43--48 (1997). 15. L. Xu, C. K. Glass, and M. G. Rosenfeld, Curt Opin. Genet. Dev. 9, 149-147 (1999). 16. N.J. McKenna, J. Xu, Z. Nawaz, S. Y. Tsai, M. J. Tsai, and B. W, O'Malley, J. Steroid Biochem. Mol. Biol. 69, 3-12 (1999). 17. N. Auphan, J. A. DiDonato, C. Rosette, A. Helmberg, and M. Karin, Science 270, 286-290 (1995). 18. R. Schule, P. Rangarajan, N. Yang, S. Kliewer, L. J. Ransone, J. Bolado, I. M. Verma, and R. M. Evans, Proc. Natl. Acad. Sci. U.S.A. 88, 6092-6096 (1991). 19. O. Wendling, P. Chambon, and M. Mark, Proc. Natl. Aead. Sci. U.S.A. 96, 547--551 1999. 20. X. Luo, Y. Ikeda, D. Lala, D. Rice, M. Wong, and K. L. Parker, J. Steroid Biochem. Mol. Biol. 69, 13-18 (1999). 21. H. de The, C. Lavau, A. Marchio, C. Chomienne, L. Degos, and A. Dejean, Cell (Cambridge, Mass.) 66, 675--684 (1991). 22. S. Kersten, B. Desvergne, and W. Wahli, Nature (London) 405, 421-424 (2000). 23. L. Novotny, P. Ranko, A. Vachalkova, and M. Peterson-Biggs, Neoplasma 47, 3-7 (2000). 24. T. M. Willson, P. J. Brown, D. D. Sternbach, and B. R. Henke, J. Med. Chem. 43, 527--550 (2000). 25. M. Becker-Andre, E. Andre, and J. F. DeLamarter, Biochem. Biophys. Res. Commun. 194, 1371-1379 (1993).


243

26. C. Carlberg, R. Hooft van Huijsduijnen, J. K. Staple, J. F. DeLamarter, and M. Becker-Audre, Mol. Endocrinol. 8, 757-770 (1994). 27. T. Hirose, R. J. Smith, and A. M. Jetten, Biochem. Biophys. Res. Commun. 205, 1976-1983 (1994). 28. A. Medvedev, Z. H. Yan, T. Hirose, V. Giguere, andA. M. Jetten, Gene 181, 199-206 (1996). 29. T. Hirose, W. Fujimoto, T. Tamaai, K. H. Kim, H. Matsuura, and A. M. Jetten, Mol. Endocrinol. 8, 1667-1680 (1994). 30. V. Giguere, M. Tini, G. Flock, E. Ong, R. M. Evans, and G. Otulakowsld, Genes Dev. 8, 538~553 (1994). 31. U. Matysiak-Scholze and M. Nehls, Genomics 43, 78--84 (1997). 32. M. Steinmayr, E. Andre, E Conquet, L. Rondi-Reig, N. Delhaye-Bouchaud, N. Auclair, H. Daniel, E Crepel, J. Mariani, C. Sotelo, and M. Becker-Andre, Proc. Natl. Acad. Sci. U.S.A. 95, 3960--3965 (1998). 33. B. A. Hamilton, W. N. Frankel, A. W. Kerrebrock, T. L. Hawkins, W. FitzHugh, K. Kusumi, L. B. Russell, K. L. Mueller, V. van Berkel, B. W. Birren, L. Kruglyak, and E. S. Lander, Nature (London) 379, 736-739 (1996). 34. M. Becker-Andre, I. Wiesenberg, N. Schaeren-Wiemers, E. Andre, M. Missbach, J. H. Saurat, and C. Carlberg, J. Biol. Chem. 269, 28531-28534 (1994). 35. N. Schaeren-Wiemers, E. Andre, J. P. Kapfhammer, and M. Becker-Andre, Eur J, Neurosci. 9, 2687-2701 (1997). 36. E. Andre, K. Gawlas, and M. Becker-Andre, Gene 216, 277-283 (1998). 37. R. Baler, S. Coon, and D. C. Klein, Biochem. Biophys. Res. Commun. 220, 975-978 (1996). 38. M. A. Ortiz, E J. Piedraflta, M. Pfahl, and R. Mald, Mol. Endocrinol. 9, 1679-1691 (1995). 39. Y. W. He, M. L. Deftos, E. W. Ojala, and M. J. Bevan, Immunity 9, 797-806 (1998). 40. I. Villey, R. de Chasseval, and J. P. de Villartay, Eur. J. Immunol. 29, 4072-4080 (1999). 41. S. Austin, A. Medvedev, Z. H. Yan, H. Adachi, T. Hirose, and A. M. Jetten, Cell Growth Differ 9, 267-276 (1998). 42. Y. W He, C. Beers, M. L. Deftos, E. W. Ojala, K. A. Forbush, and M. J. Bevan, J. Immunol. 164, 5668-5674 (2000). 43. z. Sun, D. Unutmaz, Y. R. Zou, M. J. Sunshine, A. Pierani, S. Brenner-Morton, R. E. Mebius, and D. R. Littman, Science 288, 2369-2373 (2000). 44. S. Kurebayashi, E. Ueda, M. Sakaue, D. D. Patel, A. Medvedev, F. Zhang, and A. M. Jetten, Proc. Natl. Acad. Sci. U.S.A. 97, 10132--10137 (2000). 45. G. Lain, B. L. Hall, M. Bender, and C. S. Thummel, Dev. Biol. 212, 204-216 (1999). 46. Y. Kageyama, S. Masuda, 8. Hirose, and H. Ueda, Genes Cells 2, 559-569 (1997). 47. G. E. Carney, A. A. Wade, R. Sapra, E. S. Goldstein, and M. Bender, Proc. Natl. Acad. Sci. U.S.A. 94, 12024-12029 (1997). 48. G. T. Lam, C. Jiang, and C. S. Thummel, Development 124, 1757-1769 (1997). 49. M. A. Homer, T. Chen, and C. S. Thummel, Dev. Biol. 168, 490-502 (1995). 50. M. R. Koelle, W. A. Segraves, and D. S. Hogness, Proc. Natl. Acad. Sci. U.S.A. 89, 61674171 (1992). 51. Q. Lan, K. Hiruma, X. Hu, M. Jindra, and L. M. Riddiford, Mol. Cell. Biol. 19, 4897-4906 (1999). 52. S. R. Pall], K. Hiruma, and L. M. Riddiford, Dev. Biol. 150, 306-318 (1992). 53. C. S. Thummel, Cell (Cambridge, Mass.) 83, 871-877 (1995). 54. P. S. Danielian, R. White, J. A. Lees, and M. G. Parker, EMBOJ. I1, 1025-1033 (1992). 55. E. E Greiner, J. Kirfel, H. Greschik, U. Dorflinger, P. Becker, A. Mercep, and R. Schule, Proc. Natl. Acad. Sci. U.S.A. 93, 10105-10110 (1996). 56. D. Lopez, T. W. Sandhoff, and M. P. McLean, Endocrinology 140, 3034-3044 (1999). 57. D. S. Lala, D. A. Rice, and K. L. Parker, Mol. Endocrinol. 6, 1249-1258 (1992).

244

A.M. JETrEN ET AL.

58. 59. 60. 61. 62.

T. E. Wilson, T. J. Fahrner, M. Johnston, and J. Milbrandt, Science 252, 1296-1300 (1991). L. D. McBroom, G. Flock, andV. Giguere, Mol. Cell. Biol. 15, 796--808 (1995). S. Kurebayashi and A. M. Jetten, in preparation (2001). R. Sladek, J. A. Bader, and V. Giguere, Mol. Cell. Biol. 17, 5400-5409 (1997). Z. H. Yan, A. Medvedev, T. Hirose, H. Gotoh, and A. M. Jetten, J. Biol. Chem. 272, 1056510572 (1997). B. M. Forman, J. Chen, B. Blumberg, S. A. Kliewer, R. Henshaw, E. S. Ong, and R. M. Evans, Mol. Endocrinol. 8, 1253-1261 (1994). R. Retnakaran, G. Flock, and V. Giguere, Mol. EndocrinoI. 8, 1234-1244 (1994). I. Dussault andV. Giguere, Mol. Cell. Biol. 17, 1860-1867 (1997). B. M. Forman, I. Tzameli, H. S. Choi, J. Chen, D. Simha, W Seol, R. M. Evans, and D. D. Moore, Nature (London) 395, 612,---615(1998). H. S. Camp and S. R. Tafuri, J. Biol. Chem. 272, 10811-10816 (1997). E. Hu, J. B. Kim, P. Sarraf, and B. M. Spiegelman, Science 274, 2100-2103 (1996). R. White, M. Sjoberg, E. Kalkhoven, and M. G. Parker, EMBOJ. 16, 1427-1435 (1997). G. D. Hammer, I. Krylova, Y. Zhang, B. D. Darimont, K. Simpson, N. L. Weigel, and H. A. Ingraham, Mol. Cell. 3, 521-526 (1999). J. C. Webster, C. M. Jewel], J. E. Bodwell, A. Munck, M. Sar, and J. A. Cidlowski, J. Biol. Chem. 272, 9287-9293 (1997). A.J. Horlein, A. M. Naar, T. Heinzel, J. Torchia, B. Gloss, R. Kurokawa, A. Ryan, Y. Kamei, M. Soderstrom, C. K. Glass, et al., Nature (London) 377, 397-404 (1995). S. A. Onate, S. Y. Tsai, M. J. Tsai, and B. W O'Malley, Science 270, 1354-1357 (1995). H. P. Harding, G. B. Atkins, A. B. Jaffe, w J. Seo, and M. A. Lazar, Mol. Endocrinol. 11, 1737-1746 (1997). C. Carlberg and I. Wiesenberg, J. Pineal Res. 18, 171-178 (1995). M. Tini, R. A. Fraser, andV. Giguere, J. Biol. Chem. 270, 20156-20161 (1995). M. Missbach, B. Jagher, I. Sigg, S. Nayeri, C. Carlberg, and I. Wiesenberg, J. Biol. Chem. 271, 13515-13522 (1996). I. Wiesenberg, M. Chiesi, M. Missbach, C. Spanka, W. Pignat, and C. Carlberg, Mol. Pharmacol. 53, 1131-1138 (1998). G. B. Atkins, X. Hu, M. G. Guenther, C. Rachez, L. P. Freedman, and M. A. Lazar, MoL Endocrinol. 13, 1550-1557 (1999). V. Cavailles, S. Dauvois, F. L'Horset, G. Lopez, S. Hoare, P. J. Kushner, and M. G. Parker, EMBOJ. 14, 3741-3751 (1995). E. Treuter, T. Albrektsen, L. Johansson, J. Leers, and j. A. Gustafsson, Mol. Endocrinol. 12, 864-881 (1998). I. Hu and M. A. Lazar, Trends Endocrinol. Metab. 11, 6--10 (2000). E. M. Mclnemey, D. W. Rose, S. E. Flynn, S. Wesl~n,T. M. Mullen, A. Krones, J. Inostroza, J. Torchia, R. T. Nolte, N. Assa-Munt, M. V. Milburn, C. K. Glass, and M. G. Rosenfeld, Genes Dev. 12, 3357-3368 (1998). P. M. Henttu, E. Kalkhoven, and M. G. Parker, Mol. Cell. Biol. 17, 1832-1839 (1997). X. Hu and M. A. Lazar, Nature (London) 402, 93-96 (1999). J. Zhang, X. Hu, and M. A. Lazar, Mol. Cell. Biol. 19, 6448-6457 (1999). L. Nagy, H. Y. Kao, J. D. Love, C. Li, E. Banayo, J. T. Gooch, V. Krishna, K. Chatterjee, R. M. Evans, and J. w. Schwabe, Genes Dev. 13, 3209-3216 (1999). E. F. Greiner, J. Kirfel, H. Greschik, D. Huang, P. Becker, J. P. Kapfhammer, and R. Schule, Proc. Natl. Acad. Sci. U.S.A. 97, 7160-7165 (2000). G. Paravicini, M. Steinmayr, E. Andre, and M. Becket-Andre, Biochem. Biophys. Res. Commun. 227, 82,--87(1996). A. R. Means, T. J. Ribar, C. D. Kane, S. S. Hook, and K. A. Anderson, Recent Prog. Horm. Res. 52, 389-406 (1997).

63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83.

84. 85. 86. 87. 88.

89. 90.


245

91. K. A. Anderson, T. J. Ribar, M. Illario, and A. R. Means, Mol. Endocrinol. 11,725-737 (1997). 92. R.P. Matthews, C. R. Guthrie, L. M. Wailes, X. Zhao, A. R. Means, and G. S. MeKnight, Mol. Cell. Biol. 14, 6107-6116 (1994). 93. C. K. Miranti, D. D. Ginty, G. Huang, T. Chatila, and M. E. Greenberg, Mol. Cell. Biol. 15,

3672--3684 (1995). 94. C. D. Kane andA. R. Means, EMBOJ. 19, 691-701 (2000). 95. J. Y. Wu, T. J. Ribar, D. E. Cummings, K. A. Burton, G. S. McKnight, and A. R. Means, Nat. Genet. 25, 448--452 (2000). 96. A. Medvedev, A. Chistoktfina, T. Hirose, and A. M. Jetten, Genomics 46, 93-102 (1997). 97. V. Giguere, B. Beatty, J. Squire, N. G. Copeland, and N. A. Jenkins, Genomics 28, 596-598

(1995). 98. T. J. Roderick and M. T. Davisson, in "Handbook on Genetically Standardized Jaxmice," 3rd

Ed., p. 5.110. Bar Harbor, ME, 1982. 99. E. Andre, F. Conquer, M. Steinmayr, S. C. Stratton, V. Porciatti, and M. Becker-Andre, EMBO

]. 17, 3867-3877 (1998). 100. M. C. Green and P. W. Lane,]. Heredity 58, 225-228 (1967). 101. R. L. Sidman, P. W. Lane, and M. M. Dickie, Science 137, 610-612 (1962). 102. B. Kopmels, J. Mariani, N. Delhaye-Bouchaud, F. Audibert, D. Fradelizi, and E. E. Wollman, J. Neurochem. 58, 192-199 (1992). 103. E. Trenkner and M. K. Hoffmann,]. Neurosci. 6, 1733-1737 (1986). 104. I. Dussanlt, D. Fawcett, A. Matthyssen, J. A. Bader, and V. Ciguere, Mech. Dev. 70, 147-153

(1998). 105. G. D. Frantz, C. W. Wuenschell, A. Messer, and A. J. Tobin, J. Neurosci. Res. 44, 255-262

(1996). 106. M. Doulazmi, F. Frederic, Y. Lemaigre-Dubreui], N. Hadj-Sahraoui, N. Delhaye-Bouchaud, andJ. Mariani, J. Comp. Neurol. 411,267-273 (1999). 107. J. Bouvet, Y. Usson, and J. Legrand, Int. J. Dev. Neurosci. 5, 345-355 (1987). 108. G.W. Anderson, S. G. Hagen, R. J. Larson, K. A. Strait, H. L. Schwartz, C. N. Mariash, and J. H. Oppenheimer, Mol. Cell. Endocrinol. 131, 79~7 (1997). 109. T. Matsui, Genes Cells 2, 263-272 (1997). 110. N. Koibuchi and W. W. Chin, Endocrinology 139, 2335-2341 (1998). 111. T. Meyer, M. Kneissel, J. Mariani, and B. Fournier, Proc. Natl. Acad. 8ci. U.S.A. 97, 9197-9202

(2000). 112. A. Mamontova, S. Seguret-Mace, B. Esposito, C. Chaniale, M. Bouly, N. Delhaye-Bouehaud, G. Luc, B. Staels, N. Duverger, J. Mariani, and A. Tedgui, Circulation 98, 2738-2743

(1998). 113. N. Vu-Dac, P. Gervois, T. Grotzinger, P. De Vos, K. Schoonjans, J. C. Fruchart, J. Auwerx, J. Mariani, A. Tedgui, and B. Staels,]. Biol. Chem. 272, 22401-22404 (1997). 114. I. Tobler, S. E. Gaus, T. Deboer, P. Achermann, M. Fischer, T. Rulicke, M. Moser, B. Oesch, P. A. McBride, and J. C. Manson, Nature (London) 380, 639-642 (1996). 115. M. H. Vitaterna, D. P. King, A. M. Chang, J. M. Kornhanser, P. L. Lowrey, J. D. McDonald, W. F. Dove, L. H. Pinto, F. W. Turek, and J. S. Takahashi, Science 264, 719-725 (1994). 116. Y. X. Fu and D. D. Chaplin, Annu. Rev. Immunol. 17, 399--433 (1999). 117. A. Futterer, K. Mink, A. Luz, M. H. Kosco-Vilbois,and K. Pfeffer, Immunity 9, 59-70 (1998). 118. P. De Togni, J. Goellner, N. H. Ruddle, P. R. Streeter, A. Fick, S. Mariathasan, S. C. Smith, R.

Carlson, L. P. Shornick, J. Strauss-Schoenberger, J. H. Russell, R. Karr, and D. D. Chaplin, Science 264, 703-707 (1994). 119. P. D. Rennert, D. James, F. Mackay, J. L. Browning, and E S. Hochman, Immunity 9, 71-79 (1998). 120. E A. Koni, R. Sacca, P. Lawton, J. L. Browning, N. H. Ruddle, and R. A. Flavell, Immunity 6, 491-500 (1997).

246

A.M. JETTEN ET AL.

121. W. C. Dougall, M. Glaecum, K. Charrier, K. Rohrbach, K. Brasel, T. De Smedt, E. Daro, J. Smith, M. E. Tometsko, C. R. Maliszewski, A. Armstrong, V. Shen, S. Bain, D. Cosman, D. Anderson, P. J. Morrissey, J. J. Pesehon, and J. Schuh, Genes Dev. 13, 2412-2424 (1999). 122. S. Fagarasan, R. Shinkura, T. Kamata, F. Nogaki, K. Ikuta, K. Tashiro, and T. Honjo, J. Exp. Med. 191, 1477-1486 (2000). 123. R. Shinkura, K. Kitada, F. Matsuda, K. Tashiro, K. Ikuta, M. Suzuki, K. Kogishi, T. Serikawa, and T. Honjo, Nat. Genet. 22, 74-77 (1999). 124. Y. Yokota, A. Mansouri, S. Mori, S. Sugawara, S. Adachi, S. Nishikawa, and P. Gruss, Nature (London) 397, 702-706 (1999). 125. R. E. Mebius, P. Rennert, and I. L. Weissman, Immunity 7, 493-504 (1997). 126. W. Ellmeier, S. Sawada, and D. R. IAttman, Annu. Rev. Immunol. 17, 523--554 (1999). 127. H. J. Fehling and H. von Boehmer, Curt Opin. Immunol. 9, 263-275 (1997). 128. S. C. Jameson, K. A. Hogquist, and M. J. Bevan, Annu. Rev. Immunol. 13, 93-126 (1995). 129. N. Killeen, B. A. Irving, S. Pippig, and K. Zingler, Curt Opin. Immunol. 10, 360--367 (1998). 130. P. Kisielow and H. von Boehmer, Adv. Immunol. 58, 87-209 (1995). 131. J. c. Zuniga-Pflucker and M. J. Lenardo, Curt Opin. Immunol. 8, 215-224 (1996). 132. G. Anderson, N. C. Moore, J. J. Owen, and E. J. Jenkinson, Annu. Rev. Immunol. 14, 73-99 (1996). 133. E. Sebzda, S. Mariathasan, T. Ohteki, R. Jones, M. F. Bachmann, and P. S. Ohashi, Annu. Rev. Immunol. 17, 829-874 (1999). 134. A. Gross, J. M. McDonnell, and S. J. Korsmeyer, Genes Dev. 13, 1899-1911 (1999). 135. J. C. Reed, Oncogene 17, 3225-3236 (1998). 136. G. Kroemer and J. c. Reed, Nat. Med. 6, 513--519 (2000). 137. M. O. Hengartner, Nature (London) 41)7, 770-776 (2000). 138. G. S. Salvesen and V. M. Dixit, Cell (Cambridge, Mass.) 91, 443--446 (1997). 139. N.A. Thornberry, T. A. Rano, E. P. Peterson, D. M. Rasper, T. TLmkey,M. Garcia-Calvo, V. M. Houtzager, P. A. Nordstrom, S. Roy, J. P. Vaillancourt, K. T. Chapman, and D. W. Nieholson, J. Biol. Chem. 272, 17907-17911 (1997). 140. I. Vermes, C. Haanen, H. Steffens-Nakken, and C. Reutelingsperger, ]. Immunol. Methods 184, 39-51 (1995). 141. E. Ueda, S. Kurebayashi, and A. M. Jetten, in preparation (2001). 142. D. T. Chao and S. J. Korsmeyer, Annu. Rev. Immunol. 16, 395-419 (1998). 143. S. Shimizu, M. Narita, and Y. Tsujimoto, Nature (London) 399, 48,3-487 (1999). 144. D. M. Finueane, E. Bossy-Wetzel, N. J. Waterhouse, T. G. Cotter, and D. R. Green, J. Biol. Chem. 274, 222,5-2233 (1999). 145. Y. Hu, M. A. Benedict, D. Wu, N. Inohara, and G. Nunez, Proc. Natl. Acad. Sci. U.S.A 95, 4386-4391 (1998). 146. A. Ma, J. C. Pena, B. Chang, E. Margosian, L. Davidson, F. W. Alt, and C. B. Thompson, Proc. Natl. Acad. Sci. U.S.A. 92, 4763--4767 (1995). 147. L. O'Connor, D. C. Huang, L. A. O'Reilly, and A. Strasser, Curt. Opin. Cell Biol. 12, 257-263 (2000). 148. S. Bates and K. H. Vousden, Cell. Mol. Life Sc/. 55, 28-37 (1999). 149. G. I. Evan, A. H. Wyllie, C. S. Gilbert, T. D. IAttlewood, H. Land, M. Brooks, C. M. Waters, L. Z. Penn, and D. C. Hancock, Cell (Cambridge, Mass.) 69, 119-128 (1992). 150. E. Y. Lee, N. Hu, S. S. Yuan, L. A. Cox, A. Bradley, W. H. Lee, and K. Herrup, Genes Dev. 8, 2008-2021 (1994). 151. H.J. Brady, G. Gil-Gomez, J. Kirberg, and A. J. Betas, EMBOJ. 15, 6991-7001 (1996). 152. L. A. O'Reilly, D. C. Huang, and A. Strasser, EMBOJ. 15, 6979--6990 (1996). 153. G. Gil-Gomez, A. Berns, and H. J. Brady, EMBOJ. 17, 7209-7218 (1998). 154. A. Hakem, T. Sasaki, I. Kozieradzki, and J. M. Penninger,J. Exp. Med. 189, 957-968 (1999).


247

155. D. A. Grillot, R. Merino, and G. Nunez, J. Exp. Med. 182, 1973-1983 (1995). 156. D. C. Huang, M. Hahne, M. Schroeter, K. Frei, A. Fontana, A. Villunger, K. Newton, J. Tschopp, and A. Strasser, Proc. Natl. Acad. Sci. U.S.A. 96, 14871-14876 (1999). 157. J. Jain, C. Loh, and A. Rao, C u ~ Opin. Immunol. 7, 333-342 (1995). 158. J. D. Mountz, T. Zhou, X. Su, J. Wu, and J. Cheng, Clin. Immunol. ImmunopathoI. 80, $2-14 (1996). 159. Z. G. Liu, S. W. Smith, K. A. McLaughlin, L. M. Schwartz, and B. A. Osborne, Nature (London) 367, 281-284 (1994). 160. J. D. Woronicz, B. Calnan, V. Ngo, andA. Winoto, Nature (London) 367, 277-281 (1994). 161. S. L. Lee, R. L. Wesselschmidt, G. P. Linette, O. Kanagawa, J. H. Russell, and J. Milbrandt, Science 269, 532-535 (1995). 162. K. M. Latinis, L. A. Norian, S. L. Eliason, and G. A. Koretzky,J. Biol. Chem. 272, 31427-31434 (1997). 163. C. J. Holtz-Heppelmann, A. Algeciras, A. D. Badley, and C. V. Paya, J. Biol. Chem. 273, 441~4423 (1998). 164. A. Brunet, A. Bonni, M. J. Zigmond, M. Z. Lin, P. ]uo, L. S. Hu, M. J. Anderson, K. C. Arden, J. Blenis, and M. E. Greenberg, Cell (Cambridge, Mass.) 96, 857-868 (1999). 165. P. R. Mittelstadt and J. D. Ashwell, Mol. Cell. Biol. 18, 3744-3751 (1998). 166. P. R. Mittelstadt and J. D. Ashwell,J. Biol. Chem. 274, 3222-3227 (1999). 167. P. Lan, P. Bailey, D. H. Dowhan, and G. E. Muscat, Nucleic Acids Res. 27, 411-420 (1999). 168. M. Schrader, C. Danielsson, I. Wiesenberg, and C. Carlberg, J. Biol. Chem. 271,19732-19736 (1996). 169. T. Matsui, Biochem. Biophys. Res. Commun. 220, 405-410 (1996). 170. K. Chu and H. H. Zingg, J. Mol. Endocrinol. 23, 337--346 (1999).

PDE4 cAMP-Specific Phosphodiesterases MILES D. HOUSLAY 1

Division of Biochemistry and Molecular Biology Institute of Biomedical and Life Sciences Wolfson and Davidson Buildings University of Glasgow University Avenue, Glasgow G12 8QQ Scotland, United Kingdom I. II. III. IV.

V. VI.

VII.

VIII.

IX. X. XI. XII. XIII.

Entities Involved in cAMP Metabolism . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cyclic Nucleotide Phosphodiesterase Families . . . . . . . . . . . . . . . . . . . . . . Cyclic Nucleotide Phosphodiesterases and Intracellular Targeting . . . . . . PDE4 cAMP-Specific Phosphodiesterases . . . . . . . . . . . . . . . . . . . . . . . . . . PDE4 Gene Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Domain Structure of PDE4 Isoenzymes . . . . . . . . . . . . . . . . . . . . . . . . . . . A. The Catalytic Unit of PDE4 Enzymes . . . . . . . . . . . . . . . . . . . . . . . . . . B. Conformational Changes in PDE4 Catalytic Unit . . . . . . . . . . . . . . . . . IntraceUular Targeting of PDE4 Isoforms . . . . . . . . . . . . . . . . . . . . . . . . . . A. Membrane Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. SH3 Domain Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Particulate Targeting of Other PDE4 Isoforms . . . . . . . . . . . . . . . . . . . D. Scaffold Complexes--Binding to RACK1 . . . . . . . . . . . . . . . . . . . . . . . E. ERK Docking and Targeting Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phosphorylation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Activation by Protein Kinase A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Phosphorylation of PDE4 Enzymes by ERK MAP Kinase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. T Cell Receptor-Induced Phosphorylation of PDE4B2 . . . . . . . . . . . Activation of PDE4A Isoforms through a PI-3 Kinase-Mediated Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Activation of the Long PDE4D3 Isoform by Phosphatidic Acid . . . . . . . . Induction of PDE4 Isoforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PDE4 Knockouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

250 253 258 261 262 266 266 273 279 280 281 284 285 286 288 288 290 297 298 300 301 304 305 306

1Fax: +44-141-330-4620; Phone: +44-141-330-5903; E-mail: [email protected]. Abbreviations: PDE, cyclic nudeotide phosphodiesterase; PDE4, type-IV, family 4, cyclic AMP-specific phosphodiesterase; PKA, cAMP-dependent protein kinase A; cAMP, cyclic 3/, 5'adenosine monophosphate; cGMP, cyclic 3', 5'-guanosine monophosphate; UCR, upstream conserved region in PDE4 enzymes; LR, linker region in PDE4 enzymes; FSH, follicle stimulating hormone; TSH, thyroid stimulating hormone; EGF, epidermal growth factor. Progressin NucleicAcidResearch and Moleett!arBiology,VoL69

249

Copyright© 2001by AcademicPress. Allfightsof reproducl~onin anyformreserved. 0079-66113/01$,35.00

250

MILES D. HOUSLAY

I. Entities Involved in cAMP Metabolism Cyclic AMP serves as a ubiquitous second messenger. It mediates the action of a variety of hormones and neurotransmitters, and influences cell growth, differentiation, survival, and inflammatory processes (1-5). It is generated from ATP by the action of adenylyl cyclases, of which there is a large family of enzymes; however, the reason for this diversityis still ill-understood (6, 7). Certainly, there are differences in the regulatory properties of the various adenylyl cyclase isoenzymes, their expression patterns (6), and their localization within different domains of the cell surface plasma membrane (8-10). In contrast, much is known about the various transmembrane receptors that either stimulate or inhibit adenylyl cyclase through the regulatory Gs and Gi proteins, respectively

(11, 12). Having generated an intracellular second messenger, it is crucial to have mechanisms that allow for its degradation. In the case of cAMP, this is done uniquely through the action of cAMP phosphodiesterases, which convert cAMP to 5' AMP. A large multigene family of enzymes can degrade cyclic nucleotides (Table I), of which some some families are active upon cAMP. These enzymes exhibit cell type-specific expression patterns and differ markedly in their modes of regulation, intracellular distribution, relative activities, and Kmvalues (13-23). It is a curious observation that basic research on systems that serve to terminate signals, such as phosphatases and phosphodiesterases, is relatively neglected compared to that done on systems that serve to generate signals, such as kinases and G-protein receptor-coupled cyclases. However, it is apparent from the very diversity of phosphatases and phosphodiesterases that nature brooks no such view. Rather, the enzymes that serve to terminate second-messenger actions have a pivotal role to play in determining cellular function. Certainly, the development of inhibitors that are selective for various PDE enzyme families has highlighted the importance of controlling the degradation of cAMP in determining cellular responsiveness (21-32). Indeed, nature exploits the diversity of PD E isoenzymes to generate a sophisticated system for adjusting cAMP levels in a cell type-specific fashion. This has been effectively exploited through the development of selective inhibitors specific for the PDE3 (16) and PDE4 cAMP-specific phosphodiesterases (23, 28, 33) as effective therapeutic agents. Indeed, these two classes of cAMP-specific phosphodiesterases clearly appear to control functionally distinct "pools" of cAMP (34). For example, while cardiocytes and cells of the immune system can exhibit similar activity levels toward each of these enzymes, only PDE3-selective inhibitors act as positive inotropic agents and only PDE4 inhibitors serve as anti-inflammatory agents. These and other independent approaches serve to demonstrate that, while the action of a range of phosphodiesterases is to degrade cAMP, the various isoenzymes play remarkably different roles in regulating specific cellular processes. How can this be?

251

PDE4 cAMP-SPECIFICPHOSPHODIESTERASES TABLE I PDE SUPERFAMILY PDE Family

Genes

Substrates

PDE1

1A, B, C

PDE2

2A

PDE3

3A, B

cAMP and cGMP KS-505a cAMP and cGMP cAMP

PDE4

4A, B, C, D

cAMP

PDE5

5A

cGMP

PDE6

6A, B, C

cGMP

PDE7 PDE8 PDE9 PDE10

7A, B 8A, B 9A 10A

PDEll

llA

PDE12

12A

cAMP cAMP cGMP cAMP and cGMP cAMP and cGMP cAMP and cGMP

Selective inhibitor

Regulation

Domainpairs

Nicardipine Ca2+/calmodulin CaM binding Vinpocetine PKA/PKG Phosphorylation EHNA Stimulated by GAF cGMP Cilostimide Inhibited by cGMP Milrinone Phosphorylated by PKB Rohpram Phosphorylated UCR by ERK Ro-20-1724 Phosphorylated by PKA RP73401, Ariflo Sidenafil Binds eGMP GAF Phosphorylated byPKA Phosphorylated by PKG Activated by transducin

GAF GAF

One obvious difference between members of various P D E families is the range of different Km values that these enzymes show for cAMP (13-15, 35). These range from the submicromolar for the P D E 7 and PDE8 families, to the micromolar level for P D E 3 and PDE4 enzymes, and to the tens of micromolar for the PDE1 enzyme families. This means that the relative importance of the various P D E enzymes will vary according to the prevailing cAMP concentrations. It is possible that P D E 7 and PDE8 enzymes predominantly play a scavenging role and work to keep basal cAMP levels low when adenylyl cyclase is not stimulated. As such, their importance may lie predominantly in their capacity to regulate basal cAMP levels and their relative importance can be expected to wane as cAMP levels rise upon adenylyl cyclase stimulation. Under

252

MILES D. HOUSLAY

conditions of such stimulation, cAMP levels can be expected to rise to 10-20/zM and possibly to even greater levels. This will suffice to bring the full activities of PDE3 and PDE4 into action, with PDE1 and PDE2 enzyme activities being significant only at the high cAMP levels, where they probably act to buffer cAMP concentrations. Of course, the relative importance of the various PDE isoenzymes in contributing to cAMP degradation will depend upon whether they are expressed in a particular cell type and their relative activities. This allows considerable scope for regulation; hence the "shaping" of the cAMP response over time can be achieved by altering the expression pattern and activity level of the various PDE isoenzymes that are present in a particular cell type. The importance of this has been clearly established for the PDE3 and PDE4 enzymes by virtue of the availability of highly selective and potent inhibitors (13-23). Insight into the roles of other cAMP-metabolizing PDEs should become apparent upon the availability of inhibitors selective for these various isoenzymes. Many, if not all, cells are highly polarized and can even undergo substantial spatial reorganization in the face of activation and stress. Thus, while PDEs are differentially expressed among various cell types, we can also expect, and indeed find, differential localization of PDE isoenzymes within a single cell type (8). A striking demonstration of this is given in neuronal olfactory sensing cells found in the very upper respiratory tract (nasal epithelium) (8). Here Beavo's group has shown that, within a single neuron, the Ca 2+ -regulated PDE1 enzyme is exclusively localized to the cilia at the ends ofneurites projecting out into the cavity. In complete contrast, the Ca2+-insensitive PDE4A is totally excluded from this part of the cell and, instead, is found within the neurite and the cell body. This clearly demonstrates that mechanisms determine the differential intracellular targeting of PDE isoenzymes within a single cell. Thus these two phosphodiesterases can be expected to control distinct pools of cAMP within cells. The notion that cAMP signaling in cells is compartmentalized was first mooted quite some time ago in the pivotal studies of Brunton and Mayer (36, 37). These studies indicated that the RI and RII isoenzymes of PKA could be differentially activated in cardiocytes through stimulation of adenylyl cyclase activity using different G-protein-coupled receptors. Further insight into compartmentalized signaling in cardiocytes has come from the work of Fischmeister's group (38, 39) and others (40, 41), who have identified compartmentalized signaling determined by specific PDE isoenzymes in the control of Ca 2+ channe 1 f u n c tion and contractile effects. In the mesangial cells of the kidney glomeruli, Dousa and coworkers (34, 42) have studied two cAMP-regnlated processes that they have shown to be regulated by distinct PDE isoenzymes. One such process is the generation of superoxide during the respiratory burst, which is a factor in the pathogenesis of glomernlonephritis; this was inhibited by PDE4-selective inhibitors, such as

PDE4 cAMP-SPECIFICPHOSPHODIESTERASES

253

rolipram, but not by the PDE3-specific inhibitor, cilostamide. In contrast, the mitogenic stimulation of DNA synthesis was inhibited by PDE3 inhibitors, but was unaffected by PDE4 inhibitors. These data implied that two different PDE isoenzymes regulate two separate pools of cAMP that control, respectively, DNA synthesis and superoxide generation. The specific involvement of PDE4 in the regulation of the respiratory burst suggests that PDE4 inhibitors might be developed for use in treating inflammatory glomerulopathies without affecting DNA synthesis. Another example comes from studies done on guinea pig airways, where PDE4 and PDE1 activities controlled the kinetics of contractility, while PDE3 activity was essentially involved in the modulation of the resting tone (43). The machinery for compartmentalized cAMP signaling is thus now beginning to be appreciated. First, it is highly unlikely that cAMP is uniformly generated over the entire cytosolic surface of the plasma membrane. This is because both adenylyl cyclase isoenzymes and their regulatory Gs protein can be localized to discrete regions (8-10, 44), thereby providing a starting point for the generation of gradients of cAMP within cells. Second, PDE isoenzymes can be anchored to distinct intracellular sites (44-47), providing a means of (re)shaping these gradients in a regulated fashion. Third, anchored PKA isoenzymes, plasma membrane-restricted cAMP-gated ion channels, and anchored cAMP-GTP exchange factors (cAMP-GEF/EPACs) provide a system able to sample these intracellular gradients and so can be differentially activated on a cell type-specific basis (3, 48--51). Indeed, the molecular basis of compartmentalized cAMP signaling has received considerable impetus over the past few years by the work of both the Scott (48-51) and the Rubin laboratories (3). They have demonstrated that a family of proteins, called AKAPs, serves to anchor PKA to distinct intracellular sites. Such localized populations of PKA serve to sense the prevailing cAMP levels; thus they are able to sample gradients of cAMP within cells and provide an appropriate output. In so doing, they allow for compartmentalized cAMP signaling. Thus the spatially distinct localization of both synthetic and degradative enzymes provides a system for generating three-dimensional gradients of cAMP within a cell that change with time (1, 19, 20, 34, 52-55). PDE subfamilies appear to provide a pivotal component of compartmentalized signaling systems. To date, the clearest examples of these are provided by the PDE3 and PDE4 families, which appear to regulate very different processes in cells where both are expressed together.

II. Cyclic Nucleotide Phosphodiesterase Families The PDE superfamily is large and complex. At present, it consists of at least 12 families that catalyze the hydrolysis of cAMP, cGMP, or both (13, 15, 35).

254

MILES D. HOUSLAY

These PDE families can be further distinguished according to their primary structures, their ability to hydrolyze either cAMP or cGMP or both, their modes of regulation, and their inhibition by specific inhibitors (Table I). Each of the PDE families contains one to four genes, and many of these genes generate several isoforms by alternative 5 ~mRNA splicing and the use of different transcription initiation sites. This results in PDE isoenzymes with common central and COOH-terminal regions and variable NHz-terminal regions. In addition, alternative 3 ~mRNA splicing has been observed for members of the PDE1 family. The isoenzymes within each of these various PDE families show distinct patterns of expression in different tissues and have different characteristics. A summary of the general characteristics of each of these families is given in Table I. In denoting specific PDE isoenzymes in this complex superfamily, a system ofnornenclature has been developed. This can be explained by taking the example of HSPDE4A1, where HS is the species of origin (in this case, Homo sapiens), PDE1 is the gene family, A is the gene, and 1 is the splice variant. Sometimes, a letter is also given after the splice variant number, such a letter refers to the individual GenBank report for a specific splice variant that has been cloned separately by two or more groups. All PDE isoenzymes have the same general structure (Fig. 1). The catalytic units of PDEs generally occupy the central to COOH-terminal region of these enzymes. This region is conserved between members of the same gene family. Essential histidine residues play an important role in chelating the tightly bound divalent cations (Znz+ and Mg ~+) that are essential for catalytic activity (56-60). The regions that are immediately N-terminal to the catalytic unit of PDEs are, however, variable and contain sequence motifs involved in the regulation of PDE catalytic activity and in intracellular targeting. For instance, the PDE1 isoenzymes have two Ca2+/calmodulin (Ca2+/CaM)-binding domains in their

C-terminal region

N-terminal region Catalytic unit

.,......./..-"'''"

I

"''''..........,...,.,..

Paired regulatory domains

FIG. 1. The corecomponentsof PDEs.

o


255

N-terminal regions. These enzymes can hydrolyze both cAMP and eGMP, and the binding of Ca2+/CaM causes the activation of these activities. The PDE1 family contains three genes, and the diversity of this family is further increased by 51 and 31 alternative mRNA splicing, which results in isoenzymes that differ at their N and C termini (61-63). One consequence of the 5/ alternative mRNA splicing is the generation of splice variants with different Ca2+/CaM-binding affinities. For instance, the N-terminal regions ofPDE1C2 and the other PDEIC splice variants diverge at a point midway between the two Ca2+/Ca M -blndin ' ' g domains. As a result, PDE 1C2 has a significantly greater sensitivity to Ca2+ when compared with PDE1C1, PDEIC3, PDE1C4, and PDE1C5 isoforms (62). The NH2-terminal regions of PDEIC isoenzymes also contain sites for regulation by phosphorylation by PKA. Such phosphorylation of PDEIA1 and PDE1A2 has a functional outcome in that it reduces the affinity of these isoenzymes for Ca2+/CaM (14, 62). In addition, it has been shown that receptors able to activate phosphatidyl inositol signaling in CHO cells can rapidly induce the expression of PDEIB transcripts in a manner akin to that seen for immediate early genes (61). Thus PDE 1 isoenzymes appear to provide a means of cross-talk between the cAMP and Ca 2+ signaling systems. The PDE2, PDE5, PDE6, PDE10, and the recently identified PDE11 families all contain two GAF domains within their N-terminal regions (13, 64-69). GAF domains have been identified in a diverse range of proteins, including those from plants, bacteria, and vertebrates (70, 71). The acronym GAF is derived from the names of three proteins that contain these conserved domains: cGMP-specific phosphodiesterases, cyanobacterial Anabaena adenylyl cyclase, and the Escherichia coli transcriptional regulator _fhlA. Despite the conserved nature of these domains, they do not appear to have a common functional role. Indeed, in many instances the functional significance of these domains remains to be elucidated, although it is generally considered that they are likely to perform a regulatory role. In the case of PDE2, PDE5, and PDE6 isoenzymes, these GAF domains serve to bind cGMP (72-75). The function of the cGMP-binding GAF domain in PDE2 isoenzymes is to activate markedly its cAMP- and cGMP-hydrolyzing catalytic activity in an allosteric (positive homotropic) fashion. This leads to the possibility that PDE2 may serve to allow for cross-talk between the cAMP and cGMP signaling pathways. Thus, for example, the NO-mediated activation of guanylyl cyclase (76) may increase cGMP levels and in so doing elicit the increased hydrolysis of cAMP by PDE2, causing a fall in cAMP levels and deactivation of cAMP signaling pathways. PDE2 activity may also, however, integrate cross-talk with certain lipid signaling pathways as the particulate PDE2 isoform has been shown to be phosphorylated and activated by PKC (77). PDE3 isoenzymes are characterized by a high affinity and specificity for cAMP hydrolysis (16, 78). However, unlike the PDE4 isoenzymes, PDE3

256

MILESD. HOUSLAY

enzymes are able to bind cGMP with high affinity. This occurs by the binding of cGMP to the PDE3 active site, thus cGMP acts as a potent inhibitor of the hydrolysis of cAMP by PDE3 enzymes. In this fashion PDE3 enzymes provide the corollary to PDE2 enzymes, where cGMP binding to the paired GAF domains elicits activation. Thus cross-talk between the NO pathway and PDE3 can be expected to cause an increase in cAMP levels owing to PDE3 inhibition (79). The PDE3A and PDE3B subfamilies are the products of two distinct, but related genes (16, 80). There is no evidence for alternative mRNA splicing. However, it is considered that diverse forms can be generated by transcription initiation from different sites. Both membrane-bound and cytosolic forms of PDE3 are generated (81). Such membrane association appears to be due to at least two distinct sites N-terminal sites, allowing for integration of the protein through transmembrane helices (82). The use of selective inhibitors has demonstrated that these enzymes have key roles in regulating important metabolic processes, such as lipolysis and glycogenolysis, as well as cardiac contractility. A key facet of these enzymes is, however, their ability to be activated by insulin (16). This is of fundamental importance to the regulatory role that these enzymes perform in controlling lipolysis and glycogenolysis (16, 83). Thus PDE3B has been shown to mediate the inhibitory effects of insulin on lipolysis in adipocytes. To achieve this, insulin activates PDE3B via a pathway that involves IRS-1, PI3-kinase, and PDK, with final phosphorylation of PDE3B through PKB/Akt (84). This activation of PDE3B results in a fall in the level of cAMP, causing a reduction in the activity of PKA. This leads to the net dephosphorylation and inactivation of hormone-sensitive lipase (HSL) and a decrease in the hydrolysis of stored triglyceride. The PDE4 isoenzymes will be discussed in more detail below. However, it is pertinent to note that they also have two conserved regions (UCR1 and UCR2) at their extreme NH2 termini (17, 85) (Fig. 2). These are similarly placed, N-terminal to the catalytic unit, to the paired GAF domains found in PDE2, PDE5, PDE6, and PDE10 and the paired Ca2+/CaM-binding domains found in PDE1. As such, it seems (as will be developed later in this review) that they are poised to serve a regulatory function. There is currently much interest in the cGMP-specific PDE5 enzyme because this is the isoenzyme that is selectively inhibited by ViagraTM (73). The binding of cGMP to the N-terminal GAF domains of PDE5 appears to elicit a conformational change that allows phosphorylation and activation by PKG and PKA (86--89). It has also been suggested that PDE5 may interact with y subunits, which are either akin or identical to those found in retinal rods and cones, and that these can also serve to prevent activation by PKA phosphorylation (90). The exact physiological role of this phosphorylation is unclear. However, such studies indicate that there is a series of controls able to determine the functioning of this enzyme. PDE5 function appears to play a pivotal role in controlling smooth

257

PDE4 cAMP-SPECIFIC PHOSPHODIESTERASES

Targeting& regulatory(?) < .-= ~.....ILR1

Isoform-speclflc N-terminal region

UCR1

41

UCR2 ( ~

LR2

~

~

Catalytic unit

Regulatory& targeting(?)

Subfamily-specific C-terminal region FIG. 2. The core components of PDE4 isoforms. This demonstrates a long isoform.

muscle relaxation (73); it is therefore important to understand its control and regulation because this will have important consequences for the definition and treatment of a range of pathological conditions. The cGMP-specific PDE6 isoenzymes play a pivotal role in visual signal transduction and are exclusively expressed in the rod and cone photoreceptors of the retina. PDE6 is a multisubunit protein consisting of two types of catalytic subunits (or and t ) that form a complex with two molecules of the inhibitory y subunit. The a and ~ subunits are called PDE6A and PDE6B, respectively, as they represent distinct gene products. Visual signaling transduction is initiated by a photon-induced change in the conformation of rhodopsin that triggers the heterotrimeric G protein transducin to exchange GDP for GTP (91, 92). Upon binding GTP, transducin undergoes a conformational change that releases a GTP-bound activated transducin ot subunit. This interacts with the PDE6 holoenzyme and causes the release of its inhibitory Y subunit. The activated PDE6 catalytic subunits rapidly hydrolyze cGMP, leading to the closure ofcGMP-gated Ca + channels in the rod photoreceptor plasma membrane. This results in membrane hyperpolarization. The activity and kinetics ofcGMP hydrolysis by PDE6 must be carefully regulated, and a contributory factor to this is believed to be the binding of cGMP to their NHz-terminal GAF domains (66). Indeed, the binding of cGMP to PDE6 is, seemingly, increased in the holoenzyme by a process believed to be dependent upon the presence of the

258

MILESD. HOUSLAY

inhibitory Y subunit (92-94). In addition, there is a dramatic increase (300-fold) in the affinity of PDE6 for its inhibitory F subunit when cGMP is bound to the GAF domain (95). The isoenzymes generated by PDE8 genes are cAMP-specific and are expressed in a number of tissues including testis, ovary, and intestine (96). Interestingly, these isoenzymes contain Per-ARNT-Sim (PAS) homology domains in their N-terminal regions (97, 98). PAS domains function as input modules in proteins that sense oxygen, redox potential, light, and some other stimuli. Specificity in sensing arises, in part, from different cofactors that may be associated with the PAS fold. Transduction of redox signals may be a common mechanistic theme in many different PAS domains. Human PAS proteins include hypoxiainducible factors and voltage-sensitive ion channels, while other PAS proteins are integral components of circadian clocks. These domains have been shown to mediate the homo- and heterodimerization of a number of proteins, including the Drosophila proteins Per and Sim (99, 100). The association of the aryl hydrocarbon receptor (AhR) with the Ah receptor nuclear translocator (ARNT) is mediated by their PAS domains (101, 102). In addition, PAS domains are also believed to mediate the heterodimerization of White collar 1 and 2, which are involved in blue-light signal transduction in the fungus Neurospora crassa (103). A PAS domain has also been identified as playing a crucial role in the regulation of dimerization and DNA-binding specificity of the dioxin receptor (104). Therefore, PAS domains may be involved in protein-protein interactions that may determine either the dimerization of PDE8 or its interaction with other proteins. It would thus appear that the N-terminal regions of PDE enzyme families are likely to play a key role in regulating the function of the catalytic unit.

III. Cyclic Nucleotide Phosphodiesterases and Intmcellular Targeting Different PDE isoenzymes invariably show distinct patterns of intraeellular localization (1). In many cases these differences have been attributed to the unique NH2-terminal regions of PDE isoenzymes (45, 46, 80, 105, 106). The targeting of PDE isoenzymes in the cell is likely to be a crucial aspect of their function, particularly in determining the compartmentalization of cAMP (1, 48). For instance, in olfactory neurones the splice variant PDEIC2 colocalizes with the adenylyl cyclase isoform, AC3, in the cilia that extend from the dendrites into the nasal epithelium (8). AC3 and PDE1C2 are believed to be responsible for the rapid, transient cAMP response to odorants that bind to the G proteincoupled receptors in the cilial plasma membrane. PDEIC2 and AC3 are not found in other parts of the neurone, such as the axons and the cell body. However,


259

olfactory neurones express as yet undefined PDE4A isoforms that are found in the axons and the cell body but, fascinatingly, are completely excluded from the cilial plasma membrane. This PDE4A splice variant is likely to be involved in controlling cAMP signaling in the cell body of the neurone. The presence of two differentiallytargeted classes of PDE isoenzymes in one cell strongly implies the existence of a minimum of two separately regulated pools of cAMP in olfactory neurones. There are two isoforms of PDE2 that differ simply by 5' domain swaps, thus generating two proteins that have distinct N-terminal regions (107). One of these is entirely eytosolic while the other is located within the cell particulate fraction, where it is, presumably, anchored by virtue of its N-terminal region. In hepatocytes, at least, this isoform would appear to be targeted to the cell plasma membrane (108), although the basis of both such targeting and membrane association remains to be defined. The functional relevance of having two PDE2 isoforms whose sole distinction is related to their intracellular disposition has yet to be determined. Nevertheless, it adds further support to the notion that intracellular targeting of PDE isoenzymes is of fundamental importance (1, 20, 52). It should be noted that PDE2 enzymes have GAF domains that bind cGMP and lead to stimulatory, positive homotropic effects on the functioning of the enzyme catalytic unit (15, 109). They thus provide a route whereby an increase in cGMP levels could serve to enhance the hydrolysis of both cGMP and cAMP. Thus both negative feedback and cross-talk can be achieved through this enzyme. As such, the ability to deploy such an activity in either the cytosol or the membrane fraction could provide fundamentally distinct controls. PDE3 enzymes provide a clear example of the way in which the NHeterminal regions of PDE isoenzymes can determine their intracellular localization (16, 78, 81, 82, 110). These two isoenzyxnes contain a hydrophobic segment in their NH2-terminal regions that is predicted to form six transmembrane helices. This may explain why these isoenzymes are targeted to the membrane fractions of adipocytes and cardiac myocytes. Certainly, recombinant PDE3A1 and PDE3B are both associated with the particulate fractions of Sf9 cells (110). However, truncated species that lacked the NH2-terminal hydrophobic domain were found localized to the cytosolic fraction. The PDE3A2 isoenzyme has a very short NH2-terminal splice domain that lacks the hydrophobic region. As might be predicted if the N-terminal region of PDE3A1 is responsible for membrane targeting, recombinant PDE3A2 was found to be localized to the cytosolic fraction of Sf9 cells (110). Thus the NH~-terminal hydrophobic region appears to be responsible for targeting PDE3A1 and PDE3B to the membrane fraction of cells. This has been recently evaluated (82) in more detail by using chimeric species of PDE3A1 and green fluorescent protein (GFP). GFP is normally found as a soluble protein when expressed in mammalian cells, but when it is generated as a fusion protein with PDE3A1, it is clearly localized to the endoplasmic

260

MILESD. HOUSLAY

reticulum. Deletion constructs identified two regions within the N-terminal portion of PDE3A1 that were responsible for membrane association. One of these is clearly integrated into the membrane and may form a transmembrane structure. The other determinant probably interacts with membrane components via ionic bonds. This dual anchoring is not unusual in membrane proteins. For example, many G protein-linked receptors and their kinases have transmembrane stretches plus tethering of specific regions via palmitoylation (111, 112). Similarly, SRC is localized to membrane structures both through protein-protein interactions and through N-terminal acylation and C-terminal farnesylation (112). In cells where PDE3 isoenzymes are expressed, then, their targeting to specific intracellular sites undoubtedly underpins confers compartmentalized cAMP signaling. It is possible that membrane-associated PDE3B is targeted to a region within the cell where it is in close proximity to an AKAP-anchored PKA and HSL. In this way, PDE3B could control a pool of cAMP that specifically regulates lipolysis via PKA and HSL. Undoubtedly, the specific intracellular localization of PDE3 and PDE4 isoenzymes explains why inhibitors selective for PDE3, and not PDE4, influence metabolic regulation by insulin of lipolysis in adipocytes and glycogenolysis in hepatocytes. In a similar fashion, while human cardiac myocytes contain both PDE3 and PDE4 enzymes, only PDE3-selective inhibitors serve to generate positive inotropic actions (16). This provides one of a growing number of examples demonstrating that the major cAMP-hydrolyzing enzymes in cells, namely PDE3 and PDE4, regulate distinct cellular processes. A key feature of such isoenzymes is that their expression is targeted to distinct intracellular regions; this undoubtedly plays a major role in underpinning such selectivity, especially in systems where they are expressed at similar levels of activity. Thus it is important to identify and understand the molecular basis of intracellular targeting of PDE isoenzymes. Subsequently, it may prove possible to exploit such information to develop novel therapeutics that act by disrupting targeting of specific isoforms, thereby removing them from functionally relevant intracellular compartments. Two PDE5 splice variants have been identified with distinct N-terminal regions (113, 114). However, it is not known whether these affect intracellular targeting. Interestingly, it has recently been suggested (115) that glutamic acid-rich proteins (GARPs) may allow PDE6A and B (a and/~) to associate with a signaling complex in the outer segments of rod photoreceptor cells. The proteins GARP1 and GARP2 are tightly bound to the membrane fraction of rod outer segments, where they serve a scaffolding role to allow for the binding of PDE6, guanylate cyclase, and the ATP-binding cassette transporter (ABCR). One possibility is that the generation of such a complex may serve to increase the efficiency of signal transduetion. However, the association of light-activated PDE6 with GARP actually appears to result in the inhibition of PDE6 activity. One interpretation

PDE4 cAMP-SPECIFIC PHOSPHODIESTERASES

261

of this is that the association of guanylate cyclase and PDE6 may prevent the unnecessary cycling of cGMP in rod photoreceptors that have been saturated by high light conditions (115). Two PDE7 splice variants have been cloned (116, 117): PDE7A1, which is found in multiple tissues, and PDE7A2, which is expressed mainly in skeletal muscle and cardiac myocytes. These species have distinct N-terminal regions that affect their intracellular targeting. Thus the N-terminal region of PDE7A2 contains hydrophobic sequences, which may explain why it is found predominantly in the particulate fractions of cells. In contrast, the intracellular localization of PDE7A1 appears to differ, depending upon the source. Thus in fetal skeletal muscle PDE7A1 is predominantly cytosolic, whereas in brain it is found in both cytosolic and particulate fractions. As the N-terminal region of PDE7A1 contains a large number of polar and positively charged residues, it may be that these confer particulate association through interaction with particulateassociated proteins. In this case it could be that tissue-specific differences in the availability of such anchor proteins could underpin such observations. Of course, PDE7A1 may also interact with one or more targeting proteins that have distinct tissue-specific patterns of expression. As yet, there is little information concerning the regulation and targeting of the newly discovered PDE8, PDE9, PDE10, PDE11, and PDE12 families (13) However, the PDE10 and P D E l l genes generate a variety ofisoforms that differ in their N- terminal regions (67, 69, 113) that could, potentially, play a role in intracellular targeting. Certainly they appear to play a functional role in that the various P D E l l isoforms differ in their susceptibility to phosphorylation by PKA (118). In the case of PDE8, while the function of its PAS domains may be to elicit homodimerization, if they were to allow interaction with other proteins, then this might provide a means for allowing the intracellular targeting of PDE8 isoenzymes. The association of PDE isoenzymes with scaffold proteins may thus turn out to be a recurrent theme in the PDE superfamily. There are clearly diverse mechanisms that achieve the regulation and intracellular targeting in the PDE superfamily as a whole.

IV. PDE4 cAMP-Specific Phosphodiesterases Work in our laboratory has focused on PDE4 cAMP-specific phosphodiesterases. The considerable interest in these enzymes is based upon observations that PDE4-selective inhibitors can serve as potent antiinflammatory agents that are of potential therapeutic benefit for a wide range of disorders. These disease states include asthma, chronic obstructive pulmonary disease (COPD), rheumatoid arthritis (RA), and Crohn's disease. In addition, they have potential use as

262

MILESD. HOUSLAY

antidepressants (119-123) and cognitive enhancers (124, 125) as well as possible uses in the treatment of AIDS (126, 127) and chronic lymphocytic leukemia (CLL) (128). The utility and action of PDE4-selective inhibitors have been reviewed extensivelyby others, and thus form no part of this review, which focuses on PDE4 isoforms themselves (23-29, 31-33, 129-141). Understanding the functional significance of individual members of this enzyme family poses a considerable challenge, as at least 18 different isoforms are encoded by four PDE4 genes (4A, 4B, 4C, 4D). These various species are expressed in a cell type-specific fashion and show distinct activities, intracellular distribution, and modes of regulation. Nevertheless, the potent pharmacological effect of PDE4-selective inhibitors, in both physiological and cell-based systems, clearly demonstrates that the activity of these enzymes has a profound effect on cellular function. The PDE4A1 (RD1) isoenzyme was the first PDE to be cloned (142) This was done by Davis and collaborators (142) based upon their identification of the product of the D. melanogaster dunc gene encoding a cAMP-specific phosphodiesterase (143). The pivotal studies of Conti and coworkers (144) then showed that there were PDE4 isoenzymes, presumably generated by different genes, with additional complexitydue to alternative mRNA splicing. Then Bolger and collaborators (85) clearly showed the presence of four distinct subfamilies of human PDE4 forms and their rodent equivalents. They also pointed out (85) the occurrence of two regions of homology that were located immediately N-terminal to the catalytic unit. These were called UCR1 and UCR2, for upstream conserved region 1 and 2, respectively. The 60-amino acid UCR1 and the 80-amino acid UCR2 are uniquely identified with enzymes of the PDE4 subfamily (17, 85) and are located in cognate positions to the paired calmodulinbinding domains and GEF domains found in other PDE forms (13). The two splice junctions that generate active PDE4 isoforms are located such that the so-called long isoforms exhibit both UCR1 and UCR2. However, the short isoforms lack UCR1 and have either an intact UCR2 (short isoforms) (85) or an N-terminally truncated UCR2 (supershort isoforms) (145). The characterization of the PDE4A gene locus then demonstrated unequivocally that distinct exons encoded the unique N-terminal regions of long and short isoforms and authenticated the presumed locations of the 5~splice junctions that give rise to long and short forms (145).

V. PDE4 Gene Organization There are four PDE4 genes (4A, 4B, 4C, 4D) distributed on three different chromosomes in humans. PDE4A is located at chr19p13.1 (145-147), PDE4C at chr19p13.2 (148), PDE4B at chrl (149, 150), and PDE4D at chr5


263

(149, 150). These are large, complex genes of "~50 kb with 18 or more exons in each instance (145, 148). The murine PDE4A gene has also been characterized (151), and a partial characterization has been reported of the murine PDE4B and PDE4D genes (152). PDE4 genes generate so-called long, short, and supershort isoforms (Fig. 3). These various isoforms thus arise from alternative mRNA splicing and the use of distinct promoters. Table II lists the current status of identified PDE4 isoforms. No doubt others remain to be identified. In the case of the PDE4A gene, distinct 5' exons encode the unique N-terminal regions that describe each long isoform. The exon organization of the human PDE4A and PDE4C gene loci (145, 148) is shown in Fig. 4. These have been labeled 1-15, with the acknowledgment that the extreme 5 t exon, namely exon 1, will be different for each particular long isoform. Such unique long-form 5 p exons have been labeled as exon l-x, where x is the isoform descriptor. Thus for PDE4A4 the specific 5 r exon is designated exon 1-4A4. For short isoforms, such as PDE4B2 and PDE4D1, any unique 5 / exon would be embedded between exon 5 and exon 6, whereas for supershort forms such as PDE4A1 (RD1) and PDE4D2, it would be located between exon 6 and exon 7.

Short forms: have intact UCR2

UCR1

Longforms:~ ] ~ ~

have UCR1 + UCR2

~

UCR2

B~]---~.

Catalytic unit

I

/ T

Super short forms: have truncated UCR2

FIG.3. The typesof activePDE4 isoforms.Filledarrowsindicatethe splicejunctions.

264

MILES D. HOUSLAY TABLEII PDE4IsOFOBM~

Hum~

C~ne

R~

C~ne

Type Supershort Long Catalytically inactive Long Long (putative) Long Long Short Long Long Long

4A1 (hRD1) 4A4B (pde46) 4A7 (2el)* 4A8 (TM3) 4A10 4B1 4B2 4B3 4B4 4C1

U97584 L20965 U18088 ? 1_20967 (partial) AF073745 L20966 M97515, L20971 U85048 ? Z46632

4A1 (RD1) 4A5 (rpde6) 4A7 4A8 (rpde39) (TM3) 4A10 4B1 (DPD) 4B2 4B3 4B4 4C1

M26715, L27062 L27057 ? L36467 ? AF 110461 J04563 (partial) L27058 U95748 AF202733 L27061, M25347, M28410 (all

4C2

U88712

4C2

?

4C3 4D1 4D2 4D3 4D4 4D5

U88713 (parlial) U50157 U50158, AF012074 L20970, U50159 L20969 AF012073

4C3 4D1 4D2 4D3 4D4 4D5

? M25349, M28412 U09456 U09457 AF031373 ?

partial) Long Long Short Supershort Long Long Long

~Cognatehuman and rat isoformsare listed. All members of the PDE4B and PDE4D familyindicated have been expressed and characterized. The cognate human and rodent forms have been named similarly.PDE4C isoformshave yet to be fullycharacterizedand demonstratedin nativeexpressionsystems.The namingof PDE4A species is the source of some confusion.A number of PDE4A species were named before a unified nomenclature was adopted.

These forms lack UCR1, but because UCR2 is encoded by exons 6-8, short forms have an intact UCR2 while supershort forms have a truncated UCR2. UCR1 is also encoded by three exons, exons 2-4, and the common longform splice junction is located immediately 5' to exon 2. Analyses of the human PDE4A and PDE4C genes (145, 148), as well as the rodent PDE4B and PD4D genes (35, 152), clearly show that six exons encode the catalytic unit of PDE4 isoenzymes. The final exon, exon 15, encodes not only part of the enzyme catalytic unit but also the extreme C-terminal region found in active PDE4 isoforms. This region is completely different in each of the four PDE4 subfamilies. The functional significance, ffany, of such differences is not understood. However, it has been usefully exploited to generate subfamily-specific antisera (46,153-157). In each case they are able to detect all active members of any particular subfamily, as the sequence encoded by this exon is common to all members of a particular subfamily. The linker regions, LR1 and LR2, encode PDE4 subfamily-specific sequences that serve to connect UCR1 to UCR2 and UCR2 to the catalytic unit,

265

PDE4 cAMP-SPECIFIC PHOSPHODIESTERASES Human PDE4A gene organization ,4---

14A4 4

1- 1TM3 4A10 ..

2
G substitution in the anticodon resulted in a 30-fold decrease in kcat and a 20-fold increase in KM for glutamine (68). Likewise, an Arg341 -~ Ala substitution (in the anticodon-binding region of the protein) resulted in a nearly fourfold increase in the KM for glutamine, with no effect on kcat (68). The structural basis for this functional link was proposed to be a long two-stranded t-ribbon that extends from the two t-barrels of the anticodon-binding domain that packs against the active-site KMSKS motif (66). In the presence of cognate tRNA Cln, this ribbon may transmit a signal to the active site domain, resulting in a productive conformation for catalysis. TABLE III KINETIC CONSTANTS FOR AMINOACYLATION OF t R N A ~ In VARIANTS BY E. coli GlnRS

tRNA

KM (/zM)

U1 (wild-type) G1 a -U73 -G38 -U37 -A36 -C35 -A34

0.15 0.66 8.0 3.0 1.0 6.0 6.7 2.5

kcat (sec -1) 0.2 0.92 6.8 × 6.0 x 9.3 × 5.0 × 3.4 × 6.5 ×

10 -3 10 - z 10 -3 10 -3 10 -4 10 -4

Relative aminoacylation

(kcat/KM) 0.95 1.0 6.0 x 1.4 × 6.6 x 5.9 × 3.6 × 1.9 ×

10 -4 10 -2 10 -3 10 -4 10 -5 10 -4

aGuanine substitution at the first position was a consequence of in vitro tRNA transcription using T7 RNA polymerase. From Ref. 67.

332

REBECCAW.ALEXANDERANDPAULSCHIMMEL

Functional determinants for aminoacylation of yeast tRNAAspby AspRS are located principally at the discriminator base (G73) and the anticodon (GUC) (69). The structure of the yeast tRNAASp:AspRScomplex was determined (12, 70) and revealed that specific contacts are made between tRNA identity elements and AspRS. Nucleotide substitutions at the discriminator base and at the conserved core oftRNA~p affected the KM of aminoacylation, likely due to removal of these required contacts. In contrast, substitutions at the anticodon were dominated by kinetic effects (69). For example, a G34C anticodon replacement increased KM only 4-fold while decreasing kcat 100-fold. Similar effects were observed at other anticodon positions (69). Clearly, nucleotides at the anticodon affect the orientation of the 3t end of tRNAAsp and its presentation in the active site of the enzyme. The cocrystal structure indeed revealed conformational changes in the tRNA upon enzyme binding (12). The acceptor end of tRNAAsp maintained a regular helical orientation (in contrast to the hairpin conformation of GlnRS-bound tRNACln), although a modest change was observed at the three terminal base pairs. More significant was a distortion of the anticodon loop upon AspRS binding. The complexed tRNA~p structure deviated from the free structure (71) beginning at base pair G30:U40, which seemed to act as a "hinge" point within the anticodon stem. In addition, the anticodon bases were unstacked upon complex formation. Recognition of these bases (three of the five identity elements within tRNA~p) is provided by seven amino acids within the N-terminal domain of AspRS (70, 72). Further mutational data supported the notion of functional coupling between tRNA determinants, with substitutions at the discriminator and anticodon (G35/A73 tRNAAsp) resulting in cooperative losses in aminoacylation efficiency (Fig. 10) (73). Several crystal structures of Thermus thervrurphilus SerRS have been determined, in isolation and with various substrates or analogs. This dimeric class II enzyme has several novel features. For example, SerRS does not contact the anticodon oftRNA set. Instead, it recognizes the elongated variable arm though interaction with a helical motif at the enzyme's N terminus (74). Deletion of the N-terminal helical arm reduced the efficiencyof aminoacylation4 orders of magnitude relative to the full-length enzyme (75). Substitution of the long variable arm of type II tRNAset with a shorter type I tRNA loop reduced aminoacylation 3 orders of magnitude (76). A comparison of the uncomplexed synthetase with the SerRS:tRNAse~ complex (which also contained a nonhydrolyzable seryl adenylate analog) demonstrated that the helical arm is more flexible in the absence of cognate tRNA (77). Upon tRNA binding, this helical motif maximizes contacts with the variable arm and directs the acceptor stem into the active site for aminoacylation. The most dramatic structural change in SerRS upon tRNAs~r binding is a switch in the conformation of the motif 2 loop. Because SerRS binds only one

DOMAIN-DOMAIN COMMUNICATION

333

A73 AU • U C C C C G - U C G C ~ A1' . . . . . . . . . . . . . C ~TGGGG~ AGUGCCU5'

3'

A

DGG% ,..U

c, o j i A'U ...... ~ .A ~.'(~ I ' A G -"""""" G- C C-G G.U qjC" GC

tRNAASp wt A73 G35 G35/A73

kcat (xlo~/sec)

K M (AM)

520 19 122 0.024

0.05 0.29 0.22 0.74

Activity LOSS 1 160 19 320,000

G35

tRNAASp FIC. 10. Aminoacylationdeterminants (circled)oftRNAnsp. Substitutionsat G73 in the acceptor stem and U35 in the anticodon result in cooperativeeffects (69, 73).

molecule of tRNA across the dimer interface, only one monomer has tRNA ser entering the active site, allowing direct comparisons between the tRNA-bound and unbound conformations. In the presence of ATP, or of the nonhydrolyzable adenylate analog, the loop adopts the previously observed "A conformation" (78), while tRNA binding induces a change to the "T conformation." These two orientations are mutually exclusive, stabilized by different interactions with the same set of conserved residues (77). In the absence of substrates, the motif 2 loop is disordered (79), demonstrating that different enzyme conformations are sampled as the aminoacylation reaction proceeds. Indeed, other structural studies also identified conformational changes in AARSs, particularlyin the active site, at various stages in the aminoacylation reaction. As further examples, conformational changes in either tRNA or protein have been documented in complexes between T thermophilus LysRS and E. coli or T thermophilus RNA Lys (80), E. coli ThrRS and tRNA Th" (81), and E. coli AspRS and tRNA A~p (82); among others. Conformational changes were also predicted to occur upon adenylate formation in Bacillus stearothermophilus TrpRS (83).

334

REBECCAW.ALEXANDERAND PAULSCHIMMEL

The distortion of tRNAGIn upon E. coli GlnRS binding, which has already been discussed (63, 66), was the model for a general evaluation of the conformational dynamics of complexes between tRNAs and AARSs (84). This study used a Gaussian network model (GNM) to predict flexibility and cooperative motions of both tRNA and AARS. In agreement with crystallographic data, GNM predictions identified the anticodon and acceptor stems exhibiting the highest structural flexibility. Nucleotides in the D and T arms and variable loops were determined to play crucial roles in global hinge-bending motions (84). Furthermore, invariant nucleotides were most highly restricted, as were conserved protein residues of GlnRS. Regions of the enzyme that demonstrated significant mobility, in contrast, were involved in the recognition of substrates (84).

Vh Communication by Conformational Changes in IRNA Studied in Solution Given the abundance of structural data amassed in recent years, there should be evidence in solution for conformational changes that occur within cognate complexes. Indeed structural and kinetic studies raised the possibility that, in many cases, tRNA bound to its cognate AARS adopts a conformation distinct from that of tRNA in isolation. In this connection, early temperature-jump experiments investigating the tRNASer-induced quenching of yeast SerRS fluorescence showed at least two relaxation processes. The results were consistent with a bimolecular reaction between the cognate partners and a conformational change of the complex (85). Fast kinetic studies extended these observations, comparing cognate and noncognate interactions for E. coli TyrRS, yeast SerRS, and E. coli and yeast PheRS (86, 87). Together, the results led to the suggestion that tRNA discrimination occurs in two steps. In the first step, an AARS scans through many possible protein-tRNA interactions, transiently binding even noncognate tRNAs. This scanning occurs at diffusion-limited rates. In the second step, specific contacts with the cognate tRNA result in conformational changes within the complex that trigger selective aminoacylation. A simplified view of this proposed mechanism would attribute KM contributions to the first step and keat contributions to the second, as these two parameters account for the specificity of aminoacylation (42). The conformational flexibility of tRNAA~p was evident in I2-footprinting experiments in the presence and absence of AspRS (88). Substitution of determinant nucleotides within the anticodon reduced protections within and outside of the anticodon loop, emphasizing the interdependence of contacts within the protein-RNA complex (88). This interdependence may correlate with long-range communication between anticodon and acceptor arms of tRNAA~P,

D O M A I N - D O M A I N COMMUNICATION

335

particularly considering that substitutions in the anticodon affect primarily the kcat parameter of aminoacylation (69). Although a cocrystal structure is not available for the MetRS system, comparison of an early E. coli MetRS structure with the GInRS structure predicted that tRNAMet would adopt a conformation similar to that of tRNA Gin when in complex with its cognate enzyme (89). Thus, as the observed hairpin conformation of the tRNACln acceptor stem was necessary for the 3' end to reach the catalytic site for aminoacylation, such a distortion could be expected for tRNA Met as well. An evaluation oftRNAMa-derived microhelices determined that engineered destabilization at the first position of the acceptor stem resulted in enhanced aminoacylation by MetRS (90). A microhelix lacking the 5'-terminal nucleotide was aminoacylated at a rate 16-fold higher than the wild-type microhelixMet substrate. This enhancement corresponded to a reduction of 1.6 kcal/mol in the apparent free-energy barrier for transition-state formation. An enhancement of aminoacylation by IleRS was also observed for the 5'-truncated minihelixII~ construct (90). Although the engineered destabilization of the acceptor stem increased the rate of aminoacylation of both the microhelixMet and minihelixIle substrates, addition of an anticodon stem-loop construct did not further enhance aminoacylation. If a distortion at the 31 end of some tRNAs is a consequence of domain--domain communication, continuity of the tRNA backbone is essential.

VII. Communication in Aminoacylation That Requires Covalent Continuityof Ihe tRNA Demonstrated by Functional Analysis A well-studied example is the aminoacylation by MetRS of RNAs that recapitulate the acceptor stem of elongator and initiator tRNAs Met. E. coli MetRS aminoacylates minihelices, microhelices, and duplexes with a catalytic efficiency that is reduced ~6 orders of magnitude relative to full-length tRNA Met (91, 92). Such aminoacylationis sequence-specific,with substitutions in the acceptor stem decreasing aminoacylation in ways quantitatively similar to the reductions seen in the full-length substrate (91). The decrease in catalytic efficiency (kcat/KM) is not primarily due to a binding defect (Fig. 11). Gel-electrophoresis binding studies (93) determined that the apparent dissociation constant (Ka) for the MetRS:microhelix~et interaction was decreased only ~20-fold compared to that for tRNAfMet . Other microhelix constructs (either microhelixfMet variants or microhelixnla substrates) bound with similar (or higher) affinities but were not methionylated by MetRS (48). Discrimination of the microhelixfMetsubstrate by MetRS therefore occurs in the transition state of catalysis.

336

REBECCA W. ALEXANDER AND PAUL SCHIMMEL

6,3' C C

C'G G-C G-C G'C G-C U U U A CAA

/

Microhelix fMet

5G .C 3' U.A C.G G.C G-C G.C C A U A CAU

tRNAfMet

Anticodon stem-loop fMet

Kd (IIM)

Relative aminoacylation rate (kcat/KM)

tRNA fMet 0.51 __0.14 1 Microhelix fMet 12 _. 3 3.4 X 10 .7 G72, G73 MicrohelixfMet 12 _+1 0.25 X 10 .7 Anticodon SL fMet

22 _ 6

nd

FIC. 11. Stem-loop helices that recapitulate the arms of tRNAfMet. Mierohelix mimics of the tRNAtMet aeeeptor arm are bound by MetRS and yet are aminoacylated at a low rate. An antieodon stem-loop hairpin is also bound by the enzyme but does not enhance mierohelix aminoaeylation (48). Domain--domain communication is dependent in this ease upon covalent continuity of the tRNA backbone.

DOMAIN-DOMAINCOMMUNICATION

337

Binding of an isolated anticodon stem-loop hairpin to MetRS was also observed by affinity coelectrophoresis, with a Kd only 2-fold weaker than that of the microhelix (48). The anticodon stem-loop mimic bound MetRS in a manner similar to its binding in the context of the full tRNA fMet, such that it was a competitive inhibitor of the tRNA ~et aminoacylation reaction (94). As stated earlier, addition of this anticodon stem-loop to the microhelixfMet aminoacylation reaction did not increase the efficiency of microhelixfMetcharging (90). In some cases a slight increase in the rate of minihelix aminoacylation has been reported upon the addition of an anticodon fragment (55, 95), although the efficiency remained well below that of the corresponding full-length tRNA. Together, these results further suggested that efficient aminoacylation requires, at a minimum, communication that depends on covalent continuity between the acceptor stem and anticodon portions of tRNA.

VIII. Domain Communication in Editing The ability of an AARS to discriminate between cognate and noncognate amino acids is limited primarily to binding interactions, and is more difficult when two substrates have similar structures. For example, valine differs from isoleucine by a single methylene group, while threonine and valine are isosteric. IleRS and ValRS, respectively, differentiate these noncognate from cognate amino acids using editing functions that are distinct from their aminoacylation activities. A "double-sieve" mechanism is thought to ensure amino acid selectivity (96). In the first sieve, amino acids larger than the cognate substrate are excluded from the catalytic active site, while smaller noncognate amino acids bind and are then hydrolyzed at a second active site for editing. This hydrolysis occurs either before (pretransfer) or after (posttransfer) attachment of the noncognate amino acid to tRNA (97). In the case ofE. coli IleRS, misactivation of valine occurs at a rate approximately 1/180 that of isoleucine activation (36), but the tRNAne-dependent editing reaction ensures that misincorporation of valine at isoleucine codons occurs with a frequency of less than i in 3000 (98). IleRS • (Val-AMP)+ tRNAIle ~ IIeRS • (Val-AMP).tRNATM ~ IleRS • (Val-tRNAIle)+ AMP $ Pretransferediting $ Posttransferediting IleRS + Val+ AMP + tRNAIle IIeRS+ Val+ AMP+ tRNAIle Biochemical and genetic studies showed that aminoacylation and editing functions of IIeRS are contributed by distinct domains (35, 36, 99). A single mutation (Gly56 ~ Ala) in the IleRS catalytic site decreased discrimination for isoleucine over valine in the amino acid activation step (36). However,

338

REBECCA W. ALEXANDER AND PAUL S.CHIMMEL

posttransfer editing of Val-tRNAIle was unaffected by this mutation within the catalytic core. The location of the editing activity was identified by crosslinking a reactive analog of valine-misacylated tRNA ne (N-bromoacetyl-Val-tRNAIle) to IleRS (35). The misacylated analog crosslinked to connective polypeptide 1 (CP1, Fig. 4), an insertion that splits the catalytic domain between the third and fourth/~-strands of the Rossmann fold (35, 100). In contrast, the reactive Ile-tRNAIle analog crosslinked only to the active site. The structural and functional independence of the editing site was demonstrated by expression of the isolated CP1 domain, which efficiently deacylated Val-tRNAIle (99). Mutation or deletion of conserved residues within the CP1 domain severely diminished the editing activity of IleRS (34, 101-103). High-resolution crystal structures of IleRS from T. thermophilus (101) and Staphylococcus aureus (104) demonstrated that the structural domains are physically separate. The structure of T. thermophilus IIeRS showed isoleucine bound to the conserved active site domain within the Rossmann fold. Valine bound to both the active site (for aminoacylation) and a second site within CP1. The editing and aminoacylation active sites are at least 25 A apart (101,104), necessitating long-range communication between the active site and editing domains for efficient amino acid discrimination. In parallel with the distinct domains responsible for aminoacylation and editing activities of IleRS, RNA determinants for the two functions are separate. The anticodon of tRNA Ile is a major identity element for aminoacylation by IleRS (51, 53). Small RNA substrates (minihelices and microhelices) of tRNA Ile are aminoacylated (albeit inefficiently) in a sequence-specific manner. Thus, determinants for isoleucinylation are also contained in the acceptor stem (55, 90). In contrast, nucleotides in the D loop of tRNA Ile trigger the editing reaction of IIeRS (Fig. 12) (105-107). Replacement of G16, D20, and G21 with their tRNA wl counterparts abolished the editing response in the presence ofvaline and tRNA he. However, these substitutions had no effect on aminoacylation with isoleucine (105). Each of the three nucleotides contributed to the editing response, because any substitution at these positions adversely affected the editing activity (106). Substitutions in the D loop of tRNA ne that affect editing do not decrease binding to IleRS, as determined by gel retardation assays (106). Indeed, the D loop is thought not to make contact with IleRS, as nucleotides essential for editing are not protected from chemical modification in the presence of the enzyme (53). A conformational change in tRNA n~, mediated by nucleotides in the D loop, may be responsible for inducing editing by IleRS. Furthermore, this conformational change must involve a form of domain-domain communication. For example, when a minihelix is mixed with the D loop-containing domain of tRNA he, no editing of misactivated adenylate occurs. Thus, covalent continuity of the tRNA is required for the domain-domain communication.


339

A G UCCAC~- CAG~CCU~CCA3'

C~TGGUG~ GUUCGGA5'

J" "A

G .........

9, ,A

C.G

tRNA "e FIG. 12. Aminoacylationand editingdeterminantsoftRNAlie. Nucleotidescriticalfor aminoacylationby IleRSare circled(53), whilethoserequiredfor IleRSeditingactivityare boxed(105).

The crystal structure of S. aureus IIeRS with bound tRNAIle (and the antibiotic mupirocin) revealed an editing complex structurally distinct from the expected catalytic conformation. Although the 3pend oftRNA ne was disordered, the remaining structure dictated that the acceptor terminus be an A-form helix that makes a direct path into the editing active site (104). In contrast, comparison of the structures of uncomplexed IleRS (101) and a tRNACln:GlnRS complex (63) predicted that the 3' end of tRNAII~ must be distorted to access the catalytic active site for aminoacylation (Fig. 13). Other tRNAne:IleRS contacts were different from those predicted to occur in the catalytic complex. For example, within the class I-conserved K595MSKS peptide, the backbone amide of Lys 595 and the backbone carbonyl of Gly 593 bound to the tRNA backbone, stabilizing the extended conformation of the 3' end (104). The KMSKS sequence of the ligand-free IleRS was in a different conformation, suggesting that tRNA binding induced the structural change. Other evidence suggested a tRNA-induced conformational change in IleRS leading to editing. A DNA aptamer able to trigger hydrolysis of VaI-AMP demonstrated that editing could be achieved independently of aminoacylation (108). The DNA aptamer lacked the terminal 2'-OH necessary for aminoacylation by IIeRS and previously thought to be required as a "tentative acceptor" in the editing reaction (109, 110). The aptamer could not be folded into a tRNAlike structure and bore no particular sequence similarity to tRNAII~ (108). A reactive aptamer crosslinked to the CP1 region of IleRS (111).

340

REBECCA W. ALEXANDERAND PAUL SCHIMMEL

Synthetic mode Editing mode Polymerase Polymerase domain

domain Synthetase

(Iomain

.]

domain

FIG. 13. Translocationbetween catalytic and editing sites in two systems. The 3t end of DNA lies in the polymerase active site for replication and shifts to the exonuclease site for removal of misincorporated nueleotides. Similarly, the 3r end of tRNA translocates between the IIeRS catlytic and editing sites. However, it is not clear whether this mechanism can explain pretransfer editing, where the misactivated adenylate shuttles from the active site to the editing site. Reprinted with permission from Ref. 104. Copyright 1999 American Associationfor the Advancement of Science.

Editing of misactivated amino acid (whether isolated or attached to tRNA) requires two steps: translocation to the editing site and chemical hydrolysis. Energy transfer experiments using a fluorescent analog of ATP demonstrated that translocation is the rate-limiting step for editing (103). Furthermore, D-loop nucleotides G16, D20, and D21 are important for translocation. Substitution ' n ot affect th e hydro 1ys's 1 o f these nucleotides within the context o f tRN A Ile did


341

of misacylated Val-tRNA ne (106). A minihelixne (which lacked the D-loop determinants for editing) was misacylated with valine because of a defect in the translocation step (107). The efficiency of the DNA aptamer described above may lie, therefore, in its ability to trigger the translocation step of pretransfer editing and properly position the adenylate for enzyme-catalyzed hydrolysis.

IX. Role of Induced Fit Together, the confomational changes observed in both AARS and tRNA components of the aminoacylation and editing reactions suggest that domain-domain communication proceeds through an induced-fit mechanism. As proposed by Koshland (112, 113), induced fit occurs when binding of a substrate causes a conformational change in the enzyme to align the catalytic groups properly. In RNA-protein complexes, structural changes are often seen in both protein and RNA components upon binding (114). For example, the U1A protein regulates its polyadenylation by binding a loop in the 3'-untranslated region (UTR) of its mRNA. Upon binding, the C-terminal helix of U1A swings away from the face of the protein to allow close contacts with the mRNA loop. The RNA loop also shows altered nucleotide stacking interactions, demonstrating the mutually induced fit of the complex (115). Ribosomal proteins also have been shown to bind RNA (either rRNA or their own mRNA) through a mutually induced fit mechanism (114). Similarly, most conformational changes observed upon AARS:tRNA complex formation occur in both protein and RNA components. In addition to the examples cited above, several additional cases can be cited. These include exampies from structural analysis as well as investigations in solutions. A comparison of the tRNAer°-bound ProRS (from T. thermophilus) with its unbound counterpart indicated conformational flexibility in the isolated enzyme (116). Most significant was a hinge movement of the anticodon-binding domain in relation to the catalytic domain. Also, several loops near the active site were less tightly constrained in the absence of small substrates (either prolyl adenylate or a nonhydrolyzable analog). In contrast, the/%sheet making up the catalytic core of the enzyme was rigid even in the isolated enzyme, as was the C-terminal zinc-binding domain (116). The tRNAPr° in complex with ProRS was bound in a noncatalytic orientation, as the acceptor end was disordered and not in the enzyme active site (116). The only enzyme-tRNA contact was at the anticodon, which was distorted relative to the structure of uncomplexed tRNAPae. An active ProRS:tRNAPr° complex was modeled based on the crystal structure of the closely related class IIa ThrRS:tRNA~ complex from E. coli (81). In order for the acceptor stem of tRNAP~° to reach the ProRS active site, a significant change in the tRNA

342


t R N A pr° FIG. 14. Structural comparison of tRNATM and tRNArr° in enzyme-bound conformations. Top: tRNATM superimposed on Thermus thermophilus ProRS:tRNArr°:ProAMS ternary complex (116). The two monomers of ProRS are outlined in gray and black, and the tRNAl'r° backbone is black. ProAMS, a nonhydrolyzable prolyl adenylate analog, is located in the enzyme active site. The position of tRNATM (gray) is that from its complex with E. coli ThrRS (81), with the catalytic domains of ThrRS (not shown) and ProRS aligned. The ThrRS:tRNATM complex is in a catalytically productive conformation, while the 3~ end of tRNAl'r° (black) is disordered and would require a significant conformational change to reach the enzyme active site. Bottom: The overall conformations of enzyme-bound tRNATM and tRNAl'r° are similar, except at the anticodon loops. Figure kindly provided by Dr. Stephen Cusack.

conformation was demonstrated to be necessary (Fig. 14). This may be the result of a reorientation of the anticodon-binding domain relative to the catalytic domain or of other protein-induced changes in the tRNA Pro structure, perhalps upon adenylate formation (116). Thus the catalytically active ProRS:tRNAr~° complex differs structurally from its isolated components. The recent structure of the ternary yeast ArgRS:tRNAarg:L-arginine complex provides further insight into conformational changes leading to domain-domain communication (117). Comparison with the "tRNA-free" yeast ArgRS structure (118) revealed significant distortions in the anticodon loop and acceptor end


343

of tRNAn~g, as has been seen for other enzyme-bound tRNAs (63, 81,116). Recognition of the anticodon loop by ArgRS involves formation of a bulge at A38, intercalation of A37 into the last base pair of the anticodon stem, and the splaying out of anticodon bases U33, 134, and C35. Conserved residues in the ArgRS anticodon-binding domain stabilize this conformation of the loop. For example, all ArgRS sequences end in methionine, which is Met 607 in the yeast enzyme. This methionine interacts with G36 and A38 in the distorted conformation of the loop. The 3' end of tRNA nrg forms a hairpin structure to access the catalytic site of the enzyme, similar to that previously observed for enzyme-bound tRNA Gin (63). However, the molecular mechanisms used are different in each case. The terminal base pair (qJ 1:A72) of tRNAxrg remains paired, and the hairpin is stabilized by enzyme interactions with C74 and A76. Communication between the anticodon-binding and catalytic domains of yeast ArgRS is mediated through conformational changes in two helices (H15 and H6) that link together the domains. A helix (H15) encompassing Phe 417 to Lys 435 forms one side of the pocket recognizing G36 and G38, and is conformationally altered upon tRNA binding. Structural changes in helix H15 induce changes in the class I HIGH and KMSKS signature peptides (in yeast ArgRS the corresponding residues are H159AHG and M408STR).Thus, upon tRNA binding, the MSTR loop flips and the helix (H6) containing HAHG moves to produce a more open active site. This may be the structural basis for ArgRS's requirement for tRNA nrg binding prior to aminoacyl adenylate formation. Furthermore, comparison of the binary complex (lacking arginine) with the ternary complex revealed that amino acid binding triggers the proper orientation of the tRNA CCA end. When arginine is bound, the conserved Tyr 347 interacts with both the amino acid and the adenine ring of A76, continuing the stacking interaction of A76 and C75 in the hairpin conformation. In the absence of arginine, Tyr 347 is hydrogen-bonded to the carbonyl of Trp192 and does not contact A76. In the binary complex, therefore, the tRNA CCA end is disordered, and the position of G73 suggests that the acceptor stem maintains its helical conformation rather than forming the productive hairpin structure. Thus, both tRNA arg and arginine binding induce conformational changes that result in a catalytically competent conformation of ArgRS. These changes are mediated by the long helix H15 that serves as a structural link between anticodon-binding and active-site domains, and by conserved residues in the active site that are conformationally flexible (117). Experimental evidence suggests that some substrate binding energy may be used to disrupt the AARS and/or tRNA conformation, as predicted by Fersht (97). For example, comparison of the crystal structures of ligand-free B. stearothermophilus TyrRS and its tyrosyl adenylate complex revealed a conformational change in the enzyme active site upon adenylate formation (119).

344

REBECCAW.ALEXANDERANDPAULSCHIMMEL

Kinetic studies of TyrRS mutants evaluated the energy levels of the reaction intermediates to predict how particular residues interact with components of the activation step (tyrosine, ATP, the transition state, adenylate-PPi, and adenylate) (120). This analysis revealed that Lys 82, Arg 86, Lys 230, and Lys 233 interacted with the transition state of activation, despite their distance from the active site in the crystal structure of the free enzyme. Indeed, these residues were localized to two flexible loops in the catalytic domain (119). In an induced-fit mechanism, the loops were proposed to wrap around the pyrophosphate portion of ATP upon tyrosine binding, then open again once the adenylate was formed (120). Binding energies for the acceptor helix and anticodon arms of tRNAfMetin complex with MetRS have been determined based on affinity coelectrophoresis analysis (93). The sum of free energies of binding for the two arms was much higher (by 7 kcal/mol) than for the full-length tRNAfMet.Much of this difference may be the cost associated with distorting the tRNA upon binding to MetRS. Furthermore, binding and activation energies were compared for microhelixfMet and tRNAfMet.Whereas the difference in apparent free energy of activation was calculated to be 9.2 kcal/mol, binding differed by only 1.9 kcaYmol (48). For the more active tRNA substrate, then, some or all of the energy cost associated with strain of the tRNA upon binding may result in a conformation of the enzyme:tRNA complex that more closely resembles the transition state of catalysis. This may be manifest in the elevated keat for aminoacylation of tRNAfM~t (compared to the microhelix) because of the reduced activation energy barrier (48). A portion of this reduction was achieved by engineering a destabilization in the acceptor stem of microhelixfM~t(90). Additionally, mutations of MetRS in two of the helices known to be involved in anticodon binding showed that induced fit of tRNAfMetcould be interrupted without affecting adenylate formation, microhelix aminoacylation, or anticodon binding (50). Alanine substitutions at Asn 387 and Asn 452 resulted in a variant enzyme with significantly decreased binding and aminoacylation of tRNAfMet. Yet the microhelix and anticodon arms in isolation bound the double mutant with affinities approximately equivalent to those with the wild-type enzyme, as evidenced by identical microhelix aminoacylation rates and inhibition of tRNAfMet aminoacylation by anticodon stem-loop helices, respectively (50). Amino acids in a domain distinct from the catalytic core reduced aminoacylation because the binding energies of the individual tRNA arms could not be converted into the conformational change necessary to produce the active AARS:tRNA complex. As described above, residues in the C-terminal half of E. coli MetRS are responsible for recognition of the CAU anticodon of tRNAsMet. A genetic study showed that several residues in the anticodon-recognizing helix-loop peptide (Lys 439 to Gly 468) could be substituted with little effect on enzyme activity, while a small number of residues (Asn 452, Arg 453, Pro 460, Trp 461, and Lys 465) were invariant (121, 122). Molecular dynamics simulations carried out


345

on both active and inactive variants of the peptide demonstrated that inactivity correlated with increased flexibility of the peptide (49). This result suggested that the difference between inactive and active variants of the enzyme (in some cases differing by a single residue) was the result of the increased energy required to constrain the RNA-binding region of the protein. If indeed induced fit is utilized in AARS:tRNA complex formation, fixation of the residues at the tRNA binding site may reduce the entropic cost of the induced-fit mechanism. Induced fit has been demonstrated for other tRNA-binding enzymes. Methionyl-tRNA formyltransferase (MTF) is essential for generating the formylated version of Met-tRNA fMet recognized as the initiator tRNA in bacterial and organellar protein synthesis (123). The crystal structure of E. coli MTF revealed its domain organization, with an N-terminal domain highly homologous to glycinamide ribonuclotide formyltransferase, another formylating enzyme. The C-terminal domain of MTF was proposed to make nonspecific contacts with tRNA ~et. As with some AARSs, the two domains of MTF are linked by a peptide loop (124). The peptide was flexible enough to be disordered in the crystal structure, and was susceptible to protease cleavage in the absence of tRNA. In contrast, formation of an active complex between MTF and tRNA fMet protected the peptide loop from cleavage (125). Protection did not occur when the complex contained mutations (in either MTF or tRNA ~et) known to inhibit formylation. Apparent dissociation constants of the variant MTF:tRNA fMet complexes were largely unchanged relative to interaction between wild-type components, demonstrating again that specificity is achieved at the catalytic (rather than binding) step. A cocrystal structure of the complex between MTF and fMet-tRNA fMet revealed that tRNA binding did indeed cause a conformational change in the peptide linker (126). In the case of MTF, therefore, catalysis is achieved through an induced-fit mechanism in which tRNA is the activating substrate. Binding of AA-tRNA has also been proposed to trigger a conformational change in the decoding center of 16S ribosomal RNA, in an induced-fit mechanism for substrate selection at the ribosome (127).

X. Conclusion One oft_he key early questions in the investigation of the aminoacylation reaction was the tRNA identity elements recognized by individual AARSs. Much has been worked out in this area through genetic, chemical, and kinetic studies (45, 128). Strides have also been made in structure determinations, such that comparisons of ligand-free and liganded AARSs can be compared, as can isofunctional enzymes from different organisms. One outstanding challenge is a detailed description of the molecular mechanism of aminoacylation and editing. This will require further structural studies, including descriptions of enzymes

346

REBECCA W. ALEXANDERAND PAUL SCHIMMEL

in the presence o f substrates, transition-state analogs, and inhibitors. But it is important also to revisit some of the early kinetic work that sought to understand t R N A discrimination on a fast time scale (85-87). With the tools and techniques now available, comparisons of structural changes occurring in the presence of cognate, noncognate, and near-cognate tRNAs (including tRNA pieces) may further enlighten the basis for domain--domain communication in AARSs. Finally, because these enzymes are intimately linked to the origin of life, the evolution o f t R N A synthetase structure remains an important problem for the future. This evolution is inseparable from consideration of the evolution of tRNA and the full development of the universal genetic code. During this long evolution, domain--domain communication played an essential role.

ACKNOWLEDGMENTS This work was supported by grants GM 15539 and 23562 from the National Institutes of Health and by a fellowship from the National Foundation for Cancer Research.

REFERENCES 1. P. R. Schimmel and D. $611,Annu, Rev. Biochem. 48, 601 (1979). 2. D. $811and P. Schimmel,Enzymes 10, 489 (1974). 3. A. Rich, in "Horizons in Biochemistry" (M. Kasha and B. Pullman, eds.), p. 103. Academic Press, New York, 1962. 4. C.R. Woese,in "Organizationand Control ofProkaryoticand Eukaryoticcells:20th Symposium of the Society for General Microbiology"(H. P. Charles and B. C. J. G. Knight, eds.), p. 39. Cambridge UniversityPress, London, 1970. 5. G. M. Nagel and R. F. Doolittle,J. Mol. Evol. 40, 487 (1995). 6. T. Webster, H. Tsai, M. Kula, G. A. Mackie, and P. Schimmel, Science 226, 1315 (1984). 7. S.W. Ludmerer and P. Sehimmel,J. Biol. Chem. 262, 10801 (1987). 8. G. Eriani, M. Delarue, O. Poch, J. Gangloff,and D. Moras, Nature (London) 347, 203 (1990). 9. S. Cusack, C. Be~het-Colominas, M. Hartlein, N. Nassar, and R. Leberman, Nature (London) 347, 249 (1990). 10. S. T. Ran and M. G. Rossmann,J. Mol. Biol. 76, 241 (1973). 11. C. Hountondji, E Lederer, P. Dessen, and S. Blanquet, Biochemistry 25, 16 (1986). 12. M. Ruff, S. Krishnaswamy, M. Boeglin, A. Poterszman, A. Mitschler, A. Podjarny, B. Rees, J. c. Thierry, and D. Moras, Science 252, 1682 (1991). 13. M. Ibba, S. Morgan, A. W. Curnow, D. R. Pridmore, U. C. Vothkneeht, W, Gardner, W. Lin, C. R. Woese, and D. $611,Science 278, 1119 (1997). 14. P. Schimmel, R. Gieg6, D. Moras, and S. Yokoyama,Proc. Natl. Acad. Sci. U.S.A. 90, 8763

(1993). 15. M. Delarue and D. Moras, Bioessays 15, 675 (1993). 16. S. D. Putney, R. T. Sauer, and P. R. Schimmel,J. Biol. Chem. 256, 198 (1981). 17. M. Jasin, L. Regan, and P. Schimmel, Nature (London) 306, 441 (1983).

DOMAIN-DOMAIN COMMUNICATION 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55.

56.

347

C. Ho, M. Jasin, and P. Schimmel, Science 229, 389 (1985). S. J. Park and P. Sehimmel, J. Biol. Chem. 263, 16527 (1988). Y. M. Hou and P. Sehimmel, Nature (London) 333, 140 (1988). W. H. McClain and K. Foss, Science 240, 793 (1988). C. Franeldyn and P. Schimmel, Nature (London) 337, 478 (1989). S. J. Park, Y. M. Hou, and P. Schimmel, Biochemistry 28, 2740 (1989). K. Musier-Forsyth, N. Usman, S. Searinge, J. Doudna, R. Green, and P. Schimmel, Science 253, 784 (1991). D. D. Bueehter and P. Sehimmel, Biochemistry 32, 5267 (1993). N.Y. Sardesai and P. Sehimmel, J. Amer. Chem. Soc. 120, 3269 (1998). L. Ribas de Pouplana, D. Buechter, N. Y. Sardesai, and P. Schimmel, EMBO J. 17, 5449 (1998). L. Regan, J. Bowie, and P. Schimmel, Science 235, 1651 (1987). Y. M. Hou and P. Schimmel, Biochemistry 31, 10310 (1992). M. Jasin, L. Regan, and P. Schimmel,]. Biol. Chem. 260, 2226 (1985). S. Kim and P. Schimmel, J. Biol. Chem: 267, 15563 (1992). S. Kim, J. A. Landro, A. J. Gale, and P. Schimmel, Biochemistry 32, 13026 (1993). A. Shepard, K. Shiba, and P. Sehimmel, Proc. Natl. Acad. SCI. U.S.A. 89, 9964 (1992). T. L. Hendrickson, T. K. Nomanbhoy, and P. Sehimmel, Biochemistry 39, 8180 (2000). E. Schmidt and P. Schimmel, Biochemistry 34, 11264 (1995). E. Schmidt and P. Schimmel, Science 264, 265 (1994). J. J. Burbaum and P. Schimmel, Biochemistry 30, 319 (1991). K. Shiba and P. Schimmel, Proc. Natl. Acad. Sci. U.S.A. 89, 1880 (1992). K. Shiba and P. Schimmel, J. Biol. Chem. 267, 22703 (1992). S. Blanquet, M. Iwatsubo, and J. P. Waller, Eur. J. Biochem. 36, 213 (1973). T. Meinnel, Y. Mechulam, D. Le Corre, M. Panvert, S. Blanquet, and G. Fayat, Proc. Natl. Acad. Sci. U.S.A. 88, 291 (1991). J. P. Ebel, R. Gieg6, J. Bonnet, D. Kern, N. Befort, C. Bollack, F. Fasiolo, J. Gangloff, and G. Dirheimer, Biochimie 55, 547 (1973). M. Jasin, L. Regan, and P. Schimmel, Cell (Cambridge, Mass.) 36, 1089 (1984). G. Ghosh, H. Pelka, and L. H. Schulman, Biochemistry 29, 2220 (1990). L. H. Schulman, Prog. NuvleicAcidRes. Mol. Biol. 41, 23 (1991). L. H. Sehulman and H. Pelka, Science 246, 1595 (1989). H.Y. Kim, H. Pelka, S. Brunie, and L. H. Schulman, Biochemistry 32, 10506 (1993). A. J. Gale, J. P. Shi, and P. Schimmel, Biochemistry 35, 608 (1996). L. Ribas de Pouplana, D. S. Auld, S. Kim, and P. Sehimmel, Biochemistry 35, 8095 (1996). R. W. Alexander and P. Sehimmel, Biochemistry 38, 16359 (1999). T. Muramatsu, K. Nishikawa, F. Nemoto, Y. Kuchino, S. Nishimura, T. Miyazawa, and S. Yokoyama, Nature (London) 336, 179 (1988). L. Pallanck and L. H. Schulman, Proc. Natl. Acad. SCI. U.S.A. 88, 3872 (1991). O. Nureki, T. Niimi, T. Muramatsu, H. Kanno, T. Kohno, C. Florentz, R. Giegd, and S. Yokoyama,J. Mol. Biol. 236, 710 (1994). T. Muramatsu, S. Yokoyama, N. Horie, A. Matsuda, T. Ueda, Z. Yamaizumi, Y. Kuchino, S. Nishimura, and T. Miyazawa,J. Biol. Chem. 263, 9261 (1988). O. Nureki, T. Niimi, Y. Muto, H. Kanno, T. Kohno, T. Muramatsu, G. Kawai, T. Miyazawa, R. Gieg6, C. Florentz, and S. Yokoyarna,in "The Translational Apparatus" (K. H. Nierhans, F. Franceschi, A. R. Subramanian, V. A. Erdmann, and B. Wittmann-Liebold, eds.), p. 59. Plenum Press, New York, 1993. T. A. Kleeman, D. Wei, K. L. Simpson, and E. A. First, J. Biol. Chem. 272, 14420 (1997).

348


57. B. A. Steer and P. Schimmel, J. Biol. Chem. 274, 35601 (1999). 58. H. Himeno, T. Hasegawa, T. Ueda, K. Watanabe, and M. Shimizu, Nucleic Acids Res. 18, 6815

(1990). B.A. Steer and P. Sehimmel, Proc. Natl. Acad. Sci. U.S.A. 96, 13644 (1999). R. Gieg6, Proc. Natl. Acad. Sci. U.S.A. 93, 12078 (1996). R. Gieg6, C. Florentz, and T. W. Dreher, Biochimie 75, 569 (1993). C. Florentz, T. W. Dreher, J. Rudinger, and R. Gieg~, Eur. J. Biochem. 195, 229 (1991). M.A. Rould, J. J. Perona, D. Still, and T. A. Steitz, Science 246, 1135 (1989). S. H. Kim, F. L. Suddath, G. J. Quigley, A. McPherson, J. L. Sussman, A. H. Wang, N. C. Seeman, and A. Rich, Science 185, 435 (1974). 65. J. D. Robertus, J. E. Ladner, J. T. Finch, D. Rhodes, R. S. Brown, B. F. Clark, and A. Klug, Nature (London) 250, 546 (1974). 66. M. A. Rould, J. J. Perona, and T. A. Steitz, Nature (London) 352, 213 (1991). 67. M. Jahn, M. J. Rogers, and D. Still, Nature (London) 352, 258 (1991). 68. M. Ibba, K. W. Hong, J. M. Sherman, S. Sever, and D. Still, Proc. Natl. Acad. Sci. U.S.A. 93, 6953 (1996). 69. J. Putz, J. D. Puglisi, C. Florentz, and R. Gieg~, Science 252, 1696 (1991). 70. J. Cavarelli, B. Rees, M. Ruff, ]. C. Thierry, and D. Moras, Nature (London) 362, 181 (1993). 71. E. Westhof, P. Dumas, and D. Moras, J. Mol. Biol. 184, 119 (1985). 72. J. CavareUi, B. Rees, J. C. Thierry, and D. Moras, Biochimie 75, 1117 (1993). 73. J. Putz, J. D. Puglisi, C. Florentz, and R. Gieg~, EMBOJ. 12, 2949 (1993). 74. V. Biou, A. Yaremchuk, M. Tukalo, and S. Cusack, Science 263, 1404 (1994). 75. F. Borel, C. Vincent, R. Leberman, and M. Hartlein, Nucleic Acids Res. 22, 2963 (1994). 76. J. R. Sampson and M. E. Saks, NucleieAcids Res. 21, 4467 (1993). 77. S. Cusack, A. Yaremehuk, and M. Tukalo, EMBOJ. 15, 2834 (1996). 78. H. Belrhali, A. Yaremehuk, M. Tukalo, C. Berthet-Colominas, B. Rasmussen, P. Boseeke, O. Diat, and S. Cusack, Structure 3, 341 (1995). 79. M. Fujinaga, C. Berthet-Colominas, A. D. Yaremehuk, M. A. Tukalo, and S. Cusack, J. Mol. Biol. 234, 222 (1993). 80. S. Cusack, A. Yaremehuk, and M. Tukalo, EMBOJ. 15, 6321 (1996). 81. R. Sankaranarayanan, A. C. Dock-Bregeon, P. Romby, J. Caillet, M. Springer, B. Rees, C. Ehresmann, B. Ehresmann, and D. Moras, Cell (Cambridge, Mass.) 97, 371 (1999). 82. S. Eiler, A. Dock-Bregeon, L. Moulinier, J. C. Thierry, and D. Moras, EMBO J. 18, 6532 (1999). 83. V. A. Ilyin, B. Temple, M. Hu, G. Li, Y. Yin, P. Vachette, and C. W. Carter, Jr., Protein Sci. 9, 218 (200O). 84. I. Bahar and R. L. Jemigan, J. Mol. Biol. 281,871 (1998). 85. R. Rigler, U. Pachmann, R. Hirsch, and H. G. Zachan, Eur. J. Biochem. 65, 307 (1976). 86. G. Kranss, D. Riesner, and G. Maass, Eur. J. Biochem. 68, 81 (1976). 87. D. Riesner, A. Pingoud, D. Boehme, F. Peters, and G. Maass, Eur. J. Biochem. 68, 71 (1976). 88. J. Rudinger, J. D. PugIisi, J. Piitz, D. Sehatz, K Eckstein, C. Florentz, and R. Gieg6, Proc. Natl. Acad. Sci. U.S.A 89, 5882 (1992). 89. J. J. Perona, M. A. Rould, T. A. Steitz, J. L. Risler, C. Zelwer, and S. Brunie, Proc. Natl. Acad. Sci. U.S.A. 88, 2903 (1991). 90. R.W. Alexander, B. E. Nordin, and P. Schimmel, Proc. Natl. Acad. Sci. U.S.A. 95,12214 (1998). 91. S. A. Martinis and P. Schimmel, Proc. Natl. Acad. Sei. U.S.A. 89, 65 (1992). 92. S. A. Martinis and P. Schimmel, J. Biol. Chem. 268, 6069 (1993). 93. A. ]. Gale and P. Schimmel, Pharm. Acta Helv. 71, 45 (1996). 94. T. Meinnel, Y. Mechularn, S. Blanquet, and G. Fayat, J. Mol. Biol. 220, 205 (1991). 95. M. Frugier, C. Florentz, and R. Gieg6, Proc. Natl. Acad. Sci. U.S.A. 89, 3990 (1992). 59. 60. 61. 62. 63. 64.

DOMAIN-DOMAIN COMMUNICATION 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134.

349

A. R. Fersht and C. DingwaU, Biochemistry 18, 1250 (1979). A. R. Fersht, "Enzyme Structure and Mechanism." W. H. Freeman, New York, 1985. R. B. Loftfield and D. Vanderjagt, Biochem. J. 128,1353 (1972). L. Lin, S. P. Hale, and P. Schimmel, Nature (London) 384, 33 (1996). R. M. Starzyk, T. A. Webster, and P. Schimmel, Science 237, 1614 (1987). O. Nureki, D. G. Vassylyev, M. Tateno, A. Shimada, T. Nakama, S. Fukai, M. Konno, T. L. Hendrickson, P. Schimmel, and S. Yokoyama, Science 280, 578 (1998). 0. Nureki, D. G. Vassylyev, M. Tateno, A. Shimada, T. Nakama, S. Fukai, M. Konno, T. L. Hendrickson, P. Schimmel, and S. Yokoyama, Science 283, 459 (1999). T. K. Nomanbhoy, T. L. Hendrickson, and P. Schimmel, Mol. Cell 4, 519 (1999). L. F. Silvian, J. Wang, and T. A. Steitz, Science 285, 1074 (1999). S. P. Hale, D. S. Auld, E. Schmidt, and P. Schimmel, Science 276, 1250 (1997). M. A. Farrow, B. E. Nordin, and P. Schimmel, Biochemistry 38, 16898 (1999). B. E. Nordin and P. Sehimmel, J. Biol. Chem. 274, 6835 (1999). S. P. Hale and P. Schimmel, Proc. Natl. Acad. Sci. U.S.A. 93, 2755 (1996). A. N. Baldwin and P. Berg, J. Biol. Chem. 241, 839 (1966). F. vonder Haar and F. Cramer, Biochemistry 15, 4131 (1976). S. P. Hale and P. Schimmel, Tetrahedron 53, 11985 (1997). D. E. Koslaland,Jr., Proc. Natl. Acad. Sci. U.S.A. 44, 98 (1958). D. E. Koshland, Jr., G. Nemethy, and D. Filmer, Biochemistry 5, 365 (1966). A. D. Frankel, Nat. Struct. Biol. 6, 1081 (1999). J. R. Williamson, Nat. Struct. Biol. 7, 834 (2000). A. Yaremchuk, S. Cusack, and M. Tukalo, EMBOJ. 19, 4745 (2000). B. Delagoutte, D. Moras, and J. Cavarelli, EMBOJ. 19, 5599 (2000). J. Cavarelli, B. Delagoutte, G. Eriani, J. Gangloff, and D. Moras, EMBOJ. 17, 5438 (1998). P. Brick and D. M. Blow,J, Mol. Biol. 194, 287 (1987). A. R. Fersht, J. w. Knill-Jones, H. Bedoulle, and G. Winter, Biochemistry 27, 1581 (1988). S. Kim, L. Ribas de Pouplana, and P. Schimmel, Proc. Natl. Acad. Sci. U.S.A. 90,10046 (1993). S. Kim, L. Ribas de Pouplana, and P. Sehimmel, Biochemistry 33, 11040 (1994). U. L. RajBhandary, J. Bacteriol. 176, 547 (1994). E. Schmitt, S. Blanquet, and Y. Mechulam, EMBOJ. 15, 4749 (1996). V. Ramesh, C. Mayer, M. R. Dyson, S. Gite, and U. L. RajBhandary, Proc. Natl. Acad. Sci. U.S.A. 96, 875 (1999). E. Schmitt, M. Panvert, S. Blanquet, and Y. Mechulam, EMBOJ. 17, 6819 (1998). T. Pape, W. Wintermeyer, and M. Rodnina, EMBOJ. 18, 3800 (1999). P.J. Beuning and K. Musier-Forsyth, Biopolymers 52, 1 (1999). P. Schimmel and L. Ribas de Pouplana, Cell (Cambridge, Mass.) 81,983 (1995). L. Ribas de Pouplana, D. D. Buechter, M. W. Davis, and P. Schimmel, Protein Sci. 2, 2259 (1993). P. Schimmel andT. Ripmaster, Trends Biochem. Sci. 20, 333 (1995). J. G. Arnez and D. Moras, Trends Biochem. Sci. 22, 211 (1997). C. Francklyn, K. Musier-Forsyth, and S. A. Martinis, RNA 3, 954 (1997). G. Ghosh, H. Y. Kim, J. P. Demaret, S. Brunie, and L. H. Schulman, Biochemistry 30, 11767 (1991).

Index

A

dimer vs. monomer, 159-160 overexpression, 174 specificity of DNA binding, 160-164, 196 subeellular localization, 165-166 zinc binuelear cluster protein, 156-159

AARS, see Aminoacyl-tRNA synthetases Acetaldehyde induction and intoxification ofalc gene system, 172-173 intraeellular levels, 191-192 N-Acetylgalactosamine-6-sulfatesulfatase, see GALNS Activating transcription factor-I, 223 Activators PRPP synthetase reaction, 132-133 RORs as, ligand role, 217-218 Adenylyl cyclase, AC3, colocalization with PDE1C2, 258-259 ADP, inhibitor of PRPP synthetase, 130-132 Affinity, vs. specificity, antisense oligomer, 16-18 AF-2 region, ROR, 211-212 AGGTCA core motif mutation, 240 ROREs, 214-215 Alanyl-tRNA synthetase fragment 461N, 321-322 G3:U70 base pair, 320--321,326-327

alcR

eharacteristies, 156 gene expression, control of, 181-183 promoter, 180-181 alcR-aleA, expression system, 193-195 AlcR(I:60)-DNA complex, NMR structure, 164-165 aldA

absence of direct repression, 197-198 loss~of-function mutants, 170-172 promoters, 191 regulation, 187-188 aMA/ALDH, A. nidulans, 155-156 Aldehyde dehydrogenase a/dA encoded by, 167 characteristics, 155-156 role in multiple catabolic pathways, 170-172 substrates: inducers ofalc gene system, 173-174 ALDH, see Aldehyde dehydrogenase AmidoPRTase, in purine synthesis, 122-123 Amino acids, misaetivated, editing of, 340 Amino acid substitution, PRS 1 cDNAs, 136-137 Aminoacylation communication in, 335-337 efficiency, domain-domain communication and, 32.,.5-329 systems, noneovalent assembly, 323-325 tRNA, 322--324 Aminoacyl-tRNA synthetases cataly'dc domains, 318 conformational changes, 333 tRNA identity elements recognized by, 345 2'-Aminoethoxy substitution, and triplex stabilization, 22 Aminoglycosides, RNA ligands, 4 Antibiotics, complexes with aptamers, 38

alcA

promoter, AlcR targets in, 18,3-186 transcriptional repression, 186 a/cA/ADH I, A. nidulans, 154 a/c genes cluster, 188-190 regulation: mechanism, 180-192 a/c gene system induction, 167-174 role in research, 192-195 Aleohol dehydrogenase, three forms, 154-155 AlcR alcA promoter binding sites, 195 targets in, 183-186 in alc gene expression, 189 consensus repeated sites, 180-181 control ofa/dA promoter, 187

351

352 Anticodon-binding domain absent in TyrRS, 328 hinge movement, 341 MetRS, 323-324 reorientation, 342 Anfiinflammatory agents, PDE4-selective inhibitors, 261-262 Antisense oligonueleotides affinity or specificity, 16-18 invader, chemically modified, 14-16 mini-exon sequence target for, 8-9 principles and limitations, 5-8 reactive, 18-19 RNA structures invaded by, 9--13 Apaf-1, interaction with Bd-XL, 234 APC-TCF pathway, link with Dnmtl, 58--59

Apoptosis PDE4A5 role, 283 ROR F effect on, 236-238 ROR~, role in, 231-236 Aptamers complex with antibiotics, 38 hairpins, 27-28 identified by SELEX, 26-27 to TAR RNA of HIV-1, 28--31 ArgRS, anticodon loop, 342-343 asp-box motifs, in sialidases, 93 AspergiUus nidulans a/c gene regulon, 168-169 a/c genes and products, 152-156 CreA repressor, 174-180 ethanol regulon, transcriptional activation, 156-167 ethanol utilization pathway, 151 mammalian protein production in, 193-194 AspRS, complex with tRNA~, 332 Atherosclerosis, RORot role, 228 ATP, substrate for PRPP synthetase reaction, 128-129 A3:U70 substitution, in AlaRS, 326-327

Bacillus subttlis, PRPP synthetase, 125--126 Base pairs, G3:U70, in MARS, 320-321, 326-327 Base triples, in dimerization initiation site, 34

INDEX Bcl-XL effect on cytochrome c release, 233 interaction with Apaf-1, 234 mRNA expression, 235-236 Biological function, RNA-RNA kissing complexes, 31-32 Biotechnology, a/c system as tool, 192-195 Bone metabolism, RORa role, 228

CaMKIV link with ROR activation, 241 regulation of ROR-dependent transactivation, 223-224 cAMP anti conformation binding, 269-270 compartmentalized signaling, 252-253, 299 degradation, 250 hydrolysis, PDE3 isoenzymes with affinity for, 255--256 cAMP response element-binding protein, ROR cofactor, 219-222 Cancer cells cell cycle control of DNMT1 in, deregulation, 66-67 DNMT1 overexpression, 65 lung, DNA replication inhibition, 63--64 tumor suppressors hypermethylated in, 67 Carbon catabolite, repression, 181-183, 190-191, 197 Carboxyl-terminal extension, RORs, 214-215 Caspase-3 cleavage of PDE4A5, 282-283 PDE4A4/5 activation, 306 Catalytic overactivity, PRPP synthetase, 138--139, 145 Catalytic unit PDE4 isoenzymes, 266-279 UCR2 connected to, 264 ~-Catenin, APC binding to, 58-59 Cathepsin A activity and structure, 85-86 complex with EBP and sialidase, 98-99 GAL, 97-98 mutations in, 102-103 precursor, sialidase association with, 94

INDEX protective role, 84-85 synthesis and function, 86-87 CBP, see cAMP response element-binding protein cdk2, activation in RORy - / - DP thymocytes, 235-236 Cell cycle control of DNMT1, deregulation in cancer cells, 66-67 regulation of demethylase expression, 59-61 Cellular signaling compartmentalized, 252-253 sialidase role, 94-95 Cellular transformation, D n m t l expression role, 56--58 cGMP cross-talk with cAMP, 255 PDE5 and PDE6 specific for, 256-257 Chimeras, antisense, 17 Chromatin structure, accessibilityof demethylase to DNA gated by, 69-70 Chromosomal localization, RORy, 224-226

c-Jun in cooperation with Rb, 58 docking site for JNK, 287 Conformational changes AARS, 333 PDE4 catalytic unit, 271-279 in tRNA, communication by, 334-335 CopA, stem-loop structure, 32 Corepressors recruited by Mbds, 51-53 ROR interactions with, 218-223 Corrective factor, missing in GS cells, 83--84 CpG islands free of methylation during replication, 72 multiple, methylation, 68-69 CpG sequence DNA methylation and, 50-52 methylated, 54, 65 CreA A. nidulans, 174-175 DNA-binding domains, 175 functional targets, 175-177 homologs in filamentous fungi, 179-180 role in alc gene expression, 189 carbon eatabolite repression, 181-183, 197-198

353 structure--function relationships, 177-178 target disruption, 186 Cyclie-nueleotide phosphodiesterase, see PDE Cysteine residues, AIcR, 158-159 Cytochrome c, antiapoptotic proteins and, 233-234

D D04 aptamer, binding to TAR RNA, 29 Demethylase accessibilityto DNA, 69-70 methylated CpG sites, 69 expression, regulated by cell cycle, 59-61 repair or replication mechanisms, 54-55 Dimerization initiation site HIV-1 isolates, 3-D structures, 33-34 stem-loop structure, 32 Dimers, AlcR as, 159-160 2,3-Diphosphoglyeerate, inhibitor of PRPP synthetase, 130-132 DNA-binding domain AlcR, 157-159 CreA repressor, 175 Gal4p, 166-167 RORu, 208-209 ROR proteins, 212-214 DNA methylation coordinated with DNA replication, 55--63 ectopic, suppression of gene expression, 50-51 as epigenetic modification of DNA, 49-50 interference with transcription factor binding, 51 DNA methyltransferases, candidate, 53 DNA replication coordinated with DNA methylation, 55-63 inhibition by DNMT1 inhibition, 63-64 DNMT1 de novo, 68-69 inhibition, and DNA replication inhibition, 63--64 methylation pattern inheritance, 59-61 oncogenesis and, 64-71 role in growth suppressor gene expression, 62--63 targeted to replication fork, 61-62

354 Dnmt l

expression, role in cellular transformation, 56-58 posttranscriptional regulation, 59 regulatory regions, 56 Domain communication, in editing, 337-341 Domain-domain communication and aminoacylation efficiency, 32,5--329 by conformational changes in tRNA, 334-335 via noncovalent assembly of aminoacylation systems, 323---325 role of induced fit, 341--345 Domain functions, bARS, separable, 320-323 Double-positive thymocytes, 231-233, 237 Drosophila melanogaster, ROR homologs, 211-212

INDEX Expression profiles, Mbds, 52 Expression tool, heterologuus, alc system as, 192-195

FasL, gene expression, 238 Fibroblasts, PRS isoform levels, 142-143 Fomivirsen, FDA-approved oligonucleotide, 8 Fragment 461N, AlaRS, 321--322 Frameshffting HIV-1 gag-pol signal, 19 HTLV-I gag--pro hairpin, 21 retroviral, 3 Fungi filamentous, CreA homologs in, 179-180 metabolic pathways and gene clustering, 190

G EBP, see Elastin-binding protein Editing, domain communication in, 337-341 Efficiency aminoacylation, 325-:-329 antisense oligonucleotides, 6-8 Elastin-binding protein, complex with sialidase and cathepsin A, 98-99 Endothelin-1, hydrolysis, cathepsin A role, 87 Enzyme activation therapy, 106 Epigenome DNA methylation as component of, 50 reversibility, 49 ERK docking, PDE4 species, 286--288 MAP kinases, pbosphorylation of PDE4 enzymes, 290-297 Esterase activity, cathepsin A, 85 Ethanol catabolism, in A. nidulans, 152-156 induction of a/c gene system in presence of, 172 regulon, AleR role, 156--167 utilization pathway, characteristics, 151-152 Exons, PDE4 genes, 263-266 Expression patterns RORot mRNA, 208-209 ROR~ mRNA, 209-210 RORy mRNA, 210-211

GAF domains, PDE isocnzymes~ 255-258 GAL, see ~-Galactosidase Galactosialidosis, 83 storage products, 102 /~-Galactosidase activity and structure, 88-90 complex with cathepsin A, 97-98 features, 87-88 precursor, 103 synthesis and function, 90 ~-Galactosidosis, GMx-gangliosidosis, 100 GALNS features and specificity, 95-96 structure and function, 96 Gal4p, DNA-binding domain, 166-167 Gangliosides, sialylated, hydrolysis, 91-92 GARP, see Glutamic acid-rich proteins Gaussian network model, 334 GCGG motif, interacting, 161, 163 Gene expression alc, 153--154, 189 alcR, control of, 181-183 FasL, 238 growth suppressor, DNMT1 regulation of, 62-63 pseudo-constitutive, a/dA mutants, 171-172 suppression by DNA methylation, 50-51 Gene organization, PDE4, 262--266

INDEX

355

Genomic structure, RORF, 224-226 GInRS, complex with tRNA cl~, 330, 339 Glucose, A. nidulans cultivation on, 194 Ghitamic acid-rich proteins, role in PDE6 binding, 260-261 GMl-ganglioside/3-galactosidase,see /3-Galactosidase GM1-gangliosidosis, 83 spectra of storage products, 100 Growth regulatory circuits, coordination with DNMT1 levels, 63--64 Growth suppressor genes, DNMT1 regulatory role, 62-63 GS, see Galactosialidosis G3:U70, base pair in AIaRS, 320-321, 326-327

H Hairpins antisense RNA-binding, 10-11 nrgRS, 343 complex with aptamers, 27-28 DNA, high-affinity, 35 double, 24-25 duplexes with Pu and Py moieties, 20--21 Hammerhead ribozymes, 12 Helix 12, LBD of nuclear receptors, 207-208, 221-222 Hexosaminidase A, association with LMC, 106 Histone deacetylases recruited by Mbds, 51-53 and tumor suppressor gene hypermethylation, 70-71 Homologs CreA, in filamentous fungi, 179-180 ROR, insect, 211-212 HSPDE4A7 isoform, 266 HVXXHPLLd~XLL consensus, in ROR transactivation, 224 Hybridization, antisense compounds, 11 Hybridomas, T cell, 238 Hypermethylation DNA, 52 regional, and global hypomethylation, 64 tumor suppressor genes, 65, 67-68 histone deacetylases and, 70-71 Hypermethylator phenotype, 68

Hypomethylation, global, 54-55 and regional hypermethylation, 64

Id2 gene, and RORy, 230-231 Immune system, sg/sg mutation effects, 228 Induced fit, role in domain-domain communication, 341--345 Inducibility, ethanol utilization pathway, 151 Inhibitors PDE4-selective, 261-262, 304--305 PRPP synthetase reaction, 130-132 Inorganic phosphate, activator of PRPP synthetase, 132-133 Intercalating agents, for triple helices, 22-23 Interleukin 4, production, sialidase role, 95 Internal ribosome entry site, 4 Intoxification, a/c gene system by acetaldehyde, 172-173 Inverse immunoregulation, 87 Iron-regulatory proteins, 3 Iron-response element, in mRNAs, 3 Isoleucyl-tRNA synthetase aminoacylation and editing, 337-338 isoaccepting tRNAs, 327-328 split-protein constructs, 325 K Keratan sulfate, in Morquio B disease, 101-102 Ketones, inducers ofalc gene, 169-170 Kinase interaction motifs, 287 docking site, 291-292 Kinetics kissing complexes, 35--37 oligonucleotide invasion of RNA structures, 12-13 Kissing complexes RNA-RNA, 31-32 structure, 32.-35 thermodynamic and kinetic analyses, 35-37 KMSKS sequence AARS, 318 ligand-free IleRS, 339 Knockout mouse models GS, 91-92 PDE4, 304-305 RORs, 226-236

356

INDEX

Laminin B1 gene, and RORal, 240 LBD, see Ligand-binding domain Leishmania amazonensis, mini-exon sequence, 8--9, 18, 25 leu363Pro mutation, sialidase, 105 Ligand-binding domain nuclear hormone receptors, 207-208 ROR proteins, 212-215 Linker regions I and 2 LR2 of PDE4A, 275-276 PDE4 subfamily, 264-265 LMC, see Lysosomal multienzyme complex Locked nucleic acid, efficiency in vivo, 14 Loop-closing pair, aptamer, 36-37 Loop-loop complexes, see Kissing complexes Loop palindromes, 33--34 Luciferase assay, 11-12 I2¢XLL motif, CBP and SRC-1, 222 Lymph nodes, organogenesis, RORz role, 230-231 Lymphoblasts, PRS isoform levels, 142-143 Lysosomal carboxypeptidase A, see Cathepsin A Lysosomal multienzyme complex components, 84-97 discovery, 82-84 GALNS associated with, 96 molecular pathology, 99-107 stoiehiometry and structure, 97-98

h4 Magnesium complex with ATP, 128-129, 132 DNA aptamer and, 29 Mg z+ bound and free states, 277-278 effect on PRPP synthetase, 141 TAR binding and, 37 Manduca sexta, ROR homologs, 211-212 MAP kinases, ERK, phosphorylation of PDE4 enzymes, 290--297 Mbds, see Methylated DNA binding proteins Me2+ binding sites, PDE4 isoenzymes, 267, 269-270, 278-279 Melatonin putative ligand for RORs, 218 transcription, BOR# role, 229

Membrane insertion, PDE4A1, 280-281 Messenger RNA Bcl-XL, 235-236 iron-response element, 3 PDE1, 5' alternative splicing, 2.55 RORa, 209 ROR~, 209-210 RORy, 210-211 Methionyl-tRNA formyltransferase, induced fit, 345 2'-O-Methoxyethyl, RNA binding affinity enhanced by, 14 Methylated DNA binding proteins, corepressors recruited by, 51-53 MetRS complex with tRNAfMet,344 interface region, 324 microhelix aminoacylation, 322 constructs, 335, 337 tRNAMet, 327 Microhelix, MetRS aminoacylation, 322 constructs, 335, 337 Miglp changes in subcellular localization, 178 repression of transcription by, 179 sites, organized as repeats, 177 zinc fingers, 175 Mini-exon sequence: Leishmania folding into hairpin structure, 25 oligonucleotides complementary to, 18 as target for antisense oligonucleotides, 8-9 Mismatched duplex, RNA-oligonucleotide, 16-18 Molecular pathology, LMC, 99-107 Monomers, AIcR as, 159-160 Morquio B disease, keratan sulfate in, 101 Morquio disease, GALNS deficiency, 95--96 Mutations a/dA loss-of-function, 170-172 cathepsin A, 102-103 CreA, derepressed, 177-178 PDE4B, 278 PRS isoforms, 138 sialidase, 104-105 W273L, 101 Myogenesis, inhibition by dominant-negative RORa, 239

INDEX

357

N N-CoR, interaction with RORy, 220-222 Neuraminidase, s e e Sialidase Neurogenesis, abnormal, Purkinje cells, 227 Neuronal interacting factor X1, ROR binding, 222 NF-~cB, potential binding sites, 301 N31--+P5~phosphoramidate backbone, 14 Nuclear localization signals, AlcR, 165-166 Nuclear receptors, complexes with RXR, 214-215 Nueleoside diphosphate ldnase, NM23, interaction with RORfl, 222-223 Nucleotide pools, purine, 123 Nucleotides D-loop, 340-341 modified, for triple helices, 22-23

O ODN2, chimeric antisense, 17 Oligonucleotides antisense, s e e Antisense oligonucleotides chimeric, 17 clamp and ctrodar, 23-24 identified via combinatorial approaches, 25-37 morpholino, 15 platinated, 18-19 RNA ligands, 5 selective-binding complementary, 15 Oligosaccharides, sialylated, 91 Oncogenesis, DNMT1 and, 64-71 Organogenesis, lymph nodes, RORy role, 230-231 Orphan receptors retinoid-related, s e e Retinoid-related orphan receptors Rev-Erba and -fl, 216 Overactivity, human PRPP synthetase, inherited, 136-140 Overexpression AlcR, 174 developmental regulatory genes, 193 DNMT1, in cancer cells, 65 RORs, 236-239 Oxytocin, promoter region, RORE in, 239

p21, displacement from PCNA, 66 PAPs, complex with PRS isoforms, 134-136 Particulate targeting, PDE4 isoforms, 284-285 PAS domains, PDE8, 258 PCNA, s e e Proliferating cell nuclear antigen PDE families, 253-258 intracellular targeting, 258-261 isoenzymes regulation of cAMP levels, 251-252 structural similarity, 254-255 PDE2, isoforms with 5/ domain swaps, 259 PDE3 high affinity for cAMP hydrolysis, 255-256 isoenzymes, intracellular localization, 259-260 PDE5, GAF domains, 256 PDE6 association with GARP, 260-261 cGMP binding, 257-258 PDE7, splice variants, 261 PDE8, PAS domains, 258 PDE4B2, phosphorylation, TCR-indueed, 297-298 PDE1C2, colocalization with AC3, 258-259 PDE4 cAMP-specific PDEs, and PDE4-selective inhibitors, 261-262 PDE4D3, long isoform, activation, 300-301 P D E 4 genes, organization, 262-266 PDE4 isoenzymes catalytic unit, 266-271 conformational changes, 271-279 PDE4 isoforms activation via PI-3 kinase, 298-300 induction of, 301--304 intracellular targeting, 279-288 phosphorylation, activation by PKA, 288-290 PDE4 knockouts, 304-305 Peptide nucleic acids, cRNA binding, 15 Phosphatidic acid, activation of long PDE4D3 isoform, 300--301 5-Phosphoribosyl a-l-pyrophosphate, s e e PRPP Phosphoribosyltransferase, and PRPP utilization, 119-121 Phosphorothioates, oligonucleotide analogs, 8

358 Phosphorylation PDE4 enzymes, by ERK MAP kinase, 290-297 PDE4 isoforms, activation by PKA, 288-290 PI-3 kinase, role in activation of PDE4 isoforms, 298--300 Plasma membrane complex, EBP, eathepsin A, and sialidase, 98-99 Platinum derivatives, conjugated to oligonucleotides, 18-19 Position effect, AlcR targets in a/cA promoter, 185 PPARy, phosphorylation, 217 Proliferating cell nuclear antigen, complex with DNMT1, 61-62 Proline residue, PDE4C, 270-271 Promoters alcA, AlcR targets in, 183-186 alcA and a/dA, 151 alcR, 189-181 AlcR-regulated, 161, 163 a/dA, 187, 191 PRPS, 144-145 ProRS, complex with tRNAPr°, 341--342 Prostaglandin synthesis, 294 Protective protein, 32-kDa , 84-85 Protein kinase A activation of PDE4 phosphoryla~on, 288-290 interaction with ERK2, 295-296 isoenzymes, 253 target serine in UCR1, 275 Protein kinase C, phosphorylation sites on RORet 1,217-218 Protein-protein interactions DNMT1 and, 71 DNMT1 regulatory role and, 62-63, 66-67 Protein-RNA complex, 334 PRPP reaction, 117-119 regulatory role, 121-125 synthesis, regulation in human cells, 140-145 uldlization, 119-121 PRPP synthetase allosteric control, disruption of, 137-138 associated proteins, see PAPs bacterial, proposed mechanism of, 125-126 human inherited overactivity, 136-140 and isoform family, 126--127

INDEX reaction effectors, 127-133 structures, 133-134 reaction, 117-119 PRPS genes, 126--127 PRPS promoters, 144-145 PRPS1 transcription, 139-141 PRS isoforms complex with PAPs, 134-136 in fibroblasts and lymphoblasts, 142-143 heteroaggregates, 141 mutant, 138 Pseudo-halfknots, RNA-binding, 10 Purine base salvage pathway, 124 nucleotide synthesis, regulation by PRPP, 121-123 Purine motif triplexes, 20-21 Purkinje cells, abnormal neurogenesis, 227 Pyrimidine, nucleotide synthesis, PRPP role, 12.4-125 Pyrimidine molif triplexes, 20, 24-25 Pyrophosphate loop, 125

Q Quaternary structure, PRPP synthetase, 133-134

RACK1, PDE4D5 binding, 285--286 Ras-c-Jun signaling pathway, 56 Rb, repression of dnmtl, 58 Receptor-interacting protein i40, 219-222 Regulatory circuits, cell cycle, disruption, 66-67 Regulatory role antisense RNAs, 31-32 DNMT1 for growth suppressor genes, 62-63 PRPP, 121-125 RORy for TEA promoter, 232 Regulon a/c gene, in catabolic reactions, 168-169 ethanol, A. n/du/ans, 156-167 Repair mechanisms causing site-specific demethylation, 54 demethylase, 69 Repeats

INDEX DNA sites, in AlcR-responsive promoters, 196-197 Miglp sites organized as, 177 organized, AlcR targets in alcA promoter, 183-184 Replication fork demethylases in, 59-61 DNMT1 targeted to, 61-62 Repression aldA, 187-188 carbon catabolite, 181-183, 190--191 mechanisms, in a/c genes, 197-198 transcriptional, alcA, 186 Research, alc system in, 192-195 Retina, in RORfl - / - mice, 229-230 Retinoid-related orphan receptors dn-RORa, 239 interaction with coactivators and corepressors, 218-223 overexpression, 236-239 protein structure, 212-214 response elements, see ROREs RORot, 208-209 RORa-/- mice, 226-228 RORfl, 209-210 ROR3 -/- mice, 229-230 RORy, 210-211 genomic structure, 224-226 R O R y - / - mice, 230-236 transactivation, CaMKIV role, 223-224 transcriptional control by, 217-224 Retinoid X receptors, complexes with nuclear receptors, 214-215 Rev-Erb~, association with RORs, 216 Reversibility, epigenome, 49 Rev response element, HIV-1, 26 Ribonuclease H antisense effects mediated by, 6-7 RNA--oligonucleotide duplex and, 16-17 Ribose-5-phosphate, substrate in PRPP synthetase reaction, 128-130 RNA ligands aminoglycoside antibiotics, 4 oligonucleotides, 5 RNA structures functional, 3-4 invaded by oligonucleotide, 9-13 RNPDE4A1, membrane-bound, 280-281 Rolipram, PDE4 isoform sensitivity to, 265, 271,274-279

359 Rop protein, loop-loop complex binding, 33--35 ROREs characterization, 214-216 in 5-lipoxygenase, 239-240 in transcriptional activation, 218

Saposin B, activation of hydrolysis, 88 Scaffold complexes, RACK1 binding, 285-286 Second messenger, degradation, 250 Selective-binding complementary bases, 38 Selective-binding complementary oligonucleotides, 15 SELEX, aptamers identified by, 26 Sensitivity PAP39, to digestion, 135 rolipram, 265, 271,274-279 SerRS, tRNAset binding, 332--333 sg/sg mice, ROR~ gene disruption in, 226-228 SH3 domains interactions among PDE4 isoforms, 281-284 PDE4 isoform binding, 265, 274 Sialidase catalyticallyactive conformation, 103-104 complex with EBP and cathepsin A, 98-99 features and specificity, 91-92 mutants, 104-105 structure, 92-93 synthesis and function, 93-95 Sialidosis, dysmorphic and nondysmorphic, 104-105 Signaling pathways, nodal mitogenic and oncogenic, 56 Spacer length, AlcR-regulated promoter targets, 161 Specificity vs. affinity, antisense oligomer, 16-18 AlcR DNA binding, 160-164 GALNS, 95-96 sialidase, 91-92 SRC-1, see Steroid receptor coactivator-1 Stem-loop structures, 31--33, 39 anticodon, 344 Steroid receptor coactivator-1, ROR cofactor, 217-222 Stoichiometry, LMC, 97-98 Stop-start sequence, insertion, 324

360 Storage products galactosialidosis, 102 Gm-gangliosidosis, 100 Structure-function relationship, CreA, 177-178 SubceUular localization AIcR, 165-166 Miglp, 178 Substrates ALDH, physiological inducers o f a l c gene system, 173-174 for PRPP synthetase reaction, 128--130 Synergism, between AlcR targets, 185 Systematic evolution of ligands by exponential enrichment, see SELEX

Targeted knockouts, RORs, 226-236 Targeting intracellular PDE4 isoforms, 279-288 PDEs, 258--261 particulate, PDE4 isoforms, 284-285 Tay-Sachs disease, 92 T cell receptors gene rearrangements, 231-232 PDE4B2 phosphorylation induced by, 297-298 T cells from RORF2 mice, 237-238 sialidase in, 94-95 TEA promoter, regulatory role of RORF, 232 Thermodynamics kissing complexes, 35-37 oligonucleotide invasion of RNA structures, 10-12 Thermodynamic stability, triple-stranded complexes, 20 Thiostrepton, rRNA-binding, 4 L-Threonine, degradation, 168 Thymocytes, double-positive, 231-233 Thymopoiesis RORF effect on, 2,36-238 RORF role in, 231-236 Thyroid hormone receptor RORa and, 227 RORt~I and, 215-216 T3R-interacting proteins, see TRIP-1

INDEX Thyroid stimulating hormone, induction of PDE4 isoforms, 302 Tissue distribution cathepsin A, 86 GAL, 90 PAPs, 135 RORa mRNA, 209 ROR/~ mRNA, 210 RORy, 211 T3R, see Thyroid hormone receptor trans-activating responsive element HIV-1, 4-5, 39 aptamers to, 28-31 RNA, complex with aptamer, 36-37 Transcription factors, binding, DNA methylation interference, 51 Transfer RNA aminoacylation, 322-324 conformational changes in, communication by, 334-335 covalent continuity, communication and, 335-337 identity elements, recognized by AARS, 345 tRNAnla variants, 326-327 tRNAGIn:GlnRScomplex, 330 tRNAIle D loop substitutions, 338 tRNA2ne isoaeceptor, 327-328 tRNAPr°:ProRS complex, 341-342 tRNAset binding to SerRS, 332-333 Transgenic plants, a/c system in, 194-195 TRIP-1 interaction with RORfl, 222 ROR coactivator, 219 Triple-stranded complexes chemistry, 19-21 clamp and circular oligonucleotides, 23-24 double hairpin, 24-25 modified nucleofides and intercalating agents for, 22-23 Tropoelastin, EBP binding, 99 Trp 45, at protein/DNA interface, 164 Truncation, C-terminal of AlaRS, 321-322 Tumor suppressor genes CpG islands in, 52 hypermethylation, 65 in cancer cells, 67 histone deacetylases and, 70-71 selective advantage conferred by methylation, 67-68

INDEX

361

Turnip yellow mosaic virus, RNA aminoacylation, 328--329 TyrRS absence of anticodon-binding domain, 328 crystal structures, 343-344

V ValRS, variant RNA-bound, 328-329 Val-tRNAtie, 338 Viomyein, recognizing RNA pseudoknots, 4

W U White blood ceils, GAL function in, 90 UCR1 interaction with UCR2, 289-290 PDE4, 262 PKA-mediated phosphorylation, 269, 275, 277 UCR2 connected to catalytic unit, 264 in ERK phosphorylation, 296-297

Z Zinc binuelear cluster protein AIcR, 156-159 dimer v s . monomer, 159-160 target discrimination by, 163-164

Progress in Nucleic Acid Research and Molecular Biology, Volume 69

Progress in Nucleic Acid Research and Molecular Biology, Volume 50

Progress in Nucleic Acid Research and Molecular Biology, Volume 6

Progress in Nucleic Acid Research and Molecular Biology, Volume 82

Progress in Nucleic Acid Research and Molecular Biology, Volume 65

Progress in Nucleic Acid Research and Molecular Biology, Volume 31

Progress in Nucleic Acid Research and Molecular Biology, Volume 57

Progress in Nucleic Acid Research and Molecular Biology, Volume 58

Progress in Nucleic Acid Research and Molecular Biology, Volume 40

Progress in Nucleic Acid Research and Molecular Biology, Volume 56

Progress in Nucleic Acid Research and Molecular Biology, Volume 77

Progress in Nucleic Acid Research and Molecular Biology, Volume 51

Progress in Nucleic Acid Research and Molecular Biology, Volume 61

Progress in Nucleic Acid Research and Molecular Biology, Volume 55

Progress in Nucleic Acid Research and Molecular Biology, Volume 70

Progress in Nucleic Acid Research and Molecular Biology, Volume 44

Progress in Nucleic Acid Research and Molecular Biology, Volume 47

Progress in Nucleic Acid Research and Molecular Biology, Volume 67

Progress in Nucleic Acid Research and Molecular Biology, Volume 17

Progress in Nucleic Acid Research and Molecular Biology, Volume 41

Progress in Nucleic Acid Research and Molecular Biology, Volume 19

Progress in Nucleic Acid Research and Molecular Biology, Volume 45

Progress in Nucleic Acid Research and Molecular Biology, Volume 72

Progress in Nucleic Acid Research and Molecular Biology, Volume 52

Progress in Nucleic Acid Research and Molecular Biology, Volume 51

Progress in Nucleic Acid Research and Molecular Biology, Volume 28

Progress in Nucleic Acid Research and Molecular Biology, Volume 54

Progress in Nucleic Acid Research and Molecular Biology, Volume 49

Progress in Nucleic Acid Research and Molecular Biology, Volume 38

Progress in Nucleic Acid Research and Molecular Biology, Volume 46

Progress in Nucleic Acid Research and Molecular Biology, Volume 32

Progress in Nucleic Acid Research and Molecular Biology, Volume 69

Progress in Nucleic Acid Research and Molecular Biology, Volume 50