PREFACE
As the decade of the 1990s draws to a close, it is appropriate to assess the changes that have taken place in ...
51 downloads
861 Views
35MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PREFACE
As the decade of the 1990s draws to a close, it is appropriate to assess the changes that have taken place in the drug discovery process. Previously, an infectious agent was identified and cultured for two purposes: (1) so that tests could be established to screen compounds for inhibition of growth of the infectious organism, and (2) to permit isolation of proteins that could serve as drug targets in favorable cases where sufficient quantities could be obtained. In this decade a new paradigm has emerged. Isolation of the genetic material of the infectious organism, followed by sequence analysis at the DNA level by rapid, automated methods, can reveal the entire genomic structure in a short time. Straightforward analysis using sequence-searching algorithms can lead to the identification of possible critical functional activities. Cloning of the specific region of the genome can yield the target for new drug development. With the exploding database of protein structure, and with homology modeling programs, it is now possible to predict structure and initiate the drug discovery process while waiting for solution of the X-ray structure. During the period following the identification of sequences within the HIV genome that fit the template for the structure of an aspartic protease, drugs were designed and synthesized, and the effect on viral growth was demonstrated. Now that FDA-approved drugs have been shown to suppress viral growth and improve the health of thousands of patients, we can conclude that the new paradigm for drug discovery has been successful. Of course, this account should not fail to point out the continuing problems of patient noncompliance, low bioavailability, and rapid development of resistant viral strains. Nonetheless, the value of protease-directed drugs, as well as the best pathway to finding them, is clear. Fittingly then, this book begins with an account of the development of antiviral agents targeted for AIDS. Important lessons derived from this work in-
xv
XV1
Preface
clude the demonstration, through catalytic mutation, that an active protease was essential for viral replication. Also, the value of the early determination of the three-dimensional structure of the enzyme and enzyme-inhibitor complexes has been established. Next, the iterative design process, based on structure determination, marks the HIV-1 protease case as a defining example for future work. Finally, the problems associated with drug metabolism and the resistance question complete the catalog of lessons learned in this case. Other chapters in this book summarize current knowledge regarding new drug targets from other infectious organisms. A significant new effort is directed toward a serine protease encoded by the hepatitis C virus. Due to the relatively recent identification of this virus and the lack of a cell-culture system, progress was very slow. In the past three years, however, expression of the protease and the solution of its structure have dramatically stimulated the search for drugs in this area. In addition, a shift occurred when many pharmaceutical companies cut back their HIV efforts because of the success of the Roche, Abbott, and Merck compounds. Hepatitis C was the next disease likely to have a significant public health impact in the United States and elsewhere due to the spread of the infection through transfusion before its discovery. As detailed in the chapter by Urbani, De Francesco, and Steink~hler on this subject, a strategy similar to that in the HIV case was followed. A huge body of work, conducted largely in pharmaceutical companies, is summarized by Qiu and Abdel-Meguid in their chapter on the human herpesviruses. Here again, the paradigm of discovery of a viral sequence, cloning, expression, and structure determination has proven to be the route to drug design. In this case especially, the structural insights reveal novel mechanisms of action that could provide clues for potent and selective inhibitor design. The Candida genus provides an example of an infectious agent that is not a problem for a healthy human. However, in the case of a patient whose immune system has been impacted by infectious agents such as HIV or through treatment with immune-suppressing agents to avoid transplant rejection, severe systemic infections can be the ultimate cause of death. In the chapter by Stewart, Goldman, and Abad-Zapatero, several related crystal structures derived from protease variants cloned from C. albicans are described. Unique insights that point the way toward selective inhibitor design are derived. Work in the picornarvirus arena has been under way for some time; however, the new information described by Bergmann and James seems likely to stimulate this area significantly. Progress in this field will have widespread impact, as the picornavirus family includes the rhinoviridae, which bring us the common cold, as well as many others. The chapters by Berry, on malaria, and by Cazzulo, on Chagas disease, represent the field of protozoan infectious species. Berry provides a thorough summary of current knowledge on hemoglobin degradation during the blood-borne
Preface
xvii
stage of malaria. The parasite is somewhat unique in presenting two targets for drug discovery: a cysteine protease (falcipain) and an aspartic protease (plasmepsin). The interplay of these two enzymes in the complicated process of breakdown of the globin chain provides several strategies for attack. In the case of T. cruze, the organism that causes Chagas disease, the major antigen is a cysteine protease, cruzipain (or cruzain). Cazzulo describes the involved life cycle, the properties of cruzipain, and other putative serine proteases of the organism. Successful infection by foreign agents frequently requires some function of the host. In the chapter by Kido, Chen, Murakami, Beppu, and Towatari, the role of cellular proteases in infection by influenza A and Sendai viruses in the respiratory tract and tryptase TL2 in T lymphocytes is described. The involvement of a cellular enzyme in the entry of HIV-1 into cells is also discussed. The implications for future therapeutic intervention are clear, but the complications of attempting to alter the function of a normal cellular enzyme are also significant. While it has been clearly established that polyprotein processing in the case of viruses is an essential feature of their life cycle, the world of bacteria is not so straightforward. One point of attack is the necessity for processing newly synthesized proteins targeted by the bacteria for export. Lively discusses the bacterial signal peptidases in a chapter that precedes structure determination. Nonetheless, the ground is well-prepared for the anticipated structural information. In this case as in the others, the differences between the bacterial enzyme and the related human enzyme will be critical to development of selective inhibitors. The world of plants is represented by the chapter of Garcia, FernandezFern~indez, and L6pez-Moya describing plant viruses. This area is unique in that a wide range of protease mechanisms are found, lacking only the metalloproteases. Given the huge impact of plant viruses on crop production, this is a field with a large potential for future growth. While this compilation is not encyclopedic, the chapters presented cover a wide range of infectious agents and mechanisms. There also are obvious differences in the state of knowledge of the different proteases described. This is appropriate, as efforts in the area of proteases will continue to expand into research in new diseases and infectious agents as they are discovered. The task before us is clear: Find the critical protease and develop a potent and selective inhibitor. Ben M. Dunn
CONTRIBUTORS
Numbers in parentheses indicate the pages on which the authors' contributions begin.
CELE ABAD-ZAPATERO (117) Protein Crystallography D-46Y, Abbott Laboratories, Abbott Park, IL 60064 SHERIN S. ABDEL-MEGUID (93) Department of Macromolecular Science, SmithKline Beecham, King of Prussia, PA 19406 YOSHIHITO BEPPU (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan ERNST M. BERGMANN (139) Dept. of Biochemistry, University of Alberta/ Edmonton, Edmonton, Alberta T6G 2H7, Canada COLIN BERRY (165) Cardiff School of Biosciences, Cardiff University, Cardiff CF1 3US, Wales, UK JUAN JOSl~ CAZZULO (189) Instituto de Investigaciones Biotechnol6gicas, Universidad Nacional de Beneral San Marin, San Martin 1650, Buenos Aires, Argentina YE CHEN (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan RAFFAELE DE FRANCESCO (61) Instituto di Ricerche di Biologia Moleculare, 00040 Pomezia, Rome, Italy MICHAEL A. EISSENSTAT (1) Structural Biochemistry Program, NCIFCRDC, Frederick, MD 21702 JOHN W. ERICKSON (1) Structural Biochemistry Program, NCI-FCRDC, Frederick, MD 21702 xiii
xiv
Contributors
MARIA ROSARIO FERNANDEZ-FERNANDEZ (233) Centro Nacional de Biotecnologia, Campus de la Universidad Autonoma, 28049-MADRID, Spain JUAN ANTONIO GARCIA (233) Centro Nacional de Biotecnologia, Campus de la Universidad AutOnoma, 28049-MADRID, Spain ROBERT C. GOLDMAN (117) Anti-infective Group, D-47, AP-9A, Abbott Laboratories, Abbott Park, IL 60064 MICHAEL N. G. JAMES (139) Department of Biochemistry, University of Alberta/Edmonton, Edmonton, Alberta T6G 2H7, Canada HIROSHI KIDO (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan MARK O. LIVELY (219) Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC 27157 JUAN JOSI~ LOPEZ-MOYA (233) Centro Nacional de Biotecnologia, Campus de la Universidad Autonoma, 28049-MADRID, Spain MEIKO MURAKAMI (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan XIAYANG QIU (93) Department of Macromolecular Science, SmithKline Beecham, King of Prussia, PA 19406 CHRISTIAN STEINKOHLER (61) Instituto di Ricerche di Biologia Moleculare, 00040 Pomezia, Rome, Italy KENT STEWART (117) Molecular Modeling Group, D-46Y, AP-10, Abbott Laboratories, Abbott Park, IL 60064-3500 TAKAE TOWATARI (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan ANDREA URBANI (61) Instituto di Ricerche di Biologia Moleculare, 00040 Pomezia, Rome, Italy
HIV Protease as a Target for the Design of Antiviral Agents for AIDS JOHN W. ERICKSON AND MICHAEL A. EISSENSTAT Structural Biochemistry Program, National Cancer InstitutemFrederick Cancer Research and Development Center, Frederick, Maryland 21702
I. II. III. IV. V. VI.
Introduction Retroviruses, HIV, and AIDS HIV Protease: Biology, Biochemistry, and Structure Design of HIV Protease Inhibitors Drug Resistance Future Challenges in HIV Protease Inhibitor Design VII. Summary and Conclusions References
I. I N T R O D U C T I O N The design of clinically effective inhibitors of HIV protease has been a major success story for m o d e r n antiviral therapy and has raised the awareness and attractiveness of viral proteases in general as drug design targets. 2 C u r r e n t interest in viral proteases as drug design targets is highlighted by the chapters in this b o o k devoted to the proteases of herpesviruses, adenoviruses, plant viruses, picornaviruses, hepatitis C viruses, and HIV. A c o m m o n strategy in the life cycle of viruses is the utilization of polycistronic messenger RNAs that can be translated into precursor polyproteins which subsequently are processed The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services nor does mention of trade names, commercial products, or organization imply endorsement by the U.S. Government. 2There are two distinct types of human immunodeficiency virus, HIV-1 and HIV-2. In addition, multiple subtypes have been identified for HIV-1. For clarity, HIV will be used throughout this chapter to denote HIV-1, subtype B unless otherwise indicated. Proteasesof InfectiousAgents
Copyright9 1999byAcademicPress.Allrightsof reproduction in anyform reserved.
2
John W. Erickson and Michael A. Eissenstat
enzymatically into mature, functional proteins during virus assembly (Kr~iusslich et al., 1988; Kay and Dunn, 1990). Processing enzymes may be either cellular or viral encoded and offer novel targets for intervention. Virus-encoded proteases afford particularly attractive therapeutic targets for the design of antiviral agents that are highly specific and nontoxic to their host cells. Viral proteases can be classified according to their mechanism of action as serine, cysteine, aspartic, or metalloproteinases. This chapter discusses the key biological and structural features of HIV protease (HIV PR) that define its usefulness as an antiviral target. A brief review as well as an update on efforts to discover and design inhibitors of HIV PR inhibitors is presented. Finally, the problem of drug resistance to HIV PR inhibitors is discussed with an emphasis on viral strategies for resistance and on implications for future drug discovery efforts in this area.
II. R E T R O V I R U S E S , HIV, A N D A I D S The human immunodeficiency virus type 1 (HIV) causes acquired immunodeficiency syndrome (AIDS) in humans. This disease was recognized only recently in the U.S., around 1981, as a unique clinical syndrome manifested by opportunistic infections or malignancies associated with an underlying defect of the immune system characterized by the progressive loss of CD4 helper T cells. Nearly always fatal unless treated, AIDS cases currently number upward of 2.0 million reported cases worldwide and may be closer to 10 million based on recent estimates from the 1998 AIDS meeting held in Geneva, Switzerland. The number of HIV-infected individuals is much higher, around 30 million, and current projections of global prevalence predict 4 0 - 1 0 0 million HIVinfected individuals by the year 2000. Human immunodeficiency virus is a lentivirus and belongs to the family Retroviridae, which are enveloped, positive-sense, single-stranded RNA viruses. Retroviruses induce a variety of neoplastic diseases and are widely distributed among vertebrate species. A defining characteristic of these viruses is their use of an RNA-dependent DNA polymerase, or reverse transcriptase (RT), for replication of the viral RNA. Many of the molecular events specific to HIV-1 infection have been characterized including functions common to the retrovirus life cycle (Fig. 1). Infection by a retrovirus results in the synthesis of one or more double-stranded DNA intermediates by the action of RT and at least one DNA copy of the viral genome is integrated into the host DNA. This proviral DNA serves to direct its own transcription, translation, and assembly of new virions. This ability of retroviruses to incorporate their genomes into that of their host cells endows them with the capability to be stably maintained during the life of the host and even to be transmitted through the germ line. While most germline "endogenous" retroviruses found in animals and humans are believed to be
HIV
Protease
3
nonpathogenic, tumor-causing retroviruses are fairly common in animals, and their discovery dates back to 1901. However, the first definitive reports that associated a retrovirus with disease in humans would not appear until 1980 when it was discovered that adult T-cell leukemia (ATL) was caused by human T-cell leukemia virus (HTLV-I) (Poiesz et al., 1980). First described in 1977, ATL is highly malignant; median survival is measured in months (Kawano et al., 1985). Current prevalence estimates of ATL are around 1 0 - 2 0 million cases worldwide. There is currently no effective therapy for ATL. Human T-cell leukemia virus has also been attributed to be the cause of a myelopathy disorder as well as an aggressive form of non-Hodgkins T-cell lymphoma. Shortly after the discovery of HTLV-I, another C-type retrovirus closely related to HTLV-I, named HTLV-II, was isolated from a patient with hairy cell leukemia (Kalyanaraman et al., 1982). Two years later a third human retrovirus, initially called HTLV-III, was linked to AIDS (Gonda et al., 1985) and was later found to be nearly identical to lymphadenopathy-associated virus isolated a year earlier (Wain-Hobson et al., 1991). When the capsid morphology and genetic structure clearly identified HTLV-III as a lentivirus (Gonda et al., 1985), its name was changed to HIV-1. The lentiviruses comprise a subfamily of retroviruses that characteristically cause chronic infections and disease in animals and include bovine, feline, and simian immunodeficiency viruses, equine infectious anemia virus, and visna virus of sheep. A second human AIDS virus, HIV-2, was discovered in the mid-1980s (Clavel et al., 1986). Human immunodeficiency virus 2 has a distinct geographic distribution from HIV-1 and is found mainly in West Africa. Human immunodeficiency virus type 2 appears to have a lower morbidity and longer clinical latency than HIV-1 and is closely related in sequence to SIV, a lentivirus that causes an AIDS-like disease in macaques (Gao et al., 1994). This brief history underscores the fact that the identification of disease-associated retroviruses in humans is fairly recent. Thus, much of our current understanding of HIV infection and, in particular, the identification and characterization of targets for antiviral drug design have drawn heavily on basic virology studies obtained with animal retroviruses prior to the discovery of HIV (for a comprehensive review, see Coffin et al., 1997). The discovery of HIV-1 in 1984 led to a parallel explosion of research on the molecular virology of this infectious agent and to an intensive, global search for a cure for this fatal disease that continues to the present. Efforts to inhibit HIV continue to represent one of the most active areas of antiviral research today, and nearly every aspect of the viral life cycle has become a target for antiviral drug discovery (Mitsuya and Broder, 1987; De Clercq, 1995). Modern antiviral strategies are turning more and more to the use of structure- and mechanismbased approaches for the design of safer, more specific, and effective drugs. The successful introduction of HIV PR inhibitors for AIDS treatment is a testimonial to the importance of structure- and mechanism-based approaches in the
o ~
~
~~
g~
0
~ K
~
o.~ ~ ~
N
o
c
i
c_ m
7O
llllI ~r~~~.~ ~E "~o
O~
g
~
HIV Protease
5
discovery and design of more effective, less toxic antiviral therapies. To date, there are four FDA-approved PR inhibitorsmSaquinavir, Ritonavir, Indinavir, and Nelfinavir. These compounds are highly potent and play a central role in the development of the highly active antiretroviral therapies that comprise the current standard of care and that, for the first time in the brief history of HIV infection, provide dramatic and durable suppression of HIV replication. It is probably safe to assume that much of the attention currently being paid by the pharmaceutical industry to other viral proteases as drug design targets is a "coattail" effect based on the HIV success story. It can be argued that much of the current focus on viral proteases as targets for antiviral therapy really stems from the collision of two disparate disciplines--protease biochemistry and virology. Thus, it is worth reflecting on the scientific developments that led up to the identification of HIV PR as an attractive target for drug design.
III. H I V P R O T E A S E : BIOLOGY, B I O C H E M I S T R Y , AND STRUCTURE A. ROLE OF H I V PROTEASE IN THE VIRAL LIFE CYCLE The HIV genome, like all other retroviral genomes, is a single-stranded, positivesense RNA molecule that is organized into three major coding elements: the gag, pol, and env genes. The gag and pol gene products are translated from a single unspliced polycistronic mRNA that encodes both genes (Fig. 2). A stop codon in the unspliced RNA leads to the translation of a 55-kDa Gag polyprotein, Pr55 gag,that contains sequences of the structural proteins of the virionm matrix (MA), capsid (CA), and nucleocapsid (NC)malong with the peptides p2, p 1, and p6 that are involved in the assembly and morphogenesis of mature capsids. The pol gene encodes the viral enzymes necessary for replicationm protease (PR), reverse transcriptase (RT), and integrase (IN). These proteins
FIGURE 1 The HIV life cycle. Human immunodeficiency virus infects a T cell via recognition of the CD4 receptor on the cell surface. Fusion of the viral envelope and cell membranes leads to cytoplasmic invasion by the nucleoprotein core of the virus. Proviral DNA is synthesized using the virion-associated reverse transcriptase and tRNA as a primer. Integration of proviral DNA is mediated by a viral integrase, also present in the infecting virion, and host factors. Transcription of proviral DNA into spliced and unspliced RNAs provides mRNAs for translation of the gag, pol, and env gene products as well as viral RNA for packaging. Assembly of the Gag and Gag-Pol precursor proteins and packaging of the viral RNA occurs at the cell membrane. Extracellular budding of virions results in the acquisition of an envelope which contains the viral env proteins required for subsequent rounds of receptor recognition and fusion. Processing of the Gag and Gag-Pol polyproteins occurs during budding and release and is mediated by a viral protease. Reprinted from Coffin (1996) with permission.
6
John W. Erickson and Michael A. Eissenstat
FIGURE 2 Genome organization and translational strategy for HIV. Structural (gag, pol, and env) genes are shaded; regulatory (tat and rev) and accesory (vif, nef, vpr, and vpu) genes are clear. Common to all retroviruses, the gag and pol gene products are translated on free ribosomes in the cytoplasm from newly synthesized unspliced viral RNA. Translation usually occurs through to a stop codon at the 3' end of the gag gene resulting in the structural polyprotein Pr55 gag.About 5% of the time ribosome frameshifting during translation of gag results in the synthesis of a Gag-Pol fusion protein, Pr160gag-p~ The frameshift site (fs) is located upstream of the Gag p6 protein such that a transframe polypeptide, TF, is incorporated into Gag-Pol in place of p6. The functions of the p6 and TF proteins are unclear. The total number of amino acids contained by each polyprotein is indicated at the end of each molecule. See text for individual protein abbreviations. Reprinted from Swanstrom and Wills (1997) with permission.
are also translated as part of a larger polyprotein precursor, Pr160g "g-p~ which results from ribosomal frameshift and readthrough during translation of the gag gene. The frameshift site has been mapped to lie between the N C and p6 coding sequences. The gag and gag-pol gene products in mature virions are found in a ratio of 20:1, which represents the frequency of ribosomal frameshifting, about 5%. Thus, frameshifting is used as a regulatory mechanism to ensure that large numbers of the structural proteins of the virion are synthesized relative to the viral enzymes, which are needed in only catalytic amounts. The N-termini of Pr55 and Pr160 both contain a covalently attached myristic acid moiety that is added cotranslationally and targets the polyproteins to the cellular membrane where virus assembly and budding takes place (Fig. 3). The Pr55 precursor protein is believed to play a central role in directing virion assembly and RNA packaging based on studies with other retroviruses that show that enveloped nucleoprotein core particles can form from Gag precursor proteins in the absence of pol and env gene products. Proteolytic processing of Pr55 g~gand Pr160 g~g-p~during virus assembly and maturation is performed by the viral PR, which is itself encoded by the pol gene (Swanstrom and Wills, 1977, and references therein). The env gene product, gp160, is processed into gp 120 and gp41 by a cellular protease. The processing products of HIV PR include the gag-encoded structural proteins and peptides-MA, CA, NC, pl, p2, and p 6 w a n d the pol enzymes--RT, IN, and PR. All of
7
HIV Protease
FIGURE 3 Budding and maturation of HIV-1 from an infected cell. Both immature (third panel from left) and mature (fourth panel from left) forms are shown. The mature virion exhibits the elongated capsid morphology typical of most lentiviruses. Inactivating mutations in the viral protease gene, or the presence of protease inhibitors, blocks the morphogenesis of immature to mature virion. Reprinted from Swanstrom and Wills (1997) with permission.
these products are found in mature infectious viral particles and result from cleavages at unique amino acid sequences that span the N- and C-termini of the mature proteins (Debouck, 1992) (Fig. 4). The sequences recognized by HIV PR are diverse, but certain general features emerge. Hydrophobic amino acids are preferred at the P1-PI' residues that flank the scissile peptide bond, aliphatic and Glu/Gln residues are often found at P2', aromatic residues are almost never found at P3', and small residues are preferred at P2. Several sequences contain an aromatic residue at P 1 followed by a Pro at P 1'. Although
PROCESSING SITES FOR HIV-1 PROTEASE Site
P4 P3
P2
P1
PI" P2"P3" P4"
MA/CA
-Ser-Gln-Asn-Tyr/Pro-Ile-Val-Gln-
CA/p2
-Ala-Arg-Val-Leu/Ala-Glu-Ala-Met-
p2/NC
-Ala-Thr-Ile-Met/Met-Gln-Arg-Gly-
NC/pl
-Arg-Gln-Ala-Asn/Phe-Leu-Gly-Lys-
pl/p6
-Pro-Gly-Asn-Phe/Leu-Gln-Ser-Arg-
TF/PR
-Ser-Phe-Asn-Phe/Pro-Gln-Ile-Thr-
PR/RT
-Thr-Leu-Asn-Phe/Pr~176
RT/IN
-Arg-Lys-Val-Leu/Phe-Leu-Asp-Gly-
RT (internal)
-Alu-Glu-Thr-Phe/Tyr-Val-Asp-Gly-
FIGURE 4 Cleavage site sequences in Gag and Gag-Pol polyproteins recognized by HIV PR. Cleavage occurs between residues in the P1/PI' positions and they are indicated in bold and separated by a slash. The RT (internal) represents a PR-mediated cleavage at the junction of the p51/ RNase H domains which yields the active p66/p51 heterodimer found in isolated virus particles (Tomasselli et al., 1993). The nomenclature of Schechter and Berger (1967) is used to designate residue positions in the substrate sequence.
8
John W. Erickson and Michael A. Eissenstat
all retroviral proteases appear to be structurally and functionally related, their cleavage site preferences vary widely. Efforts to predict cleavage sites for HIV PR have met with limited success (Poorman et al., 1991; Chou and Zhang, 1993) and our understanding of the basis of PR specificity is incomplete (Dunn et al., 1994; Katz and Skalka, 1994). However, identification of cleavage site sequences quickly led to the successful generation of a variety of synthetic substrates that facilitated the design of rapid and quantitative assays of PR activity (Hellen, 1994; Krafft and Wang, 1994). The HIV PR has long been known to be toxic to cells and this has prompted a search to identify cellular proteins that may be cleaved by HIV PR. Several investigators have shown that key proteins, such as NF-KB and certain cytoskeletal proteins, are cleaved in HIV-infected cells (Shoeman et al., 1993). The possible involvement of PR in the early stages of retroviral replication was suggested initially on the basis of observations with equine infectious anemia virus (Roberts and Oroszlan, 1989) and later with HIV (Baboonian et al., 1991; Nagy et al., 1994). However, recent data from several groups have demonstrated that PR inhibitors fail to block the synthesis of proviral DNA, its integration into cellular DNA, and transcription (Jacobsen et al., 1992; Uchida et al., 1997). Similar conclusions were reached using conditional lethal HIV-1 PR mutants as a probe (Kaplan et al., 1996). In 1988, the late I. Segal and co-workers observed that deletion mutagenesis of the HIV PR gene resulted in the production of virus particles that had an immature morphology and were noninfectious (Kohl et al., 1988). These results were confirmed by mutation of the active site aspartic acids and subsequently by chemical inhibition with PR inhibitors (Seelmeier et al., 1988; McQuade et al., 1990). This seminal experiment provided conclusive proof that HIV PR is essential for the life cycle of HIV and defined this enzyme as an important target for the design of specific antiviral agents for AIDS. A similar conclusion had been reached for the PR of murine leukemia virus in 1985 (Crawford and Goff, 1985; Katoh et al., 1985). However, the HIV PR studies provided the boost needed by many groups to launch drug discovery programs for this target.
B. STRUCTURE AND MECHANISM OF H I V PROTEASE Identification of the mechanistic family that a viral protease belongs to is the key to predicting its structure and function and may unlock strategies for inhibitor design that were previously developed for homologous members of the family. This concept was extremely valuable for the design of HIV PR inhibitors and led directly to the design of the first approved drug, Saquinavir, long before
HIV Protease
9
the structure of the enzyme was known. Homology modeling and biochemical inhibition studies on retroviral proteases had led to the hypothesis that these enzymes were related mechanistically to the aspartic proteinase family, typified by pepsin (Toh et al., 1985; Katoh et al., 1987). The active site of these bilobed enzymes contains two aspartic acids, one from each lobe, that participate in the catalysis of peptide bond breakage. Crystal structures of aspartic proteinases from cellular organisms revealed that the N- and C-domains associate to form an active site with approximate twofold symmetry at the protein backbone level (Davies, 1990). These observations led to the suggestion that the cellular enzymes had evolved via a duplication event of a primordial aspartic protease gene (Tang et al., 1978). Since the sequence length of retroviral proteases is about one-third that of typical aspartic proteases, it was proposed that the former enzymes are composed of two identical subunits, each of which contributes a single aspartic acid to the active site (Pearl and Taylor, 1987). This hypothesis was verified by the crystal structure determination of Rous sarcoma virus (RSV) protease by Wlodawer and colleagues (Miller et al., 1989) and subsequently of HIV PR by several laboratories (Lapatto et al., 1989; Navia et al., 1989; Wlodawer et al., 1989; Spinelli et al., 1991). So far, retroviruses are the only family of viruses whose proteases have adopted an aspartic proteinase mechanism. The HIV PR dimer consists of two identical, noncovalently associated subunits of 99 amino acid residues associated in a twofold (C2) symmetric fashion (Wlodawer and Erickson, 1993) (Fig. 5a). The dimer is stabilized by a fourstranded antiparallel/3-sheet formed by the interlocking N- and C-termini of each subunit. The active site of the enzyme is actually formed at the dimer interface and contains two conserved catalytic aspartic acid residues, one from each monomer. The substrate binding cleft is composed of equivalent residues from each subunit and is bound on one side by the active site aspartic acids, Asp25 and Asp125, and on the other by a pair of twofold related, antiparallel /3-hairpin structures, or "flaps." Comparison of the structure of HIV PR with that of a complex with a peptide-based inhibitor (Miller et al., 1989) shows that the flap undergoes significant structural changes upon binding and that it makes several direct interactions with inhibitor (Fig. 5b). Molecular dynamics studies indicate that the flaps are highly flexible and must undergo large localized conformational changes during the binding and release of inhibitors and substrates (Collins et al., 1995). Crystal packing forces apparently maintain the flap in a conformation that is unsuitable for substrate binding in the structure of the uncomplexed form of the enzyme (York et al., 1993). The crystal structures of RSV and HIV PR revealed that, despite the apparent lack of sequence homology, aspartic proteinases and retroviral proteases display considerable structural homology at the backbone level (Rao et al., 1991). Fully one-third of the main chain atoms of RSV PR can be superposed onto the backbone of
10
John W. Erickson and Michael A. Eissenstat
FIGURE 5 Chain tracings of HIV PR from crystal structures of (A) unbound and (B) inhibited forms of the enzyme. Inhibitor is drawn as stick figure.
porcine pepsin to within 1.5 A root-mean-square deviation. As expected, most of the structural c o r r e s p o n d e n c e is in the active site region. However, the overall chain topologies of the two families of enzymes are more similar than a simple superposition analysis reveals and are indicative of a distant but definite relationship to a c o m m o n , ancestral aspartic proteinase gene.
HIV Protease
11
C. STRUCTURE OF H I V PROTEASE INHIBITOR COMPLEXES To date, several hundred crystal structures have been solved for various HIV protease/inhibitor complexes~a testimony to the importance placed on structural information in the process of inhibitor design (Fitzgerald and Springer, 1991; Huff, 1991; Tomasselli et al., 1991; Meek, 1992; Abdel-Meguid, 1993; Appelt, 1993; Erickson, 1993; Wlodawer and Erickson, 1993). Structural comparison of the inhibitor complexes reveals certain common features (Fig. 6). The inhibitor and enzyme make a pattern of complementary hydrogen bonds between their backbone atoms. In some instances, these hydrogen bonds are mediated by bridging water molecules. A unique feature found in the structure of HIV PR/inhibitor complexes is the presence of a water molecule that forms bridging hydrogen bonds between the NH atoms of Ile50 and Ile150 in the two flaps and the P2 and P 1' backbone carbonyl groups of the inhibitor. This water is close to the twofold axis of the enzyme and is distinct from the water molecule that has been identified in the active site of uncomplexed structures of aspartic proteinases, including HIV PR. The latter has been implicated in substrate catalysis. The enzyme also contains a number of well-defined pockets, or subsites, in its active site region into which inhibitor side-chains protrude, resulting in tight binding interactions between enzyme and inhibitor. Since a similar pattern of hydrogen bonds is believed to be made for both substrates and peptidomimetic inhibitors, specificity is believed to reside in the pattern of largely nonpolar subsite interactions between inhibitor and enzyme side-chain atoms. Overall, knowledge of the structure and function of HIV PR and its relationship to other aspartic proteinases has led to the successful development of a wide variety of potent and chemically diverse inhibitors as is discussed in the next section.
IV. D E S I G N O F H I V P R O T E A S E I N H I B I T O R S The design of clinically effective HIV PR inhibitors has been a major success story for structure-based design (for reviews on clinically effective protease inhibitors see Vacca and Condra, 1997; Flexner, 1998). Several HIV PR inhibitors are currently in widespread use for the treatment of patients with AIDS (Mous et al., 1994; Ho et al., 1995; Wei et al., 1995; Markowitz et al., 1998). These compounds represent a new class of therapeutic agents that complement already-licensed antivirals--AZT, ddI, ddC, and d4T--all of which inhibit HIV RT. Although the initial lead compounds have been generated in various ways, the availability of protein crystal structures of these leads with
" .4,=.
0
0
~
O.
>, r
9--
:=
0
.......
:Z:Z
....
.
.
.
0~-.0
_>,
9' , , , , " ~ l i l i l l l l l l
.
0,~
/
~'~'
iI
/
......
:Z:
:=Z
=~
m
~
,.::I:Z m
o
_>,
|
~
L///
,,,,"
k" "- ~
L//,
--
o.'- :
L///
..,,,,,,llmlllllll e
0----o
',.~~IIIIIIII
\... o
9 ,,,,,IIillllll
.
~ 1 7..... 6 =~ z:z:
C,):
1 \
=
0
....
.....
.....
"/..
:=Z
0 : 0
---
0 ~ 0
---'rZ
~
\z= /
\
:C..Z
.... 0 ~ 0
. . . . . . . .
Z=
---~ 0
~"
e~ 0 : 0
. . . .
~, z:z: . . . .
S / ~ . , ~
//'~/I
,~, ~ I
\z=
~
~~0
~
_>=
I.T-I ~
HIV Protease
13
HIV PR facilitated structure-based approaches to the optimization of interactions to increase potency. Once potent inhibitors of the enzyme were generated, considerations such as improving antiviral potency, improving bioavailability (Kempf, 1994), and reducing cost could be addressed within the known structural limitations provided by these crystal structures. We make no attempt to describe in detail the enormous amount of medicinal chemistry that has been done over the past decade that has led to the development of these successful drugs. Rather, we describe the various approaches that have been used to generate potent HIV PR inhibitors with an emphasis on recent developments and in particular how they might address the two major challenges remaining for drugs of this class: pharmacokinetics and effectiveness against resistant mutants. The inhibitors can be classified on the basis of the rationale used for their design as substrate-based, structure-based, nonpeptidic, and irreversible inhibitors. These distinctions are somewhat arbitrary since many inhibitors can fall into more than one category. We apologize in advance for placing compounds into a class that others may not agree with. One indication of the enormous activity in this field is the large number of cores or templates that have been used for elaboration of HIV protease inhibitors. Table I lists the various cores along with representative examples of potent inhibitors. Cores are arranged approximately by size according to the portion delimited by a heteroatom (usually nitrogen) or carbonyl connected by atoms to another heteroatom or carbonyl, allowing for a shorthand nomenclature for grouping purposes. When looking at the compounds in this way one sees that there are close similarities between the structures of the series pursued by the various groups. As will become apparent (see below), the clinically successful compounds often resulted from incorporating pieces discovered by various groups onto a proprietary core or introducing a proprietary group onto an established core. In the discussions below structural references to a specific core in Table I are made whenever possible.
A. SUBSTRATE-BASEDINHIBITORS The close structural and functional relationships between the retroviral and cellular aspartic proteases, together with knowledge of the HIV PR cleavage site sequences on the Gag and Gag-Pol polyproteins, immediately opened the avenue of peptidomimetic substrate-based approaches that had been developed for designing inhibitors of human renin, an aspartic protease that was a popular target for the design of antihypertensive agents in the 1980s. Screening compounds generated in these programs rapidly identified reasonably potent inhibitors of HIV PR. Many of the cores shown in Table I had been previously generated as substrate mimetics for renin programs. Substrate peptidomimetic
14
John W. Erickson and Michael A. Eissenstat
TABLE I HIV Inhibitor Cores Core
Structure
Source
NCCN NH(CH2)2NH
Glaxo (Humber et al., 1993)
PhCH2CONHH PhCH2NHCO~HN ~~Or S~ N~
H NHCOCH2Ph
HN.~ NH O NCCC--O NHCH(R)CH(OH)CO
O",11.N
O - H MeS KNI-272
NHCH(R)COCO
H2N,tO
N~S O~
~./
Japan Energy (Mimoto et al., 1992; Kageyama et al., 1993; E1-Farrash et al., 1994) Syntex (Tam et al., 1992) Takeda (Kitazaki et al., 1994)
~ N H Takeda (Kitazaki et al., 1994) Scripps (Slee et al., 1995)
O NHtBu NCCCN NHCH(R) CH(OH) CH(R')NH
oWH OHH~O
Abbott (Erickson et al., 1990; Kempf et al., 1990)
A-74704 (continues)
15
HIV Protease TABLE I
(continued)
Core
Structure
Source
NHCH(R) CH(OH) CH2N(R)CO
Monsanto-Searle (Getman et al., 1993)
~ H oN.~ 0 ~I ONH2
~ ON41
SC-52151 NHCH(R) CH(OH) CHzN(R)SO2
~O
Monsanto-Searle (Getman et al., 1993) Vertex (Kim et al., 1995)
O
N"SO2-p-NH2Ph
OH~ VX-478 Amprenavir NHCH(R) CH(OH) CH2NR~
,~.N~H
'~~~-~~'~ ==_ . N O
o-
N'~ H [~.~
,[
O' NH20HH
Roche (Roberts et al., 1990; Thomas et al., 1994) Boehringer,Biomega
..,..199~.
Beaulieu et al., 1997) Agouron, Lilly (Kalish et al., 1995; Kaldor et al., 1997)
Ro 31-8959 Saquinavir
sO O~.-4NH
0 ~N:
AG 1343 Nelfinavir (continues)
16 TABLE I
John W. Erickson and Michael A. Eissenstat (continued)
Core
Structure
Source
"4-
0.,.. NH
OH
'0
%0
BILA 2011 Palinavir NH CH (R) CH (OH) CH2NR' NH
Mer
..N"~rN..,,'~ N" \ 0 NHCH(R) CH(OH)CH2NR'R' 'NH
Abbott (Sham et al., 1993, 1995) Ciba-Geigy (Fassler et al., 1993) Narhex (Grobelny et al., 1997)
UCSF (Rutenber et al., 1996)
O OH IX
Me CH2Ph AQ-148
NCPCN NH CH (R) PO (OH) CH (R') NH
HO
OHNN~H l H .VO OJ'l' ~ "Nv- E~~"X N11'O~ H
SKB (Abdel-Meguid et al., 1993a)
SB-204,144 (continues)
17
HIV Protease TABLE I
(continued)
Core
Structure
Source
NCCCC~---O NHCH(R)CH(OH)CH2CO
Glaxo (Holmes et al., 1993)
PhCH23NH PhCH2NH S N H ~ NHCH2Ph O
OH U
NHCH(R) CH(OH) CH(NHR') CO
Sandoz (Billich et al., 1994; Scholz et al., 1994)
CbzNHLH - ~ ~ N~ NHCbz = HN "
n
0
NHCH(R)COCF2CO
MMD
(Taylor et al., 1997)
o- b
O~F. F H .~
H ~ ~
O
MDL-74695 NCCCS NHCH(R)CH(OH)CH2S
Lilly (Cho et al., 1994)
O@NH O NH2 (continues)
18
John W. Erickson and Michael A. Eissenstat
TABLE I
(continued)
Core
Structure
Source
OCCCO OCH(R)CH(OH)CH(R')O
0
~
0
IL.
x~
Lederle (Babine et al., 1992)
OH
v
0
NCCCCN
NHCH(R)CH(OH)CH(OH)CH(R')NH
Me H O ~
H.0
H ~ --
0
N o ,&"
6HI,
o
"
0
N
N
Abbott (Kempf et al., 1990, 1993, 1995a) SKB (Dreyer et al., 1993) Hoechst, Bayer (Budt et al., 1995; Lange-Savage et al., 1997) NCI (Hosur et al., 1994; Randad et al., 1996)
A-77003 NH CH (R) CH (O H) CH2CH (R') NH
Abbott (Kempf et al., 1993)
H
~,~O O
A-80987
NwN 0
NwO 0
A-84538 (ABT-538) Ritonavir (continues)
19
HIV Protease
TABLE I
(continued)
Core
Structure
Source Abbott (Kempf et al., 1991)
NHCH(R) CH(OH) CF2CH(R')NH
H O~ F F , , V O 0 fN H ( ~ H b O
H Abbott (Kempf et al., 1991)
NHCH(R)COCF2CH(R')NH
~ M e H O~
FF H V . O N 1.fN_~ _N_1 , ~ N~,~ N J.kN
oz,.o
N
A-79295 Lilly (Munroe and Hornback, 1993) LG Chemical (Park et al., 1995)
NHCH(R)CH(O)CHCH2NH
O,rfNH H 0
%'0
H HN,rfO 0
O ..(~ 0
H HN,I1,0 0 (continues)
20 TABLE I
John W. Erickson and Michael A. Eissenstat (continued)
Core
Structure
NCPCCN NHCH(R)P(O) (OH) CHOHCH(R')NH
~
Source
Hoechst (Stowasser et al., 1992)
O0 H 0 0 HN.~N~ IpIJ,~H ~ N
6. )o -
NCCCCC--O NHCH(R) CH(OH) CH2CH(R')CO
HO ~ OH BocHN. ~ ~ N,,
Merck (Vacca et al., 1991) SKB (Dreyer et al., 1992) Upjohn (Mulichak et al., 1993)
o
Merck (Dorsey et al., 1994a,b)
NR2CH2CH(OH)CH2CH(R')CO
~N'~
OH~H OH NHtBu
MK-639 (L-735,524) Indinavir LG Chemical (Lee et al., 1996; Yoon et al., 1997)
NHCH(R)CH(O)CHCH2CO
,,,/.o~
H 0 H 0 I"v N,.,,.~ N J,~ ~,~ N~,,~ _ H
o~,
b"
S02Me
o /.,, ~
LB-71350 (continues)
21
HIV Protease TABLE I
(continued) Core
Structure
Source
NCCNCC---O NHCH(R)CH2NHCH(R')CO
BocHN
N H
U
Czech Academy of Science (Urban et al., 1992)
qO O ~
H
O
NCCCCCN NHCH(R) CH(OH) CH2CH2CH(R')NH
U. Montreal (Hanessian and Devasthale, 1996)
O o.~~ o..~or .o
~~-o~
Q
O--CCCCCCC~-----O COCH(R) CH2CH(OH) CH2CH(R') CO
OH H'~-N
Merck (Bone eta/., 1991) kederle (Babine eta[., 1992)
OH
O
H
OH
O
L-700417 NCCCNCCCN NHCH(R) CH(OH)CH2NHCH2CH(OH) CH(R)NH
OH BocNH ~ -
U
BMS (Patick et al., 1995)
OH N
-
NHB oc
/~O
"~oy ~-~ BMS-186318
(continues)
22 TABLE I
John W. Erickson and Michael A. Eissenstat (continued)
Core
Structure
Source
Nonpeptide Agouron (Kalish et al., 1995) Roche (Thomas et al., 1994)
NHCH(R)CH(OH)CH2R'
S O
~ OOH NH
AG-1254 SKB (Thompson et al., 1994)
NHCH(R) CH(OH) CH2CH(R')Ar
H
o
O
N...*.N ..%.....~. ,j
,
SB-206343
LG Chemical
NH CH (R) CH (O) CH2CH2SO2NR'
(Choy et al., 1997)
ON~N ~,~ H ~" o
~N' ...
~
~
~=a:
-~
>-,'~ ul
.-= = ~
,.el
~
~>~'~
r
~ ~
.gg
~a"~
r
,a::l
~
o
~
o~
. ~ ~
~ ~
~=~-~...=
9
~,.-=
~a ~
~
~_~
~
-7=
N
> ~
~
0
.,-,
0
~=~o ""
9,-,
o
o
"~
~n~
0
~o ~
o
~
.,~
0
~
0
~
~o~
o~2 ~ ~ .~'~
o~~
0
~ "~ ~ ~ o
0
0 O
0 ,-..,
o~
Q0.
L/h
|
+
0
n~
~o|
;>
==
o
, .,._,
-'-'
9~
U~
0
0
0
o
0
.~
~
0
.,~
0
.,-,
=
,~,
0
=~
95
UJ
"el
o~ .o
~[ ~
0
0
O0
96
XiayangQiu and Sherin S. Abdel-Meguid
tissues. The y-subfamily is specific for either B- or T-lymphocytes (lymphotropic) and exists at either a latent or lytic stage but does not produce infectious progeny. Viruses of the a-subfamily are among those causing serious diseases. The herpes simplex viruses were the first of the human herpesviruses to be discovered and are among the most extensively investigated of all viruses. Herpes simplex virus type 1 is the virus responsible mainly for herpes labialis (cold sores), while HSV-2 causes genital herpes. The latter disease is of increasing public health importance. The recurrent nature of the infection, its differing clinical manifestations, and complications such as aseptic meningitis and neonatal infection, are of great concern to patients and health care providers. Varicella-zoster virus is responsible for chickenpox, shingles, and postherpetic neuralgia. Primary exposure to VZV results in chickenpox, reactivation of the virus following a period of latency gives rise to shingles, and postherpetic neuralgia is probably the result of nerve damage during the active replication phase of shingles. Cytomegalovirus is a ubiquitous opportunistic pathogen that can result in life-threatening infections in congenitally infected infants, immunocompromised individuals, and immunosuppressed transplant patients. Both HHV-6 and HHV-7 have been associated with the childhood diseases such as roseola. Epstein-Barr virus could cause mononucleosis, and HHV-8 has recently been linked to the development of Kaposi's sarcoma.
II. R O L E OF THE P R O T E A S E IN T H E V I R U S LIFE C Y C L E Herpesviruses are enveloped double-stranded DNA viruses that share a common pathway of assembly. The DNA is packaged into an icosahedral capsid in the nucleus of infected cells. The icosahedral capsid is surrounded by an amorphous material referred to as the tegument and enclosed within a lipid envelope of cellular origin that is acquired while the virus buds from the infected cell. Packaging of the viral DNA requires processing of an assembly protein precursor designated ICP35 in HSV-1. The processed ICP35 appears to form an inner scaffold that supports the proper assembly of the capsid. It is found in the capsid prior to DNA packaging, but is absent in the mature virions. The precursor ICP35 is processed through removal of 25 amino acid residues from its C-terminus by a virally encoded 635-amino-acid serine protease that contains the assembly protein at its C-terminus (Fig. 1). This site of cleavage is known as the maturation (M) site. The protease is also capable of catalyzing its own cleavage at the release (R) site to produce a 247-residue N-terminal domain that has full catalytic activity (Fig. 1).
Human Herpesvirus Proteases
97
FIGURE 1 HSV-1protease (UL26gene product) and its substrate (UL26.5geneproduct). Maturation (M-site) and release (R-site) cleavagesare indicated.
Liu and Roizman (1991) and Welch et al. (1991) were the first to report the identification of these serine proteases from herpesviruses; the former identified the protease from HSV-1 and the latter from CMV. These two enzymes are the most studied of all human herpesvirus proteases. Gao et al. (1994) showed, using a null mutant virus, that the HSV-1 protease is essential for capsid formation and production of infectious virus, making it an attractive target for therapeutic intervention. This and other studies showing that herpesvirus proteases are essential for the virus life cycle have been summarized in recent reviews by Holwerda (1997) and Gibson and Hall (1997).
III. P R I M A R Y S T R U C T U R E S The full-length proteases of the various human herpesviruses range from 512 amino acids in HHV-7 to 708 amino acids in CMV. The N-terminal (catalytic) domains range from 226 amino acids in HHV-7 to 256 amino acids in CMV, indicating that most of the variability in the size of these proteases is in the C-terminal (assembly protein) portion. The catalytic domains show significant sequence homology within each subfamily of herpesvirus, but only limited homology between the different subfamilies (Fig. 2 and Table II). For example, the amino acid sequence of the HSV-1 protease catalytic domain is 91 and 50% identical to that of HSV-2 and VZV, respectively, while it is only 26% identical to that of CMV protease. Extensive homology searches against all known sequence databases revealed little homology to any other protein. Liu and Roizman (1992) showed that the HSV-1 protease was inhibited by
I I
I I
I I
I I
,,,~ I I
~~
I I
I I
~1~~~~1 ~
~,~
m
H
m ~ ~
~
M
m
~
mm~~
HH
CI~ ~-.I (~1
~
~~~~~ ~~1
~
~ D r ~-
o
mm
~q ~q ~q,q ~ m m
~m
~
m
ul ul ul ul ul u ~ u
I~~
~1~~~!1
~mmmm~ ,,~ ~l~q
I i
,-.1 b,
i-.~ I ~:~::~ i
H
~
I i
~ ,~
I i
~,,,
I i
~~~~~1
~1~
.G-, I r.T..~ i
~8,
==~
!!
HHHHHH
~ ~ , ,
i'
~1~~ i
N
~~ b~
~.~
~ 8
99
H u m a n Herpesvirus Proteases TABLE II
Amino Acid Sequence Identities between H u m a n Herpesvirus Proteases a
(%)
HSV-1
HSV-2
VZV
CMV
HHV-6
HHV-7
EBV
HHV-8
HSV-1 HSV-2 VZV CMV
--
91 --
50 51
54
54
~
26 26 26
30
30
30
~
HHV-6
24
24
21
41
23 23 21 38 --
19 19 20 35 59
HHV-7
25
28
24
39
60
~
26 27 23 31 31 27
EBV
31
31
26
34
29
29
~
27 28 26 33 33 28 45
HHV-8
30
30
30
37
35
30
45
91
aBoldface used as in Fig. 2 where the alignment was improved based on crystal structures; italic from GCG BESTFIT.
some serine protease inhibitors, but not by inhibitors of metallo-, acidic, or thiol proteases. This was surprising, given the absence in herpesvirus proteases of the conserved G-X-S/C-G-G motif for chymotrypsinlike and G-T-S-M/A for subtilisinlike proteases, and an early clue that these enzymes could be a novel class of serine proteases.
IV. E N Z Y M A T I C A C T I V I T Y All human herpesvirus proteases cleave a peptide bond between an alanine and a serine, with CMV protease being the most extensively characterized in terms of its enzymatic activity (Burck et al., 1994; Pinko et al., 1995; Liang et al., 1998). The purified wild-type CMV protease shows a clip between Ala143 and Ala144. This site of cleavage is referred to as the internal (I) site. The CMV protease is the only member of the family known to show this I-site clip; all others have been purified as a single chain. A number of mutations to ablate cleavage at the I-site have been engineered (Pinko et al., 1995; Qiu et al., 1996), with the resulting single-site mutant having nearly identical catalytic activity as the wild-type (clipped) enzyme (Table III). This suggests that the immediate vicinity of the I-site is involved neither in catalysis nor substrate binding, as was confirmed later from the crystal structure of CMV protease (the I-site being part of a disordered loop on the surface of the enzyme). The CMV protease differs in its catalytic activity toward the R- and M-sites. The turnover rate of the M-site (GVVNA$SCRLA) cleavage is an order of magnitude faster than that of the R-site (SYVKA$SVSPE), while having similar Km values. Moreover, the hydrolysis of the R-site peptide has an optimal pH of about 7, while the hydrolysis of the M-site peptide has a biphasic pH dependence with optima of approximately 7 and 9, probably due to the protonation
~S
c~
Q
Q
0 ~
w~
oo
0
~oooo~
~
~
o
Z
o
.,~
c~ .,~ .,~
.~
c~
o
Human Herpesvirus Proteases
101
of the arginines and the lysine in the peptide substrate (Burck et al., 1994). Unlike CMV protease, HSV-1 protease does not have a preference for the Mover the R-site. Its pH optimal is approximately 8.0, and it is about 10 times slower than CMV protease (DiIanni et al., 1993; Darke et al., 1994; Hall and Darke, 1995). The catalytic activity of the herpesvirus proteases is influenced significantly by the nature and concentration of cosolvents. For CMV protease, the maximal specific activity is increased by about 10-fold in the presence of 30% glycerol (Margosiak et al., 1996). Similar enhancement in catalytic activity has also been reported for the HSV-1 protease (Hall and Darke, 1995). The most dramatic cosolvent effect is observed with sodium citrate. The CMV protease catalytic efficiency increases by 290-fold (kcat/Kmof 1.24 min-1/zM -1) in the presence of 0.8 M citrate over that in the absence of the cosolvent (0.0044 min-1/zM-~). Similarly, HSV-1 protease has a kcaJKm of 0.25 min-1/zM -~ (kca~4 min -~, K~ 16/zM) in 0.8 M citrate, compared to the k c J K ~ value of 0.0003 min-1/zM -1 in the absence of citrate, an increase of 800-fold in catalytic efficiency. The cosolvent effects led to the discovery that both CMV and HSV-1 proteases are active as homodimers (Margosiak et al., 1996; Schmidt and Darke, 1997). For CMV protease, the dissociation constant is 6.6 or 0.55/zM in 10 or 20% of glycerol, respectively (Darke et al., 1996). For HSV-1 protease, they are 0.96 or 0.23/zM in 20% glycerol and 0.2 or 0.5 M citrate, respectively (Schmidt and Darke, 1997). The dimer association is rather weak, probably an artifact of working with only the catalytic domain. The cosolvent effects on herpesvirus protease activity are likely due to their ability to enhance dimerization and stabilize the conformation of active dimers. Another important observation is the apparent low catalytic efficiency of herpesvirus protease when peptide substrates were used (Table III). Using authentic M-site derived peptides in 50% glycerol (Burck et al., 1994), CMV protease catalytic efficiency is about 104 worse than that of chymotrypsin and about 30 times worse than that of HIV protease. The apparent low efficiencies of the herpesvirus proteases may be an important property of their biological functions in the well-orchestrated events during capsid maturation (Babe and Craik, 1997). Although data support the view that herpesvirus proteases are less active than other viral proteases, this may just be an artifact of working with only the catalytic domains. In fact, when the assembly protein is used as substrate, the K~ is enhanced by over 100-fold (with similar kcat) e v e n in the absence of glycerol (Burke et al., 1994). It is also known that HSV-1 protease, when coexpressed with ICP35, exhibits greater catalytic efficiency (Deckman et al., 1992). This suggests that substrate binding may involve interactions beyond the catalytic domain, i.e., with the C-terminal domain in the full-length enzyme.
102
XiayangQiu and Sherin S. Abdel-Meguid
V. THREE-DIMENSIONAL STRUCTURES A. NOVEL FOLD The crystal structure of CMV protease has been determined by several groups (Qiu et al., 1996; Shieh et al., 1996; Tong et al., 1996; Chen et al., 1996), followed by those of VZV, HSV-1, and HSV-2 proteases (Qiu et al., 1997; Hoog et al., 1997). Unlike the structures of classic serine proteases having two distinct fl-barrel domains, the CMV protease is a single-domain protein. Its overall fold can be described as a seven-stranded orthogonally packed fl-barrel core surrounded by seven to eight c~-helices and connecting loops (Fig. 3, see color plate). The core/~-barrel is mostly antiparallel, except for strands B2 and B6, which are parallel. To our knowledge, the herpesvirus proteases three-dimensional fold has not been observed in any other protein. The overall fold of the four herpesvirus proteases with known threedimensional structures is very similar. As expected, the structures of the three a-herpesvirus (VZV, HSV-1, and HSV-2) proteases are nearly identical, and despite limited sequence identity (Table II), the overall fold of the a-herpesvirus proteases is similar to that of the fl-herpesvirus CMV protease (Qiu et al., 1997, Hoog et al., 1997). Superposition of the 197 pairs of equivalent a-carbon atoms of HSV-2 and VZV proteases (Fig. 4) gives an rms deviation of 0.9 A, while superposition of 142 pairs of those of VZV and CMV protease (Fig. 4) gives 1.3 A. The core fl-barrels of the VZV and CMV proteases superimpose even better (Fig. 4), with rms differences between the 52 pairs of c~-carbon atoms
FIGURE 4 Superimpositionof a-carbon atoms of VZV and CMVproteases (left), and VZV and HSV-2proteases (right). The VZVprotease structures are in light and thick lines and CMVprotease or HSV-2protease structure is in dark and thin lines.
HumanHerpesvirusProteases
103
being only 0.7 A. As anticipated, most of the significant structural differences between CMV and VZV protease are in the loops (Qiu et al., 1997). Moreover, an additional loop containing a small or-helix (referred to as the AA loop) has been observed in the VZV protease structure (Fig. 3) but not in the CMV protease. The corresponding segment in the CMV protease was either totally or partially disordered in apo-CMV protease structures.
B. NOVEL CATALYTIC TRIAD Unlike all previously known serine proteases having a catalytic triad comprised of a serine, a histidine, and an aspartic acid (Perona and Craik, 1995), the herpesvirus protease's catalytic triad consists of a serine and two histidines. The catalytic serine of HSV-1 protease was identified by DFP-modification experiments (DiIanni et al., 1994). Site-directed mutagenesis was able to determine the catalytic serine and histidine of cytomegalovirus protease (Welch et al., 1993), but failed to identify the third member of the catalytic triad. It was only after the determination of the crystal structure of the CMV protease (Qiu et al., 1996; Shieh et al., 1996; Tong et al., 1996; Chen et al., 1996) that the novel SerHis-His catalytic triad for the herpesvirus proteases was identified. In CMV protease, the catalytic triad residues are Ser132, His63, and His157. In this chapter, we use CMV protease numbering (Qiu et al., 1997; Fig. 2) to describe all herpesvirus protease residues. This should eliminate confusion and help to standardize numbering of catalytic triad residues as has been done with the chymotrypsin family of serine proteases. The active site of CMV protease is situated in the only region of the core /3-barrel that is not sheltered by helices and flanking loops (Fig. 3). The active site region is very shallow with the catalytic residues exposed to solvent. This shallowness is not unreasonable considering the P 1-P 1' residues (Ala-Ser) are small. Residues of the triad are absolutely conserved among all human herpesvirus proteases (Fig. 2) and superimpose almost perfectly on the Ser195His57-Asp102 catalytic triad of trypsin (Fig. 5; Qiu et al., 1996; Qiu et al., 1997; Hoog et al., 1997). When the second histidine (His157) is mutated to an alanine, the CMV protease is nearly inactive (Welch et al., 1993) and the HSV-1 protease is completely inactive (Liu and Roizman, 1992). The residual activity in CMV H157A mutant protease is reminiscent to that seen in classical serine proteases when the catalytic aspartate is mutated (Perona and Craik, 1995).
C.
DIMER INTERFACE
As indicated above, it has been shown that CMV and HSV-1 proteases are active as homodimers (Margosiak et al., 1996; Darke et al., 1996; Schmidt and Darke,
104
XiayangQiu and Sherin S. Abdel-Meguid
FIGURE 5 Stereoviewof the superimposition of catalytic triad residues of CMVprotease (light) and trypsin (dark).
1997). A dimer interface related by a twofold crystallographic axis has been first identified in the structure of CMV protease (Qiu et al., 1996; Fig. 6). The most distinct structural element in the CMV dimer interface is helix A6, with the two A6 helices almost parallel and A6 of each monomer interacting with helices A1, A2, A3, and A6 of the other (Qiu et al., 1996). Subsequently, similar dimer interfaces have also been found in the structures of cr-herpesvirus proteases (Qiu et al., 1997; Hoog et al., 1997; Fig. 6). The interface area between the two monomers of these proteases is about 1300 A 2. This is in general agreement with the reported (Margosiak et al., 1996; Darke et al., 1996; Schmidt and Darke, 1997) micromolar dissociation constants of herpesvirus proteases and suggests that the crystallographically observed dimer is indeed the biologically active dimer. Although the herpesvirus proteases are active as homodimers, each monomer has a well-defined active site containing all the residues necessary for catalytic activity. The two active sites are on opposite sides of the dimer (Fig. 6, see color plate). Since the dimer interface is distal to the catalytic triad (Fig. 6), dimerization must only influence enzymatic activity indirectly. We had speculated (Qiu et al., 1996) that this occurs by stabilization of the conformation of helix A6. This helix is part of a shallow groove running across the catalytic site. One side of the groove is relatively wide and deep. It is defined by the end of helix A6, the end of strand B6, His63, and the highly conserved Gly164Arg165-Arg166 segment. This side of the groove has been proposed (Qiu et al., 1997) to be the S' subsite. In the absence of dimer formation, helix A6, the core of the dimer interface, could move toward the active site to block substrate access, thus rendering the enzyme inactive or much less active. Although dimer formation appears to be a common feature for all herpesvirus proteases, there are notable differences between the dimer interfaces
105
Human Herpesvirus Proteases
of cr- and ~8-herpesvirus proteases. The two A6 helices of the dimer are almost parallel in the CMV protease structure but show an approximately 30 ~ twist in the VZV and HSV protease structures (Qiu et al., 1997; Hoog et al., 1997; Fig. 6). Helix A2 adopts a very different conformation in VZV protease comparing to that of CMV protease. It is interesting to note that the amino acid residues of helices at the dimer interface are less conserved than those of the /3-strands, with the least conserved being those of helix A2 (Fig. 2). These differences further support the notion that the dimer interfaces do not directly contribute to catalysis, but do it indirectly.
VI. S U B S T R A T E S P E C I F I C I T Y As indicated above, human herpes protease precursor molecules undergo autoproteolytic cleavage at two sites (R and M), and they all cleave between an alanine and a serine. Studies with both CMV (Sardana et al., 1994) and HSV-1 (DiIanni et al., 1993; McCann et al., 1994) proteases have identified the P 4 - P I ' residues of the M-site as highly conserved between the two proteases and as having sequence-specific interactions with the S and S' subsites of the protease. However, inspection of all sequences at the R- and M-sites (Table IV) shows a consensus of (V,L,I)-X-A$S, where X is a hydrophilic residue. This recognition sequence is unique among known serine proteases.
TABLE IV Recognition Sites of Several Human Herpesvirus Proteases HSV-1
GALVNA*S SAAHVDV HTYLQA*SEKFKMWG
M R
HSV-2
GALVNA*S SAAHVNV HTYLQA*SEKFKIWG
M R
VNAVEA*S
M
VZV
SKAPL
IQ
HVYLQA*STGYGLAR
R
CMV
AGVVNA*SCRLATAS ESYVKA*SVSPEARA
M R
HHV-6
PSI LNA*SLAPETVN CTY I KA*SEPPVE
I I
M R
HHV-7
PSVVNA*SLTPGQDR STY I KA*SENLTANN
M R
EBV
KKLVQA*SASGVAQS ESYLKA*SDAPDLQK
M R
HHV-8
SNRLEA*S SRS SPKS PVYLKA*SQFPAGIQ
M R
106
XiayangQiu and Sherin S. Abdel-Meguid
Despite sharing the same core M-site sequence (VNA$S), HSV-1 protease does not cleave at the CMV protease M-site; however, CMV protease does cleave at the corresponding HSV-1 site (Welch et al., 1995). Likewise, the EBV protease can process both EBV and CMV assembly proteins, but CMV protease cannot process the EBV assembly protein (Donaghy and Jupp, 1995). The high selectivity of the HSV-1 protease is consistent with studies using peptide substrates. The smallest peptide mimic of the CMV protease M-site that is cleaved by that protease is P4-P4 r (Sardana et al., 1994), whereas 13 residues from P5-P8 ~ are required for cleavage by HSV-1 protease (DiIanni, 1993). This is surprising given the high sequence homology of residues lining the active site cavity of the two enzymes (Hoog et al., 1997). Therefore, it was suggested that HSV-1 protease has a more extended substrate binding pocket and that differences in substrate specificity between the two enzymes result from differences in loop conformations around the active site cavity (Hoog et al., 1997). These loops show low sequence homology and are of differing lengths.
VII. M E C H A N I S M O F A C T I O N
A. CATALYTIC MECHANISM The herpesvirus protease structures suggest a novel Ser-His-His catalytic triad, while all other known serine proteases carry a Ser-His-Asp triad. There are two common models for the mechanism of serine proteases (Perona and Craik, 1995). In the "two-proton transfer" model, the negatively charged aspartic acid accepts a second proton to become uncharged during the transition state. In the second model, the most important role for the Asp is the groundstate stabilization of the required tautomer and rotamer of the catalytic histidine. In the crystal structures of herpesvirus protease, His63 forms hydrogen bonds with both Ser132 and His157 (Fig. 5). At the optimal pH (7 to 8) of these enzymes and with the two histidines mostly exposed to solvent, it is most probable that they are neutral. Thus, it is unlikely that in the transition state His157 would accept the complete transfer of the proton it shares with His63, unless it could transfer a proton to a negatively charged residue. On the other hand, His157 can assume the role of properly orienting His63 for catalysis. Therefore, the existence of an active Ser-His-His triad seems to support the second model. An aspartic acid (Asp65) was found in the CMV protease structure near His157, suggesting the possibility of a catalytic tetrad composed of Ser132, His63, His157, and Asp65 (Qiu et al., 1996). The likelihood of a catalytic tetrad was, however, diminished because Asp65 of the CMV protease is not conserved among all human herpes proteases; for example, it is a lysine in the VZV pro-
Human Herpesvirus Proteases
10 7
tease and an alanine in the HSV-1 and HSV-2 proteases (Fig. 2). Moreover, the activity of the D65A mutant CMV protease is reduced by only 35% (Liang et al., unpublished data). Another important element of serine protease catalysis is the existence of an oxyanion hole to stabilize the negatively charged oxygen of the substrate in the transition state. Overlays of the catalytic triad of any of the herpesvirus protease structures with that of trypsin suggested that the highly conserved Arg165 and Arg166 are involved in stabilization of the oxyanion intermediate. Such overlays resulted in superimposition of the Arg165 backbone atoms with those of Gly193 of trypsin. The latter is known to stabilize the oxyanion intermediate through a hydrogen bond with its backbone NH. The Ser195 of trypsin is also known to stabilize the enzyme active site oxyanion intermediate through a hydrogen bond with its backbone NH. The equivalent residue in the herpes proteases is absent. Instead, a water molecule (Fig. 7) held by the sidechain of Arg166 in the viral proteases was proposed to form a hydrogen bond with the oxyanion (Qiu et al., 1996; 1997; Hoog et al., 1997). The roles of Arg165 and Arg166 in catalysis are further supported by the fact that both residues are absolutely conserved among all herpes proteases (Fig. 2) and the fact that the CMV R165A mutant protease still has 30% of the wild-type activity while the R166A mutant is about four orders of magnitude less active (Liang et al., 1998).
B. MODE OF PROTEOLYTIC PROCESSING There is no direct experimental evidence to support dimer formation of fulllength herpesvirus proteases. However, the extent and the intricate nature of the dimer interface of the catalytic domain suggest that the full-length proteases will also form dimers. Although greater enzymatic activity of the catalytic domain is attained with the influence of cosolvents that are thought to enhance dimer formation, the full-length protease is quite active in the absence of cosolvents. This implies that dimer formation may be further stabilized by the assembly protein or the C-terminal portion of the full-length protease. Little is known about autoprocessing of the herpes proteases. Inspection of the crystal structure shows the active site and the C-terminus are on opposite sides within a protease monomer (Fig. 6). Within a dimer, the C-terminus of one monomer is on the same side as the active site of the other monomer and they are connected by a well-defined groove (Fig. 6). However, not only are they 29 A apart, but also considerable conformational change must occur in order for the C-terminus to position itself properly and assume the correct orientation in the active site. Thus, the structure suggests that the protease is unlikely to act in cis, at least at the R-site.
108
Xiayang Qiu and Sherin S. Abdel-Meguid
FIGURE 7 The active site of HSV-2 protease with di-isopropyl phosphate (DIP) bound. Key hydrogen bonds are connected by dashed lines. The oxyanion hole is predicted to be between Arg165 N and Wat2.
VIII. INHIBITORS Knowledge that the protease is essential for viral replication (Gao et al., 1994) has stimulated research by a number of pharmaceutical companies to identify inhibitors that can be used as drugs to combat diseases caused by herpesviruses, particularly CMV. As with most drug discovery programs, the goal of these
Human Herpesvirus Proteases
109
programs is to design novel, potent, low-molecular-weight molecules devoid of peptide character. Although the bulk of the work to develop inhibitors of herpesvirus proteases has yet to be made public, some of it has been recently reported (see below). Most of the inhibitors reported to date have been designed prior to knowledge of the three-dimensional structures of any of the herpesvirus proteases. They have been either derived from a substrate or designed based on molecules that are previously known to be classical inhibitors of serine proteases and act by covalently and reversibly binding to the active site serine hydroxyl. Given the shallowness of the active site cavity of the herpesvirus proteases, such molecules may offer an advantage over those that do not bind covalently.
A. PEPTIDE INHIBITORS Studies to identify peptide inhibitors were initiated to define the minimal element in the substrate that can act as a competitive inhibitor, i.e., the smallest peptide that binds but does not process. This peptide inhibitor can then be used as a core structure against which nonpeptide inhibitors can be designed. Using peptides encompassing the sequence of the natural M-site substrate of CMV protease (GVVNA$SCRLA), LaFemina et al. (1996) identified VVNA (P4-P1 of the substrate) as such a minimal element. This peptide was shown to have a Ki of 1.36 mM against CMV protease. They also reported that substitution of the P 1' serine of an M-site P6-P5' peptide by an alanine improved the Ki by about threefold over the unaltered peptide and gave them their most potent peptide inhibitor, having a Ki of 72/zM.
B. AMINOMETHYLENE ISOSTERES (REDUCED PEPTIDE BOND) Holskin et al. (1995) reported the first peptidomimetic inhibitor designed for CMV protease. Also starting with the M-site of CMV protease, they prepared a reduced peptide bond inhibitor (RGVVNA~[CH2NH]SSRLA-OH) having an inhibition constant of ~500/zM against CMV protease. Reduced peptide bond inhibitors are secondary amines in which the carbonyl group of the scissile peptide bond ( m C O - N H - - ) is reduced to yield the methylene-containing group (mCH2-NHm). This peptide, spanning P6 to P5', differs from the amino acid sequence of the M-site at two positions, namely P6 (an arginine for alanine to increase solubility) and P2' (a serine for cysteine to prevent disulfide bond formation).
110 C.
Xiayang Qiu and Sherin S. Abdel-Meguid
KETONES
A number of micromolar and submicromolar CMV protease inhibitors containing an activated carbonyl moiety (Fig. 8), such as fluoromethyl ketones and ce-ketoamides, were recently reported (Bonneau et al., 1997). Molecules of this type are classic serine protease inhibitors that act by reversibly forming covalent hemiketal adducts with the active site serine hydroxyl. These inhibitors were used to study the effect of ligand binding on the intrinsic fluorescence and CD properties of the enzyme and to suggest that inhibition of CMV protease by peptidyl ketones involves a conformational change of the protease (Bonneau et al., 1997).
~,.
H
O
CONMe2 . 9 H
N
]1
0
~
_ O
N
-
/1\
N
I
H
II
0
R
"
:
Ketones R1
O
O
R2
N,,,~O H40
Benzoxazinones
R3
O
Thienoxazinones
O
R* ~ N ~
O
O
ph~'N
Ph
Ph
Spirocyclopropyl oxazolones
Imidazolones
....0
Fungal Metabolite FIGURE 8
N--R
H 2 N ~
0
Bripiodionen
Some of the known inhibitors of human herpesvirus proteases.
Human Herpesvirus Proteases
111
D. BENZOXAZINONES Benzoxazinones are a class of heterocyclic molecules (Fig. 8) initially identified as general mechanism-based inhibitors of serine proteases (Teshima et al., 1982) and later developed as specific inhibitors of human leukocyte elastase (Uejima et al., 1993). They inhibit by acylation of the active site serine through their carbonyl group (Radhakrishnan et al., 1987). Recently, Jarvest et al. (1996) reported inhibition of HSV- 1 protease by benzoxazinones at micromolar potency. These inhibitors appear to interact specifically with HSV-1 protease, as suggested by SAR trends and stereoselectivity, and were shown to have a wide range of half-lives (1 to 171 h) at pH 7.5 in aqueous solutions. They showed that their most stable compound was one of their most potent, with IC50 of 5 ~M.
E. THIENOXAZINONES Jarvest et al. (1997) also reported the design and synthesis of a number of thienoxazinones (Fig. 8) and showed that they are potent, selective, mechanismbased inhibitors of the herpes proteases with good aqueous stability. These compounds were found to be submicromolar inhibitors of HSV-1 and HSV-2 proteases and moderate inhibitors of CMV protease.
F. OXAZOLONES AND IMIDAZOLONES Targeted screening of compounds that can acylate the active site serine of the herpes proteases identified the spirocyclopropyl oxazolones (Fig. 8) as submicromolar inhibitors of HSV-2 and CMV proteases (Pinto et al., 1996). These compounds were shown to be better inhibitors of herpesvirus proteases than other enzymes of the chymotrypsin superfamily. To enhance the stability of these compounds, the imidazolones (Fig. 8) were prepared and found to be selective for CMV protease, with little inhibition of HSV-2 protease, elastase, trypsin, and chymotrypsin (Pinto et al., 1996).
G. NATURAL PRODUCT INHIBITORS Two natural product inhibitors of CMV protease have been identified. A fungal metabolite (Fig. 8) isolated from an unidentified fungus was found to inhibit the enzyme with an IC50 of 9.8 ~g/ml (Chu et al., 1996). A second inhibitor, bripiodionen (Fig. 8), was isolated from Streptomyces and shown to have an
112
XiayangQiu and Sherin S. Abdel-Meguid
IC50 of 30/.,M (Shu et al., 1997). Furthermore, a cycloartanol sulfate from the green alga Tuemoya sp. was identified as a 4- to 7-/zM inhibitor of both VZV and CMV proteases (Patil et al., 1997).
IX. L I G A N D B I N D I N G With the known three-dimensional structures of herpesvirus proteases, it is possible to speculate about the general characteristics of ligand binding. As mentioned above, despite a totally different protein fold of the herpesvirus proteases when compared to the classic serine proteases, residues of the catalytic triad as well as those of the oxyanion hole are quite superimposable. This suggests that the mode of substrate binding of herpesvirus proteases could be similar to that of classic serine proteases. As indicated above, the active site cavity of the herpes proteases is shallow with the catalytic residues mostly exposed to solvent. The prime side (right in Fig. 9, see color plate) of the groove is relatively wide, suggesting lack of specific recognition of the substrate. It is defined by the end of helix A6, the end of strand B6, His63, and the highly conserved Gly164-Arg165-Arg166 segment. Overlay of the Ser-His catalytic dyad of the VZV protease to that of the human leukocyte elastase from the crystal structure containing the turkey ovomucoid inhibitor shows that this part of the groove is analogous to the S' cavity of elastase, with the P I ' - P 3 ' residues of the inhibitor able to fit well in the VZV protease groove. It was speculated that the HSV-1 protease P2'-P8' residues may play a structural role that is more length dependent than sequence dependent (DiIanni et al., 1993), which agrees with the shape of the S' cavity that is large enough to accommodate a folded peptide. The nonprime side (left in Fig. 9) of the groove is rather narrow, suggesting more substrate specificity than the prime site. The nonprime region is delineated by strand B5, the Gly164Arg165-Arg166 segment and the beginning of the AA loop. Since strand B5 is almost parallel to this groove, it is possible that the substrate could be inserted into the groove with its main chain in an extended conformation forming an antiparallel/3-sheet with strands B5 and B 6 m a mode that is almost identical to that of classic serine proteases. We had speculated that the $1 site is between Arg166 and Leu32, the $2 site is at Leu133 and at a conserved water molecule bound to Arg166, and the $1' site is near His63 (Hoog et al., 1998; Fig. 7). However, the precise mode of substrate binding has yet to be determined experimentally. Surface loops are known to be important for the substrate specificity of serine proteases (Perona and Craik, 1997). Protruding from the herpesvirus protease structures are the two large surface loops: one contains the AA helix and is called the AA loop, and the other contains the I-site in CMV protease and is
Human Herpesvirus Proteases
113
called the I-loop. The sequences of these two regions are not very conserved among the herpesvirus proteases, with CMV protease having multiresidue insertions in both regions (Fig. 2). Figure 9 shows an approximate position of the M-site peptide in the VZV protease active site cavity. In this model the AA loop is important for forming the $2-$4 subsites and the I-loop could be important for recognizing substrate residues P4 and further. Therefore, the difference in substrate specificity between c~- and ~-herpesvirus proteases could be explained by the large differences in these loop regions. It is interesting that an "I-site" deletion mutant of CMV protease was shown to have altered substrate specificity (Welch et al., 1993). Structures of protease-ligand complexes are needed to support this model. While one can only speculate on the nature of interactions between herpesvirus proteases and the various classes of known inhibitors, the current knowledge of the enzymatic and structural properties is critical for the future successes in identifying drug candidates. Knowledge of the novel structure framework and active site, as well as the delineation of the substrate binding groove, are particularly important in providing a template for rational approaches to the design of novel, potent inhibitors.
REFERENCES Babe, L. M., and Craik, C. S. (1997). Viral proteases: Evolution of diverse structural motifs to optimize function. Cell 91,427-430. Bonneau, P. R., Grand-Maitre, C., Greenwood, D. J., Lagace, L., LaPlante, S. R., Massariol, M. J., Ogilvie, W. W., O'Meara, J. A., and Kawai, S. H. (1997). Evidence of a conformational change in the human cytomegalovirus protease upon binding of peptidyl-activated carbonyl inhibitors. Biochemistry 36, 12644-12652. Burck, P. J., Berg, D. H., Luk, T. P., Sassmannshausen, L. M., Wakulchik, M., Smith, D. P., Hsiung, H. M., Becker, G. W., Gibson, W., and Villarreal, E. C. (1994). Human cytomegalovirus maturational proteinase: Expression in Escherichia coli, purification, and enzymatic characterization by using peptide substrate mimics of natural cleavage sites. J. Virol. 68, 2937-2946. Chen, P., Tsuge, H., Almassy, R. J., Gribskov, C. L., Katoh, S., Vanderpool, D. L., Margosiak, S. A., Pinko, C., Matthews, D. A., and Kan, C.-C. (1996). Structure of the human cytomegalovirus protease catalytic domain reveals a novel serine protease fold and catalytic triad. Cell 86, 835-843. Darke, P. L., Chert, E., Hall, D. L., Sardana, M. K., Veloski, C. A., LaFemina, R. L., Shafer, J. A., and Kuo, L. C. (1994). Purification of active herpes simplex virus-1 protease expressed in Escherichia coli.J. Biol. Chem. 269, 18708-18711. Darke, P. L., Cole, J. L., Waxman, L., Hall, D. L., Sardana, M. K., and Kuo, L. C. (1996). Active human cytomegalovirus protease is a dimer. J. Biol. Chem. 271, 7445-7449. Deckman, I. C., Hagen, M., and McCann, P. J., III (1992). Herpes simplex virus type 1 protease expressed in Escherichia coli exhibits autoprocessing and specific cleavage of the ICP35 assembly protein.J. Virol. 66, 7362-7367. DiIanni, C. L., Mapelli, C., Drier, D. A., Tsao, J., Natarajan, S., Riexinger, D., Festin, S. M., Bolgar, M., Yamanaka, G., Weinheimer, S. P., Meyers, C. A., Colonno, R. J., and Cordingley, M. G.
114
Xiayang Qiu and Sherin S. Abdel-Meguid
(1993). In vitro activity of the herpes simplex virus type 1 protease with peptide substrates. J. Biol. Chem. 268, 25449-25454. DiIanni, C. L., Stevens, J. T., Bolgar, M., O'Boyle, D. R., II, Weinheimer, S. P., and Colonno, R. J. (1994). Identification of the serine residue at the active site of the herpes simplex virus type 1 protease. J. Biol. Chem. 269, 12672-12676. Donaghy, G., and Jupp, R. (1995). Characterization of the Epstein-Barr virus proteinase and comparison with the human cytomegalovirus proteinase.J. Virol. 69, 1265-1270. Gao, M., Matusick-Kumar, L., Hurlburt, W., DiTusa, S. F., Newcomb, W. W., Brown, J. C., McCann, P.J., III, Deckman, I., and Colonno, R.J. (1994). The protease of herpes simplex virus type i is essential for functional capsid formation and viral growth.J. Virol. 68, 3702-3712. Gibson, W., and Hall, M. R. T. (1997). Assemblin, an essential herpesvirus proteinase. Drug Des. Discov. 15, 39-47. Hall, D. L., and Darke, P. L. (1995). Activation of the herpes simplex virus type I protease.J. Biol. Chem. 270, 22697-22700. Holskin, B. P., Bukhtiyarova, M., Dunn, B. M., Baur, P., de Chastonay, J., and Pennington, M. W. (1995). A continuous fluorescence-based assay of human cytomegalovirus protease using a peptide substrate. Anal. Biochem. 227, 148-155. Holwerda, B. C. (1997). Herpesvirus proteases: Targets for novel antiviral drugs. Antiviral Res. 35, 1-21. Hoog, S. S., Smith, W. W., Qiu, X., Janson, C. A., Hellmig, B., McQueney, M. S., O'Donnell, K., O'Shannessy, D., DiLella, A. G., Debouck, C., and Abdel-Meguid, S. S. (1997). Active site cavity of herpesvirus proteases revealed by the crystal structure of herpes simplex virus protease/ inhibitor complex. Biochemistry 36, 14023-14029. Jarvest, R. L., Connor, S. C., Gorniak, J. G.,Jennings, L. J., Serafinowska, H. T., and West, A. (1997). Potent selective thienoxazinone inhibitors of herpes proteases. Bioorg. Med. Chem. Lett. 7, 1733-1738. Jarvest, R. L., Parratt, M.J., Debouck, C. M., Gorniak, J. G., Jennings, L.J., Serafinowska, H. T., and Strickler, J. E. (1996). Inhibition of HSV-1 protease by benzoxazinones. Bioorg. Med. Chem. Lett. 6, 2463-2466. Kraulis P. J. (1991). MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24, 946-950. LaFemina, R. L., Bakshi, K., Long, W. J., Pramanik, B., Veloski, C. A., Wolanski, B. S., Marcy, A. I., and Hazuda, D.J. (1996). Characterization of a soluble stable human cytomegalovirus protease and inhibition by M-site peptide mimics.J. Virol. 70, 4819-4824. Liang, P.-H., Doyle, M. L., Brun, K. A., O'Donnell, K., Green, S. M., Baker, A. E., Feild, J. A., Blackburn, M. N., and Abdel-Meguid, S. S. (1998). Site-directed mutagenesis probing the catalytic role of arginines 165 and 166 of human cytomegalovirus protease. Biochemistry 37, 5923-5929. Liu, F., and Roizman, B. (1991). The herpes simplex virus i gene encoding a protease also contains within its coding domain the gene encoding the more abundant substrate. J. Virol. 65, 51495156. Liu, F., and Roizman, B. (1992). Differentiation of multiple domains in the herpes simplex virus 1 protease encoded by the UL26 gene. Proc. Natl. Acad. Sci. USA 89, 2076-2080. Margosiak, S. A., Vanderpool, D. L., Sisson, W., Pinko, C., and Kan, C.-C. (1996). Dimerization of the human cytomegalovirus protease: Kinetic and biochemical characterization of the catalytic homodimer. Biochemistry 35, 5300-5307. McCann, P. J., III, O'Boyle, D. R., II, and Deckman, I. C. (1994). Investigation of the specificity of the herpes simplex virus type 1 protease by point mutagenesis of the autoproteolysis sites. J. Virol. 68,526-529. Nicholls, A., and Honig, B. H. (1991). GRASP.J. Comp. Chem. 12,435-445. Patil, A., Freyer, A. J., Killmer, L., Breen, A., and Johnson, R. K. (1996). A cycloartanol sulfate from
Human Herpesvirus Proteases
115
the green alga Tuemoya sp.: An inhibitor of VZV protease. Bioorganic & Med. Chem. Letters 6, 2467-2472. Perona, J. J., and Craik, C. S. (1995). Structural basis of substrate specificity in the serine proteases. Prot. Sci. 4, 337-360. Perona, J. J., and Craik, C. S. (1997). Evolutionary divergence of substrate specificity within the chymotrypsin-like serine protease fold. J. Biol. Chem. 272, 29987-29990. Pinko, C., Margosiak, S. A., Vanderpool, D. L., Gutowski, J. C., Condon, B., and Kan, C.-C. (1995). Single-chain recombinant human cytomegalovirus protease. J. Biol. Chem. 270, 23634-23640. Pinto, I. L., West, A., Debouck, C. M., DiLella, A. G., Gorniak, J. G., O'Donnell, K. C., O'Shannessy, D. J., Patel, A., and Jarvest, R. L. (1996). Novel, selective mechanism-based inhibitors of the herpes proteases. Bioorg. Med. Chem. Lett. 6, 2467-2472. Qiu, X., Culp, J. S., DiLella, A. G., Hellmig, B., Hoog, S. S., Janson, C. A., Smith, W. W., and AbdelMeguid, S. S. (1996). Unique fold and active site in cytomegalovirus protease. Nature 383, 275 -279. Qiu., x., Janson, C. A., Culp, J. S., Richardson, S. B., Debouck, C., Smith, W. W., and Abdel-Meguid, S. S. (1997). Crystal structure of varicella-zoster virus protease. Proc. Natl. Acad. Sci. USA 94, 2874-2879. Radhakrishnan, R., Presta, L. G., Meyer, E. F., Jr., and Wildonger, R. (1987). Crystal structures of the complex of porcine pancreatic elastase with two valine-derived benzoxazinone inhibitors. J. Mol. Biol. 198,417-424. Sardana, V. V., Wolfgang, J. A., Veloski, C. A., Long, W. J., LeGrow, J., Wolanski, B., Emini, E. A., and LaFemina, R. L. (1994). Peptide substrate cleavage specificity of the human cytomegalovirus protease.J. Biol. Chem. 269, 14337-14340. Schmidt, U., and Darke, P. L. (1997). Dimerization and activation of the herpes simplex virus type 1 protease.J. Biol. Chem. 272, 7732-7735. Shieh, H. S., Kurumbail, R. G., Stevens, A. M., Stegeman, R. A., Sturman, E. J., Pak, J. Y., Wittwer, A. J., Palmier, M. O., Wiegand, R. C., Holwerda, B. C., and Stallings, W. C. (1996). Threedimensional structure of human cytomegalovirus protease. Nature 383, 279-282. Shu, Y. Z., Ye, Q., Kolb, J. M., Huang, S., Veitch, J. A., Lowe, S. E., and Manly, S. P. (1997). Bripiodionen, a new inhibitor of human cytomegalovirus protease from Streptomyces sp. WC76599. J. Nat. Prod. 60, 529-532. Teshima, T., Griffin, J. C., and Powers, J. C. (1982). A new class of heterocyclic serine protease inhibitors. Inhibition of human leukocyte elastase, porcine pancreatic elastase, cathepsin G, and bovine chymotrypsin A alpha with substituted benzoxazinones, quinazolines, and anthranilates. J. Biol. Chem. 257, 5085-5091. Tong, L., Qian, C., Massariol, M. J., Bonneau, P. R., Cordingley, M. G., and Lagace, L. (1996). A new serine-protease fold revealed by the crystal structure of human cytomegalovirus protease. Nature 383, 272-275. Uejima, Y., Kokubo, M., Oshida, J., Kawabata, H., Kato, Y., and Fujii, K. (1993). 5-Methyl-4H-3, 1-benzoxazin-4-one derivatives: Specific inhibitors of human leukocyte elastase. J. Pharmacol. Exp. Ther. 265,516-523. Welch, A. R., McNally, L. M., Hall, M. R., and Gibson, W. (1993). Herpesvirus proteinase: Sitedirected mutagenesis used to study maturational, release, and inactivation cleavage sites of precursor and to identify a possible catalytic site serine and histidine.J. Virol. 67, 7360-7372. Welch, A. R., Villarreal, E. C., and Gibson, W. (1995). Cytomegalovirus protein substrate are not cleaved by the herpes simplex virus type i proteinase. J. Virol. 69,341-347. Welch, A. R., Woods, A. S., McNally, L. M., Cotter, R. J., and Gibson, W. (1991). A herpesvirus maturational proteinase, assemblin: Identification of its gene, putative active site domain, and cleavage site. Proc. Natl. Acad. Sci. USA 88, 10792-10796.
The Secreted Proteinases from Candida: Challenges for Structure-Aided Drug Design KENT STEWART,* ROBERT C. GOLDMAN,t AND CELE ABAD-ZAPATERO~ *Molecular Modeling Group, *Anti-infective Group, and ~Laboratory of Protein Crystallography, Abbott Laboratories, Abbott Park, Illinois 60064-3500
I. Introduction II. Pathogenic Spectrum and Current Therapies III. Secreted Aspartic Acid Proteases as Virulence Factors IV. Secreted Aspartic Acid Protease Substrates and Early Inhibitors V. Structural Characterization VI. Candida Genomics VII. Summary and Conclusions VIII. Methods References
I. I N T R O D U C T I O N Candida albicans is a diploid, dimorphic fungus that exists in two dominant morphological forms, the yeast and the hyphal. As a human pathogen, Candida can cause diseases ranging from mild, transient, and readily curable infections to chronic, severe, and frequently fatal systemic conditions. Over the past decade the number of patients diagnosed with severe, life-threatening infection with C. albicans and other Candida species has increased dramatically in the hospital setting, especially in patients at risk due to underlying immunosuppression as a result of cancer chemotherapy, organ transplantation, and AIDS. At present Candida is the fourth most frequent infectious organism isolated from blood cultures in hospital settings, reflecting the increasing Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.
117
118
Stewart et al.
frequency of severe infection. Prominent members of this genus are Candida albicans, Candida tropicalis, and Candida parapsilosis. Secreted aspartic (acid) proteases (SAPs) from these and other fungal pathogens with unusually broad substrate specificity have been implicated as virulence factors (Cutler, 1991; Douglas, 1988; Ray et al., 1991; Ruchel et al., 1992; White et al., 1995). Although initially Candida strains were believed to express a single SAP, further research has documented the existence of at least seven distinct genes in C. albicans grouped into two subfamilies represented by SAPs 1-3 and SAPs 4 - 6 , with SAP7 being the most divergent in sequence (Monod et al., 1994). Characterization of two additional members, SAP8 and SAP9, is still underway (Hube et al., 1997a). Prior research (Goldman et al., 1995) discussed the choice of the SAPs as antifungal targets, the discovery of the early specific inhibitors using a fluorogenic substrate, and the pharmacokinetics and in vivo activity of the inhibitors. In this chapter, we focus on the most recent data on pathogenicity and differential expression of the various SAPs, and also on the unique structural differences encountered in the proximity of the active site among the different members of SAPs from C. albicans and C. tropicalis. A detailed analysis reveals that although the various secreted enzymes are highly homologous, the residues next to the active site in the different SAP 1 - 6 of C. albicans present significant variability. This diversity can be initially divided into two groups: SAP 1-3 and SAP4-6. However, further analysis indicates that individual subsite specificity also exists within each SAP. A successful antifungal effort directed to the inhibition of the SAPs from Candida should be based on understanding this microdiversity at the different active site pockets, and on combining this information with the complex pattern of expression and regulation of the individual SAPs in the host.
II. P A T H O G E N I C S P E C T R U M AND CURRENT THERAPIES Although C. albicans can exist as a mostly harmless colonizer of mucousal surfaces, the slightest weakening of the host immune system can lead to various severities of disease. Colonization of mucousal surfaces can lead to diseases such as thrush, oral and esophageal infection, and vaginitis, many cases of which can be cured with appropriate antifungal therapy (see below) when there are no underlying factors weakening the host immune system. Predisposing factors include cellular immunodeficiency, prolonged neutropenia, diabetes mellitus, use of broad-spectrum antibacterial therapy, and the presence of intravascular catheters. Candida spp. septicemia are frequent in patients with leukemia, organ transplant patients, and others receiving immunosuppressive
Secreted Proteinases from Candida
119
therapy. Candida albicans manifests in much more severe diseases upon penetration of mucousal barriers, leading to dissemination within the host and colonization of various organ systems wherein functional damage occurs (for a detailed medical and microbiological review see Odds, 1987). In specific populations, mortality can be as high as 30% or more in spite of aggressive prophylactic and therapeutic intervention with antifungal agents (Lortholary and Dupont, 1997). Current therapies for the treatment of life-threatening Candida infection are limited, but include the use of the following approved chemotherapeutic agents: amphotericin B and its lipid formulations; azoles (fluconazole and itraconazole); and flucytosine, used much more infrequently and usually in combination with another antifungal drug. Amphotericin B must be administered by iv delivery, and such use is limited by renal toxicity and other infusion related toxicities. The lipid formulations of amphotericin B are less toxic and as efficacious as amphotericin B, but require higher levels of administration and are quite costly. Given the limited alternatives currently in use for the treatment of severe Candida infections, the higher-than-acceptable failure rates with subsequent fatality, and the rising increase of resistance to some agents, newer drug and treatment modalities are actively being sought.
III. S E C R E T E D A S P A R T I C A C I D P R O T E A S E S AS V I R U L E N C E F A C T O R S Many Candida species produce SAPs and the evidence that they contribute to virulence is substantial (Cutler, 1991; Data, 1994; Douglas, 1988; Odds, 1985; Ray et al., 1991; Ruchel et al., 1992; White et al., 1995). Secreted aspartic acid proteases produced by C. albicans are coded by a family of related aspartic proteinase genes (de Viragh et al., 1993; Hube et al., 1991; Magee et al., 1993; Miyasaki et al., 1994; Monod et al., 1994; Morrow et al., 1992; Mukai et al., 1992; White et al., 1993; Wright et al., 1992), and most of these genes are subsequently expressed (Hube et al., 1994; Hube et al., 1997a; Miyasaki et al., 1994; Morrow et al., 1992; White and Agabian, 1995; White et al., 1993; 1995; Wright et al., 1992). Three subfamilies of SAPs were defined on the basis of sequence similarity: SAP1-3, SAP4-6, and SAP7 (Monod et al., 1994). The SAP8 gene was also identified (Morrison et al., 1993) and expression of the SAP8 and SAP9 genes is being investigated (Sanglard et al., 1997) (M. Monod, personal communication). The most recent data implicating Candida SAPs in disease is outlined below. New evidence supports the hypothesis that SAPs play a significant role early in the process of Candida dissemination (Fallon et al., 1997). The role of C. albicans SAPs in early disease progression was examined using a neutropenic
120
Stewart et al.
murine model of dissemination following intranasal inoculation with C. albicans. A significant dose-dependent protection against a subsequent lethal intranasal dose of C. albicans was observed by pretreatment of neutropenic mice with the aspartic protease inhibitor pepstatin A, and the efficacy was comparable to protection obtained with amphotericin B. The reduced mortality provided by pepstatin A also correlated with a reduction in the numbers of C. albicans recovered in the lungs, liver, and kidneys. Pepstatin A did not provide protection against C. albicans innoculated intravenously to mice. Within the limitations of this experimental system, these data are consistent with Candida SAPs playing a significant role in the early spread of the infection. Additional studies have implicated Candida SAPs in the invasion process. A possible role for SAPs in the invasion of the intestinal wall, following oralintragastric inoculation of infant mice has been reported (Colina et al., 1996). Digestion of labeled mucin was examined using a plate assay method devised for quantitation of protease and glycosidase activities. Culture filtrates of C. albicans contained proteolytic activity capable of digesting mucin. The activity was inhibited by pepstatin A, thus implicating Candida SAPs in the degradation of gastric mucin. Consequently Candida SAPs may play a role in dissemination from colonization sites in the gastrointestinal tract. Moreover, induction of C. albicans acid proteinase caused degradation of extracellular matrix proteins produced by a human endothelial cell line, and this degradation was inhibited by pepstatin A (Morschhauser et al., 1997). Thus, a role is suggested also for C. albicans SAPs in the degradation of the subendothelial extracellular matrix, which could facilitate dissemination via the circulatory system. Candida SAP is known to degrade specifically the heavy chain of IgA and IgG (Ruchel, 1984, 1986). Recent data (Kaminishi et al., 1995) revealed that killing of Staphylococcus aureus by human polymorphonuclear leukocytes was greatly reduced when bacteria were opsonized with human serum pretreated with Candida protease. Degradation of the Fc portion of immunoglobulin G by the action of C. albicans proteinase was observed, indicating that this was the cause of reduced bactericidal activity. Decreased bactericidal activity of human serum against Escherichia coli was also observed, and reduction of serum bactericidal activity was apparently due to proteolysis of complement proteins. Recent data further support a role for SAP in the pathogenicity of vulvovaginal candidiasis. Passive transfer of vaginal washes from rats recovering from Candida vaginitis was able to enhance clearance in the receiving animals. Likewise, monoclonal antibody to SAP2, intravaginal immunization with SAP2, and use of pepstatin also lead to more rapid clearance of Candida (De Bernardis et al., 1997). The SAP was localized to contact points between the fungal cell wall and the vaginal epithelial cell cell layer by immunoelectron microscopy (Stringaro et al., 1997).
Secreted Proteinases from Candida
121
Candida albicans SAP could also contribute to atopic asthma (Akiyama et al., 1994, 1996). Among patients with positive skin response to C. albicans acid protease, IgE antibodies were detected in 37% of the cases. The SAP also induced T-cell proliferation in 71% of patients showing a positive response to crude C. albicans antigen, and high levels of serum IgE correlated with the hism taminemrelease response of peripheral blood leukocytes to protease. Patients with high levels of serum IgE antibodies against the SAP showed positive conjunctival and immediate bronchial responses when challenged with protease. These data suggest that the protease is a significant allergen in mucosal allergy caused by C. albicans. If the C. albicans SAP gene family evolved to perform various functions during the process of establishing infection, one would assume that the expressions of these genes would be regulated in response to host environment. Secreted aspartic acid protease production is inducible by environmental stimuli, and our own work (Lerner and Goldman, 1993) clearly indicated that it is "protein" rather than small "peptides" that act as inducers; where small peptides are defined as those small enough to be transported by C. albicans peptide transport systems. Secreted aspartic acid protease was also induced by bovine serum albumin (BSA) in the absence of protein hydrolysis by SAP, indicating that some form of signal transduction event was occurring. In addition, SAPs are also regulated by the morphological switch pathway (Morrow et al., 1992; White and Agabian, 1995) and serum-induced hyphal formation (Homma et al., 1993; Traub, 1985). The SAP antigen is also present in infected tissue, as previously reviewed (Goldman et al., 1995); however, at the present time, except for vaginal infection, the mechanisms of tissuemspecific expression or recognition of the various SAPs are not known. Most recently, the use of targeted gene disruption in Candida was used to substantiate and further elucidate the roles of SAP expression in virulence (Hube et al., 1997b; Sanglard et al., 1997). Lethality was reduced significantly in both mice and guinea pigs when Candida containing the triple disruption of SAP4-6 were injected intravenously (Sanglard et al., 1997). Slight attenuation was also observed when singly disrupted SAP1, SAP2, or SAP3 strains were examined. These two studies used iv administration of Candida, thus bypassing many of the events occuring during the normal process of colonization, invasion, and dissemination. Nonetheless, one can argue that SAP4-6 seem to be implicated as major elements of Candida virulence. Further studies using SAP gene disruptants are currently in progress (Hube, personal communication) to elucidate the effects of the deficiency of individual proteases on specific stages of the infection process. The outcome of these experiments should clarify the roles of the individual proteases during infection, dissemination, and invasion of host tissues.
122
Stewart et al.
IV. S E C R E T E D A S P A R T I C A C I D P R O T E A S E S U B S T R A T E S A N D EARLY I N H I B I T O R S Secreted aspartic acid proteases can cleave a variety of substrates, which function as barriers and host defense molecules (Goldman et al., 1995). A preference for cleavage of a His-Thr or Lys-Thr bond by SAP2 was observed, and SAP2 was quite active on the His-Thr bond at physiological pH. Cleavage site specificity of Candida SAPs was also observed using other synthetic substrates (Fusek et al., 1994). Degradation of cytoskeletal proteins of mammalian cells by SAP2 also occurs (Goldman et al., 1995). The intermediate filament protein vimentin is involved in the crosslinking of filaments (microtubules, microfilaments, and intermediate filaments) to themselves and other cell organelles. The primary cleavage site in vimentin was between Lys436 and Thr437. Cell morphology was altered, and growth was severely reduced when human skin fibroblasts were electroporated in the presence of SAP2. The addition of a potent inhibitor of SAP2 (A-70450, see below) at 20 nM restored normal growth and morphology in SAP-treated cells. These results indicate that host intermediate filament proteins, which form an essential structural scaffold, serve as substrate for Candida SAP2. Identification of additional host substrates may provide important clues to the role of SAPs during colonization and infection. Given the volume of data implicating Candida SAPs in virulence, several groups have investigated the effects of standard inhibitors of SAPs on virulence. Two earlier studies reported the activity of pepstatin A in vivo in mouse models of intravenous infection (Ruchel et al., 1990; Zotter et al., 1990). Weak effects were observed in both studies. The lack of effects of pepstatin in the Candida mouse model was briefly mentioned in a third report (Edison and ManningZweerink, 1988). Pepstatin does not have the optimum toxicity and pharmacokinetic profile, and this may reflect the limited degree of activity observed (Ruchel et al., 1990). Other novel inhibitors of SAP were reported (Sato et al., 1994a,b), but testing of in vivo efficacy was not mentioned. Preliminary work suggests that certain synthetic inhibitors of SAPs decrease C. albicans adherence to endothelial cells in vitro (Frey et al., 1990). The development of a sensitive, rapid assay system for SAP activity based on fluorogenic substrates was critical in the identification of the first subnanomolar inhibitors of SAP2 (Capobianco et al., 1992). It allowed the investigation of the biochemical properties of SAP2, its interaction with putative peptide sequences, and analysis of various kinetic parameters (Goldman et al., 1995). One inhibitor, A-70450 (Fig. 1), inhibited SAP2 with a Ki of 0.17 nM, a significant improvement when compared to a Ki of 2.9 nM for pepstatin (Capobianco et al., 1992). Additional analogs were synthesized with variations at either end of the molecule. One compound analog, Ao79912, retained potent activity
123
Secreted Proteinases from Candida i $2 subsite ] 221, 225, P2 [ 301,303, 305 ~
PI'
OH x~(/
9,
0
P3b
P1 S3b subsite" 51, 86, 118,120
, . S1 subslte 19313195:305
P3a S3a subsite [ 12, 13,119, 220, 222
"0
P2' $2' subsite 35, 82, 131,133
S1 subsite 30, 84, 88, 119,123
FIGURE 1 Schematicrepresentation of the A-70450inhibitorwith the differentenzymepockets corresponding to the inhibitorsubsites (TableII).
against SAP2 and reduced activity against key host aspartic proteainases. In this compound the C-terminal butyl group of A-70450 is replaced with a 3-morpholinopropyl substituent (compound no. 6 of Fig. 5 in Abad-Zapatero et al., 1996). Yet, no efficacy was observed when Candida cells were injected iv into mice (bypassing the normal infection pathways) and treated with A-70450, A-79912, or pepstatin A. The lack of in vivo antifungal activity by the SAP2 inhibitors discussed previously should not be used to dismiss Candida SAPs as possible targets for therapeutic intervention. Quite to the contrary, new evidence is accumulating from basic investigations of virulence of Candida. Of special interest is the systematic evaluation of the effects of targeted gene disruption of SAP genes, which should define specific roles for each SAP in the process of Candida pathogenesis. An important tool for targeting inhibition of specific SAPs as virulence factors will be potent inhibitors with specific SAP activity. To date, we know the most about SAP2. Unfortunately, the isoforms SAP1 and SAP3-7 are also involved in the pathogenesis of Candida and their inhibition characteristics are still unknown. We thus undertook a comparative study of the most closely homologous members (SAP1-6) of the SAP gene family by molecular modeling. This analysis has identified significant differences between the various SAP structures and active site residues which are relevant for the future of structurebased drug design.
124
Stewart et al.
V. S T R U C T U R A L C H A R A C T E R I Z A T I O N Structural studies have focused on SAP2, which is the most abundantly secreted protein in vitro when BSA is used as the sole nitrogen source in fermentations. The gene for SAP2 codes for a 398-residue preproprotein, which is cleaved to a mature single polypeptide chain of 342 residues with a deduced molecular mass of 35,880 Da (Wright et al., 1992). Detailed reports of the three-dimensional structures of SAP2 (Cutfield et al., 1995) and a closely related clinical isolate (Abad-Zapatero et al., 1996) (referred to as SAP2X) complexed with the same potent inhibitor, A-70450 (Ki = 0.17 nM), have been published. In addition, Foundling and co-workers have reported the molecular structure of the secreted aspartic protease from C. tropicalis (SAPT) in complex with an unknown tetrapeptide tentatively identified as Thr-Ile-Thr-Ser (Symersky et al., 1997). A detailed discussion of the variations on the aspartic proteinase fold found in the SAPs from C. albicans and C. tropicalis has been presented elsewhere, together with the structural differences among SAP2, SAP2X, and SAPT. A brief outline of the implications and possible strategies for the structure-based design of antifungal agents has been discussed (Abad-Zapatero et al., 1998).
A. VARIATIONS WITHIN THE ASPARTIC PROTEINASE FOLD The overall architecture of the SAPs from Candida conforms with the classic aspartic protease fold represented in all members of the group whose threedimensional structure has been determined (Abad-Zapatero et al., 1990; Dealwis et al., 1994; Hsu et al., 1977; Sielecki et al., 1990; Subramanian, 1978; Subramanian et al., 1977; Tang et al., 1978). Available three-dimensional structures for mammalian and fungal aspartic proteinases as well as some of their complexes have been summarized by Aguilar and coworkers with tabulations of sequence identity and root-mean-square deviations among the different members (Aguilar et al., 1997). Detailed differences between the structure of the SAPs and the cannonical pepsin fold have been illustrated elsewhere (Abad-Zapatero et al., 1998; Goldman et al., 1995) and discussed in detail by two groups (Cutfield et al., 1995) and (Abad-Zapatero et al., 1996). Briefly, SAPs from Candida present: (1) an 8residue insertion near the first disulfide (Cys47-Cys59, SAP2) that results in a "broad" flap extending toward the active site; (2) a 7-residue deletion near helix hN2 (Ser118-Gln121), which enlarges the $3 pocket; (3) a short polar connection between the two rigid body domains that alters their relative orientation and projects a small "specificity ridge" into the active site; and (4) an ordered 12-residue addition at the carboxy terminus (Fig. 2, see color plate).
Secreted Proteinases from Candida
125
B. COMPARISON OF THE DIFFERENT SAP1- 6 SXRUCXURES There is an intriguing and potentially significant trend in the total electrostatic charge of the SAP1-6 enzymes. Both SAP2 and -3 were the most negative with a net charge of - 2 1 and - 2 0 , respectively. The SAPT from C. tropicalis also has a very large total negative charge ( - 2 0 ) consistent with the larger sequence identity with SAP2 and SAP3 (Table I, see color plate). Both SAP1 and SAP4- 6 were significantly more positive with net charges of - 8 , - 5 , + 2, and + 2, respectively. Inspection of the electrostatic potential surfaces indicate that the variation in charge was mainly due to changes over the entire surface rather than locally concentrated variation. This overall variation is illustrated in Figs. 3A and 3B (see color plate) for SAP3 and SAP6. We wish to suggest that the net charge of the SAP 1 - 6 enzymes plays an important role in the tissue distribution and mode of action of these enzymes. The increase in positive charge from SAP 1 to SAP6 might be related to different optimum activities, different distribution in various host tissues, or to other environmental factors which have been shown to affect expression levels and isoenzyme distribution (White and Agabian, 1995). While this net charge refers to the entire protein, the variation in charge is also reflected in the active sites. For the purpose of the discussion, the residue numbers are given for SAP2 (Cutfield et al., 1995) as presented in reference (Abad-Zapatero et al., 1996). Because insertions/deletions in the SAP sequences cause different numbers for equivalent positions, residues in other SAP enzymes are listed without numbers to avoid confusion (Table I; see color plate).
C. DESCRIPTION OF S A P 1 - 6 ACTIVE SITES: CONSERVED REGIONS The central portion of each active site of SAP 1 - 6 bears high similarity to other fungal proteases such as rhizopuspepsin and endothiapepsin. Specifically, the Asp-Thr/Ser-Gly-Ser/Thr-Ser/Thr signature sequence for aspartyl proteases is intact for each isoform (residues 32-36 and 218-222 in SAP2) (AbadZapatero et al., 1996; Cutfield et al., 1995). In addition, the 85-1oop, which comprises one of the active site flaps of SAP 1-6, has similar counterparts in other fungal proteases. The Tyr residue (Tyr84 in SAP2) provides a boundary between the $1 and $2' binding pockets and is conserved in the Candida proteases. The Tyr-Gly-Asp-Gly-Ser sequence that follows (residues 8 4 - 8 8 in SAP2) shows only minor variation in the SAP1-6 family: Gly85 at the tip of the flap is replaced by Ala in SAP4 and SAP6 (Table II). This provides a slightly expanded hydrophobic region that falls at the interface of the $2 and $1' pockets in these two isoforms. The Asp at position 86 is conserved in SAP 1 - 6 and
126
Stewart et al.
TABLE II Summary of Amino Acid Substitutions at the Different Active Site Subsites among the Various SAPs from C a n d i d a a
SAP1
SAP2
SAP3
SAP4
SAP5
SAP6
SAPT
Comment
30 84 88 119 123
Ile Tyr Ser Ile Ile
Ile Tyr Ser Ile Ile
Val Tyr Thr Val Ile
Ile Tyr Ser Ala Ile
Ile Tyr Ser Ala Ile
Ile Tyr Ser Ala Ile
Val Tyr Thr Val Ile
Before signature motif 85-1oop(act. site flap) Active site flap hN2 deletion After hN2 deletion
221 225 301 303 305 S3A 12 13 119 220 222 S3B 51 86 118 120
Thr Tyr Ser Ala Ile
Thr Tyr Asn Ala Ile
Thr Tyr Ser Tyr Ile
Thr Tyr Ser Asp Ile
Thr Tyr Ser Asp Ile
Thr Tyr Ser Asp Ile
Thr Tyr Asn Ala Ile
Signature motif Specifcity ridge Specificity ridge Specificity ridge
Val Ser Ile Gly Thr
Val Thr Ile Gly Thr
Val Ser Val Gly Thr
Ile Thr Ala Gly Thr
Ile Thr Ala Gly Thr
Ile Thr Ala Gly Thr
Pro Ser Val Gly Thr
First ]3-turn First ~-turn hN2 deletion Signature motif Signature motif
Arg Asp Ser Pro
Tyr Asp Ser Asp
-Asp Ser Asp
Trp Asp Ser His
Trp Asp Ser Arg
Trp Asp Ser His
Tyr Asp Ser Asp
Additional, broad flap Consensus before hN2 hN2 deletion
Ile Tyr Gln Leu Ala Gly
Ile Tyr Gln Leu Ala Asp
Ile Tyr Gin Leu Ala Gly
Ile Tyr Glu Arg Tyr Arg
Ile Tyr Glu Arg Tyr Arg
Ile Tyr Glu Arg Tyr Arg
Ile Tyr Tyr Gly Leu Ser
Glu Arg Leu Ala Ile
Glu Arg Leu Ala Ile
Glu Arg Leu Tyr Ile
Thr Ser Leu Asp Ile
Lys Thr Leu Asp Ile
Thr Set Leu Asp Ile
Glu Arg Val Ala Ile
Ser Ile Asn Ala
Ser Ile Asn Ala
Ser Ile His Ala
Ser Ile Asn Ala
Ser Ile Gly Ala
Ser Ile Asn Ala
Ser Ile Asp Ala
$1
$2
$4 223 225 295 297 281 299 $1' 193 195 216 303 305 $2' 35 82 131 133
Domain interface Domain interface Domain interface Possible salt bridge Signature motif Specificity ridge Specificity ridge Signature motif Act. site flap
aData summarized from references (Abad-Zapatero et al., 1996, 1998; Cutfield et al., 1995; Symersky et al., 1997).
Secreted Proteinases from Candida
127
together with Asp32 and Asp218, provides the high-negative-charge character of the center of the active sites of each of SAP1- 6 (Fig. 4, see color plate). Residue 87 is conserved in SAP 1 - 6 to be Gly, but is Leu in SAPT. The side-chain of this residue is solvent exposed in SAPT and does not comprise an active site residue. The residue Ser88 is replaced by Thr in SAP3 and SAPT. Relative to Ser, the extra methyl of Thr extends into the S 1 binding pocket and would serve to slightly decrease the available volume for an inhibitor side-chain in the P1 position. In summary, residues 32-36, 84-88, and 218-222 in SAP 1 - 6 and SAPT represent regions of high sequence homology (Table I). They comprise a conserved core of three regions similar to other aspartyl proteinases and correspond to the sequences containing the two catalytic aspartates and the active site flap.
D. DESCRIPTION OF VARIABLE REGIONS
SAP1-6 ACTIVE SITES:
In contrast to the above-described core positions showing minor variations within the SAP1-6 family, more significant structural variation is observed at other positions. The discussion below will sequentially move clockwise through the active site covering the $1, S3a, S3b, $4, $2, $1', $3', and finally the $2' subsites (Fig. 1). A tabulation of the relevant residues is given in Table II. The $1 pocket is comprised of residues 30, 119, and 123 in addition to several conserved residues listed above. While residue 123 is a conserved Ile for SAP1-6, positions 30 and 119 show more hydrophobic variation, which will slightly alter the size of the S 1 pocket. Residue 30 is Ile for SAP 1,-2, a n d - 4 - 6 and Val for SAP3 and SAPT. Residue 119 is Ile for SAP1 and-2, Val for SAP3 and SAPT, and Ala for SAP4-6. In summary, the position of largest variation within the S1 pocket is at residue 119 with medium hydrophobic side chains for SAP1-3 and a smaller Ala side chain for SAP4-6. Thus, one may expect corresponding substrate/inhibitor P1 residues to be larger hydrophobic residues for SAP4-6 than SAP1-3. As described previously (Abad-Zapatero et al., 1996; Cutfield et al., 1995), the $3 subsite in SAP2 was found to be rather large and was operationally divided into sections a and b (Abad-Zapatero et al., 1998). This separation is extended now to the models of SAP1 and SAP3-6. The presence of an additional broad flap (residues 47-59, Table I; Fig. 2) is tantamount to a "second" active site loop. This insertion creates a large binding region in the $3 region and the inhibitor A-70450 explores this entire site through a unique ketopiperidine backbone amide bond replacement. The proteases SAP 1, SAP2, SAP4-6, and SAPT possess the same loop length for the 50-loop, while SAP3 possess a loop shorter by i residue. Residue 51 in this loop has the potential for contacting both inhibitors and substrates: the Tyr residue at position 51 in SAP2 was
128
Stewart et al.
observed to provide a long Van der Waals contact with the P3b terminal piperazine group. As aligned, residue 51 corresponds to an Arg in SAP 1 and it is conceivable that Tyr51 in SAP2 coincides spatially with Arg in SAP 1. However, Arg51 in SAP1 is both preceded and followed by Pro residues, which could impart some special loop conformation not modeled well by the simple loop replacement routine used in our work. As stated, SAP3 possesses a deletion in this region and as modeled does not have a residue that corresponds to Tyr51. Given the uncertainties of the 50-loop structure in both SAP1 and SAP3, experimental structure studies would be required for a complete understanding. Residue 51 is Trp in SAP4-6 and we suggest that the Trp has a similar rotamer preference to Tyr51 in the SAP2 crystal structure. If so, then the S3b subsite will have diminished volume due to the larger aromatic residue in SAP4-6, relative to SAP2 (and SAP1 and -3). Therefore, the S3b region is a site of significant variation within the active site of SAP1-6 due to the different residues at position 51. Residues 119-121 and 12-13 are near residue 51 and comprise the remainder of the S3a and S3b subsites. At residues 119-121, the Ile-Asp-Gln of SAP2 are replaced by Ile-Pro-Gln in SAP1, Val-Asp-Gln in SAP3 and SAPT, AlaHis-Lys in SAP4 and -6, and Ala-Arg-Lys in SAP5. These three residue changes, together with variants at residue 51 result in enlargening the $1 pocket and collapsing and adding positive character to the S3b pocket in SAP4-6, relative to SAP1-3 and SAPT. The Lys121 of SAP4-6 is positioned well for electrostatic interaction with the conserved Glul0. Residues 12 and 13 comprise the S3a subsite and show minor variation among the isoforms: ValSer in SAP1 and -3, Val-Thr in SAP2, and Ile-Thr in SAP4- 6. Since this variation at residues 12 and 13 involves only changes of a methyl group (Ser to Thr and Val to Ile), no large impact upon the S3a subsite volume is observed in the models of SAP1-6. The SAPT possesses a Pro and Ser at positions 12 and 13, respectively. The crystal structure shows that the Pro 12 of SAPT overlays well with Va112 of SAP2X, providing a similar hydrophobic surface in the S3a subsite between SAP2X and SAPT. In summary, the $3 subsite of the SAPs may be further divided into two regions: the S3a region is relatively similar among SAP1- 6 and SAPT, while the S3b region is more varied with SAP1-3 and SAPT having a larger and neutral character, and in SAP4- 6 it is smaller and positively charged. The $4 subsite provides a very clear-cut point of variation between the SAP1-3 and SAP4-6 subgroups. Residues Ile223, Tyr225, Gln295, Leu297, and Ala281 comprise this subsite in SAP2. Of these residues, Ile223 and Tyr225 are conserved in SAP1-6 and SAPT. While Gln295, Leu297, and Ala281 are conserved in SAP 1-3, these residues are replaced by Glu, Arg, and Tyr, respectively, in SAP4-6. At the far end of the $4 pocket lies residue 299. This residue is Asp299 in SAP2, Gly in SAP1 and -3, and Arg in SAP4- 6. The crystal structure of SAPT shows that residues 281,295,297, and 299 of SAP2 correspond
Secreted Proteinases from Candida
129
to Leu, Tyr, Gly, and Ser in SAPT, respectively, which are different from any of SAP1-6. In summary, the $4 subsite will be extremely different between the two subgroups of SAPs, both in shape and polarity: SAP 1-3 will have neutral residues comprising this pocket, such as Ala at 281, Gln at 295, Leu at 297, and Gly at 299 (an exception is Asp 299 for SAP2), while SAP4-6 will have an aromatic, Tyr at 281, or charged residues Glu at 295 and Arg at 297 and 299. Because of the uncertainty in side-chain conformation prediction, particularly for Arg, we cannot predict the exact shape of the $4 subsite, but the residue changes described here strongly suggest that the P4 position of either substrates or inhibitors will have a corresponding large variation in character for SAP 1 - 6 and SAPT. The $2 subsite is comprised of the side-chains of four residues at positions 225, 301,303, and 305; the last three forming the "specificity ridge" (AbadZapatero et al., 1996). Residues 225 and 305 are conserved at Tyr and Ile, respectively, for SAP1-6 and SAPT. Residue 301 is Ser or Asn in the SAP1-6 family. Residue 303 forms the interface between the $2 and $1' subsite and is discussed further in the section below. This residue is hydrophobic, Ala in SAP 1 and -2, Tyr in SAP3, polar, or Asp in SAP4-6. The steric impact of the large Tyr residue is described below in the S 1' subsite description. The presence of the negatively charged Asp at residue 303 in SAP4-6 will make the S 1' subsite significantly more negative than in SAP 1-3. The SAPT possess $2 subsite residues identical to SAP2, and comparison of the two crystal structures shows these residues to be identically oriented. In summary, the $2 subsites of SAP 16 and SAPT appear to be well filled sterically by subsitutents the size of the P2 nor-Leu of A-70450, but the presence of a polar residue at position 301 and the variation in polarity at residue 303 suggests that both hydrophobic and hydrophilic subsituents may be accommodated. The S 1' pocket is composed of the conserved residues of Leu216 and Ile305 for SAP1-6 and varied residues at positions 193, 195, and 303. As mentioned above in the $2 subsite description, one large steric difference among the SAP enzymes occurs at position 303, within the specificity ridge (Abad-Zapatero et al., 1996). This residue is Ala303 in SAP 1 and -2, Asp in SAP4- 6, and Tyr in SAP3. The Tyr residue in SAP3 may limit the access to this pocket in SAP3 substrates to P 1' residues with small side-chains, such as Val, Ala, and Thr. In addition, the Tyr at position 303 in SAP3 has the potential of contacting the 85-1oop described above and closing the access to the active site. This effect may be seen in Fig. 4c, where the protein surface appears to be continous from the flap to the carboxy domain and partially covers the inhibitor. The Asp303 in SAP4- 6 is predicted to be very close to the side-chain of the PI' substituent and should disfavor binding of hydrophobic moieties. Residues Glu193 and Arg195 form a salt bridge in SAP1-3. We have previously commented on the special role that Arg195 may play in the observed backbone orientation of residues 301-305 (Abad-Zapatero et al., 1996). Importantly, we do not know
130
Stewart et al.
how removing this Arg195 side-chain will impact the 301-305 position or the orientation of the "mobile subdomain" (Abad-Zapatero et al., 1990; Sali et al., 1992; Sielecki et al., 1990). As modeled here, the Asp residue at 303 in SAP46 will likely make a direct interaction with the P I' substituent of substrates/ inhibitors. The ionic residues at 193 and 195 are replaced with small polar neutral residues Thr and Ser in both SAP4 and SAP6, thus opening up a large volume for extended PI' residues for both substrates and inhibitors. Interestingly, SAP5 diverges from SAP4 and SAP6 in this SI' regions in that residues 193 and 195 are Lys and Thr in SAP5, with Lys193 having the potential for salt bridge formation with Asp303. This proposed salt bridge between 193 and 303 in SAP5 would partially block the S 1' pocket and decrease the available volume of this pocket, relative to SAP4 and SAP6. In summary, SAP 1 and -2 and SAPT are predicted to have very similar S 1' subsites; SAP3 has less volume available relative to SAP2; and SAP4-6 have polar SI' subsites. The $2' pocket is composed of conserved residues Ile82 and Ser35 and two residues from a variable loop. The 13 l-loop is an irregularly shaped turn that links the ~8-strand of residues 122-127 (~8hl, central strand of the ~1 loop) and the conserved a-helix of residues 139-148 (helix ah2) (Abad-Zapatero et al., 1990; Sielecki et al., 1990). The length of this loop differs among the Candida proteases with SAP4-6 and SAPT possessing a loop length of 12 residues and SAP 1-3 possessing a loop length of 11 residues. Fortunately, crystal structures are available from enzymes possessing loops with each of these different loop lengths, SAP2, and SAPT, so that suitable protein backbone templates were available for the homology modeling of SAP1 and SAP3-6. The loop in SAPT served as a reasonable template for SAP4-6 in this loop region as no clashes occurred during the modeling of the SAP4-6 structures. Importantly, the 1residue insertion at position 134 in the SAP4-6 loop does not directly contact the $2' pocket and instead projects into solvent, relative to the corresponding residues in SAP1-3. The two residues at positions 131 and 133, prior to the insertion position, comprise the $2' subsite in both SAPT and SAP2 crystal structures, and the side-chains overlap well. Therefore, we can safely predict that the protein backbone that forms the $2' subsite appears to be conserved for the Candida enzymes, and only the variation in side-chain structure will determine any subsite variation. Position 131 is Asn in SAP 1,-2,-4, and-6; His in SAP3; Gly in SAP5; and Asp in SAPT. Residue 133 is Ala in all the Candida enzymes. Thus, the sole residue that generates variation in the $2' subsite is residue 131 and with the exception of SAP5, this side-chain has polar character (Asn or His). The special case of SAP5 should be mentioned as the Gly residue at position 131 will not have a side-chain that forms a surrounding surface of the $2' subsite; therefore the $2' subsite in SAP5 will be much larger than in the other Candida enzymes. Should it ever be desirable to discover SAP5-specific inhibitors, this variation in the $2' subsite might be a suitable region to
Secreted Proteinases from Candida
131
start such a search. In the crystal structure with A-70450, a butyl group from the inhibitor P2 r group fills the $2 r subsite, with larger groups likely projecting into solvent. Thus, the $2 ~ subsite in SAP1-4 and -6 would appear to be well filled sterically by groups like the P2' butyl group of A-70450, but the presence of polar side-chains nearby at residue 131 suggest that purely hydrophobic groups may not be optimal for occupying the $2 ~ subsite. The $2 ~ subsite of SAP5 diverges form SAP1-4 and -6 and SAPT because of its much larger volume. In summary, the central portion of the active site of the SAP enzymes appears to be similar among all the fungal proteases. More variation is observed in the residues outlining the boundaries of the subsites. Both SAP 1 and-2 are most similar to one another among the SAP enzymes, with their most significant difference being located at position 51 in the S3b subsite. The protease SAP3 is distiguished by the Tyr303 within the specificity ridge, which impacts the $1 ~ subsite. The subgroup of SAP4-6 clearly diverges from the SAP1-3 subgroup. This is most evident in the larger S 1 subsites; the Trp at residue 51 in the S3b subsite, causing a contracted volume in this subsite; a charge differential at the S3b at residues 120-121; a charge differential at $4 subsite at residues 295,297, and 281; and a charge differential of the $1 r subsite at residue 303. As illustrated in Figs. 4A-4F, the active sites of SAP4-6 possess significantly more positive character than the active sites of SAP 1-3. Overall, SAPT is most similar to SAP2-3, but diverges from all SAP 1 - 6 in the $4 subsite.
E. STRUCTURE--ACTIVITY RELATIONSHIPS OF INHIBITORS OF THE SECRETED ASPARTIC ACID PROTEASE 2 In spite of the differences between SAP2 and other aspartic proteinases, pepstatin A binds in a canonical mode, although less tightly than in other wellcharacterized fungal proteinases such as rhizopuspepsin, endothiapepsin, and penicillopepsin (Cutfield et al., 1995). In addition to the structure of SAP2 inhibited with pepstatin A, structures of SAP2 and a clinical isolate variant of C. albicans SAP2 (SAP2X) have been determined to be complexed with the same subnanomolar inhibitor, A-70450 (Cutfield et al., 1995; Abad-Zapatero et al., 1996). Both complexes established the binding of the wedge-shaped inhibitor in an extended conformation with the broad side occupying the $3 subsite. Details of the interactions of the branched inhibitor with the enlarged $3 subsite have been presented elsewhere (Abad-Zapatero et al., 1996; Cutfield et al., 1995). Structure-activity relationships of some structural analogs of A-70450 as bound to SAP2X have been analyzed previously (Abad-Zapatero et al., 1996) and are briefly discussed. In view of the conservation of residues
132
Stewart et al.
between SAP2 and SAP2X in the proximity of the active site (Abad-Zapatero et al., 1998), the conclusions are applicable to SAP2 and can be used as a framework for further structure-based design. The compound A-70450 possesses a hydroxyethylene peptide bond isostere as a transition state mimic with the hydroxylic carbon exhibiting the S-configuration (Fig. 1). The hydroxy group is located between Asp32 and Asp218, equivalent to the pepsin active site residues Asp32 and Asp215. Analogs of A-70450 have been described which identify several structural features required for potency. There was a significant drop in potency for changing the configuration at the P2 nor-Leu residue. Changing the configuration of the P3a benzyl led to an analog equipotent to A-70450. These changes in inhibitor potencies are consistent with the respective subsites: the $2 site being small and restrictive, the $3 site being much more open. A reduced-bond analog of A-70450 at the PI'-P2' linkage ( C ~ O being replaced by--CH2 m) showed an almost 700-fold drop in potency relative to the parent. Most probably, this was caused by the loss of a hydrogen bond between the inhibitor and the amide nitrogen of Gly85 in the protein backbone. The urea carbonyl of A-70450 makes no direct contact with the protein and could be replaced by a sulfonyl linkage with only a twofold loss in potency. Finally, the P2' butyl group of the original A-70450 may be replaced with 3-morpholinopropyl, dimethylaminoethyl, or dimethylaminopropyl groups, with only a three- to fourfold drop in potency. The enzyme selectivity of these last three compounds is particularly interesting as they show reduced activity against renin and sharply decreased affinity for cathepsin D (Abad-Zapatero et al., 1996). In addition, the conformation of the terminal methylpiperazine ring was found to be different in the two crystal forms. Possible reasons for the alternate conformation have been suggested. Namely, the instrinsic flexibility of the methylpiperazine ring and the different pH of the crystals, which could result in altering the water structure in the proximity of residues Thr50 and Gln121 (Abad-Zapatero et al., 1998). The dual conformation of A-70450 in the crystals confirms the existence of a very large $3 subsite in the SAP2 isoenzyme, which could be modulated by the individual differences among SAPs discussed previously. Although active in vitro at the nanomolar level against SAP2, neither A-70450 nor A-77912 performed well against other approved chemotherapeutic agents such as amphotericin B and fluconazole (Goldman et al., 1995). Two more inhibitors of the SAP family have been isolated from the fermentation broth of Streptomyces sp. (Sato et al., 1994a,b) with IC50 values in the low milimolar range; their future performance is uncertain. Yet, the detailed analysis of the different subsites for the various SAPs (Section V,D) suggests that the study of the specific interactions of SAP2 with A-70450 should provide a valuable structural framework for future structurebased drug against any of the SAP variants. In particular, the suboptimal po-
Secreted Proteinases from Candida
133
larity match of A-70450 at its P2 and P2 r sites is of note. In each of the $2 and $2 r subsites, a hydrophobic butyl group from the inhibitor terminates near a region of polarity within the SAP enzymes, including SAP2--the enzyme used for cocrystallization. Since A-70450 has subnanomolar potency for SAP2 inhibition, this polarity mismatch is not completely detrimental to affinity; however, it is possibly not optimal for affinity either. It is straightforward to imagine analogs of A-70450, which might exploit these polar interactions within the $2 and $2 r subsites. However, further experimentation will be required to learn if these additional polar interactions will lead to more potent inhibitors or to effective anti-Candida agents.
VI. C A N D I D A G E N O M I C S An additional resource in the search for anti-Candida agents should be mentioned. The complete genome of C. albicans is currently being sequenced. Information about this sequencing effort may be found at world wide web (WWW) address http://alces.med.umn.edu/Candida.html at University of Minnesota. The goal of the C. albicans sequencing project is to provide 1.5x mean sequence coverage of the haploid genome from C. albicans strain SC5314. This effort should provide at least partial sequence for most genes in C. albicans and give a preliminary identification of function based on similarity to other species, especially S. cerevisiae. This Internet site also provides additional information on Candida biology and is is a useful resource for Candida research. A search of the genome database for the keyword "protease" or "proteinase" yields 15 different proteases identified in C. albicans, which includes proteases which are not secreted and not part of the SAP family. There is a special section in this W W W site devoted to the SAP gene family, http://alces.med.umn.edu/candida/ proteinase.html. Another internet resource valuable in Candida research is the mailing list called "candidanews" originating from University of Otago, New Zealand. Information on subscription to this listserver may be found at http:// alces, med. umn. edu/candida/candidannews, h tml.
VII. S U M M A R Y A N D C O N C L U S I O N S Important experimental results were gathered during the past few years with respect to Candida SAPs. Among the more noteworthy are: (1) the identification of multigene families (SAP1-3 and SAP4-6) for SAPs in Candida species and the elucidation of some aspects of differential transcriptional control; (2) induction of SAPs by intact proteins potentially via contact sensing of substratum surfaces; (3) demonstration of SAP activity under physiological conditions of pH and ionic strength; (4) detection of circulating SAP antigen during
134
Stewart et al.
infection in humans; (5) discovery of novel inhibitors of SAPs; and (6) perhaps most importantly for structure-based drug design and discovery efforts, the determination of the crystal structure of an SAP family member bound to a potent inhibitor. Yet, the poor efficacy of the leading SAP2 inhibitors A-70450 and A-79912 in murine models of Candida infection was both discouraging and unexpected. Several reasons have been invoked elsewhere (Abad-Zapatero et al., 1998) to rationalize these negative results. Some of them can be mentioned again: inability to penetrate cell barriers, insufficient potency, and, likely, the inability to inhibit all members of the SAP family, or at least those most important for virulence. We had ruled out lack of bioavailability and lack of activity at physiological pH as explanations early in our studies. However, we are now faced with the possibility that, in addition to the various isoenzymes present, we have to consider the factor of differential transcriptional control. The structural data and the modeling results reviewed here suggest that the active sites of the various secreted SAPs from C. albicans are indeed sufficiently different to allow for different specificities at the different protein subsites. In particular, the dichotomy between the active sites SAP1-3 and SAP4-6 has been clearly documented. Also, the microdiversity within the individual amino acid residues bordering the substrate for the individual SAPs has been discussed and opens the way toward targeting either the invidual SAPs or a certain subset of them. Chemical synthesis strategies to exploit the differences at the $3 and $4 subsite were outlined previously (Abad-Zapatero et al., 1998), including the posibility of "freezing" into one compound the two inhibitor conformations found in the crystalline complexes of SAP2 (Abad-Zapatero et al., 1996; Cutfield et al., 1995). The large differences in overall charges among the different members of the family - 2 1 (SAP2) to + 2 (SAP5-6) opens the possibility that the different isozymes target different tissues during the course of the infection, providing a different kind of specificity and targeting. In our view, the challenge of a successful antifungal program resides in combining the biological and genetic data on the differential expression, regulation, and virulence of the SAPs from Candida with the structural data on the microheterogeneity of the various enzyme subsites. No matter how long and difficult the road, the discovery of effective prophylactic and therapeutic antifungal agents targeted on the fungal SAPs will be our best course of action.
VIII. M E T H O D S Three-dimensional models of SAP1 and SAP3-6 were created by appropriate residue replacement and loop-searching techniques using the SAP2X structure as a template. Sequences of the SAP enzymes were taken from GenBank data-
Secreted Proteinases from Candida
135
base using the following entries: SAP1,X56867; SAP3, L22358; SAP4, L25388; SAP5, Z30191; SAP6, Z30192. The SAP2X sequence used in this w o r k is a varient of SAP2 with 96% indentity, GenBank entry M83663. The overall percentage identity b e t w e e n the SAP enzymes ranges 4 0 - 7 5 % and well above the 2 0 - 3 0 % problematic range, so the SAP2X structure is believed to be a good template for the h o m o l o g y modeling of SAP enzymes. The SAP2X structure was used as template for the entire protein structure of SAP1 and S A P 3 - 6 except as noted. Protein Data Bank entry 1EAG for SAP2 was used as template for the 245-1oop, and the C. tropicalis SAPT protein was used as a template for the 13 lloop of S A P 4 - 6 . Loop searches were carried out for residues 5 0 - 5 2 of SAP3 and the 210-1oop of SAP1 and - 3 - 6 . After manually adjusting side-chains that were inappropriate rotamers or that clashed with other protein residues, energy minimization of the protein for 200 cycles using a gradually decreasing restraint on the protein atoms p r o d u c e d a final structure with no gross VDW clashes. These final models were used in the subsequent DelPhi analysis of the charge distribution and are available to academic investigators u p o n request from the authors. DelPhi calculations were carried out with 1 A spacing using dielectric constants for the protein and bulk solvent of 2 and 80, respectively.
ACKNOWLEDGMENTS We appreciate the critical reading of the manuscript by Drs. J. Greer and C. Hutchins and the support of Drs. D. Norbeck and A. Rosenthal within the Pharmaceutical Products Division at Abbott Laboratories.
REFERENCES Abad-Zapatero, C., Goldman, R. C., Muchmore, S. W., Hutchins, C., Oie, T., Stewart, K., Cutfield, S. M., Cutfield, J. F., Foundling, S. I., and Ray, T. L. (1998). In "Advances in Experimental Medicine and Biology" (M. G. N. James, Ed.). Aspartic Proteinases pp. 297-313. Plenum, New York. Abad-Zapatero, C., Goldman, R., Muchmore, S. W., Hutchins, C., Stewart, K., Navaza, J., Payne, C. D., and Ray, T. L. (1996). Structure of a secreted aspartic protease from C. albicans complexed with a potent inhibitor: Implications for the design of antifungal agents. Prot. Sci. 5, 640-652. Abad-Zapatero, C., Rydel, T. J., and Erickson, J. W. (1990). Revised 2.3 A structure of porcine pepsin: Evidence for a flexible subdomain. Prot. Struct. Funct. Genet. 8, 62-81. Aguilar, C. F., Cronin, N. B., Badasso, M., Dreyer, T., Newman, M. P., Cooper, J. B., Hoover, D. J., Wood, S. P.,Johnson, M. S., and Blundell, T. L. (199"/). The three-dimensional structure at 2.4 A resolution of glycosylated proteinase A from the lysosome-like vacuole of Saccharomyces cerevisiae.J. Mol. Biol. 267,899-915. Akiyama, K., Shida, T., Yasueda, H., Mita, H., Yamamoto, T., and Yamaguchi, H. (1994). Atopic asthma caused by Candida albicans acid protease: Case reports. Allergy 49, 7'78-781.
136
Stewart et al.
Akiyama, K., Shida, T., Yasueda, H., Mita, H., Yanagihara, Y., Hasegawa, M., Maeda, Y., Yamamoto, T., Takesako, K., and Yamaguchi, H. (1996). Allergenicity of acid protease secreted by Candida albicans. Allergy 51,887-892. Capobianco,J. O., Lerner, C. G., and Goldman, R. C. (1992). Application of a fluorogenic substrate in the assay of proteolytic activity and in the discovery of a potent inhibitor of Candida albicans aspartic proteinase. Anal. Biochem. 204, 96-102. Colina, A. R., Aumont, F., Belhumeur, P., and de Repentigny, L. (1996). Development of a method to detect secretory mucinolytic activity from Candida albicans. J. Med. Vet. Mycol. 34,401-406. Cutfield, S. M., Dobson, E. J., Anderson, B. F., Moody, P. C. E., Marshall, C. J., Sullivan, P. A., and Cutfield, J. F. (1995). The crystal structure of a major secreted aspartic proteinase from Candida albicans in complexes with two inhibitors. Structure 3, 1261-1271. Cutler, E. (1991). Putative virulence factors of Candida albicans. Ann. Rev. Microb. 45, 187-218. Data, A. (1994). Pathogenecity of Candida albicans: Quest for a molecular switch. Brazilian J. Med. Biol. Res. 27, 2721-2732. De Bernardis, F., Boccanegra, M., Adriani, D., Sprechini, E., Santoni, G., and Cassone, A. (1997). Protective role of antimannan and anti-aspartyl proteinase antibodies in an experimental model of Candida albicans vaginitis in rats. Infect. Immun. 65, 3399-3405. de Viragh, P. A., Sanglard, D., Togni, G., Falchetto, R., and Monod, M. (1993). Cloning and sequencing of two Candida parapsilosis genes encoding acid proteases. J. Gen. Microbiol. 139, 335 -342. Dealwis, C. G., Frazao, C., Badasso, M., Cooper, J. B., Tickle, I. J., Driessen, H., Blundell, T. L., Murakami, H., Sueiras-Diaz, J., Jones, D. M., and Szelke, M. (1994). X-ray analysis at 2.0 A resolution of mouse submaxillary renin complexed with a decapeptide inhibitor CH-667, based on the 4-16 fragment of rat angiotensinogen.J. Mol. Biol. 236,342-360. Douglas, L.J. (1988). Candida proteinases and candidosis. Crit. Rev. Biotechnol. 8, 121-129. Edison, A. M., and Manning-Zweerink, M. (1988). Comparison of the extracellular proteinase activity produced by a low-virulence mutant of Candida albicans and its wild-type parent. Infect. Immun. 56, 1388-1390. Fallon, K., Bausch, K., Noonan, J., Huguenel, E., and Tamburini, P. I. (1997). Role of aspartic proteases in disseminated Candida albicans infection in mice. Infect. Immun. 65, 551-556. Frey, C. L., Barone, J. M., Dreyer, G., Koltin, Y., Petteway, S. R., and Drutz, D. J. (1990). Synthetic protease inhibitors inhibit Candida albicans extracellular protease activity and adherence to endothelial cells. Abst. Ann. Meet. Am. Soc. Microb. poster no. F-102. Fusek, M., Smith, E. A., Monod, M., Dunn, B. M., and Foundling, S. I. (1994). Extracellular aspartic proteinases from Candida albicans, Candida tropicalis and Candida parapsilosis differ substantially in their specificities. Biochemistry 33, 9791-9799. Goldman, R. C., Frost, D. J., Capobianco, J. O., Kadam, S., Rasmussen, R. R., and Abad-Zapatero, C. (1995). Antifungal drug targets: Candida secreted aspartyl protease and fungal wall ~-glucan synthesis. Infect. Agents Dis. 4, 228-247. Homma, M., Chibana, H., and Tanaka, K. (1993). Induction of extracellular protease in Candida albicans. J. Gen. Microbiol. 139, 1187-1193. Hsu, I., Delbaere, L. T. J., James, M. N. G., and Hoffmann, T. (1977). Penicillopepsin from Penicilliumjanthinellum crystal structure at 2.8 A and sequence homology with porcine pepsin. Nature 266, 140-145. Hube, B., Monod, M., Schofield, D. A., Brown, A. J., and Gow, N. A. (1994). Expression of seven members of the gene family encoding secretory aspartyl proteinases in (Bossche, H. V., Stevens, D. A., and Odds, F. C., Eds.) Candida albicans. Mol. Microbiol. 14, 87-99. Hube, B., Sanglard, D., Monod, M., Brown, J. P., and Gow, N. A. R. (1997a). In "Host-Fungus Interplay: Proceedings of the Fifth Symposium on Topics in Mycology" (Bossche, Stevens, and _Odds, Eds.), pp. 109-122. National Foundation for Infectious Diseases, Bethesda, MD.
Secreted Proteinases from Candida
13 7
Hube, B., Sanglard, D., Odds, F. C., Hess, D., Monod, M., Schafer, W., Brown, A. J. E, and Gow, N. A. R. (1997b). Disruption of each of the secreted aspartyl proteinase genes SAP1, SAP2, and SAP3 of Candida albicans attenuate virulence. Infect. Immun. 65, 3529-3538. Hube, B., Turver, C.J., Odds, F. C., Eiffert, H., Boulnois, G. J., Kochel, H., and Ruchel, R. (1991). Sequence of the Candida albicans gene encoding the secretory aspartate proteinase. J. Med. Vet. Mycol. 29, 129-132. Kaminishi, H., Miyaguchi, H., Tamaki, T., Suenaga, N., Hisamatsu, M., Mihashi, I., Matsumoto, H., Maeda, H., and Hagihara, Y. (1995). Candida SAP is known to degrade immunoglobulins.Infect. Immun. 63,984-988. Lerner, C. G., and Goldman, R. C. (1993). Stimuli that induce production of Candida albicans extracellular aspartyl proteinase. J. Gen. Microbiol. 139, 1643-1651. Lortholary, O., and Dupont, B. (1997). Antifungal prophylaxis during neutropenia and immunodeficiency. Clin. Microb. Rev. 10,477-504. Magee, B. B., Hube, B., Wright, R. J., Sullivan, P. J., and Magee, P. T. (1993). The genes encoding the secreted aspartyl proteinases of Candida albicans constitute a family with at least three members. Infect. Immun. 61, 3240-3243. Miyasaki, S. H., White, T. C., and Agabian, N. (1994). A fourth secreted aspartyl proteinase gene (SAP4) and a CARE2 repetitive element are located upstream of the SAP1 gene in Candida albicans. J. Bacteriol. 176, 1702-1710. Monod, M., Togni, G., Hube, B., and Sanglard, D. (1994). Multiplicity of genes encoding secreted aspartic proteinases in Candida species. Mol. Microbiol. 13,357-368. Morrison, C. J., Hurst, S. F., Bragg, S. L., Kuykendall, R. J., Diaz, H., Pohl, J., and Reiss, E. (1993). Heterogeneity of the purified extracellular aspartyl proteinase from Candida albicans: Characterization with monoclonal antibodies and N-terminal amino acid sequence analysis. Infect. Immun. 61, 2030-2036. Morrow, B., Srikantha, T., and Soil, D. R. (1992). Transcription of the gene for a pepsinogen, PEP1, is regulated by white-opaque switching in Candida albicans. Mol. Cell Biol. 12, 2997-3005. Morschhauser, J., Virkola, R., Korhonen, T. K., and Hacker, J. (1997). Degradation of human subendothelial extracellular matrix by proteinase-secreting Candida albicans. FEMS Microbiol. Lett. 153,349-355. Mukai, H., Takeda, O., Asada, K., Kato, I., Murayama, S., and Yamaguchi, H. (1992). cDNA cloning of an aspartic proteinase secreted by Candida albicans. Mol. Cell Biol. 12, 2997-3005. Odds, F. C. (1985). Candida albicans proteinase as a virulence factor in the pathogenesis of Candida infection. Zentralbl Bakteriol. Mikrobiol. Hyg. A 260, 539-542. Odds, F. C. (1987). Candida infections: An overview. Crit. Rev. Microb. 15, 1-5. Ray, T. L., Payne, C. D., and Morrow, B.J. (1991). Candida albicans acid proteinase: Characterization and role in candidiasis. Adv. Exp. Med. Biol. 306, 173-183. Ruchel, R. (1984). A variety of Candida proteinases and heir possible targets of proteolytic attack in the host. Zentralbl. Backteriol. Mikrobiol. Hyg. A 257, 266-274. Ruchel, R. (1986). Cleavage of immunoglobulins by pathogenic yeasts of the genus Candida. Microbiol. Sci. 3, 316-319. Ruchel, R., de Bernardis, F., Ray, T. L., Sullivan, E A., and Cole, G. T. (1992). Candida acid proteinases.J. Med. Vet. Mycol. 30, 123-132. Ruchel, R., Ritter, B., and Schaffrinski, M. (1990). Modulation of experimental systemic murine candidosis by intravenous pepstatin. Int. J. Med. Microbiol. 273, 391-403. Sali, A., Veerapandian, B., Cooper, J. B., Moss, D. S., Hofmann, T., and Blundell, T. L. (1992). Domain flexibility in aspartic proteinases. Prot. Struct. Funct. Genet. 12, 158-170. Sanglard, D., Huber, B., Monod, M., Odds, F. C., and Gow, N. A. R. (1997). A triple deletion of secreted aspartyl proteinase genes SAP4, SAP5, and SAP6 of Candida albicans causes attenuated virulence. Infect. Immun. 65, 3539-3546.
138
Stewart et al.
Sato, T., Nagai, K., Shibazaki, M., Abe, K., Takebayashi, Y., Lumanau, B., and Rantiatmodjo, R. M. (1994a). Novel aspartyl protease inhibitors, YF-0200R-A and B. J. Antibiot. (Tokyo) 47, 566-570. Sato, T., Shibazaki, M., Yamaguchi, H., Abe, K., Matsumoto, H., and Shimizu, M. (1994b). Novel Candida albicans aspartyl protease inhibitor. II. A new pepstatin-ahpatinin group inhibitor, YF044P-D.J. Antibiot. (Tokyo) 47,588-590. Sielecki, A. R., Fedorov, A. A., Boodhoo, A., Andreeva, N. S., and James, M. N. G. (1990). The molecular and crystal structure of monoclinic porcine pepsin refined at 1.8 A resolution. J. Mol. Biol. 214, 143-170. Stringaro, A., Crateri, P., Pellegrini, G., Arancia, G., Cassone, A., and de Bernardis, F. (1997). Ultrastructural localization of the secreted aspartyl proteinase in Candida albicans cell wall in vitro and in experimentally infected rat vagina. Mycopathology 95, 105. Subramanian, E. (1978). Molecular structure of acid-proteases. Trends Biochem. Sci. 3, 1-3. Subramanian, E., Swan, I. D. A., Lie, M., Davies, D. R., Jenkins, J. A., Tickle, I. J., and Blundell, T. L. (1977). Homology among acid proteases: Comparison of crystal structures at 3 A resolution of acid proteases from Rhizopus chinensis and Endothia parasitica. Proc. Natl. Acad. Sci. USA 74, 556-559. Symersky, J., Monod, M., and Foundling, S. I. (1997). High-resolution structure of the extracellular aspartic proteinase from Candida tropicalis yeast. Biochemistry 36, 12700-12710. Tang, J., James, M. N. G., Hsu, I. N., Jenkins, J. A., and Blundell, T. L. (1978). Structural evidence for gene duplication in the evolution of the acid proteases. Nature 271,618-621. Traub, P. (1985). Are intermediate filament proteins involved in gene expression? Ann N.Y. Acad Sci 455, 68-78. White, T. C., and Agabian, N. (1995). Expression of Candida albicans secreted aspartyl protease isoenzymes is determined by cell type, and levels are determined by environmental factors. J. Bacteriol. 177, 5215-5221. White, T. C., Kohler, G. A., Miyadaki, S. H., and Agabian, N. (1995). Expression of virulence factors in Candidaalbicans. Can.J. Bot. 73(Suppl. 1), 1058-1064. White, T. C., Miyasaki, S. H., and Agabian, N. (1993). Three distinct secreted aspartyl proteinases in Candida albicans. J. Bacteriol. 175, 6126-6133. Wright, R. J., Carne, A., Hieber, A. D., Lamont, I. L., Emerson, G. W., and Sullivan, P. A. (1992). A second gene for a secreted aspartate proteinase in Candida albicans. J. Bacteriol. 174, 78487853. Zotter, C., Haustein, U. F., Schonborn, C., Grimmecke, H. D., and Wand, H. (1990). Die wirkung von pepstatin A auf die Candida-albicans-infektion der maus. Dermatol. Monatsschr. 176, 189-198.
Proteolytic Enzymes of the Viruses of the Family Picornaviridae ERNST M. BERGMANNAND MICHAEL N. G. JAMES Department of Biochemistry and Medical Research Council of Canada Group in Protein Structure Function, University of Alberta, Edmonton, Alberta, T6G 2G7 Canada
I. II. III. IV.
Picornaviridae Viral Replication and Polyprotein Processing Picornaviral Proteinases Conclusions and Implications for Antiviral Strategies References
I. P I C O R N A V I R I D A E The Picornaviruses constitute a large family of positive-sense, single-stranded RNA viruses ( + RNA viruses) (Rueckert, 1996). There are more than 200 known viruses that belong to this family and are classified into six genera (Table I). These viruses share the major features of the viral replication cycle, including the central role of the specific proteolytic processing of a viral polyprotein (Palmenberg, 1990). Individual details of the viral replication and of the polyprotein processing distinguish the genera of the family Picornaviridae (Ryan and Flint, 1997). Picornaviruses are small icosahedral viruses. There are examples of atomic resolution structures of individual viruses from four of the six genera (rhino-, entero-, cardio-, and aphtho-). The assembly of the precursor of the capsids is regulated by successive proteolytic cleavages of the structural proteins. The final assembly of the protomers into the procapsids requires the RNA genome and is not completely understood (Rueckert, 1996). A nonenzymatic, so-called maturation cleavage within the assembled procapsids then yields the infectious virus (Palmenberg, 1990). Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.
139
140 TABLE I
Ernst M. Bergmann and Michael N. G. James The Family Picornaviridae
Genus
Entero-
Number of serotypes 93 hnPV 1-3; Cox A 1-A23, A24, B1-B6; EV 1-9, 11-21, 24-28, 68-71
Examples
Associated disease
Polio Coxsackie Echo
Myelitis, carditis meningitis, encephalitis, herpangina, myralgia pleurodynia,
Proteolytic enzymes 2A, 3C
Rhino-
105
Rhino
pneumonia Commoncold
2A, 3C
Aphtho-
7
FMDV
Foot-and-mouth disease
L, 3C
of cloven-hoofed animals Cardio-
2
EMCV
7
3C
Hepato-
1
HAV
Hepatitis A
3C
Parecho
2
Echo 22 Echo 23
Myocarditis
3C
EV22, EV23
Picornaviruses cause a wide variety of diseases in humans and animals (Couch, 1996; Hollinger and Ticehurst, 1996; Melnik, 1996). These range from relatively mild and widespread infections, such as the common cold and hepatitis A, to rare but often severe enteroviral diseases (Table I). There is some evidence that Picornavirus infections are also involved in the onset of severe autoimmune diseases such as myocarditis, diabetes, and multiple sclerosis (Carthy et al., 1997; Steinmann and Conlon, 1997). Recently a mouse model for a demyelinating disease that resembles multiple sclerosis, provided a plausible, immunological mechanism for the triggering of such diseases by a viral infection (Miller et al., 1997).
II. V I R A L R E P L I C A T I O N AND POLYPROTEIN PROCESSING A. THE PICORNAVIRAL LIFE CYCLE The Picornaviruses release their single-stranded, mRNA-like genome into the cytosol of the host cell where it is translated into large polyproteins. The resulting polyproteins are then cleaved by specific viral proteinases into the structural and nonstructural proteins of the virus (Kr~iusslich and Wimmer, 1988). Figure 1 shows a simplified scheme of the life cycle of a typical Picornavirus.
0
----= 9-= - - =
~.
"
i
-"'* /
9
9
~'
I
=
I ~ "~.
~
-
( ]
"~I p
@=o "r. ~
[
~
~
/,-
"*"'"
I.iI
E re'
p,.
o
-!.'~ ~a
~!
~
~0
9~ -'~
Z~ ~
;.~
~i
14 2
Ernst M. Bergrnannand MichaelN. G.James
The virus binds to a specific receptor on the cell surface. The specific receptors are different for various Picornaviruses and in some cases are not yet known. Following attachment, the virus particles lose the VP4 protein and undergo a change that allows them to release their RNA genome into the cystosol of the host cell. Picornaviral RNA genomes have a small, viral protein (VPg, the 3B gene product) covalently attached at the 5' terminus. After releasing the RNA into the cytosol of a host cell VPg is cleaved off by a cellular enzyme to yield a functional mRNA. The RNA is then translated into a large polyprotein which is cotranslationally proteolytically processed into the individual viral proteins (Palmenberg, 1990). Under normal conditions the full-length polyprotein is never found. Proteolytic processing of the polyprotein is accomplished by one or two viral proteinases which are themselves part of the polyprotein. The first proteolytic cleavage usually separates the structural proteins (P1 in Picornaviruses) from the remainder of the polyprotein. The RNA genome is also replicated to yield negative-sense RNAs. The negative-sense RNAs then serve as templates for the production of viral genomes. Picornaviral RNA replication is accomplished by a multienzyme complex that includes the viral RNA-dependent RNA polymerase (3D), the putative viral RNA helicase (2C), and the picornavrial 3C proteinase (Porter, 1993; Wimmer et al., 1993). Also part of this complex is the 3AB gene product. The 3A gene product presumably anchors the picornaviral RNA replication complex to the membrane of the smooth ER. Modification of the membrane structure of the host cell is a common feature of picornavirus infection (Bienz et al., 1983, 1990; Teterina et al., 1997a,b). The 3B gene product constitutes VPg--the viral genome-associated protein--which remains covalently bound to the 5'-end of the viral RNA genome (Wimmer, 1982). There is also evidence that some cellular proteins or their proteolytic cleavage products form part of the viral RNA replicase complex (Andino et al., 1993; Xiang et al., 1995; Gamarnik et al., 1996; Parsley et al., 1996). Picornaviral 3C proteinases possess an RNA binding site and RNA binding activity that is distinct from its proteolytic activity (H~immerle et al., 1992; Andino et al., 1993; Porter et al., 1993; Walker et al., 1995; Kusov and GaussM~iller, 1997). It is common for the limited number of gene products of small RNA viruses to perform multiple, distinct functions. The exact function of the 3C proteinase within the RNA replicase complex is not clear. Apparently its RNA binding activity is required for the initiation of RNA replication. Certain proteolytic cleavages in some picornaviruses could be essential steps during RNA replication within the RNA replicase complex, e.g., the 3C-mediated cleavage of the RNA-associated VPg (3B) from the membrane anchor (3A) may be necessary to release the RNA from the membrane-bound replicase complex.
Picornaviral Proteinases
143
It is also believed that in some picornaviruses the cleavage of 3CD within the replicase complex after binding of the RNA is required to allow 3D to perform RNA replication (Harris et al., 1992; Molla et al., 1995). Initially the structural proteins (P 1) are cleaved from the viral polyprotein. Two more 3C-mediated cleavages (1AB]IC and 1CI1D or VP0]VP3 and VP3 IVP 1, respectively) are required within the capsid precursor. The resulting protomer then assembles into pentamers and the pentamers form the provirions by a poorly understood pathway that requires the VPg-linked RNA genome. A final maturation cleavage within the provirion (VP0 --->VP2 + VP4) yields the infectious virus particles. This maturation cleavage is believed to be nonenzymatic and to require the presence of the packaged RNA genome (Palmenberg, 1990; Rueckert, 1996) Many details such as the composition of the RNA replicase complex, the function of the individual components of the RNA replicase complex, the pathway of provirion assembly and so on are, even in the best studied viruses, not completely understood. There are also differences in many aspects of the viral life cycle between the individual genera of the Picornaviridae.
B. POLYPROTEIN PROCESSING AND OTHER FUNCTIONS OF THE PICORNAVIRAL PROTEINASES The genome of all picornaviruses carries at least one, more often two, genes encoding proteolytic enzymes (Ryan and Flint, 1997). The 3C gene product is the major processing proteinase in all picornaviruses. The primary function of the picornaviral proteinases is the cotranslational, specific cleavage of the viral polyprotein into the structural and nonstructural proteins. The individual proteolytic cleavages by the 3C proteinases within the picornaviral polyproteins are sequential; some sites are cleaved faster then others. The cleavage sites are identified by the sequence of the residues immediately preceding and following the scissile bond (approximately P4 to P~ in the nomenclature of Schechter and Berger, 1967). The P1 residue, immediately preceding the scissile peptide bond, is almost always a glutamine. The 3C proteinases of the individual picornaviruses also have sequence preferences for the residues at the P4, P2, P~, and P~ sites of a cleavage site (Nicklin et al., 1988; Long et al., 1989; Pallai et al., 1989; Weidner and Dunn, 1991; Malcolm, 1995). However, what distinguishes the good, preferred cleavages sites from the ones that are cleaved more slowly is not apparent from the peptide sequence. It is very likely that other factors, such as the accessibility and the local conformation, play a part in the determination of the sequence of cleavages. The details of the polyprotein processing are one factor that distinguishes
144
Ernst M. Bergmann and Michael N. G. James
the six different genera of the Picornaviridae (Ryan and Flint, 1997). Only a single proteinase, the 3C gene product, is present in the cardio-, hepato-, and parechoviruses. In the entero- and rhinoviruses the 2A gene product is a second proteolytic enzyme. An L proteinase at the amino-terminus of the polyprotein is a unique feature of the aphthoviruses. The separation of the structural and nonstructural proteins is usually the primary cleavage event, but this is accomplished quite differently in the individual genera. In enterom and rhinoviruses 2A is a separate proteolytic activity. It performs the primary cleavage at its own amino-terminus which separates P 1 from the nonstructural proteins. In the hepatom and parechoviruses the primary cleavage is a 3C-mediated cleavage at the amino-terminus of the 2B gene product (Jia et al., 1993; Schultheiss et al., 1994; Martin et al., 1995; Schultheiss et al., 1995a). It is not clear if the small 2A gene product has any function. In the aphtho- and cardioviruses the primary cleavage is at the carboxy-terminus of 2A. The 2A gene product is not a proteolytic enzyme in these viruses. The cleavage is presumably nonenzymatic and requires the carboxy-terminal residues of 2A (Palmenberg et al., 1992; Donnelly et al., 1997). The second proteinase present in the aphthoviruses, the L proteinase, only cleaves itself from the amino-terminus of the polyprotein (Strebel and Beck, 1986). Larger precursors of the 3C proteinase, such as 3CD or 3ABC are also catalytically active proteinases (Ypma-Wong et al., 1988; Harris et al., 1992; Davis et al., 1997). It has been shown in some systems that the presence of an additional domain can change the efficiency and specificity of the proteolytic activity. In poliovirus, 3CD is a proteinase with a distinct specificity (Ypma-Wong et al., 1988). It cleaves at least some of the cleavage sites within the viral polyprotein more efficiently and is presumably required for the processing of the cleavage sites within the structural protein. In other picornaviruses other precursors may play a similar role. It has been suggested that 3ABC is a proteolytically active precursor of 3C in HAV (Harmon et al., 1992; Schultheiss et al., 1994). Structural proteins of viruses in general are designed to form large assemblies, such as viral capsids. Therefore, they have to be synthesized as precursors, which are covalently modified before they can assemble. One very common form of modification, not only in small + RNA viruses, is proteolytic processing of the precursor (Kay and Dunn, 1990). This is one reason why proteolytic enzymes are among the most ubiquitous enzymatic activities expressed by viruses (Dougherty and Semler, 1993). In the Picornaviruses the structural proteins are further proteolytically proJ cessed after they are separated from the nonstructural proteins. Two sequenJ tial, 3C-mediated cleavages are required within the capsid protein before the resulting 5S protomers can assemble into larger (14S) pentamers (Fig. 1). Final capsid assembly then requires the presence of RNA. In poliovirus the cleavages
Picornaviral Proteinases
14 5
of the capsid proteins within the protomer require the proteolytic activity of the precursor 3CD (Ypma-Wong et al., 1988). The picornaviral 3C proteinase cleaves itself out of the polyprotein. In experimental systems it was shown that this can be accomplished both in cis, when 3C is expressed as part of the polyprotein, or in trans, when 3C is expressed separately (Krtiusslich and Wimmer, 1988; Harmon et al., 1992). Kinetic evidence obtained with encephalomyocarditis virus (EMCV) also suggests that the cleavage in cis at both the amino and carboxy termini of 3C is intramolecular (Palmenberg and Rueckert, 1982). Another possible interpretation of these data is that the cleavages are performed by another 3C proteinase within a tight dimer or larger polymer. Further evidence for an intramolecular, autocatalytic cleavage of the 3C proteinase was provided by Hanecak et al. (1984). The crystal structures of 3C proteinases have allowed one to deduce a structural model for an intramolecular cleavage of 3C at its own amino-terminus (Matthews et al., 1994; Bergmann et al., 1997). This model proposes that the amino-terminal helix, which is a unique feature of the 3C proteinase, folds out of the active site of 3C after 3C has cleaved its own amino-terminus. How and if the 3C proteinase could cleave its own carboxy-terminus in an intramolecular reaction is much less obvious. An additional function of picornaviral proteinases is the inhibition or at least down-regulation of specific host cell functions which compete with the viral replication cycle (Ryan and Flint, 1997). The entero- and rhinoviral 2A proteinases cleave specifically one of the cellular proteins that forms part of the caprecognition complex (eIF4G) (Lamphear et al., 1993; Sommergruber et al., 1994a; Haghighat et al., 1996). This serves to down-regulate the translation of capped host cell mRNAs which competes with the translation of the picornaviral RNA genome. It is remarkable that the L proteinase of aphthoviruses, in spite of being a different proteinase, performs the same function. The entero-/ rhinoviral 2A proteinase and the aphthoviral L proteinase-mediated cleavages of eIF4G occur in different places on the molecule (Kirchweger et al., 1994). The picornaviruses which do not have a second proteolytic activity besides 3C do not cleave eIF4G or inhibit host cell translation by this mechanism. Hepatitis A virus even requires intact eIF4G for the translation of its own genome (Borman and Kean, 1997). There are also reports of host cell proteins being substrates of the picornaviral 3C proteinases. Most of these cellular substrates of the 3C proteinases are involved in some aspect of cellular translation or replication (Ryan and Flint, 1997; Yalamanchili et al., 1997). Thus, there are three main functions of picornaviral proteinases: the specific processing of the viral polyprotein, covalent modification of the precursors of the viral capsid and down regulation of host cell processes by proteolytic
146
Ernst M. Bergmannand Michael N. G. James
cleavage of host cell proteins (Gorbalenya and Snijder, 1996; Kay and Dunn, 1990; Kr~usslich and Wimmer, 1988; Ryan and Flint, 1997). Cotranslational, specific processing of a viral polyprotein by a specific viral proteinase is an essential part of viral replication in + RNA viruses. This is true even for some families of + RNA viruses which have developed additional strategies to generate individual gene products from a single RNA genome, e.g., subgenomic RNAs or multiple ORFs. Proteolytic cleavage as a covalent modification of the precursors of viral structural proteins is even more common and occurs even in DNA viruses. Down-regulation of the host cell metabolism by specific cleavage of cellular proteins is a mechanism which is not found in all viruses. Given these important functions, it is not surprising that proteolytic enzymes are ubiquitous gene products in all + RNA and many other viruses.
III. PICORNAVIRAL PROTEINASES A. THE 3C PROTEINASE 1. Structure The major processing proteinase of the picornaviruses, the 3C proteinase, belongs to a new family of proteolytic enzymes: the chymotrypsin-like cysteine proteinases (Gorbalenya and Snijder, 1996). This had initially been predicted based on analysis of the sequence of the 3C gene product (Gorbalenya et al., 1986, 1989; Bazan and Fletterick, 1988). This prediction was shown to be correct by the first crystal structure of a 3C proteinase (Allaire et al., 1994). Refined crystal structures of 3C proteinases have now been published for the enzymes from hepatitis A virus (HAV), poliovirus (PV), and human rhinovirus (HRV) (Matthews et al., 1994; Bergmann et al., 1997; Mosimann et al., 1997). The three-dimensional structure of the 3C proteinases from HAV and PV are shown in Figs. 2 and 3, respectively. The two enzymes differ in size and belong to two subclasses of the 3C proteinases. The 3C gene product of HAV consists of 219 residues and the molecule from PV consists of 183 residues. In spite of the size difference, the core of the enzymes superimpose surprisingly well. The rms difference for the Ca-atoms of 154 residues which superimpose closely is 1.85A. This indicates that the core of the two domain structure of the 3C proteinase is fairly well conserved. Differences between the various 3C proteinases manifest in the length of the secondary structure elements and in the turns and loops that connect the fl-strands and protrude from the core of the fl-barrel domains. In spite of being cysteine proteinases, the 3C proteinases belong structurally to the superfamily of chymotrypsin-like proteinases (Gorbalenya and Snijder,
Picornaviral Proteinases
14 7
1996). The structures of chymotrypsin-like proteinases are formed by two antiparallel fl-barrels with the proteolytic active site at the domain interface. Both domains contribute to the catalytic residues in the active site. Both domains also participate in the binding of peptide substrates. The N-terminal domain is mostly involved in binding the substrate residues following the scissile peptide bond (P~ to P~) whereas the C-terminal domain forms the specific subsites for the substrate residues preceding the scissile bond (P4 to P1) (Perona and Craik, 1995). The two domains of the chymotrypsin-like proteinases are usually described as six-stranded, antiparallel fl-barrels, with the individual fi-strands labeled aI-fl and alI-flI (Figs. 2 and 3, see color plates). An alternative description of the fl-barrels is that of a sandwich of two orthogonal, four-stranded, antiparallel fi-sheets (Chotia, 1984). The fl-strands, which form the edge of the sheets, belong to both sheets and continue, sometimes uninterrupted, from one sheet to the other. As a result the two corners of the "fl-sandwich," which are formed by the edge-strands, are closed, while the other two corners are splayed (Chotia, 1984). In both fl-barrel domains of the HAV 3C proteinase one of the edge strands is interrupted while the other continues from one fl-sheet to the other (Fig. 2) (Bergmann et al., 1997). In the N-terminal domain fi-strand el is interrupted by a single helical turn. fl-strand bI forms a fi-bulge at Val 28, allowing it to bend from one fl-sheet to the other. The residue Val 28 is involved in the binding of peptide substrates by HAV 3C. In the C-terminal domain of HAV 3C the blI strand is interrupted by a short stretch of random coil structure, whereas the eli strand continues from one sheet to the other. There are seven defined fl-strands in both of the domains of HAV 3C. In the two smaller fi-barrels, which form the domains of the polio 3C, the edge strands continue uninterrupted from one sheet to the other (Fig. 3). There are six defined fl-strands in each domain of the polio 3C proteinase (Mosimann et al., 1997). Two of the fl-strands, blI and clI, of the C-terminal domain of the 3C proteinases are extended past the C-terminal fl-barrel (Bergmann et al., 1997). From the point where the two strands are no longer part of the fl-barrel they form an antiparallel, two-stranded fl-ribbon (light gray in Figs. 2 and 3). A defined fl-bulge introduces a bend into this fl-ribbon, which causes it to curl back toward the active site. The longer fl-ribbon in the HAV 3C proteinase contributes to the residues involved in the catalytic mechanism and also to the binding of peptide substrates (see below). Because the fl-ribbon is shorter in the poliovirus 3C, it only contributes to the P4 binding pocket and the proteolytic active site of polio 3C is much more accessible (Mosimann et al., 1997). This fl-ribbon is a unique feature of the 3C proteinase and replaces the "methionine loop" of the chymotrypsin-like serine proteinases. The corresponding topological feature is somewhat similar, but smaller, in some bacterial proteinases (e.g., cr-lytic proteinase; Fujinaga et al., 1985).
148
Ernst M. Bergmannand MichaelN. G.James
There are helices at the N- and C-termini of the 3C proteinases. The N-terminal helix packs against the C-terminal ]3-barrel and the C-terminal helix packs against the surface of the N-terminal domain. The two helices stabilize the structure like two latches (Bergmann et al., 1997). The N-terminal c~-helix is a unique feature of the 3C proteinases among all chymotrypsin-like proteinases (Gorbalenya and Snijder, 1996). It has been speculated that it is important for the mechanism of a proposed intramolecular cleavage at the N-terminus of 3C (Matthews et al., 1994; Bergmann et al., 1997). In the proposed model for the N-terminal, intramolecular proteolytic cleavage, this helix is folded after 3C cleaves its own N-terminus. The favorable free energy of the folding of this stable helix may be required to fold the new N-terminus out of the active site in order to create the active proteinase (Bergmann et al., 1997). The sequence of the residues which form the last turn of this helix is highly conserved throughout the picornaviral 3C genes (K/RR/KNL/I), It is interesting that the structural and functional details of the proteolytic active site of the 3C proteinases are not the most conserved part of the 3C structure (Gorbalenya et al., 1988). The 3C gene product constitutes one subunit of the picornaviral RNA replicase complex and has a distinct RNA binding site (Hammerle et al., 1992; Andino et al., 1993; Leong et al., 1993; Kusov et al., 1997). The sequence of the residues which have been implicated in this second activity, KFRDI, is located in the domain connection of 3C, on the opposite site of the molecule from the proteolytic active site (Figs. 2b and 3b). It is completely conserved throughout the 3C gene sequences of all picornaviruses (Ryan and Flint, 1997). It was first shown for poliovirus that mutations within this sequence are deleterious for the viral replication and show two different phenotypes (Hammerle et al., 1992). These results can now be interpreted in light of the structures. The three charged residues within the consensus sequence (K82, R84, and D85 in poliovirus and K95, R97, and D98 in HAV) form part of the surface of the RNA binding site and are probably directly involved in RNA binding. The side-chains of the two highly conserved hydrophobic residues (F83 and I86 in poliovirus 3C and F96 and I99 in HAV) are packed into the interior of the molecule. They are part of the internal hydrophobic interactions that maintain the structure in this region and are important for this reason. The side-chain of the conserved phenyalanine interacts with a conserved glycine at the end of/3-strand bI inside the N-terminal/3-barrel. The sequence surrounding this glycine, LGVK/,D, is also highly conserved within the 3C genes. The residues in this sequence motif, from His 31 in poliovirus 3C and Lys 35 in HAV 3C on, form a reverse turn and connect/3-strand bI and cI (Figs. 2b and 3b). They contribute to the surface of the RNA binding site of 3C. Presumably, also contributing to the molecular surface of the RNA binding site are the turns which connect/3-strands dI and eI and dlI and eli (Figs. 2b and 3b). The latter connection forms a single turn of a helix in HAV 3C (Fig. 2b).
Picornaviral Proteinases
149
One face of the N- and C-terminal helix each flanks the conserved residues within the domain connection and probably contributes to the RNA binding site. It appears likely, as was proposed by Ryan and Flint (1997), that the binding of RNA to this site would have an influence on the proteolytic processing of both the N- and C-termini of 3C. On the other hand, it is not known if binding of RNA to the RNA binding site of 3C affects the proteolytic activity. Only structural work on a complex of 3C and bound RNA could provide a definite answer to this question. Because the RNA binding site is on the opposite side of the molecule from the proteolytic active site, the structures suggest that it could be possible that the two activities are independent. However, the 3C structures also suggest a possible mechanism whereby binding of RNA in the RNA binding site of 3C could influence the proteolytic activity. The turns, which connect/3-strands bI and cI and dII and eII, are probably involved in the specific binding of RNA. At the other end of each of these strands are residues which play important roles in the proteolytic activity. Slight conformational changes involving these/t-strands could have a dramatic effect on the proteolytic activity. 2. Activity and Specificity The picornaviral 3C proteinases are relatively slow enzymes when compared to some of the mammalian, extracellular serine proteinases. They have evolved to be very specific enzymes (Malcolm, 1995; Gorbalenya and Snijder, 1996; Ryan and Flint, 1997; Bergmann, 1998 and references therein). The chymotrypsin-like proteinases belong to a large group of proteolytic enzymes in which the nucleophile is the oxygen or sulfur atom of the side-chain of a serine or cysteine residue, respectively. In these enzymes the general acidbase catalyst is a conserved histidine residue. It is generally accepted that the mechanism of these enzymes involves an acyl-enzyme intermediate formed between the nucleophile and the carbonyl of the P1 residue of the substrate. Additional, so-called tetrahedral intermediates occur both during formation and hydrolysis of the acyl-enzyme intermediate. The tetrahedral intermediates carry a negative charge on the oxygen atom of the scissile peptide bond. In the catalytic reaction of the chymotrypsin-like serine proteinases the transition states leading to the tetrahedral intermediates are rate limiting and structurally resemble the tetrahedral intermediates. Three chemical groups with distinct functions are typically found in the active sites of proteolytic enzymes(James, 1993; Ryan and Flint, 1997): a nucleophile, which attacks the carbonyl of the scissile peptide bond; a general acidbase catalyst, which assists in the attack and protonates the leaving group; and an electrophilic structure, which stabilizes the developing negative charge on the carbonyl. The latter structure is usually referred to as the oxyanion hole. In
150
Ernst M. Bergmannand Michael N. G. James
the chymotrypsin-like serine proteinases it consists of a stretch of seven residues with a consensus sequence XGDSGG, where the serine is the nucleophile. The main-chain conformation of this structure orients the first and third peptide bonds so that they donate hydrogen bonds to the carbonyl of the scissile bond. These two hydrogen bonds of the oxyanion hole help to stabilize the developing negative charge on the carbonyl oxygen during the reaction (Whiting and Peticolas, 1994). There are additional chemical groups in the active site of proteinases, the function of which is less clear. In the chymotrypsin-like serine proteinases the carboxylate of an aspartate residue interacts with the edge of the imidazole of the histidine general acid-base catalyst, which is opposite from the nucleophile (N s). Originally it was thought that this "third member of the catalytic triad" participates in a proton transfer, but it is now generally believed that its function is to maintain the orientation of the histidine general acid-base catalyst and possibly to stabilize its developing positive charge. There are chemical groups in similar positions to the carboxylate of the third member of the catalytic triad in the 3C proteinases, but the interactions with the histidine general acid-base catalyst are different (Fig. 4, see color plate). It is generally accepted that the active sites of cysteine proteinases, such as the enzymes of the papain family, contain a thiolate-imidazolium ion pair that is stabilized over a wide pH range (Storer and M~nard, 1994). The active site of the 3C proteinases feature a thiol and an imidazole but in the structural context of a chymotrypsin-like proteinase. There is no direct experimental evidence for the charge and protonation state in the active site of the 3C proteinases. Even though the 3C proteinases belong to the superfamily of the chymotrypsin-like proteinases, they are cysteine proteinases. Whether the mechanism of 3C proteinases more closely resembles the mechanism of chymotrypsin-like serine proteinases or other cysteine proteinases or is unique is not clear. Figure 4 shows the details of the active site residues of the 3C proteinases from HAV (Bergmann et al., 1997) and PV (Mosimann et al., 1997). The arrangement of the cysteine-histidine dyad and the oxyanion hole is similar to that observed in the chymotrypsin-like serine proteinases but due to the size of the sulfur nucleophile the active site is larger and the chemical groups are further apart. The conserved glycine residue in the oxyanion hole of the wild-type 3C proteinases shows a conformation that is similar to the one seen for the corresponding residue in the chymotrypsin-like serine proteinases. This left-handed 3 ~o-helical conformation requires a glycine in this position ( ~ = 95 ~ 9 = - 5 ~ (Bergmann et al., 1997). In the chymotrypsin-like serine proteinases this conformation is maintained by interactions of the carbonyl of this peptide bond with other groups in the structure. The carbonyl of the corresponding residue
Picornaviral Proteinases
151
in the 3C proteinases (Pro 169 in HAV and Ala 144 in PV) does not make any interactions in the crystal structures. In the crystal structures of mutants of the nucleophilic cysteine of the HAV 3C proteinase this peptide bond is indeed flipped and the oxyanion hole has collapsed to a lower energy main-chain conformation (Allaire et al., 1994). It appears that the presence of the nucleophilic sulfur atom itself is required to maintain the proper conformation of the oxyanion hole in the 3C proteinases. We believe that it is a negative charge on the nucleophilic sulfur that orients the peptide bonds of the oxyanion hole and take this as partial evidence for a mechanism involving a thiolate-imidazolium ion pair. Direct experimental evidence for the protonation state of the residues in the active site and the mechanism of the 3C proteinases is, however, lacking. An aspartate or glutamate residue, which is in an equivalent position to the third member of the catalytic triad of the chymotrypsin-like serine proteinases, is present and conserved throughout the 3C proteinases (Gorbalenya et al., 1988; Ryan and Flint, 1997). However, the interaction typical for the third member of the catalytic triad is not observed. In HAV 3C the side-chain of Asp 84 points away from the imidazole of the general acid-base His 44. It is locked in interactions with other regions of the structure (Bergmann et al., 1997). The position of the carboxylate of the third member of a catalytic triad is taken up by a water molecule in HAV 3C, which forms a hydrogen bond to the N a of His 44 (Fig. 4a). This water molecule, the imidazole of His 44 and the nucleophilic S r atom of Cys 172 of HAV 3C are in a common plane. Perpendicular to this plane and 3.0A above it, is the sidechain of Tyr 143, which is located in the antiparallel fi-ribbon of HAV 3C. Mutational studies have shown that Tyr 143 is important for the catalytic activity of HAV 3C. The side-chain of Tyr 143 does not form a hydrogen bond in the crystal structure of HAV 3C. It is perpendicular to the plane of the imidazole and it is 3.5A away from the water molecule. We believe the side-chain of Tyr 143 is deprotonated and negatively charged in the structure of the HAV 3C proteinase. Presumably, an electrostatic interaction between His 44 and Tyr 143 helps to maintain the side-chain conformation of His 44 with the imidazole in the same plane as the nucleophilic S ~ atom of Cys 172 and helps to stabilize a positive charge on the His 44 imidazole. The conserved glutamate, which is present in the position that corresponds to the third member of a putative catalytic triad in polio virus 3C (Glu 71), does interact with the imidazole of the general acid-base catalyst His 40 (Mosimann et al., 1997). The interaction is, however, unusual. Accepting a hydrogen bond from the N a atom of His 44 is the anti lone electron pair of the carboxylate of Glu 71. This is similar to the structure of the 3C proteinase from rhinovirus (Matthews et al., 1997). Thus, the conserved features in the proteolytic active site of the picornaviral
152
Ernst M. Bergmannand MichaelN. G.James
3C proteinase are a cysteine-histidine dyad and an oxyanion hole that resembles that of the chymotrypsin-like serine proteinases remarkably well. Additional chemical groups have been shown to be important by mutagenesis experiments and are making interactions with mostly the histidine general acid-base catalyst. Their function is probably to maintain the orientation of the active site residues and to provide a specific electrostatic environment. Chymotrypsin-like serine proteinases bind specific substrates in a canonical mode and with a specific conformation of the bound peptide substrate (Read and James, 1986; Bode and Huber, 1992). We now have evidence from structures of enzyme inhibitor complexes that the 3C proteinases bind peptide substrates in a similar conformation (Bergmann and James, manuscript in preparation). Furthermore, the specific recognition of the cleavage sites within the viral polyprotein by the 3C proteinases can be rationalized if one assumes a similar binding mode. Chymotrypsin-like proteinases specifically bind 4 - 5 residues that precede the scissile bond and 2 - 3 residues that follow it in the sequence (i.e., P5 to P~). The residues from P5 to P2 of the substrate are usually in a/3-strand conformation. The P1 residue adopts a main-chain conformation that corresponds to a tight 31o helix. This places the carbonyl of the scissile peptide bond into the oxyanion hole. The P'I and P2 residues are usually also in a/t-conformation. This main chain conformation of a peptide substrate orients at least some of the peptide side-chains into specificity pockets, which are formed by the surface of the enzyme. While interactions between the enzyme and the main-chain of a bound substrate contribute significantly to the binding of a substrate, most of the specificity is provided by the interactions of the peptide side-chains in the specificity pockets of the enzyme (Fig. 5, see color plate). The minimum size for a good substrate of a 3C proteinase is a hexapeptide with the specific P4 to P~ residues. Sequence preferences for certain residues that distinguish the 3C cleavage sites in the picornaviral polyprotein can also be found for the P4 to P; residues (Pallai et al., 1989; Weidner and Dunn, 1991; Bergmann, 1998 and references therein). Figure 5 shows a model of a hexapeptide substrate with the sequence of the primary cleavage site of the HAV polyprotein in the active site of HAV 3C. With the exception of the two hydrogen bonds that the carbonyl of the scissile peptide bond makes in the oxyanion hole, the main-chain interactions between the HAV 3C proteinases and the bound peptide are/t-sheet interactions. In HAV 3C the substrate residues from P5 to P2 form an antiparallel/3-sheet with/3-strand eII. This form of substrate binding is common to all chymotrypsin-like proteinases. In HAV 3C there is also a parallel/t-sheet interaction between the P4 to P2 residues of the substrate and enzyme residue from the extension of/t-strand bII. This extension is not present in the smaller enteroviral 3C. Third, the P'I and P2 residues of the substrate of HAV 3C form an antiparallel/t-interaction with the residues of the
Picornaviral Proteinases
153
enzyme, which form the/3-bulge in strand bI of the enzyme. The conformation of fl-strand bI is different in polio 3C and presumably forces a different mainchain conformation on the substrate. This could explain the unique preference of polio 3C for a P'I glycine residue. All 3C proteinases share the preference for a glutamine residue in the P1 position of a substrate. Sequence preferences for the other residues of a substrate from P4 to P~' differ among the enzymes from the different genera. The major determinant of the primary specificity is a conserved histidine residue which is positioned inside the S1 pocket of the enzymes (Fig. 4). This histidine residue (191 in HAV and 161 in hnPV) is conserved throughout the 3C genes of all Picornaviruses (Ryan and Flint, 1997). Models of substrates bound to 3C proteinases agree that this histidine residue donates a hydrogen bond to the carbonyl oxygen atom of the side-chain of a glutamine residue in the $1 pocket. The environment of this histidine, which must contribute to the specific distinction between glutamine and glutamate, is, however, different in the crystal structures of HAV 3C and PV 3C. In HAV 3C His191 interacts, via buried water, with the side-chain of Glu 132. Bergmann et al. (1997) suggest that the deprotonation of this Glu 132, which is buried in the interior of the C-terminal domain, would be energetically expensive and unfavorable. Because the two residues interact, the protonation of His191 would also be unfavorable. In the entero- and rhinoviral enzymes a buried tyrosine residue performs a similar function (Mosimann et al., 1997). Comparison of the sequences of the natural cleavage sites of the HAV polyprotein reveals distinct sequence preferences for the residues in the P4, P2, and Px position of a 3C substrate (Bergmann, 1998). All natural cleavage sites in the HAV polyprotein have large hydrophobic residues (preferably Leu or Ile) in P4, serine or threonine in P2, and glutamine in P1. The 3C proteinase from poliovirus has a sequence preference for a small, hydrophobic residue in P4, glutamine in Px, and glycine in the P'I position of a peptide substrate. Bergmann et al. (1997) suggest that His145 in HAV 3C can form a hydrogen bond to a serine or threonine residue in P2 and is therefore responsible for the P2 specificity (Fig. 5). The/3-ribbon-contributing His145 in HAV 3C is shorter in the poliovirus enzyme so there is no equivalent residue in polio 3C. This correlates well with the fact that polio 3C does not show a sequence preference for a P2 residue. The hydrophobic $4 pocket of the 3C proteinases is a cleft formed by fl-strands eII and fII and the fl-ribbon formed by the extension of/3-strands bII and cII (Fig. 5). It is quite large in the HAV 3C proteinase. In polio 3C several of the hydrophobic residues that form this pocket are substituted by larger ones (e.g., Ala141 and Val200 in HAV 3C correspond to Leu125 and Phe170 in polio 3C). Therefore, the $4 pocket in the polio 3C is smaller and polio 3C prefers smaller, hydrophobic side-chains in P4.
154
Ernst M. Bergmann and Michael N. G. James
The larger 3C proteinase from HAV thus forms more extensive interactions with peptide substrates, both main-chain and side-chain. However, it is important to keep in mind that in the enteroviruses the 3CD precursor is a more active proteinase and is required for some of the cleavages of the polyprotein (Ypma-Wong et al., 1988). Bergmann et al. (1997) suggested a model for the interactions of 3C with the 3D and 3AB domains in a larger precursor. In this model the 3D part of a 3CD precursor would be in a position to interact with the residues in the P2 and P3 position of a substrate and could influence the proteolytic activity (top left of the 3C molecule in Figs. 2a and 3a). While the models of substrate binding to the 3C proteinases allow one to rationalize how the 3C proteinases recognize the specific cleavage sites within the polyprotein, it is not possible to explain the preference of some of the cleavage sites over others during the polyprotein processing. Presumably other factors besides the subsite specificity, such as the accessibility within the folded polyprotein, must play a part in the determination of the sequential polyprotein processing. 3. Inhibition Inhibitors of the 3C proteinases usually combine a chemical functionality that covalently attaches to the nucleophilic thiol in the active site, with other groups which target some of the specific interactions between the proteinases and its substrates. Typical cysteine proteinase inhibitors such as iodoacetamide, N-ethylmaleimide, epoxides, and aldehydes are also effective against the 3C proteinases (Malcolm, 1995). More promising inhibitors are the fluoromethylketones and y-aminovinylsulfones (Rasnik, 1996). Some of the best inhibitors available to date combine the latter functionalities with a peptidic specificity address which mimics the natural peptide specificity. A tetrapeptide fluoromethylketone inhibitor with the sequence Acetyl-Leu-Ala-Ala-Gln-FMK has been shown to be an effective inhibitor of the HAV 3C proteinase in vitro and in vivo (Morris et al., 1997). It covalently attaches to the HAV 3C proteinase and is capable of reducing the production of progeny virus in infected cells. Other functionalities that are now being investigated as inhibitors of the chymotrypsin-like cysteine proteinases include aft-unsaturated carboxylesters, /3- and y-lactones, lactams, isatins (2,3-dioxindoles), and triterpene sulfates (Skiles and McNeil, 1990; Brill et al., 1996). A cocrystal structure of the rhinovirus 3C proteinase with an isatin analog inhibitor in the active site shows that some of these compounds also covalently attach to the active site thiol and mimic the P1 specificity determinant of a natural substrate (Webber et al., 1996). While the details of the specific enzyme substrate interactions gleaned
Picornaviral Proteinases
15 5
from the crystal structures of 3C proteinases provide valuable information for the design of effective inhibitors, there is little experimental evidence for the mechanism of the chymotrypsin-like cysteine proteinases. This kind of information would, however, be of great value in identifying potential chemical functionalities and inhibitors.
B. THE ENTERO- AND RHINOVIRAL 2 A PROTEINASE The primary cleavage at the N-terminus of the 2A gene product that separates the structural and nonstructural proteins in the entero- and rhinoviruses is performed by the 2A gene product. Analysis of the sequence, mutational studies, and model-building studies have shown that the entero- and rhinoviral 2A proteinase is also a chymotrypsin-like cysteine proteinase but distinct from and only distantly related to 3C. The 2A proteinase is a smaller enzyme of 142 residues. It has been suggested, based on the results of mutational studies and structural models, that the active site of the 2A proteinases contains a catalytic triad of Cysl06, His18, and Asp35, which more closely resembles that of the serine proteinases (Sommergruber et al., 1989; Hellen et al., 1991). The sequence alignments also seem to indicate a closer relationship of 2A to the small, bacterial serine proteinases (Sommergruber et al., 1997). The 2A proteinase has less stringent specificity requirements than the 3C proteinases (Skern et al., 1991). Presumably this reflects the fact that its major function is to perform an intramolecular cleavage at its own N-terminus. Indeed, it has been found that amino acid changes in a substrate affect a trans activity but not the intramolecular cis activity (Hellen et al., 1992). The 2A proteinase is a zinc protein and the tightly bound zinc ion presumably plays a structural role (Sommergruber et al., 1994b; Voss et al., 1995). In the model of Sommergruber et al. (1997) the N-terminal fl-barrel has fewer strands and presumably the zinc ion is therefore needed to stabilize the N-terminal domain of the small 2A proteinase. A second function of the entero- and rhinoviral 2A proteinase is the specific cleavage of one of the proteins of the eukaryotic CAP-binding complex, eIF4G (Sommergruber et al., 1994a; Haghighat et al., 1996). This results in inhibition of the translation of capped, cellular mRNAs and preferential translation of the viral RNA. The 2A gene product of entero- and rhinoviruses also has functions in addition to its proteolytic activity (Belsham and Sonnenberg, 1996). There is evidence that 2A forms a complex with other viral proteins and is involved in viral RNA translation and other aspects of viral replication (Molla et al., 1993; Lu et al., 1995; Cuconati et al., 1998).
156
Ernst M. Bergmann and Michael N. G. James
There is little experimental evidence for the catalytic mechanism of the 2A proteinase. The chemical functionalities which provide good inhibitors of the 3C proteinases, such as the fluoromethylketones, are also effective against other chymotrypsin-like cysteine proteinases. Because the specificity requirements of the 2A proteinases are less stringent, the design of specific inhibitors against this class of enzymes could be more difficult.
C. THE L PROTEINASE OF THE APHTHOVIRUSES The aphthoviruses have another distinct proteolytic activity besides the 3C proteinase. The gene coding for the L proteinase is located at the N-terminus of the polyprotein and precedes the structural proteins (Ryan and Flint, 1997). The L proteinase cleaves its own C-terminus (Strebel and Beck, 1986). In vitro this cleavage can occur in cis and trans (Medina et al., 1993; Cao et al., 1995). The aphthoviral L proteinase also cleaves the cellular eIF4G and thus causes inhibition of the translation of capped, cellular mRNAs. This function of the aphthoviral L proteinase is similar to the one performed by the entero- and rhinoviral 2A proteinases. However, the cleavage of eIF4G by the L proteinase occurs in a different position (Kirchweger et al., 1994). In spite of these functions the L proteinase is not essential for the replication of the virus (Piccone et al., 1995a). The L proteinase is a cysteine proteinase and analysis of the sequence suggests that it belongs to the family of papain-like proteinases (Gorbalenya et al., 1991). The enzyme is present in two forms which differ by size and originate from two different initiation codons in the viral genome. The Lb proteinase of foot-and-mouth disease virus (FMDV) consists of 173 residues and the Lab proteinase is 28 residues longer. Sequence analysis, site-directed mutagenesis, and modeling studies identified the nucleophile, the general acid-base catalyst, and the third member of the catalytic triad as Cys51, His148, and Asp 164, respectively (Gorbalenya et al., 1991; Piccone et al., 1995b; Roberts and Belsham, 1995; Skern et al., 1998). Sequence alignments also suggest that the side-chain of Asn46 contributes to the oxyanion hole (similar to Gln 19 of papain) (Ryan and Flint, 1997). The C-terminus of the L proteinase has an extension, compared to the structure of papain, which has been predicted to adopt a helical conformation. Skern et al. (1998) suggest that this additional helix plays an important role for the mechanism of the intramolecular cleavage at the C-terminus of the L proteinase. Crystallization of the Lb proteinase from FMDV has been reported but the crystal structure has not been published (Guarnr et al., 1996).
Picornaviral Proteinases
15 7
IV. C O N C L U S I O N S A N D I M P L I C A T I O N S FOR ANTIVIRAL STRATEGIES The picornaviral 3C proteinases constitute an ideal target for the rational design of antiviral drugs. There is now a considerable amount of structural information for both enzymes and enzyme-inhibitor complexes. The details of the molecular interactions that are responsible for the specific substrate binding are reasonably well understood. Furthermore, the chymotrypsin-like cysteine proteinases constitute a unique class of enzymes with a distinct substrate specificity and are so far only found in +RNA viruses. Within these viruses the 3C proteinases perform a central and indispensable role during the viral life cycle and 3C proteinase inhibitors have the potential to limit the spread of viral infections (Morris et al., 1997). Neither the 2A nor the L proteinases are as attractive as targets for antiviral strategies. The activity of the L proteinase is apparently not as critical for viral replication. There is not as much structural information available for the entero-/rhinoviral 2A proteinase. The fact that the 2A proteinase activity is also less stringently specific could make the design of inhibitors more difficult. While there are many Picornaviruses, and viruses from related families, that cause disease in humans, few of these are considered important targets for the design of antiviral drugs. Rhinoviruses cause at least half of all common colds in humans. But because there are other families of unrelated viruses that cause upper respiratory tract infections, which are essentially indistinguishable, effective drugs against rhinoviruses would only be useful in combination with simple analytical procedures to unambiguously identify rhinoviral infections. Such simple analytical procedures are not available at present (Couch, 1996). There are safe and effective vaccines available against poliovirus and HAV. As a result of extensive worldwide vaccination, the incidence of poliomyelitis has been decreasing and currently there are realistic efforts underway to eradicate the disease completely. The introduction of an effective vaccine against HAV was very recent and it is too early to predict its effect. Wide-spread vaccination against HAV is currently not planned as hepatitis A is usually not a lifethreatening disease. Because co-infections of chronic carriers of hepatitis B, C, and G with HAV appears to be dangerous, the observed increase in chronic infections with other forms of hepatitis may have an influence on future strategies to control hepatitis A. It would be desirable to have antiviral drugs available against the more severe enteroviral infections. While most of the enteroviral infections are rare, they can have serious consequences. Because enteroviral infections occur infrequently, this is not considered an economically important target. Several other families of + RNA viruses also carry 3C or 3C-like proteinases
1 58
Ernst M. Bergmann and Michael N. G. James
(Wirblich et al., 1995; Martin Alonso et al., 1995; Tibbles et al., 1996). Most notably, the Corona- and Caliciviridae cause upper respiratory tract infections and intestinal infections in humans. The viruses of these families are less well studied than the Picornaviruses but distantly related. The design of 3C proteinase inhibitors would in all likelihood also be useful toward the development of antiviral drugs against the 3C-like proteinases of the viruses of these families. At present the mechanism by which some + RNA viruses, most notably the enteroviruses, can trigger severe autoimmune diseases are not well understood. It is also questionable whether inhibition of viral replication would prevent the disastrous consequences of the immune response at a later stage of an infection. Therefore, it is not clear whether antiviral drugs would be useful in the prevention of these diseases. In conclusion, there is a wealth of experimental information available for the best-studied examples of the viruses of the Picornaviridae. This information provides an opportunity to design inhibitors against the viral 3C proteinase. Effective inhibitors of the picornaviral 3C proteinase have the potential to become effective antiviral drugs against human diseases such as the common cold, HAV, enteroviral infections, and diseases caused by related + RNA viruses.
REFERENCES Allaire, M., Chernaia, M. M., Malcolm, B. A., and James, M. N. G. (1994). Picornaviral 3C cysteine proteinases have a fold similar to chymotrypsin-like serine proteinases. Nature 369, 72-76. Andino, R., Rickhof, G. E., Achacoso, P. L., and Baltimore, D. (1993). Poliovirus RNA synthesis utilizes an RNP complex formed around the 5'-end of viral RNA. EMBOJ. 12, 3587-3598. Bazan, J. F., and Fletterick, R.J. (1988). Viral cysteine proteinases are homologous to the trypsinlike family of serine proteinases: Structural and functional implications. Proc. Natl. Acad. Sci. USA 85, 7872-7876. Belsham, G.J., and Sonnenberg, N. (1996). RNA-protein interactions in regulation of picornavirus RNA translation. Microbiol. Rev. 60,499-511. Bergmann, E. M. (1998). Hepatitis A virus picornain 3C. In "Handbook of Proteolytic Enzymes" (A. D. Barrett, N. J. Rawlings, and F. Woesner, Eds.). Academic Press, London. Bergmann, E. M., Mosimann, S. C., Chernaia, M. M., Malcolm, B. A., and James, M. N. G. (1997). The refined crystal structure of the 3C gene product from hepatitis A virus: Specific proteinase activity and RNA recognition.J. Virol. 71, 2436-2448. Bienz, K., Egger, D., Rasser, Y., and Bossart, W. (1983). Intracellular distribution of poliovirus proteins and the induction of virus-specific cytoplasmic structures. Virology 131, 39-48. Bienz, K., Egger, D., Troxler, M., and Pasamontes, L. (1990). Structural organization of poliovirus RNA replication is mediated by viral proteins of the P2 genomic region. J. Virol. 64, 11561163. Bode, W., and Huber, R. (1992). Natural protein proteinase inhibitors and their interactions with proteinases. Eur. J. Biochem. 204,433-451. Borman, A. M., and Kean, K. M. (1997). Intact eukaryotic initiation factor 4G is required for hepatitis A virus internal initiation of translation. Virology 237, 129-136. Brill, B. M., Kati, W. M., Montgomery, D., Karwowski, J. P., Humphrey, P. E., Jackson, M., Clement,
Picornaviral Proteinases
15 9
J.J., Kadam, S., Chen, R. H., and McAlpine,J. B. (1997). Novel triterpene sulfates from fusarium compactum using a rhinovirus 3C protease inhibitor screen. J. Antibiot. 49,541-546. Cao, X., Bergman, I. E., F~illkrug, R., and Beck, E. (1995). Functional analysis of the two alternative initiation sites of foot-and-mouth disease virus. J. Virol. 69, 560-563. Carthy, C. M., Yang, D., Anderson, D. R., Wilson, J. E., and McManus, B. M. (1997). Myocarditis as systemic disease: New perspectives on pathogenesis. Clin. Exp. Pharmacol. Physiol. 24, 997-1003. Chotia, C. (1984). Principles that determine the structure of proteins. Annu. Rev. Biochem. 53, 537-572. Couch, R. B. (1996). Rhinoviruses. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). LippincottRaven, Philadelphia. Cuconati, A., Xiang, W., Lahser, F., Pfister, T., and Wimmer, E. (1998). A protein linkage map of the P2 nonstructural proteins of poliovirus.J. Virol. 72, 1297-1307. Davis, G.J., Wang, Q. M., Cox, G. A., Johnson, R. B., Wakulchik, M., Datson, C. A., and Villarreal, E. C. (1997). Expression and purification of recombinant rhinovirus 14 3CD proteinase and its comparison to the 3C proteinase. Arch. Biochem. Biophys. 346, 125-130. Donnelly, M. L. L., Gani, D., Flint, M., Monaghan, S., and Ryan, M. D. (1997). The cleavage activity of aphtho and cardiovirus 2A proteins.J. Gen. Virol. 78, 13-21. Dougherty, W. G., and Semler, B. L. (1993). Expression of virus-encoded proteinases: Functional and structural similarities with cellular enzymes. Microbiol. Rev. 57, 781-822. Fujinaga, M., Delbaere, L. T.J., Brayer, G., and James, M. N. G. (1987). Refined crystal structure of c~-lytic protease at 1.7 h resolution.J. Mol. Biol. 184, 479-502. Gamarnik, A. V., and Andino, R. (1997). Two functional complexes formed by KH domain containing proteins with the 5' noncoding region of poliovirus RNA. RNA 3, 882-892. Gorbalenya, A. E., and Snijder, E.J. (1996). Viral cysteine proteinases. Perspect. Drug Disc. Design 6, 64-86. Gorbalenya, A. E., Blinov, V. M., and Donchenko, A. P. (1986). Poliovirus-encoded proteinase 3C: A possible evolutionary link between cellular serine and cysteine proteinase families. FEBS Lett. 194, 253-257. Gorbalenya, A. E., Donchenko, A. P., Blinov, V. M., and Koonin, E. V. (1989). Cysteine proteinases of positive strand RNA viruses and chymotrypsin-like serine proteinases: A distinct protein superfamily with a common strcutural fold. FEBS Lett. 243, 103-114. Gorbalenya, A. E., Koonin, E. V., and Lai, M. M. C. (1991). Putative papain-related thiol protease of positive strand RNA viruses: Identification of rubi- and aphthovirus proteases and delineation of a novel conserved domain associated with proteases of rubi-, alpha- and coronaviruses. FEBS Lett. 288, 201-205. Guarnr A., Kirchweger, R., Verdaguer, R., Liebig, H. D., Blaas, D., Skern, T., and Fita, I. (1996). Crystallization and preliminary X-ray diffraction studies of the Lb proteinase of foot-and-mouth disease virus. Prot. Sci. 5, 1931-1933. Haghighat, A., Svitkin, Y., Novoa, I., K~chler, E., Skern, T., and Sonnenberg, N. (1996). The elF4GeIF4E complex is the target for direct cleavage by the rhinovirus 2A proteinase. J. Virol. 70, 8444-8450. H~immerle, T., Molla, A., and Wimmer, E. (1992). Mutational analysis of the proposed FG loop of poliovirus proteinase 3C identified amino acids that are necessary for 3CD cleavage and might be determinants of a function distinct from proteolytic activity.J. Virol. 66, 6028-6034. Hanecak, R., Semler, B. L., Ariga, H., Anderson, C. W., and Wimmer, E. (1984). Expression of a cloned gene segment of poliovirus in E. coli: Evidence for autocatalytic production of the viral proteinase. Cell 37, 1063-1073. Harmon, S. A., Updike, W., Xi-Ju, J., Summers, D. F., and Ehrenfeld, E. (1992). Polyprotein
160
Ernst M. Bergmann and Michael N. G. James
processing in cis and in trans by hepatitis A virus 3C protease cloned and expressed in E. coli. J. Virol. 66, 5242-5247. Harris, K. S., Xiang, W., Alexander, L. S., Lane, W. S., Paul, A. V., and Wimmer, E. (1994). Interactions of poliovirus polypeptide 3CD Prowith the 5' and 3' termini of the poliovirus genome. J. Biol. Chem. 269, 27004-27014. Hellen, C. U. T., Fache, M., Krausslich, H. G., Lee, C., and Wimmer, E. (1991). Characterization of poliovirus 2A proteinase by mutational analysis: Residues required for autocatalytic activity are essential for induction of eukaryotic initiation factor 4F polypeptide p220. J. Virol. 65, 4226-4231. Hellen, C. U. T., Lee, C., and Wimmer, E. (1992). Determinants of substrate recognition by poliovirus 2A proteinase. J. Virol. 66, 3330-3338. Hollinger, F. B., and Ticehurst, J. R. (1996). Hepatitis A virus. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). Lippincott-Raven, Philadelphia. James, M. N. G. (1993). Convergence of active-centre geometries among the proteolytic enzymes. In "Proteolysis and Protein Turnover" (J. S. Bond and A. J. Barrett, Eds.). Portland Press, London. Jia, X.-Y., Summers, D. F., and Ehrenfeld, E. (1993). Primary cleavage of the HAV capsid protein precursor in the middle of the proposed 2A coding region. Virology 193, 515-519. Kay, J., and Dunn, B. M. (1990). Viral proteinases: weakness in strength. Biochim. Biophys. Acta 1048, 1-18. Kirchweger, R., Ziegler, E., Lamphear, B. J., Waters, D., Liebig, H. D., Sommergruber, W., Sobrino, F., Hohenadl, C., Blaas, D., Rhoads, R. E., and Skern, T. (1994). Foot-and-mouth disease virus leader proteinase: Purification of the Lb form and determination of its cleavage site on eIF47. J. Virol. 68, 5677-5684. Kr/msslich, H.-G., and Wimmer, E. (1988). Viral proteinases. Ann. Rev. Biochem. 57, 701-754. Kusov, Y. Y., and Gauss-M~ller, V. (1997). In vitro RNA binding of the hepatitis A virus proteinase 3C (HAV 3C Pr~ to secondary structure elements within the 5' terminus of the HAV genome. RNA 3, 291-302. Lamphear, B. J., Yan, R., Yang, F., Waters, D., Liebig, H.-D., Klump, H., K~chler, E., Skern, T., and Rhoads, R. E. (1993). Mapping the cleavage site in protein synthesis initiation factor elF-47 of the 2A proteases from human coxsackie virus and rhinovirus.J. Biol. Chem. 268,19200-19203. Leong, L. E. C., Walker, P. A., and Porter, A. G. (1993). Human rhinovirus 14 protease 3C (3C Pr~ binds specifically to the 5'-noncoding region of the viral RNA. J. Biol. Chem. 268, 2573525739. Long, L. A., Orr, D. C., Cameron, J. M., Dunn, B. M., and Kay, J. (1989). A consensus sequence for substrate hydrolysis by rhinovirus 3C proteinase. FEBS Lett. 258, 75-78. Lu, H. H., Li, X., Cuconati, A., and Wimmer, E. (1995). Analysis of picornavirus 2A (pro) proteins: Separation of proteinase from translation and replication functions. J. Virol. 69, 7445-7452. Malcolm, B. A. (1995). The picornaviral 3C proteinases: Cysteine nucleophiles in serine proteinase folds. Prot. Sci. 4, 1439-1445. Martin Alonso, J. M., Casais, R., Boga, J. A., and Parra, F. (1996). Processing of rabbit hemorrhagic disease virus polyprotein.J. Virol. 70, 1261-1265. Martin, A., Escriou, N., Chao, S. F., Girard, M., Lemon, S. M., and Wychowski, C. (1995). Identification and site-directed mutagenesis of the primary (2A/2B) cleavage site of the hepatitis A virus polyprotein: Functional impact on the infectivity of HAV RNA transcripts. Virology 213, 213 -222. Matthews, D. A., Smith, W. W., Ferre, R. A., Condon, B., Budahazi, G., Sisson, W., Villafranca,J. E., Janson, C. A., McElroy, H. E., Gribskov, C. L., and Worland, S. (1994). Structure of human rhinovirus 3C protease reveals a trypsin-like polypeptide fold, RNA-binding site and means for cleaving precursor polyprotein. Cell 77, 761-771.
Picornaviral Proteinases
161
Medina, M., Domingo, E., Brangwun, J. K., and Belsham, G.J. (1993). The two species of the footand-mouth disease virus leader protein expressed individually, exhibit the same activities. Virology 194, 355-359. Melnick, J. L. (1996). Enteroviruses: Polioviruses coxsackie viruses, echoviruses, and newer enteroviruses. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). Lippincott-Raven, Philadelphia. Miller, S. D., Vanderlugt, C. L., Smith-Begolka, W., Pao, W., Yauch, R. L., Neville, K. L., Katz-Levy, Y., Carrizosa, A., and Kim, B. S. (1997). Persistent infection with Theiler's virus leads to CNS autoimmunity via epitope spreading. Nat. Med. 3, 1133-1136. Molla, A., Paul, A. V., Schmid, M., Jang, S. K., and Wimmer, E. (1993). Studies on dicistronic polioviruses implicate viral proteinase 2A vr~in RNA replication. Virology 196, 739-747. Morris, T. S., Frormann, S., Shechosky, S., Lowe, C., I_all, M. S., Gauss-MOiler, V., Purcell, R. H., Emerson, S. U., Vederas, J. C., and Malcolm, B. A. (1997). In vitro and ex vivo inhibition of hepatitis A virus 3C proteinase by a peptidyl monofluoromethyl ketone. Bioorg. Med. Chem. 5, 797-807. Mosimann, S. C., Chernaia, M. M., Sia, S., Plotch, S., and James, M. N. G. (1997). Refined X-ray crystallographic structure of the poliovirus 3C gene product. J. Mol. Biol. 273, 1032-1047. Nicklin, M. J., Harris, K. S., Pallai, P. V., and Wimmer, E. (1988). Poliovirus proteinase 3C: Largescale expression, purification and specific cleavage activity on natural and synthetic substrates in vitro. J. Virol. 62, 4586-4593. Pallai, P. V., Burkhardt, F., Shoog, M., Schreiner, K., Bax, P., Cohen, K. A., Hansen, G., Palladino, D. E., Harris, K. S., Nicklin, M. J., and Wimmer, E. (1989). Cleavage of synthetic peptides by purified poliovirus 3C proteinase. J. Biol. Chem. 264, 9738-9741. Palmenberg, A. C. (1990). Proteolytic processing of picornaviral polyprotein. Annu. Rev. Microbiol. 44, 602-623. Palmenberg, A. C., and Rueckert, R. R. (1982). Evidence for intramolecular self-cleavage of picornaviral replicase precursors. J. Virol. 41,244-249. Palmenberg, A. C., Parks, G. D., Hall, D. J., Ingraham, R. H., Seng, T. W., and Pallai P. V. (1992). Proteolytic processing of the cardioviral P2 region: Primary 2A/2B cleavage in clone derived precursors. Virology 190, 754-762. Parsley, T. B., Towner, J. S., Blyn, L. B., Ehrenfeld, E., and Semler, B. L. (1997). Poly (rC) binding protein 2 forms a ternary complex with the 5'-terminal sequences of poliovirus RNA and the viral 3CD proteinase. RNA 3, 1124-1134. Perona, J. J., and Craik, C. S. (1995). Structural basis of substrate specificity in the serine proteinases. Prot. Sci. 4, 337-360. Piccone, M. E., Rieder, E., Mason, P. W., and Grubmann, M.J. (1995a). The foot-and-mouth disease leader proteinase gene is not required for viral replication.J.Virol. 69, 5376-5382. Piccone, M. F., Zellner, M., Kumosinski, T. F., Mason, P. W., and Grubman, M.J. (1995b). Identification of the active-site residues of the L proteinase of foot-and-mouth disease virus. J. Virol. 69, 4950-4956. Porter, A. G. (1993). Picornavirus nonstructural proteins: Emerging roles in virus replication and inhibition of host cell functions. J. Virol. 67, 6917-6921. Rasnick, D. (1996). Small synthetic inhibitors of cysteine proteinases. Perspect. Drug Disc. Design 6, 47-63. Read, R.J., and James, M. N. G. (1986). Introduction to the Protein Inhibitors: X-ray Crystallography. In "Proteinase Inhibitors" (A. J. Barrett and G. Salvesen, Eds.). Elsevier, Amsterdam. Roberts, P. J., and Belsham, G.J. (1995). Identification of critical amino acids within the foot-andmouth disease virus leader protein, a cysteine protease. Virology 213, 140-146. Rueckert, R. R. (1996). Picornaviridae: The Viruses and their Replication. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). Lippincott-Raven, Philadelphia.
162
Ernst M. Bergmann and Michael N. G. James
Ryan, M. D., and Flint, M. (1997). Virus-encoded proteinases of the picornavirus super-group. J. Gen. Virol. 78, 699-723. Schechter, I., and Berger, A. (1967). On the size of the active site in proteases. I. Papain. Biochem. Biophys. Res. Commun. 27, 157-162. Schultheiss, T., Kusov, Y. Y., and Gauss-MOiler, V. (1994). Proteinase 3C of hepatitis A virus (HAV) cleaves the HAV polyprotein P2-P3 at all sites including VP1/2A and 2A/2B. Virology 198, 275-281. Schultheiss, T., Emerson, S. U., Purcell, R. H., and Gauss-MOiler, V. (1995a). Polyprotein processing in echovirus 22--A first assessment. Biochem. Biophys. Res. Commun. 219, 1120-1127. Schultheiss, T., Sommergruber, W., Kusov, Y. Y., and Gauss-MOiler, V. (1995b). Cleavage specificity of purified recombinant hepatitis A virus 3C proteinase on natural substrates.J. Virol. 69,17271733. Skern, T., Fita, I., and Guarn6, A. (1998). A structural model of picornavirus leader proteinase based on papain and bleomycin hydrolase. J. Gen. Virol. 79, 301-307. Skiles, J. W., and McNeil, D. (1990). Spiro indolinone/3-1actams, inhibitors of poliovirus and rhinovirus 3C-proteinases. Tetrahedr. Lett. 31, 7277-7280. Sommergruber, W., Zorn, M., Blaas, D., Fessel, F., Volkmann, E, Mauser-Fogy, I., Pallai, E, Merluzzi, V., Matteo, M., Skern, T., and K~chler, E. (1989). Polypeptide 2A of human rhinovirus type 2: Identification as a proteinase and characterization by mutational analysis. Virology 169, 68-77. Sommergruber, W., Ahorn, H., Klump, H., Zoephel, A., Fessl, F., Blaas, D., KOchler, E., Liebig, H.-D., and Skern, T. (1994a). 2A proteinases of coxsackie- and rhinovirus cleave peptides derived from eIF-4y via a common recognition motif. Virology 198, 741-745. Sommergruber, W., Casari, G., Fessl, F., Seipelt, J., and Skern, T. (1994b). The 2A proteinase of human rhinovirus is a zinc containing enzyme. Virology 204, 815-818. Sommergruber, W., Seipelt, J., Fessl, F., Skern, T., Liebig, H.-D., and Casari, G. (1997). Mutational analyses support a model for the HRV2 2A proteinase. Virology 234, 203-214. Steinmann, L., and Conlon, E (1997). Viral damage and the breakdown of self-tolerance. Nature Med. 3, 1085-1087. Storer, A. C., and M6nard, R. (1994). Catalytic mechanism in papain family of cysteine peptidases. Meth. Enzymol. 244,486-500. Strebel, K., and Beck, E. (1986). A second proteinase of foot-and-mouth disease virus. J. Virol. 58, 893-899. Teterina, N. L., Bienz, K., Egger, D., Gorbalenya, A. E., and Ehrenfeld, E. (1997a). Induction of intracellular membrane rearrangements by HAV proteins 2C and 2BC. Virology 237, 66-77. Teterina, N. L., Gorbalenya, A. E., Egger, D., Bienz, K., and Ehrenfeld, E. (1997b). Poliovirus 2C protein determinants of membrane binding and rearrangements in mammalian cells. J. Virol. 71, 8962-8972. Tibbles, K. W., Brierley, I., Cavanagh, D., and Brown, T. D. K. (1996). Characterization in vitro of an autocatalytic processing activity associated with the predicted 3C-like proteinase domain of the Coronavirus avian infectious bronchitis virus. J. Virol. 70, 1923-1930. Voss, T., Meyer, R., and Sommergruber, W. (1995). Spectroscopic characterization of rhinoviral protease 2A: Zn is essential for structural integrity. Prot. Sci. 4, 2526-2531. Walker, E A., Leong, L. E. C., and Porter, A. G. (1995). Sequence and structural determinants of the interaction between the 5'-noncoding region of picornavirus RNA and rhinovirus protease 3C.J. Biol. Chem. 270, 14510-14516. Webber, S. E., Tikhe, J., Worland, S. T., Fuhrmann, S. A., Hendrickson, T. F., Matthews, D. A., Love, R. A., Patick, A. K., Meador, J. W., Ferre, E A., Brown, E. L., Delisle, D. M., Ford, C. E., and Binford, S. L. (1996). Design synthesis and evaluation of nonpeptide inhibitors of human rhinovirus 3C proteinase. J. Med. Chem. 39, 5072-5882.
Picornaviral Proteinases
163
Weidner, J. R., and Dunn, B. M. (1991). Development of synthetic peptide substrates for the poliovirus 3C proteinase. Arch. Biochem. Biophys. 286,402-408. Whiting, A. K., and Peticolas, W. L. (1994). Details of the acyl-enzyme intermediate and the oxyanion hole in serine protease catalysis. Biochemistry 33,552-561. Wimmer, E. (1982). Genome linked proteins of viruses. Cell 28, 199-201. Wimmer, E., Hellen, C. U. T., and Cao, X. (1993). Genetics of poliovirus. Ann. Rev. Genet. 27, 353-436. Wirblich, C., Sibilia, M., Boniotti, M. B., Rossi, C., Thiel, H.-J., and Meyers, G. (1995). 3C-like protease of rabbit hemorrhagic disease virus: identification of cleavage sites in the ORF1 polyprotein and analysis of cleavage specificity. J. Virol. 69, 7159-7169. Xiang, W. S., Harris, K. S., Alexander, L., and Wimmer, E. (1995). Interaction between the 5'terminal cloverleaf and 3AB/3CD Pr~of poliovirus is essential for RNA application. J. Virol. 69, 3658-3667. Yalamanchili, D., Weidman, K., and Dasgupta, A. (1997). Cleavage of transcriptional activator Oct-1 by poliovirus encoded protease 3C pro. Virology 239, 176-185. Ypma-Wong, M. F., Dewalt, E G., Johnson, V. H., Lamb, J. G., and Semler, B. L. (1988). Protein 3CD is the major poliovirus proteinase responsible for cleavage of the P1 capsid precursor. Virology, 166, 265-270.
Proteases as Drug Targets for the Treatment of Malaria COLIN BERRY
Cardiff School of Biosciences, Cardiff University, Cardiff CF1 3US, Wales, UK
I. I n t r o d u c t i o n II. P r o t e a s e s i n M a l a r i a P a r a s i t e s
III. Current Antimalarial Agents with Effects in Parasite Proteolytic Enzymes IV. Concluding Remarks References
I. I N T R O D U C T I O N
A. O C C U R R E N C E OF MALARIA Every minute, approximately four children die from malaria. Globally, this culminates in 3 million deaths per year from up to 500 million clinical cases and it has been estimated that more than 2 billion people, over 40% of the world's population, are at risk from the disease (Najera and Hempel, 1996). This means that malaria causes almost as many fatalities each year as the total AIDS death toll over the past 15 years. Malaria occurs throughout the tropics (Fig. 1), limited by the distribution of mosquitoes of the genus Anopheles, the intermediate vectors, which spread the parasites from person to person. The incidence of this disease is now increasing owing to several factors including (1) increased resistance of the parasites to current antimalarial drugs, (2) increased resistance of mosquitoes to insecticides, and (3) increased size of endemic regions (e.g., because of deforestation and movement of populations into cities). Proteasesof InfectiousAgents Copyright 9 1999 by AcademicPress. All rights of reproduction in any form reserved.
165
166
Colin Berry
d ~
Affectedregions
FIGURE 1 Regionsof the world where malaria is endemic.
The problem of malaria can be tackled on two fronts: (1) by attacking the mosquito vector to reduce rates of transmission and (2) by prevention and treatment of infection in the human host. Measures such as insecticide spraying and the use of insecticide-impregnated bednets have produced some reduction in transmission although insect resistance is a developing problem. The prospects for human immunization against malaria have received much attention but suitable vaccines have not been produced as yet. At present, therefore, our defence against the parasite relies on the use of several classes of prophylactic or curative drugs. The mechanism of action of some of these drugs is known (e.g., the inhibitors of folate synthesis, the sulfonamides and sulfones, and the folate antagonists proguanil and pyrimethamine). However, the activity of many agents, including chloroquine, the mainstay of antimalarial chemotherapy for over 40 years, is poorly understood. Unfortunately, the effectiveness of our current arsenal of antimalarial compounds is increasingly compromised by the spread of resistant parasites. This makes it essential that new targets are sought to intervene in crucial biochemical pathways in the parasite. Inhibitors for the specific blockade of such targets may then be designed to develop the next generation of drugs to combat this scourge of human health.
B. MALARIAL LIFE CYCLE Malaria in humans is caused by four species of protozoan parasites in the genus Plasmodium. Plasmodium ovale and Plasmodium malariae are relatively uncommon infections. Plasmodium vivax and Plasmodium falciparum are the
Malaria
16 7
most common and P. falciparum accounts for by far the greatest number of deaths. The life cycle of the malarial parasites is complex and has many distinct phases. In humans, where antimalarial drug intervention must occur, these stages can be summarized briefly as follows. With a bite from an infected Anopheles mosquito, the parasite in its sporozoite stage is injected into the human host. In this form, the parasite migrates through the blood stream to the liver where it invades hepatocytes to begin the intrahepatic phase of the cycle. All P. falciparum and P. malariae cells and many of the cells of P. vivax and P. ovale then develop into hepatic trophozoites, which grow and divide to release the merozoite stage into the blood stream. However, P. vivax and P. ovale have an extra life cycle stage that may occur in the liver and some of the parasites of these species may enter the dormant hypnozoite stage. The hypnozoites may remain in the liver for months or years after initial infection before they, in turn, develop to release merozoites into the blood. The merozoites invade red blood cells to initiate the intraerythrocytic phase of the life cycle. After invasion, the parasites are termed "ring stage." These grow and develop into trophozoites that divide to form schizonts, burst the red blood cells, and release more daughter merozoites, which can in turn invade further erythrocytes. It is during the erythrocytic cycle of infection that symptoms of malaria first appear. The lysis of red cells to release merozoites tends to become synchronized with the consequent release of pyrogens, giving rise to the characteristic cyclic fevers of this disease. Following release from red blood cells, a few merozoites go on to develop into male and female gametocytes that cannot develop further in the human host and will die if they are not taken up by another Anopheles mosquito to complete the life cycle.
II. P R O T E A S E S IN M A L A R I A PARASITES Like other eukaryotes, malarial parasites contain a range of proteolytic enzymes that play important roles in functions such as protein processing. In addition, the complex life cycle of these protozoa gives rise to a variety of specialized functions that may be mediated by endo- and exopeptidases. These functions include processing of major parasite surface antigens, host invasion, morphological changes between the distinct stages of the parasite life cycle, digestion of host-derived proteins to obtain nutrients for growth, and release from host cells (reviewed by Schrevel et al., 1990; McKerrow et al., 1993). All such specific parasite processes are potential targets for new antimalarial interventions and therefore parasite proteases have received much attention. To date, the proteolysis of hemoglobin has been examined most intensively and the findings from such studies is reviewed below.
168
Colin
Berry
A. PROTEINASES AND HEMOGLOBIN DEGRADATION During the intraerythrocytic phase, parasites engulf red blood cell cytoplasm, which is then transported via a double-membrane-enclosed cytosome to the food vacuole (also known as the digestive vacuole), which has a single membrane. (For an excellent review of the metabolic role of the food vacuole, see Olliaro and Goldberg, 1995.) Within the latter lysosome-like acidic organelle, breakdown of the major red blood cell protein hemoglobin occurs to provide nutrients for parasite growth and development. In an established infection, this digestive process occurs on a very large scale such that 20% of red cells may be parasitized with 75% of host cell hemoglobin degraded. This can lead to the destruction of an estimated 100 g of hemoglobin during each cycle of erythrocyte infection (Goldberg et al., 1990). A biproduct of hemoglobin destruction, heme, is released. Free heme lyses malaria parasites (Orjih et al., 1981) so the parasite detoxifies the high concentrations accumulated in the food vacuole by polymerizing the heine to form the so-called malaria pigment hemozoin. Three proteinases have been isolated from the food vacuoles of P. falciparum (Goldberg et al., 1991; Gluzman et al., 1994), one cysteine proteinase (falcipain) and two aspartic proteinases (plasmepsin I, EC3.4.23.38, and plasmepsin II, EC3.4.23.39). The individual roles of each of these enzymes in the pathway is still controversial but inhibition of either cysteine or aspartic proteinase activity has been shown to cause growth inhibition and death of P. falciparum in red blood cells in culture (Rosenthal et al., 1988; Francis et al., 1994; Moon et al., 1997). As a result of the success of these and related studies, the Plasmodium cysteine and aspartic proteinases have been accepted as important drug targets by the World Health Organization (WHO, 1996). Proteolysis in the food vacuole leads to the accumulation of a series of discrete peptides (Kolakovich et al., 1997). It appears, therefore, that a specific system must exist to transport these peptide products from the food vacuole to the parasite cytoplasm where further degradation to individual amino acids may be mediated (at least in part) by a cytosolic aminopeptidase (Vander Jagt et al., 1984; Curley et al., 1994; Kolakovich et al., 1997). The roles of each of the food vacuole enzymes and the aminopeptidase will be discussed below with reference to their potential as targets for novel antimalarial intervention.
B. FALCIPAIN" A MALARIAL CYSTEINE PROTEINASE 1. Effect of Cysteine Proteinase Inhibitors on Parasites
To investigate the role of various proteinases in hemoglobin degradation within trophozoite stage parasites, Rosenthal et al. (1988) studied the effect of inhibi-
Malaria
169
tors on P. falciparum growing in red blood cells in culture. Food vacuoles with abnormal morphology were observed in parasites after 6 h incubation with the cysteine proteinase inhibitors leupeptin (20 - 100/zM) or L-transepoxysuccinylleucylamido-(4-guanidino)-butane (E64, 140/zM). Parasite differentiation and multiplication were also inhibited so that very few parasites progressed to reinvade further red blood cells to form new ring stages. Further examination of the vacuoles showed that they were completely filled with undigested erythrocyte cytoplasm (Rosenthal et al., 1988), suggesting a crucial role for cysteine proteinase(s) in the digestive pathway. The food vacuole abnormality was a specific consequence of the inhibition of cysteine proteinase activity rather than a general symptom of toxicity in Plasmodium, since other compounds including aspartic proteinase inhibitors (Rosenthal et al., 1988) and antimalarial drugs including chloroquine, mefloquine, and quinine (Rosenthal, 1995), did not produce the same morphological changes. 2. Possible Roles for Falcipain in Hemoglobin Degradation That falcipain plays a crucial role in hemoglobin degradation and that its inhibition is fatal to malarial parasites is not in doubt. The precise role of the enzyme in the degradative pathway is, however, the subject of some speculation. Rosenthal and his co-workers have proposed that falcipain is involved in the early stages of the catabolic pathway and have proposed a role in initial hemoglobin denaturation and heme release (Gamboa de Dominguez and Rosenthal, 1996). This may be supported by the findings of Asawamahasakda et al. (1994), who showed that E64 inhibited the formation of hemozoin to a greater extent than the aspartic proteinase inhibitor pepstatin. Subsequent involvement of falcipain in the first stages of globin digestion has also been inferred from the accumulation of undigested globin in the food vacuoles of parasites treated with cysteine proteinase inhibitors (Rosenthal et al., 1991; Rosenthal, 1995). Support for this model is derived from experiments indicating that falcipain is able to degrade native hemoglobin (Salas et al., 1995). These assays were performed in the presence of reducing agents (typically 10 mM Dithiothreitol) and this was considered to mimic the reducing effects of glutathione which might be provided by the erythrocyte cytoplasm and which also stimulates the activity of falcipain (Rosenthal et al., 1988). Francis et al. (1996) also showed naturally occurring falcipain isolated from parasite food vacuoles could degrade native hemoglobin in the presence of reducing agents. In contrast, in the absence of reducing agents, native hemoglobin is not digested by falcipain, although aciddenatured globin is still broken down and the specific peptide fragments produced have been characterized (Gluzman et al., 1994). This suggests that although reducing agents do produce some increase in activity of falcipain (Rosenthal et al., 1988), their major importance in the cleavage of native hemoglobin may be due to their effects on the structure of the substrate (Fran-
170
Colin Berry
cis et al., 1996). Indeed, Gamboa de Dominguez and Rosenthal (1996) and Francis et al. (1996) have shown that at the pH of the food vacuole (pH 5.0 to 5.4) hemoglobin is denatured only in the presence of reducing agents. The reducing potential of the food vacuole is therefore an important factor which influences the susceptibility of hemoglobin to attack by falcipain. Salas et al. (1995) proposed that ingested erythrocyte glutathione may provide the reducing environment necessary for hemoglobin to be cleaved. However, Francis et al. (1996) suggested that the levels of catalase present in the food vacuoles may be sufficient to protect hemoglobin from thiol-mediated denaturation, a process which is peroxide mediated. In the latter model, initial cleavage of the substrate is accredited to the action of aspartic proteinases (see below). Accumulation of undegraded hemoglobin in parasites treated in culture with cysteine proteinase inhibitors is then explained as follows. The action of the aspartic proteinases may lead to a build-up of peptide fragments (Kolakovich et al., 1997) that may no longer be broken down into amino acids and peptides for export from the vacuole while the cysteine proteinase is inhibited. As a consequence, these peptides may cause a hyperosmotic potential in the food vacuole which would in turn bring about the influx of water, swelling of the vacuole, and finally lead to a dysfunctional organelle in which catabolism no longer occurred so that native hemoglobin would accumulate. These differing views of the role of falcipain in the hemoglobin catabolic pathway remain to be resolved. However, it is clear from the studies of both Rosenthal and Goldberg that inhibition of falcipain activity is lethal to malarial parasites and thus this enzyme is a potential target for antimalarial drug design.
3. Characterization of Falcipain Cysteine proteinase activity in trophozoite stage parasites was initially analyzed by nonreducing, gelatin-substrate PAGE, with and without the inhibitors E64 or leupeptin (Rosenthal et al., 1988). These experiments confirmed earlier observations of a trophozoite cysteine proteinase (TCP) of approximately 28-kDa (Rosenthal et al., 1987). The food vacuole was identified as the location of this activity by demonstrating the accumulation of [3H]leupeptin in these organelles. Rosenthal et al. (1988) and Gluzman et al. (1994) have isolated a cysteine proteinase, falcipain, from the food vacuoles of P. falciparum. Data on the localization of falcipain and TCP and their substrate specificities have indicated that they are likely to be the same enzyme (Salas et al., 1995; Francis et al., 1996), although this remains to be proven rigorously. In vitro assays using fluorogenic peptides showed that the activity of falcipain was stimulated by the presence of sulfydryl agents and inhibited reversibly by leupeptin and irreversibly by E64. The peptide Z-Phe-Arg-AMC was the
171
Malaria
best substrate tested in these studies (Table I) and this substrate preference led to the conclusion that falcipain might be a cathepsin L-like proteinase (Rosenthal et al., 1988, 1989). Subsequently, a gene encoding a 569-amino-acid proenzyme in the cathepsin L family was identified in P. falciparum (Rosenthal and Nelson, 1992) and other Plasmodium species (Rosenthal, 1993, 1996; Rosenthal et al., 1993b). This zymogen is predicted to be activated to form a 26.8-kDa enzyme which is believed to be falcipain. Although other genes encoding cysteine proteinases are known in P. falciparum (Knapp et al., 1989, 1991; Li et al., 1989; Berti and Storer, 1995; Francis et al., 1996), Northern blot analysis has shown that only the falcipain gene has an expression pattern consistent with the trophozoite cysteine proteinase, as its mRNA is expressed during the ring stage and, at much lower levels, in the trophozoite stage (Rosenthal and Nelson, 1992; Francis et al., 1996). The availability of the gene encoding the falcipain precursor permitted the production of active recombinant falcipain in a baculovirus expression system (Salas et al., 1995). The recombinant protein had a pH profile of activity similar to trophozoite cysteine proteinase with an optimum in the range pH 5.5 to 6.0 and was shown to be able to degrade hemoglobin in the presence of reducing agents. Nevertheless, the protein produced in baculovirus, as assessed by gelatin PAGE, migrated as two bands (consistent with molecular weights of 55 and 45 kDa respectively) rather than the 28-kDa band characteristic of naturally occurring falcipain from P. falciparum. These higher-molecular-weight forms may be a result of incomplete processing of the zymogen (Salas et al., 1995) and may explain the very different kinetic properties (Table II) of the recombinant protein (Salas et al., 1995) and the naturally occurring enzymes (Rosenthal et al., 1989; Francis et al., 1996).
TABLE I RelativeRates of Cleavage of AMC Peptide Substrates by Trophozoite Extracta AMC peptide
Relative activity
Z-Phe-Arg Z-Val-Leu-Arg Z-Arg-Arg Z-Leu Z-Phe-Pro-Arg Z-Phe Z-Ala-Arg-Arg
100 37 20 7 7 4 3
aResults are normalized to 100 for the most effective substrate. Data from Rosenthal et al. (1988).
172
Colin Berry
TABLE II Kinetic constants for the hydrolysis of peptide substrates by naturally occurring (N.O.) and recombinant (Recomb.) forms of falcipain a
Z-Val-Leu-Arg-AMC
Z-Phe-Arg-AMC
Falcipain source
K,,, (/zM)
kcat
kcat/Km
(sec-1)
(M-lsec -~)
Recomb. N.O.
4 5
0.25 0.01
62,500 2,000
Recomb. N.O.
28 43
0.02 0.09
720 2000
aAdapted from Francis et al. (1996).
4. Antimalarial Action of Falcipain Inhibitors Initial studies with the relatively nonspecific inhibitors E64 and leupeptin showed that these compounds could cause the death of malaria parasites in red blood cells in culture (Rosenthal et al., 1988). Subsequently, the activity of peptide fluoromethyl ketone cysteine proteinase inhibitors was assessed against trophozoite extracts and against parasites in culture (Rosenthal et al., 1991). The ability of each of these inhibitors to kill parasites was well correlated to their effectiveness at inhibiting cysteine proteinase activity in the trophozoite extracts. The compound Z-Phe-Arg-CH2F, a potent inhibitor of cathepsin L, was the most effective inhibitor tested. Thus, the identification of falcipain as a cathepsin L family proteinase was further confirmed. With the identification of the falcipain homolog in the murine malarial parasite Plasmodium vinckei (Rosenthal, 1993), an animal model was developed for the testing of cysteine proteinase inhibitors as antimalarial agents (Rosenthal et al., 1993a). Despite the fact that the enzyme from P. vinckei was generally less susceptible to a range of inhibitors than the P. falciparum falcipain, inhibitors such as morpholine urea-Phe-Hphe-CH2F (Mu-Phe-Hphe-CH2F) were still effective against P. vinckei parasite extracts in vitro. As a result, this inhibitor was administered to infected mice to assess its effect on the activity of falcipain in vivo and on parasitemia. Falcipain activity isolated from parasites from treated animals was shown to be reduced by >90%, 2 h posttreatrnent with this irreversible inhibitor. Furthermore, after 4 days at 100 mg/kg 4 times per day, 80% of mice were cured of parasitemia (Rosenthal et al., 1993a). The inhibitor M u - P h e - H p h e - CHeF is not active specifically against falcipain (IC~0 3 and 5 nM against the P. falciparum and P. vinckei enzymes, respectively); it also inhibits the host enzymes cathepsin L (ICso 3 nM) and cathepsin B (ICs0 3 nM). Nevertheless, the effects of this compound on the murine host appeared to be relatively mild; lethargy was noted and skin ulcers occurred at the site of subcutaneous administration but both of these side-effects resolved quickly when treatment was discontinued (Rosenthal et al., 1993a). Therefore, al-
Malaria
17 3
though Mu-Phe-Hphe-CH2F is clearly not usable as a drug to tackle human malaria, the principle that inhibition of falcipain can lead to a cure for parasitemia in vivo was established by these studies. The lack of correlation between toxicity in the host and inhibition of host cysteine proteinases for the fluoromethyl ketone inhibitors led Rosenthal et al. (1996) to speculate that the side-effects may have resulted from the production of toxic metabolites rather than host proteinase inhibition. Therefore, a new series of peptide-based inhibitors was tested in which the fluoromethyl ketone leaving group was replaced by a vinyl sulfone group (VSPh). The compound Mu-Phe-Hphe-VSPh was a weaker inhibitor (IC50 80 nM) of P. falciparum falcipain than the fluoromethyl ketone analog Mu-Phe-Hphe-CH2F (3 nM). Substitution of Leu for Phe in the compound Mu-Leu-Hphe-VSPh produced an inhibitor with an ICs0 of 3 nM for falcipain, which caused hemoglobin accumulation and inhibition of parasite development in culture in the 10-30 nM range (Rosenthal et al., 1996). This vinyl sulfone-containing peptidomimetic showed no apparent toxicity or pathology in rats given up to 30 mg/kg daily for 28 days and thus would appear to be safer for use than the fluoromethyl ketone compounds. The above studies (Rosenthal et al., 1993a, 1996), using the cysteine proteinase from P. falciparum and P. vinckei, raise an important issue for the development of antimalarial inhibitors. Although P. falciparum causes the most deadly form of malaria in humans, the financial investment necessary for drug development would be likely to dictate that any compound produced should have the widest possible application and therefore should be effective not only against P. falciparum but also against the three other parasites that cause human malaria (P. vivax, P. malariae, and P. ovale). Comparison of inhibitor binding to falcipain and its homolog from P. vinckei shows a variation in ICs0 of more than 50-fold in some cases. It remains to be seen whether an inhibitor can be developed with characteristics to allow effective inhibition of falcipains from all four human Plasmodium parasites and still be selective enough to cause no host toxicity. The production and assay of the falcipain homologs from P. vivax, P. malariae, and P. ovale will be essential in future inhibitor development studies. 5. Computer-Aided Design of Novel Falcipain Inhibitors Characterization of the gene encoding the falcipain precursor and derivation of the amino acid sequence of the protein (Rosenthal and Nelson, 1992) paved the way for generation of a computer model for mature falcipain, based on the X-ray structures of papain and actinidin (Ring et al., 1993). Identification of profalcipain genes from different Plasmodium species (Rosenthal et al., 1993b; Rosenthal, 1996) has allowed the identification of conserved residues that appear to be characteristic of the falcipains and that are not present in other
174
Colin Berry
papain family proteinases. This information facilitates design steps to produce the lead compounds for drug development. The model of falcipain allowed the use of the DOCK 3.0 program to screen a small molecule database for moieties that might fit the active site of the enzyme (Ring et al., 1993). Over 2000 compounds were selected from an initial screening and were then judged manually for those most likely to produce a good interaction. Finally, 31 compounds were chosen for assay to determine their abilities to inhibit cysteine proteinase activity in trophozoite extracts. Four showed ICs0 values of