THE COMPLEMENT FactsBook
Other books in the FactsBook Series: Robin Callard and Andy Gearing The Cytokine FactsBook S...
78 downloads
1447 Views
15MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THE COMPLEMENT FactsBook
Other books in the FactsBook Series: Robin Callard and Andy Gearing The Cytokine FactsBook Steve Watson and Steve Arkinstall The G'Protein Linked Receptor FactsBook Rod Pigott and Christine Power The Adhesion Molecule FactsBook Shirley Ay ad, Ray Boot-Handford, Martin J. Humphries, Karl E. Kadler and C. Adrian Shuttle worth The Extracellular Matrix FactsBook, 2nd edn Grahame Hardie and Steven Hanks The Protein Kinase FactsBook The Protein Kinase FactsBook CD-Rom Edward C. Conley The Ion Channel FactsBook I: Extracellular Ligand-Gated Channels Edward C. Conley The Ion Channel FactsBook II: Intracellular Ligand-Gated Channels Edward C. Conley and William J. Brammar The Ion Channel FactsBook rV: Voltage-gated Channels Kris Vaddi, Margaret Keller and Robert Newton The Chemokine FactsBook Marion E. Reid and Christine Lomas-Francis The Blood Group Antigen FactsBook A. Neil Barclay, Marion H. Brown, S.K. Alex Law, Andrew J. McKnight, Michael G. Tomlinson and P. Anton van der Merwe The Leucocyte Antigen FactsBook, 2nd edn Robin Hesketh The Oncogene and Tumour Suppressor Gene FactsBook, 2nd edn Jeffrey K. Griffith and Clare E. Sansom The Transporter FactsBook Tak W. Mak, Josef Penninger, John Rader, Janet Rossant and Mary Saunders The Gene Knockout FactsBook Steven G.E. Marsh, Peter Parham and Linda D. Barber The HLA FactsBook
THE COMPLEMENT FactsBook Bernard J. Morley Mark J. Walport Imperial College School of Medicine Hammersmith Campus, London, UK
ACADEMIC PRESS A Harcourt Science and Technology Company
San Diego San Francisco New York Boston London Sydney Tokyo
This book is printed on acid-free paper. Copyright © 2000 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Academic Press A division of Harcourt Science and Technology Company 24-28 Oval Road, London NWl 7DX, UK http://www.hbuk.co.uk/ap/ Academic Press 525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.apnet.com ISBN 0-12-733360-6 Library of Congress Catalog Card Number: 99-65744 A catalogue for this book is available from the British Library
Typeset by Mackreth Media Services, Hemel Hempstead, UK Printed in Great Britain by Redwood Books, Trowbridge, Wiltshire 00 01 02 03 04 RB 9 8 7 6 5 4 3 2 1
Contents Abbreviations
Vll
Preface
Vlll
Section I THE INTRODUCTORY CHAPTERS Chapter 1 Introduction Bernard J. Morley and Mark J. Walport Chapter 2 The Complement System Bernard J. Morley and Mark J. Walport
Section II THE COMPLEMENT PROTEINS Part 1 C l q and the Collectins Clq Franz Petry and Michael Loos Mannose-binding lectin Peter Lawson and K.B.M. Reid Bovine conglutinin Peter Lawson and K.B.M. Reid SP-A Robert B. Sim SP-D Robert B. Sim Part 2 Serine Proteases Clr Nicole Thielens and Gerard J. Arlaud Cls Nicole Thielens and Gerard J. Arlaud MASP-1 Teizo Fnjita, Yuichi Endo and Misao Matsushita MASP-2 Steen V. Petersen and Jens C. Jensenius
16 31 36 41 46
52
56
61
65
Factor D 69 Jurg Schifferli and Sylvie Miot C2 73 Yuanyuan Xu and John E. Volanakis Factor B 78 Antonella Circolo and Harvey R. Cohen Factor I 83 Bernard f. Morley Part 3 C3 Family C3 Marina Botto C4 David E. Isenman C5 Rick A. Wetsel Part 4 Terminal Pathway Components C6 Michael Hobart C7 Michael Hobart
88 95 104
112 117
Contents
C8 Francesco Tedesco, Mnason E. Plumb and fames M. Sodetz C9 B. Paul Morgan
123 131
Part 5 Regulators of Complement Activation (RCA) CRl 136 Lloyd B. Klickstein and Joann M. Moulds CR2 146 Joel M. Guthridge and V. Michael Holers Decay-accelerating factor 152 L Kuttner-Kondo, W.G. Brodbeck and M.E. Medof Membrane cofactor protein 156 M. Kathryn Liszewski and John P. Atkinson C4b-binding protein 161 Santiago Rodriguez de Cordoba, Olga Criado Garcia and Pilar Sanchez-Corral Factor H 168 Richard G. DiScipio
Part 6 Cell Surface Receptors ClqRp Andrea J. Tenner C3a receptor Robert S. Ames C5a receptor Andreas Klos and Wilfried Bautsch CR3 Yu Xia and Gordon D. Ross CR4 Alex Law
176 180 184 188 198
Part 7 Miscellaneous Complement Components CI inhibitor 206 Rana Zahedi and Alvin E. Davis III Apolipoprotein J (clusterin) 210 Mark E. Rosenberg Properdin 215 Timothy Parries CD59 219 B. Paul Morgan 223 Index
Abbreviations ClINH C4BP CRD CRP DAF EBV EGF FGF fMLP GPI HIV IFNy Ig IL-1 LAD LPS MAC MBL MCP MHC MIDAS M,(K) NK PDGF PMA PMN PTK RaRF RFLP SAP SDS-PAGE SLE TGF^ TNFa VNTR VWF
CI inhibitor C4b-binding protein carbohydrate-recognition domain C-reactive protein decay-accelerating factor Epstein-Barr virus epidermal growth factor fibroblast growth factor formyl-methionyl-leucyl-phenylalanine glycosylphosphatidylinositol human immunodeficiency virus interferon y immunoglobulin interleukin 1 leukocyte adhesion deficiency lipopolysaccharide membrane attack complex mannose-binding lectin membrane cofactor protein major histocompatibility complex metal ion-dependant adhesion site relative molecular mass natural killer platelet-derived growth factor phorbol myristate acetate polymorphonuclear leukocyte protein tyrosine kinase Ra-reactive factor restriction fragment length polymorphism serum amyloid protein polyacrylamide gel electrophoresis in sodium dodecyl sulfate systemic lupus erythematosus transforming growth factor ^ tumour necrosis factor a variable number tandem repeat von Willebrand factor
Preface The authors wish to thank all those who contributed entries for this volume and for their comments and suggestions. In addition, we are indebted to a number of contributors for additional information they provided. Dr Robert Sim for Figure 2 in Chapter 2, Dr David Isenman for the C3 and C4 catabolism diagrams and Dr Robert Ames for the C3a and C5a receptor diagrams. We would also like to thank Dr James Sodetz for advice on nomenclature, and Dr Alex Law for providing much of the information used in the CR3 chapter on deficiency and polymorphism, including unpublished data. We would like to thank Dr Robert Sim for critical reading of the introduction and Jane Rose for prolific proofreading. Finally, we would like to thank Dr Lilian Leung for her encouragement in the final stages of the preparation of this book. The field of complement is rapidly changing with the constant addition of new data. In light of this, we would be grateful if readers could point out any errors, omissions or indeed new information which could then be incorporated into future editions of this book. Please send these to the Editor, The Complement FactsBook, Academic Press, 24-28 Oval Road, London NWl 7DX, UK.
Bernard J. Morley
Mark f. Walport
Section I
THE INTRODUCTORY CHAPTERS
This Page Intentionally Left Blank
1 Introduction AIMS AND SCOPE OF THE BOOK The aim of this book is to present concise biochemical information about the proteins of the complement system. A novel aspect of this book compared with others in the FactsBook series is the inclusion of cDNA structure and intron-exon boundary details. This enables the design of primers for DNA amplification by the polymerase chain reaction, facilitating both functional mutation studies and the design of probes for expression work. The focus of the book is on the human system, though accession numbers have been included for other species. In the case of conglutinin, where no human homologue has been identified, the bovine molecule has been described. The complement proteins are largely built up from protein modules and it is therefore quite easy to divide them into families of structurally related molecules. This is the basis for the separate chapters. A few proteins escape such simple classification (CI inhibitor, apolipoprotein } (clusterin), properdin and CD59) and these have been grouped together in a separate chapter.
ORGANIZATION OF THE DATA Entries are classified into the following sections, each of which is briefly described.
Other names Entries are identified by the accepted nomenclature for the complement system as described^'^. More recently characterized components are entered under their most commonly used name. Historically, many of the complement proteins have been known by alternative names, or were identified as members of other protein families. Hence different researchers may know them by different designations. All of these alternatives have been included.
Physicochemical properties This section includes data on the number of amino acids in the mature protein and leader peptide (if present); the pi; the molecular weight, both observed under reduced and non-reduced conditions, and predicted based on amino acid composition; the number and location of putative N-glycosylation sites, and if known, whether the sites are occupied; and the number and location of interchain disulfide bonds. Intrachain disulfide bonds are not listed, nor are O-linked glycosylation sites, though the latter are mentioned in the structure section.
Structure Details of the three-dimensional structure where known are included in this section together with any other significant features.
Function The mechanism of activation of the molecule is detailed in this section, together with a brief description of its role in the complement pathway. Other functional activity, outside the complement pathway is also mentioned. The modular structure of each protein is illustrated and the functional importance of each
Introduction
Table 1. Key to the schematic diagrams. All diagrams show modules to scale, with the key illustrating average sizes. SYMBOL
PROTEIN MODULE
ABBREVIATION
Complement control protein repeat
CCP
Serine protease domain
—
Factor I/membrane attack complex C6/7 module
FIMAC
0
Epidermal growth factor-like repeat
EGF
I
Calcium-binding epidermal growth factor-like repeat
Ca^+ EGF
iiiiiiiiiiiiiiiii
Von Willebrand factor type A
VWFA
Thrombospondin type 1 repeat
TSPl
0
Low density lipoprotein receptor class A repeat
LDLRA
mm
CUB domain (first identified in Clr/Cls, uEGF, bone morphogenic protein) Membrane attack complex proteins/perforin-like segment
CUB
Collagen-like domain
—
Carbohydrate-recognition domain
CRD
Alpha-helical coiled-coil "neck'' region
—
•
MACPF
Serine, threonine, proline-rich mucin-like domain STP
I
Cytoplasmic domain
—
Transmembrane domain ( Q for C3aR and C5aR) — Glycosylphosphatidylinositol anchor
GPI anchor
Other domains (see individual sections) Scale: 200 amino acids module noted. A key for the common protein modules is provided in Table 1, together with their full names and the abbreviations^ used throughout the text. Modules which are only present in a single protein in this book, are indicated by a white box and the nature of that module is indicated in the protein modules
Introduction
section of the particular entry. For non-modular proteins such as the C3a and C5a receptors, a diagram has been included only if this helps to illustrate important structural features. In the case of C3 and C4, a diagram has been included to show the degradation pathways of these proteins since this is pertinent to their function.
Tissue distribution For the secreted proteins, the typical serum concentration is provided and other biological fluids known to contain the protein are indicated. The primary site of synthesis is given, together with secondary sites. These are not meant to be exhaustive lists of cells expressing a given protein. In many cases, C3 for example, a large number of cell types have been assayed for expression. However, the absence of a cell or tissue from the list should not be taken as evidence that there is no expression from that cell type. For cell surface proteins, cell types which have been clearly demonstrated to express the molecule are listed.
Regulation of expression Stimuli which alter protein expression are described. Mechanisms, if known, are detailed.
Protein sequence The sequence is shown in the single letter amino acid code. Numbering starts with the initiator methionine residue. The leader sequence is underlined, as are cleavage sites between chains and any special features of specific molecules, for example the residues which form the thioester bond in C3/C4 and the transmembrane domains of the C3a and C5a receptors. Putative and known N-linked glycosylated sites are indicated by N. Sites known not to be occupied are not indicated.
Protein modules For the protein modules listed in Table 1, the leader sequence and some important binding regions, the amino acid boundaries and exons are indicated. For C3 and C4, the thioester domain is indicated, while for the serine proteases, the position of the catalytic triad of the active site (Fi-D-S) is listed.
Chromosomal location The chromosomal location of the gene in both human and mouse, where known, is given. Closely linked genes are also indicated.
cDNA sequence The cDNA sequence is given. Where known, the sequence starts with the 5' end of the message. Otherwise, the most 5' sequence is given. All possible exons are included in the sequence. Where alternative splicing removes an exon from the mature message, this is noted. The initiation codon, termination codon and the putative polyadenylation signals are all indicated. In addition, exon-intron boundaries are shown by underlining the first five nucleotides in each exon. No
Introduction
intronic sequences are included. Where there are discrepancies in published sequences, these are indicated.
Genomic structure where the structure of the human gene is known (with the exception of conglutinin, for which the bovine gene structure is given), this is drawn to scale. The gene is represented by a single horizontal line while the exons are indicated by vertical bars, also to scale. Only the first and last exons are numbered, together with a central exon for the larger genes.
Accession numbers Only the GenBank/EMBL accession numbers are included. These are listed as cDNA or genomic depending on the sequences they contain.
Deficiency The mode of inheritance of deficiency in humans is stated together with the functional effects of deficiency and any clinical correlates. The molecular basis is stated, for example in factor I: A1282 to T, H418 to L; three chromosomes/patients/families where A is the normal nucleotide 1282 is the position in the presented cDNA sequence T is the mutant nucleotide H is the normal amino acid 418 is the position in the presented protein sequence L is the mutant, non- or aberrantly functional amino acid and 'three chromosomes/patients/families' represents the number of times this mutation has been described.
Polymorphic variants Polymorphic variants at the protein level, at the level of restriction fragment length polymorphisms (RFLPs) or where the molecular basis is fully described are listed. Alleles are named A/B where A is the nucleotide/amino acid to the left of the numbering.
References A fully comprehensive list of references is not compatible with the format. However, each entry includes the major references, while key references are highlighted in bold. These represent either important work in the field or key reviews which will link to further references.
References ^ World Health Organization. (1968) Bull. WHO 39, 935-938. 2 lUIS-WHO Nomenclature Committee (1981) J. Immunol. 127, 1261-1262. ^ Bork, P. and Bairoch, A. (1995) Trends Biochem. Sci. 20, Suppl. March C03.
2 The Complement System HISTORICAL PERSPECTIVE In the late nineteenth century, much scientific interest was focused on the mechanisms involved in protecting the body from attack by microorganisms. Two apparently contradictory theories of bacteriolysis emerged during this time. The first, the ''cellular theory'', stemmed from the work of Elie Metchnikoff who demonstrated the existence of blood cells which could ingest invading bacteria. The second, the "humoral theory" of bacteriolysis, was based on work from Fodor, Nuttall and Buchner who identified a heat-labile component of fresh, cell-free serum which was capable of bacteriolysis^. Buchner termed this activity "alexin", from the Greek "without a name". In 1894, Pfeiffer observed that cholera vibrios injected into the peritoneum of immune guinea pigs were lysed^. Towards the end of the nineteenth century, Bordet working at the Pasteur Institute, extended this work by demonstrating that serum from immune animals lost its lytic activity after heating but that activity could be fully restored by the addition of non-immune serum. Bordet surmised that two factors were involved, one of which was heat-labile and the other was a stable substance present in immune serum^. The former he assumed was alexin while the latter he termed the "sensitizer". Meanwhile, Ehrlich and Morgenroth, examining erythrocyte haemolysis by immune serum, confirmed the idea that two "principles" were required for lysis. The first principle, which was present in a thermostable form in immune serum, they termed "amboreceptors" or "immune bodies". The second, a heat-labile substance present in the "body juices", they called "complement" due to the fact that it "complemented" the activity of the amboreceptors. However, it was Bordet and Gengou who described the first complement fixation test, thereby establishing the quantitative role played by complement in cell lysis and dispelling the idea that it was merely an accessory factor as implied by Ehrlich's name. For this reason, Bordet is generally credited with the discovery of the complement system. In the absence of robust biochemical techniques, elucidation of the proteinaceous nature of complement and of the multiple components proceeded fairly slowly over the next 40 years. However, by the late 1920s due to the work of Ferrata initially, and Coca and Gordon subsequently, four individual components were recognized. By 1941, Pillemer and co-workers had confirmed the proteinaceous nature of complement^. During the 1960s, Nelson characterized at least six components from guinea pig serum that were necessary for haemolytic activity^, while MiillerEberhard and colleagues focused on the purification and characterization of each of these components^. Also in the 1960s, Ueno and later Mayer used a reconstitution assay, adding partially purified components to antibody-sensitized sheep red blood cells, to unravel the reaction sequence of the classical pathway. The identification of the alternative pathway involved many of the same investigators in another complex challenge. Pillemer described the depletion of C3 from serum by zymosan in the absence of any effect on CI, C2 and C4 levels in 1953. He also identified properdin as an activating factor in what he termed the properdin pathway^. Nelson offered an alternative explanation for these data in 1958*. He proposed that the properdin system was actually the classical pathway, but activated via antibodies to zymosan. In 1971, Miiller-Eberhard purified C3 proactivator and proposed the C3 activator system as an alternative method of complement activation^, thus supporting Pillemer's original hypothesis.
The Complement System
MODULAR STRUCTURE OF COMPONENTS The cloning and sequencing of the complement components in the last 20 years has augmented the extensive protein sequence already in existence and enabled protein structures to be identified. This has revealed the modular nature of the complement proteins and allowed their classification into five functional groups based on common structural motifs.
Clq and the coUectins (Figure 1) SP-D
I_J4^^^
SP-A C1q chains Conglutinin MBL Figure 1. Modular structure of Clq and the coUectins. See Table 1 for key. Additional domains are the globular region for Clq fCI^J; ^^20kb (e.g. intron 1). Note unusual distribution of exons with two clusters of closely spaced exons (2, 3 and 4; 7, 8 and 9) with other exons separated by very large introns. The approximate sizes of these large exons are indicated on the figure.
lOOkbapprox I
1
I .< I l l .20kb I
I >10kb I I >5kb I I >5kb I
I >20kb I
>15kbl
Accession numbers Human Mouse Rat Rabbit Rainbow trout Puffer fish (Fugu)
X02176 K02766 Y08545 X05475 U52948 U20055 X05474 U87241
Deficiency^^'^^ Numerous cases reported. Prevalence of 1:1000 in the Japanese population. Probably much less common in white populations and other races. Causes increased susceptibility to infection with Neisseria, most commonly manifest as meningococcal meningitis. C350 to T; Rl 15 to stop; predominant in Japanese population CI66 to A; C53 to stop; white populations C464 to T; R153 to stop^^; white populations
1Hi
Polymorphic variants^^
1
An RFLP with the enzyme Taql has been identified with two alleles (frequency in Spanish), Al(0.74) = 6.5 kb and A2(0.26) = 8.0 and 6.0kb. References ' DiScipio, R.G. and Hughli, T.E. (1985) J. Biol. Chem. 260, 14802-14809. 2 DiScipio, R.G. (1993) Mol. Immunol. 30, 1097-1106. 3 Lengwiller, J.S. and Rickli, E.E. (1996) FEES Lett. 380, 8-12. ^ Biesecker, G. et al. (1982) J. Biol. Chem. 257, 2584-2590. 5 Thielens, N.M. et al. (1988) J. Biol. Chem. 263, 6665-6670. 6 Dankert, J.R. et al. (1985) Biochemistry 24, 2754-2762. ^ DiScipio, R.G. et al. (1984) Proc. Natl Acad. Sci. USA 81, 7298-7302. « Stanley, K.K. et al. (1985) EMBO J, 4, 375-382. 9 Marrazziti, D. et al. (1988) Biochemistry 27, 6529-6534. 10 Rogne, S. et al. (1991) J. Med. Genet. 28, 587-590. 11 Hobart, M.J. et al. (1995) J. Immunol. 154, 5188-5194. 1^ Witzel-Schlomp, K. et al. (1997) J. Immunol. 158, 5043-5049. 1^ Horiuchi, T. et al. (1998) J. Immunol. 160, 1509-1513. 1"^ Goto, E. et al. (1990) Nucleic Acids Res. 18, 5581.
Parts Regulators of Complement Activation (RCA)
CRl Lloyd B. Klickstein, Brigham and Women's Hospital, Boston, MA, USA. Joann M. Moulds, University of Texas Medical School, Houston, TX, USA Other n a m e s Complement receptor type 1, C3b/C4b receptor, CD35, i m m u n e adherence receptor.
Physicochemical properties CRl is a type 1 integral membrane glycoprotein of 2044 amino acids^-^, of which the leader sequence comprises either 41 or 46 amino acids; there are two possible translation initiation sites. The C-terminal transmembrane region contains 25 amino acids and there are 43 residues in the C-terminal cytoplasmic domain. The N-terminal residue is blocked^'^^, compatible with pyrollidone amide cyclization or N-terminal alkylation of Gln47. There are at least four major structural allotypes described in humans^^, the most common form is CR1*1 (F or A), and all further descriptions will focus on that h u m a n form except where specifically noted. The other forms are CR1*2 (B or S), CR1*3 (Cor F') and CR1*4 (D). pF 7.1 M, (K) 205-250 (depending on cell source and electrophoresis system^-^^).
Allotype CR1*1 CR1*2 CR1*3 CR1*4
Approx. Mj. (reduced) 220-250 250-280 190-220 >280
Approx. Mr (unreduced) 190-210 220-250 160-190 >250
CRl from polymorphonuclear leukocytes migrates Mj. (K) 5 larger than that from erythrocytes due to altered N-linked glycosylation^^. N-linked glycosylation sites 25 (61, 161, 257, 320, 415, 452, 514, 583, 707, 770, 865, 902, 964, 1033, 1157, 1220, 1315, 1486, 1509, 1539, 1545, 1610, 1673, 1768, 1913) N-linked glycosylation contributes approximately 20-25 K to the molecular weight of the CR1*1^'^^-^^. Protein sequence data from erythrocyte C R P supports occupancy of sites at 514 and 964. Similarly, sites at 320 and 770 are unoccupied. Occupancy of the other sites is unknown. There is no detectable O-linked glycosylation^^
Structure CRl has an extracellular region comprised of a linear array of 30 CCP units of 59-75 residues each^-^. There are 120 cysteine residues and all are believed involved in disulfide links, based on structural homology to ^2 glycoprotein V^. An extended linear structure has been confirmed by electron microscopy^^. The N-terminal 28 CCPs are further organized as four tandem, long homologous repeats of seven CCP units each^'^. The predicted transmembrane region was confirmed by deletion mutagenesis, which resulted in a soluble form of the protein^^'^^.
Function CRl has long been recognized as the receptor for C3b and C4b fragments, and recently as a receptor for Clq^^. CRl also binds iC3b, but relatively poorly^*. Human erythrocyte CRl mediates binding of complementopsonized i m m u n e complexes or microorganisms to the celP^. These bound complexes or particles are then carried through the bloodstream to the spleen or liver where they are removed^'^^^. CRl on neutrophils and monocytes can mediate phagocytosis if the cells are primed or activated^^-^^. CRl on B cells and dendritic cells participates in localization of antigen for presentation to T cells^^--^^. CRl on all cell types is a cofactor for factor I-mediated cleavage of C3b to iC3b and C3f, and further cleavage of iC3b to C3c and C3d,g. CRl is a cof actor for factor 1-mediated cleavage of C4b to C4c and C4d. CRl also accelerates the otherwise spontaneous decay of the C3 and C5 convertases of the classical pathway (C4b2a and C4b2a3b) as well as that of the corresponding alternative pathway convertases (C3bBb and C3bBbC3b)^'2,33 These activities may be either intrinsic or extrinsic (located on the same surface as the CRl or not)-^^"^^.
d Tissue distribution CRl as a type 1 transmembrane protein is found on all erythrocytes, B cells, polymorphonuclear leukocytes, monocytes, follicular dendritic cells and glomerular podocytes and is also found on a subset of T cells^'-^M/ C R I is absent on NK cells^*. A soluble form is found in serum at a concentration reported at 30-60 ng/ml^^''^^, however this is an overestimate as the monoclonal antibodies used have repeated epitopes in CRP'^^.
Regulation of expression CRl is constitutively expressed on the previously mentioned cells. It is slowly lost from the surface of erythrocytes over the normal life of the cells. This loss is greatly accelerated in patients with i m m u n e complex diseases such as systemic lupus erythematosus'*^-^^ and is an acquired phenomenon, not an hereditary predisposition to illness"^^. Ninety per cent of neutrophil CRl is intracellular^'^'^*, located in secretory vesicles distinct from azurophilic or specific granules^^. Upon neutrophil activation with chemotactic peptides or other stimuli, this intracellular C R l is mobilized to the cell surface^^'^^.
Protein sequence3A50 MCLGRMGASS PEWLPFARPT KDRCRRKSCR IISGDTVIWD NPGSGGRKVF GILVSDNRSL VCQPPPDVLH WSPAAPTCEV SASYCVLAGM VNYTCDPHPD PDHFLFAKLK KDVCKRKSCK ILSGNAAHWS NPGSGGRKVF GILVSDNRSL VCQPPPDVLH WSPAAPTCEV SASYCVLAGM VNYTCDPHPD PDHFLFAKLK KDVCKRKSCK ILSGNTAHWS NLGSRGRKVF GILVSDNRSL VCQPPPEILH WSPEAPRCAV SVSHCVLVGM ISYTCDPHPD CKTPEQFPFA SSVEDNCRRK TTCLVSGNNV YQCHTGPDGE VENAIRVPGN CSRVCQPPPE QGDWSPEAPR KGRSASHCVL GKEISYACDT AACPHPPKIQ IWSQLDHYCK GSPWSQCQAD LKHRKGNNAH
PRSPEPVGPP NLTDEFEFPI NPPDPVNGMV NETPICDRIP ELVGEPSIYC FSLNEWEFR AERTQRDKDN KSCDDFMGQL ESLWNSSVPV RGTSFDLIGE TQTNASDFPI TPPDPVNGMV TKPPICQRIP ELVGEPSIYC FSLNEWEFR AERTQRDKDN KSCDDFMGQL ESLWNSSVPV RGTSFDLIGE TQTNASDFPI TPPDPVNGMV TKPPICQRIP ELVGEPSIYC FSLNEWEFR GEHTPSHQDN KSCDDFLGQL RSLWNNSVPV RGMTFNLIGE SPTIPINDFE SCGPPPEPFN TWDKKAPICE QLFELVGERS RSFFSLTEII ILHGEHTLSH CTVKSCDDFL AGMKALWNSS HPDRGMTFNL NGHYIGGHVS EVNCSFPLFM DRWDPPLAKC ENPKEVAIHL
APGLPFCCGG GTYLNYECRP HVIKGIQFGS CGLPPTITNG TSNDDQVGIW CQPGFVMKGP FSPGQEVFYS LNGRVLFPVN CEQIFCPSPP STIRCTSDPQ GTSLKYECRP HVITDIQVGS CGLPPTIANG TSNDDQVGIW CQPGFVMKGP FSPGQEVFYS LNGRVLFPVN CEQIFCPSPP STIRCTSDPQ GTSLKYECRP HVITDIQVGS CGLPPTIANG TSNDDQVGIW CQPGFVMKGP FSPGQEVFYS PHGRVLFPLN CEHIFCPNPP STIRCTSDPH FPVGTSLNYE GMVHINTDTQ IISCEPPPTI lYCTSKDDQV RFRCQPGFVM QDNFSPGQEV GQLPHGRVLL VPVCEQIFCP IGESSIRCTS LYLPGMTISY NGISKELEMK TSRAHDALIV HSQGGSSVHP
SLLAWVLLA GYSGRPFSII QIKYSCTKGY DFISTNRENF SGPAPQCIIP RRVKCQALNK CEPGYDLRGA LQLGAKVDFV VIPNGRHTGK GNGVWSSPAP EYYGRPFSIT RINYSCTTGH DFISTNRENF SGPAPQCIIP RRVKCQALNK CEPGYDLRGA LQLGAKVDFV VIPNGRHTGK GNGVWSSPAP EYYGRPFSIT RINYSCTTGH DFISTNRENF SGPAPQCIIP RRVKCQALNK CEPGYDLRGA LQLGAKVSFV AILNGRHTGT GNGVWSSPAP CRPGYFGKMF FGSTVNYSCN SNGDFYSNNR GVWSSPPPRC VGSHTVQCQT FYSCEPSYDL PLNLQLGAKV NPPAILNGRH DPQGNGVWSS TCDPGYLLVG KVYHYGDYVT GTLSGTIFFI RTLQTNEENS
LPVAWGOCNA CLKNSVWTGA RLIGSSSATC HYGSWTYRC NKCTPPNVEN WEPELPSCSR ASMRCTPQGD CDEGFQLKGS PLEVFPFGKA RCGILGHCQA CLDNLVWSSP RLIGHSSAEC HYGSWTYRC NKCTPPNVEN WEPELPSCSR ASMRCTPQGD CDEGFQLKGS PLEVFPFGKA RCGILGHCQA CLDNLVWSSP RLIGHSSAEC HYGSWTYRC NKCTPPNVEN WEPELPSCSR ASLHCTPQGD CDEGFRLKGS PSGDIPYGKE RCELSVRAGH SISCLENLVW EGFRLIGSPS TSFHNGTWT ISTNKCTAPE NGRWGPKLPH RGAASLHCTP SFVCDEGFRL TGTPFGDIPY PAPRCELSVP KGFIFCTDQG LKCEDGYTLE LLIIFLSWII RVLP
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 2000
The leader sequence is underlined and the potential N-linked glycosylation sites are indicated (N).
Protein modules^'^^^^ 1 or 6-46 47-106 107-168 169-238 239-300 301-360 361-423 424-496 497-556 557-618 619-688 689-750 751-810 811-873 874-946 947-1006 1007-1068 1069-1138 1139-1200 1201-1260 1261-1323 1324-1399 1400-1459 1460-1521 1522-1591 1592-1653 1654-1713 1714-1776 1777-1851 1852-1911 1912-1972 1977-2001 2002-2044
Leader peptide CCP1, begin LHR-A CCP2 CCP3 CCP4 CCP5 CCP6 CCP7, end LHR-A CCP8, begin LHR-B CCP9 CCP 10 CCP 11 CCP 12 CCP 13 CCP 14, end LHR-B CCP 15, begin LHR-C CCP 16 CCP 17 CCP 18 CCP 19 CCP20 CCP21, end LHR-C CCP22, begin LHR-D CCP23 CCP24 CCP25 CCP26 CCP27 CCP28, end LHR-D CCP29 CCP30 Transmembrane region Cytoplasmic region
exon 1 exon 2 exon 3/4 exon 5 exon 5 exon 6 exon 7/8 exon 9 exon 10 exon 11/12 exon 13 exon 13 exon 14 exon 15/16 exon 17 exon 18 exon 19/20 exon 21 exon 21 exon 22 exon 23/24 exon 25 exon 26 exon 27/28 exon 29 exon 29 exon 30 exon 31/32 exon 33 exon 34 exon 35 exon 36/37 exon 38
The ligand-binding sites are"^'^^'^^-^^: 47-300 CCPs 1-4 C4b-binding site (lower affinity for C3b) 497-750 CCPs 8-11 C3b-binding site (lower affinity for C4b) 947-1200 CCPs 15-18 C3b-binding site (lower affinity for C4b) 1400-1851 CCPs 22-28 A Clq-binding site
Chromosomal location Human56.57: iq32. Telomere ... MCP ... CRl ... CR2 ... DAF ... C4bp ... Centromere Factor H (Cfh) maps to lq32 but has not been physically linked with other members of the RCA. Mouse^*'^^: chromosome Iq, 40 cM. Telomere. ... Crry ... CR1/CR2 ... Cfh ... C4bp ... Centromere
cDNA sequence3A50,60 TTTTGTCCCG AGTCCTATTT TGTAGATGTG CGCCGGCGCC TTGCGCTGCC CTACCAACCT GCCCTGGTTA GTGCTAAGGA TGGTGCATGT GATACCGACT GGGATAATGA ATGGAGATTT GCTGCAATCC ACTGCACCAG TACCTAACAA GCTTATTTTC GACCCCGCCG CCAGGGTATG ACAACTTTTC GGGCTGCGTC AAGTGAAATC TAAATCTCCA GCAGCTCTGC CAGTGTGTGA GAAAACCTCT CAGACAGAGG CTCAAGGGAA AAGCCCCAGA CCATTGGGAC TCACATGTCT GTAAAACTCC GATCCAGAAT AATGTATCCT TTCCTTGTGG ATTTTCACTA TGTTTGAGCT TCTGGAGCGG AAAATGGAAT TTAGGTGTCA ACAAATGGGA TGCATGCTGA ACAGCTGTGA GAGACTGGAG AACTTCTTAA TTGTTTGTGA GAATGGAAAG CTCCAGTTAT AAGCAGTAAA GAGAGAGCAC CCCCTCGCTG TGAAAACCCA GTCCTGAGTA GTCCCAAAGA TGGTGCATGT
GAACCCCGCA GCCCTCCCCA TCGCTGAGCT TTTCCTCTTA CTTGGGGAGA ATGGGGGCCT CGGTCTCCCC TTCTGCTGCG GGTGGCCTGG GGTCAATGCA AACTGATGAG TTTGAGTTTC TTCCGGAAGA CCGTTTTCTA CAGGTGCAGA CGTAAATCAT GATCAAAGGC ATCCAGTTCG CATTGGTTCC TCGTCTGCCA AACACCTATT TGTGACAGAA CATTAGCACC AACAGAGAGA TGGAAGCGGA GGGAGAAAGG CAATGACGAT CAAGTGGGCA ATGCACGCCT CCAAATGTGG CTTAAATGAA GTTGTGGAGT TGTGAAGTGC CAGGCCCTGA TCAGCCACCT CCAGATGTCC ACCTGGGCAG GAAGTGTTCT TATGCGCTGC ACACCCCAGG CTGTGATGAC TTCATGGGCC GCTTGGAGCA AAAGTGGATT TAGTTACTGT GTCTTGGCTG ACAAATCTTT TGTCCAAGTC GGAAGTCTTT CCCTTTGGAA GACGAGCTTC GACCTCATTG TGGGGTTTGG AGCAGCCCTG TCATTTTCTG TTTGCCAAGT ATCTTTAAAG TACGAATGCC AGATAACCTG GTCTGGTCAA TCCAGATCCA GTGAATGGCA CAACTATTCT TGTACTACAG CTCGGGCAAT GCTGCCCATT GCTACCCCCC ACCATCGCCA TGGATCAGTG GTGACCTACC TGTGGGTGAG CCCTCCATAT CCCGGCCCCT CAGTGCATTA ATTGGTATCT GACAACAGAA GCCTGGCTTT GTCATGAAAG GCCGGAGCTA CCAAGCTGCT GCGTACCCAA AGGGACAAGG GCCCGGCTAT GACCTCAGAG CCCTGCAGCC CCCACATGTG TGGCCGTGTG CTATTTCCAG TGAAGGATTT _CAATTAAAAG CCTTTGGAAT AGCAGTGTTC TCCTAATGGG AGACACACAG TTACACATGC GACCCCCACC CATCCGCTGC ACAAGTGACC TGGAATTCTG GGTCACTGTC AACCAATGCA TCTGACTTTC CTACGGGAGG CCATTCTCTA TGTCTGTAAA CGTAAATCAT GATCACAGAC ATCCAGGTTG
CACTCTGGGC TTTCAGTTTT CTTCTCCAAG GAGGATCCCT ATGCCCCAGA CCATTGGGAC TCATCTGCCT GTCGTAATCC GATCCCAAAT CATGCATCAT TTCCTTGTGG ATTTTCACTA TGTTTGAGCT TCTGGAGCGG AAAATGGAAT TTAGGTGTCA ACAAATGGGA TGCATGCTGA ACAGCTGTGA GAGACTGGAG AACTTCTTAA TTGTTTGTGA GAATGGAAAG CTCCAGTTAT AAGCAGTAAA GAGAGAGCAC CCCCTCGCTG TGAAAACCCA GTCCTGAGTA GTCCCAAAGA TGGTGCATGT GGCACCGACT GGAGCACGAA ATGGAGATTT GCTGCAATCC ACTGCACCAG TACCTAACAA GCTTATTTTC GACCCCGCCG CCAGGGTATG ACAACTTTTC GGGCTGCGTC AAGTGAAATC TAAATCTCCA GCAGCTCTGC CAGTGTGTGA GAAAACCTCT CAGACAGAGG CTCAAGGGAA AAGCCCCAGA CCATTGGGAC TCACATGTCT GTAAAACTCC GATCCAGAAT
GCGGAGCACA CTTCGAGATC AAGCCCGGAG GCTGGCGGTT ATGGCTTCCA ATATCTGAAC AAAAAACTCA TCCAGATCCT TAAATATTCT CTCAGGTGAT GCTACCCCCC TGGATCAGTG TGTGGGTGAG CCCCGCCCCT ATTGGTATCT GCCTGGCTTT GCCGGAGCTA GCGTACCCAA GCCCGGCTAC CCCTGCAGCC TGGCCGTGTG TGAAGGATTT CCTTTGGAAT TCCTAATGGG TTACACATGC CATCCGCTGC TGGAATTCTG AACCAATGCA CTACGGGAGG TGTCTGTAAA GATCACAGAC CATTGGTCAC GCCGCCAATT CATTAGCACC TGGAAGCGGA CAATGACGAT ATGCACGCCT CTTAAATGAA TGTGAAGTGC TCAGCCACCT ACCCGGGCAG TATGCGCTGC CTGTGATGAC GCTTGGAGCA TAGTTATTGT ACAAATCTTT GGAAGTCTTT GACGAGCTTC TGGGGTTTGG TCATTTTCTG ATCTTTAAAG AGATAACCTG TCCAGATCCA CAACTATTCT
ATGATTGGTC AAATCTGGTT CCTGTCGGGC GTGGTGCTGC TTTGCCAGGC TATGAATGCC GTCTGGACTG GTGAATGGCA TGTACTAAAG ACTGTCATTT ACCATCACCA GTGACCTACC CCCTCCATAT CAGTGCATTA GACAACAGAA GTCATGAAAG CCAAGCTGCT AGGGACAAGG GACCTCAGAG CCCACATGTG CTATTTCCAG CAATTAAAAG AGCAGTGTTC AGACACACAG GACCCCCACC ACAAGTGACC GGTCACTGTC TCTGACTTTC CCATTCTCTA CGTAAATCAT ATCCAGGTTG TCATCTGCTG TGTCAACGAA AACAGAGAGA GGGAGAAAGG CAAGTGGGCA CCAAATGTGG GTTGTGGAGT CAGGCCCTGA CCAGATGTCC GAAGTGTTCT ACACCCCAGG TTCATGGGCC AAAGTGGATT GTCTTGGCTG TGTCCAAGTC CCCTTTGGAA GACCTCATTG AGCAGCCCTG TTTGCCAAGT TACGAATGCC GTCTGGTCAA GTGAATGGCA TGTACTACAG
60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240
cDNA sequence
continued
GGCACCGACT CATTGGTCAC GGAGCACGAA GCCGCCAATT ATGGAGATTT CATTAGCACC GCTGCAATCT TGGAAGCAGA ACTGCACCAG CAATGACGAT TACCTAACAA ATGCACGCCT GCTTATTTTC CTTAAATGAA GACCCCGCCG TGTGAAGTGC CCAGGGTGTG ^CAGCCGCCT ACAACTTTTC ACCTGGGCAG GGGCTGCGTC TCTGCACTGC CAGTGAAATC CTGTGATGAC TTAATCTCCA GCTTGGGGCA GCAGTTCCGT TAGTCATTGT CTGTGTGTGA ACATATCTTT GAACTCCCTC TGGAGATATT CAGACAGAGG GATGACCTTC CTCATGGGAA TGGGGTTTGG GTCACTGTAA AACCCCAGAG TTGAGTTTCC AGTCGGGACA TGTTCTCTAT CTCCTGCCTA GAAAATCATG TGGACCTCCA CACAGTTTGG ATCAACAGTT CATCTACTAC TTGTCTCGTC GTGAGATCAT ATCTTGTGAG ATAGAACATC TTTTCACAAT GAGAACAGCT GTTTGAGCTT AAGTTGGTGT TTGGAGCAGC CAGAAGTTGA AAATGCAATT TCATCAGATT TAGATGTCAG AGACCAATGG CAGATGGGGG CAGAAATCCT GCATGGTGAG AAGTGTTCTA CAGCTGTGAG CGCCCCAGGG AGACTGGAGC TCCTGGGCCA ACTCCCTCAT AGGTGTCCTT TGTTTGCGAT TCTTGGCTGG AATGAAAGCC GTCCAAATCC TCCAGCTATC CCTATGGAAA AGAAATATCT ACCTCATTGG GGAGAGCTCC GCAGCCCTGC CCCTCGCTGT TCCAAAACGG GCATTACATT GCTACACTTG TGACCCCGGC AGGGAATCTG GAGCCAATTG TTATGAATGG AATCTCGAAG TGACTTTGAA GTGTGAAGAT CGGATGACAG ATGGGACCCT TAGTTGGCAC TTTATCTGGT TAATTCTAAA GCACAGAAAA ATTTACATTC TCAAGGAGGC ATAGCAGGGT CCTTCCTTGA TGGTGGGAAA GGAGCCAATT AAGTGACTTC ACAGAGACGC TAGCAAAGCT CCTGCCTCTT
TCATCTGCTG AATGTATCCT TGTCAACGAA TTCCTTGTGG AACAGAGAGA ATTTTCACTA GGGAGAAAGG TGTTTGAGCT CAAGTGGGCA TCTGGAGCGG CCAAATGTGG AAAATGGAAT GTTGTGGAGT TTAGGTGTCA CAGGCCCTGA ACAAATGGGA CCAGAAATCC TGCATGGTGA GAAGTGTTCT ACAGCTGTGA ACACCCCAGG GAGACTGGAG TTCTTGGGTC AACTCCCTCA AAGGTGTCCT TTGTCTGTGA GTCTTGGTTG GAATGAGAAG TGTCCAAATC CTCCAGCTAT CCCTATGGAA AAGAAATATC AACCTCATTG GGGAGAGCAC AGCAGCCCTG CCCCTCGCTG CAGTTTCCAT TTGCCAGTCC TCTTTGAATT ATGAATGCCG GAAAACTTGG TCTGGTCAAG CCAGAACCCT TCAATGGAAT AATTATTCTT GTAATGAAGG TCAGGCAATA ATGTCACATG CCACCTCCAA CCATATCCAA GGAACGGTGG TAACTTACCA GTGGGAGAAC GGTCAATATA CCTCCCCCTC GGTGTATTTC AGAGTACCAG GAAACAGGAG CCCGGGTTTG TCATGGTAGG CCCAAGCTGC CACACTGCTC CATACCCTAA GCCATCAGGA CCCAGCTATG ACCTCAGAGG CCTGAAGCCC CTAGATGTAC GGCCGTGTGC TACTTCCACT GAAGGGTTCC _GATTAAAAGG CTTTGGAATA GCAGTGTTCC CTTAATGGGA GACACACAGG TACGCATGCG ACACCCACCC ATCCGCTGCA CAAGTGACCC GAACTTTCTG TTCCTGCTGC GGAGGACACG TATCTCTATA TACCTGTTAG TGGGAAAGGG GATCATTATT GCAAAGAAGT GAGTTAGAAA TGAAAAAAGT GGGTATACTC TGGAAGGCAG CCTCTGGCCA AATGTACCTC ACGATCTTCT TTATTTTACT GGCAATAATG CACATGAAAA AGCAGCGTTC ATCCCCGAAC CAAAGTACTA TACAGCTGAA GATTTCAACA GAATCAGATC AGACATGTGC ACTTGAAGAT TGTGTGCGTC ACTGTGAAAC
CTCAGGCAAT GCTACCCCCA TGGATCAGTG TGTGGGTGAG CCCCGCCCCT ATTGGTATCT GCCTGGCTTT GCCAGAGTTA GCATACCCCA GCCTGGCTAT CCCTGAAGCC TGGCCGTGTG TGAAGGGTTT CCTTTGGAAT CCTTAATGGG TTACACATGT CATCCGCTGC TGAACTTTCT TACGATCCCA TCCTGGGTAT TGTTGAAGAC GGTGCATATA GTTTCGACTC GGATAAGAAG TGGAGACTTC GTGCCACACT TTGCACCAGC TACTAATAAA TTTCTTTTCC GTCCCACACT CAGGGTGTGT CAACTTTTCA GGCTGCGTCT AGTGAAATCC TAATCTCCAG CAGGTCTGCT AGTGTGTGAA AACTCCCTTT AGACAGAGGG TCAAGGGAAT CTGCCCACAT TCTTCCTGGG CTTCATTTTC AAATTGTAGC ATATCACTAT TCCCTGGAGC TCGTGCACAT CATCATTTTC CCCTAAAGAA TCTGCAAACA GAACATCTCG TGAGCTTCAT GCTGCCCCTT CCCCACCCTT
ACTGCCCATT ACCATCGCCA GTGACCTACC CCCTCCATAT CAGTGCATTA GACAACAGAA GTCATGAAAG CCAAGCTGCT AGCCATCAGG GACCTCAGAG CCGAGATGTG CTATTTCCAC CGCTTAAAGG AACAGTGTTC AGACACACAG GACCCCCACC ACAAGTGACC GTTCGTGCTG ATTAATGACT TTTGGGAAAA AACTGTAGAC AACACAGATA ATTGGTTCCC GCACCTATTT TACAGCAACA GGACCAGATG AAAGATGATC TGCACAGCTC CTCACTGAGA GTGCAGTGCC CAGCCGCCTC CCTGGGCAGG CTGCACTGCA TGTGATGACT CTTGGGGCAA AGTCATTGTG CAAATCTTTT GGAGATATTC ATGACCTTCA GGGGTTTGGA CCACCCAAGA ATGACAATCA TGTACAGACC TTCCCACTGT GGAGATTATG CAGTGCCAGG GATGCTCTCA CTCTCTTGGA GTGGCTATCC AATGAAGAAA AATACAATTT AAAGTCTTTG CCCTGGTACC CTGCCTCGTG
3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 6120 6180 6240 6300 6360 6420 6480
cDNA sequence CTAAACGCAC TGGATTACTT TCTTTTTTAA AAAGTTATGA TTTGATTCAT CCCCCTTAGT GAGTGAAATA AAAGGCATGA ATAAGATTTC AGTGCAGTGG CTGCCTCAGC CACGCCCGGC ATGGTCTGGA ACAGGCATGA ACTTTGTGCT ATTATAAAAG AACACAACTT
ACAGTATCTA AAAGGAATAA AATATTTGTA AAAATAAGTC TTTCTGCCTA TTGTTTCCTT TATGCTATAT AATGATCATG GATATCTTCT CGTAATCTCG CTCCTGAGTA TAATTTTTTT TCTCCTGACC GCCACCGCGC GTGTTCTATA TACTAGCTTA TTAAAAAATG
continued GTCAGGGGAA GGTGTTGCCT ATATGGAATG ACTTATAATT TCTTCTTTCA TTATTTTATA CAGTTTTTAC GGAAGAGTGG TTTTTTTTGA GCTCACTGCA GTTGGGACTA GTATTTTTAG TCGTGATCCA CTGGCCGCTT TAAAAAACAT CTTTTGTATG TATCAAAAAT
AAGACTGCAT GGAATTTCTG GGCTCAGTAA ATGCTACCTA CATATGTGTT GAGCAGAACC TTTCTCTAGG TTAAGACTAC GATGGAGTCT ACGTCCGCCT CCAGTAGATG TAGAGACGGG CCCGCCTCGG TCGATATTTT AATAAAAATT GATTCAGAAT AATAAACGTG
TTAGGAGATA GTTTGTAAGG GAAGAGCTTG CTGATAACCA TTTTTACATA CTAGTCTTTT GAGAAAAATT TGAAGAGAAA GGCTCTGTCT CCTGGGTTGA GGACTACAGG GTTTCACCAT CCTCCCAAAG CTAAACTTTA GAAATGAAAG ATACTAAATT TTCTGATATT
GAAAATAGTT TGGTCACTGT GAAAATGCAG CTCCTAATAT CGTACTTTTC AAACAGTTTA AATTTACTAG TATTTGGAAA CCCAGGCTGG CACCATTTTC CACCTGCCAA GTTAGCCAGG TGCTGCGATT ATTCAAAAGC AATAATTGTT AACTTTTTAA TTT
6540 6600 6660 672 0 67 80 6840 6900 69 60 7 02 0 7 08 0 7140 72 00 72 60 73 2 0 7 3 80 7440
The first five nucleotides in each exon are underhned. There are two transcriptional start sites Tl and A30, the A is predominant by SI nuclease analysis^^. The two possible methionine initiation codons (ATG), the termination codon (TGA) and the known polyadenylation site (AATAAAl are indicated. In this figure, nucleotides 116-7061 are a compilation from references 3 and 4, determined from cDNA clones. Nucleotides 1-115 and 7062-7493 were determined from genomic clones^^.
Genomic structure^^ The gene for the CR1*1 allotype of CRl spans approximately 133 kb and is encoded by 39 exons as illustrated below.
LHR-S
1
l—HH
\
20
1 III III llllMll llllllll
39
IIIIIH I I
The difference between the major allotypes is accounted for by deletion or duplication of a large segment of genomic DNA encoding an LHR-length of peptide sequence. The gene encoding the CRl*2 allotype is approximately 150-160 kb and is encoded by 47 exons, with the additional 8 exons inserted approximately in the location indicated. The gene encoding the CR1*3 allotype contains a deletion somewhere within the LHR-B to LHR-C regions, however the location has not been determined precisely^*^.
^Q Accession numbers (EMBL/GenBank) Human Chimpanzee Baboon Mouse
Q^13,4,50,60
Mouse
Crry69
Rat
Crry^o
C R 161,64 CR165 CR1/CR262.63-68
cDNA Y00816 L24920-L24922 L39791 M61132 M36470 M29281 M35684 J04153 M33527 U17123-U17128 X98171 M23529 M34164-M34173 L36532 D42115
Genomic L17390-L17430
Deficiency No humans totally lacking CRl have been identified. The Knops, McCoy, Swain-Langley and York blood group antigens are located on CRl, and some individuals with these antibodies have very low levels of erythrocyte CRF^. Acquired low levels of erythrocyte CRl are seen in patients with systemic lupus erythematosus^^-^^. These patients have abnormal clearance of immune complexes. Knockout mice have been prepared that lack CR1/CR2 and these animals exhibit profound defects in T cell and B cell function-^^'^^.
Polymorphic variants The structural allotypes below are a consequence of large insertions or deletions in the CRl gene, and may be detected by M, difference upon SDSPAGE^-^, northern blot analysis of mRNA or southern blot analysis of genomic DNA. The structural allotype may affect affinity of CRl for C3b dimers^^. The quantitative allotype, H or L, regulates CRl expression level on erythrocytes. Erythrocytes from individuals homozygous for the H allotype bear 4-10-fold more cell surface CRl than those from individuals homozygous for the L allotype^^.
Polymorphism frequencies^ Structural alleles CR1*1 CR1*2 CR1*3 CR1*4
White population 0.86-0.93 0.07-0.26 0-0.02 100 000 molecules/cell on endothelial and epithelial cells. Induced by phorbol ester (PMA). Promoter has cAMP response element.
I
Protein sequence^^ MTVARPSVPA GRTSFPEDTV PTRLNSASLK WSTAVEFCKK SSFCLISGSS YACNKGFTMI TVNVPTTEVS TTSGTTRLLS
ALPLLGELPR ITYKCEESFV QPYITQNYFP KSCPNPGEIR VQWSDPLPEC GEHSIYCTVN PTSQKTTTKT GHTCFTLTGL
LLLLVLLCLP KIPGEKDSVI VGTWEYECR NGQIDVPGGI REIYCPAPPQ NDEGEWSGPP TTPNAQATRS LGTLVTMGLL
AVWGDCGLPP CLKGSQWSDI PGYRREPSLS LFGATISFSC IDNGIIQGER PECRGKSLTS TPVSRTTKHF T
DVPNAQPALE EEFCNRSCEV PKLTCLQNLK NTGYKLFGST DHYGYRQSVT KVPPTVQKPT HETTPNKGSG
50 100 150 2 00 250 3 00 3 50
The leader sequence is underlined, N-linked glycosylation site is indicated (N), and the cleavage site for GPI anchor attachment is double underlined. Amino acid differences between the two publications: 80 I/T, 85 S/M.
Dccav-accclcratiiiir hictor
Protein modules 1-34 35-95 97-159 162-221 224-284 287-356
Leader sequence^'^
CCP CCP CCP CCP STP
exon V^ exon 2 exon 3 exon 4/5 exon 6
exon 7-9
Chromosomal location Hurnan^^. iq32. Mouse^*: chromosome 1.
cDNA sequence^ 2A3 ACTGCAACTC CCTTGTTCTA CCTCCTCGGG GGGTGACTGT AAGTTTTCCC TGGCGAGAAG CTGCAATCGT TATCACTCAG CAGAAGAGAA AGCAGTCGAA GATTGATGTA GTACAAATTA GAGTGACCCG TGGAATAATT TAATAAAGGA AGGAGAGTGG ACCAACAGTT TCAGAAAACC TTCCAGGACA AGGTACTACC GCTAGTAACC GTATACAGAC TGTGCTCTTC CAAGGAGAAA AGAACAACTT TTGTTCGTAT GATCTGTAAT TCAAAAGCA^ ACCACATTAT AATATTTTAA TATAGAATGA AAAGGTGTCT TAAGAAAAGA ATTCTTTTGT AAAACAAGAA AATGATCCCA
GCTCCGGCCG ACCCGGCGCG GAGCTGCCCC GGCCTTCCCC GAGGATACTG GACTCAGTGA AGCTGCGAGG AATTATTTTC CCTTCTCTAT TTTTGTAAAA CCAGGTGGCA TTTGGCTCGA TTGCCAGAGT CAAGGGGAAC TTCACCATGA AGTGGCCCAC CAGAAACCTA ACCACAAAAA ACCAAGCATT CGTCTTCTAT ATGGGCTTGC TGTTCCTAGT ATTTAGGATG AAAGGCAGTC GCAGAATTGA TTAGAATGGG GTTATTTCCA ATAAAAACCC AAAGTAATCT AGGTAAAACA AAGACTGAAT TCTTTGACTT TTATATATTA AATATTTATT AAGTTGAAGA TTTTTTGGT
CTGGGCGTAG CCATGACCGT GGCTGCTGCT CAGATGTACC TAATAACGTA TCTGCCTTAA TGCCAACAAG CAGTCGGTAC CACCAAAACT AGAAATCATG TATTATTTGG CTTCTAGTTT GCAGAGAAAT GTGACCATTA TTGGAGAGCA CACCTGAATG CCACAGTAAA CCACCACACC TTCATGAAAC CTGGGCACAC TGACTTAGCC TTCTTAGACT CTTTCATTGT CTGGAATCAC GAGTGATTCC ATCACGAGGA CTTATAAAGG AATTCAGTCT TTGGCTGTAA TGCTGGTGAA CTTCCTTTGT AATGTCTTTA TTTCTGAATC TATATTTATT AGATATGTGA
CTGCGACTCG CGCGCGGCCG GCTGGTGCTG TAATGCCCAG CAAATGTGAA GGGCAGTCAA GCTAAATTCT TGTTGTGGAA AACTTGCCTT CCCTAATCCG TGCAACCATC TTGTCTTATT TTATTGTCCA TGGATATAGA CTCTATTTAT CAGAGGAAAA TGTTCCAACT AAATGCTCAA AACCCCAAAT GTGTTTCACG AAAGAAGAGT TATCTGCATA CTTTAAGATG ATTCTTAGCA TTTCCTAAAA AAAGAGAAGG AAATAAAAAA CTTCTAAGCA GGCATTTTCA CCAGGGGTGT TGCACAAATA AAAGTATCCA GAGATGTCCA TATGACAGTG AGAAAAATGT
GCGGAGTCCC AGCGTGCCCG TTGTGCCTGC CCAGCTTTGG GAAAGCTTTG TGGTCAGATA GCATCCCTCA TATGAGTGCC CAGAATTTAA GGAGAAATAC TCCTTCTCAT TCAGGCAGCT GCACCACCAC CAGTCTGTAA TGTACTGTGA TCTCTAACTT ACAGAAGTCT GCAACACGGA AAAGGAAGTG TTGACAGGTT TAAGAAGAAA TTGGATAAAA TGTTAGGAAT CACCTACACC GTGTAAGAAA AAAGTGATTT TGAAAAACAT AAATTGCTAA TCTTTCCTTC TGATGGTGAT GAGTTTGGAA GAGATACTAC TAGTCAAATT AACATTCTGA ATTTTTCCTA
GGCGGCGCGT CGGCGCTGCC CGGCCGTGTG AAGGCCGTAC TGAAAATTCC TTGAAGAGTT AACAGCCTTA GTCCAGGTTA AATGGTCCAC GAAATGGTCA GTAACACAGG CTGTCCAGTG AAATTGACAA CGTATGCATG ATAATGATGA CCAAGGTCCC CACCAACTTC GTACACCTGT GAACCACTTC TGCTTGGGAC ATACACACAA TAAATGCAAT GTCAACAGAG TCTTGAAAAT GCATAGAGAT TTTTCCACAA TATTTGGATA AGAGAGATGA GGGTTGGCAA AAGGGAGGAA AAAGCCTGTG AATATTAACA TGTAAATCTT TTTTACATGT AATAGAAATA
60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100
Position 1 is the transcriptional start site. The first five nucleotides in each exon are underlined. Exon 10 (not illustrated in the cDNA sequence but depicted below), an Alu family sequence, has not been reported in DAF mRNA and is not used in surface DAF protein^. The initiation codon (ATG), the termination codon (TAG), and the four polyadenylation signals (AATAAA) are indicated. Nucleotide differences between the published sequences: 321 T/C, 336 G/T, 337 T/G.
Decay-acceleratine factor
I
Exon 10^ 1164 GTTCTCGTCC TGTCACCCAG GCTGGTATGC GGTGGTGTGA TCGTAGCTCA CTGCAGTCTC GAACTCCTGG GTTCAAGCGA TCCTTCCACT TCAGCCTCCC AAGTAGCTGG TACTACAG
Genomic structure^^ The gene spans -40 kb and is encoded by 11 exons. The introns vary in size from 0.5 tol9.8 kb (last intron). Exon 10 has not been reported in DAF mRNA. 2kb
ii II r I—H+i
i
Accession numbers (EMBL/GenBank) Human
Mouse^«'^9 Orang-utan^o Guinea-pig2i Rat22
M31516 M15799 M64653 S72858 L41365-6 D63679 S67775 D49416-D49421 AF039583-4
M64356 (promoter)
Deficiency Paroxysmal nocturnal haemoglobinuria results from the absence of DAF (as well as CD59 and all other GPI-anchored proteins) on peripheral blood elements^^'23,24 -fj^^ failure to express these proteins is due to a defect in the first step of GPI assembly (GlcNAc-PI synthesis^^^Tj j ^ a bone marrow stem cell eventuating from a mutation of the PIG-A gene^^. Deficient expression of DAF gives rise to heightened uptake of C3b2^. The Cromer blood group antigen system resides on the DAF molecule^'^; the Inab phenotype represents the absence of DAF^^ Polymorphic variants^^-^^ G237T; R52L G237C; R52P T327G; L82R C678T; S199L G761C; A227P T321A; I80N C831T;T250M References 1 Caras, I.W. et al. (1987) Nature 325, 545-549. 2 Medof, M.E. et al. (1987) Proc. Natl Acad. Sci. USA 84, 2007-2011. 3 Nicholson-Weller, A. et al. (1982) J. Immunol. 129,184-189.
Decay-accelerating factor
^ 5 6 7 « 9 0
20 2^ 22 23 2^ 25 26 27 2« 29 ^0 ^^ 32 33 3^ 35 36
Medof M.E. et al. (1984) J. Exp. Med. 160, 1558-1578. Lublin, D.M. et al. (1986) J. Immunol. 137, 1629-1635. Medof, M.E. et al. (1986) Biochemistry 25, 6740-6747. Pangburn, M.K. (1986) J. Immunol. 136, 2216-2221. Fujita, T. et al. (1987) J. Exp. Med. 166, 1221-1228. Kinoshita, T. et al. (1985) J. Exp. Med. 162, 75-92. Nicholson-Weller, A. et al. (1985) Blood 65, 1237-1244. Medof, M.E. et al. (1987) J. Exp. Med. 165, 848-864. Lass, J.H. et al. (1990) Invest. Ophthalmol. Vis. Sci. 31, 1136-1148. Ewulonu, U.K. et al. (1991) Proc. Natl Acad. Sci. USA 88, 4675-4679. Thomas, D.J. and Lublin, D.M. (1993) J. Immunol. 150,151-160. Bryant, R.W. et al. (1990) J. Immunol. 144, 593-598. Post, T.W. et al. (1990) J. Immunol. 144, 740-744. Lublin, D.M. et al. (1987) J. Exp. Med. 165, 1731-1736. Spicer, A.P. et al. (1995) J. Immunol. 155, 3079-3091. Fukuoka, Y. et al. (1996) Int. Immunol. 8, 379-385. Nickells, M.W. et al. (1994) J. Immunol. 152, 676-685. Nonaka, M. et al. (1995) J. Immunol. 155, 3037-3048. Hinchliffe, S.J. et al. (1998). J. Immunol. 161, 5695-5703 Pangburn, M.K. et al. (1983) Proc. Natl Acad. Sci. USA 80, 5430-5434. Nicholson-Weller, A. et al. (1983) Proc. Natl Acad. Sci. USA 80, 5066-5070. Armstrong, C. et al. (1992) J. Biol. Chem. 267, 25347-25351. Takahashi, M. et al. (1993) J. Exp. Med. 177, 517-521. Hidaka, M. et al. (1993) Biochim. Biophys. Acta 191, 571-579. Takeda, J. et al. (1993) Cell 73, 703-711. Medof, M.E. et al. (1985) Proc. Natl Acad. Sci. USA 82, 2980-2984. Telen, M.J. et al. (1988) J. Exp. Med. 167, 1993-1998. Parsons, S.F. et al. (1988) Proceedings of the 20th Congress of the International Society of Blood Transfusion, London, UK, p. 116 (abstr.). Stafford, H.A. et al. (1988) Proc. Natl Acad. Sci. USA 85, 880-884. Telen, M.J. and Green, A.M. (1989) Blood 74, 437-441. Lublin, D.M. et al. (1991) J. Clin. Invest. 87, 1945-1952. Telen, M.J. et al. (1994) Blood 84, 3205-3211. Lublin, D.M. et al. (1997) Transfusion 37, 102S (abstr.).
Membrane cofactor protein M. Kathryn Liszewski and John P. Atkinson, Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
Other names MCP, CD46, gp45-70, measles virus receptor.
Physicochemical properties MCP is a type 1 membrane glycoprotein expressed as four common isoforms (CI, C2, BCl and BC2) that arise by alternative splicing^ Each possesses a 34 amino acid signal peptide followed by 328-350 amino acids (CI, 328 aa; C2, 335 aa,- BCl, 343 aa,- BC2, 350 aap-^. pi 3.9-5.8 Higher molecular freight species possess more O-linked sugars, v^hich correlate with their more acidic pi. M, (K) predicted -39 observed 51-58 (C isoforms) 59-68 (BC isoforms) N-linked glycosylation sites^'^ 3(83, 114,273) (all occupied)
Structure The N-terminal portion of all isoforms consists of four contiguous CCP repeats. Following the four CCP is an alternatively spliced segment enriched in serines, threonines and prolines (STP domain) that is Oglycosylated at positions 286-314 (BC isoforms) and 286-299 (C isoforms)^'^. The gene contains three STP exons termed A, B and C. Two of the regularly expressed isoforms consist of B+C exons (29 amino acids) and two consist of the C (14 amino acids) exon only. STP exon A is rarely used. Flanking the STP domain and common to all of the isoforms is a sequence of 12 amino acids of unknown function, which is followed by the transmembrane domain, and intracytoplasmic anchor. Alternative splicing produces two distinct cytoplasmic tails of 16 or 23 amino acids termed Cyt-1 or Cyt-2. Designations for the four commonly occurring isoforms are MCP-BCl, MCP-BC2, MCP-Cl and MCP-C2 (specifying the ahernatively spUced STP and tail regions). Other rarer isoforms of unclear significance have been described. The upper band on SDS-PAGE consists of BCl and BC2 while the lower band consists of CI and C2. Population studies indicate that the upper band predominant pattern is present in 65% of the population, an approximately equal distribution of upper and lower bands occurs in 29%, and 6% express the lower band (C isoforms) predominantly^.
Function MCP is a ubiquitously expressed complement regulatory protein. It is a cofactor for the factor I-mediated cleavage of C3b and C4b that deposits on self-tissue^. This regulation is only performed intrinsically in that MCP
Membrane cofactor protein
protects the cell on which it is anchored, not neighbouring cells*. MCP is expressed on placental trophoblast and on the inner acrosomal membrane of human spermatozoa^'^'^. Its role in these locations is likely to protect against complement activation, but other possibilities have been suggested^^. Crosslinking of MCP downregulates IL-12 production, a finding of potential significance for the immunosuppressive sequelae of measles virus infection^^ MCP is a receptor for several pathogens including measles virus^^-^^, group A Streptococcus pyogenes^^, and pathogenic Neisseha^'^. Therapeutic uses of MCP include production of transgenic animals expressing MCP in order to prevent the hyperacute graft rejection that accompanies xenotransplantation^* and a recombinantly produced soluble form in which MCP is linked with a second complement regulatory protein (decayaccelerating factor) for therapeutic use as an inhibitor of complement activation^^.
wwWWLfa
Tissue distribution Most cells express each of the four isoforms of MCP. Human erythrocytes lack MCP. Tissue-specific isoform expression has been found in kidney and salivary gland (BC isoforms) as well as brain and fetal heart (C isoforms)^^'2^.
Regulation of expression MCP levels are increased in certain haematologic malignancies, on most solid tumour cell lines, and following SV40 transformation (reviewed in ref. 22). MCP expression is upregulated in glomerular capillary walls and in mesangial regions of diseased kidney tissues and in astrocytes following cytomegalovirus infection. IFN7 and phorbal ester (PMA) enhanced expression in an oligodendrocyte cell line^^.
Protein sequence {BCI)^ MEPPGRRECP KPYYEIGERV YIRDPLNGQA SGKPPICEKV SLIGESTIYC TVMFECDKGF GPRPTYKPPV YRYLQRRKKK
FPSWRFPGLL DYKCKKGYFY VPANGTYEFG LCTPPPKIKN GDNSVWSRAA YLDGSDTIVC SNYPGYPKPE GKADGGAEYA
LAAMVLLLYS IPPLATHTIC YQMHFICNEG GKHTFSEVEV PECKWKCRF DSNSTWDPPV EGILDSLDVW TYQTKSTTPA
FSDACEEPPT DRNHTWLPVS YYLIGEEILY FEYLDAVTYS PWENGKQIS PKCLKVSTSS VIAVIVIAIV EQRG
FEAMELIGKP DDACYRETCP CELKGSVAIW CDPAPGPDPF GFGKKFYYKA TTKSPASSAS VGVAVICWP
50 100 150 200 250 300 350
The leader sequence is underlined, iV-linked glycosylation sites (all occupied) are indicated (N) and segments alternatively spliced are double underlined (see Structure).
Membrane cofactor protein
Protein modules 1-34 35-95 96-158 159-224 225-285 286-314 315-327 328-351 352-361 362-377 362-384
Leader peptide CCP CCP CCP CCP STP B domain: VSTSSTTKSPASSAS C domain: GPRPTYKPPVSNYP Undefined segment Transmembrane domain Intracytoplasmic anchor Cytoplasmic tail one: TYLTDETHREVKFTSL Cytoplasmic tail two: KADGGAEYATYQTKSTTPAEQRG
exon 1 exon 2 exon 3/4 exon 5 exon 6 exon 7-9 exon 8 exon 9 exon 10 exon 11/12 exon 12 exon 13 exon 14
Chromosomal location^^^^'^^ Human: lq3.2. It is located along with four other closely related genes on a 900 kb fragment within the RCA locus at lq3.2. An MCP-like genomic element includes sequences 93% homologous at the nucleotide level with the MCP 5' terminus (i.e. signal peptide, and CCPl-3)^^. Located within 60kb of MCP^^, it is unknown if this partial duplication produces a protein.
cDNA sequence (BC2)2 TCTGCTTTCC CCGCGAGTGT GCTGCTGTAC CATTGGTAAA AGGATACTTC GCTACCTGTC AAATGGCCAA TTGTAATGAG AGTAGCAATT AAAAATAAAA AGTAACTTAT CACGATTTAT CAAATGTCGA TTACTACAAA CACAATTGTC GTCGACTTCT CAAGCCTCCA TTTGGATGTT TTGTGTTGTC AGCTGAATAT AGATTCCACA TTATTCTGTA
TCCGGAGAAA CCCTTTCCTT TCCTTCTCCG CCAAAACCCT TATATACCTC TCAGATGACG GCAGTCCCTG GGTTATTACT TGGAGCGGTA AATGGAAAAC AGTTGTGATC TGTGGTGACA TTTCCAGTAG GCAACAGTTA TGTGACAGTA TCCACTACAA GTCTCAAATT TGGGTCATTG CCGTACAGAT GCCACTTACC ACCTGGTTTG GTTTCACTCT
TAACAGCGTC CCTGGCGCTT ATGCCTGTGA ACTATGAGAT CTCTTGCCAC CCTGTTATAG CAAATGGGAC TAATTGGTGA AGCCCCCAAT ACACCTTTAG CTGCACCTGG ATTCAGTGTG TCGAAAATGG TGTTTGAATG ACAGTACTTG AATCTCCAGC ATCCAGGATA CTGTGATTGT ATCTTCAAAG AGACTAAATC CCAGTTCATC CATGAGTGCA
TTCCGCGCCG TCCTGGGTTG GGAGCCACCA TGGTGAACGA CCATACTATT AGAAACATGT TTACGAGTTT AGAAATTCTA ATGTGAAAAG TGAAGTAGAA ACCAGATCCA GAGTCGTGCT AAAACAGATA CGATAAGGGT GGATCCCCCA GTCCAGTGCC TCCTAAACCT TATTGCCATA GAGGAAGAAG AACCACTCCA TTTTGACTCT ACTGTGGCTT
CGCATGGAGC CTTCTGGCGG ACATTTGAAG GTAGATTATA TGTGATCGGA CCATATATAC GGTTATCAGA TATTGTGAAC GTTTTGTGTA GTATTTGAGT TTTTCACTTA GCTCCAGAGT TCAGGATTTG TTTTACCTCG GTTCCAAAGT TCAGGTCCTA GAGGAAGGAA GTTGTTGGAG AAAGGGAAAG GCAGAGCAGA ATTAAAATCT AGCTAATATT
CTCCCGGCCG CCATGGTGTT CTATGGAGCT AGTGTAAAAA ATCATACATG GGGATCCTTT TGCACTTTAT TTAAAGGATC CACCACCTCC ATCTTGATGC TTGGAGAGAG GTAAAGTGGT GAAAAAAATT ATGGCAGCGA GTCTTAAAGT GGCCTACTTA TACTTGACAG TTGCAGTAAT CAGATGGTGG GAGGCTGAAT TCAATAGTTG GCAATGTGGC
60 12 0 180 24 0 3 00 3 60 42 0 4 80 540 600 660 72 0 7 80 840 900 960 102 0 1080 1140 12 0 0 12 6 0 132 0
Membrane cofactor protein
cDNA sequence (BC2) TTGAATGTAG AGATTGCCTG CTGGTTGTAT TAGTTCACAA
GTAGCATCCT CTTTCCCTTA TAAAGCAGGG TGAAATTATA
continued
TTGATGCTTC TTTGAAACTT GTATGAATTT GGGTATGAAC 13 80 AATAACACTT AGATTTATTG GACCAGTCAG CACAGCATGC 1440 ATATGCTGTA TTTTATAAAA TTGGCAAAAT TAGAGAAATA 1500 TTTTCTTTGT
The first five nucleotides in each exon are underhned to indicate the intron-exon boundaries. The methionine initiation codon (ATGI, the termination codon (TGA) and the probable polyadenylation signals (AATATA or AATGAA) are indicated.
Genomic structure The gene spans a minimum of 43 kb and is encoded by 14 exons^. There are two sites for alternative splicing: exons 7, 8 and 9 encode the STP domains commonly expressed as isoforms with B+C (8 + 9) or C (9) alone,- exons 13 and 14 encode the cytoplasmic tails, CYT-1 and CYT-2. Since exon 13 contains an in-frame stop codon, its expression as CYT-1 converts exon 14 into the 3' untranslated region of MCP.
5kb
14
Accession numbers Human^'^
Owl monkey^^ Baboon^^ Goeldii marmoset^^ Common marmoset^^ Tamarin^^ Squirrel monkey^^ African green monkey^^ Cynomologous monkey^^ Rhesus monkey^^ White-faced saki^^ Guinea-pig2^ pj^g28,29
Mouse^^
MCP-BC2 MCP-BCl MCP-Cl MCP-C2 MCP-ABC2 MCP-ABC 1
Y00651 X59405 X59406 X59407 X59409 X59410 U87914 U87915 U87916 U87917 U87918 U87919 U87920 U87921 U87922 U87923 D84130-3 D70897 AB001566
Membrane cohictor protein
D
Deficiency None known.
Polymorphic variants A Hindlll RFLP has been found that correlates with the phenotypic polymorphism of MCP^^. This size polymorphism results from variable splicing of exon 8^. Pvull and Bglll RFLPs have also been described^^'^^.
References ^ Liszewski, M.K. et al. (1996) Adv. Immunol. 61, 201-283. 2 Lublin, D.M. et al. (1988) J. Exp. Med. 168, 181-194. 3 Post, T.W. et al. (1991) J. Exp. Med. 174, 93-102. ^ Russell, S.M. et al. (1992) Eur. J. Immunol. 22, 1513-1518. 5 Ballard, L.L. et al. (1988) J. Immunol. 141, 3923-3929. 6 Ballard, L. et al. (1987) J. Immunol. 138, 3850-3855. ^ Seya, T. and Atkinson, J.P. (1989) Biochem. J. 264, 581-588. « Oglesby, T.J. et al. (1992) J. Exp. Med. 175, 1547-1551. 9 Cervoni, F. et al. (1992) J. Immunol. 148, 1431-1437. ^» Anderson, D.J. et al. (1989) Biol. Reprod. 41, 285-293. " Anderson, D.J. et al. (1993) Proc. Natl Acad. Sci. USA 90, 10051-10055. ^2 Karp, C.L. et al. (1996) Science 273, 228-231. ^3 Naniche, D. et al. (1993) J. Virol. 67, 6025-6032. '^ Dorig, R.E. et al. (1993) Cell 75, 295-305. ^5 Manchester, M. et al. (1994) Proc. Natl Acad. Sci. USA 91, 2161-2165. ^6 Okada, N. et al. (1995) Proc. Natl Acad. Sci. USA 92, 2489-2493. ^7 Kallstrom, H. et al. (1997) Mol. Microbiol. 25, 639-647. ^« Cozzi, E. and White, D.J.G. (1995) Nature Med. 1, 964-966. ^9 Higgins, P.J. et al. (1997) J. Immunol. 158, 2872-2881. 2« Johnstone, R.W. et al. (1993) Mol. Immunol. 30, 1231-1241. 2^ GoreUck, A. et al. (1995) Lupus 4, 293-296. 22 Liszewski, M.K. et al. (1998) In (Rother, K., Till, G.O. and Hansch, G.M. eds). 2nd ed. Berlin, Springer-Verlag, pp. 146-162. 23 Gasque, P. and Morgan, B.P. (1996) Immunology 89, 338-347. 24 Bora, N.S. et al. (1989) J. Exp. Med. 169, 597-602. 25 Hourcade, D. et al. (1992) Genomics 12, 289-300. 26 Hsu, E.G. et al. (1997) J. Virol. 71, 6144-6154. 27 Hosokawa, M. et al. (1996) J. Immunol. 157, 4946-4952. 2« van den Berg, C.W. et al. (1997) J. Immunol. 158, 1703-1709. 29 Toyomura, K. et al. (1997) Int. Immunol. 9, 869-876. 3» Tsujimura, A. et al. (1998) Biochem. J. 330, 163-168. 3^ Bora, N.S. et al. (1991) J. Immunol. 146, 2821-2825. 32 Wilton, A.N. et al. (1992) Immunogenetics 36, 79-85.
C4b-binding protein Santiago Rodriguez de Cordoba, Olga Criado Garcia and Pilar SanchezCorral, Department of Immunology, CIB/CSIC, Madrid, Spain
n
Other names Proline-rich protein (PRP), Ss(C4)-binding protein, C4-bp, C4b-bp, C4BP.
Physicochemical properties'-^ Human C4BP is a heterogeneous oligomeric protein present in plasma in three isoforms with different subunit composition^. The major isoform, a?/pi {M, (K) 540-570) is a complex of seven identical a chains (C4BPa) and one P chain (C4BP/3). The other isoforms in plasma are a7/pO and a6/pi. The proportion in which the three isoforms are synthesized is determined genetically, but can be modified by factors with a differential effect on the expression of the C4BPA and C4BPB genes^'^. The C4BPa and C4BP/3 chains are disulfide-linked by their C-terminal regions.
pi (after neuraminidase treatment) Amino acids
a chain 6.60, 6.65 (3r 6.75 depending on allele 48 549 61.5 70
P chain not known
leader sequence 17 mature 235 M, (K) predicted 26.4 observed 45 N-linked glycosylation sites potential 3 (221, 506, 528) 5(64, 71,98, 117, 15 Interchain disulfide bonds located in the oligomerization domain. Precise number and positions are not known.
Structure The C4b-binding protein molecule has a spider-like structure as observed by electron microscopy^. Synchrotron X-ray scattering and hydrodynamic analyses suggest that human C4BP has a more compact structure in solution^. Both C4BPa and C4BP/3 chains are members of the RCA family^.
Function C4BP is a regulator of complement activation^. It binds to C4b, accelerates the decay of the classical pathway C3/C5 convertase, and functions as a cofactor in the factor I-mediated inactivation of C4b*. Each C4BPa chain has a binding site for C4b, spanning from CCP-1 to CCP-3^. The C4BPa chain also carries a binding site for the serum amyloid P component (SAP)^^. The C4BP/3 chain binds and inactivates the anticoagulant protein S, suggesting that C4BP/3 plays a role in the control of the coagulation system^^. However, it is unclear how this regulatory mechanism would operate. Similarly, the
C4b-bindiiig protein
functional significance of the C4BPa/C4BP/3 association remains uncertain. C4BPa
C4BPP
• • • D
Tissue distribution Serum protein: 150-300 |ig/ml. Primary site of synthesis: liver (hepatocytes).
Regulation of expression^^^^^^^ The characteristics of the C4BPA and C4BPB promoters provide an explanation for the hepatic-specific expression of the C4BP polypeptides. The C4BPA promoter contains HNFl- and HNF3-binding sites, and the C4BPB promoter contains binding sites for the HNF3 and NFI/CTF transcription factors. C4BP is an acute-phase protein. However, acute-phase mediators like IL-6, IL-1/3, TNFa and INF7 differentially regulate the C4BPA and C4BPB genes in FIep3B cells, and have a dramatic effect on the proportions of the C4BP isoforms secreted by these cells. Sequence analyses of the C4BPA and C4BPB promoters have revealed potential target elements for different classes of cytokine response factors.
Protein sequences^^^ C4BPa MHPPKTPSGA GPPPTLSFAA EWVYNTFCIY TSRCEVQDRG YSCDPRFSLL SGFGPIYNYK NLPDIPHASW NLRWTPYQGC ETSRFSAICQ lYECDKGYIL YVEPENVTIQ TGKRLMQCLP
LHRKRKMAAW PMDITLTETR KRCRHPGELR VGWSHPLPQC GHASISCTVE DTIVFKCQKG ETYPRPTKED EALCCPEPKL GDGTWSPRTP VGQAKLSCSY CDSGYGWGP NPEDVKMALE
PFSRLWKVSD FKTGTTLKYT NGQVEIKTDL EIVKCKPPPD NETIGVWRPS FVLRGSSVIH VYWGTVLRY NNGEITQHRK SCGDICNFPP SHWSAPAPQC QSITCSGNRT VYKLSLEIEQ
PILFQMTLIA CLPGYVRSHS SFGSQIEFSC IRNGRHSGEE PPTCEKITCR CDADSKWNPS RCHPGYKPTT SRPANHCVYF KIAHGHYKQS KALCRKPELV WYPEVPKCEW LELQRDSARQ
ALLPAVLGNC TQTLTCNSDG SEGFFLIGST NFYAYGFSVT KPDVSHGEMV PPACEPNSCI DEPTTVICQK YGDEISFSCH SSYSFFKEEI NGRLSVDKDQ ETPEGCEQVL STLDKEL
50 100 150 2 00 250 3 00 3 50 400 450 500 550
VAWRVSASDA LFCNASKEWD YILKGSNRSQ ISYYCEDRYY ESKNLCEAME
EHCPELPPVD NTTTECRLGH CLEDHTWAPP LVGVQEQQCV NFMQQLKESG
NSIFVAKEVE CPDPVLVNGE FPICKSRDCD DGEWSSALPV MTMEELKYSL
GQILGTYVCI FSSSGPVNVS PPGNPVHGYF CKLIQEAPKP ELKKAELKAK
50 100 150 2 00 2 50
C4BPj8 MFFWCACCLM KGYHLVGKKT DKITFMCNDH EGNNFTLGST ECEKALLAFQ LL
The leader sequence is underlined and potential N-linked glycosylation sites are indicated (N).
C4b-binding protein
Protein modules^^^^ C4BPa 1-48 49-109 110-171 172-235 236-296 297-361 362-424 425-481 482-540 541-597
Leader sequence CCP CCP CCP CCP CCP CCP CCP CCP C-terminal oligomerization domain
exon 2 exon 3 exon 4/5 exon 6 exon 7 exon 8 exon 9 exon 10 exon 11 exon 12
C4BP/3 1-17 18-77 78-135 136-192 193-252
Leader sequence CCP CCP CCP C-terminal oligomerization domain
exon 3 exon 4 exon 5 exon 6/7 exon 7/8
Chromosomal location The C4BPA and C4BPB genes are closely linked within the RCA gene cluster. Human^«: lq32. Mouse^^: chromosome 1, 67.6 cM. Rat^o; 13q24-q25. Cen
Tel FHR2I
••[ZZI HF1
l?FKF^2 PFKFB2
p4BPifk C4BPA
iC4BI?ALJ iC4BPAL1
,
CR2
MCPL1
^ MCP
iF13B
Hs. 8688
C4BPB
SRP72
C4BPAL2
DAF
CR1
CR1L1
cDNA sequences C4BPA AACCGTCCTT ACCAGTCAAC GATCAAGGCA CTTCCTCAAC TCCATCTGGG GAAAGTCTCT TCTTGGCAAT GACTGAGACA CAGATCCCAT CTTCTGTATC TAAGACAGAT AATTGGCTCA TCTCCCACAA
GACCAGCCAA TTCAGGGTAT GTTTTCTTCT TACCAAAGAA GCTCTTCATA GATCCAATTC TGTGGTCCTC CGCTTCAAAA TCAACTCAGA TACAAACGAT TTATCTTTTG ACCACTAGTC TGTGAAATTG
CCACATGGCT GAAATTCAGG GACTCTTTGG TGGAGCAATT
60
TATGATAAAC TCTGATCTGG GGAGGAACCA GGACTAGATA
120
TTGAGAAACT ATCCCAGATA TCATCATAGA GTGTTCTGCT
180
AAACATCAGC GAAGCAGCAG GCCATGCACC CCGCAAAAAC
240
GAAAAAGGAA AATGGCAGCC TGGCCCTTCT CCAGGGTGTG
300
TCTTCCAAAT GACCTTGATC GCTGCTCTGT
TGCGTGGTGT
360
CACCCACTTT ATCATTTGCT GCCCCGATGG ATATTACGTT
420
CTGGAACTAC TCTGAAATAC ACCTGCCTCC
GTGGCTAGGT
480
CGCTTACCTG TAATTCTGAT GGCGAATGGG TGTATAACAC
540
GCAGACACCC AGGAGAGTTA CGTAATGGGC AAGTAGAGAT
600
GATCACAAAT AGAATTCAGC TGTTCAGAAG G A T T T T T C T T
660
GTTGTGAAGT CCAAGATAGA GGAGTTGGCT GGAGTCATCC
720
TCAAGTGTAA GCCTCCTCCA GACATCAGGA ATGGAAGGCA
780
Ending protein
cDNA sequence CAGCGGTGAA CTTCTCACTC TTGGAGACCA TGGGGAAATG GTGCCAAAAA ATGGAATCCT ACATGCTTCC TGTGTTAAGG GATTTGTCAG TGAACCAAAG CTGTGTTTAT AGCTATATGC CAATTTTCCT CAAAGAAGAG CTCCTGCAGT ACCAGAATTA TGTCACCATC TGGGAACAGA TGAACAAGTG AATGGCCCTG CAGCGCAAGA AGGTGTCTTG CAATTTGGCA GTGCTTTGAG CACACAAAGC
continued
GAAAATTTCT TTGGGCCATG AGCCCTCCTA GTCTCTGGAT GGTTTTGTTC TCTCCTCCTG TGGGAAACAT TACCGCTGTC AAAAATTTGA CTAAATAATG TTCTATGGAG CAAGGAGATG CCTAAAATTG ATTATATATG TATTCACACT GTGAATGGAA CAATGTGATT ACCTGGTACC CTCACAGGCA GAGGTATATA CAATCCACTT CTGGCTTGCC GTGATATTCA ATTGTGAAAT ACAAATTTTT
ACGCATACGG CCTCCATTTC CCTGTGAAAA TTGGACCCAT TCAGAGGCAG CTTGTGAGCC ATCCTAGGCC ATCCTGGCTA GATGGACCCC GTGAAATCAC ATGAGATTTC GCACGTGGAG CCCATGGGCA AATGTGATAA GGTCAGCTCC GGTTGTCTGT CTGGCTATGG CAGAGGTGCC AAAGACTCAT AGCTGTCTCT TGGATAAAGA TCTTGCAATT TCATAATAAA TATTAATCAT TTTCGATTAA
CTTTTCTGTC TTGCACTGTG AATCACCTGT CTATAATTAC CAGTGTAATT CAATAGTTGT GACAAAAGAG CAAACCCACT ATACCAAGGA TCAACACAGG ATTTTCATGT TCCCCGAACA TTATAAACAA AGGCTACATT AGCCCCTCAA GGATAAGGAT TGTGGTTGGT CAAGTGTGAG GCAGTGTCTC GGAAATTGAA ACTATAATTT CAATACAGAT TATCTAGAAA CCTCTGTGTG AAATGTATGT
ACCTACAGCT GAGAATGAAA CGCAAGCCAG AAAGACACTA CATTGTGATG ATTAATTTAC GATGTGTATG ACAGATGAGC TGTGAGGCGT AAAAGTCGTC CATGAGACCA CCATCATGTG TCTAGTTCAT CTGGTCGGAC TGTAAAGCTC CAGTATGTTG CCCCAAAGTA TGGGAGACCC CCAAACCCAG CAACTGGAAC TTCTCAAAAG CAGTTTAGCA TGATAATTTG CTCATGTTTT AT
GTGACCCCCG CAATAGGCGT ATGTTTCACA TTGTGTTTAA CTGATAGCAA CAGACATTCC TTGTTGGGAC CTACGACTGT TATGTTGCCC CTGCCAATCA GTAGGTTTTC GAGACATTTG ACAGCTTTTT AGGCGAAACT TGTGTCGGAA AGCCTGAAAA TCACTTGCTC CCGAAGGCTG AGGATGTGAA TACAGAGAGA AAGGAGGAAA AATCTACTGT CTAAAGTTTA TGCTTTTCAA
TCACATACAT TGACAATTAC CCTTAGAGAT AAATAAAAAA CTGGGAAGCC TGGGGAGAGG TGGCGAGTTT ATATTTGTCG TACCACCTGG ACTACTGAGT TCTTCAGGGC CTCAAGGGCA ATCTGCAAAA AATAACTTCA GGCGTGCAGG TTGATCCAGG AAGAACCTCT ATGGAGGAGC TAACACTACA TCTGAAA
TGAGACCAAA GCTGTTGCTT ACATAAAAGA GCAGGCCTTT CTAACTCTGG ACTTTGATCA CTGCTTCAGA CAAAGGAGGT TAGGAAAGAA GCCGCTTGGG CTGTGAATGT GCAATCGGAG GTAGGGACTG CCTTAGGATC AGCAGCAATG AAGCTCCCAA GCGAAGCCAT TAAAATATTC GCTGAGCAGA
AAGACCAAGT CTGAGTGAGA GACAAGCAAT GGAGCTCTCA AGGGACAGAG CCAGATGTTT TGCAGAGCAC GGAAGGACAG GACCCTTTTT CCACTGTCCT AAGTGACAAA CCAGTGTCTA TGACCCTCCT CACCATTAGT CGTTGATGGG ACCAGAGTGT GGAGAACTTT TCTGGAGCTG TGTAATAGAA
ACCTATAAGA AGTTACAGGC TTCCAAAACA GCTTTGGAGT ACAGGTGTCT TTTTGGTGTG TGTCCAGAGC ATTCTGGGGA TGCAATGCCT GATCCTGTGC ATCACGTTTA GAGGACCACA GGGAATCCAG TATTACTGTG GAGTGGAGCA GAGAAGGCAC ATGCAACAAT AAGAAAGCTG ATAAACCTAT
GGACCAACCC 60 CCAAGAAAGG 120 AAAAGCAAAG 180 CAGTTAAGAC 240 GAGCTGGGTG 300 CGTGCTGTCT 360 TTCCTCCAGT 420 CTTACGTTTG 480 CTAAGGAGTG 540 TGGTGAATGG 600 TGTGCAATGA 660 CCTGGGCACC 720 TTCATGGCTA 780 AAGACAGGTA 840 GTGCACTTCC 900 TTCTTGCCTT 960 TAAAGGAAAG 1020 AGTTGAAGGC 1080 GAATAAATTT 1140
TTTGTTGGGG AGAAACCAGG GTATTTTTGC AGATTTTTTC GAGGTGGTTA
AAGGATAAGC AAATTTCTGA AGTTCCTCCC TTTTCATTTT GGTTGGTCTT
CTGCGTTTCA AACTGAGTTT TATACCAGCT CTCCAGGGAG AAGCAGTGTT
AAAAGACTGG TAAACCTGGC TCCTTACAGC GGTAAATCAA AGAAGATCTA
TCAACTGGAT AACTCTTTTA CGTTCTGATT TTAACCTCTT TTTTTTTTCA
840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220
C4BPB A12 ATTCTGTCTT AGACGGGCTG GTAATGACAG GCAAAAAGAA CAGTTCCTTG AATTCCAGCC TATGGTTGCG GGACAATAGC TATCAAGGGC GGATAACACC AGAGTTCAGT CCACTACATC TCCCTTTCCC TTTTGAAGGA CTACTTAGTG AGTCTGCAAG TCAGGAGAGT TGGCATGACA AAAATTGTTG TCTTCTTGGT
A19 TTGGCCCATG TTGCATTCCC CCATCCCTTT TCCTGTCGTA CTCTAGAGAG
60 120 180 240 300
C4b-binding protein
••
cDNA sequence continued
|nn|
AACCAGGTGT CTGAGCTGGG T T T T T T G G T G TGCGTGCTGT ACTGTCCAGA GCTTCCTCCA AGATTCTGGG GACTTACGTT TTTGCAATGC CTCTAAGGAG CTGATCCTGT GCTGGTGAAT AAATCACGTT TATGTGCAAT TAGAGGACCA CACCTGGGCA CTGGGAATCC AGTTCATGGC GTTATTACTG TGAAGACAGG GGGAGTGGAG CAGTGCACTT GTGAGAAGGC ACTTCTTGCC TTATGCAACA ATTAAAGGAA TGAAGAAAGC TGAGTTGAAG AAATAAACCT ATGAATAAAT
ITHTI
knj nil H H ^^M ^^M ^^1 ^ H ^^M ^ H ^ H ^^m ^ H ^^1
TGAATTCCAG CTTATGGTTG GTGGACAATA TGTATCAAGG TGGGATAACA GGAGAGTTCA GACCACTACA CCTCCCTTTC TATTTTGAAG TACTACTTAG CCAGTCTGCA TTTCAGGAGA AGTGGCATGA GCAAAATTGT TTTCTTCTTG
CCTGGGGAGA CGTGGCGAGT GCATATTTGT GCTACCACCT CCACTACTGA GTTCTTCAGG TCCTCAAGGG CCATCTGCAA GAAATAACTT TGGGCGTGCA AGTTGATCCA GTAAGAACCT CAATGGAGGA TGTAACACTA GTTCTGAAA
GGACTTTGAT TTCTGCTTCA CGCAAAGGAG GGTAGGAAAG GTGCCGCTTG GCCTGTGAAT CAGCAATCGG AAGTAGGGAC CACCTTAGGA GGAGCAGCAA GGAAGCTCCC CTGCGAAGCC GCTAAAATAT CAGCTGAGCA
CACCAGATGT 360 GATGCAGAGC 420 GTGGAAGGAC 480 AAGACCCTTT 540 GGCCACTGTC 600 GTAAGTGACA 660 AGCCAGTGTC 720 TGTGACCCTC 780 TCCACCATTA 840 TGCGTTGATG 900 AAACCAGAGT 960 ATGGAGAACT 1020 TCTCTGGAGC 1080 GATGTAATAG 1140
The first five nucleotides in each exon are underhned to indicate the intron-exon boundaries. The methionine initiation codon (ATG), the termination codons (TAA) and the putative polyadenylation signals (AATAAA) are indicated. Multiple transcription start sites are indicated in bold and underlined.
Genomic structure^^-^^^^ The human C4BPA and C4BPB genes are arranged in tandem with the 5' end of the C4BPA gene located 4172bp downstream the 3' end of the C4BPB gene. The C4BPA gene spans over 40 kb of DNA and is composed of 12 exons, ranging from 186 to 425 bp. C4BPA introns vary from 167 bp to approximately 9 kb. The C4BPB gene spans more than lOkb of DNA and is composed of 8 exons. The C4BPB gene is transcribed in human liver from two different promoters, producing two transcripts of similar size, denoted as A12 and A19, that differ in their 5' untranslated sequences. C4BPB
C4BPA 8
12
1
ifr
^^—i-h
Accession numbers Human Bovine24 Rabbit25 Rat26
Mouse Guinea-pig AM67^^ Pig ApoR^o
C4BPA M31452 L05546-54 [C4BPAL1^^) X81360-62 (C45PAL223) Z31693 Z35490 Z50051 Ml 71222^ U75654 L06820 J50773
C4BPB LI1244-46 M29964 Z31694 Z50052 Z2194428
C4b-binding protein
Deficiency^^ C4BP deficiency should favour C3 consumption through uncontrolled activation of the classical pathway. There is only one case reported with primary deficiency of C4BP. The patient showed an atypical Behcet's disease complicated with angioedema.
Polymorphic variants^"^'^^^^ C4BPa Isoelectric focusing, after neuraminidase treatment results in three variants: C4-bp 1 (pi = 6.65), C4-bp 2 (pi = 6.60), C4-bp 3 (pi = 6.75) with allele frequencies of C4BP*1 (0.986), C4BP*2 (0.010), C4BP*3 (0.004). T1292C; Y357H Associated genetic markers: D1S3704 (CA repeat). C4BPj3 C818T (A19 numbering). Frequency of the alleles is T (0.16) and C (0.84). G to A intron 4, position +3. Frequency of the alleles is G (0.47) and A (0.53). Associated genetic markers: C4BPB (CA repeat).
References ' Hillarp, A. et al. (1989) FEBS Lett. 259, 53-56. 2 Perkins, J. et al. (1986) Biochem. J. 233, 799-807. 3 Barlow, P.N. et al. (1993) J. Mol. Biol. 232, 268-284. ^ Scharfstein, J. et al. (1978) J. Exp. Med. 148, 207-222. 5 Dahlback, B. et al. (1983) Proc. Natl Acad. Sci. USA 80, 3461-3465. 6 Sanchez-Corral, P. et al. (1995) J. Immunol. 155, 4030-4036. 7 Criado Garcia, O. et al. (1995) J. Immunol. 155, 4037-4043. « Fujita, T. et al. (1978) J. Exp. Med. 148, 1044-1051. 9 Accardo, P. et al. (1996) J. Immunol. 157, 4935-4939. 0 Garcia de Frutos, P. et al. (1995) J. Biol. Chem. 270, 26950-26955. ' Hardig, Y. et al. (1993) J. Biol. Chem. 268, 3033-3036. 2 Arenzana, N. et al. (1995) Biochem. J. 308, 613-621. 3 Arenzana, N. et al. (1996) J. Immunol. 156, 168-175. -^ Chung, L.P. et al. (1985) Biochem. J. 230, 133-141. 5 Hillarp, A. et al. (1990) Proc. Natl Acad. Sci. USA 87, 1183-1187. 6 Rodriguez de Cordoba, S. et al. (1991) J. Exp. Med. 173, 1073-1082. 7 Hillarp, A. et al. (1993) J. Biol. Chem. 268, 15017-15023. s Pardo-Manuel, F. et al. (1990) Proc. Natl Acad. Sci. USA 87, 4529-4532. Seldin, M.F. et al. (1988) J. Exp. Med. 167, 688-693. 20 Andersson, A. et al. (1990) Somat. Cell. Mol. Genet. 16, 493-500. 2^ Aso, T. et al. (1991) Biochem. Biophys. Res. Commun. 174, 222-227. '' Sanchez-Corral, P. et al. (1993) Genomics 17, 185-193. 23 Pardo-Manuel de Villena, F. et al. (1995) Immunogenetics 41, 139-143. 2^' Hillarp, A. et al. (1994) J. Immunol. 153, 4190-4199. 25 Garcia de Frutos, P. et al. (1995) Biochim. Biophys. Acta 1261, 285-289 26 Hillarp, A. et al. (1997) J. Immunol. 158, 1315-1323. 27 Kristensen, T. et al. (1987) Biochemistry 26, 4668-4674.
C4b-binding protein
Rodriguez de Cordoba, S. et al. (1994) Genomics 21, 501-509. Foster, J.A. et al. (1997) J. Biol. Chem. 272, 12714-12722. Cooper, S.T. et al. (1992) Biochemistry 31, 12328-12336. Trapp, R.G. et al. (1987) J. Rheumatol. 14, 135-138. Rodriguez de Cordoba, S. et al. (1987) Immunogenetics 25, 267-268. Morboeuf, O. et al. (1998) Br. J. Haematol. 101, 10-15.
Factor H
/?1H, FH
Richard G. DiScipio, La JoUa Institute for Experimental Medicine, La JoUa, CA, USA
Physicochemical properties Factor H is synthesized as a single-chain molecule of 1231 amino acids including an 18 amino acid leader sequence^'^. Mature protein: pP (predicted) 5.7-6.2 (observed) 6.5-6.75 (after neuraminidase treatment) Mr(K) N-linked glycosylation sites^ Potential Known to be occupied Known to be unoccupied
155 9 (217, 529, 718, 802, 822, 882, 911, 1029, 1095) 5 (529, 802, 822, 882, 911) 1 (217)
Structure The tandem array of CCP modules gives rise to an elongated flexible molecule which is creased at least once. The contour length is 49.5 nm and the cross-section thickness is 3.4nm'^. The 5th and 16th CCP modules were solved by NMR. The general structure of a CCP module is an ellipsoid of maximal length of 38 nm, consisting of five P strands linked by two overlapping disulfide bonds^'^. A pair of CCP modules (15th and 16th) was also solved, and the data show that a wide range of twist angles between these modules is possible, but a much more limited range of tilt angles can exist^.
Function Factor H controls the activity of the alternative pathway C3/C5 convertase by competing with factor B for C3b binding, by displacing the Bb subunit from the convertase, and by serving as a cofactor for factor I to mediate the cleavage of C3b to iC3b*-^'^. Factor H and thrombin-treated factor H are reported to be chemotactic for monocytes^^'^^. Factor H also serves as an adhesion protein for neutrophils and a secretagogue of IL-1/3 from monocytes^^. A truncated form of factor H, consisting of the first 7 of 20 modules, supports adhesion of epithelial and fibroblast cell lines by displaying the tripeptide sequence ROD found in the fourth module^^.
Tissue distribution Serum protein: -550 jiig/mP^. Primary site of synthesis: liver^^. Secondary sites: monocytes, endothelial cells, fibroblasts and myoblasts^^^^.
Regulation of expression The synthesis of factor H by fibroblast, monocyte, endotheHal and cultured liver cells is augmented by iFN^^^'^^'^o
Protein sequence^ MRLLAKIICL KCRPGYRSLG GNVFEYGVKA APENGKIVSS EKPKCVEISC CTESGWRPLP ATRGNTAKCT KYYSYYCDEH NHGRKFVQGK SSIDIENGFI AQPTCIKSCD IVCGYNGWSD FTIVGPNSVQ HSEWEYYCN WAQLSSPPYY KKCKSSNLII VNCSMAQIQL ITCKDGRWQS CEGGFRISEE YGEEVTYKCF MGEKKDVYKA PPTVQNAYIV QCKDSTGKCG TCRNGQWSEP CKRGYRLSSR
MLWAICVAED NVIMVCRKGE VYTCNEGYQL AMEPDREYHF KSPDVINGSP SCEEKSCDNP STGWIPAPRC FETPSGSYWD SIDVACHPGY SESQYTYALK IPVFMNARTK LPICYERECE CYHFGLSPDL PRFLMKGPNK YGDSVEFNCS LEEHLKNKKE CPPPPQIPNS IPLCVEKIPC NETTCYMGKW EGFGIDGPAI GEQVTYTCAT SRQMSKYPSG PPPPIDNGDI PKCLHPCVIS SHTLRTTCWD
CNELPPRRNT WVALNPLRKC LGEINYRECD GQAVRFVCNS ISQKIIYKEN YIPNGDYSPL TLKPCDYPDI HIHCTQDGWS ALPKAQTTVT EKAKYQCKLG NDFTWFKLND LPKIDVHLVP PICKEQVQSC IQCVDGEWTT ESFTMIGHRS FDHNSNIRYR HNMTTTLNYR SQPPQIEHGT SSPPQCEGLP AKCLGEKWSH YYKMDGASNV ERVRYQCRSP TSFPLSVYAP REIMENYNIA GKLEYPTCAK
EILTGSWSDQ QKRPCGHPGD TDGWTNDIPI GYKIEGDEEM ERFQYKCNMG RIKHRTGDEI KHGGLYHENM PAVPCLRKCY CMENGWSPTP YVTADGETSG TLDYECHDGY DRKKDQYKVG GPPPELLNGN LPVCIVEEST ITCIHGVWTQ CRGKEGWIHT DGEKVSVLCQ INSSRSSQES CKSPPEISHG PPSCIKTDCL TCINSRWTGR YEMFGDEEVM ASSVEYQCQN LRWTAKQKLY R
TYPEGTQAIY TPFGTFTLTG CEWKCLPVT HCSDDGFWSK YEYSERGDAV TYQCRNGFYP RRPYFPVAVG FPYLENGYNQ RCIRVKTCSK SIRCGKDGWS ESNTGSTTGS EVLKFSCKPG VKEKTKEEYG CGDIPELEHG LPQCVAIDKL VCINGRWDPE ENYLIQEGEE YAHGTKLSYT WAHMSDSYQ SLPSFENAIP PTCRDTSCVN CLNGNWTEPP LYQLEGNKRI SRTGESVEFV
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 10 5 0 110 0 1150 12 0 0
The signal sequence is underlined and the TV-linked glycosylation sites are indicated (N). N217, though a potential N-linked glycosylation site, is known not to be glycosylated and therefore not indicated.
Protein modules 1-18 1-62 63-123 124-187 188-244 245-302 303-367 368-425 426-487 488-547 548-606
leader peptide CCPl CCP2 CCP3 CCP4 CCP5 CCP6 CCP7 CCP8 CCP9 CCPIO
Factor H
607-668 669-731 732-790 791-847 848-908 909-967 968-1026 1027-1085 1086-1146 1147-1213
CCPll CCP12 CCP13 CCP14 CCP15 CCP16 CCP17 CCP18 CCP19 CCP20
CCPl, 2 and 3 are required for I cofactor and Bb decay acceleration activities2^'22 CCP4 augments I cofactor activity and is required for Bb decay acceleration2^'22 CCP7 is involved in heparin and streptococcus Mprotein binding23-25 ^^d CCP20 plays a role in heparin binding^^. Three C3bbinding sites are created by CCP clusters: 1-4, 6-10 and 16-20^^.
Chromosomal location Human2«'29; chromosome lq32. The gene for factor H is within the RCA cluster of genes. Telomere ... MCP CR1.CR2.DAF.C4BP .FH ... Centromere Mouse: chromosome 1, 74.1 cM.
cDNA sequence AATTCTTGGA AAAGATCCAA TGTAGCAGAA CTGGTCTGAC TAGATCTCTT ATTAAGGAAA TACCCTTACA GGGGTATCAA TGATATTCCT AATTGTCAGT TGTATGTAAC TTTTTGGAGT AAATGGATCT ATGTAACATG GCGTCCGTTG CTACTCACCT TGGTTTTTAT TGCTCCGAGA TCATGAGAAT CTGTGATGAA AGATGGATGG TGGATATAAT CCATCCTGGC GTCTCCTACT GAATGGGTTT ATGCAAACTA
AGAGGAGAAC AAAATGAGAC GATTGCAATG CAAACATATC GGAAATGTAA TGTCAGAAAA GGAGGAAATG TTGCTAGGTG ATATGTGAAG AGTGCAATGG TCAGGCTACA AAAGAGAAAC CCTATATCTC GGTTATGAAT CCTTCATGTG TTAAGGATTA CCTGCAACCC TGTACCTTGA ATGCGTAGAC CATTTTGAGA TCGCCAGCAG CAAAATCATG TACGCTCTTC CCCAGATGCA ATTTCTGAAT GGATATGTAA
TGGACGTTGT TTCTAGCAAA AACTTCCTCC CAGAAGGCAC TAATGGTATG GGCCCTGTGG TGTTTGAATA AGATTAATTA TTGTGAAGTG AACCAGATCG AGATTGAAGG CAAAGTGTGT AGAAGATTAT ACAGTGAAAG AAGAAAAATC AACACAGAAC GGGGAAATAC AACCTTGTGA CATACTTTCC CTCCGTCAGG TACCATGCCT GAAGAAAGTT CAAAAGCGCA TCCGTGTCAA CTCAGTATAC CAGCAGATGG
GAACAGAGTT GATTATTTGC AAGAAGAAAT CCAGGCTATC CAGGAAGGGA ACATCCTGGA TGGTGTAAAA CCGTGAATGT TTTACCAGTG GGAATACCAT AGATGAAGAA GGAAATTTCA TTATAAGGAG AGGAGATGCT ATGTGATAAT TGGAGATGAA AGCCAAATGC TTATCCAGAC AGTAGCTGTA AAGTTACTGG CAGAAAATGT TGTACAGGGT GACCACAGTT AACATGTTCC ATATGCCTTA TGAAACATCA
AGCTGGTAAA CTTATGTTAT ACAGAAATTC TATAAATGCC GAATGGGTTG GATACTCCTT GCTGTGTATA GACACAGATG ACAGCACCAG TTTGGACAAG ATGCATTGTT TGCAAATCCC AATGAACGAT GTATGCACTG CCTTATATTC ATCACGTACC ACAAGTACTG ATTAAACATG GGAAAATATT GATCACATTC TATTTTCCTT AAATCTATAG ACATGTATGG AAATCAAGTA AAAGAAAAAG GGATCAATTA
TGTCCTCTTA GGGCTATTTG TGACAGGTTC GCCCTGGATA CTCTTAATCC TTGGTACTTT CATGTAATGA GATGGACCAA AGAATGGAAA CAGTACGGTT CAGACGATGG CAGATGTTAT TTCAATATAA AATCTGGATG CAAATGGTGA AGTGTAGAAA GCTGGATACC GAGGTCTATA ACTCCTATTA ATTGCACACA ATTTGGAAAA ACGTTGCCTG AGAATGGCTG TAGATATTGA CGAAATATCA GATGTGGGAA
60 12 0 180 240 3 00 3 60 42 0 480 54 0 600 6 60 72 0 7 80 840 900 960 102 0 1080 1140 12 0 0 12 60 13 2 0 13 80 1440 150 0 1560
Factor H
cDNA sequence AGATGGATGG TGCCAGAACT CCATGATGGT TGGTTGGTCT ACACTTAGTT CTGCAAACCA GTCTCCTGAC CCTCAATGGG ATATTATTGC AGAGTGGACA ACTTGAACAT ATTCAATTGC AGTATGGACC AAATTTAATT CATAAGGTAC ATGGGATCCA GATTCCCAAT TGTTCTTTGC AAGATGGCAG AGAACACGGA ATTGAGTTAT CATGGGAAAA GATTTCTCAT GTACAAATGT AAAATGGTCT AAATGCCATA CACTTGTGCA ATGGACAGGA TGCTTATATA ATGTAGGAGC GACGGAACCA CAATGGGGAC CCAATGCCAG ATGGTCAGAA TTATAACATA AGTTGAATTT AACATGTTGG AAGTGCACAC TATTGTTTTA TATAAGCTGA
TCAGCTCAAC AAAAATGACT TATGAAAGCA GATTTACCCA CCTGATCGCA GGATTTACAA CTCCCAATAT AATGTTAAGG AATCCTAGAT ACTTTACCAG GGCTGGGCCC TCAGAATCAT CAACTTCCCC ATACTTGAGG AGATGTAGAG GAAGTGAACT TCTCACAATA CAAGAAAATT TCAATACCAC ACCATTAATT ACTTGTGAGG TGGAGTTCTC GGTGTTGTAG TTTGAAGGTT CACCCTCCAT CCCATGGGAG ACATATTACA AGGCCAACAT GTGTCGAGAC CCTTATGAAA CCTCAATGCA ATTACTTCAT AACTTGTATC CCACCAAAAT GCATTAAGGT GTGTGTAAAC GATGGGAAAC CTTTATTCAG CTCCTTTTTA GACCGGTGGC
continued CCACGTGCAT TCACATGGTT ATACTGGAAG TATGTTATGA AGAAAGACCA TAGTTGGACC GTAAAGAGCA AAAAAACGAA TTCTAATGAA TGTGTATTGT AGCTTTCTTC TTACAATGAT AGTGTGTGGC AACATTTAAA GAAAAGAAGG GCTCAATGGC TGACAACCAC ATCTAATTCA TCTGTGTTGA CATCCAGGTC GTGGTTTCAG CACCTCAGTG CTCACATGTC TTGGAATTGA CATGCATAAA AGAAGAAGGA AAATGGATGG GCAGAGACAC AGATGAGTAA TGTTTGGGGA AAGATTCTAC TCCCGTTGTC AACTTGAGGG GCTTACATCC GGACAGCCAA GGGGATATCG TGGAGTATCC AACTTTAGTA TTCATACGTA TCTCTT
TAAATCTTGT TAAGCTGAAT CACCACTGGT AAGAGAATGC GTATAAAGTT TAATTCCGTT AGTACAATCA AGAAGAATAT GGGACCTAAT GGAGGAGAGT CCCTCCTTAT TGGACACAGA AATAGATAAA AAACAAGAAG ATGGATACAC ACAAATACAA ACTGAATTAT GGAAGGAGAA AAAAATTCCA TTCACAAGAA GATATCTGAA TGAAGGCCTT AGACAGTTAT TGGGCCTGCA AACAGATTGT TGTGTATAAG AGCCAGTAAT CTCCTGTGTG ATATCCATCT TGAAGAAGTG AGGAAAATGT AGTATATGCT TAACAAGCGA GTGTGTAATA ACAGAAGCTT TCTTTCATCA AACTTGTGCA TTAAATCAGT AAATTTTGGA
GATATCCCAG GACACATTGG TCCATAGTGT GAACTTCCTA GGAGAGGTGT CAGTGCTACC TGTGGTCCAC GGACACAGTG AAAATTCAAT ACCTGTGGAG TACTATGGAG TCAATTACGT CTTAAGAAGT GAATTCGATC ACAGTCTGCA TTATGCCCAC CGGGATGGAG GAAATTACAT TGTTCACAAC AGTTATGCAC GAAAATGAAA CCTTGTAAAT CAGTATGGAG ATTGCAAAAT CTCAGTTTAC GCGGGTGAGC GTAACATGCA AATCCGCCCA GGTGAGAGAG ATGTGTTTAA GGGCCCCCTC CCAGCTTCAT ATAACATGTA TCCCGAGAAA TATTCGAGAA CGTTCTCACA AAAAGATAGA TCTCAATTTC TTAATTTGTG
TATTTATGAA ACTATGAATG GTGGTTACAA AAATAGATGT TGAAATTCTC ACTTTGGATT CTCCTGAACT AAGTGGTGGA GTGTTGATGG ATATACCTGA ATTCAGTGGA GTATTCATGG GCAAATCATC ATAATTCTAA TAAATGGAAG CTCCACCTCA AAAAAGTATC GCAAAGATGG CACCTCAGAT ATGGGACTAA CAACATGCTA CTCCACCTGA AAGAAGTTAC GCTTAGGAGA CTAGCTTTGA AAGTGACTTA TTAATAGCAG CAGTACAAAA TACGTTATCA ATGGAAACTG CACCTATTGA CAGTTGAGTA GAAATGGACA TTATGGAAAA CAGGTGAATC CATTGCGAAC ATCAATCATA ATTTTTTATG AAAATGTAAT
162 0 168 0 17 4 0 180 0 18 60 192 0 19 80 2 040 2100 2160 2 22 0 22 80 2340 2 4 00 2 4 60 2 52 0 2 5 80 2 640 27 00 27 60 2 82 0 2 880 2 94 0 3 000 3 06 0 312 0 3180 32 4 0 3 3 00 3 3 60 3 42 0 3 480 3 540 3 6 00 3 660 3 72 0 3780 3 84 0 3 9 00
The initiation methionine (ATG) and termination codon (TAG) are indicated.
Genomic structure^^ The structure of the human gene is unknown. The murine gene spans 120 kb and is composed of 22 exons. The 5' untranslated region and the leader peptide are encoded by the first exon. Of the 20 CCP modules comprising factor H, 19 were coded for by single exons. Only CCP module 2 was encoded by two exons. Exon sizes vary between 77 and 210 bp, but introns show a larger range of sizes: 86 bp to 26 kb.
Accession numbers Human^ Moused
Y00716 Ml 2660
Deficiency Autosomal recessive. Uncontrolled activation of the alternative pathway C3 results in reduced levels of C3 accompanied by meningococcal disease, glomerulonephritis, chronic hypocomplementemic renal disease and systemic lupus erythematosus^^-^^.
Polymorphic variants Five variant gene frequencies in white populations, only two of which are common: FH*1 (0.6-0.69), FH*2 (0.30-OA)^'*'^^. Among Japanese donors there are two common and two rare alleles, along with a null allele^^. Two forms, (|)1 and (^2, differ in affinity for phenyl-Sepharose, with ct)l being more hydrophilic and (\)2 more hydrophobic. The molecular differences between these two forms is unknown,- therefore, these may or may not be genetic variants^^. References ^ Ripoche, J. et al. (1988) Biochem. J. 249, 593-602. 2 Kristensen, T. and Tack, B.F. (1986) Proc. Natl Acad. Sci. USA 83, 3963-3967. 3 Sim, R.B. and DiScipio, R.G. (1982) Biochem. J. 205, 285-293. ^ DiScipio, R.G. (1992) J. Immunol. 149, 2592-2599. 5 Norman, D.G. et al. (1991) J. Mol. Biol. 219, 717-725. 6 Barlow, P.N. et al. (1992) Biochemistry 31, 3626-3634. ' Barlow, P.N. et al. (1993) J. Mol. Biol. 232, 268-284. « Whaley, K. and Ruddy, S. (1976) J. Exp. Med. 144, 1147-1163. 9 Kazatchkine, M.D. et al. (1979) J. Immunol. 122, 75-81. 0 Weiler, J.M. et al. (1976) Proc. Natl Acad. Sci. USA 73, 3268-3272. Nabil, K. et al. (1997) Biochem. J. 326, 377-381. Ohtsuka, H. et al. (1993) Immunology 80, 140-145. Iferroudjene, D. et al. (1991) Eur. J. Immunol. 21, 967-972. Hellwage, J. et al. (1997) Biochem. J. 326, 321-327. Vik, D.P. et al. (1990) J. Biol. Chem. 265, 3193-3201. Lappin, D.F. et al. (1992) Biochem. J. 281, 437-442. Legoedec, J. et al. (1995) Eur. J. Immunol. 25, 3460-3466. Guc, D. et al. (1993) Rheum. Int. 13, 139-146. Vik, D.P. (1996) Scand. J. Immunol. 44, 215-222. 20 Brooimans, R.A. et al. (1989) J. Immunol. 142, 2024-2030. 2^ Gordon, D.L. et al. (1995) J. Immunol. 155, 348-356. 22 Kuhn, S. and Zipfel, P.F. (1996) Eur. J. Immunol. 26, 2383-2387. 23 Blackmore, T.K. et al. (1998) Infect. Immun. 66, 1427-1431. 24 Blackmore, T.K. et al. (1992) J. Immunol. 157, 5422-5427. 25 Kotarsky, H. et al. (1998) J. Immunol. 160, 3349-3354.
Factor H
26 Blackmore, T.K. et al. (1998) J. Immunol. 160, 3342-3348. 27 Sharma, J.K. and Pangburn, M.K. (1996) Proc. Natl Acad. Sci. USA 93, 10996-11001. 2s Rodriguez de Cordoba, S. and Rubinstein, P. (1987) Immunogenetics 25, 267-268. 29 Rodriguez de Cordoba, S. and Rubinstein, P. (1984) J. Immunol. 132, 1906-1908. ^0 Vik, D.P. et al. (1988) J. Biol. Chem. 263, 16720-16724. 3^ Fijen, C.A. et al. (1996) Clin. Exp. Immunol. 105, 511-516. ^2 Nielsen, H.E. et al. (1989) Scand. J. Immunol. 30, 711-718. 33 Ault, B.H. et al. (1997) J. Biol. Chem. 272, 25168-25175. 34 Zhou, M. and Larsen, B. (1990) Hum. Hered. 40, 55-57. 35 Day, A.J. et al. (1988) Immunogenetics 27, 211-214. 36 Nakamura, S. et al. (1990) Hum. Hered. 40, 121-126. 37 Ripoche, J. et al. (1984) Biochem. J. 221, 89-96.
This Page Intentionally Left Blank
Part 6 Cell Surface Receptors
ClqRp Andrea J. Tenner, Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, USA
D
Other names Clq receptor that enhances phagocytosis, human Clq/MBL/SPA receptor.
Physicochemical properties^ ClqRp is synthesized as a single-chain molecule which in humans is a 652 amino acid precursor protein that includes a 21 amino acid leader sequence. pi (predicted) 5.24 M, (K) predicted 66.5 observed 126 (reduced) 100 (unreduced) N-linked glycosylation sites^ Potential 1 (325)
D
Structure Not determined.
Function Multivalent interaction of this receptor with its known ligands, Clq, MBL and SP-A results in the enhancement of phagocytosis of suboptimally opsonized particles and/or cellular debris^^. Ca'*Ca»*Ca»*
Tissue distribution Myeloid cells, endothelial cells, platelets^'^'^.
D
Regulation of expression Unknown.
Protein sequence (human)^ MATSMGLLLL LLLLLTQPGA G T G A D T E A W CVGTACYTAH SGKLSAAEAQ
50
NHCNQNGGNL ATVKSKEEAQ HVQRVLAQLL RREAALTARM SKFWIGLQRE
10 0
KGKCLDPSLP LKGFSWVGGG EDTPYSNWHK ELRNSCISKR CVSLLLDLSQ
150
PLLPNRLPKW SEGPCGSPGS PGSNIEGFVC KFSFKGMCRP LALGGPGQVT
2 00
YTTPFQTTSS SLEAVPFASA ANVACGEGDK DETQSHYFLC KEKAPDVFDW
2 50
GSSGPLCVSP KYGCNFNNGG CHQDCFEGGD GSFLCGCRPG FRLLDDLVTC
3 00
ASRNPCSSSP CRGGATCVLG PHGKNYTCRC PQGYQLDSSQ LDCVDVDECQ
3 50
DSPCAQECVN TPGGFRCECW VGYEPGGPGE GACQDVDECA LGRSPCAQGC
400
TNTDGSFHCS CEEGYVLAGE DGTQCQDVDE CVGPGGPLCD SLCFNTQGSF
450
ClqRp
Protein sequence HCGCLPGWVL SPTRGPEGTP HATAASGPQE LGLLVYRKRR DC
continued
APNGVSCTMG KATPTTSRPS PAGGDSSVAT AKREEKKEKK
PVSLGPPSGP LSSDAPITSA QNNDGTDGQK PQNAADSYSW
PDEEDKGEKE PLKMLAPSGS LLLFYILGTV VPERAESRAM
GSTVPRAATA SGVWREPSIH VAILLLLALA ENQYSPTPGT
500 550 600 650
The leader sequence is underlined and the potential iV-linked glycosylation site is indicated (N).
Protein modules^ 1-21 29-184 260-301 302-344 345-384 385-426 427-468 469-578 581 -605 606-652
Leader peptide CRD EGF EGF EGF-Ca2+ EGF-Ca2+ EGF-Ca2+ STP Transmembrane domain Intracellular domain
Chromosomal location Unknown.
cDNA sequence^ AAAGCCCTCA CCCCTTGGGG TCCCGCAGAG GCTGCTGCTC CGTGGGGACC CCACTGCAAC CGTCCAGCGA CAAGTTCTGG GAAGGGCTTC GCTCCGGAAC GCTCCTTCCC CGGAAGTAAC GGCCCTGGGG CTTGGAGGCT CGAGACTCAG CAGCTCGGGC CCACCAGGAC CCGGCTGCTG TCGTGGGGGG CCAAGGGTAC CTCCCCCTGT TGGCTATGAG
GCCTTTGTGT CCCAGCTGGG GGCCACACAG CTGACCCAGC GCCTGCTACA CAGAACGGGG GTACTGGCCC ATTGGGCTCC AGCTGGGTGG TCGTGCATCT AACCGCCTGC ATTGAGGGCT GGCCCAGGTC GTGCCCTTTG AGTCATTATT CCCCTCTGTG TGCTTTGAAG GATGACCTGG GCCACGTGCG CAGCTGGACT GCCCAGGAGT CCGGGCGGTC
CCTTCTCTGC AGCCGAGATA AGACCGGGAT CCGGGGCGGG CGGCCCACTC GCAACCTGGC AGCTCCTGAG AGCGAGAGAA GCGGGGGGGA CCAAGCGCTG CCAAGTGGTC TCGTGTGCAA AGGTGACCTA CCTCTGCGGC TCGTGTGCAA TCAGCCCCAA GGGGGGATGG TGACCTGTGC TCCTGGGACC CGAGTCAGCT GTGTCAACAC CTGGAGAGGG
GCCGGAGTGG GAAGCTCCTG GGCCACCTCC GACGGGAGCT GGGCAAGCTG CACTGTGAAG GCGGGAGGCA GGGCAAGTGC GGACACGCCT TGTGTCTCTG TGAGGGCCCC GTTCAGCTTC CACCACCCCC CAATGTAGCC GGAGAAGGCC GTATGGCTGC CTCCTTCCTC CTCTCGAAAC CCATGGGAAA GGACTGTGTG CCCTGGGGGC GGCCTGTCAG
CTGCAGCTCA TCGCGGCTGG ATGGGCCTGC GACACGGAGG AGCGCTGCCG AGCAAGGAGG GCCCTGACGG CTGGACCCTA TACTCTAACT CTGCTGGACC TGTGGGAGCC AAAGGCATGT TTCCAGACCA TGTGGGGAAG CCCGATGTGT AACTTCAACA TGCGGCTGCC CCTTGCAGGT AACTACACGT GACGTGGATG TTCCGCTGCG GATGTGGATG
CCCCTCAGCT GCTTCTCGCC TGCTGCTGCT CGGTGGTCTG AGGGCCAGAA AGGCCCAGCA GGAGGATGAG GTCTGCCGCT GGCACAAGGA TGTCCCAGCC CAGGCTCCCC GCCGGCCTCT CCAGTTCCTC GTGACAAGGA TCGACTGGGG ATGGGGGCTG GACCAGGATT CCAGCCCATG GCCGCTGCCC AATGCCAGGA AATGGTGGGT AGTGTGCTCT
60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320
ClqRp tp
cDNA sequence GGGTCGCTCG TGAGGAGGGC TGTGGGCCCG CTGTGGCTGC TGTGTCTCTG GAGCACCGTG GGCTACACCC ACTCAAGATG CGCCACAGCT AAACAACGAT GGCCATCCTA GAAGAGGGAG TCCAGAGCGA CTGCTGAAAG TGAACTCCCC CAAACAATTG TGTTTGATGT TCTATAATGA GGGTGTGAGG ATCTAAGAGG CCTAGGATGA TCAAAGGGAA AGCACAAGTC TAACCTCTTA CTTGGGTTTA CAGGTGTTTG CACAGATACT TGTGATCAAC CCTCAGACAC CGAGCTCAGA TGAACGGGAG TCATAGTCCA CACACCAAGT TTCCTTAAAA TTTTTACAGC TTGCAAATAT
continued
CCTTGCGCCC TACGTCCTGG GGGGGCCCCC CTGCCAGGCT GGACCACCAT CCCCGCGCTG ACCACAAGTA CTGGCCCCCA GCCTCTGGCC GGCACTGACG CTCCTGCTGG GAGAAGAAGG GCTGAGAGCA TGAGGTGGCC ATTCCAAAGG TAAGTCTCCT TCCTGAAGTG TTGTTACTCC AGGCTGGGGC AAAAGGTGAG AAACTAAATC CATGTTCGGA TTGCTAAATG GGTGGCAAGG TTTGCAAAGG TGAAGTCACA TGAATTAATT ACTAACAAGG CCTGCCTGTG CAGAGGAAGC ATGATGCACT CAGTTGATGC AGGGAGCTAG TTGGGGGTAA AAAAACTGCT TTCTCCCTAT
AGGGCTGCAC CCGGGGAGGA TCTGCGACAG GGGTGCTGGC CTGGGCCCCC CAACAGCCAG GACCTTCGCT GTGGGTCCTC CCCAGGAGCC GGCAAAAGCT CCCTGGCTCT AGAAGAAGCC GGGCCATGGA CTAGAGACAC GGCACCCACA CCTTAAAGGC GAAGCTGTGT CCCTCCCTTT TAAGGGGCTC TTGCTCATGC AATTAATTAT CTGGAAACAT TGATACTGTT AGGCAGGAAG AAGCTTGAAA TAATCTACGG CATCCAAATG AAACAAATTC GCCCCGCCTC CCTGCAGAAA GTGTTTTGAA AGCATCCTGA TCAGGCAGTT GGAGGGAAGG CAAAGCCATT GATAATGCAG
CAACACAGAT CGGGACTCAG CTTGTGCTTC CCCAAATGGG CGATGAGGAG TCCCACAAGG GTCATCTGAC AGGCGTCTGG TGCAGGTGGG GCTTTTATTC GGGGCTACTG CCAGAATGCG GAACCAGTAC TAGAGTCACC TTTTTTTGAA CCCTTGGAAC GTTGGCGTGC TCAAATTCCA CCCTGAATAT TGATTAGGAT TCAATTAGGT TTCTTTACAT GACATCCTCC TGCCTCTTTA AATATGAGAA GGCTAGGGCG TACTGAGGTT AAGGACAACC CACTTCATCC GTTCCATCAG AGTTGTCATT GATTTTAAAT TGCTTAAGGA AAGAGGGAAA TAAATTATAT TCGATAGTGT
GGCTCATTTC TGCCAGGACG AACACACAAG GTCTCTTGCA GACAAAGGAG GGCCCCGAGG GCCCCCATCA AGGGAGCCCA GACTCCTCCG TACATCCTAG GTCTATCGCA GCAGACAGTT AGTCCGACAC AGCCACCATC AGACTGGACT ATGCAGGTAT CACGGTGGGG ATGTGACCAA CTTCTCTGCT TGAAATGATT AAGAAGATCT TTGCATTCCT AGAATGGCCA GTTCTTACAT AAGTTGCTTG AGAGAGGCCA ACCACACACT TGTCTTTGAG TGCCCGGAAT GCTGTTTCCT TTAAAGCATT CCTGAAGTGT ACTTTTGTTC GAGATGACTA CCTCATTTTA
ACTGCTCCTG TGGATGAGTG GGTCCTTCCA CCATGGGGCC AGAAAGAAGG GCACCCCCAA CATCTGCCCC GCATCCATCA TGGCCACACA GCACCGTGGT AGCGGAGAGC ACTCCTGGGT CTGGGACAGA CTCAGAGCTT GGAATCTTAG TTTCTACGGG ATTTCGTGAC TTCCGGATCA CACTTCCACC TGTTTCTCTT GGTTTTTTGG CCATTTCGCC GAAGTGCAAT TTCTAATAGC AAGTGCATTA GGGATTTGTT TGACTACGGA CCAGGGCAGG GCCAGTGCTC AAAGGATGTG TTAGCACAGT GGGTGGCGCA TCTGTCTCTT ACTAAAATCA AAAGTTACAT
1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420
The probable methionine initiation codon (ATGj and the termination codon (TGAI are indicated.
Genomic structure In the mouse gene there are two exons and one intron*. By PCR sizing in human genomic DNA we know that this structure is the same in humans (unpubUshed observations).
Accession numbers Human^ Mouse
cDNA U94333 AF081789
Genomic AF074856
ClqRp
Deficiency None known.
Polymorphic variants None known.
References ^ Nepomuceno, R.R. et al. (1999) J. Immunol. 162, 3583-3589. 2 Nepomuceno, R.R. et al. (1997) Immunity 6, 119-129. 3 Guan, E. et al. (1991) J. Biol. Chem. 266, 20345-20355. "* Guan, E. et al. (1994) J. Immunol. 152, 4005-4016. 5 Tenner, A.}, et al. (1995) Immunity 3, 485-493. 6 Nepomuceno, R.R. and Tenner, A.J. (1998) J. Immunol. 160,1929-1935. 7 Lozada, C. et al. (1995) Proc. Natl Acad. Sci. USA 92, 8378-8382. * Kim, T.S. et al. (1999) In preparation.
C3a receptor
n
Robert S. Ames, Department of Molecular Biology, SmithKline Beecham Pharmaceuticals, King of Prussia, PA, USA Other names C3a anaphylatoxin receptor, C3aR, AZ3B^.
Physicochemical properties C3a receptor is a G protein-coupled receptor of 482 amino acids characterized by the presence of an unusually large second extracellular domain. M, (K) predicted 53.9 N-linked glycosylation sites 2 (9, 194)
Structure Transmembrane, G protein-coupled receptor protein with seven transmembrane domains.
Function C3a receptor functions as the cell surface receptor for the anaphylatoxin C3a, the C-terminal 77 amino acid cleavage product of the a chain of C3, but not C3a-desArg2. C3a and C4a have been reported to act through the same receptor^, however, this appears to be a unique property of the guineapig C3aR as C4a is not an agonist of the human or mouse C3aR'^'^.
Tissue distribution Transcript for the C3aR is widely distributed in peripheral tissues and the central nervous system^'*^'^. Using antibodies reactive with the second extracellular domain of the C3aR, expression has been demonstrated on neutrophils, monocytes, eosinophils, astrocytes, neurons and glial cells*-^^.
Regulation of expression Unknown.
C3a receptor
Protein sequence^'^^^ MASFSAETNS TDLLSQPWNE PPVILSMVIL SLTFLLGLPG NGLVLWVAGL TMRl
50
KMORTVNTIW FLHLTLADLL CCLSLPFSLA HLALQGQWPY GRFLCKLIPS TMR2
100
IIVLNMFASV FLLTAISLDR CLWFKPIWC
150
QNHRNVGMAC SICGCIWWA
TMR3
TMR4
CVMCIPVFVY REIFTTDNHN RCGYKFGLSS SLDYPDFYGD PLENRSLENI
2 00
VQPPGEMNDR LDPSSFQTND HPWTVPTVFQ PQTFQRPSAD SLPRGSARLT
2 50
SQNLYSNVFK PADWSPKIP
SGFPIEDHET SPLDNSDAFL STHLKLFPSA
3 00
SSNSFYESEL PQGFQDYYNL GQFTDDDQVP TPLVAITITR LWGFLLPSV TMR5
3 50
IMIACYSFIV FRMQRGRFAK SQSKTFRVAV WVAVFLVCW TPYHIFGVLS TMR6
400
LLTDPETPLG KTLMSWDHVC lALASANSCF NPFLYALLGK DFRKKARQSI
450
TMR7 QGILEAAFSE ELTRSTHCPS NNVISERNST TV
The seven transmembrane domains (TMRl-7) are underlined, N-linked glycosylation sites are indicated (N).
D
Chromosomal location Human^2. i2pi3.
cDNA sequence^'^'^ CACGAGGAGA ACTGTGGCTA CAGCTACTGT CAACTGACCT TCAGCCTTAC TGAAGATGCA TCTGCTGCCT ACGGCAGGTT TCTTCCTGCT GTCAGAATCA CTTGTGTGAT ATAGATGTGG ATCCACTAGA GGTTAGATCC AACCTCAAAC CAAGTCAAAA CCAGTGGGTT TCTCTACTCA TACCACAAGG CAACACCCCT TTATCATGAT AGTCTCAGAG GGACTCCATA GGAAAACTCT TTAATCCCTT TTCAGGGAAT
ACAGAAGAAG AGTGTGGGGA CTCAGTTTTT ACTCTCACAG TTTTTTACTG GCGGACAGTG CTCCTTGCCC CCTATGCAAG TACTGCCATT TCGCAATGTA GTGCATTCCT CTACAAATTT AAACAGGTCT TTCCTCTTTC ATTTCAAAGA TCTGTATTCT TCCTATTGAA TTTAAAGCTG TTTCCAGGAT CGTGGCAATA AGCCTGTTAC CAAAACCTTT CCACATTTTT GATGTCCTGG CCTTTATGCC TCTGGAGGCA
AGAAAGCTCA CCAGACAGGA TGAAGTTTAG CCATGGAATG GGATTGCCAG AACACAATTT TTCTCGCTGG CTCATCCCCT AGCCTGGATC GGGATGGCCT GTGTTCGTGT GGTCTCTCCA CTTGAAAACA CAAACAAATG CCTTCTGCAG AATGTATTTA GATCACGAAA TTCCCTAGCG TATTACAATT ACGATCACTA AGCTTCATTG CGAGTGGCCG GGAGTCCTGT GATCATGTAT CTCTTGGGGA GCCTTCAGTG
GCAAATTTTC CTCGTGGAGA CAATGGCGTC AGCCCCCAGT GCAATGGGCT GGTTCCTCCA CTCACTTGGC CCATCATTGT GCTGTCTTGT GCTCTATCTG ACCGGGAAAT GCTCATTAGA TTGTTCAGCC ATCATCCTTG ATTCACTCCC AACCTGCTGA CCAGCCCACT CTTCTAGCAA TAGGCCAATT GGCTAGTGGT TCTTCCGAAT TGGTGGTGGT CATTGCTTAC GCATTGCTCT AAGATTTTAG AGGAGCTCAC
TTGCCATACT CATCCAGGTG TTTCTCTGCT AATTCTCTCC GGTGCTGTGG CCTCACCTTG TCTCCAGGGA CCTCAACATG GGTATTCAAG TGGATGTATC CTTCACTACA TTATCCAGAC GCCTGGAGAA GACAGTCCCC TAGGGGTTCT TGTGGTCTCA GGATAACTCT TTCCTTCTAC CACAGATGAC GGGTTTCCTG GCAAAGGGGC GGCTGTCTTT TGACCCAGAA AGCATCTGCC GAAGAAAGCA ACGTTCCACC
TCATGACTTC CTGAAGCCTT GAGACCAATT ATGGTCATTC GTGGCTGGCC GCGGACCTCC CAGTGGCCCT TTTGCCAGTG CCAATCTGGT TGGGTGGTGG GACAACCATA TTTTATGGAG ATGAATGATA ACTGTCTTCC GCTAGGTTAA CCTAAAATCC GATGCTTTTC GAGTCTGAGC GATCAAGTGC CTGCCCTCTG CGCTTCGCCA CTTGTCTGCT ACTCCCTTGG AATAGTTGCT AGGCAGTCCA CACTGTCCCT
60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560
C3a receptor
cDNA sequence CAAACAATGT AGCAGGGGCT AGCAGCGGAC CTATTGACAT AGACTTGCTG CGTTTCTGAT AACTAAGCTA GATATTTCCA
continued
CATTTCAGAA CTTAGGCAAT TTCAAAAACT CAGCATCACC AATCGGAATC TAATGCTAAA TGTGAAATAA TCATTAAATT
AGAAATAGTA CACATAGTGA GTCAAAGAAT TAGAAACTTG TCTGGGGGTT TGTAAGAATC GAGAAGCTAC TTTCCTTAGC
CAACTGTGTG AAGTTTATAA CAATCCAGCG TTAGAAATGC GGGACCCAGC ATTGTAAACA TTTGTTTTTA ATTGTCTAAG
AAAATGTGGA GAGGATGAAG GTTCTCAAAC AAATTCTCAA AAGGGCACTT TTAGTTCTAT AATGATGTTG TCAAAAAAAA
GCAGCCAACA 162 0 TGATATGGTG 1680 GGTACACAGA 1740 GCCGCATCCC 1800 AACAAACCCC 1860 TTCTATCCCA 192 0 AATATTTGTC 1980 AAAAAAAAAA 2 040
The methionine initiation codon (ATGl, and the termination codon (TGAl are indicated and the first five nucleotides of exon 2 are underlined.
Genomic structure The C3aR gene is encoded on 2 exons and contains a single -6.0 kb intron located 11 bp upstream of the ATG initiation codon^^. 1kb
Accession numbers Human^'^'7 Mouse^^'^^ Rat^5
Guinea-pig^
U62027 Z73157 U28488 U97357 U77460 U77461 U86379 AJ006402
Deficiency None known.
Polymorphic variants None known.
References Roglic, A. et al. (1996) Biochim. Biophys. Acta 1305, 39-43. Ember, J.A. et al. (1998) In The Human Complement System in Health and Disease (Volanakis, J.E. and Frank, M.F. eds). Marcel Decker, New York, pp. 241-284. Gorski, J.P. et al. (1979) Proc. Natl Acad. Sci. USA 10, 5299-5302. Ames, R.S. et al. (1997) Immunopharmacology 38, 87-92. Lienenklaus, S. et al. (1998) J. Immunol. 161, 2089-2093.
C3a receptor
6 Ames, R.S. et al. (1996) J. Biol. Chem. 271, 20231-20234. ^ Crass, T. et al. (1996) Eur. J. Immunol. 26, 1944-1950. « Martin, U. et al. (1997) J. Exp. Med. 186, 199-207. ^ Hawlisch, H. et al. (1998) J. Immunol. 160, 2947-2958. ^» Gasque, P. et al. (1998) J. Immunol. 160, 3543-3554. ^^ Davoust, N. et al. (1998) Glia 26, 201-211. ^2 Paral, D. et al. (1998) Eur. J. Immunol. 28, 2417-2423. ^3 Tornetta, M.A. et al. (1997) J. Immunol. 158, 5277-5282. ^^ Hsu, M.H. et al. (1997) Immunogenetics 47, 64-72. ^5 Fukuoka, Y. (1998) Biochem. Biophys. Res. Commun. 242, 663-668.
C5a receptor
CD88
Andreas Klos and Wilfried Bautsch, Medizinische Hochschule Hannover, Hannover, Germany
Physicochemical properties C5a receptor is synthesized as a single-chain molecule of 350 amino acids. Mr (K) predicted 39.3^'2 observed 43-48^ (HL-60 cells) 50-55"^ (eosinophils) N-linked glycosylation site (occupied) 1 (5)^
Structure Integral membrane G protein-coupled receptor with seven transmembrane a helices and an extracellular N-terminus. Probable intramolecular cystine bridge between C109 (TMR3) and C188 (extracellular loop 2).
Function Cellular receptor for the complement-derived anaphylatoxins C5a and C5a-desArg74. Intracellular activation of G proteins, like Gia2,3; in vitro: Ga-16^^, mediating chemotaxis, 02~-generation, granule release (histamine, interleukins, leukotrienes and enzymes) and upregulation of adhesion molecules (for a review see ref. 10).
Tissue distribution Expressed on myeloid-derived cells and cell lines (granulocytes, monocytes and monocyte-derived cell lines U-937, HL-60 and THP-1, mast cells, dendritic cells)^'^^-^^ and non-myeloid cells (vascular smooth muscle, endothelia, epithelia, glial cells)^^'^^. CAVE: Immunological cross-reactivity of monoclonal antibodies with keratinocytes^^.
Regulation of expression Upregulated in vitro by dibutyryl-cAMP, dimethylsulf oxide, 1,25dihydroxy-vitamin D in combination with prostaglandin E2, phorbol esters and IFN7 in U-937 and HL-60 cells lines^^'^'^,- in vivo in inflamed brain tissue^*.
C5a receptor
Protein sequence (human)^'^ MNSFNYTTPD YGHYDDKDTL DLNTPVDKTS NTLRVPDILA LVIFAWFLV TMRl GVLGNALWW VTAFEAKRTI NAIWFLNLAV ADFLSCLALP ILFTSIVQHH TMR2 HWPFGGAACS ILPSLILLNM YASILLLATI SADRFLLVFK PIWCQNFRGA TMR3 GLAWIACAVA WGLALLLTIP SFLYRWREE YFPPKVLCGV DYSHDKRRER TMR4 AVAIVRLVLG FLWPLLTLTI CYTFILLRTW SRRATRSTKT L K W V A W A S TMR5 FFIFWLPYQV TGIMMSFLEP SSPTFLLLNK LDSLCVSFAY INCCINPIIY TMR6 TMR7 WAGQGFQGR LRK5LP5LLR NVLTEESWR ESK5FTR5TV DTMAQKTQAV
50 100 150 2 00 2 50 300 3 50
The iV-linked glycosylation site (occupied^) is indicated (N) and the seven putative transmembrane regions (TMRl-7) are underlined. The Ugandbinding sites are at the N-terminus (21-30)2^-2'', £199^^ and R20626. The serine phosphorylation sites^^-^^ are marked (*) and italicized.
Chromosomal location Human: 19ql3.3-13.4^«.
cDNA sequence (human)^ AGGGACCTTC ACCCCTGATT AAAACTTCTA TTCCTGGTGG CGGACCATCA GCGCTGCCCA GCCTGCAGCA GCCACCATCA CGAGGGGCCG ACCATACCCT TGTGGCGTGG GTCCTGGGCT CGGACGTGGA GTGGCCAGTT CTGGAGCCAT TTTGCCTACA CAGGGCCGAC GTGGTTAGGG CAGGCAGTGT CCATTCTCCC CTCTCCTCCA TCATCCTTCC CCCCCCCCCA ATCTGGGATA GAAAGATTCT GAATCTCAAA
GATCCTCGGG ATGGGCACTA ACACGCTGCG GAGTGCTGGG ATGCCATCTG TCTTGTTCAC TCCTGCCCTC GCGCCGACCG GCTTGGCCTG CCTTCCTGTA ACTACAGCCA TCCTGTGGCC GCCGCAGGGC TCTTTATCTT CGTCACCCAC TCAACTGCTG TGCGGAAATC AGAGCAAGTC AGGCGACAGC TCTTGTTTTC TGTTGCCTGT TCATTTGCAA CACACCATCT TTTCCATATG CGCTTAAAAA AGTTCTTTGG
GAGCCCAGGA TGATGACAAG TGTTCCAGAC CAATGCCCTG GTTCCTCAAC GTCCATTGTA CCTCATCCTG CTTTCTGCTG GATCGCCTGT CCGGGTGGTC CGACAAACGG TCTACTCACG CACGCGGTCC CTGGTTGCCC CTTCCTGCTG CATCAACCCC CCTCCCCAGC ATTCACGCGC CTCATGGGCC ACTTCACTTT CTTTCCCAGA GGTGAACACT TTCCATCCCA GCAATAGGTG AATGTATTTA GACAAAACAG
GACCAGAACA GATACCCTGG ATCCTGGCCT GTGGTCTGGG TTGGCGGTAG CAGCATCACC CTCAACATGT GTGTTTAAAC GCCGTGGCTT CGGGAGGAGT CGGGAGCGAG CTCACGATTT ACCAAGACAC TACCAGGTGA CTGAATAAGC ATCATCTACG CTCCTCCGGA TCCACAGTGG ACTGTGGCCC TCGTGGGATG CTTGTCCCTC TCCTTCTAGG GGCTTTTGAA TGAACAGGGA TTTTATGGCA AAGTCCATGG
TGAACTCCTT ACCTCAACAC TGGTCATCTT TGACGGCATT CCGACTTCCT ACTGGCCCTT ACGCCAGCAT CCATCTGGTG GGGGTTTAGC ACTTTCCACC CCGTGGCCAT GTTACACTTT TCAAGGTGGT CGGGGATAAT TGGACTCCCT TGGTGGCCGG ACGTGTTGAC ACACTATGGC GATGTCCCCT GTGTTACCTT CTTTTCCAGC GAGCACCCTC AAACAAACAG ACTCAGAATA AGTTGGAAAA AGTTATCTAA
CAATTATACC CCCTGTGGAT TGCAGTCGTC CGAGGCCAAG CTCCTGCCTG TGGCGGGGCC CCTGCTCCTG CCAGAACTTC CCTGCTGCTG AAAGGTGTTG CGTCCGGCTG CATCCTGCTC GGTGGCAGTG GATGTCCTTC GTGTGTCTCC CCAGGGCTTC TGAAGAGTCC CCAGAAGACC TCCTTCCCGG AGCTAACTAA GGGACTCTTC CCACCCCCCA AAACCCGTGT CAGACT^GTA TATGTAACTG GCTCTTGTAA
60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560
C5a receptor
cDNA sequence GTGAGTTAAT AACTTTGGGA CAGCATGGTG GTGCCTGTAA TGGAGGTTGT CTCTGTCTCA TTTGTACTTT TGTAAGTAAT ATCTTGCAAA ACAGGACATT CCCAGCCGTG CATTTCAAGA AAAAAGTATA GAG
continued
TTAAAAAAGA GGCTAAGGTG AAACCCCGTC TCCCAGCTAC GGTGAGCCAT AAAGCAAAGC GTTTTTAAAT GATACAGAGG ACTACAATGT CTCATCACCA TCCCTAACCC ATGTTATTCA CATGACTTTA
AAATTAGGCT GGTGGATCAC TGTACTAAAA TTGGGAGGCT GATCGCACCA AAAAACAAAA TATGCTTTCT GATCTTGTGT AGTCTCATAA CAGGGATCCC CTGGCAACCA ATGGAATCAT ATGAGGAAAA
GAGAGCAGTG CTGAGGTCAA ATACAAAAAA GAGGTGGGAG CTGCACTCTA ACAAAAACAC ATTTTGAGAT ACCCTTCACC CCAGGATATT CAGGATGCCC GGAATCCACT ATAGTATGTA TAAAAATGAA
GCTCACGCCT GAGTTCCAGA TTAACTGGGC AATTGCTCGA GCCTGGGTGA CTAAAAAACC CATTGCAAAC CAGCCTCCCC GACATTGATA ACTTCCCTCC CTCCATTTCT ACCTGTTTTG TATTGAAAAA
GTAATCCCAG CCAGGCTGGC ATGGTAGTGG ACCTTGGAGG CCGAGGGAGG TGCAGTTTTG TCAACACAAT CAATGGCAAC CAGTGAAGAT ACCCCCACAC ATAATGTTGT AGCTTAAAAA AAAAACTTTA
1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340
The initiation codon (ATG), termination codon (TAG) and polyadenylation signal (AATAAA) are indicated. The first five nucleotides of each exon are underlined (exon 2 starts at nucleotide 43, immediately after the initiation codon). Two nucleotides in the 5' UTR of the cDNA sequence differ from the reported sequence of exon 1: AG at positions - 2 3 / - 2 4 in the cDNA sequence versus TC in the genomic sequence^^.
Genomic structure Two exons separated by a ~9 kb intron sequence located between codon 1 and 2^«. 1kb
1
Accession numbers Human^'2
X57250 M62505 ^at^i,32 Y09613 AB003042 Mouse^^ S46665 Guinea-pig^"^ U86103 Canine^^ X65860 Chimpanzee^^ X97730 Orang-utan^^ X97732 Rhesus monkey^^ X97731 Lowland gorilla^^ X97733 The bovine sequence is also published^^.
Deficiency Knockout mice show defects in mucosal defence and altered Arthus reaction^^-^^.
C5a receptor
D
Polymorphic variants None known. References ^ Boulay, F. et al. (1991) Biochemistry 30, 2993-2999. 2 Gerard, N.P. and Gerard, C. (1991) Nature 349, 614-617. 3 Tardif, M. et al. (1993) J. Immunol. 150, 3534-3545. ^ Gerard, N.P. et al. (1989) J. Biol. Chem. 264, 1760-1766. 5 Pease, J.E. and Barker, M.D. (1993) Biochem. Mol. Biol. Int. 31, 719-726. 6 Rollins, T.E. et al. (1991) Proc. Natl Acad. Sci. USA 88, 971-975. ^ Offermans, S. et al. (1990) FEBS Lett. 260, 14-18. « Amatruda, T.T. et al. (1993) J. Biol. Chem. 268, 10139-10144. 9 Buhl, A.M. et al. (1993) FEBS Lett. 323, 132-134. ^0 Ember, J.A. et aL (1998) In The Human Complement System in Health and Disease (Volanakis, J.E. and Frank, M.M. eds). Marcel Dekker, New York, pp. 241-284. 1^ Chenoweth, D.E. and Hugli, T.E. (1978) Proc. Natl Acad. Sci. USA 75, 3943-3947. ^2 Van-Epps, D.E. and Chenoweth, D.E. (1984) J. Immunol. 132, 2862-2867. ^3 Dahinden, C.A. et al. (1991) Int. Arch. Allergy Appl. Immunol. 94, 161-164. '^ Hartmann, K. et al. (1997) Blood 89, 2863-2870. ^5 Burg, M. et al. (1996) J. Immunol. 157, 5574-5581. ^6 Sozzani, S. et al. (1995) J. Immunol. 155, 3292-3295. ^7 Haviland, D.L. et al. (1995) J. Immunol. 154,1861-1869. ^« Gasque, P. et al. (1997) Am. J. Pathol. 150, 31-41. ^^ Werfel, T. et al. (1996) J. Immunol. 157, 1729-1735. 20 Rubin, J. et al. (1988) Endocrinology 123, 2424-2431. 2^ Oppermann, M. et al. (1993) J. Immunol. 151, 3785-3794. 22 Mery, L. and Boulay, F. (1994) J. Biol. Chem. 269, 3457-3463. 23 Siciliano, S.J. et al. (1994) Proc. Natl Acad. Sci. USA 91, 1214-1218. 24 Chen, Z. et al. (1998) J. Biol. Chem. 273, 10411-10419. 25 Monk, P.N. et al. (1995) J. Biol. Chem. 270, 16625-16629. 26 DeMartino, J.A. et al. (1995) J. Biol. Chem. 270, 15966-15969. 27 Giannini, E. et al. (1995) J. BioL C h e m . 2 7 0 , 1 9 1 6 6 - 1 9 1 7 2 .
28 Giannini, E. and Boulay, F. (1995) J. Immunol. 154, 4055-4064. 29 Bock, D. et al. (1997) Eur. J. Immunol. 27, 1522-1529. ^0 Gerard, N.P. et al. (1993) Biochemistry 32, 1243-1250. 3^ Rothermel, E. et al (1997) Mol. Immunol. 34, 877-886. 32 Akatsu, H. et al. (1997) Microbiol. Immunol. 41, 575-580. 33 Gerard, C. et al. (1992) J. Immunol. 149, 2600-2606. 3^ Fukuoka, Y. et al. (1998) Int. Immunol. 10, 275-283. 35 Perret, J.J. et al. (1992) Biochem. J. 288, 911-917. 36 Alvarez, V. et al. (1996) Immunogenetics 44, 446-452. 37 Hopken, U.E. et al. (1996) Nature 383, 86-89. 3« Bozic, C.R. et al. (1996) Science 273, 1722-1725. 39 Hopken, U.E. et al. (1997) J. Exp. Med. 186, 749-756.
CR3 Yu Xia and Gordon D. Ross, Department of Pathology, University of Louisville, Louisville, KY, USA
Other names Complement receptor type 3, Mac-1, Mol, OKM-1, C3bi-receptor, CD lib/CD 18, aMjSi-integrin, LeuCAM, leukocyte integrin.
Physicochemical properties CR3 is made up of two non-covalently associated subunits encoded by two genes. Both a and /3 subunits are type I membrane glycoproteins with leader sequences of 16 and 22 amino acids respectively. Amino acids Mr (K) predicted observed N-linked glycosylation sites
asubunit (CDllb) 17-1153 125.6 160 19(86,240,391,469, 693, 697, 735, 802, 881,901,912,941, 947, 919, 994, 1022, 1045, 1051, 1076)
j3subunit(CD18) 23-769 82.6 95 6(50, 116,212,254, 501, 642)
Structure CR3 is one of four members of the /32-integrin family that share a common /3 subunit (j82-integrin or CD 18) linked non-covalently to one of four a subunits forming a membrane surface glycoprotein heterodimer. There are no disulfide links between a and j8 subunits. The I-domain of C D l l b adopts a classic Rossmann a//3 fold, with seven hydrophilic a helices surrounding five parallel and one short antiparallel hydrophobic ^ sheet in the middle, and contains an unusual Mg^'-ZMn^'^ coordination site on its surface at the top of the j3 sheet. The metal-binding site, named metal ion-dependent adhesion site (MIDAS) plays a direct role in protein ligand binding. Some studies have suggested that an I-domain-like structure in CD 18 contributes part of the MIDAS site and increases its affinity for protein ligands^'^. The C-terminal domain of CDllb contains an unusual lectin site responsible both for cytotoxic recognition of exogenous polysaccharides on microbial pathogen cell walls (e.g. j3-glucan, j3-oligomannan, iV-acetylD-glucosamine)^, as well as for CR3-complex formation with various endogenous membrane glycoproteins (e.g. CD14, CD16, CD59, CD87)^'^. Molecular mapping studies have shown that the lectin site is contained somewhere in a broad region of CDllb located C-terminal to the Idomain and divalent cation-binding repeats sequence. The lectin site is unusual because this region of CDllb contains no C-type lectin consensus sequences and lectin activity does not require divalent cations^'^.
Function CR3 has several seemingly unrelated functions that can be divided into intercellular adhesion-related events and cytotoxic receptor functions. CR3 and LFA-1 are the major adhesion molecules used by phagocytes for directed migration through the vascular endothelium into sites of inflammation*'^. In its cytotoxic receptor function, phagocyte CR3 stimulates phagocytosis, a respiratory burst, and degranulation when it binds to iC3b-opsonized bacteria or yeast CR3. However, when CR3 binds instead to iC3b-opsonized erythrocytes or tumour cells, no response is stimulated and only adhesion occurs. Such iC3b-mediated adhesion occurs with native CR3 on resting cells through a low-affinity MIDAS within the I-domain of CD lib. Stimulation of CR3 for ingestion, cytotoxic degranulation or cytokine secretion requires additional ligation of CR3 via its lectin site to polysaccharides present on microbial surfaces. Natural killer (NK) cell CR3 functions similarly to phagocyte CR3 in its requirement for dual ligation of the lectin site and I-domain for triggering extracellular cytotoxicity^. NK cells use CR3 for recognition and cytotoxicity of yeast hyphae^^. The lectin site of CR3 also functions to promote cell surface transmembrane signalling complexes between CR3 and endogenous membrane glycoproteins that are attached only via glycosylphosphatidylinositol anchors, and thus have no other mechanism for signalling^. This allows CD 14 and CD 16 to prime and trigger neutrophil functions in a manner similar to exogenous microbial polysaccharides. In addition, the lectin site-dependent complex between CR3 and CDS 7 has been shown to be essential for development of the high-affinity MIDAS and neutrophil adhesion to endothelial cell ICAM-P^ This suggests that signalling through the lectin site may be essential for the full range of both adhesion and cytotoxic functions mediated by CR3. Protein tyrosine kinases (PTKs) play an important role in CR3 priming and activation for cytotoxicity, phagocytosis, respiratory burst or firm adhesion. Four different PTKs (fyn, lyn, hck and fgr) have been implicated in CR3 signalling^2-i4^ ^^(j these have been found in association with CR3 in large complexes that included also LFA-1 and the urokinase plasminogen activator receptor (uPA-R or CDS 7)^^. Several targets that are tyrosine phosphorylated have been identified including paxillin^^, Vav and Vavp21(ras)^^. CR3 signalling also involves activation of phospholipase Ar^^^'^^ and phosphatidylinositol 3-kinase2^. C-terminal domain
I-domain
1
y
\
1 -
.•:•:
i-domainlike region
1
1
"a
asubunit(CDIIb)
Cysteine-rich repeat region
1
1 1 1 1 1- H 12 3 4
Psubunit(CD18)
Tissue distribution CR3 is expressed on all myeloid lineage cells, but the amount of CR3 is diminished as monocytes mature into macrophages or dendritic cells, and may be undetectable on terminally differentiated cells. Among lymphoid cells, CR3 is expressed on the majority of NK cells but is restricted on B cells to the CD5+ subset. CR3 is undetectable on the majority of resting T cells.
Regulation of expression With monocytes, neutrophils, eosinophils and NK cells, most CR3 is stored in cytoplasmic granules2^'22 ^md various cellular activation events cause a rapid mobilization of the granule pool of CR3 to the membrane surface and a 3-10-fold increase in external membrane-bound CR3. Two major transcriptional start sites exist in the CD l i b gene located 90 bp and 54 bp upstream from the translational initiation methionine. The CD 18 gene has multiple transcriptional start sites spread out over a region of -45 nucleotides, with one or two major initiation sites contributing >50% of the transcripts. Transcription of CD l i b and CD 18 is regulated coordinately and is induced hormonally by retinoic acid.
Protein sequences a subunit^^ MALRVLLLTA PQEIVAANQR QLLACGPTVH lAFLIDGSGS TFKEFQNNPN KILWITDGE NTIASKPPRD SQEGFSAAIT DAYLGYAAAI TQIGAYFGAS RARWQCDAVL GAVYLFHGTS DLTVGAQGHV EVRVCLHVQK RQTQVLGLTQ PVLAEDAQRL FNVTVTVRND ASSTEVSGAL NVTSENNMPR VMQHQYQVSN TKERLPSHSD NLSFDWYIKT FEVPNPLPLI EPQ
LTLCHGFNLD GSLYQCDYST QTCSENTYVK IIPHDFRRMK PRSLVKPITQ KFGDPLGYED HVFQVNNFEA SNGPLLSTVG ILRNRVQSLV LCSVDVDSNG YGEQGQPWGR GSGISPSHSQ LLLRSQPVLR STRDRLREGQ TCETLKLQLP FTALFPFEKN GEDSYRTQVT KSTSCSINHP TNKTEFQLEL LGQRSLPISL FLAELRKAPV SHNHLLIVST VGSSVGGLLL
TENAMTFQEN GSCEPIRLQV GLCFLFGSNL EFVSTVMEQL LLGRTHTATG VIPEADREGV LKTIQNQLRE SYDWAGGVFL LGAPRYQHIG STDLVLIGAP FGAALTVLGD RIAGSKLSPR VKAIMEFNPR IQSWTYDLA NCIEDPVSPI CGNDNICQDD FFFPLDLSYR IFPENSEVTF PVKYAVYMW VFLVPVRLNQ VNCSIAVCQR AEILFNDSVF LALITAALYK
ARGFGQSWQ PVEAVNMSLG RQQPQKFPEA KKSKTLFSLM IRKWRELFN IRYVIGVGDA KIFAIEGTQT YTSKEKSTFI LVAMFRQNTG HYYEQTRGGQ VNGDKLTDVA LQYFGQSLSG EVARNVFECN LDSGRPHSRA VLRLNFSLVG LSITFSFMSL KVSTLQNQRS NITFDVDSKA TSHGVSTKYL TVIWDRPQVT IQCDIPFFGI TLLPGQGAFV LGFFKRQYKD
LQGSRVWGA LSLAATTSPP LRGCPQEDSD QYSEEFRIHF ITNGARKNAF FRSEKSRQEL GSSSSFEHEM NMTRVDSDMN MWESNANVKG VSVCPLPRGQ IGAPGEEDNR GQDLTMDGLV DQWKGKEAG VFNETKNSTR TPLSAFGNLR DCLWGGPRE QRSWRLACES SLGNKLLLKA NFTASENTSR FSENLSSTCH QEEFNATLKG RSQTETKVEP MMSEGGPPGA
50 100 150 200 2 50 300 3 50 400 450 500 550 600 650 7 00 7 50 800 850 900 9 50 1000 1050 1100 1150
i ^9 WM
|3 subunit^^
B|fl
MLGLRPPLLA LVGLLSLGCV LSQECTKFKV SSCRECIESG PGCTWCQKLN
H
FTGPGDPDSI RCDTRPQLLM RGCAADDIMD PTSLAETQED HNGGQKQLSP 100
llll llil ||9 ^ 1 ^ H ^ 1 ^ H ^ H ^ H ^ H ^ H ^ H ^ H ^ H
QKVTLYLRPG GDLLRALNEI PFAFRHVLKL GWRNVTRLLV YPSVGQLAHK NWHLIKNAY DGVQINVPIT DQSRDRSLCH NNSIICSGLG GPGRGLCFCG CECHSGYQLP QLSNNPVKGR lAAIVGGTVA NPLFKSATTT
QAAAFNTVTFR TESGRIGFGS TNNSNQFQTE FATDDGFHFA LAENNIQPIF NKLSSRVFLD FQVKVTATEC GKGFLECGIC DCVCGQCLCH KCRCHPGFEG LCQECPGCPS TCKERDSEGC GIVLIGILLL VMNPKFAES
RAKGYPIDLY FVDKTVLPFV VGKQLISGNL GDGKLGAILT AVTSRMVKTY HNALPDTLKV IQEQSFVIRA RCDTGYIGKN TSDVPGKLIY SACQCERTTE PCGKYISCAE WVAYTLEQQD VIWKALIHLS
The leader sequences are underlined glycosylation sites are indicated (N).
YLMDLSYSML NTHPDKLRNP DAPEGGLDAM PNDGRCHLED EKLTEIIPKS TYDSFCSNGV LGFTDIVTVQ CECQTQGRSS GQYCECDTIN GCLNPRRVEC CLKFEKGPFG GMDRYLIYVD DLREYRRFEK
and
DDLRNVKKLG CPNKEKECQP MQVAACPEEI NLYKRSNEFD AVGELSEDSS THRNQPRGDC VLPQCECRCR QELEGSCRKD CERYNGQVCG SGRGRCRCNV KNCSAACPGL ESRECVAGPN EKLKSQWNND
the putative
50 150 200 250 300 350 400 450 500 550 600 650 700 750
N-linked
Protein modules 1-16 167-353 453-614 1109-1134 1135-1153
Leader peptide I-domain Divalent cation-binding region Transmembrane domain Cytoplasmic domain
exon 1/2 exon 6-9 exon 13-15 exon 29/30 exon 30
p subunit24'26 1-22 Leader peptide exon 2/3 23-700 Extracellular domain exon 3-15 701-723 Transmembrane domain exon 15 724-769 Cytoplasmic domain exon 15/16 The region from C445 to C631 is cysteine rich, with four tandem repeats of an eight-cysteine motif.
Chromosomal location a subunit^^'^^'^* 16pll-pl3.1 (CDlla, CDllb and CDllc are clustered between bands p l l and pi3.1 on chromosome 16; CDlld is arranged in tandem with CDllc and separated by more than 11.5 kb). p subunit Human29.30: 21q22.3. Telomere ... PFKL ... CD 18 ... CRYAl ... Centromere Mouse^^: chromosome 10.
cDNA sequences a subunit^^ GAATTCCGTG CTCCTTCCAG TTCAACTTGG AGCGTGGTCC GCCAACCAAA CGCCTGCAGG ACCAGCCCCC ACGTATGTGA TTCCCAGAGG GGCTCTGGTA ATGGAGCAAT CGGATTCACT CCAATAACGC GAGCTGTTTA ACGGATGGAG AGAGAGGGAG CGCCAAGAGC AACTTTGAGG GGTACTCAGA GCTGCCATCA GGAGTCTTTC TCAGACATGA CAAAGCCTGG CAGAACACTG TTCGGGGGCT ATCGGGGCCC CCCAGGGGGC CCCTGGGGCC ACGGACGTGG CACGGAACCT CTCTCTCCCA GATGGACTGG CCAGTACTGA TTTGAGTGTA CATGTCCAGA TATGACCTGG AACAGCACAC CTACAGTTGC TCTCTGGTGG GCTCAGAGAC TGCCAGGATG GGGCCCCGGG ACACAGGTCA AACCAGCGCT TCTGGGGCCT GAGGTCACCT CTCCTCAAGG CAACTGGAGC ACTAAATATC CAGGTCAGCA CGGCTGAACC AGTACGTGCC AAGGCCCCCG
GTTCCTCAGT CCATGGCTCT ACACTGAAAA AGCTTCAGGG GGGGCAGCCT TCCCCGTGGA CTCAGCTGCT AAGGGCTCTG CCCTCCGAGG GCATCATCCC TAAAAAAGTC TTACCTTCAA AGCTGCTTGG ACATCACCAA AAAAGTTTGG TCATTCGCTA TTAATACCAT CTCTGAAGAC CAGGAAGTAG CCTCTAATGG TATATACATC ATGATGCTTA TTCTGGGGGC GCATGTGGGA CCCTCTGCTC CCCATTACTA AGAGGGCTCG GCTTTGGGGC CCATTGGGGC CAGGATCTGG GGCTCCAGTA TAGACCTGAC GAGTCAAGGC ATGATCAGGT AGAGCACACG CTCTGGACTC GCAGACAGAC CGAATTGCAT GAACGCCATT TCTTCACAGC ACCTCAGCAT AGTTCAACGT CCTTCTTCTT CACAGCGATC TGAAGAGCAC TTAATATCAC CCAATGTGAC TGCCGGTGAA TCAACTTCAC ACCTGGGGCA AGACTGTCAT ACACCAAGGA TGGTGAACTG
GGTGCCTGCA CAGAGTCCTT CGCAATGACC ATCCAGGGTG CTACCAGTGC GGCCGTGAAC GGCCTGTGGT CTTCCTGTTT GTGTCCTCAA ACATGACTTT CAAAACCTTG AGAGTTCCAG GCGGACACAC CGGAGCCCGA CGATCCCTTG CGTCATTGGG CGCATCCAAG CATTCAGAAC CAGCTCCTTT CCCCTTGCTG AAAGGAGAAA CTTGGGTTAT ACCTCGATAT GTCCAACGCT CGTGGACGTG CGAGCAGACC GTGGCAGTGT AGCCCTAACA CCCAGGAGAG CATCAGCCCC TTTTGGTCAG TGTAGGAGCC AATCATGGAG GGTGAAAGGC GGATCGGCTA CGGCCGCCCA ACAGGTCTTG CGAGGACCCA GTCTGCTTTC CTTGTTTCCC CACCTTCAGT GACAGTGACT CCCGCTTGAC CTGGCGCCTG CAGCTGCAGC GTTTGATGTA CAGTGAGAAC ATATGCTGTC GGCCTCAGAG GAGGAGCCTC ATGGGACCGC GCGCTTGCCC CTCCATCGCT
ACCCCTGGTT CTGTTAACAG TTCCAAGAGA GTGGTTGGAG GACTACAGCA ATGTCCCTGG CCCACCGTGC GGATCCAACC GAGGATAGTG CGGCGGATGA TTCTCTTTGA AACAACCCTA ACGGCCACGG AAGAATGCCT GGATATGAGG GTGGGAGATG CCGCCTCGTG CAGCTTCGGG GAGCATGAGA AGCACTGTGG AGCACCTTCA GCTGCCGCCA CAGCACATCG AATGTCAAGG GACAGCAACG CGAGGGGGCC GATGCTGTTC GTGCTGGGGG GAGGACAACC TCCCATAGCC TCACTGAGTG CAGGGGCACG TTCAATCCCA AAGGAAGCCG AGAGAAGGAC CATTCCCGCG GGGCTGACCC GTGAGCCCCA GGGAACCTCC TTTGAGAAGA TTCATGAGCC GTGAGAAATG CTGTCCTACC GCCTGTGAGT ATAAACCACC GACTCTAAGG AACATGCCCA TACATGGTGG AATACCAGTC CCCATCAGCC CCCCAGGTCA TCTCACTCCG GTCTGCCAGA
CACCTCCTTC CCTTGACCTT ACGCAAGGGG CCCCCCAGGA CAGGCTCATG GCCTGTCCCT ACCAGACTTG TACGGCAGCA ACATTGCCTT AGGAGTTTGT TGCAGTACTC ACCCAAGATC GCATCCGCAA TTAAGATCCT ATGTCATCCC CCTTCCGCAG ATCACGTGTT AGAAGATCTT TGTCTCAGGA GGAGCTATGA TCAACATGAC TCATCTTACG GCCTGGTAGC GCACCCAGAT GCAGCACCGA AGGTGTCCGT TCTACGGGGA ACGTAAATGG GGGGTGCTGT AGCGGATAGC GGGGCCAGGA TGCTGCTGCT GGGAAGTGGC GAGAGGTCAG AGATCCAGAG CCGTCTTCAA AGACTTGTGA TTGTGCTGCG GGCCAGTGCT ATTGTGGCAA TGGACTGCCT ATGGTGAGGA GGAAGGTGTC CTGCCTCCTC CCATCTTCCC CTTCCCTTGG GAACCAACAA TCACCAGCCA GGGTCATGCA TGGTGTTCTT CCTTCTCCGA ACTTTCTGGC GAATCCAGTG
CAGGTTCTGG ATGTCATGGG CTTCGGGCAG GATAGTGGCT CGAGCCCATC GGCAGCCACC CAGTGAGAAC GCCCCAGAAG CTTGATTGAT CTCAACTGTG TGAAGAATTC ACTGGTGAAG AGTGGTACGA AGTTGTCATC TGAGGCAGAC TGAGAAATCC CCAGGTGAAT TGCGATCGAG AGGCTTCAGC CTGGGCTGGT CAGAGTGGAT GAACCGGGTG GATGTTCAGG CGGCGCCTAC CCTGGTCCTC GTGCCCCTTG GCAGGGCCAA GGACAAGCTG TTACCTGTTT AGGCTCCAAG CCTCACAATG CAGGTCCCAG AAGGAATGTA AGTCTGCCTC TGTTGTGACT TGAGACAAAG GACCCTGAAA CCTGAACTTC GGCGGAGGAT TGACAACATC CGTGGTGGGT CTCCTACAGG CACACTCCAG CACCGAAGTG GGAAAACTCA AAACAAACTG AACCGAATTC TGGGGTCTCC GCATCAATAT GGTGCCCGTC GAACCTCTCG TGAGCTTCGG TGACATCCCG
60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180
cDNA sequences TTCTTTGGCA TACATCAAGA GATTCCGTGT AAAGTGGAGC GGACTGCTGC CAATACAAGG CTTCCCGACA CCAGGCTGCT TTTGTGTGTG AGTGTGTGCA CAAGTATGTG TGTGCGAGTG TCTCTGGCGT GCTCCCTTGT CCTGTGGGTG CCCGTCGCCT AGAAAAGCCG TGCCCACTGA TAATTTTTTG AGTGAAAAGT TTTTGAGGTT ACCACACACA CTGTATCTTG TACTTTTTCA ATTTAACCAG AATAAATCAA
TCCAGGAAGA CCTCGCATAA TCACCCTGCT CGTTCGAGGT TCCTGGCCCT ACATGATGAG GAGCTGCCTC GGACACGTCG TGCAAGTGTG CGTGTGCGTG AGTGTGTGCA TGTGCATGTG GTGGGTAGGT GCGTGGGTAA AAGAGAGAGG GCGAGCCTGC TGGGTGGAAC GGAATCATGA GATGGATAAG CTCCCTTTCC TCCTTCAGAC TACACACACA CTTTTTTTCA TTCTTTTATA TCTTCTTTTG ATATATGTCA
continued ATTCAATGCT CCACCTCCTG GCCGGGACAG CCCCAACCCC CATCACCGCC TGAAGGGGGT TCGGTGGCCA GACAGCGAAG TATGTGCGTG TGCGTGCATG GTGTGTGTGC TGTGCTCAGG GACGGCAGCG GCCGCTGCTG GAAACACAGC GGCCTGCTGG CAGGAGCCTC AGCTTCCTTT CCTGTCTATG AGATATTCAA AGATTCCAGG CAAGCTTTTT CCAATATTTC CCGCTGCATA ATATACTATT AAAAAAAAAA
ACCCTCAAAG ATCGTGAGCA GGGGCGTTTG CTGCCGCTCA GCGCTGTACA CCCCCGGGGG GCAGGACTCT TATCCCCGAC TGTGCGAGTG TGCACTCGCA GTGTGTCCAT GGCTGTGGCT TAGCCTCTCC GGTTTTCCTC AGCATCTCTC AGCCTGCGCA CTCCACACCA CTGGATTCAT GTACAAAAAT GTCACCTCCT CGATGTGCAA TACACAAATG TCAGACATCG GTATTCCATT TTCATCTCTT AAAA
GCAACCTCTC CAGCTGAGAT TGAGGTCCCA TCGTGGGCAG AGCTCGGCTT CCGAACCCCA GCCCAGACCA AGGACGGGCT TGTGCAAGTG CGCCCATGTG GTGTGTGCAG CACGTGTGTG GGCAGAAGGG CGGGAGAGGG CACTGAAAGA GCTTGGATGG GCGCTGATGC TTATTATTTC CACAAGGCAT TAAAGGTAGT GTGTATGCAC GTAGCATACT GTTCATATTA GTGTGAGTGT GTTATTGCAT
GTTTGACTGG CTTGTTTAAC GACGGAGACC CTCTGTCGGG CTTCAAGCGG GTAGCGGCTC CACGTAGCCC TGGGCTTCCA TCTGTGTGCA TGAGTGTGTG TGTGTGCATG ACTCAGAGTG AACTGCCTGG GACGGTCAAT AGTGGGACTT ATACTCCATG CCAATAAAGA AATGTGACTT TCAAGTGTAC CAAGATTGTG GTGTGCACAC TTATATTGGT AGACATAAAT ACCATAATGT CTGCTGAGTT
3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680
p subunit24'^2 ACAGGAAGTG GGGCAGACTG ACCGAGGGAC CGGGTGCGTC CGAGTCGGGG TGACTCCATT CATCATGGAC GCTGTCCCCA GACCTTCCGG CTCCATGCTT CAACGAGATC GCCGTTCGTG GTGCCAGCCC TCAGACCGAG GGACGCCATG GCTGCTGGTG CATCCTGACC CGAATTCGAC GCCCATCTTC CCCCAAGTCA GAATGCTTAC CCTGAAAGTC AGGTGACTGT CACAGAGTGC GACCGTGCAG CCTCTGCCAT
TCAGGACTTT GTAGCAAAGC ATGCTGGGCC CTCTCTCAGG CCCGGCTGCA CGCTGCGACA CCCACAAGCC CAAAAAGTGA CGGGCCAAGG GATGACCTCA ACCGAGTCCG AACACGCACC CCGTTTGCCT GTCGGGAAGC ATGCAGGTCG TTTGCCACTG CCCAACGACG TACCCATCGG GCGGTGACCA GCCGTGGGGG AATAAACTCT ACCTACGACT GATGGCGTGC ATCCAGGAGC GTCCTTCCCC GGCAAGGGCT
ACGACCCGCG CCTCCAGCTG AGGTTTCTAG ACGTGACCCA
60
CCCCACGCCC AGCCAGGAGC ACCGCCGAGG ACTCCAGCAC
12 0
TGCGCCCCCC ACTGCTCGCC CTGGTGGGGC TGCTCTCCCT
ATGACGGCTT CCATTTCGCG GGCGACGGAA AGCTGGGCGC
180 240 300 360 420 480 540 600 660 720 780 840 900 960
GCCGCTGTCA CCTGGAGGAC AACTTGTACA AGAGGAGCAA
1020
TGGGCCAGCT GGCGCACAAG CTGGCTGAAA ACAACATCCA
1080
GTAGGATGGT GAAGACCTAC GAGAAACTCA CCGAGATCAT
1140
AGCTGTCTGA GGACTCCAGC AATGTGGTCC ATCTCATTAA
1200
CCTCCAGGGT CTTCCTGGAT CACAACGCCC TCCCCGACAC
1260
CCTTCTGCAG CAATGGAGTG ACGCACAGGA ACCAGCCCAG
1320
AGATCAATGT CCCGATCACC TTCCAGGTGA AGGTCACGGC
1380
AGTCGTTTGT CATCCGGGCG CTGGGCTTCA CGGACATAGT
1440
AGTGTGAGTG CCGGTGCCGG GACCAGAGCA GAGACCGCAG
1500
TCTTGGAGTG CGGCATCTGC AGGTGTGACA CTGGCTACAT
1560
AGTGCACGAA GTTCAAGGTC AGCAGCTGCC GGGAATGCAT CCTGGTGCCA GAAGCTGAAC TTCACAGGGC CGGGGGATCC CCCGGCCACA GCTGCTCATG AGGGGCTGTG CGGCTGACGA TCGCTGAAAC CCAGGAAGAC CACAATGGGG GCCAGAAGCA CGCTTTACCT GCGACCAGGC CAGGCAGCAG CGTTCAACGT GCTACCCCAT CGACCTGTAC TATCTGATGG ACCTCTCCTA GGAATGTCAA GAAGCTAGGT GGCGACCTGC TCCGGGCCCT GCCGCATTGG CTTCGGGTCC TTCGTGGACA AGACCGTGCT CTGATAAGCT GCGAAACCCA TGCCCCAACA AGGAGAAAGA TCAGGCACGT GCTGAAGCTG ACCAACAACT CCAACCAGTT AGCTGATTTC CGGAAACCTG GATGCACCCG AGGGTGGGCT CCGCCTGCCC GGAGGAAATC GGCTGGCGCA ACGTCACGCG
c D N A sequences
continued
TGGGAAAAAC CCGGAAGGAC CCTGTGCCAC CACCATCAAC CTTCTGCGGG GACCACTGAG CTGCAACGTA CTGCCCCTCA CCCCTTTGGG GAAGGGCAGG GCAGCAGGAC AGGCCCCAAC TCTCCTGCTG CTTTGAGAAG CACCACGACG GCCGTCAGGA TTGAGGATGT TGGCCGGCCG TCTTTGCATG CTGTGCAAGT TGTCAGGGTA AAAAATAAAA
AGACACAGGG TCATCTGCTC TCCCCGGCAA ACAACGGCCA GCCACCCGGG ACCCGCGGCG ATTCAGGCTA AGTACATCTC GCGCGGCGTG AGAGGGACTC GCTACCTCAT TCGTCGGGGG AGGCTCTGAT AGTCCCAGTG CCAAGTTTGC CTGCCCCATC CCAGAAATCC GGGGCTCGTC GAGGGAGGGC GTCTGATTAA TCCCATTAAT GGCTGTCCAT
TGTGAGTGCC AACAACTCCA ACCAGCGACG TGTGAGCGCT AAGTGCCGCT GGCTGCCTGA TGCGAGTGCC CCCTGTGGCA AAGAACTGCA ACCTGCAAGG GGGATGGACC ATCGCCGCCA GTCATCTGGA GAGAAGCTCA GTCATGAACC CCCACCATGT CACCAATTAA GGTGCTTCTG GAGACTTGAG CAGGACATCA TAAAATGACA CTTCAATACA
CCGGAGCAGC AGGGCTGGGG GCTGATATAC GGTCTGCGGC CTTTGAGGGC TGTTGAGTGT CCAGCTGCCT CTGCGCCGAG TCCGGGCCTG AGAGGGCTGC CTATGTGGAT CACCGTGGCA CCACCTGAGC GAACAATGAT TGAGAGTTAG ACGCGGCCGA AGTTATTTTC GGGGGGACAG TTGAGGTTGG AGGTGGTGCC TATATTGTTA GGAAAAAAAA
CAGGAGCTGG GACTGTGTCT GGGCAGTACT GGCCCGGGGA TCAGCGTGCC AGTGGTCGTG CTGTGCCAGG TGCCTGAAGT CAGCTGTCGA TGGGTGGCCT GAGAGCCGAG GGCATCGTGC GACCTCCGGG AATCCCCTTT GAGCACTTGG GACATGGCTT CGCCCTCAAA CTCCACTCTG TGAGGTTAGG AATTTATTTA ATCAATCACG AAAAAAAAAA
AAGGAAGCTG GCGGGCAGTG GCGAGTGTGA GGGGGCTCTG AGTGCGAGAG GCCGGTGCCG AGTGCCCCGG TCGAAAAGGG ACAACCCCGT ACACGCTGGA AGTGTGTGGC TGATCGGCAT AGTACAGGCG TCAAGAGCGC TGAAGACAAG GCCACAGCTC ATGACAGCCA ACTGGCACAG TGCGTGTTTC CATTTAAACT TGTATAGAAA
1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820
The methionine initiation codons (ATGI, the termination codons (TAG); and the probable polyadenylation signals (AATAAA) are indicated. The first five nucleotides in each exon are underlined to indicate intron-exon boundaries. The p subunit has multiple initiation sites for transcription as well as multiple polyadenylation sites^^. The sequence shown is derived from a cDNA with the longest 5' untranslated region and the most frequent polyadenylation site.
Genomic structure a subunit^^: The gene spans 55 kb with 30 exons. CD11b 2kb
H
w—^+-^
m ^ —
15
30
p subunit^^: The gene spans approximately 40 kb and is organized into 16 exons. CD18 h
2kb -H
1
\-{-{-\
Accession numbers Human
a subunit23»33
J03925 X07421 Ml 8044
9 I I I
16 III II
p subunit^^'^'^
Mouse
a subunit^^ P subunit^ C3b.B > C3b complexes^. The result is (1) to inhibit cleavage of C3b by factor I, (2) to increase the affinity for factor B, and most significantly (3) to increase the stability of C3bBb, inhibiting its dissociation into C3b + Bb^^^. Consequently properdin promotes positive amplification of C3b deposition onto an activating surface.
23333213Tissue distribution Serum protein: 4.3-5.7 jag/ml in plasma*. Sites of synthesis: monocytes^, T cells^'^ and granulocytes^^.
Regulation of expression Properdin synthesis by monocytic cell lines is upregulated by phorbol esters, bacterial LPS, IL-lp and TNFa^^. TNFa, C5a, IL-8 and fMLP also stimulate release of properdin stored in neutrophil granules^^
Properdin
Protein sequence 13,14 MITEGAQAPR LLLPPLLLLL TLPATGSDPV LCFTQYEESS GKCKGLLGGG
50
VSVEDCCLNT AFAYQKRSGG LCQPCRSPRW SLWSTWAPCS VTCSEGSQLR
100
YRRCVGWNGQ CSGKVAPGTL EWQLQACEDQ QCCPEMGGWS GWGPWEPCSV
150
TCSKGTRTRR RACNHPAPKC GGHCPGQAQE SEACDTQQVC PTHGAWATWG
200
PWTPCSASCH GGPHEPKETR SRKCSAPEPS QKPPGKPCPG LAYEQRRCTG
250
LPPCPVAGGW GPWGPVSPCP VTCGLGQTME QRTCNHPVPQ HGGPFCAGDA
300
TRTHICNTAV PCPVDGEWDS WGEWSPCIRR NMKSISCQEI
350
PGQQSRGRTC
RGRKFDGHRC AGQQQDIRHC YSIQHCPLKG SWSEWSTWGL CMPPCGPNPT
400
RARQRLCTPL LPKYPPTVSM VEGQGEKNVT FWGRPLPRCE ELQGQKLWE
450
EKRPCLHVPA CKDPEEEEL
The leader sequence is underlined, and the single N-linked glycosylation site (occupied) is indicated (N).
Protein modules Leader peptide 1-27 exon2 TSPl (1) 77-134 exon 4 TSPl (2) 135-191 exon 5 TSPl (3) 192-255 exon 6 TSPl (4) 256-313 exon 7 TSPl (5) 314-377 exon 8 TSPl (6) 378-437 exon 9/10 Polymerization and ability to stabilize C3bBb are impaired by deletion of TSPl (4), (5) or (6), but not by deletion of (3)^5.
D
Chromosomal location Human^^: short arm of X chromosome. Xpll.3-Xp 11.23.
cDNA sequence 13,14,17 GAGCCTATCA TCAACATGAT TGCTCACCCT CCTCCGGCAA ACACTGCCTT GATGGTCCCT TGCGGTACCG CCCTGGAGTG GGTCTGGCTG GCAGGCGAGC AGGAATCAGA GGGGCCCCTG CACGAAGCCG CGGGGCTAGC GCTGGGGGCC TGGAACAACG ATGCCACCCG
ACCCAGATAA CACAGAGGGA GCCAGCCACA GTGCAAGGGC TGCCTACCAG GTGGTCCACA GCGCTGTGTG GCAGCTCCAG GGGGCCCTGG CTGTAATCAC GGCCTGTGAC GACCCCCTGC CAAGTGTTCT CTAGGAGOAG TTGGGGCCCT GACGTGCAAT GACCCACATC
AGCGGGACCT GCGCAGGCCC GGCTCAGACC CTCCTGGGGG AAACGTAGTG TGGGCCCCCT GGCTGGAATG GCCTGTGAGG GAGCCTTGCT CCTGCTCCCA ACCCAGCAGG TCAGCCTCCT GCACCTGAGC CGGAGGTGCA GTGAGCCCCT CACCCTGTGC TGCAACACAG
CCTCTCTGGT CTCGATTGTT CCGTGCTCTG GTGGTGTCAG GTGGGCTCTG GTTCGGTGAC GGCAGTGCTC ACCAGCAGTG CTGTCACCTG AGTGTGGGGG TCTGCCCCAC GCCACGGTGG CCTCCCAGAA CCGGCCTGCC GGCCTGTGAC CCCAGCATGG CTGTGCCCTG
AGAGGTGCAG GCTGCCGCCG CTTCACCCAG CGTGGAAGAC TCAGCCTTGC GTGCTCTGAG TGGAAAGGTG CTGTCCTGAG CTCCAAAGGG CCACTGCCCA ACACGGGGCC ACCCCACGAA ACCTCCTGGG ACCCTGCCCA CTGTGCCCTG GGGCCCCTTC CCCTGTGGAT
GGGGCAGTAC 60 CTGCTCCTGC 120 TATGAAGAAT 180 TGCTGTCTCA 240 AGGTCCCCAC 300 360 GGCTCCCAGC GCACCTGGGA 420 ATGGGCGGCT 480 540 ACCCGGACCC GGACAGGCAC 600 TGGGCCACCT 660 CCTAAGGAGA 720 780 AAGCCCTGCC GTGGCTGGGG 840 GGCCAGACCA 900 TGTGCTGGCG 960 GGGGAGTGGG 1020
Properdin
••
cDNA sequence continued
||H| IMtl IHi HiJ 1^9 ^ H ^ H ^ H ^ H
ACTCGTGGGG AAATCCCGGG GATGTGCCGG AAGGATCATG CTACCCGTGC CCATGGTCGA GTGAGGAGCT CTGCTTGCAA TGACCTTCCA
GGAGTGGAGC CCAGCAGTCA GCAACAGCAG GTCAGAGTGG CCGCCAGCGC AGGTCAGGGC ACAAGGGCAG AGACCCTGAG AACCTCAATA
CCCTGTATCC GACGGAACAT CGCGGGAGGA CCTGCAGGGG GATATCCGGC ACTGCTACAG AGTACCTGGG GGCTGTGCAT CTCTGCACAC CCTTGCTCCC GAGAAGAACG TGACCTTCTG AAGCTGGTGG TGGAGGAGAA GAAGAGGAACT CTAACACTT AACTAGCCTCT TCGAAAAAA
GAAGTCCATC CCGCAAGTTT CATCCAGCAC GCCCCCCTGT CAAGTACCCG GGGGAGACCG ACGACCATGT CTCTCCTCCA AAAAAAAAAA
AGCTGTCAAG GACGGACATC TGCCCCTTGA GGACCTAATC CCCACCGTTT CTGCCACGGT CTACACGTGC CTCTGAGCCC AAA
1080 1140 1200 1260 1320 1380 1440 1500
The first five nucleotides in each exon are underlined to indicate the intron-exon boundaries. The methionine initiation codon (ATGj, the termination codon (TAA) and the probable polyadenylation signal (AATAAAI are indicated.
Genomic structure^^'18 The gene spans 6kb and is encoded by 10 exons illustrated below. The introns vary from 0.1 to 1.6 kb. 1 kb
H 10
Hf
I
mil
Hi
Accession numbers Human^^'^^ Mouse^^ Guinea-pig2o
X57748 M83652 S49355 X12905 S81116
Deficiency X-linked. Defective alternative pathway function, resulting in highly impaired bactericidal activity. Patients are highly susceptible to fulminant meningococcal infections. Mutations identified: C546toT;R161 to stop C363 to T; RlOO to W T1305 to G; Y414 to D
Polymorphic variants None known.
Type I (complete deficiencyj^^ Type II (partial deficiency)^^ Type III (dysfunctional protein)^^
Properdin
References ' Fames, T.C. et al. (1987) Biochem. J. 243, 507-517. 2 Fearon, D.T. and Austen, K.F. (1975) J. Exp. Med. 142, 856-863. ^ Nolan, K.F. and Reid, K.B.M. (1990) Biochem. Soc. Trans. 18, 1161-1162. ^ Smith, C.A. et al. (1984) J. Biol. Chem. 259, 4582-4588. 5 Pangbum, M.K. (1989) J. Immunol. 142, 202-207. 6 Smith, K.F. et al. (1991) Biochemistry 30, 8000-8008. 7 Parries, T.C. et al. (1988) Biochem. J. 252, 47-54. « Nolan, K.F. and Reid, K.B.M. (1993) Methods Enzymol. 223, 35-46. 9 Whaley, K. (1980) J. Exp. Med. 151, 501-516. » Schwaeble, W. et al. (1993) J. Immunol. 151, 2521-2528. ^ Wirthmueller, U. et al. (1997) J. Immunol. 158, 4444-4451. 2 Schwaeble, W. et al. (1994) Eur. J. Biochem. 219, 759-764. ^ Nolan, K.F. et al. (1992) Biochem. J. 287, 291-297. 4 Nolan, K.F. et al. (1991) Eur. J. Immunol. 21, 771-176. 5 Higgins, J.M. et al. (1995) J. Immunol. 155, 5777-5785. 6 Goundis, D. et al. (1989) Genomics 5, 56-60. 7 Maves, K.K. and Weiler, J.M. (1992)}. Lab. Clin. Med. 120, 761-766. 8 Fredrikson, G.N. et al. (1996) J. Immunol. 157, 3666-3671. 9 Goundis, D. and Reid, K.B.M. (1988) Nature 335, 82-85. 20 Maves, K.K. et al. (1995) Immunology 86, 475-9. 2i Westberg, J. et al. (1995) Genomics 29, 1-8.
CD59 B. Paul Morgan, Department of Medical Biochemistry, University of Wales, Cardiff, UK Other names P-18, membrane inhibitor of reactive lysis (MIRL), homologous restriction factor 20 (HRF-20), membrane attack complex inhibitory factor (MACIF), protectin.
Physicochemical properties^ CD59 is synthesized as a 128 amino acid precursor, including a 25 amino acid leader peptide at the N-terminus and a 26 amino acid signal for glycosylphosphatidylinositol (GPI) anchor addition at the C-terminus. The mature protein consists of 77 amino acids, the GPI anchor attachment site being at N102. M,(K) predicted 11.5 observed 18-23 iV-linked glycosylation site^ 1 (43)
Structure^^ CD59 is a compact, disc-shaped structure with four loops created by intramolecular disulfide bonds projecting from the disc. There is a large Nlinked carbohydrate group placed laterally containing variable structures in the M, rSinge 4000-6000. A flexible seven amino acid stalk extends from C94 to N102, the site of GPI anchor addition. The C8/C9-binding site has been putatively localized to a hydrophobic groove on the concave upper face of the disc, centred around W65. CD59 shows sequence and structural similarities with the murine Ly-6 antigens and a number of other molecules now grouped in the 'Ly-6 multigene family'.
Function^-^ CD59 binds C8 in the forming membrane attack complex (MAC) and blocks the recruitment of multiple C9 molecules necessary for assembly of the MAC pore. It may also bind C9 in partially assembled MACs. It has a proposed role in signalling cell activation upon assembly of the MAC. Suggested roles as a ligand for CD2 and as an inhibitor of perforin have not been supported.
Tissue distribution*'^ Widely expressed, present on all circulating cells, vascular endothelium, epithelia and in most tissues. Weakly expressed in the central nervous system. Fluid-phase forms in urine and some other biological fluids.
Regulation of expression Little studied. Expression in vitro is enhanced by incubation of cells with phorbol esters.
Protein sequence^^ 10 MGIQGGSVLF GLLLVLAVFC HSGHSLQCYN CPNPTADCKT AVNCSSDFDA CLITKAGLQV YNKCWKFEHC NFNDVTTRLR ENELTYYCCK KDLCNFNEQL E^GGTSLSEK TVLLLVTPFL AAAWSLHP
50 100
The N-terminal leader peptide is underlined and the single iV-linked glycosylation site is indicated (N). The site of GPI anchor addition is indicated by N. This sequence does not include the translated product of exon 2.
Chromosomal location Human^^: Ilpl4-pl3. Mouse^^: chromosome 2, E2-E4.
cDNA sequence1,10,13 CGCAGAAGCG GGTGTAGGAG GGAGGGTCTG AGCCTGCAGT TCATCTGATT TGGAAGTTTG ACGTACTACT ACATCCTTAT AGCCTTCATC TCCGCTTTCT GAAAGAATAA GACCAGTCCT GTGACTTGAA ACAGCTTGAG GTCAGTTAGC CTCACATGGA TGTTCCATAT TCTGGCAGGG AGGTACAAGT TATCTTCCAC
GCTCGAGGCT TTGAGACCTA TCCTGTTCGG GCTACAACTG TTGATGCGTG AGCATTGCAA GCTGCAAGAA CAGAGAAAAC CCTAAGTCAA CTTGCTGCCA AATTAGCTTG GCCCGCAGGG CTAGATTGCA TGGGTTCTCT ATCATTAGTA ACGCTTTCAT GTGGGTGTCA AAGTGGGGAA GGCTGAAAAT TGGAAAAGTG
GGAAGAGGAT CTTCACAGTA GCTGCTGCTC TCCTAACCCA TCTCATTACC TTTCAACGAC GGACCTGTGT AGTTCTTCTG CACCAGGAGA CATTCTAAAG AGCAACCTGG AAGCCCCACT TGCTTCCTCC GCAGCCCTCA CATCTTTGGA AAACTTCAGG GTCAGGGACA GTGTTCCAGA CGAGTTTTTC TAATAGCATA
CCTGGGCGCC GTTCTGTGGA GTCCTGGCTG ACTGCTGACT AAAGCTGGGT GTCACAACCC AACTTTAACG CTGGTGACTC GCTTCTCCCA GCTTGATATT CTAAGATAGA TGAAGGAAGA TTTGCTCTTG GATTATTTTT GGGTGGGGCA GATCCCGTGT ACAAGATCCT TTCCAGATAG CTCTGTCTTT CATCAATGGT
GCCAGTCTTT CAATCACAAT TCTTCTGCCA GCAAAACAGC TACAAGTGTA GCTTGAGGGA AACAGCTTGA CATTTCTGGC AACTCCCCGT TTCCAAATGG GGGGTCTGGG AGTCTAAGAG GGAAGACCAG CCTCTGGCTC GGAGTATATG TGCCATGGAG TAATGCAGAG CAGGGCATGA AAATTTTATA GTGTT
AGCACCAGTT 60 GGGAATCCAA 12 0 TTCAGGTCAT 180 CGTCAATTGT 2 40 TAACAAGTGT 3 00 AAATGAGCTA 3 60 AAATGGTGGG 42 0 AGCAGCCTGG 480 TCCTGCGTAG 540 ATCCTGTTGG 600 AGACTTTGAA 660 TGAAGTAGGT 72 0 CTTTGCAGTG 7 80 CTTGGATGTA 84 0 AGCATCCTCT 900 GCATGCCAAA 9 60 CTAGAGGACT 102 0 AAACTTAGAG 108 0 TGGGCTTTGT 112 0
The first five nucleotides in each exon are underlined to indicate the intron-exon boundaries. The methionine initiation codon (ATGj, the termination codon (TAA) and the first polyadenylation signal (AATAAA) are indicated. Exon 2 is alternatively spliced and only a minority of total mRNA (10-20%) contains this exon. At least four species of mRNA have been identified with mobilities of 0.6 kb, 1.2 kb, 1.9 kb and 2.2 kb, differing only in the degree of polyadenylation.
Genomic structure^^-^^ The gene spans approximately 26 kb and contains 5 exons, including the alternatively spliced exon 2. 2kb 1
I
5
\
1
1
•
Accession numbers (EMBL/GenBank) Human Primate Rat Mouse Pig Rabbit
X15861 X16447 L22860 U48255 U60473 AF 020302 AF 040387
Deficiency^^'^^ Single reported case of complete deficiency presenting with nocturnal haemoglobinuria and multiple thrombotic episodes. Defect caused by deletion of C231 in coding region leading to premature termination. On the same allele, G469 is also deleted. Deficiency of CD59 (and other GPIanchored proteins) on clone of circulating cells in paroxysmal nocturnal haemoglobinuria.
D
Polymorphic variants None reported in coding region.
References ^ Davies, A. et al. (1989) J. Exp. Med. 170, 637-654. 2 Rudd, P.M. et al. (1997) J. Biol. Chem. 272, 7229-7244. ^ Fletcher, CM. et al. (1994) Structure 2, 185-199. ^ Kieffer, B. et al. (1994) Biochemistry 33, 4471-4482. 5 Meri, S. et al. (1990) Immunology 71, 1-9. 6 Rollins, S.A. and Sims, P.J. (1990) J. Immunol. 144, 3478-3483. ^ Morgan, B.P. et al. (1993) Eur. J. Immunol. 23, 2841-2850. « Nose, M. et al. (1990) Immunology 70, 145-149. ^ Meri, S. et al. (1991) Lab. Invest. 65, 532-537. ^0 Okada, H. et al. (1989) Biochem. Biophys. Res. Commun. 162, 1553-1559. ^^ Bickmore, W. et al. (1993) Genomics 17, 129-135. ^2 Powell, M.B. et al. (1997) J. Immunol. 158, 1692-1702. ^^ Holguin, M.H. et al. (1996) J. Immunol. 157, 1659-1668. ^^ Tone, M. et al. (1992) J. Mol. Biol. 227, 971-976. ^5 Petranka, J.G. et al. (1992) Proc. Natl Acad. Sci. USA 89, 7876-7879. ^6 Motoyama, N. et al. (1992) Eur. J. Immunol. 22, 2669-2673. '^ Yamashina, M. et al. (1990) N. Engl. J. Med. 323, 1184-1189.
This Page Intentionally Left Blank
Index Underlined type refers to complement main entries. Acute-phase proteins, 32 Adaptive i m m u n e response amplification, 18-19 Adipsin, see Factor D Adult respiratory distress syndrome (ARDS), 42, 44 AIDS, 34 'Alexin', 7 Alternative pathway, 16 Alzheimer's disease, 214 'Amboreceptors', 7 Amyloid A, 26 Anaphylatoxins, 18, 20, 21 see also C3aR Angioedema, 20, 166, 208 Antibodies, and the complement system, 18-19 Apolipoprotein J (clusterin), 210-214 Apoptosis, 20-21 Artificial membranes, 21 Asthma, 49 AZ3B'. see C3aR B, see Factor B Bacterial sepsis, 20-21 see also specific bacteria by name Bacteriolysis, 7 B cells, 19, 150 Behcet's disease, 166 Beta (b) IH, see Factor H Bordet, J., 7 Bovine conglutinin, see Conglutinin see also Collectins Bradykinin, 20, 208 'Bunch of tulips' structure — f o r C l q , 9, 26 forMBL, 31 Bystander lysis, 18 CI complex ( C l q - C l r - C l s ) , 15 CI inactivator, see CI inhibitor
CI inhibitor (ClINH), 15, 206-209 CI inhibitor deficiency-linked disease, 20 Clq, 9, 15, 26-30 see also Collectins ClqA, 28 ClqB, 28 C l q C , 29 ClqRp, 176-179 see also Cell surface receptors Clr, 15. 52-55 see also Serine proteases Cls, 15. 56-60 see also Serine proteases C2. 73-77 see also Serine proteases C2a, 15, 17 C2b, 15 C3, 17. 88-94 thioester bond in, 11 see also C3 family C3a, 15 C3aR, 180-183 see also Cell surface receptors C3b, 15, 17 C3b*, 15, 17 C3bB, 17 C3bBb, 17 C3b/C4b receptor, see CRl C3bi-receptor, see CR3 C3 convertase activator, see Factor D C3 deficiency, 20 C3 family, 10-11 C3 (H2O), 16-17 C3 (H20)Bb complex, 17 C3 proactivator, see Factor B C3dR, see CR2 C3 family, 88-109 C3. 88-94 C4. 95-103
C5.104-109 C4, 15, 95-103 thioester bond in, 11 see also C3 family
C4a, 15 C4b, 15 C4b*, 15 C4b-binding protein, see C4BP C4b-bp, see C4BP C4BP, 15, 161-167 see also Regulators of complement activation (RCA) C4-bp, see C4BP C5. 104-109 see also C3 family C5a, 17 C5aR, 184-187 see also Cell surface receptors C5b, 17 C5b*, 17 C5b67, 18 C6. 112-116 see also Terminal pathway components C7. 117-122 see also Terminal pathway components C8.123-130 see also Terminal pathway components C9, 131-134 see also Terminal pathway components CD4^ lymphocytes, 32 CD lib/CD 18, see CR3 CDllc/CD18, seeCR4 CD 18, see CR3 and also CR4 CD21, see CR2 CD35, see CRl CD46, see MCP CD55, see DAF CD59, 18 CD87, see Urokinase plasminogen activator receptor CD88, see C5aR Cell lysis, 18 Cell surface receptors, 176-203 ClqRp, 176-179 C3aR, 180-183 C5aR, 184-187 CR3. 188-197 CR4. 198-203 'Cellular theory^ 7
Chain association, in collectins, 8 Chemotaxis, 109 Chromosome lq32, 12 Chymotrypsin family, 9 Classical pathway, 15 Clusterin, see Apolipoprotein J (clusterin) Collectins, 9, 26-50 Clq. 26-30 chain association, 8 conglutinin, 36-40 MBL. 31-35 SP-A, 41-45 SP-D, 46-50 Complement activation regulators, see Regulators of complement activation (RCA) Complement compound-C3d/EpsteinBarr virus receptor 2, see CR2 Complement control protein modules (CCP), 12 Complement deficiency, 70 Complement lysis inhibitor (CLI), see Apolipoprotein J (clusterin) Complement pathways, 13-18 alternative, 16 classical, 15 lectin, 15 terminal, 17-18 Complement receptor type 1, see CRl Complement receptor type 1, see CR2 Complement receptor type 3, see CR3 Complement receptor type 4, see CR4 Complement system, 7-22 activation of (summary), 14 and disease, 19-20 function of, 19-20 history of, 7 molecular structure of components, 9 and tissue injury, 20-21 Conglutinin, 9, 36-40 see also Collectins Core-specific lectin, see Mbl CP4, see SP-D CRl. 136-145 see also Regulators of complement activation (RCA) Clr. 52-55
CR2. 146-151 see also Regulators of complement activation (RCA) CR3. 188-197 see also Cell surface receptors CR4. 198-203 see also Cell surface receptors C-reactive protein, 13 Cromer antigens, 154 D, see factor D DAF, 15, 17. 152-155 see also Regulators of complement activation (RCA) DAG, see Apolipoprotein } (clusterin) Decay-accelerating factor, see DAF Dendritic cells, 19 Dialysis, 21 Dimeric acidic glycoprotein (DAG), see Apolipoprotein J (clusterin) Discoid lupus erythematosus, 102 see also Systemic lupus erythematosus Disease, and the complement system, 19-20 Disulfide pattern, of SP-A, 8 DNA, 26 EC 3.4.21.41, s ^ e C l r EC 3.4.21.42, see C l s EC 3.4.21.43, see C2 EC 3.4.21.45, see Factor I EC 3.4.21.46, see Factor D EC 3.4.21.47, see Factor B Epstein-Barr virus receptor 2, see CR2 Escherichia coli, 36, 71 Epidermal growth factor, 12 Esterase, CI inhibitor, see CI inhibitor Factor Xlla, 20 Factor B, 12, 15, 17. 78-82 see also Serine proteases Factor D (adipsin), 17, 69-72 see also Serine proteases Factor H, 15, 17, 20, 168-173 see also Regulators of complement activation (RCA)
Factor 1, 15, 17.20. 83-86 see also Serine proteases Fc receptors, 21 FFl, see factor H Follicular dentritic cells, 19 Fulminant meningococcal infection, 217 see also Meningitis Germinal centres, 19 Glomerulonephritis, see under Nephritis Glycoprotein 2, sulfated, see Apolipoprotein J (clusterin) Glycoprotein III, see Apolipoprotein J (clusterin) Glycoprotein 45-70, see MCP Glycoprotein 80, see Apolipoprotein J (clusterin) Gonorrhoea, 109 see also under Neisseria gp45-70, see MCP Haemolysis, 7 Haemodialysis, 21 Heart disease, 21 Heart-lung bypass, 21 Heparin, 26 Herpes simplex-2, 36 History, of complement research, 7 HIV, 32, 34, 150 Homologous restriction factor 20 (HRF20), see CD59 Host defences, 18 HRF-20, see CD59 H u m a n Clq/MBL/SPA receptor, see ClqRp Humoral theory, 7 Hypersensitivity, 19 Hypocomplementaemic renal disease, 172 I, see Factor I IgG, 15, 26 IgM, 15, 26 Immune adherence receptor, see CRl I m m u n e bodies', 7
Immunoglobulins, 15, 26 Immune system, and complement, 18-19 Infectious disease, 20 Neisseria, 20: see also under Neisseria pyogenic, with C3 deficiency, 20 pyogenic, with partial MBL deficiency, 20 Inflammatory injury, 20-21 Influenza A virus, 31, 36 Insulin, 70 aj„P2-Iiitegrin, see CR3 axp2-Integrin, see CR4 Integrins, see CR3 and also CR4 Interferon-y, 70, 206 Interleukin 6, 206 Kallikrein, 20 Kidney, immune complex disease, 102 see also under Nephritis Kininogen, 208 Kinins, 20 Knops antigen, 143 Lectin, see MBL (mannose-binding lectin) Lectin pathway, 15 LeuCAM, see CR3 Leukocyte adhesion deficiency (LAD), 195 Leukocyte integrin, see CR3 Leukotrienes, 18 LFA-1, 189 Local tissue injury, 20-21 Low-density lipoprotein, in terminal components, 12 Lytic pathway, see Terminal pathway Mac-1, seeCR3 MACIF, see CD59 a2-Macroglobulin, 11 Magnesium ions, 188 Manganese ions, 188 Mannan-binding lectin (MBL)associated serine protease 1, 5^6 MASP-1
Mannan-binding lectin (MBL)associated serine protease 2, see MASP-2 Mannan-binding protein (lectin), see MBL Mannose-binding lectin, see MBL MASP-1, 15-16.31.61-64. see also Serine proteases MASP-2, 15-16.31.65-68 see also Serine proteases MBL, 9, 15-16, 31-35 see also Collectins MBL deficiency, 20 McCoy antigen, 143 MCP, 17. 156-160 see also Regulators of complement activation (RCA) Measles, 157 Measles virus receptor, see MCP Membrane attack complex (MAC), 12, 18,20 Membrane attack complex inhibitory factor (MACIF), see CD59 Membrane cofactor protein, see MCP Membrane inhibitor of reactive lysis (MIRL), see CD59 Membranes, artificial, 21 Membranoproliferative glomerulonephritis, see under Nephritis Meningitis, 109, 115, 120, 128, 134, 172,217 Meningococcal meningitis, see under Meningitis Metal ion-dependent adhesion site (MIDAS), 188 Metchnikoff, EHe, 7 MIRL, see CD59 Myasthenia gravis, 21 Myocardial infarction, 21
NA1/NA2, see Apolipoprotein J (clusterin) Native properdin, see Properdin Neisseria gonorrhoea, 71, 115, 120, 134 Neisseria meningitidis, 71 Neisseria spp., 20, 157 Nephritis, 21, 30, 76, 93, 172
Nocturnal haemoglobinuria, 221 see also Paroxysmal nocturnal haemoglobinuria OKM-1, seeCRS Opsonins, 18 P-18, sg^CD59 PlOO, se^MASP-l pi50, 95 antigen, see CR4 Paraoxonase, 211 Paxillin, 189 Perforin, 12 Phagocytosis, enhancement, see ClqRp Plasmin, 20 Proline-rich protein (PRP), see C4BP Properdin, 7, 215-218 Properdin factor B, see Factor B Prostaglandins, 18 Protectin, see CD59 Protein tyrosine kinases (PTKs), 189 PSAP, see SP-A PSPD, see SP-D Pulmonary surfactant (glyco)protein A, see SP-A Pulmonary surfactant (glyco)protein D, see SP-D Proteases, see Serine proteases Proteinuria, 21 Pyogenic infections, 20, 55, 59, 76, 93 RaRF (Ra-reactive factor), see MBL Regulators of complement activation (RCA), 12-13, 136-173 C4BP, 161-167 C R l . 136-145 CR2. 146-151 DAF. 152-155 Factor H, 168-173 MCP. 156-160 Retinoic acid, 190 Rheumatoid arthritis, 21 Salmonella montevideo, 31 Salmonella typhimurium, 36 'Sensitizer', 7 Sepsis, 20-21
Serine proteases, 9-10 C l r . 52-55 C l s , 56=60 C2. 73-77 Factor B, 78-82 Factor D, 69-72 Factor I, 83-86 MASP-1. 61-64 MASP-2, 65-68 Serum protein 40,40, see Apolipoprotein J (clusterin) SFTPAl, see SP-A SFTPA2, see SP-A SFTPD, see SP-D SGP-2, see Apolipoprotein J (clusterin) Short concensus repeat (SCR), see CCP Sialic acid, 17 SP-A, 8, 41-45 see also Collectins SP-D, 46-50 see also Collectins l i s protein, see C l q and also Collectins Spontaneous inherited complement deficiency, 20 Ss(C4)-binding protein, see C4BP Ss protein (mouse), see C4 Staphyloccous aureus, 42, 211 Streptococcus pyogenes, 157, 211 Sulphated glycoprotein 2, see Apolipoprotein J (clusterin) Surfactant protein A, see SP-A Surfactant protein D, see SP-D Swain-Langley antigen, 143 Systemic diseases, 20-21 Systemic lupus erythematosus (SLE), 20, 30, 55, 59, 76, 93, 102, 143, 150, 172, 195,208 Systemic vasculitis, 21
Terminal pathway, 16 Terminal pathway components (C6, C7, C8, C9), 12 EGF-like repeat, 12 LDL receptor class A repeat, 12 perforin-like segment in, 12 thrombospondin type 1 repeat in, 12
Testosterone repressed prostate message 2, see Apolipoprotein J (clusterin) Thioester bonding, 11, 15 Thrombosis, 221 Thrombospondin, 12 Tickover'hypothesis, 16 Tissue injury, 20-21 Transforming growth factor-(3 receptors, 211 TRPM-2, see Apolipoprotein J (clusterin) Trypsin subfamily, 9 Tumour necrosis factor-a, 206 Tyrosine kinase, 189
Urokinase plasminogen activator receptor, 189 Vasculitis, 76 Vav, 189 Vav-p21 (ras), 189 York antigen, 143 Zymogen, 16 Zymosan, 7