THE TRANSPORTER
FactsBook
THE TRANSPORTER
FactsBook
Other books in the FactsBook Series: A. Nell Barclay, Albertus...
57 downloads
2261 Views
28MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THE TRANSPORTER
FactsBook
THE TRANSPORTER
FactsBook
Other books in the FactsBook Series: A. Nell Barclay, Albertus D. Beyers, Marian L. Birkeland, Marion H. Brown, Simon J. Davis, Chamorro Somoza and Alan F. Williams The Leucocyte Antigen FactsBook, 1st edn Robin Callard and Andy Gearing The Cytokine FactsBook Steve Watson and Steve Arkinstall The G-Protein Linked Receptor FactsBook Rod Pigott and Christine Power The Adhesion Molecule FactsBook Shirley Ayad, Ray Boot-Handford, Martin J. Humphries, Karl E. Kadler and C. Adrian Shuttleworth The Extracellular Matrix FactsBook Grahame Hardie and Steven Hanks The Protein Kinase FactsBook The Protein Kinase FactsBook CD-Rom Edward C. Conley The Ion Channel FactsBook h Extracellular Ligand-Gated Channels Edward C. Conley The Ion Channel FactsBook H: lntracellular Ligand-Gated Channels Kris Vaddi, Margaret Keller and Robert Newton The Chemokine FactsBook Marion E. Reid and Christine Lomas-Francis The Blood Group Antigen FactsBook A. Nell Barclay, Marion H. Brown, S.K. Alex Law, Andrew J. McKnight, Michael G. Tomlinson and P. Anton van der Merwe The Leucocyte Antigen FactsBook, 2nd edn Robin Hesketh The Oncogene and Tumour Suppressor Gene FactsBook, 2nd edn
THE TRANSPORTER
FactsBook Jeffrey Griffith Department of Biochemistry and Molecular Biology University of New Mexico School of Medicine
Clare Sansom Department of Crystallography Birkbeck College, University of London
Academic Press Harcourt Brace & Company, Publishers SAN DIEGO
LONDON
SYDNEY
BOSTON
TOKYO
NEW YORK
TORONTO
This book is printed on acid-flee paper Copyright 9 1998 by ACADEMIC PRESS
All Rights Reserved No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Academic Press 525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.apnet.com Academic Press Limited 24-28 Oval Road, London NW1 7DX, UK http://www.hbuk.co.uk/ap/ ISBN 0-12-303965-7
Library of Congress Cataloging-in-Publication Data Griffith, Jeffrey. The transporter factsbook / by Jeffrey Griffith, Clare Sansom. p. cm. Includes index. ISBN 0-12-303965-7 (alk. paper) 1. Carrier proteins. I. Sansom, Clare. II. Title. OP552.C34075 1997 97-44438 572'69-dc21 CIP
A catalogue record for this book is available from the British Library
Typeset in Great Britain by Alden Group, Oxford. Printed in Great Britain by WBC, Bridgend, Mid Glamorgan 98 990001 0203 EB 9 8 7 6 5 43 2 1
Preface
M
Abbreviations
X
Chapter I Function and Structure of Membrane Transport Proteins Peter I.F. Henderson
3
Chapter 2 Amino Acid Sequence Comparisons
30
Chapter 3 Organization of the Data
34
Part 1 P-Type ATPases Calcium-transporting ATPase family Plasma membrane cation-transporting ATPase family Heavy metal-transporting ATPase family
41 42 48 88
Part 2 Vacuolar ATPases Vacuolar ATPase family
103 104
Part 3 ABC Multidrug Resistance Proteins White transporter family ABC 1 & 2 transporter family Yeast multidrug resistance family Cystic fibrosis transmembrane conductance regulator family P-Glycoprotein transporter family Peroxisomal membrane transporter family
113 114 121 126 135 142 179
Part 4 ABCQ Transporters ABC-2 nodulation protein family ABC-2 polysaccharide exporter family ABC-2 associated (cytoplasmic)protein family
185 186 190 194
Part 5 ABC Binding Protein-Dependent Transporters:Transmembrane Elements ABC-associated binding protein-dependent maltose transporter family ___ ABC-associated bindmg protein-dependent peptide transporter family ABC-associated binding protein-dependent iron transporter family
203 204 208 214
Part 6 ABC Binding Protein-Dependent Transporters: Cytoplasmic Elements Binding protein-dependent monosaccharide transporter family Binding protein-dependent peptide transporter family Part 7 Other ABC-Associated (Cytoplasmic) Proteins Heme exporter family Macrolide-streptogramin-tysolin resistance family
22 1 222 227 25 1 252 255
Part 8 H'-Dependent Symporters H+/sugar symporter-uniporter family H'/rhamnose symporter family H'/amino acid symporter family H'/lactose-sucrose-nucleoside symporter family H+/galactoside-pentose-hexuronide symporter family H+/oligopeptide symporter family H+/fucose symporter family H'/carboxylate symporter family H'/nucleotide symporter family Sugar phosphate transporter family
26 1 262 288 290 30 1 305 310 317 320 326 329
Part 9 H'-Dependent Antiporters H+/vesicular amine antiporter family 14-Helix H'/multidrug antiporter family 4-Helix H+/multidrug antiporter family 12-Helix H'/multidrug antiporter family Acriflavin-cation resistance family Yeast multidrug resistance family Part 10 Na'-Dependent Symporters Na'/Ca2' exchanger family Na'/proline symporter family Na'/glucose symporter family Na'/dicarboxylate symporter family Na'PO4 symporter family Na+/branched amino acid symporter family Na+/citrate symporter family Na'/alanine-glycine symporter family Na+/neurotransmitter symporter family
335 336 341 353 357 364 370 375 376 380 385 392 400 404 408 41 1 414
Part 11 Na'-Dependent Antiporters Na+/H+antiporter family
42 7 428
Part 12 PEP-Dependent Phosphotransferase Family Phosphoenolpyruvate-dependent sugar phosphotransferase family
435 436
Part 13 Other Transporters Anion exchanger family Mitochondnal adenine nucleotide translocator family Mitochondria1 phosphate carrier family Nitrate transporter I family Nitrate transporter II family
445 446 454 469 472 476
Contents Spore germination transporter family Vacuolar membrane pyrophosphatase family Gluconate transporter family
479 482 486
Index
491
VII
This Page Intentionally Left Blank
The Transporter FactsBook had its inception early in 1996 when Tessa Picknett at Academic Press approached the authors with the idea of preparing a volume on transport proteins. Recognizing that the book would contain several different types of transporters, and that additional transporter species were being described almost daily, it was decided that the only way to make the volume comprehensive would be to base the chapters on families of related transporters, rather than individual proteins. Using this method, we have been able to include nearly 800 transport proteins in this volume. More important, this comparative approach, which stresses the structural, mechanistic and biological properties that are common to closely related proteins, provides an objective basis for identifying potential evolutionary relationships between distantly related groups of proteins and establishes a system for classifying and characterizing newly described transporters. The authors hope that this basis for identification and classification will continue to make the volume a valuable resource even after the compilation of transporters it contains is no longer comprehensive. An undertaking of this scope and complexity would not have been possible without the help and advice of many people. In particular, the authors would like to thank Jennifer Bryant and Peggy Moran at the University of New Mexico School of Medicine for cheery and able assistance in establishing the relationships between the nearly 800 transporters described in The Transporter FactsBook, Dr Mark Platt, now at the Louisiana State University School of Medicine, for wizardry in editing and modifying the phylogenetic trees, and Tessa Picknett and her staff at Academic Press for encouragement, support and patience in getting the manuscript into press. There will undoubtedly be omissions and errors in this volume although we hope that they will be infrequent. We would greatly appreciate being informed of any inaccuracies by writing to the Editor, The Transporter FactsBook, Academic Press, 24-28 Oval Road, London NW1 7DX, UK, so that these can be rectified in future editions.
Jeff Griffith
Clare Sansom
IX
ABC ADP Asn Asp ATP CC-4 CFTR CNS 3-D DNA EB
FAD GABA Gln Glu HMA kDa M
mol. wt MDR MFS mV NNADH Ap ApH NMR PEP PIR PTS QUAC SD USA
angstrom unit ATP binding cassette adenosine diphosphate asparagine aspartic acid adenosine triphosphate carboxyl4-carbon cystic fibrosis transmembrane conductance regulator central nervous system three-dimensional deoxyribonucleic acid ethidium bromide flavin adenine dinucleotide 4-amino butyric acid glutamine glutamic acid heavy metal binding sequence kilodalton molar molecular weight multidrug resistance major facilitator superfamily millivolts amino nicotinamide adenine dinucleotide (reduced form) proton-motive force transmembrane pH gradient transmembrane charge gradient nuclear magnetic resonance phosphoenolpyruvate protein identification resource Phosphoenolpyruvate-dependent sugar phosphotransferase system quaternary ammonium compound standard deviation uniporter-symporter-antiporter
THE INTRODUCTORY CHAPTERS
This Page Intentionally Left Blank
Peter J. F. Henderson (Department of Biochemistry and Molecular Biology, University of Leeds, Leeds LS2 9JT, UK) INTRODUCTION The hydrophobic bilayer membrane that bounds cells is inherently impermeable to the great majority of hydrophilic solutes required for cell nutrition and to many of the waste products and/or toxins that must be excreted. Accordingly, the membrane contains proteins, the sole function of which is to catalyze the translocation of substrates through the membrane. As the substrates for many membrane processes can be obtained in radioisotope-labeled form, it has been technically feasible to characterize the functions of many of these transport proteins. The structures of the proteins themselves, however, have proved to be difficult to elucidate: they are of low natural abundance in the membrane; they are very hydrophobic and refractory to isolation methods in aqueous solutions; and, even when purified, usually in nondenaturing detergents, they are very difficult to crystallize. Where the proteins happen to be abundant - bacteriorhodopsin from Halobacterium halobium, K+/Na + ATPases in nerve and Ca 2+ ATPase from muscle, cytochrome oxidases in bacteria and mitochondria, glucose transporter from human erythrocytes, for example progress has been made in elucidating the structure-function relationship. Yet, of these proteins the three-dimensional structure has only been determined for bacteriorhodopsin and the oxidases 1-3, and this is just the beginning of determining their molecular mechanisms of operation. Free-living microorganisms {bacteria, algae, yeasts, parasitic protozoa} often inhabit environments where nutrients are in short supply, and different species must compete with each other for the available metabolites. Accordingly, they couple expenditure of metabolic energy to inward transport of essential nutrients (K§ NH~, Pi, SO42-, sugars, vitamins, etc.)to achieve intracellular concentrations sufficient for optimal growth rates. This expenditure can amount to 20-30% of the organism's available energy when a carbohydrate is fermented under anaerobic conditions to yield only 2-3 moles ATP per mole sugar 4,s. Since the efficiencies of the transport steps may therefore influence cell yield and growth rate 4,6,7 an understanding of the transport processes is important to both the academic researcher seeking to understand bacterial cell physiology, and the industrial manager trying to maintain the profitability of a fermentation process. Furthermore, the process of eliminating metabolic wastes and/or toxins such as antibiotics is often coupled to the expenditure of metabolic energy, an indication of its importance for survival. Motility appears to be driven by transport processes also, although this may not consume so much energy 8. In higher organisms, where survival functions are distributed between different organs, the energization of nutrient capture and waste efflux may be confined to specific tissues, e.g. the gut and the kidney. As a result of their activities, cells in other tissues enjoy an unchallenging environment in which their energy reserves can be channeled into other functions. Thus, their transport processes more often occur by facilitated diffusion.
As approximately 5-15 % of all proteins, revealed by the current efforts in genome sequencing, are membrane transport proteins 9, we anticipate the need for a huge effort in the new millennium to determine the structures of these proteins that are vital for the capture of nutrients and hence the first stage in cell growth. Their additional roles in antibiotic resistance, toxin secretion, ATP synthesis, ion balance, generation of action potentials, synaptic neurotransmission, kidney function, intestinal absorption, tumor growth and other diverse cell functions in organisms from microbe to man presage a major investigative effort to elucidate their molecular mechanisms of action. This effort to elucidate vectorial processes can be compared to the continuing efforts to understand enzyme-mediated catalysis, though there is the possibility of an underlying uniformity of translocation mechanism despite the huge numbers of independent transport proteins that exist. The advent of recombinant DNA technology has enabled the study of membrane transport proteins to be furthered in at least four major directions. The first is the burgeoning appearance of an enormous number of amino acid sequences of the proteins predicted from the DNA sequences of their genes in the genome mapping projects. This sequence information has enabled a second advance: the unambivalent exposure of the evolutionary relationships between proteins not thought hitherto to be related. The third is the manipulation of the genes to expedite amplified expression and purification of the proteins. Finally the ability to mutagenize individual amino acids and to make chimeric proteins is being used to elucidate the relationship of function to structure. A number of transport proteins play a role in human health and disease. The study of "ABC" transport systems (see later) in mammalian cells was intensified with the discovery that cystic fibrosis, the commonest inherited disease in the western world, was caused by a defect in the C1- transport protein lo. The significance of a multidrug resistance protein, "Mdr" that catalyzes secretion of cytotoxins and the failure of anti-tumor chemotherapy similarly focused attention on a different ABC system. In both cases their similarity as ABC-type systems would have been completely obscured without the amino acid sequence information derived from the cloning and sequencing of their genes. Other transport proteins are involved in glucose/galactose malabsorption, albinism, adrenoleukodystrophy. This FactsBook is intended to catalyze this new age of exploration of membrane transport protein structure. It is our major goal to arrive at a sensible classification of transport systems based upon both evolutionary and mechanistic considerations. The numbers of protein sequences now known is too large to include them all, and the expected appearance of legions more from the genome sequencing programs makes it timely to formulate a systematic approach to their classification. First it is important to describe current concepts of their functions. The treatment below is necessarily brief, and the reader is referred to the appropriate chapters in standard biochemistry textbooks 11,12 for a fuller introduction. A watershed in the field occurred when Peter Mitchell la-,s showed that transport processes were intimately associated with the mechanism of oxidative and photosynthetic ATP synthesis, a process which is central to energy metabolism in almost all organisms. However, because of the difficulties in studying the hydrophobic membrane proteins involved we know very little about the molecular mechanism of such vectorial events; this contrasts with the wealth of information on the molecular mechanisms of chemical events catalyzed by water-soluble enzymes. It is quite possible that there is an underlying unity in the molecular mechanism of the
translocation process, even when the direction of solute movement and any energization steps are completely different. This question is likely to be illuminated only when we elucidate the 3D structures and determine the structure-activity relationships of the transport proteins. By far the most central question in the transport field is precisely this - what are the 3D structures of the proteins involved? Before reaching this question it is useful to define some terms often used in the characterization of transport processes.
USEFUL CONCEPTS
Passive diffusion Passive diffusion is the translocation of a solute across a membrane down its electrochemical gradient without the participation of a transport protein. The process follows Fick's law, and so obeys the relationship below in which the velocity has a linear relationship to the [solute]: v : PAc
where v is velocity, P is the permeability coefficient for the particular solute, A is the area, and c is the difference in solute concentration across the cell membrane. Diffusion has a low temperature coefficient (vcx absolute temperature) and is non-specific. Typical biologically important compounds that follow this mechanism are 02, CO2, NH3, HCO~H, CH3CO2H, CH~OH.CHOH.CH~OH - small, neutral molecules that are soluble in lipid membranes.
Facilitated diffusion Facilitated diffusion is the translocation of a solute across a membrane down its electrochemical gradient catalyzed by a transport protein. The Michaelis-Menten relationship ~,~2 often adequately relates the initial rate of transport (vl to initial substrate concentration ([S] = c at zero time): V :
Vmax.[Sl/(K
m --]-IS])
(Vm~x is maximum velocity, K m = [S] where v is Vmax/2). As with enzyme reactions, there is a high temperature coefficient and, usually, strong substrate specificity. Biological substrates that follow this mechanism are typically charged and/or larger than about the size of glycerol, with a very low inherent solubility in biological membranes. Mitchell classified such transport of a single substrate as "uniport", and glycerol transport is an example of such facilitated diffusion in E. coli 16"17. However, in flee-living single-cell organisms, e.g. bacteria, yeasts, algae, the rate of capture of nutrient from the environment by this mechanism is probably too slow at the dilute concentrations that prevail in their normal environments to support competitive growth. Therefore, we usually find that transport of their vital nutrients is coupled to consumption of metabolic energy by active transport (see below) rather than facilitated diffusion. Presumably, during the course of evolution of such organisms the expenditure of precious energy reserves on transport has been a very significant survival factor. In contrast, transport of solutes between intracellular organelles in eukaryotes, or into tissue cells from the blood, often occurs by facilitated diffusion since high
concentrations of solute are already established, for example by the Na§ symport system (below) so that facilitated diffusion by the tissue glucose uniporters is sufficient to support cell metabolism. The seminal example of such a facilitated transport was the GLUT1 glucose transport protein in human erythrocytes ls,19.
Active transport The term active transport is used to describe the net transport of a solute across a biological membrane from a low to a high electrochemical potential. Active transport shows the following characteristics. 9 Accumulation of solute occurs against a concentration gradient. 9 The solute is not chemically modified during translocation. 9 Saturable steady-state kinetics are observed, often following the Michaelis-Menten relationship (above). 9 There is a high temperature coefficient typical of enzyme-catalyzed reactions. 9 Substrate specificity is restricted. 9 An input of metabolic energy is required. Active transport processes embrace a variety of molecular mechanisms, in which energy may be derived from light, oxidoreduction, ATP hydrolysis, or pre-existing solute gradients. It is conceptually helpful to classify them further into "primary" and "secondary" mechanisms 2o (Fig. 1). Secondary transport can be subdivided into "symport" or "antiport", terms introduced by Mitchell la,14 (Fig. 1).
Primary active transport Primary transport involves the direct conversion of chemical or photosynthetic energy into an electrochemical potential of solute across the membrane barrier. Thus, translocation of protons driven by oxidation of respiratory substrates ls, zl,z2 by hydrolysis of ATP ls'22-24 or by light energy absorbed by bacteriorhodopsin 1 all fall into this category. Many nutrient transport systems involving binding proteins in bacteria are of the primary type, directly energized by ATP (see below). All these examples transport one substrate in one direction and so are described as "uniport" 14
Secondary active transport Secondary transport involves the conversion of a pre-existing electrochemical gradient, usually of H + or Na + ions, into a new electrochemical gradient of the transported species. Thus the ultimate energy source for secondary transport systems is a primary chemical or photochemical conversion. In E. coli primary proton ejection by respiration or ATPase powers secondary sugar-H + s y m p o r t (obligatory coupling of H § and solute movement in the s a m e direction 14 see Fig. 1) or secondary Na+/H + antiport (the obligatory coupling of H § and solute movement in the opposite direction la; Fig. 2). For example, the resulting Na + gradient can be further coupled to melibiose transport by a melibiose-Na § symport, so that net melibiose accumulation is driven by respiration (or ATPase)via H + and Na § gradients (Fig. 2}. In E. coli the transmembrane H § gradient would appear to be the "common currency" of many energized transport reactions, and the Na § gradient of relatively few. However, in other organisms living in salt environments the Na § gradient is the dominant factor maintained by a primary Na § pump 2s-27, as it is in multicellular eukaryotes.
Secondaryactivetransport H+ Nutrient
H+
H+
Symport
Antiport
Respi
+
J
A
~
An~biotic / ~ SugarP
~ugar Group translocation \
\Pi§ ATP~
synthase~
~lf
^-r~
Toxin
l
T
ATe
I Nutrient
I
N O cero,
Uniport
I
K§
Primaryactivetransport Figure 1 Energization of sugar transport in E. coli. The large oval represents the cytoplasmic membrane of the microorganism. A transmembrane electrochemical gradient of protons is generated by respiration or ATP hydrolysis depicted on the left. This can be utilized by the proton-nutrient symport or proton-substrate antiport systems shown along the top. Some sugars can be accumulated by an alternative mechanism involving ATP, a binding protein, and two or three other proteins shown along the bottom, with other ATP-dependent primary transport systems for uptake of K § or efflux of toxin. A phosphotransferase mechanism involving PEP and two or three proteins for sugar accumulation is shown on the upper right, and facilitated transport of glycerol on the lower right.
Group translocation All the above mechanisms operate without chemical modification of the solute. Group translocation systems catalyze both the translocation and concomitant chemical modification of the solute. For a range of carbohydrates in many species of bacteria, phosphoenol pyruvate (PEP} is the donor to produce internal sugarphosphate from external free sugar 2s {Fig. 1); the glucose phosphotransferase system is particularly widespread amongst anerobic organisms.
CLASSIFICATION OF MEMBRANE TRANSPORT SYSTEMS A C C O R D I N G TO THEIR ENERGETICS Although thousands of transport processes, each catalyzed by its own protein, have been identified, the strategies found coupling metabolic energy to the translocation process are relatively few in number. These are now described to provide a formal
7
Na + Nutrient
Na +
ymport
Respirat~ionN ,.
'
,
Na+
Nla+.." AT
+
Na+
~
.
"/
Na§
--
,Oarlooxy-! IDa~eCarb~'/
Anti(x~rt
Primary active transport
Figure 2 Sodium-linked transport systems. In halotolerant bacteria respiration may pump sodium ions from the inside to the outside, and the ATP synthase then utilizes the gradient of sodium ions to make ATP (depicted on the left). Similarly, instead of being driven by the proton gradient, rotation of the flagella is driven by the sodium gradient (right). Nutrients may be accumulated by a Na § substrate symport system (top left) and toxins excreted by a Na § substrate antiport (top right). In many bacteria, sodium ions are excreted by a Na§ § antiport (bottom left). In a few species there are sodium-secreting active transport systems driven by decarboxylation reactions (bottom right). basis for a preliminary classification of all the processes. While recent work indicates that a single transport system might employ more than one energization mechanism e9,3o, or even that at least one novel mechanism may exist (TonB), the vast majority of biological transport systems so far fall conveniently into one of these classes. Their operation is illustrated in Fig. 1. While previous investigators made many fundamental contributions to understanding transport processes (see review31), it was Peter Mitchell who showed how vital are vectorial processes to the totality of energy metabolism in living organisms. Accordingly, we will now sketch in the chemiosmotic approach before focusing on individual mechanisms of solute translocation.
The chemiosmotic theory of oxidative and photosynthetic phosphorylation In 1961, Peter Mitchell proposed his Chemiosmotic Theory of Oxidative and Photosynthetic Phosphorylation ae. This sought to explain how ATP synthesis is coupled to oxidative or photosynthetic electron transfer by the use of an electrochemical gradient of protons across the membrane as a high-energy intermediate between the processes. This brilliant concept generated a wealth of productive experimental investigations that, not without some controversy, arrived at an acceptance that proton transport is a fundamental feature of ATP synthesis in virtually all
organisms. The molecular mechanism of these processes is just beginning to be understood, with the very recent elucidations of the structures of proton-translocating proteins and electron transfer proteins 2,3. There has also been the realization that rotation of the proteins in the membrane is a key feature of energy transmission for flagella s,aa, and ATP synthase 2a,aa The four basic parts of the chemiosmotic system, corresponding to the four postulates of the Chemiosmotic Hypothesis, can be paraphrased as follows al,a2: 1. The proton-translocating reversible ATPase system. 2. The proton-translocating oxido-reduction or light-driven electron transfer chain. 3. The exchange diffusion systems, coupling proton translocation to that of anions and cations. 4. The ion-impermeable coupling membrane, in which systems 1, 2 and 3 reside.
The chemiosmotic view of substrate transport mechanism It is postulate 3, which predicts the involvement of transport systems in the process of balancing charge and osmolarity across the membrane, that led Peter Mitchell to consider the energetics of solute uptake into bacteria. In 1963 he suggested that the uptake of sugars into microbial cells might be energized by a transmembrane proton gradient la. The idea required that an individual transport system catalyze the simultaneous translocation of protons with a substrate molecule, "symport", or the experimentally indistinguishable "antiport" of hydroxyl ions la'al. In this hypothesis energy released by respiration or ATP hydrolysis and "stored" as the electrochemical gradient of protons, could drive accumulation of the nutrient 1a-is'a1 The principle is illustrated in Fig. 1. However, this brilliant prediction remained untested until 1970, when Ian West devised experimental conditions in which the movement of lactose or substrate analogs into cells of Escherichia coli containing the lactose transport protein (LacY) evoked an alkaline pH change showing proton movement in the same directionaS-aT. Since then the structure-activity relationship of the LacY protein has been explored by every practicable method of modern molecular biology as-4~ Several other sugar-H + systems have been characterized, but, most importantly, the principles enunciated by Mitchell la, la have been shown to apply to diverse bacterial transport systems responsible, not just for the capture of nutrients like sugars, amino acids, vitamins and ions, but also for the extrusion of wastes and toxins including lactate, Na § or antibiotics al.
Many transport systems are not i o n - l i n k e d Although the Chemiosmotic Theory formed a framework to unify ideas on mechanisms of transport, it became evident that not all transport sytems were linked to ion translocation. In bacteria, the seminal experiments of Berger and Heppe142"4a showed that transport systems associated with periplasmic binding proteins were energized "directly", probably by ATE These early ideas have been reinforced by the subsequent discovery of numerous ATP binding cassette, "ABC", transport systems in all types of organism that function to transport substrates into, or out of, whole cells or subcellular compartments. They are reviewed most recently by Higgins lo and Boos and Lucht 44.
Furthermore, the uptake of some carbohydrates into bacteria, including most importantly, glucose, was accompanied by simultaneous phosphorylation 2s. This chemical conversion occurred at the expense of phosphoenol pyruvate (PEP) via a cascade of phosphate transfer reactions 28. The operation of such vectorial "group translocation" reactions was considered in detail by Mitchell 14"4s, and the subsequent elucidation of these interesting systems has been reviewed most recently by Postma et aI. 2s CLASSIFICATION OF TRANSPORT SYSTEMS ACCORDING THE AMINO ACID SEQUENCES OF THEIR PROTELNS
TO
Proteins catalyzing a single type of transport function and/or energization mechanism do not necessarily exhibit homology at the primary sequence level. Note that they might nevertheless have similar secondary and tertiary structures. Thus the sequences of the rhamose-H § and fucose-H § symport proteins of E. coli are not homologous to that of the arabinose-H § xylose-H § or galactose-H § symport protein of the same organism 46,47 and none of the sugar-H+ symporters are homologous to the sugar-Na § symporters 48. In addition, some phosphotransferase enzymes II are homologous while others are not 2s. More important, perhaps, is that some proteins catalyzing different types of transport according to the above classifications exhibit a high degree of primary sequence homology. One example is the similarity of E. coli sugar-H + symport proteins for arabinose, xylose, or galactose to the mammalian non-energized glucose uniporter, GLUT1 (Fig. 34z). Another example is the similarity of bacterial K§ ATPase uniport to mammalian Na+/K § ATPase antiport and Ca 2§ ATPase uniport proteins. The mitochondrial H+-Pi symport, ADP/ATP antiport, and oxoglutarate/ malate antiport proteins a9 also show homology to one another. It seems likely that our understanding of the molecular mechanisms of transport processes will be much enhanced by this rapid proliferation of information about the amino acid sequences of membrane transport proteins. In this book the transport proteins are arranged according to such evolutionary families. At least 28 families can be identified already (Table 1), and there are likely to be many more as the sequence databases grow. TRANSPORT
ACROSS PROKARYOTIC
CELL MEMBRANES
Penetration of the cell wall by solutes The cell walls of gram-negative bacteria have a complex multilayered structure that includes lipopolysaccharide, an outer lipid membrane, peptidoglycan, the periplasm and an inner phospholipid bilayer membrane (Fig. 4 so). This wall can be regarded as having at least two global functions, that are to an extent antagonistic. In the first instance the wall has to protect the cell against external toxins and environmental changes inimical to life; secondly, it has to permit the uptake of vital nutrients. The wall must also confer mechanical strength to maintain the integrity of the cell, for example when there are changes in osmotic pressure. In E. coli and a number of other species the evidence suggests that compounds of molecular weight less than about 900 penetrate to the inner membrane at rates that
m
Glucose
Na+-Glucose
H+ Sugar
Na+
Proline
Glutamate
Na+ Glutamate ~
Respir
Hi.~Nucle~
H+
Toxin ATPsynthase
I
K* Ca++ATPase
K+/Na+ATPase H+ATPase
Multidrug resistance Antigen presentation Cystic fibrosis
Antibiotic~ / ~
Neurotransmitters
Figure 3 Mammalian homologues of bacterial transport proteins. The bacterial transporters are depicted as in Figs 1 and 2 with their mammalian homologues indicated in bold type around. do not limit cell growth so,s1. This is achieved by at least three factors. First, the lipopolysaccharide layer is permeable to hydrophilic solutes, though it may be impermeable to more hydrophobic molecules including antibiotics so,s2. Secondly, the outer membrane contains channel-forming trimeric proteins ("porins" s2), acting as molecular sieves that permit simple diffusion of solutes of Mr up to 900, including di- and trisaccharides so-s2. Thirdly, the outer membrane also contains other porin-like proteins which exhibit some specificity for the permeant molecule, and pass the substrate (we presume} to high-affinity binding proteins in the periplasm so,s2. In general the porins can be regarded as forming a "pore" or "channel" that enables passive diffusion of solute into the periplasm at a rate sufficient for growth. However, not all porins are non-specific. A clear example of this is the maltoporin, LamB, that aids the entry into the cell of oligosaccharides containing up to six glucose units. The molecular basis of this specificity has recently been elucidated with the characterization of a "greasy slide" in the pore that interacts with the hydrophobic face of the sugar molecules sa. Similarly, the preference of one porin protein for anions, of which a most important nutrient is inorganic phosphate ions, is explained by a positively charged region in the molecule. Thus, the porins may reflect an evolutionary bridge between passive and facilitated modes of diffusion of nutrients into the cell. Importantly, the inner and outer membranes also have to function as conduits for secretion. Included amongst their substrates are: protein, carbohydrate and lipid components of outer layers of the cell wall; proteins and toxins secreted by
m
Function and Structure of Membrane Transport Proteins
Table 1
Families of Transport Proteins 1
Family
Example: Species
Calcium-transporting ATPase
Probable calcium-transporting ATPase 4 Saccharomyces cerevisiae Peroxisomal Membrane Adrenoleukodystrophy protein Homo sapiens ABC-2 Nodulation Protein Nodj nodulation protein Azorhizobium caulinodans ABC-2 Polysaccharide Exporter BexB capsular polysaccharide exporter Haemophilus influenzae ABC-2 Associated (Cytoplasmic) ATP-binding protein NodI Azorhizobium caulinodans ABC-Associated Binding Protein MalG maltose permease Dependent Maltose Transporter Escherichia coli ABC-Associated Binding Protein DppC dipeptide transporter Dependent Peptide Transporter Escherichia coli ABC-Associated Binding Protein Btuc vitamin B12 transport protein Dependent Iron Transporter Escherichia coli Binding Protein Dependent L-Arabinose transport ATP binding protein Monosaccharide Transporter Escherichia coli Binding Protein Dependent Oligopeptide transport ATP binding protein Peptide Transporter Escherichia coli Heme Exporter Heme exporter CycV Bradorhizobium japonicum Plasma Membrane Calcium-transporting ATPase Cation-Transporting ATPase Homo sapiens Macrolide-Streptogramin-Tylosin Erythromycin resistance protein MsrA Resistance Staphylococcus epidermalis H+-Sugar Symporter or Glut l facilitative glucose transporter Sugar Uniporter Homo sapiens H§ Symporter RhaT rhamnose-H § symporter Escherichia coli H§ Acid Symporter PheP phenylalanine transporter Escherichia coli H§ Sucrose-Nucleoside LacY lactose-H* symporter Symporter Escherichia coli H§ PentoseMelB melibiose-H § symporter Hexuronide Symporter Escherichia coli H§ Symporter Pet l oligopeptide-H+ symporter Homo sapiens H§ Symporter FucP fucose-H+ symporter Escherichia coli H+-Carboxylate Symporter KgtP ~-ketoglutarate-H+ symporter Escherichia coli H§ Symporter NupC pyrimidine nucleoside-H* symporter Escherichia coli Heavy Metal-Transporting Copper-transporting ATPase 1 ATPase Homo sapiens 1Data kindly providedby J.K. Griffith and C.E. Sansom.
12
Code Atc4sacce Aldhomsa Nodjazoca Bexbhaein Nodiazoca Malgescco Dppcescco Btucescco Aragescco Oppdescco Cycvbraja Atchomsa Msrastaep Glutlhomsa Rhatescco Phepescco Lacyescco Melbescco Petlhomsa Fucpescco Kgtpescco Nupcescco At7ahomsa
Table 1
Continued
Family Sugar Phosphate Transporter
Example: Species
UhpT hexose phosphate transporter Escherichia coli H§ Vesicular Antiporter Vesicular amine transporter 2 (VAT2) Homo sapiens 14-Helix H+/Multidrug QacA multidrug resistance protein Antiporter Staphylococcus aureus 4-Helix H+/Multidrug Antiporter QacC multidrug resistance protein Staphylococcus aureus 12-Helix H§ TetA(C) tetracycline antiporter Antiporter Escherichia coli Acfiflavin-Cation Resistance AcrB acriflavin resistance protein Escherichia coli Yeast Multidrug Resistance Bmr benomyl-methotrexate resistance Candida albicans Na+/Ca + Exchanger Cardiac sodium/calcium exchanger Homo sapiens Na+-Proline Symporter PutP proline-Na § symporter Escherichia coli Na+-Glucose Symporter Sgltl glucose-Na § symporter Homo sapiens Vacuolar ATPase Vacuolar ATPase subunit Homo sapiens Na+-Dicarboxylate Symporter DctA dicarboxylate-Na § symporter Escherichia coli Na+-PO4 Symporter Nptl phosphate-Na § cotransporter Homo sapiens Na§ Amino Acid Brnq branched chain amino acid transporter Symporter Salmonella typhimurium Na§ Symporter CitN citrate transporter Klebsiella pneumoniae Na+-Alanine-Glycine ACP alanine transporter Symporter Thermophilic bacterium PS-3 Na § Net 1 noradrenalin-Na § symporter Symporter Homo sapiens Na+/H + Antiporter Nhe 1 Na§ + antiporter Homo sapiens Phosphenolpyruvate-Dependent PtaA N-acetyl glucosamine permease II Sugar Phosphotransferase Escherichia coli System (PTS) Anion Exchanger AE1 anion exchange protein 1 Homo sapiens Mitochondrial Adenine Ant 1 ADP/ATP carrier protein Nucleotide Translocator Homo sapiens White White protein Drosophila melanogaster Mitochondrial Phosphate PHC phosphate carrier protein Carrier Homo sapiens
Code Uhptescco Vat2homsa Qacastaau Ebrstaau Tcr2escco Acrbescco Bmrpcanal Naclhomsa Putpescco Nagchomsa Vphlhomsa Dctaescco Nptlhomsa Bmqsalty Citnklepn Alcpthep3 Ntnohomsa Nhelhomsa Ptaaescco
B3athomsa Antlhomsa Whitdrome Mpcphomsa
13
Table 1 Continued Family Nitrate Transporter I
Example: Species
Code
NarK nitrate-nitrite facilitator protein
Narkescco
Escherichia coli
Nitrate Transporter 1I
CmA nitrate transporter
Crnaemeni
Emericella nidulans
Spore Germination
Spore germination protein GraII
Gra2bacsu
Bacillus subtilis
Vacuolar Membrane Pyrophosphatase Gluconate Transporter
Pyrophospate-energized vacuolar proton pump Avp3arath Arabidopsis thaliana
GntP gluconate transporter
Gntpbacsu
Bacillus subtilis
ABC 1 &2
ATP binding protein ABC 1
Abc 1musmu
Mus musculus
Yeast Multidrug Resistance
Multidrug resistance protein Cdr 1
Cdr 1canal
Candida albicans
Cystic Fibrosis Transmembrane Cystic fibrosis transmembrane Conductance Regulator conductance regulator
Cffrhomsa
Homo sapiens
P-Glycoprotein
Multidrug resistance protein Mdr 1
Mdrlhomsa
Homo sapiens
pathogenic organisms that aid their infection of host cells; enzymes required for the digestion of extracellular macromolecules such as cellulose, proteins, nucleic acids and lipids present as the result of the death of other organisms; and the active secretion of "assault" agents such as antibiotics.
Penetration of the inner cell membrane by solutes The inner cell membrane, a protein-contaimng phospholipid bilayer (Fig. 4 s4) is the barrier preventing the entry of most ambient solutes into the bacterial cell. Nutrient uptake is therefore effected by integral membrane transport proteins, either singly or in complexes, the majority of which are synthesized only in the presence of their substrate (see below). Energization of transport is effected at this inner membrane. Amongst its many other functions are the processes of respiration, ATP synthesis, maintenance of the K§ gradient, motility and osmoregulation, which are themselves transport processes ls, ss, s6. The membrane is therefore a dynamic entity of transport proteins, some of which are dependent on others (Figs 1 and 2). For example, only one i n d u c i b l e protein is required for lactose transport 4o, but the energization of its accumulation requires the respiratory chain or ATPase activity (see Figs 1 and 2), which are more permanent features of the membrane s6,s7.
The importance of proton transport across the inner membranes The Chemiosmotic Theory of Mitchell proposed that the respiratory enzymes pump protons across the inner bacterial membrane so that energy released by substrate
14
LPS
{o
;I
C
Oill
A' '
I
PL ), i
MLP .... PC, Pr'
'
PL s
Figure 4 Schematic drawing of the gram-negative bacterial cell envelope. The
outer membrane (om) consists of lipopolysaccharide, phospholipid and proteins, most of which are porins. Inside the outer membrane is a peptidoglycan layer (pg), which is noncovalently bonded to the outer membrane via murein lipoproteins, themselves covalently attached to the peptidoglycan. The cell membrane (cm) is composed of phospholipid and protein, and is the location of the integral membrane proteins involved in transport. The region between the outer membrane and the cell membrane is called the periplasm. The wavy lines are fatty acid residues that anchor the phospholipids and lipid A into the membrane. LPS, lipopolysaccharide; O, oligosaccharide; C, core; A, lipid A; P, porin; PL, phospholipid; MLP, murein lipoprotein; Pr, protein; ore, outer membrane; pg, peptidoglycan; cm, cell membrane. [Copied, with permission, from White, D. (1995) The Physiology and Biochemistry of Prokaryotes. Oxford University Press, New York.]
oxidation is conserved as an electrochemical proton gradient ss-6o. This "protonmotive force" could then be used as an energy "currency" for expenditure on ATP synthesis, nutrient transport, chemotaxis, osmoregulation, etc. {Fig. 1). In organisms without respiratory enzymes an H § ATPase could maintain the proton-motive force utilizing ATP generated by fermentative metabolism.
m
The existence of the proton-motive force (Ap) across the inner membrane has been conclusively established in a diversity of bacterial species. Its magnitude is usually equivalent to 200-300mV, made up of both electrical (A~P) and osmotic (ApH) components. Proton-motive force Ap = A~P- ZApH where Z is RT/zF, the factor that converts pH units to millivolts, usually calculated at 25 ~ When the proton-motive force is used to energize solute transport by proton-coupled mechanisms (Figs 1 and 2), the gradients of solute that can be achieved are related to the Ap by the following equation (n + m) A ~ - n Z ApH log[Sd/[So] :
z
where m is the substrate charge and n is the proton/substrate ratio. As already described, the Chemiosmotic Theory has been an invaluable guide for the elucidation of transport mechanisms. It is important to note that in some organisms living in alkaline and/or high salt environments, the Na § ion has replaced the H § as the coupling cation 61,sz. While in most examples the "conventional" oxidases and ATP synthase components seem simply to have adapted to pump Na § instead of H § in some organisms Na§ decarboxylase enzymes generate an electrochemical gradient of Na § 62 (Fig. 2). The diagram in Fig. 5 illustrates the following mechanisms by which bacteria are known to effect the transport of some nutrients into their cells, and some solutes out.
1. Facilitated diffusion. 2. The "ATP-Binding-Cassette" ABC systems ("uniport")utilizing ATP to capture nutrient or drive efflux (Figs 1 and 5). 3. The group translocation mechanism utilizing PEP as energy source (Figs 1 and 5).
Facilitated Diffusion Glycerol Out
Primary active Group transport translocation Maltose
I
Secondary active transport Fit_ Tetracycline
Mannitol
Out
_ .
In Glycerol
H+ -antibiotic antiport
H§ -su ga r symport
Ma I )se Mannitol 1 - & ..-. ) Pyruvate,.qlJ"
9
H* Lactose
" H
Tetracychne
Figure 5 Mechanisms of transport across the bacterial cell membrane. The different types of transport activity are described in the text.
16
./n
Function and Structure of Membrane Transport Proteins
4. The H + nutrient coupled ("symport") systems utilizing the transmembrane electrochemical gradient of protons generated by respiration or ATPase (Figs 1 and 5). 5. Coupled transport of similarly charged compounds - anions or cations - in opposite directions ("antiport", Figs 1, 2 and 5), which may effect either accumulation of desired substrate or efflux of enzyme, waste or toxin. The best-understood membrane transport processes have been studied in the gram-negative organisms Escherichia coli and Salmonella typhimurium, which are convenient because of their unicellular nature. Furthermore, most of their transport mechanisms appear to occur in many other microorganisms and even man himself.
TRANSPORT ACROSS EUKARYOTIC CELL MEMBRANES The considerations that apply to understanding transport in prokaryotes extend to eukaryotes with important exceptions. It is more difficult with multicellular organisms where cells occur in tissues. Also, eukaryote cells have subcellular compartments bounded by membranes. Obviously, the transport reactions involved in ATP synthesis are localized in mitochondria and chloroplasts, which use an H § electrochemical gradient for energy coupling. In order to accommodate solute-H + symporters or antiporters in the cell membrane, therefore, organisms like yeast have an H § ATPase located there 63. Mammalian cells, however, utilize a transmembrane Na § gradient generated by an Na§ § ATPase to accommodate solute-Na § symporters or antiporters 64. Quite often the maintenance of high concentrations of nutrient in the extracellular fluid (from the blood in mammals or vascular system in plants) obviates the need for energized transport into the cell, so higher organisms can utilize facilitated diffusion systems in their cell membranes rather than active transport. Translocation of substrates between intracellular and extracellular compartments can have sophisticated functional implications. Examples are the release and recapture of neurotransmitter substances in nerve6S; sucrose mobilization in plants 64; antigen peptide presentation in lymphocytes 10, protein targeting in plants and animal cells 67. Since little is known about each of the individual proteins that contributes to these processes our understanding remains superficial at the present time.
THE NUMBER OF MEMBRANE PROTEIN COMPONENTS A N D / OR DOMAINS INVOLVED IN A TRANSPORT SYSTEM Facilitated diffusion transport systems usually contain a single protein. Similarly, secondary active transport systems usually contain one protein, if we discount those that generate the driving ion gradient. Primary active transport systems may occasionally contain one protein, for example bacteriorhodopsin. However, most appear to comprise a protein complex, involving from as few as two polypeptides (X§ ATPase 68) through six (histidine transport system) to 20 (F1 Fo ATPase) and more in, for example, NADH dehydrogenase 69 Both the ABC and phosphotransferase systems illustrate how transport systems that contained several separate polypeptides in primitive organisms may become
17
fused together during the course of evolution so that one polypeptide with functionally distinguishable domains effects translocation. This has been particularly well illustrated by Higgins 7o and by Postma et al. zs (see Fig. 6).
Oligopeptide
o:
Membrane
IN
Ribose
::~:.'+.' . . . . . . k__)]~J
S. typhimuriuxm
k Yk_) E.
coli~
--
~~
N
i.'-.". . . .
iii!i!i.
|~il!~i
O0 Mycoplasma
(~i~ ::~.....~-".~:
Drosophila
ManJ Multidrugs
~
OUT Membrane
Mannitol
Gluco
EIIC
Single
.
potypeptioe
Mannitol
EIIC
IN
o.,
oi//,
Pyruvate "~/ ~ [ ~
P
Figure 6 Proteins of multicomponent transport systems may become fused during evolution. The transport systems illustrated are discussed in the text. The upper part of the figure shows schematically various ABC primary active transport systems and the organisms in which they are found; the different polypeptides are unfused in the example on the left, and the shading indicates different types of fusion between functionally discrete domains that has occurred in other examples. The lower part of the figure shows different group translocation transport systems, all of which are phosphotransferases found in Escherichia coli; the different polypeptides, E IIA and E liB, associated with the membrane component, E IIC, are unfused in the example on the left and the shading indicates different types of fusion in other examples. The figures are derived from information in refs. 2s,44,7o
n
18
M A N Y MEMBRANE TRANSPORT PROTEINS ARE PREDICTED TO CONTAIN 12 TRANSMEMBRANE DOMAINS Hydropathy plots are widely used to predict if regions of a protein might span the membrane as an a helix 71. This method is particularly applicable when a protein is predicted to contain a high proportion of hydrophobic amino acids. Some examples are shown in Fig. 7. The only authentication of their validity is the reasonable correspondence of predicted a helices with those actually observed in bacteriorhodopsin and membrane proteins of the photosynthetic reaction centre and light-harvesting complex 72,73, and more recently cytochrome oxidases 2,a. There is discussion over which algorithm, if any, is satisfactory 74-77. Despite these uncertainties, the majority of the transport protein sequences in Table 1 are predicted to contain 12 hydrophobic regions of sufficient length (19+ amino acids} to span the membrane as a helices 7s,79. The possible exceptions are the transporters for methylenomycin and quaternary ammonium compounds, which may have 14 so,s1, and the rhamnose-H + transporter, predicted to have 10 s2. Many of the sugar transport proteins have an extensive central {i.e. between transmembrane domains 6 and 7)hydrophilic region of about 65 amino acids which is predicted to contain a substantial proportion of helix. Most of the other transport proteins also have a central hydrophilic region, although it is usually shorter than that of the sugar transport proteins. Taken with the evidence of some sequence duplication in the two halves of many of the proteins 4s, it seems reasonable to propose the existence of internal dimerization, originally resulting, perhaps, from gene duplication. This also accords with the same proposal by Lancaster s3, based on kinetic and inhibitor studies of the LacY porter. Despite the differences in individual sequences, an underlying similarity between transport proteins from otherwise dissimilar groups seems to exist 79, even though some catalyze mechanistically rather different types of transport reaction - uniport, antiport, or symport {influx or efflux). One example of a 12-helix arrangement is shown in Fig. 8. In this context, it is interesting that many other groups of membrane transport proteins are predicted to have 12 membrane-spanning a helices. One is the series of phosphate antiporters in prokaryotes 79. Another is the "ABC" group typified by the Mdr, multiple drug resistance factor l~176 some individual members of this group catalyze influx of substrate and others catalyze efflux. Yet another group is the family of mitochondrial transporters, which are thought to function as a dimer, each subunit having six a helices s4,ss (see discussion below); here again transporters of similar sequence catalyze different types of transport reaction - uniport, antiport, or symport, influx or efflux. A fourth group contains the homologous transporters for noradrenaline and gamma-aminobutyric acid 86. Within the family of mitochondrial transport proteins each is predicted to contain six hydrophobic regions and transmembrane helices sa'ss. However, in several examples there is evidence for dimerization to form a functional unit with 12 predicted helices. Interestingly, this family has strong evidence of internal triplication in each polypeptide sa, implying that there are six equivalent domains in the functional dimer. It is important to consider the possible arrangements of 12 helices in the membrane, and several groups have obtained evidence for the nearest-neighbor relationships of predicted transmembrane helices in individual membrane transport proteins, using fluorescence energy transfer, second site revertants of mutants, cysteine
19
Window size
4ol-
19 -40 40
. . . . 17
15 40-
' 13
11 -40
~9
40
-4ut
.
.
.
.
.
.
.
.
I
!
I
I
100
200
300
400
I
Residue number Figure 7a Hydropathy plots of the L-fucose-H § symport protein, FucP, of E. coli.
The algorithm of Kyte and Doolittle (1982, see text) was used with window sizes of 7-19 amino acid residues to generate a series of hydropathy plots of FucP; the putative positions of 12 helices are indicated in the plot with a window of nine residues. [Copied, with permission, from Gunn et al. (1995) Molec. Microbiol. 15,
771-783.1
mutagenesis and other techniques. In addition, m a n y reviewers have hypothesized as to h o w the arrangement might be. However, until we determine the actual 3D structures of some of these proteins such models should perhaps be regarded with caution.
20
Galactose-H + transport protein (GalP) 4..I-
Q~
>" "1-
1
2
3
4
5
,,6
7
8
9
10
11
1"2
J
-40 0
50
100
150
200 250 300 Sequence number
350
400
450
Arabinose-H + transport protein (AraE) .__o
40~
^L
2
3
4
5
),6
7
8
9
10
11
12
/
o~ 0iV
"1:3"--
>"
91"
-40
I
O
I
50
|
100
150
!
I
1
I
200 250 300 Sequence number
I
350
t
400
450
Xylose-H + transport protein (XylE) o F9 ~x
40
o~
~
~"
"1"
0 -40
~. 1
2
I
3
I
0
I
50
-r
4
5
I
100
150
6
7
I
8
I
l
200 250 300 Sequence number
9
.,
10
1
11
t
350
__l
400
L1
Rhamnose-H + transport protein (RhaT) 2 3 4 5 6 7 8 9
0
50
12
450
I !
I
10i
-40
100 150 200 250 Sequence number
300
Fucose-H + transport protein (FucP) L
"1-
-40
1
2
I 50
0
3
4
I 100
5
i 150
,,.6
?
?
,/ ?
. ~ l I ..... l 200 250 300 350 Sequence number
#'
,/
J 400
Lactose-H + transport protein (LacY) F9 ~•
40
>" "1-
-40
o~
0 0
"!
50
1
100
1
!,
I
150 200 250 Sequence number
I
300
I
350
400
Figure7b
Hydropathic profiles of membrane transport proteins. The amino acid sequence of each of the indicated transport proteins was analyzed for hydropathy using the algorithm of Kyte and Doolittle with a window of 11 residues. The majority can be interpreted in terms of 12 putative transmembrane helices, but the L-rhamnose-H § symport protein appears to have 10. [Copied, with permission, from Henderson (1991) Bioscience Reports 11,477-538, ref. 31].
21
Cytopla~mirside
Figure 8a Model o f the orientntion o f the L-fucose-H+ symport protein, FucP,in the membrane, based upon the hydropathy plot ( F I X . 7 4 . Note the predominance o f positive residues inside the membrane, which follows the rule o f von Heiine (see text). [Copied, with permission, from Gunn et 01. (199.5) Molec. Microhiol. 15, 771-783.1
Feriplncmic siclc
@
11
-I
:
6:
r ,;t
0 Scgatively rhsrgcd
-2
1
6"
I;
Pmitively chargtd
+2
+1 0
+3
0
+I
II
+1 3 SD) between many members of different families. When the amino acid sequences of multiple families are significantly similar, the families are presumed to be derived from a common ancestor and are considered subgroups of a superfamily of related transporters 8,9,~. One of the most functionally diverse superfamilies, the uniporter-symporterantiporter (USA)or major facilitator (MFS) superfamily, contains uniporters, symporters, and antiporters of structurally dissimilar sugars, sugar phosphate esters, antibiotics, antiseptics, disinfectants carboxylated compounds, catecholamines and indolamines s'9"~1. The significance of about 40% of the pairwise comparisons between families of the superfamily exceed 3 SD and the ALIGN scores for certain pairwise comparisons between families are as high 8.7 SD, reflecting their presumed common ancestry s. This predicts that they also have similar three-dimensional structures, suggests fundamentally similar molecular mechanisms, and implies that relatively subtle structural differences account for the differences in the functional properties of the proteins, such as the recognition of structurally dissimilar substrates, or the vectorial mechanism. As pointed out previously, a perceived profound difference in function need not be a consequence of a profound difference in structure. Multiple sequence alignments generated in this manner often reveal highly conserved "signature motifs ''8'9. These may be unique to either the family or a subgroup of the family, or common to a group of families which share a functional attribute. Signature motifs of the first category can have great utility in assessing the potential relatedness of transporters which are not homologous by the criterion of the alignment score. Signature motifs of the second and third categories, i.e. those that are highly conserved in proteins with a common functional attribute, for example substrate specificity, mode of energization or vectorial mechanism, are predicted to be necessary for that attribute. These predictions can then be tested with site-directed mutagenesis and other molecular-genetic approaches. In only a few instances has it been possible to crystallize integral membrane proteins for molecular structural analysis. Therefore, most investigations of the structureactivity relationships of transport proteins have been founded on amino acid sequence comparisons of this sort. Signature motifs that are conserved in all transporters of a superfamily may dictate structural or functional attributes that are common to all members of the superfamily. For example, alignment of the consensus sequences of the several families comprising the USA/MFS superfamily identifies several amino acid sequence motifs which are highly conserved in all or some of these diverse transporters. A "G-X-X-X-D-R/K-XG-R-R/K" motif, which is strongly predicted to form a r-turn in most cases, is highly conserved between the second and third predicted helices of transporters in all families of the USA/MFS superfamily 2"s'9. The "G-X-X-X-D-R/K-X-G-R-R/K"
31
motif has been proposed to act as a cytoplasmic gate that limits the flow of substrate into and out of the cytoplasm. Site-directed and insertional mutagenesis of the TETA(B) tetracycline/H + antiporter and LACY lactose/H + symporter have demonstrated that several of these conserved residues of the motif are necessary for function. Similarly, a "R-X-X-X-G-X-X-X-G/A" motif is conserved in the fourth predicted helix and the preceding predicted extracellular hydrophilic loop of transporters in all families of the USA/MFS superfamily 2"s'9. The "R-X-X-X-G-X-XX-G/A" motif has been proposed to function in energy coupling. In the ATP binding cassette (ABC)superfamily, the "G-H-S-G-A-G-K-S-T" and "I-L-L-D-E" motifs, the so-called Walker A and B motifs, define the superfamily. These motifs, the first of which is known to be involved in phosphoryl transfer, are shared by many nucleotide binding proteins 4. Although overall amino acid sequence relatedness and the conservation of highly conserved signature motifs provides strong presumptive evidence that two proteins have related functions, this is not always the case. For example, signature motifs corresponding the ATP binding domains define the ATP binding cassette (ABC) transporter superfamily 4. However, these domains are also found in at least two families of the ABC superfamily that are neither associated with the membrane nor implicated in transport. These are the UVRA family of DNA excision repair proteins and the EF3 family of translational elongation factors. Thus, the conservation of an extended functional domain in two proteins, in this case the ATP binding cassette, does not by itself indicate that the two proteins have related functions, although in most instances this is true. The second category of signature motif is conserved in, and thereby can define, subgroups of a supeffamily. These motifs may dictate the shared structural or flmctional properties of the subset, such as substrate specificity or vectorial mechanism, predictions that also can be tested by site-specific mutagenesis. For example, a "G-X-X-X-G-P-X-X-G" motif is highly conserved in the fifth predicted membrane-spanning region of transporters of all families of the USA/MFS superfamily which direct substrate export, but not in any of the transporter families which direct substrate uptake s,9,1e. Molecular modeling of the so-called "antiporter motif" predicts that a "kink" at approximately the position of the GP dipeptide, resulting in a change in helix axis direction of approximately 20 degrees, would be more stable than a regular helical conformation. The repeating pattern of glycine residues in the antiporter motif also forms a pocket, devoid of side-chains, on the surfaces of the fifth predicted helices. Site-directed mutagenesis experiments indicate that even very slight alterations in the structure of this motif, for example replacement of the hydrogen of glycine with either the small methyl side-chain of alanine or the methylol side-chain of serine, has profound and specific effects on resistance to tetracycline ,2. Intramolecular amino acid sequence comparisons are also useful in investigating structure-activity relationships. For example, there are significant similarities between the amino acid sequences of the N- and C-terminal halves of transporters in many families and superfamilies, including the acriflavin-cation resistance family and the USA/MFS superfamily s. This implies that these proteins arose by the duplication of a half-sized ancestor, suggesting that the N- and C-terminal halves of the transporters might have evolved to contain independent functional domains. This prediction was confirmed for the USA/MFS superfamfly by demonstrating that paired in-flame deletion constructs of the E. coli LACY lactose/H+ symporter
32
complement each other functionally la. Using similar methods, two functional complementation groups also have been defined in the TETA(B) tetracycline/H + antiporter, which belongs to a different family from LACY 14,1s. Intramolecular amino acid sequence comparisons have also shown that the Nterminal halves of distantly related transporters of the USA/MFS superfamily are generally much more similar than the C-terminal halves, provided the proteins being compared have structurally dissimilar substrates s. Thus, the greater conservation of the N-terminal halves of transporters that recognize structurally dissimilar substrates has been interpreted to reflect the conservation of structures which confer the substrate binding-induced conformational change that is proposed to be common to these transporters' mechanism of action. The C-terminal halves of transporters that recognize structurally dissimilar substrates are much less conserved than their N-terminal halves, a situation frequently reversed when transporters that recognize structurally similar substrates are considered. These observations support the interpretation that substrate specificity is determined by sequence motifs contained in the C-terminal halves of these transporters. Consistent with this possibility, inhibitor, photo-affinity labeling and domain exchange studies suggest that the substrate binding sites for the USA/ MFS superfamily's sugar transporters are located in their C-terminal halves 16. Likewise, mutations resulting in altered substrate specificities in various antibiotic antiporters have been found primarily in the C-terminal halves of the proteins 9.
References 1 2 3 4 s 6 7 8 9 lo 11 12 13 14 is 16
Reizer, J. et al. (1994) Biochim. Biophys. Acta 1197, 133-166. Henderson, P.J.F. (1993) Curr. Opin. Cell Biol. 5, 708-721. Kuan, J. and Saier, M. (1993) CRC Crit. Rev. Biochem. Mol. Biol. 28, 209-233. Hi~,ins, C.F. (1992) Annu. Rev. Cell Biol. 8, 67-113. Lipman, D. and Pearson, W (1985) Science 227, 1435-1441. Altschul, S. et al. (1990) J. Mol. Biol. 215, 403-410. Dayhoff, M. et al. (1983) Methods Enzymol. 91, 524-545. Griffith, J. et al. (1992) Curr. Opin. Cell Biol. 4, 684-695. Paulsen, I. et al. (1996) Microbiol. Rev. 60, 575-608. Devereaux, J. et al. (1984) Nucleic Acids Res. 12, 387-395. Marger, M.D. and Saier, M. (1993) Trends Biochem. Sci. 18, 13-20. Varela, M. et al. (1995) Mol. Memb. Biol. 12, 313-319. Bibi, E. and Kaback, H.R. (1990) Proc. Natl Acad. Sci. USA 87, 4325-4329. Rubin, R.A. and Levy, S.B. (1991) J. Bacteriol. 173, 4503-4509. Yamaguchi, A. et al. (1993} FEBS Lett. 324, 131-135. Carruthers, A. (1990) Physiol. Rev. 70, 1135-1176.
33
3 Organization of the Data INTRODUCTION Two kinds of information are provided in The Transporter FactsBook. The first is a compilation of the physical and biological properties of nearly 800 transport proteins. Although every attempt was made to make this compilation comprehensive, some sequences were not included, either by design (see below) or by unintentional omission. Moreover, new transporter sequences are being added to the databases on a near daily basis. Thus, this information is best viewed as a representative, rather than an exhaustive, overview of the characteristics of membrane transport proteins. The second kind of information is a comparison of the physical and biological properties of more than 50 families of transport proteins defined by the relatedness of their amino acid sequences. These data provide rationale bases for grouping proteins and identifying relationships between their structures and functions. A key feature of these data is the consensus amino acid sequence that has been provided for each transporter family or group of families. These are displayed in the multiple amino acid sequence alignments and also in the plots of the predicted topologies. The former indicates what kinds of substitutions are permitted at a conserved residue while the latter presents the conserved residues in the context of predicted structure. The consensus sequences provide means to classify newly identified transporters, particularly when they are not closely related to known proteins. They also define sequence elements that are conserved in multiple families with a common functional characteristic, and therefore may be necessary for the expression of that characteristic. This data is useful in predicting the locations of individual structural or functional domains, and designing experiments to test these predictions with site-directed mutagenesis or other techniques. Because the predictive value of the correlation between a signature sequence and a specific functional characteristic increases with the addition of each new sequence to the family, this information, rather than becoming outdated, will in fact become even more valuable as it is refined by the addition of new transporter sequences.
DEFINITION
OF FAMILY
The FASTA and BLASTP algorithms 1,2 were used with default parameters to search the SwissProt, Protein Identification Resource (PIR) and Genbank/EMBL Genpept protein sequence databases for transport proteins that share local similarity with any of several query sequences representative of known classes of transport proteins. The overall (versus local) similarities of the proteins identified in each search were then quantified by pairwise comparisons using the ALIGN a algorithm. ALIGN calculates a score for the best alignment between any pair of sequences using an empirically derived scoring matrix and two types of penalties for breaking a sequence. The first, the gap penalty, is applied every time a gap is inserted, regardless of the length of the gap. The second, the bias, is applied according to the length of the gap. The ALIGN program utilized the normalized Dayhoff 250 PAM mutational matrix, a gap penalty of 6.0 and a bias of 6.0.
i
34
The statistical significance of each alignment score was evaluated by comparing it to the mean score obtained from comparison of each sequence to 100 random permutations of the other sequence, and is expressed as the number of standard deviations (SD) by which the maximum score for the real comparison exceeds the mean of the scores for the randomized sequences. Pairs of proteins with ALIGN scores in excess of 9SD were considered homologous, i.e. having a common evolutionary origin a, and together constituted a "family" s. Hypothetical proteins, the open reading flames of unidentified genes, and partial sequences are not included. Proteins identified in each FASTA or BLASTP search that had ALIGN scores less than 9 SD with the query sequence were used as query sequences for succeeding FASTA and BLASTP searches. Additional families of homologous sequences were again identified by pairwise comparisons using ALIGN. This process was repeated until all transport proteins identified by the successive FASTA and BLASTP searches were assigned to families. "Orphan transporters", proteins which are not homologous to any other transporter in the database, were not included.
GROUPING OF FAMILIES Families with seemingly similar activities, e.g. "H§ symporters" or "P-type ATPases" were grouped together in a section. However, the reader should bear in mind that transporters with similar functions do not necessarily have related amino acid sequences and vice versa.
ORGANIZATION
OF T H E D A T A
Summary The summary provides an overview of the physical and biological properties of the family, its distribution in nature, its relationship to other families, and known disease associations.
Nomenclature, biological sources and substrates Each sequence in a family was assigned an eight- or nine-character alphanumeric code. This code was derived from three or four characters taken from the protein name, the first three characters from the genus name and the first two characters from the species name. For example, the code for the XYLE transporter of Escherichia co/i is Xyleescco. In a few cases, where the species is unknown, the last two characters are "sp". In many sequences found in the SwissProt database - the main exceptions being sequences from very common higher eukaryotes {e.g. human, rat, cattle} - the sequence code is equivalent to the SwissProt code without the underscore separating the parts describing the protein and its source. Tabulated information for sequences only currently present in the EMBL/GENBANK databases refers to the GenPept translations of the gene sequences. The "Description" of each protein, taken directly from the sequence database, is listed in the second column. All known synonyms, including gene names, are
m
included within square brackets below the description in the second column. "Organism", listed in the third column, refers to the Latin name of the species; the common name of the species, or (for most unicellular organisms} a classification such as "gram-negative bacterium" or "yeast" is included within square brackets in the third column. Substances listed in the "Substrate" column are known to be transported across the membrane. Where a protein is only known to corder resistance to a toxic compound, the compound's name is given in this column in square brackets. Where the mechanism of transporter action is known to be symport or antiport, the coupled ions are also listed here.
Phylogenetic trees Phylogenetic trees were constructed for all families containing more than two members using the PILEUP algorithm 7 with default parameters. Proteins more than 90% identical to at least one other member of the family are indicated in the text by italics and are not included in the phylogenetic trees.
Topology plots Each topology plot is derived from a single, typical member of a transporter family. In most cases, the predicted membrane-spanning regions, indicated in the figures by the shaded rectangles, and the interhelical loops, indicated in the figures by thin solid lines, are identified from hydropathy plots and analysis of ~ helix-forming propensity; in a few cases, these predictions are supported by experimental evidence derived from reporter fusions, susceptibility to proteolytic cleavage, reactivity with peptide-specific antibodies or scanning glycosylation mutagenesis. The number of the first and last residue of each predicted membrane helix is boxed. In families with more than two members, and unless there is a very high percentage identity between all family members {more than 50% of the sequence is identical in at least 75% of the proteins}, the locations and identities of residues conserved in more than 75% of family members are indicated on the topology plots. All residues that are conserved in a family are not necessarily conserved in the representative transporter shown in the topology plot. In these instances, the residue is indicated with an asterisk. In the ABC transporter superfamily, the active transporters consist of four domains: two ATP binding domains and two transmembrane domains. These four domains may be expressed as separate chains or fused to form multidomain proteins: almost every conceivable type of domain fusion has been found 6. The sequence motifs characteristic of this superfamily are found in the ATP binding domains. In families in which the ATP binding domains are expressed separately from the transmembrane domains, the tables and alignments describe the cytoplasmic ATP binding domains associated with the transmembrane domains. Since the former chains do not cross the membrane, no topology plots are included for these families. There is great variability in the relatedness of the separately expressed transmembrane domains. Some of the chains containing these transmembrane domains constitute discrete families of homologous proteins, for example the ABC-associated binding proteindependent maltose, peptide and iron transporter families. Other chains are no more similar to one another than would be expected for non-related transmembrane proteins which contain many highly hydrophobic regions. These are not included.
36
Physical and genetic characteristics Molecular weights and sequence length (in amino acids) are listed for all proteins. When available, the proteins' principal expression sites (tissue or organ specificity), Michaelis constants (Km)and chromosomal loci are listed. Where a bacterial sequence is known to be plasmid-encoded, this is also indicated. The chromosomal loci for humans, Escherichia coli, Haemophilus influenzae, Saccharomyces cerevisiae, and Bacillus subtih's are taken from the Online Mendelian Inheritance in Man, Encyclopedia of E. coli Genes and Metabolism, Encyclopedia of Haemophilus influenzae Genes and Metabolism, Saccharomyces Genomic Information Resource and the Bacillus subtih's Genomic Databases, respectively.
Multiple amino acid sequence alignments Multiple amino acid sequence alignments were calculated using the PILEUP algorithm 7 with default parameters. The consensus sequences list residues present in at least 75 % of the aligned sequences. Conservative substitutions were not taken into account. To ensure that the consensus sequences are not biased by the contribution of very closely related sequences, proteins more than 90% identical to at least one other member of the family (indicated in the text in italics)were not included in the alignments. Residues within the consensus sequence that are also conserved in at least one other family are indicated in bold type.
Database accession numbers Information for each transporter was abstracted from the files in the SwissProt, PIR and EMBL/GENBANK databases identified by the accession numbers. No more than two accession numbers for each database are included. SwissProt was used as the primary data source as it is an extremely well annotated database.
References Supplemental references cited in the summary and recent reviews, when available, are listed at the end of each chapter. Reviews are shown in bold type.
References 1 2 a a s 6 7
Lipman, D. and Pearson, W. (1985) Science 227,1435-1441. Altschul, S. et al. (1990)J. Mol. Biol. 215, 403-410. Dayhoff, M. et al. (1983) Methods Enzymol. 91,524-545. Reeck, G. et al. (1987)Cell 40, 667. Griffith, J. et al. (1992) Curr. Opin. Cell Biol. 4, 684-695. Higgins, C.F. (1992) Annu. Rev. Cell Biol. 8, 67-113. Devereaux, J. et al. (1984)Nucleic Acids Res. 12, 387-395.
m
This Page Intentionally Left Blank
THE MEMBRANE TRANSPORT PROTEINS
This Page Intentionally Left Blank
P-Type ATPases
m
Summary
ii i~i~i:~ !i i/i ~i: ::::-i.
~
ii:-. i ~.~! : }: .c.~..s.
-
!i-:::;"7'7::, :i~ .}i~ ~~:./; : .
.
::i'-! " ,!-
1601 1650 Atxaleido KPAEKSTEKA LNSSVSSASH KALEGLREDT HSPIEEASPV NVYVSRDQK.
Atxbleido Pmallyces Pmalnicpl Pmalarath Pma4nicpl
!4:.--:.i.
:.
i:i
84
LFEETALAAF LSYCPGMGVA LRMYPL ................... KPTWW LFEETALAAF LSYCPGMDVA LRMYPL ................... KPNWW LFEETALAAF LSYTPGTDIA LRMYPL ................... KPSWW LVFETCLAAF LSYTPGMDKG LRMYPL ................... KINWW LVFETVLAAF LSYCPGMEKG LRMYPL KLVWW LVFETCVAAF LSYTPGMDKG LRMYPL ................... KIWWW LFFETALAAF LQYTPGVNTG LRLRPM ................... NFTWW IVFQVCIGCF LCYCPGMPNI FNFMPI ................... RFQWW ICLSMTLHFVILYVEILSTVFQICPL ................... TLTEW MALSFTLHFV ILYVDVLSTV FQVTPL SAEEW ICLSMSLHFL ILYVDPLPMI FKLKAL ................... DLTQW ICLSMSLHFL ILYVEPLPLI FQITPL ................... NVTQW VVMSMALHFL ILLVPPLPLI FQVTPL ................... SGRQW IFSSLSLHLI IMYVPFFAKL FNIVPLGVDP HVVQQAQPWS ILTPTNFDDW TIGSLLLHVL ILYIPPLARI FGVVPL ................... SAYDW IAVAIALQIG FSQLPFMNVL FKTAPM ................... DWQQW VIVTALLQLA LVYVSPLQKF FGTHSL ................... SQLDL VGLSLLGQMC AIYIPFFQSI FKTEKL ................... GISDI LA.SILVFLL IIFINPLGLV FNVLQ .................... DLTNH ..................................................
Pma3arath Pmalschpo Pma2schpo Pmalajeca Pmalneucr Pmalsacce Pma2sacce Pmalklula Pmalcanal Pmalzygr o Atcphomsa
KPAEKSTEKA AQRTLHGLQV AQRTLHGLQV AQRTLHGLQP AQRTLHGLQP
LNLSVSSGPH PD.PKIFSET PD.TKLFSEA KEDVNIFPEK PEATNLFNEK
KALEGLREDT TNFNELNQLA TNFNELNQLA GSYRELSEIA NSYRELSEIA
HVLNESTSPV EEAKRRAEIA EEAKRRAEIA EQAKRRAEIA EQAKRRAEMA
NAFSPKVKK. RLRELHTLKG RLRELHTLKG RLRELHTLKG RLRELHTLKG
AQRTLHGLQN TETANVVPER GGYRELSEIA NQAKRRAEIA RLRELHTLKG TRHEKGDA .......................................... THHEAEGKVT $ ....................................... TQHEKSS ........................................... TQHEKSQ ........................................... TQHEKET ........................................... TQHEKSS ........................................... TQHEKEN ........................................... TQHEKST ........................................... TQHEKGN ........................................... T S R L K F L K E A G H G T Q K E E I P EEELAEDVEE IDHAERELRR G Q I L W F R G L N
Atcqhomsa TSRLKFLKEA GRLTQKEEIP EEELNEDVEE IDHAERELRR GQILWFRGLN Atcrhomsa Atc3sacce Atc3schpo Atnlsacce Atn3homsa Atn3ratno Atn3galga Atnlhomsa Atn2homsa Atn3sussc Atnacatco Atnatorca Atnaartsf Atnadrome
TRSLKFLKEA GHGTTKEEIT ................ DEVA VAVMFYFFYV EIWKSIRRSL AFTIAFWIGA ELYKCGKRRY FCAFPYSFLI FVYDEIRKLI FCAFPYSFLI FVYDEIRKLI FCAFPYSFLI FVYDEIRKLI FCAFPYSLLI FVYDEVRKLI FCAFPYSLLI FIYDEVRKLI FCAFPYSLLI FVYDEVRKLI FCAFPYSLLI FIYDEIRKLI FCAFPYSLII FLYDEARRFI FPALPFSFLI FVYDEARKFI FPAIPFALAI FIYDETRRFY
KD..AEGLDE VKVFPAAFVQ TNPQKKGKFR FKTQRAHNPE LRRNPGGWVE LRRNPGGWVE LRRNPGGWVE IRRRPGGWVE LRRYPGGWVE IRRRPGGWVE LRRNPGGWME LRRNPGGWVE LRRNPGGWVE LRRNPGGWLE
IDHAEMELRR GQILWFRGLN ........... RFKYVFGLE RTL . . . . . . . . . . . . . . SNT NDLESNNKRD PFEAYSTSTT KETYY ............... KETYY ............... KETYY ............... KETYY ............... KETYY ............... KETYY ............... RETYY ............... QETYY ............... QETYY ............... QETYY ...............
Plasma membrane cation-transporting ATPase filmily Atnaartsa Atnahydat Athahomsa Atcartsf Atcbdrome Atcborycu Atcdhomsa Atcfratno Atctrybr Atcplafa Atalsynsp Atclsynsp Atclsacce Atclmycge
FPPMPFSLLI
LVYDECRKFL
LVPLPYGILI
FVYDEIRKLG
LPGLPFSLLI
FVYDEIRRYL
IVVLKISFPVLLL
MRRNPGGFLE
RETYY ...............
VRCCPGSWWD
QELYY ...............
LRKNPGGWVE
KETYY ...............
.... D E V L K F V A R K Y T D E F S F I K
..............
ITVMKFSIPV
V L L .... D E T L K F V A R K I A D
VPDVVVDRM
LMVLKISLPV
I L M .... D E T L K F V A R N Y L E
PAILE ...............
KAVIVFSVPV
I F L .... D E L L K F I T R R M E K
AQEKKKD
LMVLKISLPV
GVVLQMSLPV
I G L .... D E I L K F I A R N Y L E
I L L .... D E A L K Y L S R H H V D
FLVFLWSFPVIIL
V P V .... R I L A N R L D P
LLLLLISSSV
F I V .... D E L R K L W T R K K N E
AIC.LGFSLL
EKKDLK ..............
.............
.... D E I I K F Y A K R K L K E E Q R T K K I K I D
AICLLPMIPM
...........
G ...................
.........
........................
L F V .... Y L E A E K W V R H G R Y
....................
EDSTYFSNV
...........
Consensus
PVLISYSFGG VILYMGMNEV VKLIRLGYGN I ................... ..................................................
Pmallyces
HVESVVKLKG
LDIETIQQSY
TV ............................
Pmalarath
HVESVAKLKG
LDIDTAGHHY
TV ............................
HVESVVKLKG
LDIETAG.HY
Pmalnicpl Pma4nicpl
Pma3arath Atcphomsa Atcqhomsa Atcrhomsa Atc3sacce Atc3schpo Atnlsacce Atceorycu Atceratno Atcehomsa Atcesussc Consensus At cphomsa Atcqhomsa At crhomsa Consensus At cphomsa Atcqhomsa Atcrhomsa Consensus
1651
HVESVVKLKG HVESVVKLKG
LDIETIQQAY LDIETIQQHY
1700
TV ............................ TV ............................ TV ............................
RIQTQIRVVNAFRSSLYEGL
EKPESRSSIH
NFMTHPEFRI
EDSEPHIPLI
RIQTQIKVVK
AFHSSLHESI
QKPYNQKSIH
SFMTHPEFAI
EEELPRTPLL
ITTESKLSEK IHTEVNIGIK
DLEHRLFLQS RRA ........................... Q .......................................
DGISWPFVLL
IMPLVVWVYS
RIQTQIRVVK
FLRKNHTGKH
EGVSWPFVLL DGISWPFVLL
AFRSSLYEGL DDEEALLEES
EKPESRTSIH DSPESTAFY
NFMAHPEFRI
EDSQPHIPLI
.....................
IVPLVMWVYS
TDTNFSDLLW
S ...................
IMPLVIWVYS
TDTNFSDMFW
S ...................
TDTNFSDMFW
S ...................
DGISWPFVLL IMPLVIWVYS TDTNFSDMFW S ................... .................................................. 1701
DDTDAEDDAP
DDTDLEEDAA
TKR ........
LKQ ........
NSSPPPSPN
NSSPPSSLN
KNNNAVDSGI
KNNSSIDSGI
1750
HLTIEMNKSA
NLTTDTSKSA
DEEEEENPDK ASKFGTRVLL LDGEVTPYAN TNNNAVDCN... QVQLPQS. .................................................. 1751
TSSSPGSPLH
TSSSPGSPIH
1766
SLETSL
SLETSL
..... D S S L Q S L E T S V ................
Proteins listed subsequently in italics are at least 90% identical to the paired transporters listed in parenthesis and are therefore not included in the alignment: Atcbgalga, Atcaorycu [Atcborycu); Atcdfelca, Atcdsussc, Atcdorycu, Atceorycu, Atcdratno, Atceratno, Atcehomsa, Atcesussc [Atcdhomsal; Atcporycu, Atcpratno, Atcpsussc [Atcphomsal; Atcqratno [Atcqhomsa); Athasussc, Athaorycu, Atharatno [Athahomsa); Atmasalty (Atmaesccol;
Atnlbufma, Atnlgalga, Atnloviar, Atnlequca, Atnlsussc, Atnlratno
85
(Atnlhomsa); Atn2sacce (Atnlsacce); Atn2galga, Atn2ratno (Atn2homsa); Pma2arath (Pmalarath); Pma3nicpl (Pmalnicpl). Residues listed in the consensus sequence are present in at least 75% of the aligned transporter sequences. Residues indicated in boldface type are also conserved m at least one other family of the P-type ATPase superfamily.
-:.
Database accession numbers SWISSPR OT
9
" . . . : : . .2
.:"
. .
..
" 7 - ' ~ '
-
-
.
.
,
.
.
..
. . . . . .
-.
..
v , -
.
.
..
. . . .
iii
i
..:
..
86
Ata 1synsp Atcartsf Atcplafa Atctrybr Atclsacce Atc3schpo Atc3sacce Atcaorycu Atcbdrome Atcbgalga Atcborycu Atcdfelca Atcdhomsa Atcdorycu Atcdratno Atcdsussc Atcehomsa Atcesussc Atceorycu Atceratno Atcfratno Atclmycge Atclsynsp Atcphomsa Atcporycu Atcpramo Atcpsussc Atcqhomsa Atcqramo Atcrhomsa Athahomsa Athaorycu Atharatno Athasussc Atmaescco Atmasalty Atmbsalty Atnlbufma Atnl equca Atn 1galga Atnlratno Atnlsussc Atnloviar Atn 1homsa Atnlsacce Atn2galga Atn2homsa Atn2ratno Atn2sacce Atn3galga
P3 7367 P35316 Q08853 P35315 P13586 P22189 P38929 P04191 P22700 P 13585 Pl 1719 Q00779 P 16614 P04192 P 11508 P11606 P 16615 P 11607 P20647 P 11507 P 18596 P47317 P3 7278 P20020 Q00804 P 11505 P23220 Q01814 Pl 1506 P23634 P20648 P27112 P09626 P 19156 P39168 P36640 P22036 P30714 P 18907 P09572 P06685 P05024 P04074 P05023 P13587 P24797 P50993 P06686 Q01896 P24798
PIR
EMBL/GENBANK
S40440; $33207 S07526
X71022; G296568 X51674; G665604 X71765; G402222 M73769; G162201 M25488; G172199 J05634; G 173355 U03060; G454003 M12898; G164779 M62892; G 158416 M26064; G211224 M12898 Z11500; G1081 M23115; G306851 X 0 2 8 1 4 G1469 ; J04023; G203059 X15073; G1921 M23114; G306850 X15074; G1923 J04703;G164739 J04022; G203057 M30581; G206899 U39687; G1045747 D 16436; G435123 J04027; G 190133 X59069; G 1675 J03 753; G203047 X53456; G2061 L20977; G404702 J03754; G203049 M25874; G179163 J05451; G561634 X64694; G 1471 J02649; G20303 7 M22724; G 164384 U14003; G537084 U07843; G468207 M57715; G397973 Z11798; G62492 X16773; G871026 J03230; G211220 D 10359; G220824 X03938; G 1898 X02813; G1206 D00099; G219942 U24069; G790261 M59959; G212406 J05096; G 179165 M14512; G203029 X67136; G5513 M59960; G212408
A45598 S05787; P W B Y R 1 A36096 A01075; P W R B F C A36691; S07050 A32792 $23444 B31981 A01076; P W R B S C B31982; S04269 S04651 /%31981 S04652 S10335; PWRBMC A31982 A34307 $36742 A30802 S 17179 A28065 S 13057 A38871 B28065 A35547 A35292; A36558 $23406 A25344 A31671; A24228 B39083 $24650; A43451 S04630 A28199 A24639; S00460 B24862 A01074; PWSHNA A24414 S05788; P W B Y R 2 B24639 $25007 B37227
Plasma membrane cation-transporting ATPase filmily
Atn3homsa Atn3sussc !!i!i i !i i ~i i i ,~!i,i,:iAtn3ratno :::::::::::::::::::::: Atnaartsa Atnaartsf i:====================== !.;i;i:-i~!::.il G~iii!:.::iii~i .i::i~;::Ii.2::: Atnacatco Atnadrome ........... Atnahydat Atnatorca Atxaleido i::::.::.;::i~:.~ :::::::::::::::::::::: ii~i!i~!:iiii ili~i!:! Atxbleido Pmalajeca Pmalarath Pmal canal ii!iiiiii::i',::iii~ Pmal klula Pmallyces Pmalneucr i:iii!ii!!i:iiG!i Pmalnicpl Pmalschpo Pmalsacce Pma 1zygro Pma2arath ~i: :::::::::::::::::::::::::::::::::::::::::: ~i:i;i~ii;ii~i~i:;iPma2sacce i.i:ii!;i~:~i?i:: !ii!ii:-:.~i Pma2schpo Pma3arath .~:..~.~.~:.:::..:::~:.:~. i!(~ii~:~..i~:.:i :.i ~:i!i i:Pma3nicpl Pma4nicpl iiiiiii~ii~ii!ii:::iiiiii iii:iiiiiiii:'~ii:!~ iii~i!~ii!ili
SWISSPR OT P 1363 7 P 18874 P06687 P17326 P28774 P25489 P 13607 P35317 P05025 Pl1718 P 12522 Q07421 P20649 P28877 P49380 P22180 P07038 Q08435 P09627 P05030 P24545 P19456 P 19657 P28876 P20431 Q08436 Q03194
PIR S00801
C24639 S06635 JH0470 S14740; PWCCNM S03632 S00503 A27124; PXLNPD A32326; PXMUP1 A41336; PXCKP A45506 A26497; PXNCP A41779 A28454; PXZP1P A25823; PXBY1P JX0181; PXKZP A37116; PXMUP2 A32023; PXBY2P A40945; PXZP2P A33698; PXMUP3 $24959; $33548
EMBL/GENBANK M3 7457; G497763 M38445; G 164382 M14513; G203031 Y07513; G5670 X56650; G 10934 X58629; G62642 X14476; G732656 M75140; G159258 X02810; G64400 M17889; G159294 J04004; G 159295 L07305; G409249 M24107; G166746 M74075; G170818 L37875; G598435 M60166; G170464 M14085; G168761 M80489; G 170289 J03498; G173429 X03534; G4187 D 10764; G218531 J05570; G166629 J04421; G295644 M60471; G 173431 J0473 7; G 166625 M80490; G 170295 X66737; G19704
References :::::::::::::::::::::::: ,::.~::.:;.
:;!.!ii~!:r162 i.:::~:~i~;~ll :..::i~::.i~::i:.~::~i:.i:::
1 z 3 4
:::~i!~:::ii ~::!:i:::: ;il
5 6 ili::~,!!iii:i:::::~ii?~::ii:'~ ii~i!ii~!:::i::::i:Li: ~i
ii:!~:r :~-~i:~:: :~:?i:::~
7 s
Lytton, J. and MacLennan, D.H. (1988) J. Biol. Chem. 263, 15024-15031. Harper, J.F. et al. (19891Proc. Natl Acad. Sci. USA 86, 1234-1238. Maeda, M. et al. (1990) J. Biol. Chem. 265, 9027-9032. Sussman, M.R. (1994) Annu. Rev. Plant Physiol. Plant Mol. Biol. 45, 211-234. Assmann, S.M. and Haubrick, L.L. (1996) Curr. Opin. Cell Biol. 8, 458-467. Green, N.M. and MacLennan, D.H. (1989) Biochem. Soc. Trans. 17, 819-822; Green, N.M. (1989) Biochem. Soc. Trans. 17, 970-972. Fagan, M.J. and Saier, M.H. Jr. (1994) J. Mol. Evol. 38 57-99. Rudolph, H.K. et al. (1989)Cell 58, 133-145.
87
m
Heavy metal-transporting ATPase family Summary
Transporters of the heavy metal-transporting ATPase family, examples of which are heavy-metal transporting P-type ATPases from bacteria such as Enterococcus (Atkaentfa) 1 and human copper-transporting ATPase 12 (At7ahomsa), mediate active transport of heavy metal ions driven by ATPase !i!il:i:;~!i~i:!:~: activity. Where the natural substrate is known it is usually divalent copper or cadmium. The nitrogen fixation protein FIXI from R h i z o b i u m meliloti a is also a member of this family. In humans, mutations in copper-transporting ATPases cause hereditary Menkes' disease (Cu-transporting ATPase 12} and Wilson's disease (Cu-transporting ATPase 2 4). Members of the heavy metaltransporting ATPase family have a broad biological distribution that includes gram-positive and gram-negative bacteria, yeast and humans. Heavy metaltransporting ATPases from bacteria may be chromosomal or plasmid-encoded. Statistical analysis of multiple amino acid sequence comparisons places the heavy metal-transporting ATPase family in the P-type ATPase superfamily (also known as El-E2 ATPases s,6). Proteins in this superfamily use the energy of ATP hydrolysis to pump ions across cell membranes. P-Type ATPases are }i: !i:!ii :~:~:ii:,~~!::!~;:: all predicted to contain at least six transmembrane helices by the hydropathy i}!i~:i:i!i!i~i~i::ii~. of their amino acid sequences. They have two large cytoplasmic loops separating three pairs of transmembrane helices; the larger of these loops contains )i?!Gi!~ i':?!ir the ATP binding domain. The sequences are usually extended by one or two i:)',~i!Nii:,!i:i more pairs of helices s. Members of the heavy metal-transporting ATPase family are predicted to contain eight transmembrane helices 7. They also have !i;iiii!~r an N-terminal cytoplasmic domain which contains one or more repeats of a sequence associated with heavy metal binding, the HMA sequence 7. In the iiii~:~i;ili!!iiii~',!:iii human copper-transporting proteins 2'4 this domain contains six tandem HMA sequences. Eukaryotic proteins may be glycosylated. A few short sequence motifs are very highly conserved within the heavy metal-transporting ATPase family of transporters, including motifs unique to the family and signature motifs of the P-type ATPase superfamily.
Nomenclature, biological sources and substrates CODE
DESCRIPTION [SYNONYMS]
At7ahomsa Copper-transporting ATPase 1 [Copper pump 1, Menkes' disease-associated protein, ATP7A, MNK, MC1] At7acrigr Copper-transporting ATPase 1 At7bhomsa Copper-transporting ATPase 2 [Copper pump 2, Wilson's disease-associated protein, ATP7B, WND, PWD, WCI] Atc2sacce Probablec a l c i u m transporting A T P a s e [PCA1, YBR295W, YBR21121
88
ORGANISM [COMMON NAMES] Homo sapiens
SUBSTRATE(S)
Cu2+
[human]
Cricetulus griseus
C u 2+
[hamster] Homo sapiens
C u 2+
[human]
Saccharomyces cerevisiae
[yeast]
Ca2§
CODE
DESCRIPTION [SYNONYMS]
OR GANISM
SUBSTRATE(S)
[COMMON NAMES]
Atcssynsp Cation-transporting ATPase [PACS] Atkaentfa Potassium/coppertransporting ATPase A [ATKA]
Synechococcus sp. [cyanobacterium] Enterococcus faecalis [gram-positive bacterium]
Metal ions
Atkbentfa
Enterococcus faecalis [gram-positive bacterium]
Cu 2+, K+
Escherichia coli [gram-negative bacterium] Synechococcus sp. [cyanobacterium]
Cu 2+
Saccharomyces cerevisiae [yeast]
Cu2+
Bacillus firmus [gram-positive bacterium]
Cd 2+
Staphylococcus aureus [gram-positive bacterium]
Cd 2+
Staphylococcus aureus [gram-positive bacterium]
Cd 2+
Atsyescco Atsysynsp Atulsacce Cadabacfi
Cadastaau
Caddstaau
Potassium/coppertransporting ATPase A [ATKB] Probable coppertransporting ATPase Probable coppertransporting ATPase [SYNAI Probable coppertransporting ATPase [Cu2+-ATPase, CCC2] Probable cadmiumtransporting ATPase [Cadmium efflux ATPase, CADA] Probable cadmiumtransporting ATPase [Cadmium efflux ATPase, CADA] Probable cadmiumtransporting ATPase [Cadmium efflux ATPase, CADAI P-Type ATPase
Bradyrhizobium japonicum [gram-negative bacterium] Ctppromi Heavy metal-transporting Proteus mirabilis [gram-negative bacterium] P-type ATPase Ctpamycle Cation-transporting P-type Mycobacterium leprae [gram-negative bacterium] ATPase A [CTPB] Ctpbmycle Cation-transporting P-type Mycobacterium leprae [gram-negative bacterium] ATPase A [CTPB] Rhizobium rneliloti Fixirhime Nitrogen fixation protein [gram-negative bacterium] [FIXlI
Ctpbraja
Cu 2+, K+
Cu2+
Metal ions Metal ions Mg2§ Mg~*
Metal ions
89
Heavy metal-transporting ATPase family
ir
P h y l o g e n e t i c tree Ctpbbraja Fixirhime
. .
:;::.id
............... Ji};
~ "
i!i~::~iii.liii:!r~ !i~;i::iiii~i~,-:.~!!!i]!iii
Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle
-
Atsysynsp At7acrigr
N: |
LAtTahomsa At7bhomsa .... A t c s s y n s p Ctppromi Atkaentfa Atsyescco Atulsacce Atkbentfa Atc2sacce
P r o p o s e d o r i e n t a t i o n of A T 7 A z in t h e m e m b r a n e . . . . . . . . . . . . . . .
:.:.:..=.:.:.::.:.:.-v.:
............. ...... ...,......., ....
'~if! ~';!i!:2j,2.! (,~ ;2.. !~!
90
The model is based on predictions of membrane-spanning regions and ~-helical content. The N-terminus of the protein is illustrated on the reside and is folded eight times through the membrane. The predicted membrane-spanning helices are portrayed as rectangles. The numbers corresponding to the first and last residue of each membrane-spanning helix are boxed. Residues that are conserved in more than 75 % of the aligned transporters {see below} are shown.
OUTSIDE
L
i ,'i!!
v
G I
l
Nt
N
T. . . . .
T G D
S
~
APL
A
A
A
13
G I
Q
G G
A
COON
A
D
A
V I
A DG
J
A
DKI'GTLT G V -
E
S HP AI
v,,,GV]vp
T
G NH
t
N
D Q
L
C
V
A fi....... L.:
940~,
F
C
TGG
A ....
" .!
iL i~
V
P
PG
I)(3 G D TGE
2 INSIDE
Physical and genetic characteristics At7ahomsa At7acrigr At7bhomsa Atc2sacce Atcssynsp Atkaentfa Atkbentfa Atsyescco Atsysynsp Atul sacce Cadabacfi Cadastaau Caddstaau Ctpbraja Ctppromi Ctpamycle Ctpbmycle Fixirhime
AMINO ACIDS 1500 1476 1443 1216 747 727 745 834 790 1004 723 727 804 730 829 780 750 757
MOL. WT
163334 160335 154 776 131838 79 732 78 388 81 522 87 782 83 694 109 828 78 207 78 811 86 882 77 337 87859 82 384 78 224 79559
EXPRESSION SITES endothelial cells
CHROMOSOMAL L O CU S Xq13.3
liver, kidneys
13q 14.3 copAB operon copAB operon Chromosome 4
91
Multiple amino acid sequence afignments
92
At7acrigr At7ahomsa At7bhomsa Consensus
1 50 MEPSMDVNSV TISVEGMTCI SCVRTIEQKI GKENGIHHIK VSLEEKSATI MDPSMGVNSV TISVEGMTCN SCVWTIEQQI GKVNGVHHIK VSLEEKNATI ...................... MPEQERQI TAREGASRKI LS.KLSLPTR ..................................................
At7acrigr At7ahomsa At 7bhomsa Consensus
51 i00 IYDPKLQTPK TLQEAIDDMG FDALLHNANP LPVLTDTLFL TVTASLTLPW IYDPKLQTPK TLQEAIDDMG FDAVIHNPDP LPVLTDTLFL TVTASLTLPW AWEPAMKKSF AFDNVGYEGG LDGLGPSSQV ATSTVRILGM TCQSCV .... ..................................................
At 7acrigr At 7ahomsa At 7bhomsa Consensus
i01 150 DHIQSTLLKT KGVTDIKIFP QKRTLAVTII PSIVNANQIK ELVPELSLET DHIQSTLLKT KGVTDIKIYP QKRTVAVTII PSIVNANQIK ELVPELSLDT KSIEDRISNL KGIISMKVSL EQDSATVKYV PSVVCLQQVC HQIGDMGFEA ..................................................
At7acrigr At7ahomsa At7bhomsa Atc2sacce Consensus
151 200 GTLEKRSGAC EDHSMAQAGEVVLKIKVEGM TCHSCTSTTE GKIGKLQGVQ GTLEKKSGAC EDHSMAQAGEVVLKMKVEGM TCHSCTSTIE GKIGKLQGVQ SIAEGKAASW PSRSLP.AQE AVVKLRVEGM TCQSCVSSIE GKVRKLQGVV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MKPEKLFSGL ..................................................
At7acrigr At7ahomsa At7bhomsa Atc2sacce Consensus
201 250 RIKVSLDNQE ATIVYQPHLI SVEEIKKQIEAMGFPAFVKK QPKYLKLGAI RIKVSLDNQE ATIVYQPHLI SVEEMKKQIEAMGFPAFVKK QPKYLKLGAI RVKVSLSNQE AVITYQPYLI QPEDLRDHVNDMGFEAAIKS KVAPLSLGPI G T S D G E Y G V V N S E N I S I D A M Q D N R G E C H R R SIEMHANDNL GLVSQRDCTN ..................................................
At7acrigr At7ahomsa At7bhomsa Atc2sacce Consensus
251 300 DVERLKNT .... PVKSLEGS QQR.PSYPSD S .... TATFI IEGMHCKSCV DVERLKNT .... PVKSSEGS QQRSPSYTND S .... TATFI IDGMHCKSCV DIERLQSTNP KRPLSSANQN FNNSETLGHQ GSHVVTLQLR IDGMHCKSCV RPKITPQECL SETEQICHHG ENRTKAGLDV DDAETGGDHT NESRVDECCA ..................................................
At7acrigr At7ahomsa At7bhomsa Atc2sacce Consensus
301 350 SNIESALPTL QYVSSIAVSL ENRSAIVKYN ASSVTPEMLI KAIEAVSPGQ SNIESTLSAL QYVSSIVVSL ENRSAIVKYN ASSVTPESLR KAIEAVSPGL LNIEENIGQL LGVQSIQVSL ENKTAQVKYD PSCTSPVALQ RAIEALPPGN EKVNDTETGL DVDSCCGDAQ TGGDHTNESC VDGCCVRDSS VMVEEVTGSC ..................................................
At7acrigr At7ahomsa AtTbhomsa Atc2sacce Consensus
351 400 YRVSIANEVE STSS...SPS SSSLQKMPLNVVSQPLTQET VINISGMTCN YRVSITSEVE STSN...SPS SSSLQKIPLNVVSQPLTQET VINIDGMTCN FKVSLPDGAE GSGTDHRSSS SHSPGSPPRN QV.QGTCSTT LIAIAGMTCA EAVSSKEQLL TSFEVVPSKS EGLQSIHDIR ETTRCNTNSN QHTGKGRLCI ..................................................
At7acrigr At7ahomsa At7bhomsa Atc2sacce Consensus
401 450 SCVQSIEGVVSKKPGVKSIH VSLANSFGTV EYDPLLTAPE TLREVIVDMG SCVQSIEGVI SKKPGVKSIR VSLANSNGTV EYDPLLTSPE TLRGAIEDMG SCVHSIEGMI SQLEGVQQIS VSLAEGTATV LYNPSVISPE ELRAAIEDMG ESSDSTLKKR SCKVSRQKIE VSSKPECCNI SCVERIASRS CEKRTFKGST ..................................................
At7acrigr At7ahomsa At7bhomsa Atsyescco Atc2sacce Consensus
451 500 FDAVLPDMSE PLVVIAQPSL ETPLLPSTND .................... FDATLSDTNE PLVVIAQPSS EMPLLTSTNE FYTKG ............... FEASVVSESC STNPLGNHSA GNSMVQTTDG TPTSVQEVAP HTGRLPANHA ................................... MSQTI DLTLDGLSCG NVGISGSSST DSLSEKFFSE QYSRMYNRYS SILKNLGCIC NYLRTLGKES ..................................................
Caddstaau At7acrigr At7ahomsa At7bhomsa Ctppromi Atsyescco Atulsacce Atc2sacce Consensus
501 550 ...... MDSS T K T L T E D K Q V Y R V E G F S C A N C A G K F E K N V K E L S G V H D A K V ....... QDN M M T A V H S K C Y I Q V S G M T C A S C V A N I E R N L R R E E G I Y S V L V ...MTPVQDK EEGKNSSKCY IQVTGMTCAS CVANIERNLR REEGIYSILV PDILAKSPQS TRAVAPQKCF LQIKGMTCAS CVSNIERNLQ KEAGVLSVLV ...... M N T P T T L S S A N R L S L P V E G M T C A S C V G R V E R A L K A V P E I K D A V V HCVKRVKESL EQRPDVEQAD VSITEAHVTG TASAEQLIET IKQAGYDASV ............... MREVI LAVHGMTCSA CTNTINTQLR ALKGVTKCDI CCLPKVRFCS GEGASKKTKY SYRNSSGCLT KKKTHGDKER LSNDNGHADF ..................................................
Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa Atsyescco Atulsacce Atkbentfa Atc2sacce Consensus
551 600 .......................... MHVT RDFSHY ..... VRTAGEGIK ........ MS C C A S S A A I M V A E G G Q A S P A S E E L W L A . . . . . S R D L G G G L R . . . . . . . . . . . . . . . . . . . . . . . . . . . . MS E Q K V K . . . . . . . . . . L M E E E NFGASKIDVF GSATVEDLEK AGAFENLKVA PEKARR ..... RVEPVVTED . . . . . . . . . . . . . . . . . . . . . . . . . . . . MS D Q K A . . . . . . . . . . ITSEQE ............................ MT A S L V E D . . . . . T N N N H E S V R ................................................ MQ ............................... MPAAI ..... VHSADPSST ALMAGKAEVR YNPAVIQ..P PVIAEFIREL GFGATV ..... MENADEGDG ALMAGKAEVR YNPAVIQ..P PMIAEFIREL GFGATV ..... IENADEGDG ALMAGKAEIK YDPEVIQ..P LEIAQFIQDL GFEAAV ..... MEDYAGSDG ............................................. MVNQQ N L A T E R A D I T F S S T P N P . . V ......... L A V S A I E . . . . . S S G Y K V P E E ............................................ MATNTK S H P K A K P L A E S S I P S E A . L .......... T A V S E A L . . . . . P A A T A D D D D SLVTNECQVT YDNEVTADSI KEIIEDCGFD CEILRD ..... SEITAISTK ................................... M ..... NNGIDPENE VCSKSCCTKM KDCAVTSTIS GTSSSEISRI VSMEPIENHL NLEAGSTGTE ..................................................
Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle
601 HIDLAVEGVH QTELSVPNAY MNVYRVQGFT KNVYRVEGFS MKAYRVQGFT RIQLDVAGML
CAGCMAKIER CGTCIATIEG CANCAGKFEK CANCAGKFEK CANCAGKFEK CAACASRVET
GLSAIPDVTL ALRAKPEVER NVKKIPGVQD NVKQLAGVQD NVKQLSGVED KL.NKIPGVR
ARVNLTDRRV ARVNLSSRRV AKVNFGASKI AKVNFGASKI AKVNFGASKI ASVNFATRVA
650 ALEWKAGT.. SIVWKEEVGG DVYGNASVEE DVYGNASVEE AVYGNATIEE TI..oDAVDV
93
Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa i!i~ii'![ (!!ili!~iii~i A t c s s y n s p Ctppromi Atkaentfa Atsyescco Atulsacce i!i I(I!~!E~! A t k b e n t f a Atc2sacce Consensus (i~iii~ii ~:;'i~i~ ~ !i~ii
~iii~i:i:ii::i~ii~!!i C t p b r a j a Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa ....;,:::ii:Y%: Atsyescco !i!iti/.!::/i A t u l s a c c e Atkbentfa Atc2sacce :i!i l;.41~:i!!~: Consensus
i!!i!i!!!
% ::::, i;i
:if!i:/(ii .,.:
..
.-.
:.
)
)
i:.:..::'ij.!-. i
s . . . . . . . . .....
94
.
RIQLNITGMS SILVEVEGMK ILKLVVRGMT VLELVVRGMT NIELTITGMT ..TLTLRGMG ITELAIEEMT METFVITGMT SQQLLLSGMS EGLLSVQGMT TNKKGAIGKN HIVLSVSGMS ....... G..
TLVNSATRVA RLQQTAGVEA VSVNLITRLA TLTKHKGIFY CSVALATNKA SLTKHRGILY CSVALATNKA KLTRTNGITY ASVALATSKA LIQALPGVQE CSVNFGAEQA ALAQIPGVLE ATVNLATERA ELNEQPGVMS ATVNLATEKA ALQSVPGVTQ ARVNLAERTA QVEGIEGVESVVVSLVTEEC NTKNNLQEHG KMENMDQHHT SFGALKCVHG LKTSLILSQA
CSCCAPNGWNNLPNKLSDFS
CAGCVAAVER CASCVHKIES CASCVHKIES CASCVHNIES CAACAGRIEA CASCVGRVEK CANCSARIEK CASCVTRVQN CGSCVSTVTK PEEKITVEQT CTGCESKLKK C..C
..................
RL...TSAR. KVDYDAALIE HIKYDPEIIG HIKYDPEIIG LVKFDPEIIG QVCYDPALTQ RVRHLSGVVS SVKYTDTTTE LVM...GSAS HVIYEPSKT. HGHMERHQQM EFNLDLAQGS
V .................
651 700 ..LDPGRFIDRLEELGYKAYPFETESAEVAEVAES . . . . . . . . . . . . . RF RRTNPCDFLH AIAERGYQTH LFSPGEEEGD DLLKQ ...... _ _ _ _ - _ _ _ _ ...LEK . . . . . . . A G A F E N L K V S P E K L A N Q T I Q R V K D D T K A H K E E K T P F Y ...LEK . . . . . . . A G A F E N L K V I P E K L A N P S I Q A V K E D T K A P K E E K I P F Y LEK AGAFENL KVTPEKSARQ ASQEVKEDT...KEDKVPFY ...AVDELRQ VIEQAGYRAT ........... AHAESAVEE IDPDADYARN ...SPRPLRY VKAVRRAALC ........... TDGGEALQR RQADADNARY . . . D P T V L T T E I T G L G F R A Q L R Q D D N P L T L P I A E I P P L Q Q QR ........ . . . P R D I I H T . I G S L G F E A S L V K K D R S A S H L D H K R E I K Q W RSS ....... . . . P R D I I H T .IESLGFEAS L V K K D R S A S H L D H K R E I R Q W RRS ....... ...PRDIIKI .IEEIGFHAS L A Q R N P N A H H L D H K M E I K Q W KKS ....... ...VAAIQAA.IEAAGYHAF PLQDPWDN.. EVEAQERHRR ARSQRQLAQR ...ITDLEVA .WHAGYKPRRLSDNPANTRDLSEERREKEARS ....... ..... RLIKS .VENIGYGAI L Y D E A H K Q K I A E E K Q T Y L R K M K F D ...... . . . P Q D L V Q A . V E K A G Y G A K R L K M T L N A A S A S K K P P S L A M K R ........ ...TLETARE M I E D C G F D S N I I M D G N G N A D M T E K T V I L K V T K A F E D E S P L . . . D H G H M S G .MDHSHMDHE D M S G M N H S H M G H E N M S G M D H S M H M G N F K Q K VKDVIKHLSK TTEFKYEQIS NHGSTIDVVVPYAAKDFINE EWPQGVTELK ..................................................
701 Ctpbraja LLRCLGVAAF ATMNVMMLSI Fixirhime LILAVAVSGFAATNIMLLSV Cadastaau KKHSTLLFAT LLIAFGYLSH Caddstaau KKHSTLLFAT LLIAFGYLSH Cadabacfi KKHSTLLYAS LLITFGYLSS Ctpbmycle LLRRLIVAAL LFVPLADLST Ctpamycle LLIRLAVAAALFVPLAHLSV A t s y s y n s p .......... L Q L A I A A F L L At7acrigr FLVSLFFCTP VMGLMMYMMA At7ahomsa FLVSLFFCIP VMGLMTYMMV At7bhomsa FLCSLVFGIP VMALMIYMLI AtcssynspVWVSGLIASL LVIGSLPMML Ctppromi LRRALLIATI FTLPVFVIEM Atkaentfa LIFSAILTLP LMLAMIAMML Atsyescco FRWQAIVALA VGIPVMVWGM Atulsacce ILSSVSERFQ FLLDLGVKSI Atkbentfa FWLSLILAIP IILFSPMMGM
750 PVWSGNVSDM LPEQRDFF ............ S V W S G A D .... A A T R D L F . . . . . . . . . . . . FVNGE . . . . . . . . . . . . . . . . . . . . . . . . . FVNGE . . . . . . . . . . . . . . . . . . . . . . . . . YVNGE . . . . . . . . . . . . . . . . . . . . . . . . . .............................. .............................. .............................. .............................. .............................. .............................. .............................. .............................. .............................. .............................. EISDDMHTLT IKYCCNELGI RDLLRHLERT SF . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Atc2sacce IVERNIIRIY FDPKVIGARD LVNEGWSVPV SIAPFSCHPT IEVGRKHLVR Consensus .................................................. 751 800 Ctpbraja .................................................. Fixirhime .................................................. Cadastaau .................................................. Caddstaau .................................................. Cadabacfi .................................................. C t p b m y c l e ..... M F A I V P T N R . . . . . . . . . . . . . . . . . . . . . . . . . . FPGWGYLL.. C t p a m y c l e ..... M F A V L P S T H . . . . . . . . . . . . . . . . . . . . . . . . . . FPGWEWML.. A t s y s y n s p ..... I V S S W G H L G H W L D H P L P G T D Q L . . . . . . . . . . . . . . . . . . WFH.. A t 7 a c r i g r ..... M E H H F A T I H H N Q S M S N E E M I K N H S S M F L E R Q I L P G L S I M N L L S . . A t 7 a h o m s a ..... M D H H F A T L H H N Q N M S K E E M I N L H S S M F L E R Q I L P G L S V M N L L S . . A t 7 b h o m s a . . . . . . . . . . . . . . . . . . PS N E P .... HQS M V L D H N I I P G L S I L N L I F . . A t c s s y n s p ..... G I S . I P G I P M W L H H P G . . . . . . . . . . . . . . . . . . . . . . . . . LQ.. Ctppromi ..... G S H F I P G V H H W V T Q T L G Q Q . . . . . . . . . . . . . . . . . . LNWYIQ.. A t k a e n t f a ..... G S H . . G P I V S F F H L S L . . . . . . . . . . . . . . . . . . . . . . . . . VQ.. Atsyescco ............ IGDNMMVT ADNR .................. SLWLVI.. Atulsacce GYKFTVFSNL DNTTQLRLLS KEDEIRFWKK NSIKSTLLAI ICMLLYMIVP Atkbentfa ................................... PFQVT FPGSNWVV.. A t c 2 s a c c e V G C T T A L S I I L T I P I L V M A W A P Q L R E K I S T IS . . . . . . . . . . . . . . . . . . Consensus .................................................. Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa Atsyescco Atulsacce Atkbentfa Atc2sacce Consensus
801 850 .......................... HWLSALIALPAAAY AGQPFFRSAW .......................... HWIS ALIAGPALIY AGRFFYKSAW ......................... DNLVT SMLFVGSIVI GGYSLFKVGF ......................... DNLVT SMLFVSSIVI GGYSLFKVGF ......................... ENIVT TLLFLASMFI GGLSLFKVGL .............................. TALAAPIVTWAAWPFHRVAL .............................. TALAIPVVTWAAWPFHRVAI .............................. ALLATWALLG PGRSILQAGW .............................. LLLCLPVQFF GGWYFYIQAY .............................. FLLCVPVQFF GGWYFYIQAY .............................. FILCTFVQLL GGWYFYVQAY .............................. LGLTLPVLWA .GRSFFINAW .............................. FVLATIVMFG PGLRFFKKGI .............................. LLFALPVQFY VGWRFYKGAY .............................. GLITLAVMVF AGGHFYRSAW MMWPTIVQDR IFPYKETSFV RGLFYRDILG VILASYIQFS VGFYFYKAAW .............................. LVLATILFIY GGQPFLSGAK ............................ AS M V L A T I I Q F V I A G P F Y L N A L ................................ L . . . . . . . . G . . F .....
Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp
851 RALS.AKTTN NAIR.HGRTN QNLI.RFDFD QNLI.RFDFD QNLL.RFEFD RNAR.YRAAS HNAR.YHGAS QGLR.CGAPN
900 MDVPISIGVI LALGMSVVET I ............... HHAE MDVPIALAVS LSYGMSLHET I ............... GHGE MKTLMTVAVI GATIIGK ....................... MKTLMTVAVI GAAIIGE ....................... MKTLMTVAVI GGAIIGE ....................... METLISAGILAATGWSLSTI FVDKEPRQTH GIWQAILHSD METLISTGITAATIWSLYTV FGHHQSTEHR GVWRALLGSD M N S L V L L G T G S A Y L A S L V A L L W ....... P Q L ...... G W
95
A t 7 a c r i g r K A L K . H K T A N M D V L I V L A T T IAFAYSLII. LL ....... V A M Y E R A K V N P
~Ji};}JJ:!ili)~J A t 7 a h o m s a K A L K . H K T A N M D V L I V L A T T IAFAYSLII. LL ....... V A M Y E R A K V N P At7bhomsa Atcssynsp Ctppromi Atkaentfa
......... . . . . .
~::i:.i~i::.ii::~::::!:-::!i::5,
........ ....
Atsyescco
Atulsacce Atkbentfa Atc2sacce Consensus
.:..~ 9 . . . ..:.:.:...... . .
ilil;iiiiii -iii!!i
. . . .
Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa Atsyescco Atulsacce Atkbentfa Atc2sacce Consensus
!i!i!iiii
........ .................. ....... if:;! i~iL ;..i.~2!
96
M D V L I V L A T S IAYVYSLVI. LV ....... V A V A E K A E R S P M D T L V A V G T G A A F L Y S L A V T LF ....... P Q W L T R Q G L P P M N S L V S V G T V A A Y G Y S V V S T FI ....... P Q V L . . P A G T A M D V L V A I G T S A A F A L S I Y N G FF ....... P ...... SHSH M D T L V A L G T G V A W L Y S M S V N L W ....... P Q W F P M E A . . R M D T L V C V S T T C A Y T F S V F S L V H N M F H P S S T G K L P R ..... M M T L I A M G I T V A Y V Y S V Y S F I ........ A N L I N P H T H V M M D L L I V L S T S A A Y I F S I V S F GY . . . . . . . . . F V V G R P L S T M.L . . . . . . . . A...S . . . . . . . . . . . . . . . . . . . . . . . .
901 HAYFDAAIML LTFLLVGRFL HAWFDASVTL LFFLLIGRTL ...WAEASIV VILFAISEAL ...WAEASIV VILFAISEAL ...WAEVAIV VILFAISEAL SIYFEVAAGV TVFVLAGRFF AIYFEVAAGI TVFVLAGKYY VCFFDEPVML LGFILLGRTL ITSFDTPPML FVFIALGRWL ITFFDTPPML FVFIALGRWL VTFFDTPPML FVFIALGRWL DVYYEAIAVI IALLLLGRSL NIYFEAAVVI VTLILLGRNL DLYFESSSMI ITLILLGKYL HLYYEASAMI IGLINLGHML .IVFDTSIMI I S Y I S I G K Y L D F F W E L A T L I .VIMLLGHWI Atc2sacce E Q F F E T S S L L V T L I M V G R F V C o n s e n s u s . . . . . . . . . . . . . . . . G..L
Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa Atsyescco Atulsacce Atkbentfa
iiiiiii!i!i!i i::?:-: :::.::: :~!:.i'! ~
KSLR.HRSAN KAFR.QNTAT PALL.RGAPD HALK.TKAPN KSLL.NGAAT ASLK.HGSGT MELK.QKSPA KSLIFSRLIE ..........
950 DQNMRRRTRA VAGNLAALKA ETAAKFVGPD DHMMRGRART AISGLARLSP RGATVVHPDG E R F S M D R S R Q S I R S L M D I A P KEALVRRNG. E R F S M D R A R Q S I R S L M D I A P KEALVRRNG. E R F S M D R A R Q S I R S L M D I A P KEALVKRNG. EARAKSKAGS ALRALAARGA KNVEVLLPNG TARAKSHASI ALLALAALSA KDAAVLQPDG EEQARFRSQA ALQNLLALQP ETTQLLTAPS E H I A K G K T S E A L A K L I S L Q A TEATIVT... E H I A K G K T S E A L A K L I S L Q A TEATIVT... E H L A K S K T S E A L A K L M S L Q A TEATVVT... E E R A K G Q T S A A I R Q L I G L Q A KTARVLR... EAKAKGNTSQ AIKRLVGLQA KTARVSR E H T A K S K T G D A I K Q M M S L Q T KTAQVLR... E A R A R Q R S S K A L E K L L D L T P PTARLVT... E T L A K S Q T S T A L S K L I Q L T P SVCSII .... EMNAVSNASD ALQKLAELLP ESVKRLKKDG SELARHRAVK SI.SVRSLQA SSAILVDKTG E . . . . . . . . . . . . . L..L .... A .......
951 i000 .......... E I S Q V P V A A I S P G D I V L L R P G E R C A V D G T V I E G R S E I D Q S .......... S R E Y R A V D E I N P G D R L I V A A G E R V P V D G R V L S G T S D L D R S .......... Q E I I I H V D D I A V G D I M I V K P G E K I A M D G I I V N G L S A V N Q A .......... Q E I M I H V D D I A V G D I M I V K P G E K I A M D G I I I N G V S A V N Q A .......... Q E I M I H V D D I A V G D I M I V K P G Q K I A M D G V V V S G Y S A V N Q T .......... A E L T I P A G E L K K Q Q H F L V R P G E T I T A D G V V I D G T A T I D M S .......... S E M V I P A N E L N E Q Q R F V V R P G Q T I A A D G L V I D G S A T V S M S SIAPQDLLEA PAQIWPVAQL RAGDYVQVLP GDRIPVDGCI VAGQSTLDTA ..LDSDNILL S E E Q V D V E L V Q R G D I I K V V P G G K F P V D G R V I E G H S M V D E S ..LDSDNILL S E E Q V D V E L V Q R G D I I K V V P G G K F P V D G R V I E G H S M V D E S ..LGEDNLII R E E Q V P M E L V Q R G D I V K V V P G G K F P V D G K V L E G N T M A D E S ..QGQ ...... E L T L P I T E V Q V E D W V R V R P G E K V P V D G E V I D G R S T V D E S ..HGE ...... I L E I P L D Q V M M G D I V V V R P G E K I P V D G E V V E G H S Y V D E S ..DGK ...... E E T I A I D E V M I D D I L V I R P G E Q V P T D G R I I A G T S A L D E S ..DEG ...... E K S V P L A E V Q P G M L L R L T T G D R V P V D G E I T Q G E A W L D E A .... SDVERN E T K E I P I E L L Q V N D I V E I K P G M K I P A D G I I T R G E S E I D E S .......... T E E T V S L K E V H E G D R L I V R A G D K M P T D G T I D K G H T I V D E S .......... K E T E I N I R L L Q Y G D I F K V L P D S R I P T D G T V I S G S S E V D E A . . . . . . . . . . . . . . . . . . . . . . . D . . . V . P G ..... DG .... G .... D..
N .
.
.
.
.
.
.
.
.
.
.
.
i001 Ctpbraja LITGETLYVT A E Q G T P V Y A G FixirhimeVVNGESSPTV VTTGDTVQAG Cadastaau AITGESVPVS KAVDDEVFAG Caddstaau A I T G E S V P V A KTVDDEVFAG Cadabacfi AITGESVPVE KTVDNEVFAG Ctpbmycle A I T G E A R P V H A S P A S T V V G G Ctpamycle P I T G E A K P V R V N P G A Q V I G G Atsysynsp MLTGEPLPQP CQVGDRVCAG At7acrigr L I T G E A M P V A KKPGSTVIAG A t 7 a h o m s a L I T G E A M P V A KKPGSTVIAG A t 7 b h o m s a LITGEAMPVT KKPGSTVIAR Atcssynsp MVTGESLPVQ KQVGDEVIGA Ctppromi M I T G E P V P V A KEIGAEVVGG A t k a e n t f a MLTGESVPVE KKEKDMVFGG Atsyescco MLTGEPIPQQ KGEGDSVHAG Atulsacce LMTGESILVP KKTGFPVIAG A t k b e n t f a A V T G E S K G V K KQVGDSVIGG Atc2sacce LITGESMPVP KKCQSIVVAG Consensus ..TGE..PV ....... V..G
1050 SMNISGTLRV RVSAASEATL L A E I A R L L D N TLNLTGPLTL EATAAARDSF IAEIIGLMEA T L N E E G L I E V KITKYVEDTT ITKIIHLVEE T L N E E G L L E V KITKYVEDTT ISKIIHLVEE T L N E E G L L E V E I T K L V E D T T ISKIIHLVEE TTVLDGRLVI E A T A V G G D T Q FAAMVRLVED TVVLNGRLIV EAAAVGDETQ LAGMVRLVEQ TLNLSHRLVI R A E Q T G S Q T R LAAIVRCVAE SINQNGSLLI CATHVGADTT LSQIVKLVEE SINQNGSLLI CATHVGADTT LSQIVKLVEE SINAHGSVLI KATHVGNDTT LAQIVKLVEE TLNKTGSLTI RATRVGRETF L A Q I V Q L V Q Q TINKTGTFSF KVTKVGANTI LAQIIRLVEE TINTNGLIQI Q V S Q I G K D T V L A Q I I Q M V E D TVVQDGSVLF R A S A V G S H T T L S R I I R M V R Q SVNGPGHFYF R T T T V G E E T K LANIIKVMKE SINGDGTIEI TVTGTGENGY LAKVMEMVRK SVNGTGTLFV KLSKLPGNNT ISTIATMVDE ..N..G ............ T .... I...V..
1051 Ctpbraja A L Q A R S R Y M R L A D R A S R L Y A Fixirhime A E G G R A R Y R R IADRAARYYS Cadastaau A Q G E R A P A Q A FVDKFAKYYT Caddstaau A Q G E R A P A Q A FVDKFAKYYT Cadabacfi A Q G E R A P S Q A FVDKFAKYYT Ctpbmycle A Q V Q K A R V Q H L A D R I A A V F V Ctpamycle A Q Q Q N A N A Q R L A D R I A S V F V Atsysynsp A Q Q R K A P V Q R FADAIAGRFV At7acrigr A Q T S K A P I Q Q FADKLGGYFV A t 7 a h o m s a A Q T S K A P I Q Q FADKLSGYFV A t 7 b h o m s a A Q M S K A P I Q Q LADRFSGYFV Atcssynsp A Q A S K A P I Q R LADQVTGWFV Ctppromi A Q G S K L P I Q A LVDKVTMWFV A t k a e n t f a A Q G S K A P I Q Q IADKISGIFV Atsyescco A Q S S K P E I G Q L A D K I S A V F V Atulsacce A Q L S K A P I Q G YADYLASIFV A t k b e n t f a AQGEKSKLEF LSDKVAKWLF Atc2sacce A K L T K P K I Q N IADKIASYFV C o n s e n s u s A Q ...... Q . . . D .......
ii00 PVVHATALIT ILGWVIA ............. PAVHLLALLT FVGWMLV ............. P I I M V I A A L V A V V P P L F F G G SWDTW ..... P I I M V I A A L V A V V P P L F F G G SWDTW ..... P I I M I I A T L V A I V P P L F F D G SWETW ..... P M V F V I A G L A GASWLLAG ............ PCVFAVAALD ...RCWMA ............ YGVCAIAALT F G F W A T L G S R W W P Q V L Q Q P L PFIVLVSIAT L L V W I I I G F Q NFT ....... PFIVFVSIAT LLVWIVIGFL NFE ....... PFIIIMSTLT LVVWIVIGFI DFG ....... PAVIAIAILT FLLWFNWI ............ PAVMIGATIT FFIWLAFG ............ PIVLFLALVT LLVTGWLT ............ P V V V V I A L V S A A I W Y F F G ............ PGILILAVLT FFIWCFI ............. Y V A L V V G I I A FIAWLFLA ............ P T I I G I T W T FCVWIAVG ............ P .............................
i!!ii!iii:i:i!iiii~i!!
)iii~;~ii}:~ii??i:
}=:iii:!ii~!ili!!i!ii~!i~
Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle ~:;i)!:~,;,!:'~'i:i:i:i:.Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa
i~:~:!ii:~:i!i!ii!'.i~
i!i?ii!!)i
ii01 . . . . . . . . . . . . . . . . . . . G A S W H D A I V T G VAVLIITCPC . . . . . . . . . . . . . . . . . . . E G D V R H A M L V A VAVLIITCPC . . . . . . . . . . . . . . . . . . . . . . . . . . VYQG LAVLVVGCPC . . . . . . . . . . . . . . . . . . . . . . . . . . VYQG LAVLVVGCPC . . . . . . . . . . . . . . . . . . . . . . . . . . IYQG LAVLVVGCPC . . . . . . . . . . . . . . . . . ASP D R A F S V V L G . . . V L V I A C P C . . . . . . . . . . . . . . . . . DRR E R T R P S V L G A IAVLVIACPC PGLLIHAPHH GMEMAHPHSH SPLLLALTLA ISVLVVACPC ...IVETYFP GYSRSISRTE TIIRFAFQAS ITVLCIACPC ...IVETYFP GYNRSISRTE TIIRFAFQAS ITVLCIACPC ...VVQKYFP NPNKHISQTE VIIRFAFQTS ITVLCIACPC
1150 ALGLAIPTVQ ALGLAVPVVQ ALVISTPISI ALVITTPISI ALVISTPISI TLGLATPTAM ALGLATPTAM ALGLATPTAI SLGLATPTAV SLGLATPTAV SLGLATPTAV
97
........................
...........
!~::~i!;!:~::!; :!:~::!!!:!'J! ::::.::::.:
......................... .......................
9..:.x.
:::?,:.: :::::::::::::::::::::::::::::::
...............
....... 9.:~ .........................
...............
...........
::::::::::::::::::::::::::::::::
............
:,:,,,,:,::::: :
-
Atcssynsp Ctppromi Atkaentfa Atsyescco Atulsacce Atkbentfa Atc2sacce Consensus
. . . . . . . . . . . . . . . . . . GN ..VTLALITA V G V M I I A C P C A L G L A T P T S I . . . . . . . . . . . . . . . . . . PE P A L T F A L I N A V A V L I I A C P C A M G L A T P T S I . . . . . . . . . . . . . . . . . . KD WQ..LALLHS V S V L V I A C P C A L G L A T P T A I . . . . . . . . . . . . . . . . . . PA P Q I V Y T L V I A TTVLIIACPC ALGLATPMSI ...LNISANP P V A F T A N T K A D N F F I C L Q T A T S V V I V A C P C A L G L A T P T A I ..................... NLPDALERM VTVFIIACPH ALGLAIPLVV ........... IRVEKQSRS D A V I Q A I I Y A ITVLIVSCPC VIGLAVPIVF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VL...CPC .LGLATP...
1151 Ctpbraja TVASGAMFKS FixirhimeVVAAGRLFQG Cadastaau V S A I G N A A K K Caddstaau V S A I G N A A K K Cadabacfi V S A I G N A A K K Ctpbmycle M V A S G R G A Q L Ctpamycle M V A S G R G A Q L Atsysynsp L V A T G L A A E Q At7acrigr M V G T G V G A Q N At7ahomsa MVGTGVGAQN At7bhomsa MVGTGVAAQN Atcssynsp M V G T G K G A E Y Ctppromi M V G T G R A A E L Atkaentfa MVGTGVGAHN Atsyescco ISGVGRAAEF Atulsacce M V G T G V G A Q N Atkbentfa ARSTSIAAKN Atc2sacce V I A S G V A A K R Consensus .... G..A.. Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa Atsyescco Atulsacce Atkbentfa Atc2sacce Consensus
GVLLNSGDAI ERLAEADHVI FDKTGTLTLP G V M V K D G S A M ERLAEIDTVL L D K T G T L T I G G V L V K G G V Y L E K L G A I K T V A FDKTGTLTKG G V L I K G G V Y L E E L G A I K A I A FDKTGTLTKG G V L V K G G V Y L E E M G A L K A I A FDKTGTLTKG GIFIKGYRAL E T I N A I D T V V F D K T G T L T L G GILLKGHESF E A T R A V D T V V F D K T G T L T T G G I L V R G G D V L E Q L A R I K H F V FDKTGTLTQG GILIKGGEPL EMAHKVKVVVFDKTGTITHG GILIKGGEPL EMAHKVKVVVFDKTGTITHG G I L I K G G K P L EMAHKIKTVM FDKTGTITHG G I L I K S A E S L ELAQTIQTVI L D K T G T L T Q G GILFRKGEAL Q A L R D V S V V A L D K T G T L T K G GILIKGGEAL EGAAHLNSII L D K T G T I T Q G GVLVRDRDAL QRASTLDTVVFDKTGTLTEG G V L I K G G E V L E K F N S I T T F V FDKTGTLTTG GLLLKNRNAMEQANDLDVIM LDKTGTLTQG GVIFKSAESI E V A H N T S H V V F D K T G T L T E G G.L.K ..... E .......... DKTGTLT.G
1200 DLEVMNAADI KPRLVNAHEI VPVVTDFEVL VPVVTDFKVL VPAVTDYNVL QLSVSTVTST QLKVSAVTAA QFELIEIQPL TPVVNQVKVL TPVVNQVKVL VPRVMRVLLL QPSVTDFLAI RPELTDLIP. RPEVTDVIGP KPQVVAVKTF FMVVKKFLKD KFTVTGIEIL KLTWHETVR ...V ......
1201 1250 PA ........ D I F E L A G R L A L S S H H P V A A A V A Q A A G A R S P IV ........ SP ........ G R L A T A A A I A V H S R H P I A V A IQNSAGAASP IA ........ N D . . . Q V E E K ELFSIITALE Y R S Q H P L A S A IMKKAEQDNI PYSNVQV... N D . . . Q V E E K ELFSIITALE Y R S Q H P L A S A IMKKAEQDNI TYSDVRV... N K . . . Q I N E K ELLSIITALE Y R S Q H P L A S A IMKKAEEENI TYSDVQV... G G W . C S G E . . . V L A L A S A V E A A S E H S V A T A IV ...... A A Y A D P R P V . . . P G W . Q A N E . . . V L Q M A A T V E SASEHAVALA IA ...... AS TTHREPV... AD .... VDPD RLLQWAAALE A D S R H P L A T A L Q T . . A A Q A A N L A P I A A . . . VES.NKIPRS KILAIVGTAE S N S E H P L G A A V T K Y C K Q E L D TETLGTC... TES.NRISHH KILAIVGTAE SNSEHPLGTA ITKYCKQELD TETLGTC... GDV.ATLPLR KVLAVVGTAE A S S E H P L G V A V T K Y C K E E L G TETLGYC... G D . . . R D Q Q Q TLLGWAASLE N Y S E H P L A E A IVRY..GEAQ GITLSTV... A E . . . K F E Y N E I L S L V A S I E TYSEHPIAQS I V N A . . A N E A K L T L A S V . . . KE ......... IISLFYSLE H A S E H P L G K A IVAY..GAKV GAKTQPI... A D . . . V D E A Q A.LRLAAALE Q G S S H P L A R A IL .... DKAG DMRLPQV... SNWVGNVDED EVLACIKATE SISDHPVSKA IIRYCDGLNC N K A L N A W L E DE...AYQEE EILKYIGALE A H A N H P L A I G IMNYLKEKKI TPYQAQ .... GDRHNSQ ...... SLLLGLT E G I K H P V S M A IASYLKEKGV SAQNVSNTKA .................... E .S.HP..AI ....................
1251 1300 Ctpbraja G A V E E . A G Q G VRADVDGAE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fixirhime G D I R E I P G A G IEVKTEDGV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
Cadastaau Caddstaau !~iiiiiiiiii@ C a d a b a c f i Ctpbmycle Ctpamycle Atsysynsp At7acrigr ;~::: ======================== A t 7 a h o m s a At7bhomsa Atcssynsp Ctppromi Atkaentfa Atsyescco Atulsacce iiii!~i;f!iiii A t k b e n t f a Atc2sacce Consensus
EEFTSITGRG KDFTSITGRG EDFSSITGKG ADFVAFAGCG ANFRAVPGHG SDRQQVPGLG TDFQVVPGCG IDFQVVPGCG TDFQAVPGCG TDFEAIPGSG DNFEAIPGFG TDFVAHPGAG NGFRTLRGLG SEYVLGKGIV .EQKNLAGVG VTGKRVEGTS ....... G.G
Ctpbraja Fixirhime ~::~iii::ii;,ii@~;ii Cadastaau ~5~272:2=2~22~ Caddstaau ============================== Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa i~i!?~i!iii!~ A t s y e s c c o Atulsacce ;:;'j;:; j'jj Atkbentfa Atc2sacce Consensus
1301 1350 . . . . . . . . . . . . . . . . . . . . . . . I R L G R P S F C G A E A L V G D G T R L D P .... . . . . . . . . . . . . . . . . . . . . . . . Y R L G S R D F .... A V G G S G P D G R Q .... ....................... YYIGSPK LFKELNVSDF SLGFENNVKI ....................... YYIGSPR LFKELNVSDF SLEFENKVKV ....................... YYIGSPK LFKELLTNDF DKDLEQNVTT ....................... VKIGKPSWVTRNA..PC DWLESARRR ....................... VRVGKPS WIASRC..NS TTLV.TARRN ...................... SLRLGNPTWV .......... QVATAKLP TSSSMIIDAP LSNAVDT..Q QYKVLIGNRE WMIRNGL.VI SNDVDDSMID TSSSMIIDAQ ISNALNA..Q QHKVLIGNRE WMIRNGL.VI NNDVNDFMTE PASHLNEAGS LPAEKDAVPQ TFSVLIGNRE WLRRNGL.TI SSDVSDAMTD ...................... WLQIGTQR WLGELGI.ET S.ALQNQWED ...................... SVSVGADR FMKQLGL.DV S.QFASSAQK ...................... HYFAGTRK RLAEMNL.SF D.EFQEQALE ...................... ALLLGNQA LLNEQQV.GT K.AIEAEITA ................... N TYDICIGNEA LILEDAL.KK SGFINSNVDQ ....................... VKIINEK EAKRLGL.KI D...PERLKN . . . . . . . . . . . . . . . . . . . . . L K L Q G G N C R W L G H N N D P D V R K A L E ..... .......................... G .......................
~i!i~i!!!ii!~iiii:il i~:@i:~!ii;::?iii~!!
[ii ;,',i i:i ~; . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
}if!F!!!ii~!!!! N::N~;:::@i~%
...............................
. . . . . . . . . . . . . . . .
...........................
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
..........................
. . . . . . . . . . . . . .
;;~;:~;:;:~;~:~
. . . . . . . . . . . . . . .
~..o
.....................
.........................
::::::::::::::::::::::::: ................ ................. -~-,;-~-~.:t~.~:~::..~:~:~-~
...........................
Ctpbraja Fixirhime Cadastaau Caddstaau Cadabacfi Ctpbmycle Ctpamycle Atsysynsp At7acrigr At7ahomsa At7bhomsa Atcssynsp Ctppromi Atkaentfa
1351 ..... E A S I V ..... S E A I L LQNQGKTAMI LQNQGKTAMI LQNQGKTAMI RRITGETWF AELRGETAVF TGSAAATSIW HGRKGRPAVL HERKGRTAVL HEMKGQTAIL WEAAGKTVVG LGEQGKTPLY LEQAGKTVMF
IKGIVNGTT ............................... IQGNIDGTT ............................... IKGIVNGTT ............................... VSGWAEHH ............................... VSGTVAERA VSGTCDGR ................................ ISCKVTNIEG LLHKSNLKIE ENNTKNASLV QIDAINEQSS ISCKVTNIEG LLHKNNWNIE DNNIKNASLV QIDASNEQSS I G C K V S N V E G I L A H S E R P L . . . . . . . . . . . . . . . . . . . SA VQGQVEGI ................................ VSATVDGR ................................ ISGTINGV ................................ VSGEAEGH ................................ SKCQVNG ................................. LEATVEDKD ............................... YSG ...... ........................................
AFSKGAEKFI SL.DFRELAC IGTEKTILGV IGTDQTILGV IGTEKEILAV VSVDGVACGA VEIDGEQCGV LADDQQLLAC VTIDDELCGL VAVDDELCGL VAIDGVLCGM VAADGHLQAI TAIDGRLAAI LANEEQVLGM
LWVRQGLRPD AQAVIAALKA FRFEDQPRPA SRESIEALGR IAVADEVRET SKNVIQKLHQ IAVADEVRET SKNVILKLHQ IAVADEVRES SKEILQKLHQ VAIADTVKDSAADAISALCS IAVADAVKASAADAVAALHD FWLQDQPRPEAAEVVQALRS IAIADTVKPE AELAVHILKS IAIADTVKPE AELAIHILKS IAIADAVKQEAALAVHTLQS LSIADQLKPS SVAWRSLQR IAVADPIKET TPEAIKALHA IAVADQIKED AKQAIEQLQQ
1400 RNI.GIEILS LGI.ATGILS LGIKQTIMLT LGIKQTIMLT LGIKKTIMLT RGL.HTILLT RGF.RTALLT RGA.TVQILS MGL.EVVLMT MGL.EVVLMT MGV.DVVLIT LGL.QWMLT LGL.KVAMIT KGV.DVFMVT
99
Heavy metal-transporting ATPase f a m i l y Atsyescco QASQGATPVL LAVDGKAVAL LAVRDPLRSD SVAALQRLHK AGY.RLVMLT A t u l s a c c e .... G N T V S Y V S V N G H V F G L F E I N D E V K H D S Y A T V Q Y L Q R N G Y . E T Y M I T Atkbentfa YEAQGNTVSF LVVSDKLVAV IALGDVIKPE AKEFIQAIKE KNI.IPVMLT At c 2 s a c c e . . . Q G Y S V F C F S V N G S V T A V Y A L E D S L R A D A V S T I N L L R Q RGI. S L H I L S C o n s e n s u s .... G.T . . . . . . . . . . . . . . . . . D . . . . . . . . . . . . L . . . G ....... T
}~i 9 .!;i:;:~77 !i!~- :~:: i! :J
}?i:~. : ::-
:"2:."!-:..!/i.:: :
.
:-..-s..:;: -
;i!2:1 .
~i-
. ::.: ..::
....
..:..:
~!:i~:ii!~;~!ili~!,
102
1 0 d e r m a t t , A. et al. {1993} J. Biol. Chem. 268, 12775-12779. 2 Vulpe, C. et al. {1993} Nature Genet. 3, 7-13. a K a Y , D. et al. {1989} J. Bactenol. 1 7 1 , 9 2 9 - 9 3 9 . 4 Petmkhin, K. et al. (1993) Nature Genet. 5, 3 3 8 - 3 4 3 . s Green, N.M. and MaeLennan, D.H. {1989} Bioehem. Soe. Trans. 17, 819-822; Green, N.M. {1989} Bioehem. Soe. Trans. 17, 9 7 0 - 9 7 2 . 6 Fagan, M.J. and Saier, M.H. Jr. (1994) J. Mol. Evol. 38, 5 7 - 9 9 . r Bull, P.C. and Cox, D.W. {1994} Trends Genet. 10, 246-252.
Vacuolar ATPases
Vacuolar ATPase family Summary
::.5:::':::::9 : :":
::::::::::::::::::::::::::::::::::::::.
.........
i~!i:i;;!ii!iii~?:~i;i:~i!.: ..............:::: :::::::::::::::::::::::::::::::::::::::::::::::::: .......
. . . . . . . . .
.......... ...:.:.:..
Transporters of the vacuolar ATPase family, examples of which are vacuolar ATPase and vacuolar proton pump subunits from humans t (Vphlhomsa), rodents ~ (Vpplratno)and yeast a (Stvlsacce), mediate proton transport by ATPase (H+-transporting ATP synthase; EC 3.6.1.34) activity. Other members of the vacuolar ATPase family include the mouse immune suppression factor TJ6 ~ (Tj6musmu). This ATPase subunit is required for assembly as well as for ATPase activity a. Members of the vacuolar ATPase family have only been found in eukaryotes. Statistical analysis of multiple amino acid sequence comparisons reveals no apparent relationship between these transporters and any other ATPase or transporter family. Members of the vacuolar ATPase family contain two domains: a hydrophilic N-terminal domain containing many charged residues, and a hydrophobic C-terminal domain. The hydrophobic domain is predicted to contain six transmembrane helices by the hydropathy of amino acid sequences. Unusual for any transporter family, both the N-terminal domain and the C-terminus are predicted to be extracellular. They are also known to be glycosylated. Many amino acids, and several long sequence motifs, are conserved throughout this family. These conserved sequence motifs are more prevalent in the hydrophobic, membrane-spanning C-terminal domain of the proteins.
Nomenclature, biological sources and substrates CODE
DESCRIPTION [SYNONYMS]
Stvlsacce
VacuolarATP synthase 101 kDa subunit [V-ATPase subunit AC115, STV1, YMR054W, YM9796.07] Immune suppressor factor j6b7 Vacuolar proton pump subunit [OC-116 kDa, VPP1] VacuolarATPase 98 kDa subunit [VPH1] VacuolarATP synthase 95.5 kDa subunit [VPH1, YOR270C] Putative clathrin-coated vesicle/synaptic vesicle proton pump subunit [ZK637.81 Clathrin-coated vesicle/ synaptic vesicle proton pump 116 kDa subunit
. . . . . . . .
::::::::::::::::::::::::::::::: ~:~:=========================: .:::..:. !~!?.!!~i::!::!:i: ~!-:~!!:-!
~:~i:~.-:~::~:i~i ~ .:~
Tj6musmu Vphlhomsa Vphlneucr Vphlsacce
:::::::::::::::::::..:::,:~::
Vpplcaeel
Vpplratno
104
ORGANISM [ C O M M O N NAMES] Saccharomyces cerevisiae
SUBSTRATE(S)
H§
[yeast] Mus musculus
H§
[mouse] H o m o sapiens
H§
[human] Neurospora crassa
H§
[mold] Saccharomyces cerevisiae
H§
[yeast] Caenorhabditis elegans
H+
[nematode] R attus norvegicus
[rat]
H§
i:!if:i,iii~!!:iiii!i!!ii: 'iii~i~iii
P h y l o g e n e t i c tree "r
:=:..:~ .......................
i ~!,~: ~i ~i!?:: ,~~,i~,~.i i !i~:i il):ii!~::i!i:::!~::i~~::ii::~i i!iiii.:i
Vpplratno
~.ji~iiiiii~iiiiii~!i~!ii: i!!'i!i< ............................. :: ................................ !:i::i.::::.%.].!i ~::]::i:::: .::i:~ii.!!
Tj6musmu
: i:j? i]i:i:/~!i-i!i:j~. ]]-!~!~i
VDhlhomsa : ~:~.:~-?~::~:~.::.:,:.~::::~::
....::::::::::::::::::::: ....
Stv!sacce
j,i ::~-ii::::::::::::::::::::::: }i ]~]]:~. . ..................... ..
! :.::~i:::::::::::::::::::::::::::::::::: ~::i ::.......... !%1!ij!::ii:.;.+>: ~2~2~i':i!i:::d!~:!ii!
!iii#@i INSIDE
.................... ~-~4.;:~:~;.#:~:~;.;.:~:~:
Physical and genetic characteristics
............................. ..................................... ...................
.... . ~ + : + , . + . , : : . + : ~ . :
.................. ::::::::::::::::::::::::::
!!:.:(!(!:E';(!::!!::?I:!!II .................. -::: ::-:-::-:::::.::::
10(
9
Stvl sacce Tj6musmu Vphlhomsa Vphlneucr Vphlsacce Vpp 1caeel Vpp 1ratno
AMINO ACIDS 890 855 829 856 840 1030 838
MOL. WT
101 660 98 048 93 011 97 992 95 528 117 544 96 327
EXPRESSION SITES
CHROMOSOMAL LOCUS ADH3 to centromere
thymus 17q21 Chromosome 15 ZK63 7.8 brain
Multiple amino acid sequence alignments 1
50
Vpplcaeel MGDYVTPGEE PPQPGIYRSE QMCLAQLYLQ SDASYQCVAE Vpplratno ............ MGELFRSE EMTLAQLFLQ SEAAYCCVSE Tj6musmu ............ MGSLFRSE SMCLAQLFLQ SGTAYECLSA Vphlhomsa ............ MGSMFRSE EVALVQLFLP TAAAYTCVSR S t v l s a c c e ......... M N Q E E A I F R S A D M T Y V Q L Y I P L E V I R E V T F L V p h l s a c c e ........ MA E K E E A I F R S A E M A L V Q F Y I P Q E I S R D S A Y T V p h l n e u c r ........ MA P K Q D T P F R S A D M S M V Q L Y I S N E I G R E V C N A Consensus ................ FRS..M...QL ............. Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
51 DLNPDVSSFQ DLNPDVNVFQ DLNQNVSSFQ DLNASVSAFQ DLNKDLTAFQ DLNSKVRAFQ DLNSELSAFQ DLN ..... FQ
LGELGLVQFR LEELGKVQFR LGEKGLVQFR LGELGLVEFR LGKMSVFMVM LGQLGLVQFR LGELGLVHFR LG..G.V.FR
i00 R K Y V N E V R R C D E M E R K L R Y L E R E I K K D Q I P M ......... R K F V N E V R R C E E M D R K L R F V E K E I R K A N I P I ......... R K F V G E V K R C E E L E R I L V Y L V Q E I T R A D I P L ......... R R F V V D V W R C E E L E K T F T F L Q E E V R R A G L V L ......... RGYVNQLRRF DEVERMVGFL NEWEKHAAE TWKYILHIDD R T F V N E I R R L D N V E R Q Y R Y F Y S L L K K H D I K LY ....... E R A F T Q D I R R L D N V E R Q L R Y F H S Q M E K A G I P L R K F ..... D R . . V .... R .... ER . . . . . . . . . . . . . . . . . . . . . . . . .
i01 V p p l c a e e l .......... L D T G E N P D A P V p p l r a t n o .......... M D T G E N P E V P Tj6musmu .......... P E G E A S P P A P V p h l h o m s a .......... P P P K G R L P A P Stvlsacce EGNDIAQPDM ADLINTMEPL Vphlsacce GDTDKYLDGS GELY...VPP Vphlneucr PDVDI ........... LTPP Consensus ................... p
150 LPREMIDLEA TFEKLENELR EVNKNEETLK FPRDMIDLEA NFEKIENELK EINTNQEALK PLKHVLEMQE QLQKLEVELR EVTKNKEKLR PPRDLLRIQE ETERLAQELR DVRGNQQALR SLENVNDMVK EITDCESRAR QLDESLDSLR SGSVIDDYVR NASYLEERLI QMEDATDQIE TTTEIDELAE RAQTLEQRVS SLNESYETLK ............... E ..............
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
151 200 KNFSELTELK HILRKTQTFF EEVDHDRWRI LEGGSGRRGR STEREETRPL RNFLELTELK FILRKTQQFF DEMADP..DL LEESSS ............. L KNLLELVEYT HMLRVTKTFL KRNVEFEPTY EEFPAL ......... ENDSL AQLHQLQLHA AVLRQG ..... HEPQLAAAH TD.GAS ......... ERTPL SKLNDLLEQR QVIFECSKFI EVNPGIAGRA TNPEIEQEER DVDEFRMTPD VQKND.LEQY RFILQSG ......................... DEFFLKGD KREVELTEWRWVLREAGGFF DRAHG ............... NVEEIRASTD ..... L.E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
201 250 IDIGDMDDDSAARMSAQAAMLRLGYVVLGK MDRPESATIA KRDLVYVVLF LEPNEM ......... GRGAP LRLG .......................... LDYSCMQ ........................................... LQAPGGP ........................................... DISETLSDAF SFDDETPQDR GALG .......................... N T D S T ..... S Y M D E D M I D A N G E N . . . . . . . . . . . . . . . . . . . . . . . . . . NDDAPL .......... LQDV EQHN .......................... ..................................................
251 300 Vpplcaeel VSFSFCIPLV FFPDSFLHED MIASSAESSG IGEVLSADEE ELSGRFSDAM Vpplratno ..................................................
10~
Vacuolar
ATPase
Tj6musmu
!:iiiiiii!! V p h l h o m s a ....
Stvlsacce Vphlsacce Vphlneucr Consensus
iiiii!iii!:!.!ilii!ii:
i i ':i:ii!ii :;i i:
[alnily
.................................................. .................................................. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . NDLTR N Q S V E D L S F L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IAAAI GASVN ..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . TAADV E R S F S G M N I G ..................................................
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
301 SPLKLQLRFV ........ FV .RLGAKLGFV .HQDLRVNFV EQGYQHRYMI ........ YV ........ FV ......... V
AGVIQRERLP AFERLLWRAC AGVINRERIP TFERMLWRVC SGLIQQGRVE AFERMLWRAC A G A V E P H K A P ALERLLWRAC T G S I R R T K V D ILNRILWRLL T G V I A R D K V A TLEQILWRVL AGVIGRDRVD AFERILWRTL .G.I ........ ER.LWR..
9 Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
351 GDPVNKCVFI GDYVHKSVFI GEVIKWYVFL GEPATWMTFL K..VEKDCFI REYKHKNAFI NEPVLKNVFV ........ F.
400 IFFQGDHLKT KVKKICEGFR ATLYPCPDTP Q E R R E M S I G V IFFQGDQLKN R V K K I C E G F R A S L Y P C P E T P Q E R K E M A S G V ISFWGEQIGH KVKKICDCYH CHIYPYPNTA EERREIQEGL ISYWGEQIGQ KIRKITDCFH CHVFPFLQQE EARLGALQQL IFTHGETLLK KVKRVIDSLN G K . . . I V S L N TRSSELVDTL VFSHGDLIIK RIRKIAESLDANLYDVDSSN EGRSQQLAKV IFAHGKEILA KIRRISESMG AEVYNVDEHS D L R R D Q V H E V I...G ......... I . . . . . . . . . . . . . . . . . R .......
RGNVFLRTSE RGNVFLRQAE KGYTIVTYAE RGFLIASFRE RGNLIFQNFP RGNLFFKTVE RGNLYMNQAE RG ....... E
401
350 IDDVLNDTVT IENPLEDPVT LDECLEDPET LEQPLEHPVT IEEPLLEGKE IEQPVYDVKT IPEPLIDPTI .... L .....
450
ii!!iiiii~!:~.:~ii Vpplcaeel M T R I E D L K T V L G Q T Q D H R H R V L V A A S K N V R M W L T K V R K I K SIYHTLNLFN :: 7-:::.- : ; .~ :~i:i-
~i'i"ii): i:i !i::i
..::+. ........
Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
NTRIDDLQMV LNQTEDHRQR VLQAAAKNIRVWFIKVRKMK N T R I Q D L Y T V L H K T E D Y L R Q VLCKAAESVC SRVVQVRKMK Q Q Q S Q E L Q E V L G E T E R F L S Q V L G R V L Q L L P PGQVQVHKMK NRQIDDLQRI L D T T E Q T L H T E L L V I H D Q L P V W S A M T K R E K N K N L S D L Y T V LKTTSTTLES ELYAIAKELD SWFQDVTREK N A R L E D V Q N V L R N T Q Q T L E A ELAQISQSLS A W M I T I S K E K ..... D L . . V L . . T ....... L . . . . . . . . . . . . . . . . . K
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
451 IDVTQKCLIA IDVTQKCLIA FDVTNKCLIA VSTTHKCLIA FQQESQGLIA YDTNRKILIA YDRARRTLIA ....... LIA
500 EVWCPIAELD R I K M A L K R G T DESGSQVPSI LNRMETNEAP E V W C P V T D L D SIQFALRRGT EHSGSTVPSI LNRMQTNQTP EVWCPEVDLP GLRRALEEGS RESGATIPSF MNTIPTKETP EAWCSVRDLP ALQEALRDSS M E E G . . V S A V AHRIPCRDMP EGWVPSTELI HLQDSLKDYI E T L G S E Y S T V FNVILTNKLP E G W I P R D E L A TLQARLGEMI ARLGIDVPSI IQVLDTNHTP EGWCPTNDLP L I R S T L Q D V N N R A G L S V P S I INEIRTNKTP E.W.P...L ...... L ....... G ........... T...P
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce
501 PTYNKTNKFT PTYNKTNKFT PTLIRTNKFT PTLIRTNRFT PTYHRTNKFT PTFHRTNKFT
KGFQNIVDAY HGFQNIVDAY EGFQNIVDAY ASFQGIVDRY QAFQSIVDAY AGFQSICDCY
i)':.:,:::-:: 5:
~!;:.;i:!!. ~(ii!iiii.. ,
~,i~;iii!Cii:.i: ...... _.......
AIYHTLNLCN AIYHMLNMCS AVYLALNQCS YVYTTLNK.. AIFEILNKSN AVYNTLNLFS ..Y..LN...
.
i:i/i~i-
108
GIATYREINP GIGTYREINP GVGSYREVNP GVGRYQEVNP GIATYKEINA GIAQYREINA
APYTMISFPF APYTVITFPF ALFTIITFPF APYTIITFPF GLATVVTFPF GLPTIVTFPF
550 LFAVMFGDMG LFAVMFGDFG LFAVMFGDFG LFAVMFGDVG MFAIMFGDMG MFAIMFGDMG
Vphlneucr PTYLKTNKFT EAFQTIVNAY GTATYQEVNP AIPVIVTFPF LFAVMFGDFG ConsensusPT...TNKFT . . F Q . I V D . Y G . . . Y . E . N .... T . . T F P F . F A . M F G D . G 551 600 Vpplcaeel HGAIMLLAAL FFILKEKQLEAARIKDEIFQ TFFGGRYVIF LMGAFSIYTG Vpplratno HGILMTLFAV WMVLRESRIL SQKNENEMFS MVFSGRYIIL LMGLFSIYTG Tj6musmu HGFVMFLFAL LLVLNENHPR LSQSQ.EILR MFFDGRYILL LMGLFSVYTG Vphlhomsa HGLLMFLFALAMVLAENRPA VKAAQNEIWQ TFFRGRYLLL LMGLFSIYTG Stvlsacce HGFILFLMAL FLVLNERKFG .AMHRDEIFD MAFTGRYVLL LMGAFSVYTG Vphlsacce HGFLMTLAAL SLVLNEKKIN .KMKRGEIFD MAFTGRYIIL LMGVFSMYTG Vphlneucr HALIMLCAAL AMIYWEKPLK .KVTF.ELFA MVFYGRYIVL VMAVFSVYTG ConsensusHG..M.L.AL ...L.E . . . . . . . . . . E . . . . . F . G R Y . . L L M G . F S . Y T G Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
601 650 F M Y N D V F S K S I N T F G S S W . . . . . . . . . QNT I P E S V I D Y Y L D D E K R S E S Q L L I Y N D C F S K S L N I F G S S W . . . . . . . . . . SV R P M F T I G N W T E E T L L G S S V L LIYNDCFSKS VNLFGSGWNV CAMYSSSHSP EEQRKMVLWNDSTIRHSRTL F I Y N E C F S R A T S I F P S G W S V A A M A N Q S G . . . . . . . . . . WS D A F L A Q H T M L LLYNDIFSKS MTIFKSGWQW ..PSTFRKG .................... E FLYNDIFSKT MTIFKSGWKW ..PDHWKKG .................... E LIYNDVFSKS MTLFDSQWKWVVPENFKEG .................... M . . Y N D . F S K .... F . S . W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
651 IL.PPETAFD GNPYPIGVDPVWNLAEGNKL QLNPAIPGVF GGPYPFGIDP IWNIA.TNKL QLDPNIPGVF RGPYPFGIDP IWNLA.TNRL TLDPNVTGVF LGPYPFGIDP IWSLA.ANHL S I E A K K T G V .... Y P F G L D F A W H . G T D N G L S I T A T S V G T .... Y P I G L D W A W H . G T E N A L TVKAVLREPN GYRYPFGLDW RWH.GTENEL . . . . . . . . . . . . . Y P . G . D . . W ..... N.L
SFLNSMKMKM TFLNSFKMKM TFLNSFKMKM SFLNSFKMKM LFSNSYKMKL LFSNSYKMKL LFINSYKMKM .F.NS.KMK.
700 SVLFGIAQMT SVILGIIHML SVILGIFHMT SVILGVVHMA SILMGYAHMT SILMGFIHMT AIILGWAHMT S...G..HM.
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
701 FGVLLSYQNF IYFKSDLDIK YMFIPQMIFL SSIFIYLCIQ FGVSLSLFNH IYFKKPLNIY FGFIPEIIFM SSLFGYLVIL FGWLGIFNH LHFRKKFNVY LVSVPEILFM LCIFGYLIFM FGVVLGVFNH VHFGQRHRLL LETLPELTFL LGLFGYLVFL YSFMFSYINY RAKNSKVDII GNFIPGLVFM QSIFGYLSWA YSYFFSLANH LYFNSMIDII GNFIPGLLFM QGIFGYLSVC YSLCFSYINA RHFKRPIDIW GNFVPGMIFF QSIFGYLVLC ........ N . . . F . . . . . . . . . . . P . . . F . . . . FGYL...
750 ILSKWLFFGA IFYKWTAYDA IIYKWLAYSA VIYKWLCVWA I V Y K W ..... I V Y K W ..... I I Y K W ..... I . Y K W .....
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
751 VGGTVLGYKY PGSNCAPSLL .......... H S S R N A P S L L .......... E T S R E A P S I L .......... A R A . A S P S I L ..... SKDWI K D D K P A P G L L ..... A V D W V K D G K P A P G L L ..... SVDWF G T G R Q P P G L L ................ P..L
800 IGLINMFMMK SRNAGFVDDS GETYPQCYLS IHFINMFLF ............. SYPESGNA IEFINMFLFP .............. TSKTHG IHFINMFLFS .............. HSPSNR NMLINMFLAP GTIDD..Q ............ NMLINMFLSP GTIDD..E ............ NMLIYMFLQP GTLDGGVE ............ ...INMFL ......................
801 850 Vpplcaeel TWYPGQSFFE TIFVLVAIAC VPVMLFGKPY FLWKEEKERR EGGHRQLATI Vpplratno MLYSGQKGIQ CFLIVVAMLC VPWMLLFKPL ILRHQYLRKK HLGTLNFGGI
11(~
Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
.LYPGQAHVQ R V L V A L T V L A LLYPRQEVVQ ATLVVLALAM .LYSGQAKLQVVLLLAALVC .LYPHQAKVQ V F L L L M A L V C .LYPGQATVQ V I L L L L A V I Q .LY..Q...Q ..L...A...
V P V L F L G K P L F L L W L H N G R N CFGMSRSG.. V P I L L L G T P L H L L H R H R R R .... LRRRP.. VPWLLLYKPL TLRRLNKNGG GGRPHGYQSV I P W L L L V K P L H F K F T H K K ...... KSHEPL VPILLFLKPF YLRWENNRAR AKGYRGIGER VP..L..KP..L ..................
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
851 900 E I I L W L A L V Q V P I M L F A K P Y F L Y R R D K Q Q SRYSTLTAES N Q H Q S V R A D I ............................................ RVGNGP ........................................... YTLVRKD .............................................. ADRQ GNI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EH .EEQIAQQRH PST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EA .DA ....... SRV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SA L D E D D E E D P S ..................................................
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
901 950 N Q D D A E V V H A P E Q T P K P S G H G H G H G D G P . . . . . . . LEMGD V M V Y Q A I H T I T E E D A E I I Q H D Q L S T H S E D A E E P T E D E V . . . . . . . FDFGD T M V H Q A I H T I SEEEVSLLGNQDIE.EGNSRMEEGCREVTCE...EFNFGE ILMTQAIHSI EENKAGLLDL PDASVNGWSS DEEKAGGLDD EEEAELVPSE VLMHQAIHTI SAEGFQGMII S D V A S V A D S I N E S V G G G .... E Q G P F N F G D V M I H Q V I H T I S S E D L E A Q Q L ISAMDADDAE E E E V G S G .... S H G E . D F G D IMIHQVIHTI N G D D Y E G A A M LT ........ H D E H G D G .... E H E E F E F G E V M I H Q V I H T I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G ..... Q.IHTI
Vpplcaeel Vpplratno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
951 EFVLGCVSHT EYCLGCISNT EYCLGCISNT EFCLGCVSNT EFCLNCISHT EFCLNCVSHT EFCLNSVSHT E.CL.C.S.T
ASYLRLWALS ASYLRLWALS ASYLRLWALS ASYLRLWALS ASYLRLWALS ASYLRLWALS ASYLRLWALS ASYLRLWALS
LAHAQLSDVL LAHAQLSEVL LAHAQLSDVL LAHAQLSEVL LAHAQLSSVL LAHAQLSSVL LAHQQLSAVL LAHAQLS.VL
i000 WTMVFRNAFV LDGYTGAIAT WTMVIHIGLH VRSLAGGLGL WAMLMRVGLR VDTTYG...V WAMVMRIGLG LGREVGVAAV WDMTISNAFS SKNSGSPLAV WTMTIQIAFG FRGF...VGV WSMTMAKALE SKGLGG..AI W.M . . . . . . . . . . . . . . . . .
Vpplcaeel Vpp ir atno Tj6musmu Vphlhomsa Stvlsacce Vphlsacce Vphlneucr Consensus
i001 YI...LFFIF GSLSVFILVL FF... IFAAF A T L T V A I L L I LL.LPVMAFF AVLTIFILLV VL.VPIFAAF AVMTVAILLV MKVVFLFAMW FVLTVCILVF FMTVALFAMW FALTCAVLVL FLVVA.FAMF FVLSVIILII ...... FA . . . . . . . . IL..
MEGLSAFLHA MEGLSAFLHA MEGLSAFLHA MEGLSAFLHA MEGTSAMLHA MEGTSAMLHS MEGVSAMLHS MEG.SA.LH.
LRLHWVEFQS LRLHWVEFQN IRLHWVEFQN LRLHWVEFQN LRLHWVEAMS LRLHWVESMS LRLAWVESFS LRLHWVE...
Vpplcaeel Vpp ir atno Tj6musmu Vphlhomsa Stvlsacce
1051 1073 A P F S F E K I L A E E R E A E E N L .... L P F S F E H I R E GKFDE . . . . . . . . V P F S F S L L S S K F S N D D S I A .... SPFTFAATDD ............. E P F S F R ...... AIIE .......
1050 KFYGGLGYEF KFYTGTGFKF KFYVGAGTKF KFYSGTGYKL KFFEGEGYAY KFFVGEGLPY KFAEFGGWPF KF..G.G...
Vphlsacce EPFAFEYKDM EVAVASASSS ASS Vphlneucr TPFSFKQQLE ESEELKEYIG . . . Consensus .PF.F . . . . . . . . . . . . . . . . . .
Residues listed in the consensus sequence are present in at least 75 % of the aligned transporter sequences.
Database accession numbers Stv 1sacce Tj6musmu Vphlhomsa Vphlneucr Vphlsacce Vpplcaeel Vpplratno
SWISSPR OT
PIR
EMBL/GENBANK
P3 7296 P15920
A54081 JH0287
P32563 P30628 P25286
A42970 S15795 B38656
U06465; G460160 M31226; G293678 U45285 U36396 M89778; G173173 Z11115; G1067097 M58758; G206430
References 1 2 3 4
Li, Y.P. et al. {1996) Biochem. Biophys. Res.. C o m m u n . 218, 813-821. Perin, M.S. et al. (1991)J. Biol. Chem. 266, 3877-3881. Manolson, M.E et al. {1992) J. Biol. Chem. 267, 14294-14303. Lee, C.-K. and Ghoshal, K.K.D. (1990)Mol. Immunol. 27, 1137-1144.
m
This Page Intentionally Left Blank
ABC Multidrug Resistance Proteins
White transporter family Summary Typical transporters of the white family, the example of which is the white 1 protein of Drosophila melanogaster (Whitdrome), mediate the import of pigment precursors into cells in the compound eye by acting as ATP-dependent effiux pumps. In Drosophila the white protein dimerizes with the brown protein (Browdrome) to import guanine 2 and with the scarlet protein (Scrtdrome) to import tryptophan a. Members of the white transporter family are also found in mammals and a few other eukaryotes. The human homolog of the white protein 4 is located on chromosome 21 and may be implicated in Down's syndrome (trisomy 21). Statistical analysis of multiple amino acid sequence comparisons places the white transporter family in the multidrug resistance subdivision of the ATP binding cassette (ABC) superfamily s. Proteins in this superfamily use the energy of ATP hydrolysis to pump substrates across cell membranes. Transporters of the white family consist of a single ATP binding domain (containing the sequence patterns characteristic of the ABC transporter superfamily)fused to a transmembrane domain, with the ATP binding domain towards the Nterminus 2. The functional transporter complex is formed from a dimer - in the case of the Drosophila pigment proteins, a heterodimer of white with either brown or scarlet. The transmembrane domains are predicted to contain six membrane-spanning helices by the hydropathy of their amino acid sequences. Several amino acids are conserved within the white transporter family, including motifs unique to the family, signature motifs of the ABC superfamily, and motifs necessary for function by the criterion of site-directed mutagenesis.
Nomenclature, biological sources and substrates CODE
Browdrome
DESCRIPTION [SYNONYMS] ProbableATP-dependent dermease precursor [ADP1, YCR11C, YCR105] Brownprotein [BW]
Scrtdrome
Scarletprotein [ST]
Whitanoal
Eye pigment protein [White] White protein [W]
Adp 1sacce
Whitdrome Whithomsa Whitmusmu
114
White protein homolog [WHIT1] White protein homolog
ORGANISM [COMMON NAMES] Saccharomyces cerevisiae [yeast]
SUBSTRATE(S)
Drosophila melanogaster [fruit flY] Drosophila melanogaster [fruit fly] Anopheles albimanus [mosquito] Drosophila melanogaster ]fruit fly] Homo sapiens [human] Mus musculus [mouse]
Guanine Tryptophan Pigment precursors? Guanine tryptophan Pigment precursors? Pigment precursors?
P h y l o g e n e t i c tree :::::::::::::::::::::::::::::::::::::::::: :-z::;;~ ======================
Whithomsa
.............. :::::::::::::::::::::::::::::
.Z. I. Z. . '.Z. .I.I. I. . . . . ::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::
!~i!iiiii',ii!%::i!
Whitanoal
............................ .................
.................
Whitdrome
~i~{i!ii~',i!iii',:,
Scrtdrome
.................. .................
!N~.ii?ilN~ Adplsacce ............................... .................
.................
Browdrome
77.:2~tC" ................
Proteins listed subsequently in italics are at least 90% identical to the paired transporters listed in parenthesis and are therefore not included in the phylogenetic tree: Whitmusmu {Whithomsa}. P r o p o s e d o r i e n t a t i o n of w h i t e p r o t e i n ~ in t h e m e m b r a n e The model is based on predictions of membrane-spanning regions and ~-helical content. The N-terminus of the protein is illustrated on the reside and is folded six times through the membrane. The predicted membrane-spanning helices are portrayed as rectangles. The numbers corresponding to the first and last residue of each membrane-spanning helix are boxed. Residues that are conserved in more than 75 % of the aligned transporters {see below} are shown.
115
E
I
I
T t
S D L G
L A
OUTOYI I
A
v
~
v
L L
sI L
L D
T p
G
IG
E D
N G
L
O D
E
~
L
T
E
A
R K R E G G S
G
k
C
R
p
Q
N P A
k
D F
S
~
COOH F
RE
Y
i DR
NH
2
L
INSIDE
T
I
Physical and genetic characteristics AMINO ACIDS 1049 675 666 709 687 674 666
Adplsacce Browdrome Scrtdrome Whitanoal Whitdrome Whithomsa Whitmusmu
MOL. W T
117 231 75 943 74 506 79 052 75 672 75 169 74 032
EXPRESSION SITES
head retina
CHR O M O S O M A L LOCUS Chromosome 3
21q22.3
Multiple amino acid sequence alignments 1
50
51
i00
I01
150
151
200
Adplsacce MGSHRRYLYY SILSFLLLSC SVVLAKQDET PFFEGTSSKN SRLTAQDKGN Adplsacce D T C P P C F N C M LPIFECKQFS ECNSYTGRCE CIEGFAGDDC SLPLCGGLSP
Adplsacce DESGNKDRPI RAQNDTCHCD NGWGGINCDV CQEDFVCDAF MPDPSIKGTC
Adplsacce YKNGMIVDKV FSGCNVTNEK ILQILNGKIP QITFACDKPN QECNFQFWID
11(
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
201 250 ................................................ MA ........................................... MTINTDD ........................................... MGQEDQE .................................................. QLESFYCGLS DCAFEYDLEQ NTSHYKCNDV QCKCVPDTVL CGAKGSIDIS .................................................. ..................................................
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
251 300 A F S V G T A M N A S S Y S A E M T E P KS . . . . . . . . . . . . . . VCVS V D E V V S S N M E Q Y A D G E S K T T I S S N R R Y S T S SF . . . . . . . . . . . . . . QDQS M E D D G I N A T L L L I R G G S K H P S A E H L N N G D S GA . . . . . . . . . . . . . . ASQS CINQGFGQ.. .MSDSDSKRI D V E A P E R V E Q HE . . . . . . . . . . . . . . LQVM P V G S T I E V P S DFLTETIKGP GDFSCDLETR QCKFSEPSMN DLILTVFGDP YITLKCESGE .................................................. ..................................................
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
301 350 ATETDLLNGH LKKVDNNLTE AQRFSSLPRRAAVNIEFRDL S.YSVPEGPW TNDKATL.IQVWRPKSY... GSVKGQIPAQ DRLTYTWREI DVFGQAAIDG A K N Y G T L . L P PSPPEDS... GSGSGQLA.. E N L T Y A W H N M D I F G A V N Q P G LDSTPKL.SK RNSSERSLPL RSYSKWSPTE QGATLVWRDL CVYTNVGGSG CVHYSEIPGY KSPSKDPTVS WQGKLVLALT AVMVLALFTF ATFYISKSPL . . . . . . . . . . . . . . . . . . MQ E S G G S S G Q G G P S L C L E W K Q L N Y Y V P D Q E Q S ..................................................
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
351 400 WRKKGY ............................................ KSREPLCSRL RHCFTRQRLV KDFNPR ........................ SGWRQLVNRT RGLFCNERHI PA..PR ........................ . . . . . . . . . . . . . . . . . . . . . . . QRM . . . . . . . . . . . . . . . . . . . . . . . . FRNGLGSSKS PIRLPDEDAV NNFLQNEDDT LATLSFENIT YSVPSINSDG ....................................... N YSFWNECRKK ..................................................
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
401 ..KTLLKGIS ..KHLLKNVT ..KHLLKNVC ..KRIINNST VEETVLNEIS RELRILQDAS ..... L ....
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
451 500 ..LINGLPRD L R C F R K V S C Y I M Q D D M L L P H L T V Q E A M M V S A H L K L Q . . . E IRTLNGVPVT AEQMRARCAY VQQDDLFIPS LTTKEHLMFQ AMLRMGRDVP MRLLNGQPVD AKEMQARCAY VQQDDLFIGS LTAREHLIFQAMVRMPRHLT L..INGRRIG PF.MHRNHGY VYQDDLFLGS VSVLEHLNFM AHLRLDRRVS SIKVNGISMD RKSFSKIIGF VDQDDFLLPT LTVFETVLNS ALLRLPKALS ..VLNGMAME R H Q M T R I S S F L P Q F E I N V K T F T A Y E H L Y F M S H F K M H R R T T .... NG . . . . . . . . . . . . . . . . QDD ...... T..E ..... A .........
GKFNSGELVA GVARSGELLA GVAYPGELLA GAIQPGTLMA GIVKPGQILA GHMKTGDLIA G .... G . L . A
IMGPSGAGKS VMGSSGAGKT VMGSSGAGKT LMGSSGSGKT IMGGSGAGKT ILGGSGAGKT .MG.SGAGKT
450 TLMNILAGYR ETGMKGAV.. TLLNELAFRS PPGVKISPNA TLLNALAFRS PQGIQVSPSG TLMSTLAFRQ PAGTVVQGDI TLLDILAMKR KTG...HVSG TLLAAISQRL RGNLTGDV.. T L . . . L A ..... G .......
r
[17
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
501 550 KDEGRREMVK EILTALGLLS CANTRTGS ...... LSGGQR KRLAIALELV ATPIKMHRVD EVLQELSLVK CADTIIGVAG RVKGLSGGER KRTAFRSETL YRQ.RVARVD QVIQELSLSK CQHTIIGVPG RVKGLSGGER KRLAFASEAL KEERRLI.IK ELLERTGLLSAAQTRIGSGD DKKVLSGGER KRLAFAVELL F.EAKKARVY KVLEELRIID IKDRIIG.NE FDRGISGGEK RRVSIACELV KAE.KRQRVA DLLLAVGLRDAAHTRI ...... QQLSGGER KRLSLAEELI ........ V . . . L . . . . L . . . . . T . I G . . . . . . . L S G G E R K R . . . A . E . .
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
551 NNPPVMFFDE TDPHLLLCDE TDPPLLICDE NNPVILFCDE TSPLVLFLDE TDPIFLFCDE ..P..L..DE
)!iiiiiiii
PTSGLDSASC PTSSLDSFMA PTSGLDSFTA PTTGLDSYSA PTSGLDASNA PTTGLDSFSA PT.GLDS..A
FQVVSLMKGL QSVLQVLKGM HSVVQVLKKL QQLVATLYEL NNVIECLVRL YSVIKTLRHL ..V...L..L
600 A ................... A ................... S ................... A ................... S ................... CTRRRIAKHS LNQVYGEDSF ....................
601 650 Whithomsa ............................................... QG. Whitanoal ............................................... MK. Whitdrome ............................................... QK. Scrtdrome ............................................... QK. Adplsacce ............................................... SDY Browdrome ETPSGESSAS GSGSKSIEMEVVAESHESLL QTMRELPALG VLSNSPNGTH Consensus ..................................................
) .
- 4
..... ::2? ::)i::
.
.
.
.
651 GRSIICTIHQ GKTIILTIHQ GKTVILTIHQ GTTILCTIHQ NRTLVLSIHQ KKAAICSIHQ ....... IHQ
700 PSAKLFELFD QLYVLSQGQC VYRGKVCNLV PYLRDLGLNC PSSELYCLFD RILLVAEG.V AFLGSPYQSA DFFSQLGIPC PSSELFELFD KILLMAEGRV AFLGTPSEAV DFFSYVGAQC PSSQLFDNFN NVMLLADGRV AFTGSPQHAL SFFANHGYYC PRSNIFYLFD KLVLLSKGEM VYSGNAKKVS EFLRNEGYIC PTSDIFELFT HIILMDGGRI VYQGRTEQAAKFFTDLGYEL P . S . . F . L F .... L . . . G . . . . . G . . . . . . . F .... G..C
701 750 W h i t h o m s a P T Y H N P A D F V M E V . . A S G E Y G D Q N . . . S R L V R A V R E G M C D SDH ....... W h i t a n o a l P P N Y N P A D F Y V Q M L A I A P N K E T E C . . . R E T I K K I C D S F A V SPI ....... W h i t d r o m e P T N Y N P A D F Y V Q V L A V V P G R E I E S . . . R D R I A K I C D N F A I SKV ....... S c r t d r o m e P E A Y N P A D F L I G V L A T D P G Y E Q A S . . . Q R S A Q H L C D Q F A V SSA ....... Adplsacce PDNYNIADYL IDITFEAG.. PQGK...RRR IRNISDLEAG TDTNDIDNTI B r o w d r o m e P L N C N P A D F Y L K T L A D K E G K E N A G A V L R A K Y E H E T D G L Y S GS ........ ConsensusP...NPADF .......................... D ..............
.
..
~
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
.
118
Whithomsa Whitanoal W h i t d r ome Scrtdrome Adplsacce Br owdr ome Consensus
751 ........................................ ........................................ ........................................ ........................................ HQTTFTSSDG TTQREWAHLA AHRDEIRSLL RDEEDVEGTD ..................................... WLL .........................................
800 KRDLGGDAEV ARDI.. I E T A ARDM..EQLL AKQR..DMLV GRRGATEIDL ARSYSGDYLK R ........
t i?~............. ./ u
;:~'!; ?g5 -;!::.ff
iii~-.i):~)12;
..-......
...
!~i: ?-:2~i
...
!i;jii::!j:: ;:y ...... ......
:ii.i;!!(~/
ill -i: ...
..
801 85O N P F L W H R P S E E V K Q T K R L K G L R K D S S S M E G CHSF . . . . . . . . . . SASC.L S Q V N G D G G I E L T R T K H T T D P Y F L Q P M E G V D STGY . . . . . . . . . . R A . S W W A T K N L E K P L E . . . . . . . . . . . . . . . . QPEN GYTY KA TWF N ....... LE I H M A Q S G N F P F ...... DTE VESF . . . . . . . . . . R G V A W Y NTKLLHDKYK DSVYYAELSQ EIEEVLSEGD EESNVLNGDL PTGQQSAGFL HVQNFKK ....................................... IRWI ..................................................
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
851 TQFCILFKRT FLSIMRDSVL THLRITSHIG IGLLIGLLYL TQFYCILWRS WLSVLKDPML VKVRLLQTAM VASLIGSIYF MQFRAVLWRS WLSVLKEPLL VKVRLIQTTM VAILIGLIFL KRFHVVWLRA IVTLLRDPTI QWLRFIQKIA MAFIIGACFA QQLSILNSRS FKNMYRNPKL LLGNYLLTIL LSLFLGTLYY YQVYLLMVRF MTEDLRNIRS GLIAFGFFMI TAVTLSLMYS .Q ...... R . . . . . . . . . . . . . . . . . . . . . . . . . . G ....
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
901 950 LSNSGFLFFS MLFLMFAALM PTVLTFPLEM GVFLREHLNY WYSLKAYYLA MNINGSLFLF LTNMTFQNVF AVINVFSAEL PVFLREKRSR LYRVDTYFLG MNINGAIFLF LTNMTFQNVF ATINVFTSEL PVFMREARSR LYRCDTYFLG QAVQGALFIM ISENTYHPMY SVLNLFPQGF PLFMRETRSG LYSTGQYYAA QNRMGLFFFI LTYFGFVT.F TGLSSFALER IIFIKERSNN YYSPLAYYIS QDVGGSIFML SNEMIFTFSY GVTYIFPAAL PIIRREVGEG TYSLSAYYVA .... G..F . . . . . . . F . . . . . . . . . F . . . . . . F.RE ..... Y .... Y...
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
951 KTMAD VPFQ I M F P V A Y C S I KTIAE.LPLF IAVPFVFTSI KTIAE.LPLF LTVPLVFTAI NILAL.LPGM IIEPLIFVII KIMSEVVPLRWPPILLSLI LVLS.FVPVA FFKGYVFLSV ....... P ..... P ..... I
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
i001 LGLL.IGAAS TSLQVATFVG PVTAIPVLLF FGYL.ISCAS SSISMALSVG PPVVIPFLIF FGYL.ISCAS SSTSMALSVG PPVIIPFLLF CGCF F S T A F N S V P L A M A Y L V P L D Y I F M I T LEILTIGIIF EDLNNSIILS VLVLLGSLLF YGVF.LSSLF ESDKMASECA APFDLIFLIF .G . . . . . . . . . S...A . . . . . . . . . . . L.F
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
1051 ISYVRYGFEG VILSIY.GLD REDLHCDIDE TCHFQKSEAI LSWFRYANEA LLINQWADHR DGEIGCTRAN VTCPASGEII LSWFRYANEG LLINQWADVE PGEISCTSSN TTCPSSGKVI LSWMLYANEA MTAAQWSGVQ NITCFQESAD LPCFHTGQDV FSVFYYAYES LLINEVKTLM LKERKYGLNI EV...PGATI LSLFFYSNEA LMYKFWIDID NIDCPVN.ED HPCIKTGVEV .S...Y..E . . . . . . . . . . . . . . . . . . . . . . . . . . . G...
...:
.i~::~:: .~::-i~:::.:
?
Whithomsa Whitanoal Whitdrome Scrtdrome Adplsacce Browdrome Consensus
..
~
900 GIGNETKK.V G.QVLDQDGV G.QQLTQVGV GTTEPSQLGV NVSNDI.SGF GIGGLTQRTV G ........ V
i000 VYWMTSQPSD AVRFVLFAAL GTMTSLVAQS TYPMIGLKAAISHYLTTLFI VTLVANVSTS AYPMIGLRAG VLHFFNCLAL VTLVANVSTS CYWLTGLRST FYAFGVTAMCVVLVMNVATA VYPMTGLNMK DNAFFKCIGI LILF.NLGIS IYASIYYTRG FLLYLSMGFL MSLSAVAAVG .Y . . . . . . . . . . . . . . . . . . . . L ....... 1050 SGFFVSFDTI P.TYLQWMSY GGFFLNSASV P.AYFKYLSY GGFFLNSGSV P.VYLKWLSY SGIFIQVNSLP VAFWWTQF SGLFINTKNI TNVAFKYLKN G G T Y M N V D T V PG ..... LKY .G.F ...... P .........
ii00 LRELDVENAK LETFNFRVED LETLNFSAAD LDKYTFNESN LSTFGFVVQN LQQGSYRNAD L .........
11~
ii01
1135
Whithomsa ..LYLDFIVL GIFFISLRLI AYLVLRYKIR AER.. ~ii !?ii ~i!i i i i!i W h i t a n o a l . . F A L D I G C L F A L I V L F R L G A L F C L W L R S R SKE.. Whitdrome ..LPLDYVGL AILIVSFRVL AYLALRLRAR RKE.. S c r t d r o m e . . V Y R N L L A M V G L Y F G F H L L G Y Y C L W R R A R KL... iii? i i', ' iiii:i :::::::::::::::::::::::::::::::::::::::: ~ii~ii:;~i;ii!:i!i;:A d p l s a c c e . . L V F D I K I L A L F N V V F L I M G Y L A L K W I V V E Q K . . Browdrome YTYWLDCFSLVVVAVIFHIV SFGLVRRYIH RSGYY ,:............ +:.:; ..... C o n s e n s u s ..... D . . . L ...... F . . . . . . . L . . . . . . . . . . ~:.i::::ii;i:::.~/. i::i:~::!~i~i.::; -
i~!iii!i:~i:.::i!::i~,!ii!!il :::::::::::::::::::::::::::::: .:.~#.:
Residues listed in the consensus sequence are present in at least 75 % of the aligned transporter sequences. Proteins listed subsequently in italics are at least 90% identical to the paired transporters listed in parenthesis and are therefore not included in the alignment: W h i t m u s m u (Whithomsa). Residues indicated in boldface type are also conserved in at least one other family of the ABC transporter superfamily. Database accession numbers 2.2L.. LL..Z ....
ZZ.Z-C :::::::::::::::::::::::::::::
:::::::::::::::::::::::::::::::::..:
i!i)i!s~::!ii:!!::ili:~::i
Adplsacce Browdrome Scrtdrome Whitanoal Whitdrome Whithomsa Whitmusmu
SWISSPR OT
PIR
EMBL/GENBANK
P25371 P12428 P45843
S19421; $40914 A31399; FYFFB
P10090 P45844
S07263; FYFFW
X59720; G5381 M20630; G157014 U39739; G 1079665 L76302 X51749; G8826 X91249; E218444 U34920
Rs163 t 2 3 4 s
12(
Pepling, M. and Mount, S.M. {1990) Nucleic Acids Res. 18, 1633. Dreesen, T.D. et al. (1988) Mol. Cell Biol. 8, 5206-5215. Tearle, R.G. et al. (1989)Genetics 122, 595-606. Chen, H. et al. (1996)Am. 1. Hum. Genet. 59, 66-75. Higgins, C.E (1992) Annu. Rev. Cell Biol. 8, 67-113.
ABC 1 & 2 transporter family
Summary i~i!i~!i!i~~ ,i':i:i Transporters of the ABC 1 & 2 family, the example of which is the novel mouse i:!:~;i~ ,!i~!i:!~~ !,i,l' ATP binding protein ABC 1 1 (Abclmusmu), are believed to act as transporters, although their natural substrate is unknown. The two known members of this family are found only in mammals. Statistical analysis of multiple amino acid sequence comparisons places the ABC 1 & 2 transporter family in the multidrug resistance subdivision of iiiii',i:,d; the ATP binding cassette (ABC)superfamily 2. Proteins in this superfamily use the energy of ATP hydrolysis to pump substrates across cell membranes. !iliiii;ii~;ii~il Transporters of the ABC 1 & 2 transporter family consist of a single polypeptide chain made up of four domains. The N- and C-terminal halves of the protein are i',~!!iii:i~!~!i!!iiii:~:ii homologous, and each half is made up of a transmembrane domain followed by an ATP binding domain. Each transmembrane domain is predicted to contain six membrane-spanning helices by the hydropathy of the amino acid sequences and may be glycosylated. ..........5::*:: :::".............
i~i!i:.!ii~i!i;i:ii:.:::::i
Nomenclature, biological sources and substrates CODE
Abclmusmu Abc2musmu
DESCRIPTION
OR GANISM
S UBSTRATE(S)
[SYNONYMS]
[COMMON NAMES] Mus musculus
Unknown
ATP binding cassette transporter 1 ATP binding cassette transporter 2
[mouse] Mus musculus
Unknown
[mouse]
Proposed orientation of ABC1 ~ in the membrane The model is based on predictions of membrane-spanning regions and a-helical content. The N-terminus of the protein is illustrated on the inside and is folded twelve times through the membrane. The predicted membrane-spanning helices are portrayed as rectangles. The numbers corresponding to the first and last residue of each membrane-spanning helix are boxed.
121
A B C 1 8. 2 t r a n s p o r t e r f a m i l y
OUTSIDE
,i~!!!i:;i,~!ii~:~i~:~-!
iiii..i:.,.ii!:~.i:!
.. . . . . . . . ..
:iii!~;i!::;.iil}d
I
~_ ..... . ......
tcr/ t ' ~ l i'll,
:~:i!!iT~i.:;~:.!,~!ili! 7iii:D77!!!:q i~!i77:7~;2i711
7
.............. ............... ...... .:.,:,: