Molecular Systematics of Fishes
This Page Intentionally Left Blank
Molecular Systematics Fishes Edited by
Thomas D...
243 downloads
2031 Views
25MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Molecular Systematics of Fishes
This Page Intentionally Left Blank
Molecular Systematics Fishes Edited by
Thomas D. Kocher Department of Zoology University of New Hampshire Durham, New Hampshire
Carol A. Stepien Department of Biology Case Western Reserve University Cleveland, Ohio
Academic Press San Diego London Boston New York Sydney Tokyo Toronto
This book is printed on acid-free paper. ( ~
Copyright 9 1997 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. A c a d e m i c Press a division o f Harcourt Brace & Company
525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.apnet.com Academic Press Limited 24-28 Oval Road, London NW 1 7DX, UK http://www.hbuk.co.uk/ap/ Library of Congress Cataloging-in-Publication Data Molecular systematics of fishes / edited by Thomas D. Kocher, Carol A. Stepien. p. cm. Includes bibliographical references and index. ISBN 0-12-417540-6 (alk. paper) 1. Fishes--Phylogeny. 2. Fishes--Molecular aspects. I. Kocher, Thomas D. II. Stepien, Carol A. QL618.2.M65 1997 96-49199 597.13'8--dc21 CIP
PRINTED IN THE UNITED STATES OF AMERICA 97 98 99 00 01 02 EB 9 8 7 6 5
4
3
2
1
Contents
Contributors Preface xi
ix
CHAPTER
3 CHAPTER
1 Molecules and Morphology in Studies of Fish Evolution Carol A. S tepien and Thomas D. Kocher
I. II. III. IV.
Introduction 1 History of Molecular Techniques 2 Controversy over Analytical Methods 5 Achievements and Failures of Molecular Systematics 7 V. Eight Promising Directions for Future Research 8 VI. A New Age of Synthesis 9 References 9
Molecular Systematics of a Rapidly Evolving Species Flock: The mbuna of Lake Malawi and the Search for Phylogenetic Signal Irv Kornfield and Alex Parker
I. Introduction 25 II. Molecular Investigations 26 III. Mitochondrial DNA and Ancestral Polymorphisms 26 IV. Alternate Molecular Approaches 27 V. Microsatellite Loci 28 VI. A Test of the Phylogenetic Potential of Microsatellites 29 VII. Materials and Methods 31 VIII. Results 32 IX. Discussion 33 X. Summary 35 References 35
CHAPTER
2
CHAPTER
Base Substitution in Fish Mitochondrial DNA: Patterns and Rates
4
Thomas D. Kocher and Karen L. Carleton
Reconstruction of Cichlid Phylogeny Using Nuclear DNA Markers
I. II. III. IV.
Introduction 13 Simple Models of Substitution 13 Evolution of Real Sequences 15 Implications for Phylogenetic Reconstruction 19 V. Conclusions 23 References 24
Holger S~iltmann and Werner E. Mayer I. Introduction 39 II. Methods Used for Reconstructing Cichlid Phylogeny 40 III. Random Amplification of Polymorphic DNA (RAPD) 41
vi
CONTENTS
IV. Allele Size Frequencies at Dinucleotide Microsatellite Loci 45 V. Critical Evaluation Using RAPD and Microsatellite Allele Frequencies for the Reconstruction of Cichlid Fish Phylogeny References 49
CHAPTER
7 47
Phylogeographic Patterns in Populations of Cichlid Fishes from Rocky Habitats in Lake Tanganyika Christian Sturmbauer, Erik Verheyen, Lukas R~iber and Axel Meyer
CHAPTER
5 Biogeographic Analysis of Pacific Trout
(Oncorhynchusmykiss) in California and Mexico Based on Mitochondrial DNA and Nuclear Microsatellites Jennifer L. Nielsen, Monique C. Fountain and Jonathan M. Wright
I. Lake Tanganyika and Its Cichlid Species Flock 97 II. Speciation and DNA 98 III. From Patterns toward an Understanding of Processes 105 IV. Conclusions 109 References 109
CHAPTER
8 Fish Biogeography and Molecular Clocks: Perspectives from the Panamanian Isthmus
I. II. III. IV.
Introduction 53 Materials and Methods Results 57 Discussion 64 References 66 Appendices 70
Eldredge Bermingham, S. Shawn McCafferty and Andrew P. Martin
55
I. Introduction 113 II. Temporal Scaling: The Panama Isthmus and Molecular Clocks 114 III. Geographic Scaling: The Panama Isthmus and Caribbean Fish 119 IV. Geographic Scaling: The Panama Isthmus and the Circumtropical Abudefduf (Teleostei: Pomacentridae) Species Group 121 V. Geographic Scaling: The Panama Isthmus and Neotropical Freshwater Fishes 123 VI. Concluding Remarks 125 References 126
CHAPTER
6 Mitochondrial DNA Sequence Variation among the Sand Darters (Percidae: Teleostei) E. O. Wiley and Robert H. Hagen
I. II. III. IV. V. VI.
Introduction 75 Systematics of Sand Darters Methods and Materials 78 Results 78 Discussion 91 Summary 94 References 94 Appendices 96
76
CHAPTER
9 The Utility of Mitochondrial DNA Control Region Sequences for Analyzing Phylogenetic Relationships among Populations, Species, and Genera of the Percidae Joseph E. Faber and Carol A. S tepien
I. Introduction 129 II. Materials and Methods
131
CONTENTS
III. Results 133 IV. Discussion 137 V. Material Examined References 140
140
vii
III. IV. V. VI. VII.
Allozymes and DNA 191 Fish Samples 191 DNA Sequences 191 Phylogenetic Relationships Conclusion 195 References 195
193
CHAPTER
10 Phylogenetic Relationships among the Salmoninae Based on Nuclear and Mitochondrial DNA Sequences Ruth B. Phillips and Todd H. Oakley
CHAPTER
13 Interrelationships of Lamniform Sharks: Testing Phylogenetic Hypotheses with Sequence Data Gavin J. P. Naylor, Andrew P. Martin, Erik G. Mattison and Wesley M. Brown
I. Introduction 145 II. Conclusions 158 References 159
CHAPTER
11
I. Introduction 199 II. Materials and Methods III. Results and Discussion References 216 Appendix 218
Combining Molecular and Morphological Data in Fish Systematics: Examples from the Cyprinodontiformes
CHAPTER
14
Alex Parker
I. II. III. IV. v. VI. VII. VIII. IX.
Introduction 163 Analysis of Combined Data: Justification 164 Analysis of Combined Data: Methods 165 Consensus Approaches: Justification 166 Consensus Methods 166 Analysis of Cyprinodontiform Data 167 Methods 167 Results and Discussion 170 Conclusions 181 References 182 Appendices 184
202 204
Radiation of Characiform Fishes: Evidence from Mitochondrial and Nuclear DNA Sequences Guillermo Ortf
I. Introduction 219 II. Materials and Methods III. Results and Discussion References 240 Appendix 242
222 222
CHAPTER
15
CHAPTER
12 Molecular Phylogeny of the Fundulidae (Teleostei, Cyprinodontiformes) Based on the Cytochrome b Gene Giacomo Bernardi
I. Introduction 189 II. Morphology
190
The Evolution of Blennioid Fishes Based on an Analysis of Mitochondrial 12S rDNA Carol A. Stepien, Alison K. Dillon, Meriel J. Brooks, Kristen L. Chase and Allyson N. Hubers
I. II. III. IV.
Introduction 245 Materials and Methods Results 253 Discussion 258
250
viii
CONTENTS
V. Summary 267 References 268
VII. Conclusion 279 References 281
CHAPTER
CHAPTER
16
17
Major Histocompatibility Complex Genes in the Study of Fish Phylogeny
The Phylogenetic Utility of the Mitochondrial Cytochrome b Gene for Inferring Relationships among Actinopterygian Fishes
Jan Klein, Dagmar Klein, Felipe Figueroa, Akie Sato and Colin O'hUigin I. Introduction 271 II. Major Histocompatibility Complex (Mhc) Structure and Function 271 III. Mhc as a Source of Systematic Information 273 IV. Sequences as a Source of Phylogenetic and Systematic Information 273 V. Cladistic Analysis with Macromutations 275 VI. Mhc Gene Frequencies in Populations Undergoing Adaptive Radiation 276
Charles Lydeard and Kevin J. Roe I. Introduction 285 II. Materials and Methods III. Results and Discussion References 300 Taxonomic Index 305 Subject Index 311
288 289
Contributors
Numbers in parentheses indicate the pages on which the authors' contributions begin.
Allyson N. Hubers (245) Department of Biology, Case Western Reserve University, Cleveland, Ohio 44106.
Eldredge Bermingham (113) Smithsonian Tropical Research Institute, Balboa, Republic of Panama.
Dagmar Klein (271) Department of Microbiology and
Immunology, University of Miami School of Medicine, Miami, Florida 33136.
Giacomo Bernardi (189) Department of Biology, University of California, Santa Cruz, Santa Cruz, California 95064.
Jan Klein (271) Max-Planck-Institut ftir Biologie, Ab-
teilung Immungenetik, D-72076 Ttibingen, Germany and Department of Microbiology and Immunology, University of Miami School of Medicine, Miami, Florida 33136.
Meriel J. Brooks (245) Department of Science, Notre Dame College, South Euclid, Ohio 44121. Wesley M. Brown (199) Department of Biology, University of Michigan, Ann Arbor, Michigan 48109.
Thomas D. Kocher (1,13) Department of Zoology, University of New Hampshire, Durham, New Hampshire 03824.
Karen L. Carleton (13) Department of Zoology, University of New Hampshire, Durham, New Hampshire 03824.
Irv Kornfield (25) Department of Zoology and School of Marine Sciences, University of Maine, Orono, Maine 04469.
Kristen L. Chase (245) Department of Biology, Case Western Reserve University, Cleveland, Ohio 44106.
Western Reserve University, Cleveland, Ohio 44106.
Charles Lydeard (285) Aquatic Biology Program, University of Alabama, Department of Biological Sciences, Tuscaloosa, Alabama 35487.
Joseph E. Faber (129) Department of Biology, Case
Andrew P. Martin (113,199) Smithsonian Tropical Re-
Western Reserve University, Cleveland, Ohio 44106.
search Institute, Balboa, Republic of Panama and Department of Biological Sciences, University of Nevada Las Vegas, Las Vegas, Nevada 89154.
Alison K. Dillon (245) Department of Biology, Case
Felipe Figueroa (271) Max-Planck-Institut ftir Biologie, Abteilung Immungenetik, D-72076 Ttibingen, Germany.
Erik G. Mattison (199) Department of Biology, University of Michigan, Ann Arbor, Michigan 48109.
Monique C. Fountain (53) USDA Forest Service, Pa-
cific Southwest Research Station and Hopkins Marine Station, Department of Biology, Stanford University, Pacific Grove, California 93950.
Werner E. Mayer (39) Max-Planck-Institut ftir Bio-
Robert H. Hagan (75) Department of Entomology, University of Kansas, Lawrence, Kansas 66045.
S. Shawn McCafferty (113) Smithsonian Tropical Research Institute, Balboa, Republic of Panama.
logie, Abteilung Immungenetik, D-72076 Ttibingen, Germany.
ix
x
CONTRIBUTORS
Axel Meyer (97) Department of Ecology and Evolution, State University of New York at Stony Brook, Stony Brook, New York 11794.
Kevin J. Roe (285) Aquatic Biology Program, University of Alabama, Department of Biological Sciences, Tuscaloosa, Alabama 35487.
Gavin J. P. Naylor (199) Department of Biology, Osborn Memorial Laboratory, Yale University, New Haven, Connecticut 06520.
Lukas Riiber (97) Zoological Museum of the University of Zfirich, Switzerland.
Jennifer L. Nielsen (53) USDA Forest Service, Pacific Southwest Research Station and Hopkins Marine Station, Department of Biology, Stanford University, Pacific Grove, California 93950.
Akie Sato (271) Max-Planck-Institut ftir Biologie, Abteilung Immungenetik, D-72076 Ttibingen, Germany. Carol A. Stepien (1,129,245) Department of Biology, Case Western Reserve University, Cleveland, Ohio 44106.
Colm O'hUigin (271) Max-Planck-Institut ffir Biologie, Abteilung Immungenetik, D-72076 Tfibingen, Germany.
Christian Sturmbauer (97) Department of Zoology, University of Innsbruck, A-6020 Innsbruck Austria.
Todd H. Oakley (145) Department of Biological Sciences, University of Wisconsin- Milwaukee, Milwaukee, Wisconsin 53201.
I-Iolger Siiltmann (39) Max-Planck-Institut ftir Biologie, Abteilung Immungenetik, D-72076 Tfibingen, Germany.
Guillermo Orti (219) Department of Genetics, University of Georgia, Athens, Georgia 30602.
Erik Verheyen (97) Royal Belgium Institute of Natural Sciences, B-1000 Brussels, Belgium.
Alex Parker (25,163) Department of Zoology and School of Marine Sciences, University of Maine, Orono, Maine 04469.
E. O. Wiley (75) Museum of Natural History and Department of Systematics and Ecology, University of Kansas, Lawrence, Kansas 66045.
Ruth B. Phillips (145) Department of Biological Sciences, University of Wisconsin - Milwaukee, Milwaukee, Wisconsin 53201.
Jonathan M. Wright (53) Marine Gene Probe Laboratory, Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1, Canada.
f , ce
ber, Allyson N. Hubers, Mark D. Chandler, Rachel A. Bartholomew, Rachael A. Callcut, and Gary R. Kutsikovich reviewed the entire volume at various stages. We owe special thanks to Rachel A. Bartholomew and Rachael A. Callcut for helping to prepare the indices, Karen L. Carleton for work on the references, and Craig Albertson for the artwork on the cover jacket. Our work on the molecular systematics of fishes has been generously funded by grants from the National Science Foundation, the Alfred Sloan Foundation, the National Geographic Society, the National Research Council, and the NOAA Sea Grant Program. We especially thank our families and students for their patience and understanding during the many periods that our work has required us to be elsewhere--in body or in thought. This volume is dedicated to our mentors (especially Richard Rosenblatt, David Hillis, Allan Wilson, and Jeff Mitton) who encouraged, critiqued, and shaped our ideas in molecular systematics. We hope that this volume will contribute to the preservation of fish species so that future generations will be able to wonder at the beauty and diversity of fishes in their natural habitats.
Fishes are the most diverse group of extant vertebrates, and yet our knowledge of the evolutionary relationships among them is largely incomplete. Over the past few years, molecular genetic methods, particularly PCR amplification and DNA sequencing, have become widely used to study the evolutionary history of fishes. Because of the strong tradition of morphological systematics of fishes, this group is uniquely suitable for testing and evaluating the efficacy of different approaches to elucidating the relationships among taxa. This book surveys the use of these new methods at many taxonomic levels, from the structure of local populations to the relationships among the deepest branches of the piscine family tree. The authors bring a diversity of experience and approaches to their analyses, and the result is a collective evaluation of the utility of these techniques for understanding evolutionary patterns and processes. Although this book focuses on fishes, the conclusions should be broadly applicable to the molecular systematics of other groups. We thank the authors for seeing this project through to completion. We are indebted to a host of anonymous individuals for constructive critical reviews of each chapter in manuscript form. In an increasingly busy world, it was a delight to see that many careful reviewers are still willing to take the time to coax a higher quality manuscript from their colleagues. In addition to these reviewers, Raymond R. Wilson, Joseph E. Fa-
Thomas D. Kocher, University of New Hampshire Carol A. Stepien, Case Western Reserve University
xi
This Page Intentionally Left Blank
C H A P T E R
1 Molecules and Morphology in Studies ofFish Evolution CAROL A. STEPIEN
T H O M A S D. KOCHER
Department of Biology Case Western Reserve University Cleveland, Ohio 44106
Department of Zoology University of New Hampshire Durham, New Hampshire 03824
I. I n t r o d u c t i o n
Fishes are the most diverse group of living vertebrates, with more than 24,600 extant species currently known (Nelson, 1994). For more than a century, systematists have sought to organize this diversity by studying aspects of their external and internal morphology. Their patient counting and dissection have achieved remarkable success in identifying groups of evolutionarily related species and provide the foundation and starting point for all current work on the systematics of fishes (for summaries of present status of morphological systematics of fishes see Nelson, 1994; Stiassny et al., 1996). The development of molecular techniques has helped invigorate studies of fish systematics. The realm of methods developed for molecular systematics (Hillis, et al., 1996; Ferraris and Palumbi, 1996) offer new suites of characters for analyzing relationships among fishes (Carvalho and Pitcher, 1995) and have been effectively applied from the level of populations to orders. It is hoped that this book illustrates the broad utility of molecular approaches for addressing fish systematic questions. Morphological studies have been especially successful in defining species and in organizing these species into genera. These groupings have usually been confirmed when examined with molecular approaches. Molecular characters have revealed some cryptic species (reviewed by Avise, 1994) and identified some inMOLECULAR SYSTEMATICS OF FISHES
correctly split groups (e.g., species in the clinid kelpfish genus Gibbonsia by Stepien and Rosenblatt, 1991; Stepien et al., Chapter 15). In general, the overall concordance between morphological and molecular studies has been good. Testing for congruence of relationships derived from independent data sets is a particularly robust approach to systematic problems (Miyamoto and Fitch, 1995). Although morphological studies have generally been successful in defining genera, it is rare to find studies which present a hypothesis of relationship above the level of the species comprising a genus, primarily due to a lack of congruence of characters. Fortunately, this is one of the strengths of molecular data, and inter- and intrageneric relationships are now being rapidly tested and elucidated. Molecular data are also the primary means used to assess the phylogeographic relationships among populations, examining questions of zoogeographic subdivision and relationships among areas (see Chapter 5 by Nielsen et al., Chapter 8 by Bermingham et al., and Chapter 9 by Faber and Stepien). Studies at these lower systematic levels are shedding more light on the mechanisms underlying the diversity of fishes. Both morphological and molecular studies have had particular difficulty discerning higher-level relationships. In both types of data, the central problems are identifying homologous characters and finding a sufficient number of synapomorphies to identify lineages with statistical confidence. Although great strides have Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
2
CAROL A. STEPIEN A N D THOMAS D. KOCHER
been made in identifying appropriate molecules and refining analytical techniques, interpreting relationships among the deepest clades of the piscine phylogeny are still problematic. This book is arranged in approximate order of primary phylogenetic problems addressed, ranging from lower (relationships among populations and closely related species) to higher-level systematic questions. The first set of chapters primarily focus on discerning population and species level problems in relation to phylogeography and include Chapter 3 by Kornfield and Parker (mbuna species flock), Chapter 4 by S~iltmann and Mayer (cichlid adaptive radiation), Chapter 5 by Nielsen et al. (Pacific trout Oncorhynchus), Chapter 6 by Wiley and Hagen (sand darters Ammocrypta), Chapter 7 by Sturmbauer et al. (cichlids), and Chapter 8 by Bermingham et al. (biogeographic patterns involving fishes of the Panamanian Isthmus). The next set of chapters address resolution of DNA for testing middle-level systematic problems (species through family-level questions) and discriminating among morphology-based hypotheses, including Chapter 9 by Faber and Stepien (Percidae), Chapter 10 by Phillips and Oakley (Salmoninae), Chapter 11 by Parker (Cyprinodontiformes), and Chapter 12 by Bernardi (Fundulidae, Cyprinodontiformes). The final set of chapters focus on the resolution power of genes to address higher-level systematic questions and evaluating the level of maximum phylogenetic utility. These include Chapter 13 by Naylor et al. (lamniform sharks), Chapter 14 by Orti (Characiformes), Chapter 15 by Stepien et al. (Blennioidei), Chapter 16 by Klein et al. (Cichlidae), and Chapter 17 by Lydeard and Roe (Actinopterygii).
II. History of Molecular Techniques An increasingly sophisticated realm of techniques has been developed since the mid-1970s to study the molecular similarities of organisms. Although preceded by protein sequencing and immunology, the widespread use of molecular techniques in fish systematics really began with the discovery of allozyme polymorphisms. A. A l l o z y m e S t u d i e s
Allozyme/isozyme studies involve identifying protein polymorphisms by comparing their similarities and differences in net electric charge. Allozyme and isozyme studies have been one of the most popular approaches in examining population genetic and stock
divergence questions in fishes. They have also been especially useful in identifying cryptic species and in testing biogeographic hypotheses. Allozyme/isozyme electrophoresis has the advantage of being relatively rapid, cost effective, and efficient. Another advantage is that the sampling is spread over a variety of presumably independent gene loci. The chief disadvantage of using an allozyme approach is that bands (alleles) that have the same electric charge and migrate to the same point in the gel may not be homologous (i.e., evolutionary convergence). The scoring of gels is often somewhat subjective and bands are difficult to interpret when weak or close together. Variants have traditionally been assumed to be selectively neutral, enabling hypotheses of separation time to be tested. However, several studies have shown that some allozyme variants are not neutral markers and are under selection (Avise, 1994; Pogson et al., 1995; Powers and Shulte, 1996). Our view is that increasing evidence shows that most (if not all) "neutral" genetic markers, including allozymes, mtDNA, and microsatellites, are indeed subject to varying amounts of selective constraint. The possibility that loci are under selection does not eliminate their utility in systematics, however. For example, morphologists regularly utilize characters that are the products of selection. In this volume, Nielsen et al. (Chapter 5; Salmonidae) and Stepien et al. (Chapter 15; Blennioidei) examine the congruence of hypotheses derived from allozyme data with other molecular data sets. B. M i t o c h o n d r i a l D N A
The mitochondrial (rot) genome has many properties that make it useful for reconstructing recent phylogenetic history (reviewed by Wilson et al., 1985; Avise, 1994; Simon et al., 1994). The most important feature is its clonal inheritance. Fish mitochondrial genomes are haploid and apparently nonrecombining. The evolution of the molecule therefore corresponds exactly to the model of bifurcating evolutionary trees. Second, mtDNA evolves more quickly than most nuclear genes, allowing the identification of informative phylogenetic characters among even closely related species and populations. Two other features of mtDNA are typically listed as advantages for phylogenetic analysis. First, mtDNA is maternally inherited. Although it is true that mtDNA is predominantly maternally inherited, several instances of heteroplasmy of distinct mitochondrial lineages suggest that this is not strictly, or universally correct (Magoulas and Zouros, 1993). Second, it may no longer be appropriate to consider that substitutions in mtDNA accumulate according to a strictly neutral process.
1. Molecules and Morphology in Studies of Fish Evolution
Patterns of sequence differentiation suggest that selective sweeps may be common (Ballard and Kreitman, 1994), and laboratory experiments have suggested competitive differences among mitochondrial haplotypes (Hutter and Rand, 1995). Whether these departures from neutral evolution invalidate the concept of molecular clocks remains to be seen. Many studies of mtDNA have analyzed restriction fragment length polymorphisms (RFLPs). Whole mtDNA can be digested with specific endonucleases, and the products are then separated by size using gel electrophoresis. In the most comprehensive studies, restriction sites are mapped and their presence or absence (rather than mere sharing of fragment lengths) is scored (Dowling et al., 1990). RFLP studies have been a popular approach in quantifying the degree of divergence within and among populations. In applying this approach to species and higher-level systematic questions, the homology of restriction site characters becomes less certain. A better approach for these comparisons involves direct analysis of DNA sequences.
C. Polymerase Chain Reaction and DNA Sequencing Until the development of the polymerase chain reaction (PCR) (Saiki et al., 1988), sequencing of genes for phylogenetic analysis was rarely performed because of the huge investment required to clone homologous genes from multiple samples. The introduction of primer sequences with wide phylogenetic utility ("universal primers"; e.g., Kocher et al. 1989) allowed the rapid amplification of particular sequences from a large number of samples and helped create an explosion of studies using DNA sequences to examine phylogenetic questions. DNA sequence data have a number of inherent advantages over other kinds of systematic data. First, an essentially unlimited number of sequence characters are potentially available. Fish genomes typically contain on the order of a billion nucleotide pairs, each of which is potentially informative for phylogenetic analysis. Second, these characters are useful for studying relationships among both close and distant relatives. Each gene, as well as individual sites within a gene, evolves at a unique rate because of variation in the level of functional constraint. Slowly evolving genes such as nuclear 18S rDNA may be useful for discerning relationships among highly divergent groups (Hillis and Dixon, 1991). More rapidly evolving areas, such as the mtDNA control region, may be useful for discerning lower-level systematic relationships, such as among populations and species, as shown for percid relationships in the study by Faber and Stepien
(Chapter 9). In coding regions, the variation in DNA sequences may be evaluated among first, second, and third codon positions and at the amino acid level in order to increase potential phylogenetic utility at higher systematic levels. The relative strength of the phylogenetic signal with codon position and between the nucleotide and amino acid levels are critically evaluated by Naylor et al. (Chapter 13) and Lydeard and Roe (Chapter 17).
D. Mitochondrial DNA Sequence Regions Mitochondrial DNA regions have been well studied in fishes, and knowledge of universal primer sequences (e.g., Kocher et al., 1989; Meyer ef al., 1990, Simon et al., 1994; Palumbi, 1996) for amplification by PCR and sequencing has made them very accessible. As illustrated in this volume, they can be effectively used to address many different levels of taxonomic questions, depending on the region sequenced and the use of various correction factors for types and positions of substitutions. Silent sites of mitochondrial protein-coding genes and the nontranscribed control region are shown to be particularly useful for analyzing relationships of recently diverged taxa, such as among populations, species, and genera. In the case of higherlevel systematic questions, silent sites and rapidly evolving regions may have experienced multiple substitutions, obscuring phylogenetic signal. At higher taxonomic levels, more slowly evolving regions, such as the 12S and 16S ribosomal RNA genes may be useful. Alternatively, because substitutions in nonsynonymous nucleotide sites (which alter the encoded amino acids) occur more rarely, these changes may provide a higher signal/noise ratio for deep comparisons. The sequence evolution of mtDNA has been relatively well studied in fishes. Base substitution events occur relatively rapidly. MtDNA structure, gene order, and secondary structure are largely conserved in fishes, as well as in other vertebrates. It is inherited as a single unit and thus has been characterized as sampling a single gene, which is a possible disadvantage that may particularly affect population genetic studies. Because the evolutionary history of a single gene can be different from the average history of an entire genome (discussed by Avise, 1994), caution must be used in interpreting mitochondrial gene trees as reflecting the history of populations. The cytochrome b gene is probably the best-studied mitochondrial gene in fishes (e.g., Kocher et al., 1989; Meyer et al., 1990; Carr and Marshall, 1991; Block et al., 1993; Zhu et al., 1994; Carr et al., 1995). Like most mitochondrially encoded proteins, it is a transmembrane protein important in the respiratory chain of cellular
4
CAROL A. STEPIEN AND THOMAS D. KOCHER
metabolism. Although it has been widely used, some have questioned the ability of this sequence (especially short subsets of the gene) to resolve phylogenies (Martin et al., 1990; Graybeal, 1993; Meyer, 1994). In this volume, mtDNA sequences from the cytochrome b gene are used to analyze a variety of levels of relationships ranging from population genetics to higher-level systematics. For example, Bermingham et al. (Chapter 8) use cytochrome b data to assess population genetic and phylogeographic questions in tropical damselfishes of the Abudefduf saxatilis species group. Cytochrome b sequences are used to analyze relationships among species and groups of sand darters (family Percidae) (Wiley and Hagen, Chapter 6), among species of salmonids (Phillips and Oakley, Chapter 11), among members of the family Fundulidae (Cyprinodontiformes) (Bernardi, Chapter 12), and among lamniform sharks (Naylor et al., Chapter 13). At higher taxonomic levels, Lydeard and Roe (Chapter 17) test the use of cytochrome b to analyze relationships among actinopterygian fishes, revealing strong phylogenetic signal. By examining their data using different codon positions, Lydeard and Roe achieve greater utility at higher taxonomic levels than does Bernardi (Chapter 12). Mitochondrial ribosomal genes (12S and 16S rDNA subunits) are often used to study more distantly related taxa. Substitutions in the small subunit (12S) accumulate relatively slowly, approximating the average for the entire mitochondrial genome, whereas those in the large subunit (16S) evolve even more slowly (Simon et al., 1994). The 12S rDNA gene is used by Stepien et al. (Chapter 15) to examine relationships among species, genera, tribes, families, and suborders of blenniiform fishes, showing strong utility at these different levels and congruence with morphological-based hypotheses. Stepien (12S; Chapter 15), Orti (12S and 16S, Characiform fishes; Chapter 14), and Parker (16S, Cyprinodontiformes; Chapter 11) evaluate differences in the amount of phylogenetic signal among stem and loop regions of the ribosomal genes, reporting a greater retention of the phylogenetic signal at higher taxonomic levels in the more slowly evolving stem regions and more useful characters at lower taxonomic levels in the more rapidly changing loop regions. The mtDNA control region is involved in the control of mtDNA replication and RNA transcription. It is also called the displacement loop (D-loop) because one of the two strands of the helix is displaced by the synthesis of a new strand during replication. The highly variable left domain region has been believed to be largely selectively neutral, which may account for its very rapid rate of variation. In fishes, the control region is usually long (e.g., 888 to 1223 bp in percids; Faber and Stepien, Chapter 9) and often contains tandemly repeated segments. There is a set of conserved se-
quence blocks that are probably involved in controlling mtDNA replication and transcription, which may be useful for some systematic studies (see Attardi, 1985; Lee et al., 1995; Faber and Stepien, Chapter 9). The highly variable control region has thus been a popular sequence for examining population structure and relationships among closely related species of fishes (e.g., Meyer et al., 1990; Arnason and Rand, 1992; Sturmbauer and Meyer, 1992, 1993; Brown et al., 1993; Stepien, 1995; Lee et al., 1995). In this volume, Sturmbauer et al. (Chapter 7) employ sequence data from the control region to address phylogenetic questions and models of adaptive radiation and biogeography of cichlid fishes in Lake Tanganyika, Africa. Nielsen et al. (Chapter 5) utilize control region variation to discern patterns of geographic structure in the Pacific trout Oncorhynchus mykiss. The utility of control region sequences for discerning higher-level relationships is critically evaluated by Phillips and Oakley (Chapter 11) and by Faber and Stepien (Chapter 9). Although some areas of this rapidly evolving sequence are alignable even among distantly related fishes (see Lee et al., 1995), the high rate of evolution of this sequence appears to preclude analyses beyond the level of closely related species and perhaps genera. E. N u c l e a r
DNA Sequences
Several nuclear DNA regions have been used to address systematic questions among fishes. One of these is the major histocompatibility complex (MHC) used by Klein et al. (Chapter 16) to examine evolutionary hypotheses of the haplochromine flock of cichlids in Lake Victoria, East Africa. MHC molecules are believed to play a central role in the vertebrate immune system by presenting peptides to T lymphocytes, thereby initiating immune response cascades. Because MHC molecules are well known due to their role in the immune system and are highly variable, they also offer a wealth of potential systematic information. There are two classes of MHC molecules (I and II), which each consist of two polypeptide chains (a and b), but differ in structure and function (Bjorkman and Parham, 1990). Klein et al. (Chapter 16) use examples from classes I and II to test phylogenetic utility among recently diverged fish species as well as at higher phylogenetic levels. They also address whether selection causes sequence and allele frequency convergence in MHC genes. Stepien et al. (Chapter 15) compare sequence-based trees of blennioid fishes derived from the nuclear internal transcribed spacer (ITS)-1 region of the ribosomal array (Stepien et al., 1993) with trees produced from mitochondrial 12S rDNA gene sequences. A much greater number of variable characters is obtained using mtDNA 12S gene than was found from the nuclear
1. Molecules and Morphology in Studies ofFish Evolution
ITS-1 region (Stepien et al., 1993), suggesting that nuclear ITS sequences are best used for studying deeper divergences. In contrast, Phillips and Oakley (Chapter 10) find nuclear rDNA spacers to be most useful at lower taxonomic levels (interspecific and subspecific levels). These results suggest that the ITS-1 region may evolve at different rates in different fish groups. Other chapters explore the utility of new genes for phylogenetic analysis. Parker (Chapter 11) tests the relative degree of phylogenetic signal among first, second, and third codon positions of the nuclear tyrosine kinase gene X-src sequences for resolving relationships among the cyprinodontid killifishes. Orti (Chapter 14) compares nuclear DNA sequences from the protein-coding gene ependymin (a major glycoprotein component of the extracellular fluid in the brain of fishes) with mitochondrial 12S and 16S rDNA sequences to test the evolution of characiform fishes at various hierarchical levels. Much work remains in identifying a standard set of nuclear genes for phylogenetic analysis of fishes. F. O t h e r N u c l e a r Techniques
The introduction of PCR opened other avenues for the analysis of genome sequences. We touch here on two popular methods: randomly amplified polymorphic DNAs (RAPDs) and microsatellite polymorphisms. The RAPD method primarily detects sequence changes within the annealing sites of PCR primers, resulting in the presence or absence of amplification products from a particular locus. RAPD polymorphisms usually have a pattern of dominant inheritance (Williams et al., 1990) and can be used to screen for differences among individuals, populations, and species. Sultmann and Mayer (Chapter 4) employ RAPDs to identify polymorphic loci in cichlid groups, followed by locus-specific DNA amplification and sequence determination of the fragments. In this way, they avoid problems with determining homology of fragments among species. They find a large number of insertions and deletions (some of which are species specific) that can be treated as characters along with nucleotide substitutions. Their phylogenies show considerable congruence with morphological hypotheses and other molecular studies. They conclude that RAPDs are able to detect polymorphisms among closely related taxonomic groups, ranging from populations to genera. Microsatellite DNAs are highly variable, tandemly repeated DNA sequences with unit repeats one to six bases in length. Length polymorphisms arising from variation in the number of repeats are quantified by sizing PCR-amplified copies of the locus on a polyacrylamide gel. Microsatellites are abundantly distrib-
5
uted throughout the nuclear genome and are highly polymorphic. They follow a Mendelian codominant inheritance pattern. Microsatellites have been widely used to analyze mating systems and population genetic structure (Queller et al., 1993), despite the fact that their pattern of mutation is still poorly understood (Jarne and Lagoda, 1996). In Chapter 5, Nielsen et al. examine the biogeographic variation of nuclear microsatellite repeats in Pacific trout, O. mykiss, in comparison with mtDNA control region sequences. Although their mtDNA data show significant latitudinal and longitudinal correlations, microsatellite data are only weakly associated with longitude (and not at all with latitude). These differences suggest that the evolutionary processes resulting in phylogeographic patterns of genetic variation differentially affect the mitochondrial and nuclear genomes. Kornfield and Parker (Chapter 3) test the utility of microsatellite loci for examining relationships within a rapidly evolving species flock (the mbuna of Lake Malawi), in comparison with results from allozyme, mtDNA RFLP, mtDNA sequence, nuclear DNA sequence, and RAPDs data sets. They conclude that microsatellites are the first class of molecular markers to possess sufficient power to elucidate that level of evolutionary history. Sultman and Mayer (Chapter 4) compare microsatellite allele size frequencies among cichlid species from Lake Victoria. In total, these results suggest that microsatellite loci are applicable to species- and population-level work in rapidly evolving groups, as exemplified by the adaptive radiations of the Cichlidae. G. A L o o k to the Future
Although new kinds of polymorphisms will be identified as we come to understand the structure of genomes, there is some hope that the techniques used to study these polymorphisms have stabilized. Most investigators are now directly examining DNA sequence polymorphisms, the most fundamental unit of molecular variation. PCR and DNA sequencing will likely be the primary tools of molecular systematics in the foreseeable future. We anticipate that the major differences will be increases in length of sequence examined and the number of genetic loci scored.
III. Controversy over Analytical Methods Systematic biology is well known for its vigorous and highly polarized methodological debates. Although much of the acrimony has subsided, strong proponents of distance and cladistic approaches remain. This polarization is strongly correlated with the type of
6
CAROL A. STEPIEN A N D THOMAS D. KOCHER
data sets studied by individual scientists. Morphologists have generally rejected distance approaches. Molecular systematists appear relatively flexible in the approaches taken to recover phylogenetic relationships from their data and have found that the evolution of sequences is often most easily modeled with distance methods. Still, character-state analyses of molecular data abound, and we should be careful not to equate molecular studies with distance analyses or morphological studies with cladistic analyses.
A. Cladistic Approaches The rise of cladistic methodology, as proposed by Hennig (1950, 1966) and popularized by Wiley (1981), has greatly contributed to the development of systematics from a collection of ad hoc procedures to a respectable science. Cladistics has markedly increased objectivity for interpreting the evolutionary history of characters and testing the relative strength of competing systematic hypotheses. This standard methodology has facilitated the comparison of hypotheses proposed by various investigators and support for different types of data sets. Examples of such comparisons occur in almost every chapter of this volume.
B. Distance Approaches Along with the development of molecular techniques, such as allozyme-isozyme electrophoresis, emerged the use of genetic distances and clustering algorithms which describe the degree of similarity or genetic relatedness among pairs of taxa and summarize this information in a "tree." Distance methods differ from cladistics in that they reduce the difference among each pair of taxa to a single number. Some workers argue that distance methods lose information inherent in the character-state matrix. Others argue that distance methods allow the evolution of the sequence to be more easily modeled. This allows accurate correction for unobserved multiple substitutions (homoplasy) in sequence data that is not possible with other methods. Like character-state methods, distance methods can be bootstrapped to evaluate the internal consistency of data. Recent theoretical work has focused on the calculation of standard errors of distances and branch lengths. Most types of distance trees are constructed with branch lengths that are proportional to the amount of divergence, making it possible to estimate relative times of separation.
C. Distance Corrections, Weighting,
and Clustering Genetic distances may be corrected for the effects of multiple substitutions per site. Methods for correcting
these include the Jukes-Cantor equation (Jukes and Cantor, 1969), which uses a Poisson model to calculate the probabilities of multiple substitutions, assuming equal probability of the type of substitution, no nucleotide bias (same proportions of G, A, T, and C), and that all sites along a sequence have an equal probability of change. Because some or all of these assumptions are violated by most DNA sequence data sets, additional correction factors are often used. The Kimura twoparameter method (Kimura, 1980) allows differential weighting of transition and transversion probabilities. Tamura and Nei's (1993) distance correction is based on the gamma distribution and corrects for nucleotide frequency differences, transition:transversion biases, and variation of substitution rate among different sites. Gamma distances are discussed at length by Kocher and Carleton (Chapter 2). Kumar et al. (1993) suggest that if various distance correction methods give similar results, then the simplest possible model should be used in order to minimize variance of the estimates. They suggest using the Jukes-Cantor or simple pairwise distances in cases when genetic distances are low, as long as substitution rates do not vary among lineages. Differential weighting of characters has been widely discussed (Wheeler, 1986; Swofford et al., 1996). It is clear that data for different nucleotide positions in coding regions, i.e., first, second, and third codon positions, should be analyzed separately because of their distinct patterns of selective constraint. Weighting is a relatively crude way to correct for the variation in rate among sites in noncoding sequences, especially as the pattern of selective constraint for these sequences is poorly understood. Weighting has also been used to model the relative frequency of different types of nucleotide substitution in parsimony analyses (Fitch and Ye, 1991). The advantage of this approach relative to the use of an appropriate distance method is not clear. Clustering algorithms have greatly improved in recent years. Neighbor joining (Saitou and Nei, 1987) is a widely used distance clustering algorithm that allows unequal rates of divergences among lineages. It is no longer necessary (or desirable) to assume that rates of sequence change are constant throughout a phylogeny.
D. Molecular Clocks Use of molecular characters has also been associated with the assumption of a "molecular clock," i.e., that mutations arise at relatively regular, predictable rates (Zuckerkandl and Pauling, 1962, 1965). Today, it is unlikely that any proponents of a universal clock, that ticks at a regular rate across all taxa, remain. Still, most workers accept the idea of local clockswthat rates of evolution within a particular group are relatively
1. Molecules and Morphology in Studies ofFish Evolution
similar. Clocks may be calibrated based on comparisons with taxa having known divergences, using wellcorroborated geological events (such as the linkage of the Isthmus of Panama as a barrier between the Atlantic and Pacific aquatic fauna; see Vawter et al, 1980; Grant, 1987; Stepien and Rosenblatt, 1996; Chapter 8 by Bermingham et al.), or with the fossil record. Dating divergences to the fossil record is complicated by the fact that the actual divergence usually predates its first fossil appearance by an unknown amount of time. Problems with clock calibration are discussed by Bermingham et al. (Chapter 8) and by Stepien et al. (Chapter 15).
E. Combining Data and Testing f o r Congruence There are two primary schools of thought among systematic biologists regarding combining morphological and molecular data. The first is the "total evidence" approach (Mickevich and Johnson, 1976; Kluge and Wolf, 1993) which states that phylogenetic analysis should be performed on a combined data set using all possible evidence. The null hypothesis for this approach is that there are no significant differences or partitions within the data set, i.e., that there is only one evolutionary history for the clade in question. Huelsenbeck et al. (1996) raise the point that estimates from total evidence have less sampling error as separate analyses of data partitions are based on fewer characters. It is advocated that total evidence tests should examine whether different sets of data have significantly different signals and these possible partitions should be tested against the combined data set (de Queiroz, 1993; Bull et al., 1993, Ballard, 1996). The other school of thought states that data sets should be analyzed separately (see Bull et al., 1993; Miyamoto and Fitch, 1995). Relationships among taxa that are congruent in separate analyses are regarded as strongly supported. In other words, the congruence of data from separate sources (such as separate analyses using different genes, or between morphological and molecular data sets) indicates increased support that the relationships are likely to be true. Miyamoto and Fitch (1995) suggest that relationships among taxa that are supported by different independent data sets are particularly robust, equivalent to obtaining independent verification of an experimental hypothesis from a different experimental source. This independent type of verification may be lost in combining data sets. An explicit assessment of congruence versus total evidence approaches is discussed in Chapter 11 by Parker. Parker analyzes problems in systematics of the Cyprinodontiformes by combining morphological characters from Parenti (1981, 1984) along with mo-
7
lecular data, including the nuclear tyrosine kinase gene X-src (Meyer and Lydeard, 1993) and mt16S rDNA sequences (Parker and Kornfield, 1995). He evaluates the methodology for combining data sets and comparing trees, including T-PTP (Faith, 1991) and bootstrap tests (Rodrigo et al., 1993). His conclusions argue for the utility of both combination and congruence approaches. Many of the authors in this volume compare taxonomic congruence between molecular-based and morphological-based hypotheses (e.g., Chapter 9 by Faber and Stepien, Chapter 10 by Phillips and Oakley, Chapter 12 by Bernardi, and Chapter 17 by Lydeard and Roe). Phillips and Oakley (Chapter 10) compare results from morphological and molecular studies of salmonid relationships and conclude that morphological traits suggesting one clade are unreliable. Bernardi (Chapter 12) discerns considerable concordance between molecular data and the definition of subgenera, but is unable to resolve higher-level relationships within the family. Lydeard and Roe (Chapter 17) also find greatest concordance of the two types of data at the lowest levels of the taxonomic hierarchy.
IV. Achievements and Failures of Molecular Systematics The greatest achievement of molecular systematics is the consistent and large set of characters generated for the analysis of phylogenies. The availability of these data has allowed the resolution of many intrageneric phylogenies that had not been previously addressed. Molecular studies have been spectacularly successful at the lowest taxonomic levels, particularly the analysis of relationships among populations or intraspecific phylogeography (Avise et al., 1987; see Chapters 3 through 9 of this volume). Molecular data offer an abundance of characters for studies at this level. Molecular studies have not yet fulfilled their promise for resolving deep relationships. There are two problems holding up progress in this area. First, it can become difficult to identify homology in highly diverged sequences. Alignments of characters becomes more difficult as the sequences diverge, particularly for hypervariable regions of rDNA genes. Hillis and Dixon (1991) have suggested that rDNA sequences beyond about 30% sequence difference should be discarded as unalignable. A better understanding of the relationship between rRNA structure and function would help in the identification of homologous sites. The second problem is "saturation": the equilibrium value of sequence difference that is reached when mul-
8
CAROL A. STEPIEN A N D T H O M A S D. KOCHER
tiple substitutions erase the record of previous substitutions at a site. For DNA sequence data there are only four nucleotide character states, G, A, T, and C, thus base substitutions at single nucleotide sites are often obscured by multiple substitutions at sites (multiple hits). As with morphological data sets, apparent synapomorphies may be the result of homoplastic convergence rather than shared common ancestry. Saturation is apparent in many molecular systematic studies. Claims that a group of taxa radiated rapidly at some time in the past should be scrutinized. It may be that molecular data are saturated and therefore uninformative as to the timing of particular branching events. This problem may be lessened either by examining more slowly evolving sites or by considering the codon as the character (rather than the individual nucleotides; Goldman and Yang, 1994; see Chapter 13 by Naylor et al. and Chapter 17 by Lydeard and Roe). Further studies of mutational processes, and the selective forces underlying variation in rate among sites, are needed. Alternatively, new kinds of data, such as the analysis of positional data, may be needed. Patterns of SINE insertion (Murata et al., 1993) or the order of homologous loci (Boore et al., 1995) provide another approach for resolving deep relationships. Molecular studies have also failed to resolve the phylogeny of some rapidly speciating groups. Even an accurate phylogeny of a gene may not be informative as to the relationships of the species under study. If the gene pools are isolated more rapidly than polymorphisms can be fixed in a lineage, then the reconstructed gene trees may not parallel the evolution of the species (Moran and Kornfield, 1993; Parker and Kornfield, 1997; Chapter 3 by Kornfield and Parker; Chapter 7 by Sturmbauer et al.). Instead, the polymorphisms may be carried through the speciation event and be randomly fixed in the descendant populations (see discussion by Avise, 1994). The solution of this problem may require brute force; the construction of many independent gene trees may uncover the relationships among populations.
1. Integration of Intraspecific Biogeographic Patterns with Studies of Speciation The study of the phylogenetic histories of populations in relation to biogeography has been termed "intraspecific phylogeography" (Avise et al., 1987). Several chapters in this volume specifically address testing these types of phylogeographic questions using fishes. Specifically, Wiley and Hagen (Chapter 6) test geographic distribution and likely histories of vicariance in a southeastern United States percid group, the sand darters. Faber and Stepien (Chapter 9) test for geographic relationship among spawning populations of walleye, Stizostedion vitreum, addressing whether gene flow is decreased due to natal homing. The evolution of species flocks, models of adaptive radiation, and biogeographic barriers are tested by Sturmbauer et al. (Chapter 7) for the cichlids of Lake Tanganyika, Africa. In studies of Panamanian freshwater fishes, Bermingham et al. (Chapter 8) describe very high levels of genetic divergence among populations, postulating that very high levels of phylogeographic structuring may be common in species exhibiting distributions that span large distances across physically isolated drainages. These studies are beginning to shed light on the role of geographic processes in speciation.
2. Reconstruction of Phylogenies among Congeners The now standard methodology of sequencing short stretches of the mitochondrial genome will continue to bear fruit in the analysis of relationships within genera. As outlined by Kocher and Carleton in Chapter 2, these efforts will be most successful for divergences within the last 5 million years. The steady accumulation of these sequences will allow the construction of intrageneric phylogenies for many groups of fishes and will lay the groundwork for studies attempting to understand relationships further back in time.
3. Reconstruction of Higher-Level Relationships Using Longer Sequences V. Eight Promising Directions for Future Research Molecular systematists have been working with DNA sequences for most of the last decade. The basic techniques of PCR and DNA sequencing are firmly established, but how will they be applied in the future? The following areas of molecular systematics may prove especially rewarding in the future.
Continuing advances in DNA sequencing technology suggest that it will be practical to analyze increasingly longer segments of DNA. Up to a point, longer sequences will allow the resolution of more ancient divergences. Hillis (1996) has suggested that sequences only 5000 bp long may be sufficient to accurately reconstruct even complex phylogenies. This seems a good intermediate goal, although additional complete mitochondrial sequences and many more nuclear sequences would be useful for some questions.
1. Molecules and Morphology in Studies ofFish Evolution
4. Analysis of Developmental Homologies at the Molecular Level Developmental biologists are beginning to focus on the analysis of fish development. A recent mutant hunt resulted in the isolation of more than 1500 mutations affecting development of the zebrafish (Haffter et al., 1996; Driever et al., 1996). We suspect that the genetic basis for many morphological differences will be revealed in the near future. Although the impact on the systematics of fishes is difficult to predict, the elucidation of molecular mechanisms generating morphological differences is sure to have an impact on the analysis of such characters. Where it is possible to cross species, it may be possible to identify the number of genes responsible for morphological differences (e.g., Doebley, 1992), quantifying for the first time the number of characters scored in morphological analyses.
5. Interpretation of Hybridization and Species Boundaries Using Abundant Nuclear Markers Habitat disturbance and continued introductions of exotic species will create new opportunities for the hybridization of species. The analysis of introgression in such hybrid swarms will be facilitated by the abundance of new genetic markers now available. Where the taxonomy of natural species has been in debate, these markers will provide new data on the extent of differentiation across the whole genome. The analysis of hybrids may also shed light on selective constraints and the interaction of genes (Kilpatrick and Rand, 1995; Rieseberg et al., 1996).
6. Analysis of the Evolution of Repetitive DNA Families Although most systematic analyses have focused on sequence variation in single-copy genes, there is some indication that repetitive DNA families offer new and useful tools for identifying relationships (Franck et al., 1994; Elder and Turner, 1994). Sequence variation in tandem and dispersed repetitive DNA may provide new insights in some groups.
9
8. Genomic Organization The increasing availability of genome maps, and even complete DNA sequences, is creating opportunities for the analysis of new characters. For example, Boore et al. (1995) used the pattern of gene arrangements in arthropod m t D N A to study arthropod relationships. O'Brien et al. (1993) proposed the use of a standard set of reference loci in the analysis of genomes, which would make it easy to identify such rearrangements in the nuclear genome. These types of characters may offer the best hope for resolving relationships among ancient lineages and need to be comprehensively addressed in fishes.
VI. A N e w Age of Synthesis Although morphological and molecular traditions have frequently collided in the past, we argue for a more synergistic approach that recognizes the peculiarities and limitations of each kind of data and in which there is an interplay between morphological and molecular studies. All inherited morphological characters have their origin in molecular characters. A record of the history of evolutionary change can be found in both the structure and the genes of organisms. At this point, analytical methods are rapidly increasing in sophistication, enabling us to better quantify rates of evolution and constraints on molecular changes through time. This understanding will lead to more accurate and consistent phylogenetic analyses. When combined with traditional approaches, these data promise to reveal much about the evolutionary forces that have produced the great diversity of modern fishes. This volume illustrates the beginning stages of this process, which is sweeping the field of fish systematics and paving the way to a new understanding of the interplay of genes, development, and selection. This new age of synthesis promises to continue to revolutionize systematics in the 21st century. References
7. Studies of the Molecular Clock in Fishes The mechanisms governing the speed and regularity of molecular clocks are poorly understood. The great diversity of habitat and life history among fishes, coupled with their excellent fossil record, makes this an excellent group with which to study molecular clocks. New insights will arise as rigorous accountings of substitution rate are made in groups of fishes varying in population size, environment, and life history.
Amason, E., and Rand, D. M. 1992. Heteroplasmy of short tandem repeats in mitochondrial DNA of Atlantic cod, Gadus morhua. Genetics 132:211- 220. Attardi, G. 1985. Animal mitochondrial DNA: An extreme example of genetic economy. Int. Rev. Cytol. 93:93-145. Avise, J. C. 1994. "Molecular Markers, Natural History, and Evolution." Chapman and Hall, New York. Arise, J C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics.Annu. Rev. Ecol. Syst. 18: 489-522.
10
CAROL A. STEPIEN AND THOMAS D. KOCHER
Ballard, J. W. O., and Kreitman, M. 1994. Unraveling selection in the mitochondrial genome of Drosophila. Genetics 138: 757-772. Ballard, J. W. O. 1996. Combining data in phylogenetic analysis. Trends Ecol. Evol. 11:334. Bjorkman, P.J., and Parham, P. 1990. Structure, function, and diversity of class I major histocompatibility complex molecules. Annu. Rev. Biochem. 59:253-288. Block, B. B., Finnerty, J. R., Stewart, A. F. R., and Kidd, J. 1993. Evolution of endothermy in fish: Mapping physiological traits on a molecular phylogeny. Science 260:210- 214. Boore, J. L., Collins, T. M., Stanton, D., Daehler, L. L., and Brown, W. M. 1995. Deducing the pattern of arthropod phylogeny from mitochondrial DNA rearrangements. Nature 376:163-165. Brown, J. R., Beckenbach, A. T., and Smith, M. J. 1993. Intraspecific DNA sequence variation of the mitochondrial control region of white sturgeon (Acipenser transmontanus). Mol. Biol. Evol. 10: 326-341. Bull, J. J., Huelsenbeck, J. P., Cunningham, C. W., Swofford, D. L., and Waddell, P. J. 1993. Partitioning and combining data in phylogenetic analysis. Syst. Biol. 42:384-397. Carr, S. M., and Marshall, H. D. 1991. Detection of intraspecific DNA sequence variation in the mitochondrial cytochrome b gene of Atlantic cod (Gadus morhua) by the polymerase chain reaction. Can. J. Fish. Aquat. Sci. 48:48-52. Carr, S. M., Snellen, A. J., Howse, K. A., and Wroblewski, J.S. 1995. Mitochondrial DNA sequence variation and genetic stock structure of Atlantic cod (Gadus morhua) from bay and ofshore locations on the Newfoundland continental shelf. Mol. Ecol. 4:79-88. Carvalho, G. R., and Pitcher, T. J. (eds.) 1995. "Molecular Genetics in Fisheries." Chapman and Hall, New York. de Queiroz, A. 1993. For consensus (sometimes). Syst. Biol. 42: 368-372. Doebley, J. 1992. Mapping the genes that made maize. Trends Genet. 8: 302- 307. Dowling, T. E., Moritz, C., and Palmer, J.D. 1990. Nucleic acids. II. Restriction site analysis. In "Molecular Systematics" (D. M. Hillis and C. Moritz, eds.), pp. 250-317. Sinauer Associates, Sunderland, MA. Driever, W., Solnica-Krezel, L., Schier, A. F., Neuhauss, S. C. E, Malicki, J., Stemple, D. L., Stainier, D. Y. R., Zwartkruis, F., Abdelilah, S., Rangini, Z., Belak, J. and Boggs, C. 1996. A genetic screen for mutations affecting embryogenesis in zebrafish. Development 123: 37-46. Elder, J. F., Jr., and Turner, B. J. 1994. Concerted evolution at the population level: Pupfish HindIII satellite DNA sequences. Proc. Nat. Acad. Science USA 91:994-998. Faith, D. P. 1991. Cladistic permutation tests for monophyly and nonmonophyly. Syst. Zool. 40:366-375. Ferraris, J. D., and Palumbi, S. R. (eds.) 1996. "Molecular Zoology." Wiley-Liss, New York. Fitch, W. M., and Ye, J. 1991. Weighted parsimony: Does it work? In "Phylogenetic Analysis of DNA Sequences" (M. M. Miyamoto and J. Cracraft, eds.), pp. 147-154. Oxford University Press, New York. Franck, J. P. C., Kornfield, I., and Wright, J. M. 1994. The utility of SATA Satellite DNA sequences for inferring phylogenetic relationships among the three major genera of tilapiine cichlid fishes. Molec. Phyl. Evol. 3:10-16. Goldman, N., and Yang, Z. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11: 725- 736. Grant, W. S. 1987. Genetic divergence between congeneric Atlantic and Pacific Ocean fishes, In "Population Genetics and Fishery Management" (N. Ryman and F. Utter, eds.), pp. 225-246. Univ. Washington Press, Seattle, WA.
Graybeal, A. 1993. The phylogenetic utility of cytochrome b: Lessons from bufonid frogs. Mol. Phyloget. Evol. 2:256-269. Hafter, P., Granato, M., Brand, M., Mullings, M. C., Hammerschmidt, M., Kane, D. A., Odenthal, J., Van Eeden, F. J. M., Jiang, Y.-J., Heisenberg, C.-P., Kelsh, R. N., Furutani-Seiki, M., Vogelsang, E., Beuchle, D., Schach, U., Fabian, C., and N~issleinVolhard, C. 1996. The identification of genes with unique and essential function in the development of the zebrafish, Danio rerio. Development 123:1-36. Hennig, W. 1950. "Grundzuege einer Theorie der phylogenetischen Systematik." Deutscher Zentralverlag, Berlin. Hennig, W. 1966. "Phylogenetic Systematics." University of Illinois Press, Urbana, IL. Hillis, D. M., and Dixon, M. T. 1991. Ribosomal DNA: Molecular evolution and phylogenetic inference. Quart. Rev. Biol. 66: 411-453. Hillis, D. M., 1996. Inferring complex phylogenies. Nature 383: 130-131. Hillis, D. M., Moritz, C., and Mable, B. K. (eds.) "Molecular Systematics," 2nd. ed. Sinaver Assoc., Sonderland, Massachusetts. Huelsenbeck, J. P., Bull, J. J., and Cunningham, C. W. 1996. Combining data in phylogenetic analysis. Trends Ecol. Evol. 11(4): 152-158. Hutter, C. M., and Rand, D. M. 1995. Competition between mitochondrial haplotypes in distinct nuclear genetic environments: Drosophila pseudoobscura vs. D. persimilis. Genetics 140(2):537-548. Jarne, P., and Lagoda, P. J. L. 1996. Microsatellites, from molecules to populations and back. Trends Ecol. Evol. 11(10):424-429. Jukes, T. H., and Cantor, C. R. 1969. Evolution of protein molecules. In "Mammalian Protein Metabolism" (H. N. Munro, ed.), pp. 21132. Academic Press, New York. Kilpatrick, S.T., and Rand, D.M. 1995. Conditional hitchhiking of mitochondrial DNA: Frequency shifts of Drosophila melanogaster mtDNA variants depend on nuclear genetic background. Genetics 141:1113-1124. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kluge, A. G., and Wolf, A. J. 1993. Cladistics: What's in a word? Cladistics 9:183 - 199. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S. E, Villablanca, E X., and Wilson, A. C. 1989. Dynamics of mtDNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kumar, S., Tajura, K., and Nei, M. 1993. "MEGA: Molecular Evolutionary Genetics Analysis, Version 1.0." Pennsylvania State University, University Park, PA. Lee, W., Conroy, J., Howell, W. H., and Kocher, T. D. 1995. Structure and evolution of teleost mitochondrial control regions. J. Mol. Evol. 41:54-66. Magoulas, A., and Zouros, E. 1993. Restriction-site heteroplasmy in anchovy (Engraulis encrasiocholus) indicates incidental biparental inheritance of mitochondrial DNA. Mol. Biol. Evol. 10(2):319-325. Martin, A. P., Kessing, B.D., and Palumbi, S. R. 1990. Accuracy of estimating genetic distance between species from short sequences of mitochondrial DNA. Mol. Biol. Evol. 7:485-488. Meyer, A. 1994. Shortcomings of the cytochrome b gene as a molecular marker. Trends Ecol. Evol. 9:278-280. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Meyer, A., and Lydeard, C. 1993. The evolution of copulatory organs, internal fertilization, placentas, and viviparity in killifishes (Cyprinodontiformes), as inferred from a DNA phylogeny of the tyrosine kinase gene X-src. Proc. Royal. Soc. Lond. B 254:153-162.
1. Molecules and Morphology in Studies ofFish Evolution
Mickevich, M. F., and Johnson, M. S. 1976. Congruence between morphological and allozyme data in evolutionary inference and character evolution. Syst. Zool. 25:260-270. Miyamoto, M. M., and Fitch, W. M. 1995. Testing species phylogenies and phylogenetic methods with congruence. Syst. Biol. 44: 64-76. Moran, P., and Kornfield, I. 1993. Were population bottlenecks associated with the radiation of the Mbuna species flock (Teleostei: Cichlidae) of Lake Malawi. Mol. Biol. Evol. 10:1015-1029. Murata, S., Takasaki, N., Saitoh, M., and Okada, N. 1993. Determination of the phylogenetic relationships among Pacific salmonids by using short interspersed elements (SINEs) as temporal landmarks of evolution. Proc. Natl. Acad. Sci. USA 90:6995-6999. Nelson, J. S. 1994. "Fishes of the World," 3rd. ed. Wiley, New York. O'Brien, S. J., Womack, J. E., Lyons, L. A., Moore, K. J., Jenkins, N. A., and Copeland, N. G. 1993. Anchored reference loci for comparative genome mapping in mammals. Nat. Genet. 3:103-112. Palumbi, S. R. 1996. Nucleic acids II. The polymerase chain reaction. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), pp. 205-221. Sinauer Assoc., Sunderland, MA. Parenti, L. R. 1981. A phylogenetic and biogeographic analysis of cyprinodontiform fishes. Bull. Am. Mus. Nat. Hist. 1658:341-557. Parenti, L. R. 1984. A taxonomic revision of the Andean killifish genus Orestias. Bull. Am. Mus. Nat. Hist. 178:110-214. Parker, A., and Kornfield, I. 1995. A molecular perspective on evolution and zoogeography of cyprinodontid killifishes. Copeia 1995:8-21. Parker, A. and Kornfield, I. 1997. Evolution of the mitochondrial DNA control region in the mbuna (Cichlidae) species flock of Lake Malawi, East Africa. J. Mol. Evol. in press. Pogson, G. H., Mesa, K. A., and Boutilier, R. G. 1995. Genetic population structure and gene flow in the Atlantic cod Gadus morhua: A comparison of allozyme and nuclear RFLP loci. Genetics 139: 375-385. Powers, D. A., and Schulte, P. M. 1996. A molecular approach to the selectionist/neutralist controversy. In "Molecular Zoology" (J. D. Ferraris and S. R. Palumbi eds.), pp. 327-352. Wiley-Liss, New York. Queller, D. C., Strassmann, J. E., and Hughes, C. R. 1993. Microsatellites and kinship. Trends Ecol. Evol. 8:285-288. Rieseberg, L. H., Sinervo, B., Linder, C. R., Ungerer, M. C., and Arias, D. M. 1996. Role of gene interactions in hybrid speciation: Evidence from ancient and experimental hybrids. Science 272: 741-745. Rodrigo, A. G., Kelly-Borges, M., Bergquist, P. R., and Bergquist, P. L. 1993. A randomisation test of the null hypothesis that two cladograms are sample estimates of a parametric phylogenetic tree. New Zeal. J. Bot. 31:257-268. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., Horn, G. T., Mullis, K. B., and Erlich, H. A. 1988. Primer-directed enzymatic amplification of DNA with a thermostabile DNA polymerase. Science 239: 487-491. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Simon, C., Frati, F., Beckenbach, A., Crespi, B., Liu, H., and Flook, P. 1994. Evolution, weighting, and phylogenetic utility of mitochondrial gene sequences and a compilation of conserved polymerase chain reaction primers. Ann. Entomol. Soc. Am. 87(6): 651-701.
11
Stepien, C. A. 1995. Population genetic divergence and geographic patterns from DNA sequences: Examples from marine and freshwater fishes. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation" (J. Nielsen, ed.), pp. 263-287. American Fisheries Soc. Symposium, Bethesda, MD. Stepien, C. A. and Rosenblah, R. H. 1991. Patterns of gene flow and genetic divergence in the Northeastern Pacific Clinidae (Teleosteii Blennioidei), based on allozyme and morphological data. Copeia. 1991(4): 873-896. Stepien, C. A., Dixon, M. T., and Hillis, D. M. 1993. Evolutionary relationships of the blennioid fish families Clinidae, Labrisomidae, and Chaenopsidae: Congruence between DNA sequence and allozyme data. Bull. Mar. Sci. 52(1): 873-513. Stepien, C. A., and Rosenblatt, R. H. 1996. Genetic divergence in antitropical pelagic marine fishes (Trachurus, Merluccius, and Scomber) between North and South America. Copeia 1996(3): 586-598. Stiassny, M. L. J., Parenti, L. R., and Johnson, G. D. (eds.) 1996. "Interrelationships of Fishes." Academic Press, San Diego. Sturmbauer, C., and Meyer, A. 1992. Genetic divergence, speciation and morphological stasis in a lineage of African cichlid fishes. Nature 358:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes of Lake Tanganyika, East Africa. Mol. Biol. Evol. 10: 751-768. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. 1996. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.). 2nd Ed., pp. 407-514. Sinauer Assoc., Sunderland, MA. Tamura, K., and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526. Vawter, A. T., Rosenblatt, R. H., and Gorman, G. C. 1980. Genetic divergence among fishes of the Eastern Pacific and the Caribbean: Support for the molecular clock. Evolution 34: 705m711. Wheeler, W. C. 1986. Character weighting and cladistic analysis. Syst. Zool. 35:102-109. Wiley, E. O. 1981. "Phylogenetics: The Theory and Practice of Phylogenetic Systematics." Wiley Interscience, New York. Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S. V. 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18: 6531-6535. Wilson, A. C., Cann, R. L., Carr, S. M., George, M., Jr., Gyllensten, B., Helm-Bychowski, K., Higuchi, R. C., Palumbi, S. R., Prager, E. M., Sage, R. D., and Stoneking, M. 1985. Mitochondrial DNA and two perspectives on evolutionary genetics. Biol. J. Linnean Soc. 26: 375-400. Zhu, D., Jamieson, B. G. M., Hugall, A., and Moritz, C. 1994. Sequence evolution and phylogenetic signal in control region and cytochrome b sequences of rainbowfishes (Melanotaeniidae). Mol. Biol. Evol. 11:672-683. Zuckerkandl, E. and Pauling, L. 1962. Molecular disease, evolution and genic heterogeneity. In "Horizons in Biochemistry" (M. Kasha and B. Pullman, eds.), pp. 189-225. Academic Press, New York. Zuckerkandl, E. and Pauling, L. 1965. Evolutionary divergence and convergence in proteins. In "Evolving Genes and Proteins" (V. Bryson and H. J. Vogel, eds.), pp. 97-166. Academic Press, New York.
This Page Intentionally Left Blank
C H A P T E R
2 Base Substitution in Fish Mitochondrial DNA: Patterns and Rates THOMAS D. KOCHER and KAREN L. CARLETON Department of Zoology University of New Hampshire Durham, New Hampshire 03824
I. Introduction
differences can be observed in comparisons among species. Probably more is known about evolutionary patterns in animal mitochondrial genomes than for any other DNA sequence. Although some aspects of the substitutional pattern (e.g., the high proportion of transitions) are unique to animal mtDNA, this molecule is still an excellent model system to illustrate the analytic method needed to reconstruct phylogenies from DNA sequence data. This chapter focuses on patterns of mtDNA evolution in cichlid fishes. Examples are drawn from continuing studies of the sene encoding NADH dehydrogenase subunit 2 (ND2) in East African cichlids (Kocher et al., 1995). This data set is particularly useful because it includes a large number of closely related molecules which provide insights into the pattern of substitution usually obscured in comparisons among more highly diverged sequences.
Many of the authors in this volume use mitochondrial DNA (mtDNA) sequences because they are easily accessible, have high rates of evolution, and generally follow a clonal pattern of inheritance well suited to phylogenetic reconstruction (Wilson et al., 1985). This chapter is about the natural history of these sequences. Just as morphological systematists strive to analyze characters for which the pattern of development and effects of the environment are well known, so molecular systematists should begin by understanding the biology underlying the characters they use for inferring phylogenies. By understanding how changes accumulate in sequences, accurate models of substitution can be developed for use in phylogenetic inference. Molecular sequences are deceptively simple in structure. There are just four bases common in DNA. These bases appear to be freely interchangeable, but in fact, mutation interconverts some nucleotides more frequently than others. Selection and drift then act on this spectrum of mutations in such a way as to prevent most substitutions from becoming fixed in the population. Neither mutation nor selection is homogeneous along a sequence of nucleotides; close examination reveals important differences in the pattern of mutation and selective constraint among nucleotide sites. Additional
MOLECULAR SYSTEMATICS OF FISHES
II. S i m p l e M o d e l s of Substitution A. M u t a t i o n a l Models At the core of most phylogenetic reconstruction algorithms is a simplified mutational model of the sub-
13
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
14
THOMAS D. KOCHER A N D KAREN L. CARLETON
A
C
~
G
Purines
T Pyrimidines
FIGURE 1 The substitution model of Kimura (1980) in which the rate of transitions (or) is usually higher than the rate of transversions (13).
stitution process. The simplest model (Jukes and Cantor, 1969) assumes an equal probability of interconversion among all four nucleotides. A consequence of this model is a twofold excess of transversional change (purine ~ pyrimidine) because there are twice as many paths for transversions as for transitions. This model is not adequate for animal mtDNA because a much larger excess of transitions, relative to transversions, is typically observed. Kimura (1980) introduced a two parameter model to accommodate the higher rate of transitions (Fig. 1). This model is also inadequate, as it predicts that sequences at equilibrium will contain equal frequencies of all four nucleotides. A modified Kimura model (Felsenstein, 1986) adjusts the relative rates of transitions or transversions to accommodate the unequal frequency of bases seen in real sequences. More complex models are possible, but the need for a fully elaborated model, with a separate rate parameter for each of the 12 possible kinds of substitution, has not yet been demonstrated (but see Rzhetsky and Nei, 1995).
B. Multiple Hits and Saturation As mutations occur over time, a pair of homologous sequences will become increasingly different. The observed number of differences between these sequences accumulates almost linearly at first. Gradually, however, as some nucleotide sites experience more than one substitution, the observed sequence difference becomes a poor indicator of the actual divergence which has occurred. Eventually, the rate at which new differences arise is equal to the rate at which identical nucleotides arise by multiple substitution. At this point the sequences cannot display greater sequence difference (the sequences have reached "saturation"), even
though additional substitutions continue to occur. The true evolutionary rate is hidden by the occurrence of multiple substitutions at a site. Appropriate statistical corrections can be applied to transform the observed differences into a measure of the total number of changes that have occurred (total divergence, or evolutionary distance). These corrections can be derived for any of the mutational models, but are accurate only in the early stages of differentiation, before saturation has been closely approached. Furthermore, these corrections are accurate only if all of the nucleotide sites are evolving according to the same substitutional model.
C. Selectional Filter Although mutational models have been widely used to describe the process of substitution, they ignore the influence of selection, which may be the dominant force regulating change in real sequences. It is easy to show, by comparison of nucleotide substitution rates at silent and amino acid replacement sites, that selection filters out more than 90% of all mutations which occur in mtDNA. Any concordance between the predictions of mutational models and the evolution of real sequences is therefore fortuitous. Most simple models assume that substitutions occur randomly among sites following the Poisson distribution. Numerous demonstrations of the inadequacy of this model have been published (Fitch and Markowitz, 1970; Uzzell and Corbin, 1971; Kocher and Wilson, 1991). Substitutions do not occur with equal probability at each site. Instead, selection resists substitution at some sites, while allowing mutations at other sites to become fixed. A better model of this process uses a gamma distribution (Bliss and Fisher, 1953; Tamura and Nei, 1993), or a covarion model (Fitch and Markowitz, 1970; Miyamoto and Fitch, 1995), to allow rates of substitution to vary among nucleotide sites. The gamma distribution models have been mathematically formulated so that it is straightforward to correct distances for multiple hits (Tamura and Nei, 1993), but this is not yet possible for the covarion model. Few studies have attempted to estimate either the gamma parameter or the size and exchange rate of the covarion. It is important to remember that estimates of these parameters must be made from close relatives, as they provide the best information to quantify the process of substitution, free from the effects of multiple substitution. For protein-coding sequences, it is possible to classify sites a priori according to the known selective constraints of the coding function. At the very least, it is recognized that first, second, and third positions of co-
2. Base Substitution in Fish mtDNA dons evolve at different rates, because of the redundant structure of the genetic code, and the grouping of functionally similar amino acids according to the second base of the codon. Because the functional constraints on rRNA sequences are poorly understood, it is more difficult to assign sites to particular rate classes a priori. Models of evolution for these genes typically resort to a purely statistical representation of the sites.
III. Evolution of Real Sequences To evaluate which theoretical models provide the most appropriate basis for phylogenetic reconstruction, the evolution of real sequences must be quantified. Here we examine a set of 56 mitochondrially encoded NADH subunit 2 (ND2) sequences (348 codons) obtained from 45 species of cichlid fish, mostly from East Africa. The most divergent comparisons involve New World species which presumably diverged from the African lineages more than 60 million years (MY) ago. The most closely related sequences are intraspecific polymorphisms differing by just a few nucleotides. Those sequences not already reported in Kocher et al. (1995) are deposited in GenBank. Ideally, we would plot the divergence of molecules with respect to geologic times of divergence. For these fishes, however, few reliable divergence times are available. Instead we will use the proportion of third position sites which have experienced a transversion as a measure of divergence. Transversions occur relatively rarely and in a nearly Poisson fashion (Irwin et al., 1991). These divergences are corrected for multiple substitution using a two-state model [d = -0.5 in (1-2Q), where Q is the observed proportion of transversions].
A. Changes in the Third P o s i t i o n o f C o d o n s Many substitutions at the third positions of codons are synonymous (i.e., do not change the amino acid sequence of the encoded protein) and thus escape selection on protein structure. These sites therefore provide the most direct view of the mutational process. Although these sites are often thought to evolve according to a purely mutational model, some selective constraint does exist (Perna, 1996; Xia et al., 1996). While it would be inappropriate to equate substitutions at third positions with mutation, these sites approximate the underlying mutational spectrum more closely than the first or second positions. The dominant feature of mtDNA evolution is the high rate of transition substitutions relative to transver-
15
sions. At third positions the ratio of transition: transversion differences is at least 5:1 initially (Fig. 2), consistent with a strong transition bias in the underlying mutation process. As transitions begin to occur repeatedly at the same sites, the ratio of transitions: transversions observed in pairwise comparisons drops. At a 10% transversion difference, the ratio is only 2.5:1, and in the deepest comparisons it drops to 1:1. At a 10% transversion difference, the actual number of transition substitutions that have occurred is at least twice as great as the observed number of differences. The transition: transversion ratio is thus one way to quantify the degree of multiple substitution that has occurred since the common ancestor of two sequences. Base composition influences the maximum observed difference. Figure 3 shows the accumulation of the two kinds of transitions possible: those involving the purines (A and G) and those involving the pyrimidines (C and T). It is interesting to note that the initial rate of transitions is the same for the two types of nucleotides. The purines, however, show saturation at a lower level of divergence than the pyrimidines. This pattern arises because the frequencies of A and G are much more unequal than the frequencies of C and T. At third positions the proportions are A,G,C,T: 0.32, 0.05, 0.38, 0.26. The maximum divergence of two sequences is calculated as I - probability of chance identity. For the purines described earlier, where only two states are possible (e.g., A or G), this is calculated as
(f6)2 (fA) 2 dma x
-
1 -
fc +fA
fC +fA
o
(1)
The very unequal frequencies of A and G allow a maxim u m difference of just 23% instead of the 50% that would be expected given equal frequencies of the two nucleotides. For C and T, the maximum difference is higher, about 48% (Kocher et al., 1995). These differences explain why the purine transitions reach saturation before the pyrimidine transitions. These mitochondrial sequences approach saturation rapidly. Evidence of multiple substitutions is quite apparent at only 2% transversion difference. The mammalian fossil record suggests that this corresponds to about 2 MY of divergence (Irwin et al., 1991). The fossil record of cichlids is more difficult to interpret, but a similar rate does correlate well with the geologic history of East Africa (Kocher et al., 1995). The fact that saturation effects begin to arise after just 2 MY of divergence underscores the importance of corrections for multiple substitution when constructing phylogenies of more distantly related taxa.
16
THOMAS D. KOCHER AND KAREN L. CARLETON 30 ~e~
9
9
9
-
~o
9 o9 9
"#-
9
~Z --
i n m i n e
r~ tO
9
9
9 9
GO
Gig 9O ; ~ 9 9
9
9
8o
9
9
9
9
9149
9
9
9 ~
9
~
9
9
e
OO
9 O OOOO
9
9
-00O N ~ O O ~ I
9O
9
O ~ ~ m . gOB ~ O ~ go ~ ~ ~ 9~ ~ ~
20-
~1761769
mOO
9 0 9 ooo H a D O9 9~ O ~ O O O ~ ~ O O O O~ ~ ~ U~ O o
o~
alJo ~176176
9
%
9
9 9
9 Oo
9
9
~
iSiSlX~
~ 9 O N B
~ i o
i i
o
~o~o
o
o ~~ , o ooo~9
r.f) t-
m0 U
0
o 9
9 1 4 9 aBOO 9
z..
,~o"
tO
o~ooO 9 ~IN~OO
U) 0
gmmB
"0
o')
m O0~O
10-
9
080
.o 9 So 9 4, 8, o% 9 m
T ;O O R
o #B m o m D
o o
I
I
~ c~
Corrected
I
~ c~
3rd
~ d
position
I
~. 6
d
transversions
FIGURE 2 The accumulation of transition differences at the third position of codons in the ND2 gene. The pairwise differences among 56 sequences representing 4 5 species of cichlid fish are plotted. The x axis is the observed proportion of third position differences corrected for multiple hits according to a Poisson model [(x = - 0 . 5 l n ( 1 - 2(proportion of differences))].
B. Changes in First and Second Positions and Amino Acid Substitution At the first and second position of codons, selection dominates the substitution process. This is apparent from the rate of transition substitution, which is 6and 15-fold slower at first and second positions, respectively, than the rate at third positions (Fig. 4). Because there is no reason to suspect a slower mutation rate at these sites, the difference must arise because selection prevents fixation of most mutations. Selection also constrains the maximum amount of difference that is observed between two sequences. Second positions plateau at approximately 3% transition differ-
ence, while first positions plateau at about 8%. The comparable value at third positions is 25%. Selective constraint has a strong effect on base composition, which differs among the three codon positions. First positions are relatively rich in GC because of the high leucine and alanine content of the ND2 protein. Second positions show a high proportion of T and C (37.9 and 30.4%, respectively), probably because hydrophobic amino acids required for this membranespanning protein are encoded by either C or T at the second position (Naylor et al., 1995). Probably the most important characteristic of selective constraint is that it varies from site to site along the molecule according to the structural function of the
0.45
0
0.97 [Nei's unbiased genetic identity; Nei (1987)] to actual population allele frequencies (A. Parker, unpublished data). The empirical relationship derived from these simulations is: Sample Size = 1.5(N alleles)135. FIGURE 4
34
IRV KORNFIELD AND ALEX PARKER
preted to represent the point beyond which additional sampling effort is no longer worthwhile. In the present case, however, only 25 to 47 individuals were sampled per taxon. Thus, the resolution of relationships based on these two loci may be incomplete.
0.t3
0.25
0.15
~ ~t
B. Evolutionary Signals in Microsatellites
011 0.05
There are three independent classes of phylogenetic information that can potentially be gleaned from microsatellite loci. The authors anticipate that these classes will be appropriate for examining relationships at different taxonomic levels. First, allele frequency distributions may be compared using genetic distance metrics based on the stepwise mutation model of microsatellite evolution (Slatkin, 1995; Goldstein et al., 1995). A second class of phylogenetic information, however, may be present in microsatellite allele frequency distributions. Major gaps in allele size distributions may signify unique mutational events. For example, in M. parallelus, alleles of size 150-180 bp are recognized at locus UME002 as representing expansion, via stepwise mutation, from a single allele produced by a distinct, saltatory mutational process (Fig. 3A). This class of alleles is separated from the next smallest allele by 75 bp; it thus conforms to the two-phase mutation model presented by DiRienzo et al. (1994), wherein divergent repeat classes are generated by infrequent large jumps. Machado-Joseph disease also conforms to this model; this pathology appears when a trinucleotide repeat increases by at least 75 bp to form a new allelic class (Maciel et al., 1995). As Maciel et al. (1995) noted, "clustering of expanded repeat sizes is also suggestive of a unique ancient founder mutation." A cladistic perspective, recognizing such novel classes of alleles as discrete characters, is adopted here; in light of the saltatory nature of the mutational events hypothesized to generate them, such characters are called saltines to distinguish them from standard patterns of microsatellite allele variation. Thus, the allelic class centered at 178 bp for UME002 in M. parallelus constitutes an autapomorphic saltine; if shared among independent lineages, saltines may be treated as synapomorphies. In this manner, some aspects of microsatellite allele distributions can be analyzed by standard cladistic methods (Swofford, 1990) rather than by distance approaches. Indeed, if a large number of loci were examined, it would be anticipated that saltines would permit construction of robust phylogenetic trees. However, like ancestral mtDNA polymorphisms, saltines can be retained or lost in multiple lineages. For example, in the human microsatellite data analyzed by Bowcock et al. (1994), locus ms164 has two
0
,
o
-
I
.
I
9
- -
.
Allelesize(bp) FIGURE 5 Distribution of alleles at human microsatellite locus
ms164 (E. Minch, personal communication);this locus was included in the study of Bowcocket al. (1994).The two divergentallelicclasses depicted are shared by a number of diverse human lineages. Aggregate sample size is 250.
allelic classes separated by 16 bp which are present in diverse human lineages (Fig. 5). Inspection of allele distributions at UME003 (Fig. 3B) reveals the presence of two major expansion classes centered around 149 and 201 bp; the smaller expansion class has probably been lost from both M. auratus and P. zebra. Genetic drift may play a major role in molding the distribution of rare expansion classes. If drift were to eliminate relatively infrequent alleles associated with major allelic clusters, e.g., the class centered around 300 bp at locus UME002 (Fig. 3A), such alleles could be regenerated rapidly by mutation. In contrast, if eliminated by drift, variation embodied in saltines would not be regenerated. For example, the absence of the saltatory class centered at 178 bp at UME002 from M. auratus could be due to drift. Indeed, mtDNA diversity is observed to be relatively low in this taxon (Bowers et al., 1994), consistent with the possibility of a recent population bottleneck. Note that it is critical that sample sizes be large enough to reliably detect the presence of saltines that occur at low absolute frequencies in some populations. To date, no one has exploited this class of information to construct phylogenetic trees. Finally, similar to saltines, the ability (or inability) of a given microsatellite primer pair to amplify DNAs from certain taxa can be treated as a cladistically informative binary character and forms a third potential information class. Again, such characters may constitute synapomorphies and can thus be used to infer relationships from a cladistic perspective, although no empirical information about the prevalence of these characters in cichlid fishes can be found. If null alleles are to be employed in this fashion, it is imperative that new flanking primers be designed and used to dem-
3. Molecular Systematics of mbuna
onstrate, by sequencing, that all observed null alleles are due to homologous changes in the original priming sites.
X. Summary The classical methods of molecular phylogenetic investigation, allozyme electrophoresis and mtDNA restriction or sequence analysis, have failed to resolve relationships among members of rapidly evolving species flocks such as the mbuna (Cichlidae) of Lake Malawi. Several classes of nuclear DNA markers may, however, provide greater resolution; most promising are microsatellite markers. The extremely high mutation rates at these loci render them fundamentally different from other nuclear DNA polymorphisms, as changes in allele frequency are influenced by mutation as well as genetic drift. Analysis of two microsatellite loci in three congeneric pairs of mbuna species strongly suggests that these markers can provide phylogenetic information relevant to these recently diverged taxa.
Acknowledgments We are exceedingly grateful to S. Grant, Salima, Malawi, for providing specimens and supporting our research. L. DeMason also supplied critical logistical support. E. Minch kindly shared unpublished human microsatellite data and provided a copy of his program to calculate delta-/z. A. Konings generously permitted reproduction of his mbuna photographs. M. Stiassay is inspirational. We are grateful to the editors and two anonymous referees who provided comments which helped improve this manuscript. This work was supported by NSF EHR91-08766 and NOAA Sea Grant NA36RG0110.
References Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Helfman, G. S., Saunders, N. C., and Hales, L. S. 1986. Mitochondrial DNA differentiation in North Atlantic eels: Population genetic consequences of an unusual life history pattern. Proc. Natl. Acad. Sci. USA 83:4350-4354. Avise, J. C., Neigel, J. E., and Arnold, J. 1984. Demographic influences on mitochondrial DNA lineage survivorship in animal populations. J. Mol. Evol. 20:99-105. Bardakci, F., and Skibinski, D. O. F. 1994. Application of the RAPD technique in tilapia fish: Species and subspecies identification. Heredity 73:117-123. Bowcock, A. M., Ruiz-Linares, A., Tomfohrde, J., Minch, E., Kidd, J. R., and Cavalli-Sforza, L. L. 1994. High resolution of human
35
evoutionary trees with polymorphic microsatellies. Nature 368: 455 -458. Bowers, N., Stauffer, J. R., Jr., and Kocher, T. D. 1994. Intra- and interspecific mitochondrial DNA sequence variation within two species of rock-dwelling cichlids (Teleostei: Cichlidae) from Lake Malawi, Africa. Mol. Phylogenet. Evol. 3:75-82. Brown, W. M., George, M. Jr., and Wilson, A. C. 1979. Rapid evolution of mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76:19671971. Bruford, M. W., and Wayne, R. K. 1993. Microsatellites and their application to population genetic studies. Curr. Opin. Genet. Dev. 3: 939-943. Charlesworth, B., Sniegowski, P., and Stephan, W. 1994. The evolutionary dynamics of reptitive DNA in eukaryotes. Nature 371: 215-220. Crother, B. I. 1990. Is "some better than none" or do allele frequencies contain phylogenetically useful information? Cladistics 6:277-281. Dallas, J. F. 1992. Estimation of microsatellite mutation rates in recombinant inbred strains of mouse. Mamm. Genome 5: 32- 38. DeMason, L. 1993. Into Africa: Exporting the Tanzanian coast of Lake Malawi. Cichlid News 2: 22- 23. DiRienzo, A., Peterson, A. C., Garza, J. C., Valdes, A. M., Slatkin, M., and Freimer, N. B. 1994. Mutational processes of simple-sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA 91: 3166-3170. Dominey, W. J. 1984. Effects of sexual selection and life history on speciation: Species flocks in African cichlids and Hawaiian Drosophila. In "Evolution of Fish Species Flocks," (A. A. Echelle and I. L. Kornfield, eds.), pp. 231-249. University of Maine Press, Orono, ME. Dowling, T. E., Moritz, C., and Palmer, J. D. 1990. Nucleic acids. II. Restriction site analysis. In "Molecular Systematics" (D. M. Hillis and C. Moritz, eds.), pp. 250-317. Sinauer, Sunderland, MA. Eccles, D. H., and Trewavas, E. 1989. "Malawian Cichlid Fishes: The Classification of Some Haplochromine Genera." Lake Fish Movies, Herten, West Germany. Edwards, A., Hammond, H. A., Jin, L., Caskey, C. T., and Chakraborty, R. 1992. Genetic variation at five trimeric and tetrameric tandem repeat loci in four human population groups. Genomics 12:241-253. Ellegren, H., Primmer, C. R., and Sheldon, B. C. 1995. Microsatellite "evolution": Directionality or bias? Nat. Genet. 11:360-362. Ellsworth, D. L., Rittenhouse, K. D., and Honeycutt, R. L. 1993. Artifactual variation in randomly amplified polymorphic DNA banding patterns. Biotech. 14:214-217. Estoup, A., Garnery, L., Solignac, M., and Cornuet, J. M. 1995. Microsatellite variation in honeybee (Apis mellifera L.) populations: Hierarchical genetic structure and test of the infinite allele and stepwise mutation models. Genetics 140:679-695. Felsenstein, J. 1993. "PHYLIP v3.5 (Phylogenetic Inference Package, computer software) Ver. 3.2." University of Washington, Seattle, WA. Franck, J. P. C., Wright, J. M., and McAndrew, B. 1992. Genetic variability of a family of satellite DNAs from tilapia (Pisces: Cichlidae). Genome 35:719-725. Franck, J. P. C., Kornfield, I., and Wright, J. M. 1994. The utility of SATA satellite DNA sequences for inferring phylogenetic relationships among the three major genera of tilapinne cichlid fishes. Mol. Phylogenet. Evol. 3:10-16. Fryer, G. 1959a. Some aspects of evolution in Lake Nyasa. Evolution 13: 440-451. Fryer, G. 1959b. The trophic interrelationships and ecology of some littoral communities in Lake Nyasa with special references to
36
IRV KORNFIELD AND ALEX PARKER
the fishes, and a discussion of the evolution of a group of rockfrequenting Cichlidae. Proc. Zool. Soc. Lond. 132:153-281. Fryer, G., and Iles, T. D. 1972. "The Cichlid Fishes of the Great Lakes of Africa." Oliver Boyd, Edinborough. Gasse, F., Ledee, V., Massault, M., and Fontes, J.-C. 1989. Water level fluctuations of Lake Tanganyika in phase with oceanic changes during the last glaciation and deglaciation. Nature 342:57-59. Goldstein, D. B., Linares, A. R., Cavalli-Sforza, L. L., and Feldman, M. W. 1995. Genetic absolute dating based on microsatellites and the origin of modern humans. Proc. Natl. Acad. Sci. USA 92:67236727. Greenwood, P. H. 1984. African cichlids and evolutionary theories. In "Evolution of Fish Species Flocks." (A. A. Echelle and I. L. Kornfield, eds.), pp. 141-154. University of Maine Press, Orono, ME. Hare, M. P., Karl, S. A., and Avise, J. C. 1996. Anonymous nuclear DNA markers in the American oyster and their implications for the heterozygote deficiency phenomenon in marine bivalves. Mol. Biol. Evol. 13:334-345. Hughes, A. L., and Nei, M. 1989. Nucleotide substitution at major histocompatibility complex class II loci: Evidence for overdominant selection. Proc. Natl. Acad. Sci. USA 86:958-962. Karl, S. A., Bowen, B. W., and Avise, J. C. 1992. Global population structure and male-mediated gene flow in the green turtle (CheIonia mydas): RFLP analyses of anonymous nuclear loci. Genetics 131:163-173. Karl, S. A., and Avise, J. C. 1993. PCR-based assays of mendelian polymorphisms from anonymous single-copy nuclear DNA: Techniques and applications for population genetics. Mol. Biol. Evol. 10:342-361. Kellogg, K. A., Markert, J. A., Stauffer, J. R., Jr., and Kocher, T. D. 1995. Microsatellite variation demonstrates multiple paternity in lekking cichlid fishes from Lake Malawi, Africa. Proc. R. Soc. Lond. B 260:79-84. Klein, J. 1986. "Natural History of the Major Histocompatibility Complex." Wiley, New York. Klein, D. H., Ono, H., O'Huigin, C., Vincek, V., Goldschmidt, T., and Klein, J. 1993. Extensive Mhc variability in cichlid fishes of Lake Malawi. Nature 364: 330-332. Konings, A. 1990. "Koning's Book of Cichlids and All the Other Fishes of Lake Malawi." TFH Publications, Inc., Neptune City, NJ. Kornfield, I. 1978. Evidence for rapid speciation in African cichlid fishes. Experientia 34: 335-336. Kornfield, I. 1991. Genetics. In "Cichlid Fishes: Behavior, Ecology and Evolution." (M. Keenleyside, ed.), pp. 103-128. Chapman and Hall, London. Lazzaro, X. 1991. Feeding convergence in South American and African zooplanktivorous cichlids Geophagus brasilensis and Tilapia rendalli. Environ. Biol. Fishes 31:283-293. Levin, I., Cheng, H. H., Baxter-Jones, C., and Hillel, J. 1995. Turkey microsatellite DNA loci amplified by chicken-specific primers. Anim. Genet. 26:107-110. Lewis, D. S. C. 1981. "Problems of Species Definition in Lake Malawi Cichlid Fishes (Pisces, Cichlidae)." J. L. B. Smith Inst. Ichthy. Spec. Publ. 23:1-5. Lewis, D. S. C. 1982. A revision of the genus Labidochromis (Teleostei: Cichlidae) from Lake Malawi. Zool J. Linn. Soc. 75:189-265. Liem, K. F. 1980. Adaptive significance of intra- and interspecific differences in the feeding repertoires of cichlid fishes. Am. Zool. 20: 295-314. Maciel, P., et al. 1995. Correlation between CAG repeat length and clinical features in Machado-Joseph disease. Am. J. Hum. Genet. 57:54-61. Marsh, A. C., Ribbink, A. J., and Marsh, B. A. 1981. Sibling species complexes in sympatric populations of Petrotilapia Trewavas (Cichlidae, Lake Malawi). Zool. J. Linn Soc. 71:253-264.
Mayr, E. 1984. Evolution of fish species flocks: A commentary. In "Evolution of Fish Species Flocks." (A. A. Echelle and I. Kornfield, eds.), pp. 3-11. University of Maine Press, Orono, ME. McElroy, D. M., Kornfield, I., and Everett, J. 1991. Coloration in African cichlids: Diversity and constraints in Lake Malawi endemics. Neth. J. Zool. 41:250-268. McKaye, K. R., Kocher, T., Reinthal, P., Harrison, R., and Kornfield, I. 1984. Genetic evidence for allopatric and sympatric differentiation among morphs of a Lake Malawi cichlid fish. Evolution 36: 658-664. McKaye, K. R., Kocher, T., Reinthal, P., and Kornfield, I. 1982. Sympatric sibling species complex of Petrotilapia Trewavas analyzed by enzyme electrophoresis (Pisces: Cichlidae). J. Linn. Soc. 76:9196. McMillan, W. O., and Palumbi, S. R. 1995. Concordant evolutionary patterns among Indo-West Pacific butterflyfishes. Proc. R. Soc. Lond. B 260: 229- 239. Meyer, A. 1993. Phylogenetic relationships and evolutionary processes in East African cichlid fishes. Trends Ecol. Evol. 8:279-284. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Minch, E. 1995. "MICROSAT vl.4 (computer software)." Stanford University, Stanford, CA. Moran, P., and Kornfield, I. 1993. Retention of an ancestral polymorphism in the mbuna species flock (Pisces: Cichlidae) of Lake Malawi. Mol. Biol. Evol. 10:1015-1029. Moran, P., and Kornfield, I. 1995. Were population bottlenecks associated with radiation of the mbuna species flock (Teleostei: Cichlidae) of Lake Malawi? Mol. Biol. Evol. 12:1085-1093. Moran, P., Kornfield, I., and Reinthal, P. 1994. Molecular systematics and radiation of the haplochromine cichlids (Teleostei: Perciformes) of Lake Malawi. Copeia 1994:274-288. Nei, M. 1978. Estimation of avaerage heterozygosity and genetic distance from a small number of individuals. Genetics 89:583-590. Nei, M. 1987. "Molecular Evolutionary Genetics." Columbia University Press, New York. Niki, Y., Chigusa, S. I., and Matsuura, E. T. 1989. Complete replacement of mitochondrial DNA in Drosophila. Nature 341:551-552. Oliver, M. K. 1984. "Systematics of African Cichlid Fishes: Determination of the Most Primitive Taxon, and Studies on the Haplochromines of Lake Malawi (Teleostei: Cichlidae). Unpublished Ph.D. dissertation, Yale University, New Haven, CT. Ono, H., O'Huigin, C., Tichy, H., and Klein, J. 1993. Major histocompatibility complex variation in two species of cichlid fishes from Lake Malawi. Mol. Biol. Evol. 10:1060-1072. Owen, R. B., Crossley, R., Johnson, T. C., Tweddle, D., Kornfield, I., Davison, S., Eccles, D. H., and Engstrom, D. E. 1990. Major low levels of Lake Malawi and implication for speciation rates in cichlid fishes. Proc. R. Soc. Lond. B 240:519-553. Paetkau, D., Calvert, W., Stirling, I., and Strobeck, C. 1995. Microsatellite analysis of population structure in Canadian polar bears. Mol. Ecol. 4: 347-354. Palumbi, S. R. and Baker, C. S. 1994. Contrasting population structures for nuclear intron sequences and mtDNA of humpback whales. Mol. Biol. Evol. 11:426-435. Pamilo, P., and Nei, M. 1988. Relationships between gene trees and species trees. Mol. Biol. Evol. 5:568-583. Parker, A., and Kornfield, I. 1996. Polygynandry in Pseudotropheus zebra, a cichlid fish from Lake Malawi. Environ. Biol. Fish., 47:345352. Parker, A., and Kornfield, I. 1997. Evolution of the mitochondrial DNA control region in the mbuna (Cichlidae) species flock of Lake Malawi, East Africa. J. Mol. Evol., in press. Pemberton, J. M., Slate, J., Bancroft, D. R., and Barrett, J. A. 1995. Non-
3. Molecular Systematics of mbuna amplifying alleles at microsatellite loci: A caution for parentage and population studies. Mol. Ecol. 4:249-252. Penny, D., Steel, M., Waddell, P. J., and Hendy, M. D. 1995. Improved analyses of human mtDNA sequences support a recent african origin for Homo sapiens. Mol. Biol. Evol. 12:863-882. P6pin, L., Amigues, Y., Le'Pringle, A., Berthier, J.-L., Bensaid, A., and Vaiman, D. 1995. Sequence conservation of microsatellites between Bos taurus (cattle), Capra hircus (goat) and related species: Examples of use in parentage testing and phylogenetic analysis. Heredity 74:53-61. Queller, D. C., and Goodnight, K. F. 1989. Estimating relatedness using genetic markers. Evolution 43:258-275. Queller, D. C., Strassmann, J. E., and Hughes, C. R. 1993. Microsatellites and kinship. Trends Ecol. Evol. 8:285-288. Rand, D. M., Dorfsman, M., and Kan, L. M. 1994. Neutral and nonneutral evolution of Drosophila mitochondrial DNA. Genetics 138: 741-756. Raymond, M., and Rousset, F. 1995. GENEPOP ver. 1.2 a population genetics software for exact tests and ecumenicism. J. Hered. 86: 248-249. Regan, C. T. 1921. The cichlid fishes of Lake Nyasa. Proc. Zool. Soc. Lond. 1921: 675- 727. Reinthal, P. N. 1987. "Morphology, Ecology, and Behavior of a Group of the Rock-Dwelling Fishes (Cichlidae: Perciformes) from Lake Malawi, Africa. Unpublished Ph.D dissertation, Duke University, Durham, NC. Reinthal, P. N. 1990a. Morphological analysis of the neurocranium of a group of rock-dwelling cichlid fishes (Cichlidae: Perciformes) from Lake Malawi, Africa. Zool. J. Linn. Soc. 98:123-139. Reinthal, P. N. 1990b. The feeding habits of a group of herbivorous rock-dwelling cichlid fishes (Cichlidae: Perciformes) from Lake Malawi, Africa. Environ. Biol. Fishes. 27:215-233. Ribbink, A. J., Marsh, A. C., Marsh, B. A., and Sharp, B. J. 1983a. The zoogeography, ecology and taxonomy of the genus Labeotropheus Ahl, 1927, of Lake Malawi (Pisces: Cichlidae). Zool. J. Linn. Soc. 79: 223- 243. Ribbink, A. J., Marsh, A. C., Ribbink, C. C., and Sharp, B. J. 1983b. A preliminary survey of the cichlid fishes of rocky habitats in Lake Malawi. S. Afr. J. Zool. 18:149-310. Rice, W. R. 1989. Analyzing tables of statistical tests. Evolution 43: 223-225. Rubensztein, D. C., Amos, W., Leggo, J., Goodburn, S., Jain, S., Li, S. H., Margolis, R. L., Ross, C. A., and Ferguson-Smith, M. 1995. Microsatellite evolution: Evidence for directionality and variation in rate between species. Nat. Genet. 10:337-343. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406 -425. Schl6tterer, C., Amos, B., and Tautz, D. 1991. Conservation of polymorphic simple sequence loci in cetacean species. Nature 354: 63-65.
37
Scholz, C. A., and Rosendahl, B. R. 1988. Low lake stands in Lakes Malawi and Tanganyika, East Africa, delineated with multifold seismic data. Science 240:1645-1648. Seyoum, S., and Kornfield, I. 1992a. Taxonomic notes on the Oreochromis niloticus subspecies complex (Pisces: Cichidae), with a description of a new subspecies. Can. J. Zool. 70:2161-2165. Seyoum, S., and Kornfield, I. 1992b. Identification of the subspecies of Oreochromis niloticus (Pisces: Cichlidae) using restriction endonuclease analysis of mitochondrial DNA. Aquaculture 102:29-42. Shriver, M. D., Jin, L., Boerwinkle, E., Deka, R., Ferrell, R. E., and Chakraborty, R. 1995. A novel measure of genetic distance for highly polymorphic tandem repeat loci. Mol. Biol. Evol. 12:914920. Shriver, M. D., Jin, L., Chakraborty, R., and Boerwinkle, E. 1993. VNTR allele frequency distributions under the stepwise mutation model: A computer simulations approach. Genetics 134:983-993. Slatkin, M. 1995. A measure of population subdivision based on microsatellite alleles. Genetics 139:457-462. Sultmann, H., Mayer, W. E., Figueroa, F., Tichy, H., and Klein, J. 1995. Phylogenetic analysis of cichlid fishes using nucler DNA markers. Mol. Phylogenet. Evol. 12:1033-1047. Swofford, D. L. 1990. "PAUP: Phylogenetic Analysis Using Parsimony, ver. 3.1.1." Computer program distributed by the Illinois Natural History Survey, Champaign, IL. Swofford, D. L., and Berlocher, S. H. 1987. Inferring evolutionary trees from gene frequency data under the principle of maximum parsimony. Syst. Zool. 36:293-325. Swofford, D. L., and Selander, R. B. 1981. BIOSYS-I: a FORTRAN program for the comprehensive analysis of electrophoretic data in population genetics and systematics. J. Hered. 72:281-283. Trewavas, E. 1935. A synopsis of the cichlid fishes of Lake Nyasa. Ann. Mag. Nat. Hist. 10:65-118. Turner, G. F. 1994. Speciation mechanisms in Lake Malawi cichlids: A critical review. Arch. Hydrobiol. 44:139-160. Valdes, A. M., Slatkin, M., and Freimer, N. B. 1993. Allele frequencies at microsatellite loci: The stepwise mutation model revisited. Genetics 133:737-749. Van Dongen, S. 1995. How should we bootstrap allozyme data? Heredity 74: 445-447. Weir, B. S. 1990. Sampling strategies for distances between DNA sequences. Biometrics 46:551-560. Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S. V. 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18:65316535. Wright, J. M. 1989. Nucleotide sequence, genomic organization and evolution of a major repetitive DNA family in tilapia Oreochromis mossambicus/hornorum. Nucleic Acids Res. 17:5071-5079. Zaykin, D. V., and Pudovkin, A. I. 1993. Two programs to estimate /~,2 values using pseudo-probability tests. J. Hered. 84:152-153.
This Page Intentionally Left Blank
C H A P T E R
4 Reconstruction of Cichlid Fish Phylogeny Using Nuclear DNA Markers ~176
H O L G E R S U L T M A N N and WERNER E. MAYER
Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 Tiibingen, Germany
some of which gave rise to more recent groups in lakes Malawi (Kocher et al., 1993) and Victoria. A comparatively large genetic divergence between species of the genus Tropheus from Lake Tanganyika was found to be accompanied by small morphological changes (Sturmbauer and Meyer, 1992). However, high morphological plasticity was found within the single New World species Cichlasoma managuense (Meyer, 1987). In addition, although some cichlid species from different lakes resemble each other morphologically, molecular data indicate that this similarity is due to convergent evolution (Kocher et al., 1993). Most of the cichlid species of Lakes Malawi and Victoria are endemic (Kornfield, 1978; Meyer et al., 1990; Greenwood, 1991) and monophyletic (Meyer et al., 1990; Meyer, 1993). Taking into account the estimated ages of 2 MY for Lake Malawi and less than 1 MY for Lake Victoria (Fryer and Iles, 1972), questions arise as to the speed and mode of speciation leading to hundreds of different species. It has been shown by allozyme variation that speciation in Lake Malawi occurred rapidly (Kornfield, 1978). Allopatric speciation might have been promoted by considerable fluctuation in the water levels of Lakes Malawi and Victoria (Livingstone, 1980; Owen et al., 1990). However, microallo-
I. I n t r o d u c t i o n
The family Cichlidae constitutes a monophyletic group in the order Perciformes (Kaufman and Liem, 1982). Monophyly of the cichlid family is indicated by the presence of at least nine synapomorphic morphological characters (Stiassny, 1991; Zihler, 1982; Gaemers, 1984). Since the distribution of cichlids ranges from South and Central America and Mexico to tropical Africa, Madagascar, southern India, and Sri Lanka (Ribbink, 1991), the cichlid family must have arisen before the separation of Africa, South America, and India by continental drift more than 100 million years (MY) ago. The morphology of cichlid species has been studied for almost 100 years and various classification schemes have been proposed (e.g., Pellegrin, 1904; Regan, 1906; Vandewalle, 1971; Trewavas, 1973, 1983; Poll, 1986; Greenwood, 1987; Cichocki, 1976; Stiassny, 1987, 1991; Oliver, 1984). The cichlid taxa in the Great Lakes of East Africa (Lakes Victoria, Malawi, and Tanganyika) are of special interest, having undergone recent explosive adaptive radiations leading to hundreds of different species. Lake Tanganyika, which is approximately 12 MY old (Cohen et al., 1993), provides an ancient reservoir of polyphyletic taxa (Nishida, 1991),
MOLECULAR SYSTEMATICS OF FISHES
39
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
40
HOLGER SLILTMANN AND WERNER E. MAYER
patric or even sympatric speciation cannot be ruled out, particularly because habitats and niches are quite restricted for most of the species (Ribbink, 1991; Meyer, 1993). Before hypotheses regarding the speciation process can be postulated, the phylogenetic relationships among the various cichlid taxa must be elucidated. Two main difficulties have, however, hampered the reconstruction of cichlid phylogenies from morphological characters: paucity of synapomorphic characters, which hinders the recognition of taxonomic groups, and abundant parallelism, which makes it difficult to ascertain whether shared characters are synapomorphies or homoplasies. To circumvent these problems, molecular analyses have been initiated and used for the construction of phylogenies for cichlid species and species flocks (i.e., monophyletic groups of closely related species coexisting in the same area; Greenwood, 1984; Nishida, 1991; Sage et al., 1984).
II. Methods Used for Reconstructing Cichlid Phylogeny The present taxonomy of cichlids in the east African lakes is largely based on morphological characters, particularly the shape of the jaws and teeth as well as the trophic behavior (Greenwood, 1979, 1980). For a variety of allozymes, allelic frequencies have been estimated from the electrophoretic mobility patterns (Sage and Selander, 1975; Kornfield, 1978; Kornfield et al., 1979; McKaye et al., 1982; McAndrew and Majumdar, 1983, 1984; Sage et al., 1984; Nishida, 1991) and used to calculate genetic distances between sister groups of cichlids. These data allowed the subdivision of cichlids into genera and species flocks. For the Lake Victoria cichlids, however, despite their considerable morphological differences, genetic distances were too small (0.006 substitutions per locus; Sage et al., 1984) to evaluate interspecies relationships. The substitution rate at the mitochondrial control region and adjacent loci (cytochrome b and tRNA genes) has been shown to be higher than that of most nuclear DNA loci (Brown et al., 1979). Sequence analyses of mitochondrial DNA (mtDNA) (Meyer et al., 1990; Sturmbauer and Meyer, 1992, 1993; Kocher et al., 1993, 1995; Sturmbauer et al., 1994; Moran and Kornfield, 1993; Schliewen et al., 1994; Bowers et al., 1994) have extended the phylogenetic trees of the Lake Tanganyika and Malawi lineages and confirmed the monophyly of the Lake Victoria species flock. Discrep-
ancies between the restriction fragment length polymorphism (RFLP) pattern of mtDNA and the species tree based on morphological characters, however, led Moran and Kornfield (1993) to suggest an ancestral polymorphism in the founding populations of the Lake Malawi flocks, which hinders an accurate determination of their phylogenetic relationships. In addition to this problem of polymorphism predating species divergence, the low number of mtDNA markers available is a limiting factor. In contrast to the low genetic diversity among cichlids as revealed by allozyme data is the finding of high polymorphism at the Mhc (major histocompatibility complex) loci (Klein et al., 1993; Ono et al., 1993). Although some of this polymorphism is ancient (predating species divergence), the high number of different Mhc groups (loci) and alleles in cichlids might make the Mhc a useful genetic tool for studying cichlid phylogeny (Klein et al., 1997). The detection of a family of tandemly repeated satellite DNA elements in tilapia (Wright, 1989; Franck et al., 1992) has enabled Franck and co-workers (1994) to provide evidence for a close relationship of the mouthbrooding tilapiine genera Oreochromis and Sarotherodon in contrast to the substrate spawning genus Tilapia. In this report, nucleotide differences between the satellite consensus sequences for each genus were used for the construction of a phylogenetic tree. Using the molecular methods just described, the aim of elucidating the evolutionary history of cichlid species within the monophyletic groups of Lakes Malawi and Victoria has been achieved only partially, either because of the poor resolution achievable by the methods or because of the low number of polymorphic loci found. Thus, using more polymorphic nuclear DNA markers is the only means for making further progress in this field of research. The search for such new markers was greatly facilitated by the discovery of the polymerase chain reaction (PCR) (Saiki et al., 1988). This chapter describes and discusses the application of two PCR-based methods. First, S~iltmann et al. (1995) used the random amplification of the polymorphic DNA (RAPD) technique (Williams et al., 1990; Welsh and McClelland, 1990) to identify polymorphic genomic loci, followed by locus-specific DNA amplification and sequence determination of the fragments. In a second (unpublished) approach, locus-specific PCR primers were used to amplify microsatellite repetitive elements to determine allele size frequencies among cichlid species from Lake Victoria. Nucleotide substitutions and allele frequency differences between species were then used to calculate genetic distance matrices and to construct phylogenetic trees.
4. Reconstruction of Cichlid Phylogeny
III. Random Amplification of Polymorphic DNA (RAPD) The RAPD procedure (Welsh and McClelland, 1990; Williams et al., 1990) was originally developed as a method for fingerprinting genomes. PCR amplification is performed using a single oligonucleotide, typically a 10-mer primer, at low annealing temperatures (35-40 ~ Fig. 1A). Depending on its sequence, the primer randomly anneals to an unknown segment on one of the DNA strands. In some cases, another annealing site will be present on the complementary strand not too distant from the first site and amplification will occur. When two species, strains, or individuals are compared, polymorphism between them will be revealed on agarose or polyacrylamide electrophoresis gels by the presence or absence of an amplification product. This method has been applied to the discovery of genetic markers for mapping studies (Serikawa et al., 1992; Postlethwait et al., 1994) and to elucidate phylogenetic relationships between bacterial species and strains (Welsh and McClelland, 1990; Smith et al., 1994) and tilapiine cichlid species (Bardakci and Skibinski, 1994). In the latter case, three species of the genus Oreochromis and four subspecies of Oreochromis niloticus could be distinguished. However, analyses of the reaction conditions (Ellsworth et al., 1993; Muralidharan and Wakeland, 1993; Smith et al., 1994; Bowditch et al., 1994) have shown that RAPD is highly sensitive to a wide range of factors: the quality of the template DNA, minute contaminations of RNA, the primer/template ratio, and small changes of the magnesium concentration. In addition, it is prone to producing spurious fragment variation (as shown, for example, by comparison of F1 hybrid DNA with parental DNA; Ayliffe et al., 1994) and other artifacts. Therefore, the procedure has been supplemented by sequencing the differential RAPD fragment and designing primers for locus-specific amplification in standard PCR. Although the RAPD polymorphism is presumed to be located at the annealing site of the 10-mer primer, it has been shown that the primer-binding sites are often identical between two samples showing polymorphic bands (Bowditch et al., 1994). The most likely explanation for this is that the formation of different secondary structures of the DNA templates, due to nucleotide substitutions outside the annealing sites, affects the accessibility of the annealing sites. To examine variation at the RAPD primer annealing site, the "vectorette" technique described by Riley and co-workers (1990) was also applied. Genomic DNA was digested with restriction endonucleases, and so-
41
called vectorette linkers were ligated to all fragments (Fig. 1B). The vectorette linkers consisted of two oligonucleotides that were complementary to each other at their 5' and 3' ends, but contained a central mismatched region. In the subsequent PCR, the firststrand DNA synthesis primed by a locus-specific oligonucleotide was essential for the generation of the binding site of the so-called vectorette primer that specifically annealed to the complementary strand of the mismatched region of the vectorette linker. Thus, specific exponential amplification of the flanking region occurred. The PCR products were then cloned and sequenced by standard methods. Using two DNA samples (shown to be devoid of RNA by ethidium bromide staining) from Pseudotropheus zebra and Melanochromis auratus, the RAPD conditions that yielded the most reproducible results were determined and then these were kept constant in subsequent experiments. The conditions were as follows: 50-60% G +C content for the 10-mer primer (see Stiltmann et al., 1995), which was used at a concentration of 4/xM in the PCR (a combination of two 10-mer primers can also be used for the amplification); 100/xM each of dATP, dCTP, dGTP, and dTTP; 2.5 units of Taq polymerase; and 100 ng of template DNA in a total reaction volume of 25 #1 in 1• reaction buffer containing 1.5 mM magnesium chloride. The PCR program consisted of 45 sec at 93~ 15 sec annealing at 35-42~ and 10 min primer extension at 72 ~ followed by 35 to 45 cycles, each 15 sec at 93 ~ 15 sec annealing at 35 ~ 42~ and 3 min primer extension at 72~ The reaction was completed by a final primer extension step for 10 min at 72~ Only those cases which gave concordant banding patterns for two individuals of each species were examined further. Figure 2 shows an example of a typical result of the RAPD reaction where a polymorphic band of about 400 bp in size is present in P. zebra but absent from M. auratus. The fragments were subcloned in pUC18 or M13 vectors and sequenced (Sanger et al., 1977). From three of these sequences, specific primers for amplification were constructed. The corresponding loci were called DXTU1, DXTU2, and DXTU3 (for details, see Stiltmann et al., 1995). In the specific PCR, the following observations were made: 1. Polymorphism of locus-specific PCR products was frequently observed. The proportion of polymorphic versus monomorphic loci obtained by this procedure was estimated to be higher than 50%. 2. The applicability of the specific primers varied depending on the locus examined. At DXTU1, products were obtained from neotropical as well as from
42
HOLGER SfdLTMANN A N D WERNER E. M A Y E R
B
A
R
R
R
X
genomicDNA
10mer primer(X)
Restriction digest of genomic DNA
~ Denaturing, annealingat low stringency X
X
Species A
Species B
~
~
(RAPD) PCR underlow stringencyconditions
X
Ligation of vectorette linker containing a central mismatched segment
./"
Heat denaturation
no exponentialamplification fromspeciesA exponentiallyamplifiedDNA fragment from species B
~ Subcloning, Sequencing, Constructionof specific primersY andZ
1 st strand synthesis with target primer
~ StandardPCRusing manyspecies'DNA as templates
PCR with target and vectorette primers
"~LLLUdz
/
|
/
no priming
Z
~ Subcloning, Sequencing Phylogeneticanalysis
FIGURE I (A) Schematic outline of the RAPD method. See text for details. (B) Schematic outline of the vectorette approach. Genomic DNA is digested with a restriction enzyme (R). A vectorette linker, composed of two oligonucleotides that are complementary only at the ends and leave a central mismatched part, is ligated to the DNA fragments. In a PCR the synthesis of first-strand DNA is primed by an oligonucleotide specific for the target segment of a known sequence (shown as black box) and extended into the flankingparts and vectorette linker. This strand is used as a template in subsequent PCR cycles by the vectorette primer, which is located in the mismatched segment of the linker.
west and east African cichlid species, whereas at DXTU2, no products were found in cichlid species outside the Lake Victoria and Lake Malawi regions. These results are most likely due to different extents of conservation at the primer-binding sites. 3. Another notable feature of the specific PCR was the appearance of several by-products in addition to the band of the expected size. Since the possibility of amplification from multiple related loci (e.g., diversified repeats) could not be excluded in some cases, a third primer was used to prove the singularity of the amplified region in the cichlid genome. Sequence variability in the D N A fragments resulting from specific PCR can also be examined using single-stranded conformation polymorphism (SSCP;
Orita et al., 1989a,b) analysis. In this approach, distinct banding patterns of PCR products from different species indicate sequence differences between species at a single locus. The polymorphic locus DXTU1 was selected for a detailed sequence analysis using the GCG software package (Devereux and Haeberli, 1991) or the Clustal V program (Higgins et al., 1992). The following analysis of representative sequences at the DXTU1 locus is shown (contact author for raw data): 1. It is remarkable that insertions or deletions (indels) constitute about one-quarter of the total number of polymorphic sites found at the DXTU1 and the other genomic loci. Although nucleotide substitutions are commonly used in phylogenetic tree construction, in-
4. Reconstruction of Cichlid Phylogeny
12345
1358 bp 1078 bp 872 bp
-.91~
603 bp v
FIGURE 2 Products obtained by RAPD PCR with the 10-mer primer TU984 (5' GTGTGCCCCA3'). Products from Pseudotropheuszebra (lanes 1 and 2) and Melanochromisauratus (lanes 3 and 4) were separated on a 2% agarose gel. The left arrow indicates a 400-bp band present only in lanes I and 2. Lane 5 contains DNA size marker. The arrows on the right denote marker sizes.
(~ (~f
dels are normally excluded, yet the n u m b e r of possible ways by which nucleotides can be inserted, deleted, or rearranged is nearly unlimited, in contrast to the three possibilities by which nucleotides can be substituted at a single site. Thus, data could be analyzed by two different methods: the standard tree construction m e t h o d based on genetic distances and the neighbor-joining algorithm of Saitou and Nei (1987), and the cladistic analysis with the PAUP p r o g r a m version 3.1.1. (Swofford, 1993), in which shared indels were treated as synapomorphies. 2. There is considerable agreement between the distance tree (Fig. 3) and the cladogram (Fig. 4) based on the DXTU1 sequences. Although the evolutionary forces acting on the single loci m a y vary, the topologies of the neighbor-joining trees constructed for other loci than DXTU1 were congruent with the DXTU1 tree (see Stiltmann et al., 1995). However, low bootstrap values with respect to certain branching patterns in the neighbor-joining tree (e.g., the position of haplochromines in Lake Malawi, Fig. 3) suggest that longer sequences or more loci are required for a more precise
|
Cyphotilapia frontosa-143 L_ Cyphotilapia frontosa- 144
" ~ L a k e Tanganyika
~J
Cyphotilapia
Melanochromis auratus-1 Tyrranochromis macrostoma-1
~
Lake Malawi Haplochromis
Haplochromis xenognathus-ll4 Haplochromis velifer-602 ~'L-Hapl~176
~
Lake Victoria
~
~ _~ .
43
[ILCyrtocaramoorii-ZR216 L Pseudotropheus zebra- 1
Haplochromis ~
Lake Malawi
I,[
Haplochromis
(~ "-"
Lake Tanganyika genera
[ Neolamprologus brevis-63 ~ f Neolamprologus leleupi-135 ~ Julidochromis regani-60 Neolamprologus tretocephalus- 140 - - Astatoreochromis alluaudi-771 Alcolapia alcalicus-462 Oreochromis niloticus-LS7 Oreochromis urolepis-LSl O
Tylochromis leonensis-PR2 Thorichthys meeki-#55 Cichla-#15
non-endemic species Tilapiines from rivers and Lake Natron West African species ~
Neotropical species
Genetic distance I ! I I ! I I I 0.02 0.03 0.04 0.05 0.06 0.07 0 0.01 FIGURE 3 Neighbor-joining tree (Saitou and Nei, 1987) of the sequences at the DXTU1 locus. Genetic distances were calculated using Kimura's (1980) two-parameter method. The numbers at each node represent percentage recovery of the particular node in 1000 bootstrap replications.
44
HOLGER SCILTMANN AND WERNER E. MAYER
[
Melanochromis auratus-1 Pseudotropheus zebra- 1 Lake
Malawi
Haplochromis
Tyrannochromis macrostoma- 1 Cyrtocara moorii-ZR216 Haplochromis xenognathus-114
Lake Victoria
Haplochromis velifer-602
Haplochromis Haplochromis nigricans-268
Astatoreochromis alluaudi-771
non-endemic species
Cyphotilapia frontosa-143
Lake Tanganyika
Cyphotilapia frontosa-144
Cyphotilapia
Julidochromis regani-60
Lake Tanganyika
Neolamprologus leleupi-135
Julidochromis, Neolamprologus
Neolamprologus tretocephalus-140 Neolamprologus brevis-63 Alcolapia alcalicus-462
Oreochromis niloticus-LS7
Tilapiines
Oreochromis urolepis-LS l O
West African species
Tylochromis leonensis-PR2 Thorichthys meeki#55
~
Neotropicalspecies
Cichla#15 FIGURE 4 Cladogram of 20 representative taxa based on presence or absence of indels and substitutions at the DXTU1 locus. The tree resulted from 500 bootstrap replications using the heuristic search option of the PAUP program version 3.1.1 (Swofford, 1993). The numbers at each node represent percentage recovery of the particular branching order. Cichla No. 15 was used as an outgroup.
determination of species relationships. The trees (Figs. 3 and 4) of the DXTU1 sequences led to the following conclusions. First, the neotropical cichlid species Cichla sp. and Thorichthys meeki form a sister group to the African cichlids. The position of neotropical cichlids indicated by the molecular analysis is consistent with the results of morphological analysis (Cichocki, 1976; Oliver, 1984; Stiassny, 1991), which has revealed a set of derived characters uniting African cichlids (with the exception of Heterochromis) into a monophyletic group. Second, in the phylogram, the west African species Tylochromis leonensis is in a sister-group relationship with the east African species (the tilapiines, represented here by the genus Oreochromis from east African rivers and the Alcolapia alcalicus from Lake Natron; the Lake Tanganyika genera Neolamprologus, Julidochromis, and Cyphotilapia; and the Lake Malawi and Lake Vic-
toria species). In contrast, in the cladogram, Tylochromis appears as a sister group to the tilapiines. Third, the monophyly of the considered east African cichlids (tilapiines, haplochromines, and the Neolamprologus and Cyphotilapia genera of Lake Tanganyika) is indicated both by nucleotide substitution and by indel patterns. This branching order is also supported by mitochondrial DNA data (Meyer, 1993). Fourth, the tilapiines form a monophyletic sister group to the remaining east African Great Lake species and genera (haplochromines, Cyphotilapia, Astatoreochromis, and lamprologines). This result is concordant with morphological analyses (Regan, 1920; Trewavas, 1983) and other molecular studies (Kornfield et al., 1979; McAndrew and Majumdar, 1984; Seyoum, 1989; Sodsuk and McAndrew, 1991; Franck et al., 1994). Fifth, the species Astatoreochromis alluaudi, which is not endemic to Lake Victoria but is also found in other east African lakes and rivers, is a sister group of the
4. Reconstruction of Cichlid Phylogeny included east African lake genera. This result, as well as the sister-group placement of Julidochromis and Neolamprologus with respect to the Lake Malawi and Lake Victoria flocks, is also supported by mitochondrial DNA sequence data (Meyer et al., 1990). Sixth, the sister-group relationship of the Tanganyikan species Cyphotilapia frontosa to Lake Malawi haplochromines suggested by the NJ tree (Fig. 3) supports allozyme data (Kornfield, 1991), according to which Cyphotilapia is more closely related to the haplochromines of Lake Malawi than to those of Lake Tanganyika. This result further supports the polyphyletic structure of the Lake Tanganyika flocks. However, the cladogram (Fig. 4) favors Cyphotilapia in a sister group position to the Lake Malawi and Lake Victoria species on the one hand and the Lake Tanganyika Julidochromis and Neolamprologus genera on the other hand, thus supporting the sister group relationship of Cyphotilapia with other Lake Tanganyika cichlid flocks by using mtDNA control region data (Kocher et al., 1993). Finally, the monophyly of the endemic Lake Victoria haplochromines, as suggested by both trees, is consistent with the results of morphological studies (see Greenwood, 1978; Trewavas, 1983) and mitochondrial DNA analyses (Meyer et al., 1990; Sturmbauer and Meyer, 1992; Meyer, 1993; Moran and Kornfield, 1993). The finding of several indels, which are probably species specific, suggests that it may be possible to elucidate the relationships within species flocks using RAPD markers.
IV. Allele Size Frequencies at Dinucleotide Microsatellite Loci Microsatellites are tandemly repeated DNA sequences with repeat units of I to 6 bp in length and 10 to 100 units per locus (Charlesworth et al., 1994). They have been used for the construction of genetic maps in humans (Hearne et al., 1992; LeBlanc-Straceski et al., 1994) and other species. Variable repeat numbers have also been implicated in disease and cancer susceptibility (Wooster et al., 1994). The rate of mutations generating microsatellite repeat number variation is highest among all nuclear DNA markers; estimations for dinucleotide repeats range from 10 -2 to 10 -4 per generation (Jeffreys et al., 1988; Weber and Wong, 1993). This high mutation rate makes microsatellites a promising tool for population genetic analyses. Consequently, a number of studies have made use of microsatellites for determining relationships among populations of humans (Bowcock et al., 1994; Deka
45
et al., 1995), wolves (Roy et al., 1994), sheep (Buchanan et al., 1994), and toads (Scribner et al., 1994). Variability is believed to occur by the stepwise addition or subtraction of single repeat units after mispairing of the two DNA strands during the replication process (stepwise mutation model, SMM; Levinson and Gutman, 1987; Schl6tterer and Tautz, 1992). It has been shown, however (DiRienzo et al., 1994), that the SMM does not fully explain observed allele frequency distributions within populations: although allelic variation at dinucleotide repeat loci is predominantly due to single step mutations, rare changes of more than one repeat unit may occur as well. Furthermore, unequal crossing-over during meiosis may also contribute to the generation of polymorphism at the microsatellite loci. In cichlids, microsatellites have been used to study the mating behavior of Lake Malawi species (Kellogg et al., 1995). Cichlid fish phylogenies based on microsatellite data have not yet been published. However, the determination of allele size frequency distributions in distinct species from Lake Malawi and Lake Victoria, followed by the calculation of distance matrices, may provide the most promising means for reconstructing their phylogenies. In order to generate allele size data, subgenomic libraries with small insert sizes (200-1000 bp) from P. zebra (Lake Malawi) and Haplochromis nigricans (Lake Victoria) in the A gtl0 phage vector were constructed. The libraries were screened with the CA dinucleotide repeat-specific oligonucleotide (Ca)lsC, and hybridizing clones were isolated and sequenced (Sambrook et al., 1989). The clones contained stretches of CA(GT)repeated DNA with repeat numbers ranging from 8 to 90. Sequence-specific primers flanking the entire repetitive element at the particular locus were then taken for PCR amplification using genomic DNA from Lake Victoria cichlids as templates. One of the primers was labeled with fluorescein at its 5' end. The PCR products obtained from polymorphic loci were separated on a denaturing polyacrylamide gel in an automated sequencing apparatus. Bands were detected as fluorescence intensities of the labeled DNA strands, and their sizes were automatically determined by comparison with a size standard. From several microsatellite loci typed, this chapter shows the example of the DXTUCA15 locus. This locus was amplified from haplochromine genomic DNA with the primers MS16 (5' GCTGTGTAATCCCAAACTCC 3') and MS17 (5' GTATTTAGcTTTCCTCTG TGCT 3') by PCR with one 45-sec cycle at 93~ 15 sec at 55 ~ and 10 min at 72 ~ followed by 35 cycles, each 15 sec at 93~ 15 sec at 55~ and 1 min at 72~ The reaction was completed by a final primer extension
46
HOLGER S~ILTMANN AND WERNER E. MAYER
step for 10 min at 72~ As templates, genomic DNA samples from the Lake Victoria (and minor adjacent lakes like Lake Nabugabo, Kayugi, and Kayania) Haplochromis species H. beadlei (number of individuals, n = 12), H. cinctus (n = 19), H. laparogramma (n = 12),
H. nigricans (n = 15), H. nyererei (n = 40), H. plagiodon (n = 19), H. pyrrhocephalus (n = 43), H. sauvagei (n = 17), H. velifer (n = 81), and H. xenognathus (n = 29) were used. Individuals from each species were captured at two to six different locations in the wild. In the PCR, the primers amplified DNA fragments with sizes ranging from 75 to 93 bp. Size differences were due to variation of the number of CA repeat units, as determined by subcloning and sequencing random clones (data not shown). Although data are still preliminary, some interesting results have already been obtained from the specific amplification of cichlid microsatellite loci. First, in most of the amplifications, one or two products were visible (corresponding to homo- or heterozygosity of the individual at the particular locus). However, additional artifactual bands often appeared
Haplochromis beadlei
Haplochromislaparograrnrna
n=12
due to amplification at other loci. These by-products sometimes interfered with the precise determination of microsatellite size. Second, the size determination was also hampered by the occurrence of so-called "shadow bands" flanking the highest peak in a cluster of products. In the case of dinucleotide repeats, shadow bands usually differ by 2 bp in size. This observation suggests that they may have been generated by the insertion or deletion of repeat units during PCR amplification (Litt et al., 1993). Assuming that a similar mechanism generated shadow bands in different reactions, the largest peak area within a peak cluster for the determination of the allele size was used. Third, inhomogeneities within the polyacrylamide gel may lead to incorrect measurement of product sizes. In order to assess this possibility, the allele sizes at certain loci determined by gel electrophoresis were compared with those obtained by subcloning and sequencing of the same PCR products. From these data it was concluded that the error was no larger than one repeat unit. The summary of the allele size determination for each species is shown in Fig. 5. Allele frequencies (y
Haplochromis cinctus
n=12
Haplochromisnigricans
n=19
HapIochromis nyererei
n=15
n=40
0.4
g O.4 0.3
--e 0.3
0.3
0.2
0.2
0.2
0.3
0.2
o.111 I
0.1 o.,
8
9 10 11 12 13 14 15 16 17 repeat units
O.0
~
8
Hsplochromisplagiodon
~a 0.3
O 0
8
Hsplochromispyrrhocephalus
n=19
o>, 0.4 c
9 10 11 12 13 14 15 16 17 repeat units
0"0
9 10 11 12 13 14 15 1G 17 repeat units
8
Haplochromissauvagei
n=43
=-~. 04
o.,
-~ 0.3
.
.
.
.
.
.
.
.
O0
.
9 10 11 12 13 14 15 16 17 repeat units
8
Haplochromisvelifer
n=17
=~ o s]
.
0.1
Haplochromisxenognathus
n=81
n=29
~
0.41
Imi 9 10 11 12 13 14 15 16 17 repeat units
0.6
~0.5"
_== 0.4"
0.3 0.2 0.2 0.1
0.1
8
9 10 11 12 13 14 15 16 17 repeat units
0.0
8
9 10 11 12 13 14 15 16 17 repeat units
0.0
8
9 10 11 12 13 14 15 16 17 repeat units
8
9 10 11 12 13 14 15 16 17 repeat units
8
9 10 11 12 13 14 15 16 17 repeat units
FIGURE 5 Allele frequency distributions for 10 Haplochromis species from the Lakes Victoria, Kayugi, Kayania, and Nabugabo at the microsatellite locus DXTUCA15. Frequencies (y axis) are plotted against the number of repeat units (x axis) found in the fragment analysis (see text for details). The number of individuals included in each sample is given by n.
4. Reconstructionof Cichlid Phylogeny axis) are tabulated against the number of repeat units (x axis) calculated from the PCR product size by subtraction of the number of unique nucleotides in the fragment. Differences in frequency distributions between species are indicated by shape variations between the individual plots. Frequency data were used as the input for the microsat 1.4 computer program (written by Eric Minch; Goldstein et al., 1995), which calculates various distance measurements on the basis of allele frequencies (e.g., average squared difference in repeat numbers, Nei's identity; proportion of shared alleles). The basic assumption of the program is the validity of the stepwise mutation model (see also Valdes et al., 1993; Slatkin, 1995). It is important to note, however, that the algorithm is not dependent on the distribution of allele sizes within the species. Nei's identity method (Nei, 1972) was used for the calculation of a distance matrix (Fig. 6) which was then applied for the construction of a phylogenetic tree by the PHYLIP software package (Felsenstein, 1986-1993). The tree is shown in Fig. 7. It can be divided into two major branches, one of which is constituted by the Haplochro-
mis nyererei, H. nigricans, H. plagiodon, H. pyrrhocephalus, and H. laparogramma species, whereas H. beadlei, H. cinctus, H. sauvagei, and H. xenognathus appear on the second major branch. Haplochromis velifer is located at an intermediate position. An obstacle for a test of the reconstruction of the true phylogeny based on microsatellite allele frequencies in the closely related cichlid species from Lake Victoria is the low abundance of synapomorphic morphological characters. Most of the available studies have focused on the feeding habits, jaw morphology, and dentition (Greenwood, 1974, 1979, 1980; Witte and van Oijen, 1990). On the basis of these data, the species included in the phylogenetic tree (Fig. 7) can be subdivided into two major trophic groups (Witte and van Oijen, 1990), one of which is the planktivorous/algaeeating group Haplochromis cinctus (phytoplankton), H. laparogramma, H. pyrrhocephalus, H. nyererei (zoo-
i0
Habe Haci Hala Hani Hany Hapl Hapy Hasa Havl Haxe
-0.086 -0.030 -0.016 0.030 -0.050 0.088 -0.095 -0.008 0.072
0.051 0.088 0.158 0.024 0.117 -0.015 0.124 0.234
-0.078 -0.013 -0.079 -0.023 0.017 -0.018 0.248
-0.046 -0.078 0.040 0.009 -0.034 0.152
47
plankton), and H. nigricans (epilithic algae grazer). The other trophic group consists of the oral shell/mollusc crashers H. plagiodon, H. sauvagei, and H. xenognathus as well as H. beadlei, which is considered to be a sister species of Haplochromis plagiodon (Greenwood, 1980). In the phylogenetic tree generated by microsatellite data, this grouping is roughly reflected in the major branching pattern. The exceptions to this are H. plagiodon and cinctus, which are unexpectedly located on the opposite branches. Thus, data suggest that microsatellite data can be used to make a rough subdivision of some Lake Victoria cichlid species which corresponds to their feeding habits. Whether the congruence between phylogenetic position and trophic grouping is a rule for haplochromines in general remains to be examined. Certainly, multiple microsatellite loci will have to be analyzed in order to generate more reliable and independent data sets.
V. Critical Evaluation Using RAPD and Microsatellite Allele Frequencies for the Reconstruction of Cichlid Fish Phylogeny The recent adaptive radiation of cichlid fishes in Lake Malawi and Lake Victoria has produced closely related species flocks. The reconstruction of their phylogeny requires new methods capable of resolving genetic distances generated within short time spans. Because the available markers (mtDNA, allozymes, morphology) have achieved the goal of clarifying the Lake Victoria and Lake Malawi cichlid phylogeny only marginally, two different approaches that are both based on nuclear DNA markers were studied. The goal was to test the validity of current hypotheses on cichlid fish relationships. The RAPD-based sequence comparison requires relatively few samples from the species under consideration, and data collection and analysis are compara-
-0.029 0.157 0.023 -0.006 0.083
0.036 -0.024 -0.022 0.141
0.168 0.141 0.547
-0.001 0.033
0.i01
FIGURE 6 Distance matrix obtained with microsatellite allele frequency data for the 10 Haplochromis species shown in Fig. 5. Nei's identity method (Nei, 1972) was used for the generation of the matrix with the program microsat 1.4 (Goldstein et al., 1995).
48
HOLGER SfflLTMANN AND WERNER E. MAYER
Haplochromis laparogramma Haplochromis pyrrhocephalus Haplochromis plagiodon Haplochromis nigricans Haplochromis nyererei Haplochromis velifer I Haplochromis beadlei [ Haplochromis cinctus [ Haplochromis sauvagei
I I
I
Haplochromisxenognathus I
0 0.05 0.1 Relative length
I
0.15
FIGURE 7 Neighbor-joining tree (Saitou and Nei, 1987) constructed using the distance matrix from Fig. 6 as input data for the PHYLIP distance algorithm (Felsenstein, 1986 - 1993 ).
tively easy to carry out. The chance of finding interspecies variation in the set of random sequences is high. A prerequisite of the method as described here, however, is complete lineage sorting of the particular RAPD marker. To distinguish young species, therefore, frequency data are necessary. Subcloning and sequencing can be performed by established methods, but are time-consuming. Sequence data provide two types of characters, substitutions and indels, both of which can be used in separate phylogenetic analyses. The results obtained thus far agree well with previously reported molecular data and support the use of this method for molecular evolutionary studies of cichlid fishes. Yet, because some conflicting hypotheses (e.g., the position of Cyphotilapia with respect to the other east African Lake cichlids) could not be clearly resolved, the number of RAPD markers has to be increased in order to obtain phylogenetic trees with higher bootstrap support values. The likelihood of detecting synapomorphic characters between related species increases with the time that has passed since species separation. In this respect, the evolutionarily young (less than 2 MY) species flocks from Lake Malawi and Lake Vic-
toria, which are the interesting ones regarding the speciation process, will require many more nuclear DNA loci and sequencing. For the Lake Victoria haplochromines, the feasibility of obtaining such markers has been shown with the RAPD approach. The microsatellite approach makes use of the high mutation rate of short tandemly repeated sequences. Therefore, this method is more suitable for the determination of relationships between closely related species like the haplochromines of Lake Malawi and Lake Victoria. Once the polymorphic loci have been identified and the variation has been shown to be due to the repeat numbers, allele typing can be performed much quicker as compared to sequencing approaches. However, because the method employs frequency data, the requirement for large sample sizes per species may be a major obstacle to generating reliable data sets. It has been shown for di- to hexanucleotide microsatellite loci that the variance of the stepwise-weighted genetic distance does not change significantly when more than 25 individuals per species are used (Shriver et al., 1995). This number thus defines the preferred sample size. Two limiting factors hinder the use of microsatellite
4. Reconstruction of Cichlid Phylogeny
allele size typing: First, at oligomorphic loci the allele frequency distributions are similar in all species because the time for generation of variability has been too short or polymorphism has predated the speciation process. Second, in the case of convergence the allele frequency distributions of the species are similar because the process of generation of variability has reached an equilibrium state in all the species. In addition, to use the microsatellite approach effectively, several theoretical considerations have to be resolved: First, there is still uncertainty concerning the mechanism generating the variability (Schl6tterer and Tautz, 1992; DiRienzo et al., 1994). Thus, the available models might have to be refined once the mechanism of generation of repeat number variation has been elucidated. Second, the possibility of selection acting on microsatellite repeat number cannot be excluded and may lead to inconsistent results when loci are compared. Third, ignorance of interspecific hybridization events introduces a high degree of uncertainty concerning the topology of the phylogenetic tree. Fourth, errors due to inadequate sample size and possible kinships between cichlid fish taxa are difficult to evaluate and may bias the observed genetic distances between species. In general, RAPD and the microsatellite approach are both able to detect polymorphism between closely related taxonomic groups. With respect to cichlid phylogeny, RAPD can be primarily applied to genera under comparison. In contrast, the microsatellite method should be applied to the species and population level. Despite the unresolved problems with microsatellites, it is the authors' opinion that they are the best tool so far among all the available methods for studying cichlid phylogeny. Nonetheless, the search for additional polymorphic nuclear DNA markers should be continued because these will provide excellent markers for testing the validity of phylogenetic hypotheses.
Acknowledgments We thank Jan Klein for critical reading of the manuscript and helpful suggestions, Herbert Tichy for discussions on the cichlid species and providing the samples, Eric Minch from the Department of Genetics, Stanford University, for the microsat 1.4 program and help in getting it started, and Lynne Yakes for editorial assistance.
References Ayliffe, M. A., Lawrence, G. J., Ellis, J. G., and Pryor, A. J. 1994. Heteroduplex molecules formed between allelic sequences cause nonparental RAPD bands. Nucleic Acids Res. 22:1632-1636. Bardakci, F., and Skibinski, D. O. F. 1994. Application of the RAPD technique in tilapia fish: Species and subspecies identification. Heredity 73:117-123. Bowcock, A. M., Ruiz-Linares, A., Tomfohrde, J., Minch, E., Kidd, J. R., and Cavalli-Sforza, L. L. 1994. High resolution of human
49
evolutionary trees with polymorphic microsatellites. Nature 368: 455 -457. Bowditch, B. M., Albright, D. G., Williams, J. G. K., and Braun, M. J. 1994. Use of randomly amplified polymorphic DNA markers in comparative genome studies. Meth. Enzymol. 224:294-309. Bowers, N., Stauffer, J. R., and Kocher, T. D. 1994. Intra- and interspecific mitochondrial DNA sequence variation within two species of rock-dwelling cichlids (Teleostei: Cichlidae) from Lake Malawi, Africa. Mol. Phylogenet. Evol. 3(1):75-82. Brown, W.M., George, M., Jr., and Wilson, A.C. 1979. Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967-1971. Buchanan, F. C., Adams, L. J., Littlejohn, R. P., Maddox, J. F., and Crawford, A. M. 1994. Determination of evolutionary relationships among sheep breeds using microsatellites. Genomics 22: 397-403. Charlesworth, B., Sniegowski, P., and Stephan, W. 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371: 215-220. Cichocki, F. P. 1976. "Cladistic History of Cichlid Fishes and Reproductive Strategies of the American Genera Acarichthys, Biotodoma and Geophagus," Vol. 1. Ph.D. thesis, University of Michigan, Ann Arbor, MI. Cohen, A. S., Soreghan, M. J., and Scholz, C A. 1993. Estimating the age of formation of lakes: An example from Lake Tanganyika, East African rift system. Geology 21:511-514. Deka, R., Jin, L., Shriver, M. D., Yu, L. M., Decroo, S., Hundrieser, J., Bunker, C. H., Ferrell, R. E., and Chakraborty, R. 1995. Population genetics of dinucleotide (dC-dA)n 9(dG-dT)n polymorphisms in world populations. Am. J. Hum. Genet. 56:461-474. Devereux, J., and Haeberli, P. 1991. "Genetics Computer Group, Program manual for the GCG package, Version 7," April 1991, Madison, WI. DiRienzo, A., Peterson, A. C., Garza, J. C., Valdes, A. M., Slatkin, M., and Freimer, N. B. 1994. Mutational processes of simple-sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA 91: 3166-3170. Ellsworth, D. L., Rittenhouse, K. D., and Honeycutt, R. L. 1993. Artifactual variation in randomly amplified polymorphic DNA banding patterns. Biotechniques 14:214-217. Felsenstein, J. 1986-1993. "PHYLIP: Phylogenetic Inference Package Version 3.5c." University of Washington. Franck, J. P. C., Kornfield, I., and Wright, J. M. 1994. The utility of SATA satellite DNA sequences for inferring phylogenetic relationships among the three major genera of tilapiine cichlid fishes. Mol. Phylogenet. Evol. 3:10-16. Franck, J. P. C., Wright, J. M., and McAndrew, B. J. 1992. Genetic variability in a family of satellite DNAs from tilapia (Pisces: Cichlidae). Genome 35: 719-725. Fryer, G., and Iles, T. D. 1972. "The Cichlid Fishes of the Great Lakes of Africa." Oliver and Boyd, Edinburgh. Gaemers, P. A. M. 1984. Taxonomic position of the Cichlidae as demonstrated by the morphology of their otoliths. Neth. J. Zool. 34: 566-595. Goldstein, D. B., Linares, A. R., Cavalli-Sforza, L. L., and Feldman, M. W. 1995. An evaluation of genetic distances for use with microsatellite loci. Genetics 139: 463-471. Greenwood, P. H. 1974. Cichlid fishes of Lake Victoria, East Africa: The biology and evolution of a species flock. Bull. Br. Mus. Nat. Hist. (Zool.) Suppl. 6:1-134. Greenwood, P. H. 1978. A review of the pharyngeal apophysis and its significance in the classification of African cichlid fishes. Bull. Br. Mus. Nat. Hist. (Zool.) 33:297-323. Greenwood, P. H. 1979. Towards a phyletic classification of the 'genus' Haplochromis (Pisces, Cichlidae) and related taxa. Bull. Br. Mus. Nat. Hist. (Zool.) 35:265-322.
50
HOLGER SCILTMANN A N D WERNER E. MAYER
Greenwood, P. H. 1980. Towards a phyletic classification of the 'genus' Haplochromis (Pisces, Cichlidae) and related taxa. II. The species from Lakes Victoria, Nabugabo, Edward, George, and Kivu. Bull. Br. Mus. Nat. Hist. (Zool.) 39:1-101. Greenwood, P. H. 1984. What is a species flock? In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 13-20. University of Maine at Orono Press, Maine. Greenwood, P. H. 1987. The genera of pelmatochromine fishes (Teleostei, Cichlidae). A phylogenetic review. Bull. Br. Mus. Nat. Hist. (Zool.) 53:139-203. Greenwood, P. H. 1991. Speciation. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 86-102. Chapman and Hall, London. Hearne, C. M., Ghosh, S., and Todd, J. A. 1992. Microsatellites for linkage analysis of genetic traits. Trends Genet. 8:288-294. Higgins, D. G., Bleasby, A. J., and Fuchs, R. 1992. CLUSTAL V: Improved software for multiple sequence alignment. Cabios 8: 189-191. Jeffreys, A. J., Royle, N. J., Wilson, V., and Wong, Z. 1988. Spontaneous mutation rates to new length alleles at tandem-repetitive hypervariable loci in human DNA. Nature 332:278-281. Kaufman, L., and Liem, K. F. 1982. Fishes of the suborder Labroidei (Pisces: Perciformes): Phylogeny, ecology and evolutionary significance. Breviora 472:1-19. Kellogg, K. A., Markert, J. A., Stauffer, J. R., Jr., and Kocher, T. D. 1995. Microsatellite variation demonstrates multiple paternity in lekking cichlid fishes from Lake Malawi, Africa. Proc. R. Soc. Lond. B 260:79-84. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Klein, D., Ono, H., O'Huigin, C., Vincek, V., Goldschmidt, T., and Klein, J. 1993. Extensive MHC variability in cichlid fishes of Lake Malawi. Nature 364: 330-334. Klein, J., Klein, D., Figueroa, F., and O'Huigin, C. 1997. Major histocompatibility complex genes in the study of fish phylogeny. In "Molecular Systematics of Fishes" (T. D. Kocher and C. A. Stepien, eds.). Academic Press, San Diego. Kocher, T. D., Conroy, J. A., McKaye, K. R., and Stauffer, J. R. 1993. Similar morphologies of cichlid fish in Lakes Tanganyika and Malawi are due to convergence. Mol. Phylogenet. Evol. 2:158-165. Kocher, T. D., Conroy, J. A., McKaye, K. R., Stauffer, J. R., and Lockwood, S. F. 1995. Evolution of NADH dehydrogenase subunit 2 in east African cichlid fish. Mol. Phylogenet. Evol. 4(4): 420-432. Kornfield, I. L. 1978. Evidence for rapid speciation in African cichlid fishes. Experientia 34: 335- 336. Kornfield, I. L. 1991. Genetics. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 103-150. Chapman and Hall, London. Komfield, I. L., Ritte, U., Richler, C., and Wahrman, J. 1979. Biochemical and cytological differentiation among cichlid fishes of the Sea of Galilee. Evolution 33:1-14. LeBlanc-Straceski, J. M., Montgomery, K. T., Kissel, H., Murtaugh, L., Tsai, P., Ward, D. C., Krauter, K. S., and Kucherlapati, R. 1994. Twenty-one polymorphic markers from human chromosome 12 for integration of genetic and physical maps. Genomics 19:341349. Levinson, G., and Gutman, G. 1987. Slipped-strand mispairing: A major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4(3):203-221. Litt, M., Hauge, X., and Sharma, V. 1993. Shadow bands seen when typing polymorphic dinucleotide repeats: Some causes and cures. BioTechniques 15(2):280-284. Livingstone, D. A. 1980. Environmental changes in the Nile head-
waters. In "The Sahara and the Nile" (M. A. J. Williams and H. Faure, eds.), pp. 339-359. Balkema, Rotterdam. McAndrew, B. J., and Majumdar, K. C. 1983. Tilapia stock identification using electrophoretic markers. Aquaculture 30: 249- 261. McAndrew, B. J., and Majumdar, K. C. 1984. Evolutionary relationships within three Tilapiine genera (Pisces: Cichlidae). Zool. J. Linn. Soc. 80:421-435. McKaye, K. R., Kocher, T., Reinthal, P., and Kornfield, I. 1982. Genetic analysis of a sympatric sibling species complex of Petrotilapia Trewavas (Cichlidae, Lake Malawi). Zool. J. Linn. Soc. 76:91-96. Meyer, A. 1987. Phenotypic plasticity and heterochrony in Cichlasoma managuense (Pisces, Cichlidae) and their implications for speciation in cichlid fishes. Evolution 41(6): 1357-1369. Meyer, A. 1993. Phylogenetic relationships and evolutionary processes in East African cichlid fishes. Trends Ecol. Evol. 8:279-284. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Moran, P., and Kornfield, I. 1993. Retention of an ancestral polymorphism in the Mbuna species flock (Teleostei: Cichlidae) of Lake Malawi. Mol. Biol. Evol. 10(5):1015-1029. Muralidharan, K., and Wakeland, E. K. 1993. Concentration of primer and template qualitatively affects products in randomamplified polymorphic DNA PCR. BioTechniques 14(3):362-364. Nei, M. 1972. Genetic distance between populations. Am. Nat. 949: 283-292. Nishida, M. 1991. Lake Tanganyika as an evolutionary reservoir of old lineages of East African cichlid fishes: Inferences from allozyme data. Experientia 47:974-979. Oliver, M. K. 1984. "Systematics of African Cichlid Fishes; Determination of the Most Primitive Taxon, and Studies on the Haplochromines of Lake Malawi (Teleostei: Cichlidae)." Ph.D. thesis, Yale University, New Haven, CT. Ono, H., O'Huigin, C., Tichy, H., and Klein, J. 1993. Major-histocompatibility-complex variation in two species of cichlid fishes from Lake Malawi. Mol. Biol. Evol. 10:1060-1072. Orita, M., Iwahana, H., Kanazawa, H., Hayashi, K., and Sekiya, T. 1989a. Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphism. Proc. Natl. Acad. Sci. USA 86:2766-2770. Orita, M., Suzuki, Y., Sekiya, T., and Hayashi, K. 1989b. Rapid and sensitive detection of point mutations and DNA polymorphisms using the polymerase chain reaction. Genomics 5:874-879. Owen, R. B., Crossley, R., Johnson, T. C., Tweddle, D., Kornfield, I., Davison, S., Eccles, D. H., and Engstrom, D. E. 1990. Major low levels of Lake Malawi and their implications for speciation rates in cichlid fishes. Proc. R. Soc. Lond. B 240:519-553. Pellegrin, J. 1904. Contribution a l'6tude anatomique, biologique et taxonomique des poissons de la famille des cichlid6s. M~m. Soc. Zool. Fr. 16: 41-402. Poll, M. 1986. Classification des Cichlidae du lac Tanganyika: Tribus, genres et esp~ces. M~m. Acad. R. Belg. CI. Sci. 45: 5-163. Postlethwait, J. H., Johnson, S. L., Midson, C. N., Talbot, W. S., Gates, M., Ballinger, E. W., Africa, D., Andrews, R., Carl, T., Eisen, J. S., Home, S., Kimmel, C. B., Hutchinson, M., Johnson, M., and Rodriguez, A. 1994. A genetic linkage map for the zebrafish. Science 264: 699- 703. Regan, C. T. 1906. A revision of the fishes of the South American cichlid genera Cichla, Chaetobranchus and Chaetobranchopsis, with notes on the genera of the American Cichlidae. Ann. Mag. Nat. Hist. 7: 230-239. Regan, C. T. 1920. The classification of the fishes of the family Cichlidae. I. The Tanganyikan genera. Ann. Mag. Nat. Hist. 9:33-53. Ribbink, A. J. 1991. Distribution and ecology of the cichlids of the
4. Reconstruction of Cichlid Phylogeny
African Great Lakes. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 36-59. Chapman and Hall, London. Riley, J., Butler, R., Ogilvie, D., Finniear, R., Jenner, D., Powell, S., Anand, R., Smith, J. C., and Markham, A. F. 1990. A novel rapid method for the isolation of terminal sequence from yeast artificial chromosome (YAC) clones. Nucleic Acids Res. 18:2887-2890. Roy, M. S., Geffen, E., Smith, D., Ostrander, E. A., and Wayne, R. K. 1994. Patterns of differentiation and hybridization in North American wolflike canids, revealed by analysis of microsatellite loci. Mol. Biol. Evol. 11(4):553-570. Sage, R. D., Loiselle, P. V., Basasibwaki, P., and Wilson, A. C. 1984. Molecular versus morphological change among cichlid fishes of Lake Victoria. In "Evolution of Fish Species Flocks" (A. A. Echelle and I. Kornfield, eds.), pp. 185-197. University of Maine at Orono Press. Maine. Sage, R. D., and Selander, R. K. 1975. Trophic radiation through polymorphism in cichlid fishes. Proc. Natl. Acad. Sci. USA 72: 46694673. Saiki, R. K., Gelfland, D. H., Stoffel, S., Scharf, S. J., Higuchi, I. G., Horn, G. T., Mullis, K. B., and Erlich, H. A. 1988. Primer directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487-491. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sambrook, J., Fritsch, E. F., and Maniatis, T. 1989. "Molecular Cloning: A Laboratory Manual." Cold Spring Harbor Press, Cold Spring Harbor, NY. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467. Schliewen, U. K., Tautz, D., and P~i~ibo, S. 1994. Sympatric speciation suggested by monophyly of crater lake cichlids. Nature 368:629632. Schl6tterer, C., and Tautz, D. 1992. Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 20(2):211-215. Scribner, K. T., Arntzen, J. W., and Burke, T. 1994. Comparative analysis of intra- and interpopulation genetic diversity in Bufo bufo, using allozyme, single-locus microsatellite, minisatellite, and multilocus data. Mol. Biol. Evol. 11(5):737-748. Serikawa, T., Montagutelli, X., Simon-Chazottes, D., and Gu6net, J.-L. 1992. Polymorphisms revealed by PCR with single, shortsized, arbitrary primers are reliable markers for mouse and rat gene mapping. Mamm. Genome 3:65- 72. Seyoum, S. 1989. "Stock Identification and the Evolutionary Relationship of the Genera Oreochromis, Sarotherodon and Tilapia (Pisces: Cichlidae) Using Allozyme Analysis and Restriction Endonuclease Analysis of Mitochondrial DNA." Ph.D. thesis, University of Waterloo, Waterloo, Ontario, Canada. Shriver, M. D., Jin, L., Boerwinkle, E., Deka, R., Ferrell, R. E., and Chakraborty, R. 1995. A novel measure of genetic distances for highly polymorphic tandem repeat loci. Mol. Biol. Evol. 12(5): 914-920. Slatkin, M. 1995. A measure of population subdivision based on microsatellite allele frequencies. Genetics 139: 457-462. Smith, J. J., Scott-Craig, J. S., Ledbetter, J. R., Bush, G. L., Roberts, D. L., and Fulbright, D. W. 1994. Characterization of random amplified polymorphic DNA (RAPD) products from Xanthomonas campestris and some comments on the use of RAPD products in phylogenetic analysis. Mol. Phylogenet. Evol. 3(2):135-145.
51
Sodsuk, P., and McAndrew, B. J. 1991. Molecular systematics of three tilapiine genera Tilapia, Sarotherodon and Oreochromis using allozyme data. J. Fish Biol. 39:301-308. Stiassny, M. L. J. 1987. Cichlid familial interrelationships and the placement of the neotropical genus Cichla (Perciformes, Labroidei). J. Nat. Hist. 21:1311-1331. Stiassny, M. L. J. 1991. Phylogenetic intrarelationships of the family Cichlidae: An Overview. In "Cichlid Fishes: Behavior, Ecology and Evolution" (M. H. A. Keenleyside, ed.), pp. 1-35. Chapman and Hall, London. Sturmbauer, C., and Meyer, A. 1992. Genetic divergence, speciation and morphological stasis in a lineage of African cichlid fishes. Nature 358:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 10:751-768. Sturmbauer, C., Verheyen, E., and Meyer, A. 1994. Mitochondrial phylogeny of the Lamprologini, the major substrate spawning lineage of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 11:691-703. S(iltmann, H., Mayer, W. E., Figueroa, F., Tichy, H., and Klein, J. 1995. Phylogenetic analysis of cichlid fishes using nuclear DNA markers. Mol. Biol. Evol. 12(6): 1033-1047. Swofford, D. L. 1993. "PAUP: Phylogenetic Analysis Using Parsimony, Version 3.1.1." Computer program distributed by the Illinois Natural History Survey, Champaign, Ill. Trewavas, E. 1973. On the cichlid fishes of the genus Pelmatochromis with a proposal of a new genus for P. congicus; on the relationship between Pelmatochromis and Tilapia and the recognition of Sarotherodon as a distinct genus. Bull. Br. Mus. Nat. Hist. (Zool.) 26: 331-419. Trewavas, E. 1983. Tilapiine fishes of the genera Sarotherodon, Oreochromis and Danakilia. Br. Mus. (Nat. Hist.) Lond. Valdes, A. M., Slatkin, M., and Freimer, N. B. 1993. Allele frequencies at microsatellite loci: The stepwise mutation model revisited. Genetics 133: 737- 749. Vandewalle, P. 1971. Comparaison ost6ologique et myologique de cinq Cichlidae Africains et Sud-Americains. Ann. Soc. R. Zool. Belg. 101:259-292. Weber, J. L., and Wong, C. 1993. Mutation of human short tandem repeats. Hum. Mol. Genet. 2:1123-1128. Welsh, J., and McClelland, M. 1990. Fingerprinting genomes using PCR with arbitrary primers. Nucleic Acids Res. 18:7213-7218. Williams, J. G. K., Kubelik, A. R., Livak, K. J., Rafalski, J. A., and Tingey, S. V. 1990. DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. Nucleic Acids Res. 18: 6531-6535. Witte, F., and van Oijen, M. J. P. 1990. Taxonomy, ecology and fishery of Lake Victoria haplochromine trophic groups. Zool. Verh. Leiden 262:1-47. Wooster, R., Cleton-Jansen, A.-M., Collins, N., Mangion, J., Cornelis, R. S., Cooper, C. S., Gusterson, B. A., Ponder, B. A. J., Von Deimling, A., Wiestler, O. D., Cornelisse, C. J., Devilee, P., and Stratton, M. R. 1994. Instability of short tandem repeats (microsatellites) in human cancers. Nat. Genet. 6:152-156. Wright, J. M. 1989. Nucleotide sequence, genomic organization and evolution of a major repetitive DNA family in tilapia (Oreochromis mossambicus/hornorum). Nucleic Acids Res. 17:5071-5079. Zihler, F. 1982. Gross morphology and configuration of digestive tracts of Cichlidae (Teleostei, Perciformes): Phylogenetic and functional significance. Neth. J. Zool. 32:544-571.
This Page Intentionally Left Blank
CHAPTER
5 Biogeographic Analysis of Pacific Trout (Oncorhynchus mykiss) in California and Mexico Based on Mitochondrial DNA and Nuclear Microsatellites JENNIFER L. NIELSEN and MONIQUE C. FOUNTAIN USDA Forest Service Pacific Southwest Research Station and Hopkins Marine Station Department of Biology Stanford University Pacific Grove, California 93950
rainbow trout). Genetic and morphological characters reported in many studies have confirmed the Pacific trout as true members of Oncorhynchus (PJehnke, 1968; Utter et al., 1973; Kendall and Behnke, 1984; Thomas et al., 1986; Stearley and Smith, 1993; Utter and Allendorf, 1994; see also Phillips and Oakley, 1997). The popular terms "salmon" and "trout" are now generally thought to refer to a flexibility in life history pattern .that has evolved independently among separate monophyletic groups, the Pacific Oncorhynchus [i.e., anadromous steelhead and freshwater rainbow trout O. mykiss; anadromous sockeye salmon (O nerka) and resident kokanee; sea-run and resident cutthroat trout O. clarki], and the Atlantic Salmo (i.e., anadromous and landlocked Atlantic salmon, S. salar; anadromous and resident brown trout, S. trutta). Similar trade-offs in life history traits are also found within Salvelinus (i.e., lacustrine and anadromous char, S. alpinus), suggesting that this flexibility in life history may
I. Introduction At the turn of the century, the Pacific basin trout were traditionally classified as members of the Atlantic lineage Salmo, based on analyses of morphology, life history characteristics, and iteroparity in the Pacific trout that were lacking in other Pacific salmon (Oncorhynchus spp.) The current reclassification of Pacific steelhead, cutthroat, and rainbow trout into the genus Oncorhynchus was based on new morphological characters and associations drawn from molecular genetic data (Smith and Stearley, 1989). As early as 1914, Regan had suggested that the Pacific trout were more closely related to the Pacific salmon (Oncorhynchus) than to the European Salmo. Based on osteological characters, Vladykov (1963) recognized that Pacific basin trout were separable from Atlantic basin Salmo, and Behnke (1965) first reported the near morphological identity of O. mykiss (Asiatic trout) and S. gairdnerii (North American
MOLECULAR SYSTEMATICS OF FISHES
JONATHAN M. WRIGHT Marine Gene Probe Laboratory Department of Biology Dalhousie University Halifax, Nova Scotia Canada B3H 4J1
53
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
54
JENNIFER L. NIELSEN et al.
be a characteristic with roots ancestral to the split between Salmo and Oncorhynchus (Stearley and Smith, 1993; Foote et al., 1994). Genetic studies have revealed cryptic population structure due to behavior or life history variation that was not obvious from other types of analyses (Bowen et al., 1993; Bowcock et al., 1994). The reclassification of all Pacific anadromous steelhead and resident rainbow trout as O. mykiss has, therefore, led to significant controversy over the taxonomic status and genetic identity of the many subgroups of trout found throughout western North America (Behnke, 1992). Specific interest has evolved around the position of the California golden trout, the McCloud rainbow trout, Baja rainbow trout, the Eagle Lake rainbow trout, and the interior "redband" trout in the lineage of O. mykiss. The first genetic data used to support biogeographic separation of western trout into two major subgroups came from a study of allozymes via electrophoresis analyses conducted by Allendorf (1975). This study documented the geographical separation of western trout around the Cascade Crest (Pacific Crest), dividing O. mykiss into "inland" and "coastal" populations. Allendorf (1975) showed that allozyme allelic frequency differences separated inland and coastal groups of O. mykiss longitudinally over a broad geographic area throughout the western United States. Subsequent molecular studies conducted on the North American coastal distributions of O. mykiss supported genetic similarities between both resident (rainbow trout) and anadromous (steelhead) forms of coastal Pacific trout within geographically proximate locations (Utter et al., 1973; Okazaki, 1984; Parkinson, 1984; Currens et al., 1990; Gall et al., 1990; Reisenbichler et al., 1992). DNA analyses of the intraspecific genetic diversity in coastal O. mykiss confirmed the genetic similarity of resident and anadromous life history forms of trout from proximate geographic areas (Wilson et al., 1985; Thomas and Beckenbach, 1989)and have shown significant biogeographic structure at the southern extent of the range (Nielsen et al., 1994b). The latter study used mitochondrial DNA (mtDNA) and nuclear microsatellites to demonstrate a high degree of population differentiation and levels of genetic diversity that were unprecedented for this species. This unique level of genetic diversity found in southern steelhead has been confirmed by allozyme analyses of California coastal stocks by the National Marine Fisheries Service for their scientific status review resulting from a petition for Federal listing of the Pacific steelhead under the Endangered Species Act (Dr. R. Waples, personal communications, National Marine Fisheries Service, Seattle, WA).
DNA studies of Pacific salmonids initially concentrated on mitochondrial DNA markers due to the relatively rapid rate of evolution in this maternally inherited molecule, the ease of extraction and amplification of mtDNA, and a significant literature on the theory and application of mtDNA sequence analyses available to researchers by the end of the 1980s (Avise et al., 1987, and literature therein). Controversy has evolved over the degree and level of phylogenetic resolution available with mtDNA markers due to demonstrated variability in mutation rates for individual parts of the molecule among different taxa and possible saturation of base-point mutations in highly polymorphic regions (Avise et al., 1987, 1994b; Hillis, 1995). Despite such arguments, this molecule has played an important role in high-resolution analyses of population structure in closely related vertebrate groups (Moritz et al., 1987; Stoneking et al., 1991; Avise, 1994 and references therein; Avise et al., 1994a; Moritz et al., 1995). The development of simple protocols for the detection and amplification of short repetitive DNA sequences (i.e., microsatellites; Miklos, 1985; Tautz, 1989; Weber and May, 1989; Moore et al., 1991) provides access to new molecular tools derived from the nuclear genome with unusually high levels of intraspecific polymorphism. Short repetitive DNAs are common throughout the eukaryotic genome, have exceptionally high mutation rates, and generally provide large numbers of alleles useful for the reconstruction of closely related phylogenetic groups (Kelly et al., 1991; Henderson and Petes, 1992; Queller et al., 1993; Estoup et al., 1993; Bowcock et al., 1994). Polymerase chain reaction (PCR) amplification of microsatellites has provided an alternative molecular approach for the analysis of groups sharing recent evolutionary divergence (Burke et al., 1989; Bruford and Wayne, 1993; Queller et al., 1993; Ellegren, 1995). Nuclear microsatellite loci have, in general, provided a degree of analysis not previously available at the intraspecific level from mtDNA or allozymes (Bowcock et al., 1994; Goldstein et al., 1995; FitzSimmons et al., 1995; Nielsen, 1996). The function and biochemical mechanisms underlying mutation of simple sequence repeat loci, however, remain unknown and controversial (Long and David, 1980; Di Rienzo et al., 1994). One theoretical mechanism of mutation has been proposed for the microsatellite class of tandem repeats: a stepwise mutation process in which an allele mutates up or down by a small number of nucleotide repeat units (Schlotterer and Tautz, 1992). Variations on the stepwise mutation model underlie two recently developed genetic distance measures designed specifically for microsatellite loci (Goldstein et al., 1995; Slatkin, 1995). These dis-
5. mtDNA and Nuclear Microsatellites in Trout
tance measures are closely related in their analytical techniques, but are based on different conceptual interpretations of the stepwise mechanisms leading to repeat polymorphisms. Goldstein et al. (1995) used a strict (single-step) stepwise mutation model to analyze variation in the number of repeats found within a simple DNA sequence. Slatkin (1995), however, developed a two-phase mutation model introduced by Di Rienzo et al. (1994), which allows replication or deletion of more than one repeat unit as a single mutation event. Under the two-phase model, single-step mutations (involving only one repeat unit) are thought to be the most common elements of change, but events involving larger groups of repeat units, inserted or deleted as a single mutational element, are possible (Di Rienzo et al., 1994). Despite the fact that mutational mechanisms in repetitive DNA remain an open question, microsatellite markers have proven useful in many vertebrate population studies (Bruford and Wayne, 1993; Wright, 1993; Bowcock et al., 1994; Morin et al., 1994a,b; Nielsen et al., 1994b; Wright and Bentzen, 1994; Spencer et al., 1995; Gerloff et al., 1995). To date, however, no empirical studies have looked at the implications of the different analytical approaches to microsatellite distance data. Phylogenies based on single genes or short sequence loci, especially among closely related taxa, can be discordant with organismal phylogenies (Weller et al., 1994). Discrepancies between an individual gene tree and the true phylogeny of an organism can arise from lineage-sorting processes or allelic introgression between closely related groups (Neigel and Avise, 1986; Pamilo and Nei, 1988). The degree of phylogenetic congruence available among independent genetic markers has become an important issue in the interpretation of gene trees in relationship to organismal phylogenies (Birky et al., 1989; Bernatchez and Danzmann, 1993; Avise et al., 1994b; Bernatchez, 1995; Moritz et al., 1995). Phylogenetic results derived from several independent DNA regions provide a more robust perspective on the genetic history of an individual group or population than any one gene or nucleotide sequence alone (Avise, 1994; Cummings et al., 1995). It is important, however, that the chosen gene or sequence data used to test congruence among phylogenetic information are appropriately matched to the level or degree of phylogenetic divergence in question (Graybeal 1994). This chapter compares genetic diversity for mtDNA and three independent, highly polymorphic nuclear microsatellite markers in putative wild trout and steelhead populations from California and Mexico. DNA data on trout populations from interior as well as coastal locations are presented, and the intraspecific biogeographic resolution available for O. mykiss in
55
California and Mexico using both mtDNA and nuclear markers is addressed. Inferences available from these molecular data concerning the status of various populations of trout and steelhead are discussed.
II. Material and Methods A. Sampling Protocol
Coastal steelhead and interior trout (O. mykiss) were sampled noninvasively by taking fin clips (2-mm 2) from 354 live fish captured within riverine habitats in California and Mexico (Fig. 1). Tissues were sent as frozen or dried samples to the authors' laboratory from 1990 to 1995 and stored at -70~ until DNA extraction and amplifications were performed. O. mykiss were sampled from stream locations where wild stocks of steelhead and trout have been reported to have received a minimum of hatchery introductions since the mid-1930s [California Department of Fish and Game (CDFG) unpublished records and personal communications; Swift et al., 1993; Gall, 1995; Titus et al., in press]. Streams and rivers included in these analyses were divided into six general geographic localities to aid in the graphic depiction of data (see Appendix I). The northern and southern regions of California were separated at the San Francisco Bay and the interior and coastal populations were separated by the western boundary of the Klamath mountains and the great valley region in the north, the Sierra Nevada range throughout central California, and the transverse range in the south. All coastal steelhead and trout in California are currently classified as O. mykiss
Steelhead Eel River
/
Steelhead northcoast Eagle Lake
/.-~,.,~on,,~,~,,.,,,~/
rainbow~~~ ~~
Upper SacramentoRiver rainbow Steelhead RussianRiver
McCIoudRiver rainbow GoldenTroutCreek
Steelhead SF Bay
SF KernRivergoldentrout
Little Kerngoldentrout
Kern River rainbow
Central coast steelhead
Southern steelheadBig Sur Southern steelheadSanta YnezR.
/
Southern steelhead Malib~Cr.eek
Bajatrout
(,(~~___~tL./Rio Yaquitrout /.../~A~Mexico ,~,,,,
FIGURE 1 General location of DNA-sampling sites for steelhead and trout in California and Mexicoused in this study.
56
JENNIFERL. NIELSENet al.
irideus (after Behnke, 1992). The north interior collection of California trout included two putative subspecies of trout, the Eagle Lake trout (O. mykiss aquilarum) and the McCloud River redband trout (O. mykiss stonei), but probably contained a diverse mixture of populations with redband and coastal rainbow ancestry (Behnke, 1992). The south interior California trout collection was made up of three reported subspecies: the Kern River rainbow trout (O. mykiss gilberti), Little Kern River rainbow trout (O. mykiss whitei), and California golden trout from the South Fork Kern River (O. mykiss aquabonita). Mexican trout from Baja California Norte (O. mykiss nelsoni) were collected by G. Ruiz-Campos (Facultad de Ciencias, Universidad Aut6noma de Baja California). Fin clips taken from trout from the Rio Yaqui basin (an undefined subspecies of O. mykiss, R. R. Miller, personal communications) were sent to the authors' laboratory by B. L. Jensen (U. S. Fish and Wildlife Service, Dexter National Fish Hatchery and Technology Center, Dexter, NM) and by Jose Campoy Favela (Centro Ecologico de Sonora, Hermosillo, Sonora, Mexico). These samples were collected from a headwaters tributary of the Rio La Cueva, a tributary of the Rio Bavispe, which is a tributary of the Rio Yaqui. B. M i t o c h o n d r i a l DNA Total genomic DNA was extracted from fin clips using Chelex-100 resin (BioRad) following the methods of Nielsen et al. (1994a). Primers used in this study (S-phe and P2) were developed by W. K. Thomas (University of Missouri, Kansas City) in the late Allan Wilson's laboratory using the methods given in Kocher et al. (1989). These primers are known to amplify a highly variable segment of the mtDNA control region in salmonids. These primers permit amplification and sequencing of a segment containing 188 bp of the O. mykiss mtDNA control region and 5 bp of the adjacent phenyalanine tRNA gene. Primer sequences, amplification and sequencing protocols, and sequence of the entire region amplified by these primers in this species can be found in Nielsen et al. (1994a).
C. Microsatellites Three microsatellite loci [Omy77, Morris et al. (1996); Omy207, M. O'Connell, Marine Gene Probe Laboratory (MGPL), Dalhousie University, personal communications; and Ssa289, McConnell et al. (1995)] were chosen for this study based on their level of polymorphism in O. mykiss. Omy77 and Omy207 were developed specifically for O. mykiss at MGPL, Dalhousie
University. Ssa289 was developed by MGPL for Atlantic salmon. The sequence for primers amplifying these microsatellite loci appears in the respective literature or is available by request from MGPL. For each locus, primer B was labeled according to protocols given in Nielsen et al. (1994b). The methods of Nielsen et al. (1994b) were used except that each PCR reaction contained 3.75 ~1 doubledistilled H20, 0.625 #1 10• PCR buffer (670 ~1 1 M Tris, 67 #1 1 M MgCI2, 83 ~1 2 M AmSo4, 7/.~1 14 M ]3-mercaptoethanol, and 173 Izl double-distilled H20), 0.625 #1 10 mM dDNTPs, 0.625 #1 10/~M primer A, 0.32 ~1 1 #M primer B, 0.32/~1 labeled B primer, and 0.03 ~1 (0.15 units) Taq polymerase. PCR conditions were 30 cycles of 94 ~ for 40 sec, 50~ for 1 min, and 72~ for 2 min. Microsatellites were run out on a 6% polyacrylamide gel. Prior to loading the gel, 5/~1 of loading buffer [94% formamide, 4% 0.5 EDTA, 0.025% (w/v) both bromphenol blue and xylene cyanol FF) was added to each sample. The size of each microsatellite allele was determined by reference to the M13mp18 sequence, known DNA samples that were rerun on each gel, and a doublestranded reference marker showing the common alleles available for each microsatellite locus. Only unambiguous bands were scored, and in the case of multiple (shadow) bands, the darkest band was scored as the allele. The appearance of stutter bands which overlap between alleles was resolved by comparing the intensity and number of stutter bands for each individual at each locus (O'Reilly and Wright, 1995). To ensure consistency in both the PCR reactions and the scoring of microsatellites, 3.5% of all samples were rerun separately on different gels and scored independently by two people.
D. A n a l y t i c a l A p p r o a c h A pairwise distance matrix was constructed for sequences from the mtDNA control region segment amplified by S-Phe and P2, based on the two-parameter model of Kimura (1980). Phylogenetic analysis was performed on mtDNA data using the unrooted neighbor-joining (NJ) tree procedure from PHYLIP (Felsenstein, 1991) with 1000 bootstrap replicates (Felsenstein, 1985) to assess reproducibility of the NJ mtDNAbranching pattern. Previous studies have documented the biogeographic concordance associated with the mtDNA haplotypes in coastal steelhead (Nielsen et al., 1994b; Neeley, 1995). To test for differences in biogeographic distribution of genotypes using nuclear microsatellites, microsatellite data were pooled for individual trout by
5. mtDNA and Nuclear Microsatellites in Trout
known mtDNA haplotype and capture location, where the parenthetical mtDNA haplotype designation refers to the most common haplotype found in that particular geographic population (Appendix II). These geographic-haplotype groups then served as sample units for microsatellite genetic distance analyses and tree development for comparison with the mtDNA NJ tree, allowing the authors to discuss results available from microsatellite data in individual populations with documented mtDNA phylogeographic structures. The trees depicted in these analyses were not intended to reflect historic evolutionary associations among trout populations, but rather to test for genetic congruence in biogeographic data drawn from two independent molecular markers with potentially different evolutionary histories among these populations. Observed and expected values of heterozygotes were calculated for microsatellite data, and a test for Hardy-Weinberg (HW) equilibrium was performed for all populations combined according to the Fisher method described by Louis and Dempster (1987), which provided an estimate of the probability of rejecting the null hypothesis, i.e., HW equilibrium. A pairwise genetic distance matrix was calculated for allelic diversity using both the Slatkin (1995) and the Goldstein et al. (1995) methods for the three microsatellite loci combined. For the Goldstein et al. (1995) distance analyses, the authors used the program available from Dr. E. Minch, Department of Genetics, Stanford University. Rst analyses were performed using a Pascal program developed by M. C. F. that implemented Slatkin's stepwise model for distance analyses. Both distance measures assume a linear expectation of the average-squared distance for each locus (assuming no correlation between mutation rate and repeat score) and use the arithmetic average of mutation rates across loci. Statistics in both methods are equivalent to a general analysis of variance. Both methods compute an average sum of squares of the differences in allelic size within each population [Sw in Slatkin (1995); Do in Goldstein et al. (1995)] and the average squared difference between all possible pairs of populations (SB and D1 respectively) to obtain an estimate of variance in allele size in the total population. The basic difference between the two methods involves how they interpret the parameters of the mutation process. Slatkin's Rst [developed under the assumptions of the infinite allele model, Slatkin (1991)] used a ratio of combinations of the mean squared distance which cancels out all parameters of the mutation process [see formula 12 in Slatkin (1995)]. Goldstein et al. (1995) maintain an estimate of the mutation process under the expectation of a strict, single-step (one
57
repeat unit) shift for each mutation event. Distance data from both methods were used to generate a consensus neighbor-joining tree (PHYLIP, Felsenstein, 1991). One thousand bootstrap replicate trees were generated to assess the reproducibility of branching patterns found in each consensus tree. Analysis of variance (ANOVA) and factor analysis using principal components (PCA) were used to describe biogeographic associations between genotype (mtDNA or microsatellite allelic diversity) and sample location (longitude and latitude). Each factor represented a linear combination of actual mtDNA haplotype or microsatellite allelic frequencies (weighted for sample size) over all genotypes. Factor analyses were based on the variance-covariance matrix for all sampled populations such that the range of components was associated with the proportion of total variance over all locations. The first component was, therefore, associated with the greatest portion of the total variance for all genotypes over all locations, the second component had the second greatest proportion, etc. Least-squares multiple regression analyses were then used to regress the first principal component on latitude or longitude by genotype to graphically depict the correlation between sampling locality and genotype distributions.
III. Results A. Mitochondrial
DNA
Three previously unpublished mtDNA controlregion haplotypes, containing novel single base mutations, were found in this survey of trout populations from California and Mexico (MYS15, MYS16, and MYS18; Table I). MYS15 was found only in golden trout from Golden Trout Creek in the Kern River basin and in Taylor Creek, a tributary to the South Fork Kern River. MYS16 was found in two tributaries of the South Fork Kern River (Fay Creek, Manter Creek), in Ramshaw Meadows on the South Fork Kern River, in Golden Trout Creek, and in Eagle Lake rainbow trout. MYS18 was unique to the trout of the Rio Yaqui basin of northwestern Mexico. Twenty-seven trout from the San Pedro M~rtir basin in Baja California were monomorphic for mtDNA haplotype MYS1. All other mtDNA haplotypes found in freshwater trout samples taken from interior California rivers and streams carried identical control-region haplotypes to those previously reported in coastal anadromous populations (Nielsen et al., 1994b). The frequency distribution of the
58
JENNIFER L. NIELSEN et al.
TABLE I Mitochondrial Control Region Variable Sites and Nucleotide Changes Found in California Steelhead and Trout (Oncorhynchus mykiss) in 1990-1995 and in Two Populations of Mexican Trout from Baja California and the Rio Yaqui Base pair no.a mtDNA type
No.
1021
MYS1 MYS2 MYS3 MYS5 MYS6 MYS8 MYS12 MYS13 MYS15 MYS16 MYS18
99 17 108 20 7 25 7 8 7 45 11
T C T T T T T T T T T
1050
1086
1103
1104
1106
1109
1147
1149
T T T C C C C C T T T
A A A G G A A G A A A
G G G G G G G G A G G
A A A C C C C C A A A
G A A G G G G G A G G
G G G A A A G G G A G
C C C C C C C C C T C
T T T T C T T T T T C
"Base pair numbers follow those published by Digby et al. (1992). The number of fish sequenced for this study is given for each mtDNA type. Mitochondrial haplotypes MYS1-14 are equivalent to haplotypes ST1-14 previously reported in Nielsen et al. (1994b). ST4, ST7, ST9, ST10, ST11, and ST14 were represented by less than five confirmed samples each and were, therefore, not included in these analyses.
s o u t h e r n extent of the range, i.e., s o u t h of Point Conc e p t i o n (MYS5, MYS6, MYS8, a n d MYS13), a n d one n o r t h e r n California h a p l o t y p e (MYS12) s h o w e d significant u n i t y w i t h b o o t s t r a p v a l u e s > 50%. A 68% bootstrap v a l u e s u p p o r t e d u n i t y b e t w e e n coastal s t e e l h e a d (MYS3) a n d r e s i d e n t t r o u t f r o m the S a c r a m e n t o River (MYS3), the Little K e r n River (MYS3), a n d t w o g o l d e n t r o u t h a p l o t y p e p o p u l a t i o n s (MYS3 a n d MYS15). Genetic u n i t y b e t w e e n the Eagle Lake t r o u t a n d Califor-
11 u n i q u e m t D N A h a p l o t y p e s f o u n d in this s t u d y is g i v e n b y g e n e r a l g e o g r a p h i c location in Fig. 2. A n u n r o o t e d n e i g h b o r - j o i n i n g tree for controlr e g i o n m t D N A s e q u e n c e data s u m m e d for h a p l o t y p e p o p u l a t i o n s is d e p i c t e d in Fig. 3. This tree d i v i d e d the t r o u t - s t e e l h e a d a s s e m b l a g e into four g r o u p s supp o r t e d w i t h b o o t s t r a p v a l u e s > 50%, w h e n c o n s i d e r e d in r e l a t i o n s h i p to the Rio Yaqui trout. Steelhead m t D N A h a p l o t y p e s f o u n d m o s t f r e q u e n t l y at the
1
-
mtDNA Control Region
0.9 0.8 9 North Coast
0.7 o~
[ ] North Interior
0.6
Nil South Coast
0.s
L~ South Interior
~" 0.4
[ ] Mexico Coast
0.3
I~ Mexico Interior
0.2 O'
o
,' i 1
: 2
'," 3
I ' = =
5
6
l
', 8
12
13
15
16
18
mtDNA haplotype
Frequency distribution of Oncorhynchus mykiss mtDNA haplotypes given for six general geographic locations. Haplotype numbers are given by streams and geographic areas in Appendix I.
FIGURE 2
5. mtDNA and Nuclear Microsatellites in Trout
59
Sacramento rainbow trout (3) Little Kern R. golden trout (3) Kern River rainbow trout (3) CA steelhead (3)
35 68
CA golden trout (3) CA golden trout (15)
17 Rio Santo Domingo trout (1)
17
CA steelhead (1)
541 I
99
McCloud rainbow trout (1) CA steelhead (2)
83
Eagle Lake trout (16) CA golden trout (16) CA steelhead (6)
84 1 34
57 51
70
I I
I
CA steelhead (5) CA steelhead (13) CA steelhead (8) CA steelhead (12) Rio Yaqui trout (18)
FIGURE 3 Unrooted phylogenetic tree for a 188-bp mtDNA control region segment inferred from neighbor-joining analysis (PHYLIP) of pairwise distances calculated for 11 mtDNA haplotypes found in anadromous steelhead and resident trout in California and Mexico (19901995). For these analyses, parenthetical mtDNA haplotype designations represented the most common haplotype found in each particular population. Bootstrap (1000 replicates) probability values are given in percentages on the tree branches; values >50% are indicated in bold type.
nia golden trout, which shared identical m t D N A haplotypes (MSY16), was supported by a bootstrap value of 83%. B. N u c l e a r M i c r o s a t e l l i t e s
The three microsatellite loci used in this study contained dimeric repeats [Omy207 and Ssa289 poly(CA)poly(GT), and Omy77 poly(CT)-poly(GA)], found in tracts up to 74 repeat units long, with 10-33 alleles expressed per locus (Appendix II). Frequency distributions for microsatellite alleles are given by locus and geographic area in Fig. 4. The combined allelic distribution for the three loci was found to be in H a r d y Weinberg equilibrium over the total sample population (Fisher's exact p = 0.013). The microsatellites developed specifically for O. mykiss, i.e., Omy77 (27 alleles; range 77-141 bp) and Omy207 (33 alleles; range
76-148 bp), were significantly more polymorphic in California and Mexican trout and steelhead than the Ssa289 locus developed for Atlantic salmon (10 alleles; range 89-109 bp). All three loci conformed to the expectation of a single-step allele model, with gaps in the two-base repeat sequence occurring only in the largest alleles for Omy77 and Omy207. Genetic distance measures for the three microsatellite loci combined as calculated by Slatkin (1995) and Goldstein et al. (1995) are given by the haplotype population in Table II. Derived distance measures showed similar m e a n values for both models across all populations. Slatkin's m e a n Rst value was 0.207, whereas the m e a n Fst of Goldstein et al. was 0.205. The r of Goldstein et al. (expected duration of linearity of distance for the three loci combined) equaled 299,058 + 14,732 generations. Neighbor-joining trees developed from microsatellite distance data using both methods were
1
-r-
Locus - Omy77
0.9
i
0.8
North Coast
~] North Interior
0.7
I
0.6
South Coast South Interior
= 0.5 O"
[ ] Mexico Coast
n m
[ ] Mexico Interior
0.4 0.3 i
0.2 0.1
-
i
77-85
87-95
=
97-107
109-117
121-131
L
135-148
allele size range (bp)
0.9
Locus - O m y 2 0 7
0.8 0.7 0.6
i i
North Coast
[ ] North Interior
0.5
I
South Coast South Interior
0.4
[ ] Mexico Coast
0.3
E~ Mexico Interior 0.2 0.1 0 76-84
86-94
96-104
ii[
106-114
116-124
126-134
136-148
allele size range (bp)
Locus-
0.9
Ssa289
_.=
0.8
__=
0.7
i
-=
0.6
_=
___=
= 0.5 O" L
-
0.4
i
North Coast
[]
North Interior
i
South Coast
[ ] South Interior
~iiiii
[ ] Mexico Coast
0.3
I~ Mexico Interior
0.2
i
--=
0.1
1 89
91
93
9 95
97
~n 101
103
105
m 107
109
allele size (bp)
FIGURE 4 Frequency distributions of Oncorhynchus mykiss alleles from Omy77 (A),
60
Omy207 (B), and Ssa289 (C) microsatellite loci given by geographic area (see Appendix I for sample locations). Frequencies have been pooled by size class (each bin includes five sequential alleles) to aid in graphic resolution.
TABLE H
Genetic Distance Measures for Three Microsatellite Loci (Omy77, Omy207, Ssa289) from California and Mexican Trout Populations a
Population (haplotype) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
Rio Yaqui (18) CA steelhead (1) M c C l o u d (1) Rio Santo D o m i n g o (1) CA steelhead (2) CA g o l d e n (3) CA steelhead (3) Kern r a i n b o w (3) Little Kern River golden (3) Sacramento (3) CA steelhead (5) CA steelhead (6) CA steelhead (8) CA steelhead (12) CA steelhead (13) CA g o l d e n (15) CA golden (16) Eagle Lake (16)
1
2
3
m 23.10 41.05 94.92 25.96 32.10 35.46 22.46 56.43 7.28 44.09 11.68 42.67 28.43 55.70 13.39 10.00 11.53
0.40 m 4.93 109.53 0.25 15.32 7.79 45.41 64.87 36.89 8.51 3.64 10.00 1.22 34.91 12.20 11.70 31.80
0.42 0.04 ~ 91.92 3.19 11.36 2.96 47.06 53.29 50.56 3.24 9.03 3.61 3.21 21.37 14.44 16.64 37.19
4 0.90 0.59 0.59 ~ 106.21 73.58 69.52 35.31 6.08 79.73 84.11 81.47 73.11 108.65 29.47 60.25 68.54 45.57
5
6
7
8
9
10
11
12
13
14
15
16
17
18
0.33 0.03 0.04 0.59 ~ 12.67 6.67 44.46 62.38 37.98 7.52 3.89 8.75 0.48 32.46 11.51 11.45 31.85
0.63 0.22 0.09 0.71 0.07 -15.47 21.32 38.83 25.16 22.56 9.55 20.02 9.81 27.98 5.34 6.57 18.31
0.42 0.03 0.05 0.52 0.03 0.16 ~ 38.37 37.27 46.73 1.15 8.22 0.43 8.52 10.63 12.22 15.21 29.26
0.17 0.21 0.28 0.40 0.24 0.02 0.20 -15.21 9.56 52.65 23.75 46.01 44.15 32.80 10.79 11.83 1.98
0.59 0.40 0.34 0.34 0.34 0.41 0.26 0.02 m 46.23 49.41 43.24 41.12 64.09 13.24 28.12 34.02 20.42
0.30 0.27 0.29 0.60 0.21 0.32 0.28 0.04 0.36 ~ 60.04 18.94 56.06 37.59 59.33 12.16 9.29 5.48
0.44 0.05 0.05 0.57 0.04 0.17 0.00 0.27 0.31 0.33 ~ 12.77 0.40 9.87 15.12 20.01 23.17 41.30
0.00 0.10 0.14 0.63 0.10 0.14 0.14 0.18 0.26 0.11 0.13 ~ 12.11 4.80 26.95 3.01 2.63 13.94
0.43 0.03 0.05 0.50 0.05 0.20 0.00 0.23 0.27 0.30 0.02 0.20 ~ 10.95 10.75 17.24 20.78 36.31
0.11 0.07 0.01 0.59 0.03 0.31 0.00 0.23 0.22 0.05 0.00 0.00 0.06 -34.92 11.36 11.25 32.52
0.41 0.16 0.17 0.37 0.19 0.04 0.07 0.20 0.00 0.24 0.09 0.19 0.12 0.18 ~ 22.48 28.72 30.81
0.05 0.01 0.09 0.46 0.04 0.38 0.02 0.04 0.02 0.12 0.06 0.00 0.09 0.04 0.11 m 0.43 5.73
0.13 0.14 0.18 0.60 0.12 0.08 0.15 0.06 0.25 0.06 0.19 0.02 0.20 0.02 0.18 0.08 ~ 5.83
0.19 0.27 0.30 0.54 0.25 0.16 0.25 0.04 0.19 0.06 0.30 0.10 0.28 0.16 0.20 0.05 0.05
Distance measures (Rst) obtained according to the Slatkin (1995) m e t h o d using a stepwise m u t a t i o n process are given above the diagonal. Distance m e a s u r e s calculated according to Goldstein et al. (1995) using a one-step mutation model are given below the diagonal. a
62
JENNIFER L. NIELSEN et al. McCIoud r a i n b o w trout (1)
9
34 29
I
22 I
I
36
I
, C A steelhead (8) C A steelhead (5) C A steelhead ( 1 2) C A steelhead (1 3)
77
971 I
Rio Santo D o m i n g o trout (1) Little Kern R. golden trout (3) CA golden trout (1 6)
20
13
C A steelhead (2) C A steelhead (3)
28
I
11
CA steelhead (1)
CA golden trout (3) S a c r a m e n t o r a i n b o w trout (3)
4? r
44 31 !
I
Eagle Lake trout (16) Kern River r a i n b o w trout (3) CA golden trout (1 5) CA steelhead (6) Rio Yaqui trout (1 8)
FIGURE 5 Consensusunrooted phylogenetic tree for three microsatellite loci combined (Omy77, Omy207, and Ssa289) inferred from pairwise distances (Rst) resulting from mi-
crosatellite distance analysis based on Slatkin (1995) and using neighbor-joininganalysis (PHYLIP) of distance values to construct the tree. Bootstrap probability values based on 1000 replicate trees developed from bootstrapping of the original Rst distance data are given in percentages on the tree branches; values >50% are indicated in bold type.
not congruent for most haplotype populations (Figs. 5 and 6). Bootstrap values > 50% were rare among the microsatellite NJ branching units, making comparisons between the microsatellite and mtDNA trees difficult. No similar branching patterns were found by analyses of microsatellites that reflected the biogeographic associations developed from the authors' analyses of m t D N A haplotypes. The genetic similarity of the Rio Santo Domingo trout from Baja (MYS1), trout from the Little Kern River (MYS3), and a haplotype found only in southern steelhead (MYS13) was supported with > 50% bootstrap values in both microsatellite NJ trees. In both trees, close associations among the coastal steelhead populations (with the exception of haplotype MSY13)
and the McCloud River rainbow trout were supported. Only the Goldstein distance method, however, supported this association with bootstrap values 50%. Eagle Lake trout that shared a mtDNA haplotype with the South Fork Kern River golden trout (MYS16) were more closely associated with the Kern River and Sacramento River rainbow trout (both MYS3 haplotypes) using microsatellites. Neither tree supported these association with high bootstrap values. C. B i o g e o g r a p h i c
Concordance
A significant correlation was observed between mtDNA haplotype variation and both latitude (ANOVA F test < 0.001) and longitude (F test = 0.01), with lati-
5. m t D N A and Nuclear Microsatellites in Trout
63
McCloud rainbow trout (1) 11 22 [ 42
I
CA steelhead (1) CA steelhead (12)
38 /
CA Steelhead (2) 69
CA Steelhead (8)
55
CA steelhead (5)
59
CA Steelhead (3)
32i
CA steelhead (6) 21
CA golden trout (3) CA Steelhead (13)
58
21 21
s9 I I
Rio Santo Domingo trout (1) Little Kern golden trout (3) CA golden trout (15)
11 38 99
CA golden trout (16) Kern rainbow trout (3) Eagle Lake trout (16) Sacramento rainbow trout (3)
Rio Yaqui trout (18) FIGURE 6 Unrooted phylogenetic tree for three microsatellite loci combined (Omy77, Omy207, and Ssa289) inferred from pairwise distances resulting from microsatellite distance analysis based on the Goldstein et al. (1995) single-step distance model and using neighbor-joining analysis (PHYLIP) of distance values. Bootstrap (1000 replicates) probability values developed from the Goldstein el al. (1995) program are given in percentages on the tree branches; values >50% are indicated in bold type.
tude explaining 46% of the variance within haplotypes and longitude explaining 39% of the variance. Factor analysis of m t D N A haplotype frequency showed that the first principal component explained 72% of the variation across sampling areas, whereas the second factor explained 21% of the haplotype variance. Multiple regression of the m t D N A first principal component on latitude had the highest correlation 0 "2 - - 0.74; Fig. 7), with the m a x i m u m trend detected between populations above and below 37~ latitude (approximate location of Santa Cruz, CA). Regression of the first principal component on longitude gave 1,2 - - 0.62. The frequency distribution of microsatellite alleles, however, was weakly associated with longitude (F test = 0.05) and not at all with latitude (F test - 0.46). The
first principal component explained 33% of the variation in microsatellite allelic frequencies across all sampling areas, whereas the second factor contributed only 9% of the variance. Multiple regression analyses of the first principal component on longitude (1,2 = 0.55; Fig. 8) demonstrated a m a x i m u m trend in allelic variation around 121 ~ longitude (the approximate b o u n d a r y of the Sierra Nevada Crest in north-central California). Principal components analysis of genotype distributions using both m t D N A and microsatellite data combined found that the first principal component explained 68% of the variation in genotype frequency, whereas the second factor contributed 31% of the proportionate genotype variance. Factor axis loadings for
64
JENNIFER L. NIELSEN et at.
1.5 c-
mmm
.5
O
E o
0
._o. -.5, o_
-I
(y.)
mimmm
imm
o
mira m 9
9
-2~ -2.5 m
-3
28
3b
32
3:4
3i3
3i3
4b
42
Degrees Latitude
FIGURE 7 Regression of the first principal component derived from factor analysis of mtDNA haplotype diversity on latitude 0 .2 -0.74). The maximum trend was detected between populations above and below 37~ latitude (approximate location of Santa Cruz, CA).
mtDNA were -0.72 (factor one) and -0.09 (factor 2); for microsatellites, factor one and two axis loadings were 0.16 and 0.98, respectively.
IV.
Discussion
Biogeographic structure based on analyses of mtDNA and nuclear microsatellites proved to be non-
"9
...'"
9.'~, I"
~! _.."
....'I
i"-'%
_=_
l-"-
E
8--~
O,
-r-
Ii,
~08
,,0
,,2
,,,
,,6
,,8
9
,20
" "
,22
,2,
,2,
Degrees Longitude
FIGURE 8 Regression of the first principal component derived
from factor analysis of microsatellite allelic frequency on longitude 0 .2 = 0.55). The maximumtrend was detected between populations east and west of 121~latitude (approximatelocationof SierraNevada Crest in north-central California).
congruent in this study, with no intraspecific phylogenetic relationships supported by both markers with significant bootstrap values. This noncongruence may be explained by the documented differences found between these genetic markers and geography. Mitochondrial DNA haplotypes showed significant correlation with both longitude and latitude. Nuclear microsatellites, however, correlated only weakly with longitude and not at all with latitude. Although it is widely understood that data from differents parts of the genome often evolve differently (Avise, 1994; Huelsenbeck et al., 1996), the influence different evolutionary processes may have on phylogeographic structure within closely related populations is not generally known. Three mtDNA haplotype bioregions for coastal steelhead were suggested in Nielsen et al. (1994b). Neeley (1995) confirmed these findings using additional mtDNA haplotype data and showed significant genetic subdivision in coastal trout at 38.7~ (just above the mouth of the Russian River on the north coast of California) and at 36.7~ at the Pajaro River in central California. The analyses of trout mtDNA diversity presented here included interior populations from the McCloud River, the upper Sacramento River, and the Kern River basin, as well as two southern populations from Mexico. These new results support a latitudinal cline in O. mykiss mtDNA haplotype variation, but suggest that the maximum difference in variation for inland and coastal populations occurs north and south of 37~ latitude. The resolution of mtDNA frequency distributions in interior trout populations from California would gain from the addition of samples from the San Joaquin River that were not available at the time of these analyses. One mtDNA haplotype (MSY3) was common in anadromous steelhead from the Russian River north of San Francisco Bay to the Carmel River just south of Monterey, California. This mtDNA haplotype was also found in dominant frequencies in resident trout from the upper Sacramento River, California golden trout from the South Fork Kern River at Ramshaw Meadows, Golden Trout Creek, and Johnson Creek, and in rainbow trout populations from the Kern River and the Little Kern River. These data imply an extensive geographic distribution of this haplotype in the interior populations and suggest a strong genetic relationship between resident and anadromous trout in the Sacramento River drainage and the trout of the Kern River basin. Behnke in his 1992 monograph on native trout suggests such a linkage between Sacramento River redband trout and the California golden trout based on coloration and other taxonomic characters, which would appear to support the mtDNA findings. Ac-
5. mtDNA and Nuclear Microsatellites in Trout cording to Behnke (1992), the most primitive redband trout found in the Sacramento River basin is represented by fish from an isolated population found in Sheepheaven Creek near the McCloud River. mtDNA was sequenced from 11 fin clips taken from trout from Sheepheaven Creek that were sent to the author's laboratory by the California Department of Fish and Game. These Sheepheaven Creek fish were monomorphic for mtDNA haplotype MYS1, as were all 54 McCloud River rainbow trout that were sequenced. Haplotype MYS1 was most frequently found in coastal steelhead from northern California. This haplotype has never been found in California golden trout. McCloud River redband trout had microsatellite alleles that have not been found in coastal steelhead groups. One notable example was the Omy77 allele (Omy77-79), which was common in the upper Sacramento River, Kern River, and Little Kern River rainbow trout, in Eagle Lake trout, and in California golden trout, but has been found in only one steelhead from the Carmel River. A second Omy77 allele (Omy77-121) dominated frequency in the McCloud trout populations and was rarely found in coastal steelhead. Two new mtDNA haplotypes (MYS15 and MYS16), never seen in coastal populations, were found among golden trout captured in Taylor, Fay and Manter Creeks, and in the South Fork Kern River at Ramshaw Meadows. Haplotype MYS16 was also found to be monomorphic in Eagle Lake trout. The isolated geographic distribution of this haplotype into this northern interior lake remains unclear. There have been no officially documented fish transfers from the South Fork Kern River to Eagle Lake in recent history (E. Gerstung, California Department of Fish and Game, personal communications). Microsatellite distance analyses did not link these two populations with any statistical rigor. A third unique mtDNA haplotype (MYS18) was found in the Rio Yaqui trout from northwestern Mexico. This group of fish had a significantly different genetic profile for both mtDNA and microsatellites when compared to the rest of O. mykiss. Several alleles that dominated the microsatellite frequency in the Rio Yaqui fish were found only rarely or not at all in California trout populations. The position of this group in the evolutionary history of Pacific trout has been speculated on in several early studies (Miller, 1950, 1972; Needham and Gard, 1964), but their taxonomic status remains undefined. These genetic findings support a unique identity for this group of trout which deserves further study. The mtDNA haplotype (MYS1), which dominated anadromous steelhead populations in northern California, was also found to be fixed in Rio Santo Domingo rainbow trout from Baja California. It has been
65
speculated that the Baja rainbow trout originated from the anadromous coastal steelhead of southern California (Ruiz-Campos and Pister, 1995). The rare, but ubiquitous, distribution of the MYS1 haplotype throughout southern California supports a possible historic connectivity between these anadromous stocks and the resident rainbow trout populations of Baja. In an earlier study using electrophoretic analyses of allozymes, Berg (1987) found a unique creatine kinase allele (Ck-2) in Baja trout that was not found in other coastal populations. Microsatellites also paint a different picture of biogeographic associations for the Baja trout. Microsatellite alleles (Omy77-77, Omy77-87, and Omy207-124) show a closer relationship between the Baja fish and trout populations in the Kern River and the South Fork Kern River. Omy77-87 was found in only one fish from Bull Frog Lake on the Little Kern River. Omy77-77 was found only in fish from Dry Meadows Creek on the Kern River. These associations demonstrate a possible evolutionary connection between the Baja trout and the Kern River basin, suggesting an alternate evolutionary path for these fish. Both analyses of microsatellite distance supported the unity of the Baja trout with the rainbow trout of the Little Kern River with high bootstrap values. This lack of congruence between mtDNA and microsatellite allelic frequencies argues against a single Pleistocene radiation for O. mykiss. An alternative hypothesis is two radiations from a Gulf of California refugium as suggested by Behnke (1992), with one contributing to the interior redband/golden trout complex and one to the coastal radiation of steelhead and coastal rainbow trout. It is interesting that three controversial trout populations, McCloud River's Sheepheaven Creek redband trout, the Eagle Lake trout, and the Baja trout, have demonstrable differences in interpretation of their evolutionary associations based on mtDNA and microsatellites. It is possible that these unique trout populations represent different ancestral nodes for both radiations. Another possible explanation for the lack of congruence between mtDNA phylogenetic structure and microsatellite data is male-mediated gene flow (Karl et al., 1992). This study found significant differences in population structure in nuclear vs mtDNA assays of sea turtles (Chelonia mydas) and attributed this finding to life history differences between males and females, where females alone demonstrated a strong natal site philopatry in rookery use. Male straying from natal streams during spawning migrations in anadromous salmon is thought to be more pervasive than straying in females (Flemming and Gross, 1994; Quinn and Foote, 1994). Similar behavior and male-mediated gene flow in resident rainbow trout, however, would be limited to
66
JENNIFERL. NIELSENet al.
straying among tributaries of the same river basin and would not seem to be a credible cause of the microsatellite allelic panmixia shown in this study across many interior river basins where there is currently no access to the ocean. The artificial transfer of trout from basin to basin could explain such a panmixia, but artificial stock transfers would not be limited strictly to male fish. A more likely mechanistic explanation for the lack of congruence among these molecular markers lies in the fact that microsatellites probably have diverged more rapidly than the mtDNA control region and may, therefore, not be useful in detecting phylogenetic relationships among closely related taxa due to a lack of lineage sorting in these markers. Our comparison of the Slatkin (1995) and Goldstein et al. (1995) distance measures gave no indication as to which of the two methods used for constructing distance matrices for microsatellites might more likely reflect trout phylogeny. The three microsatellite loci used in this study seem to fit the expectation of the singlestep allele model with the exception of a few large alleles in Omy77 and Omy207. The general results of the consensus trees were similar, with few significant branching patterns based on bootstrap analyses of 1000 trees. Analyses of additional polymorphic microsatellite loci may provide a more reliable signal for divergence of O. mykiss, but it is clear from these data that mtDNA control region sequence and microsatellites can give very different evolutionary signals for closely related groups. In a study that inferred phylogenetic trees from 10 vertebrate species, Cummings et al. (1995) suggested that a large number of genes and nucleotide sites are needed to exactly determine phylogenetic relationships. The selection of molecular markers used in phylogenetic studies is frequently made based on factors related to the historic use of the marker in systematic studies, the functional characteristics of the marker, the ease of extraction and amplification, but not on the relevance of that marker to the evolutionary history of the population. The conflicting results reported here for mtDNA sequence data and nuclear microsatellites confirm the need to draw phylogenetic inference from several independent markers before reaching conclusions that are presumed to represent the evolutionary history of the organism. In summary, the biogeographic results derived from mtDNA and microsatellites were not congruent for this study of trout and steelhead populations. The phylogeographic structure for mtDNA was significantly associated with both longitude and latitude in western trout populations. Unlike the conclusion drawn by Phillips and Oakley (1997), intraspecific mtDNA control region data retain significant biogeographic struc-
ture, suggesting that control region divergence can serve as a rigorous marker in the documentation of stock structure in this species. Only a weak association, however, was shown between longitude and the frequency of microsatellite alleles. The most significant separation for this marker occurred at the approximate boundary of the Sierra Nevada Crest, weakly supporting the biogeographic subdivision of O. mykiss previously reported by Allendorf (1975) for allozymes in trout. These data suggest that microsatellite, allozymes, and mtDNA data do not reflect the same evolutionary architecture in O. mykiss. Based on morphological data, Behnke (1992) suggested a Gulf of California refugium for Oncorhynchus during the mid-Pleistocene, approximately 250,000 years ago. With 4.5% mtDNA control region sequence divergence (Nielsen et al., 1994b), the female lineage of O. mykiss appears to have retained significant phylogenetic structure for a far longer period, assuming an expected substitution rate of around 4% per million years (Avise, 1994). Microsatellites, however, with only a weak geographic association between longitude and allelic frequency distributions, seem to represent population structure that has more recently diverged, perhaps during the mid to late Pleistocene (Bailey, 1966) when the Sierra Nevada area was strongly uplifted and tilted to the west. It is interesting, however, to note that factor analyses of the geographic range of samples and genetic data from both molecular markers showed that the first two factors could be used to explain 99% of the genetic variance reported in this study, suggesting that a combination of molecular markers reflecting independent evolutionary histories do a far better job of depicting phylogeography than either one alone.
Acknowledgments Numerous people were instrumental in collecting tissue from steelhead and trout for this project, clarifying our analytical approach, and editing difficult and cumbersome drafts of this paper. We express our special appreciationto MuriceCardenas, Cindy Carpanzano, Sara Chubb, Bill Cox, Karen Crow, Tom Dowling, Chris Gan, Eric Gerstung, Ed Henke, Buddy Jensen, Wendy Jones, Mat Lectner, Giles Manwaring, Bob Miller, Eric Minch, Steve Nettie, Steve Parmenter, Phil Pister, Dennis Powers, Mike Rode, Gorgonio Ruiz-Campos, Monty Slatkin, Kelley Thomas, Doug Tupper, Steve Turek, and Don Weidlein. We are grateful for the suggestions and corrections made in this manuscript by the editors and two anonymous reviewers.
References Allendorf, F. W. 1975. "Genetic Variability in a Species Possessing Extensive Gene Duplication: Genetic Interpretation of Duplicate Loci and Examination of Genetic Variation in Populations of
5. m t D N A and Nuclear Microsatellites in Trout
Rainbow Trout." Unpublished Ph.D. dissertation, University of Washington, Seattle, WA. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman & Hall, New York. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C., 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489-522. Avise, J. C., Nelson, W. S., and Sibley, C. G. 1994a. DNA sequence support for a close phylogenetic relationship between some storks and New World vultures. Proc. Natl. Acad. Sci. USA 91: 5173-5177. Avise, J. C., Nelson, W. S., and Sibley, C. G. 1994b. Why one-kilobase sequences from mitochondrial DNA fail to solve the Hoatzin phylogenetic enigma. Mol. Phylogenet. Evol. 3:175-184. Bailey, E. H. 1966. "Geology of Northern California." USGS Bulletin 190. CA Div. Mines and Geol. Ferry Bldg., San Francisco. Behnke, R. J. 1965. "A Systematic Study of the Family Salmonidae with Special Reference to the Genus Salmo." Doctoral dissertation, University of California, Berkeley, CA. Behnke, R. J. 1968. A new subgenus and species of trout, Salmo (Platysalmo) platycephalus, from south-central Turkey, with comments on the classification of the subfamily Salmonidae. Mitteil. Hamburg. Zool. Mus. Inst. 66:1-15. Behnke, R. J. 1992. "Native Trout of Western North America." Am. Fish. Soc. Mon. Berg, W. J. 1987. "Evolutionary Genetics of Rainbow Trout, Parasalmo gairdnerii (Richardson)." Doctoral dissertation, University of California, Davis, CA. Bernatchez, L. 1995. A role for molecular systematics in defining evolutionarily significant units in fishes. In: "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation," (J. L. Nielsen, ed.), pp. 114-132 Am. Fish. Soc. Symposium No. 17, Bethesda, MD. Bernatchez, L., and Danzmann, R. G. 1993. Congruence in controlregion sequence and restriction site variation in mitochondrial DNA of Brook char (Salvelinus fontinalis Mitchill) Mol. Biol. Evol. 10:1002-1014. Birky, C. W., Fuerst, P., and Maruyama, T. 1989. Organelle gene diversity under migration, mutation, and drift: Equilibrium expectations, approach to equilibrium, effects of heteroplasmic cells, and comparison to nuclear genes. Genetics 121:613-627. Bowcock, A. M., Ruiz-Linares, A., Tomfohrde, J., Minch, E., Kidd, J. R., and Cavalli-Sforza, L. L. 1994. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368: 455-457. Bowen, B. W., Richardson, J. I., Melan, A. B., Margaritoulis, D., Hopkins Murphy, R., and Avise, J. C. 1993. Population structure of loggerhead turtles (Caretta caretta) in the northwestern Atlantic Ocean and Mediterranean Sea. Conserv. Biol. 7:834-844. Bruford, M. W., and Wayne, R. K. 1993. Microsatellites and their application to population genetic studies. Curr. Opin. Genet. Dev. 3: 939-943. Burke, T., Davies, N. B., Bruford, M. W., and Hatchwell, B. J. 1989. Parental care and mating behavior of polyandrous dunnocks Prunella vulgaris related to paternity by DNA fingerprinting. Nature 338:249-251. Cummings, M. P., Otto, S. P., and Wakeley, J. 1995. Sampling properties of DNA sequence data in phylogenetic analyses. Mol. Biol. Evol. 12(5):814-822. Currens, K. P., Schreck, C. B., and Li, H. W. 1990. AUozyme and morphological divergence of rainbow trout (Oncorhynchus mykiss) above and below waterfalls in the Deschutes River, Oregon. Copeia 1990(3):730-746. Digby, T. J., Gray, M. W., and Lazier, C. B. 1992. Rainbow trout mito-
67
chondrial DNA: Sequence and structural characteristics of the non-coding region and flanking tRNA genes. Gene 118:197-204. Di Rienzo, A. A., Peterson, A. C., Garza, J. C., Valdes, A. M., Slatkin, M., and Freimer, N. B. 1994. Mutational processes of simple sequence repeat loci in human populations. Proc. Natl. Acad. Sci. USA 91:3166-170. Ellegren, H. 1995. Microsatellites. In "Methods in Molecular Population Genetics for Ecologists" (D. T. Parkin, ed.). Blackwell Sci., Oxford. Estoup, A., Presa, P., Krieg, F., Vaiman, D., and Guyomard, R., 1993. (CT)n and (GT)n microsatellites: A new class of genetic markers for Salmo trutta L. (brown trout). Heredity 71:488-496. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using bootstrap. Evolution 39: 783-791. Felsenstein, J. 1991. "PHYLIP 3.4--Phylogeny Inference Package Distributed by Author. Department of Genetics SK-10, University of Washington, Seattle, WA. FitzSimmons, N. N., Moritz, C., and Moore, S. S. 1995. Conservation and dynamics of microsatellite loci over 300 million years of marine turtle evolution. Mol. Biol. Evol. 12(3):432-440. Flemming, I. A., and Gross, M. R. 1994. Breeding competition in a pacific salmon (Coho: Oncorhynchus mykiss): Measures of natural and sexual selection. Evolution 48:637-657. Foote, C. J., Mayer, I., Wood, C. C., Clarke, W. C., and Blackburn, J. 1994. On the developmental pathways to anadromony in sockeye salmon, Oncorhynchus nerka. Ca. J. Zool. 72:397-405. Gall, G. A. E. 1995. "California Trout of the Kern River: A Genetic Analysis. Report submitted to California Department of Fish and Game, Inland Fisheries Division, Sacramento, CA. Gall, G. A. E., Bentley, B., and Nuzum, R. C. 1990. Genetic isolation of steelhead rainbow trout in Kaiser and Redwood Creeks, California. Calif. Fish Game 76:216-223. Gerloff, U., Schlotterer, C., Rassmann, K., Rambold, I., Hohmann, G., Fruth, B., and Tautz, D. 1995. Amplification of hypervariable simple sequence repeats (microsatellites) from excremental DNA of wild living bonobos (Pan paniscus). Mol. Ecol. 4:515-518. Goldstein, D. B., Linares, A. R., Cavalli-Sforza, L. L., and Feldman, M. W. 1995. An evaluation of genetic distances for use with microsatellite loci. Genetics 139: 463-471. Graybeal, A. 1994. Evaluating the phylogenetic utility of genes: A search for genes informative about deep divergence among vertebrates. Syst. Biol. 43(2): 174-193. Henderson, S. T., and Petes, T. D. 1992. Instability of simple sequence DNA in Saccharomyces cerevisiae. Mol. Cell. Biol. 12:2749-2757. Hillis, D. M. 1995. Approaches for assessing phylogenetic accuracy. Syst. Biol. 44:3-16. Huelsenbeck, J. P., Bull, J. J., and Cunningham, C. W. 1996. Combining data in phylogenetic analyses. TREE 11(4): 152-158. Karl, S. A., Bowen, B. W., and Avise, J. C. 1992. Global population genetic structure and male-mediated gene-flow in the green turtle (Chelonia mydas): RFLP analyses of anonymous nuclear loci. Genetics 131:163-173. Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16:111 - 120. Kelly, R., Gibbs, M., Collick, A., and Jeffreys, A. J. 1991. Spontaneous mutation at the hypervariable mouse microsatellite Ms6-hm: Flanking DNA sequence and analysis of and early somatic events. Proc. R. Soc. Lond. B 245:235-245. Kendall, A. W., Jr., and Behnke, R. J. 1984. Salmonidae: Development and relationships. In "Ontogeny and Systematics of Fishes." (H. G. Moser, W. J. Richards, D. M. Cohen, M. P. Fahay, A. W. Kendall, Jr., and S. L. Richardson, eds.), pp. 142-149. Am. Soc. Ichthyol. Herpetol., Special Publication 1, Allen Press, Lawrence, KS.
68
JENNIFER L. NIELSEN et al.
Kocher, T. D., Thomas, W. K., and Meyer, A. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86: 6196-6200. Long, E. O. and David, I. B. 1980. Repeated genes in eukaryotes.Ann. Rev. Biochem. 49: 727- 764. Louis, E. J. and Dempster, E. R. 1987. An exact test for Hardy-Weinberg and multiple alleles. Biometrics 43:805-811. McConnell, S. K., O'Reilly, P., Hamilton, L., Wright, J. M., and Bentzen, P. 1995. Polymorphic microsatellite loci from Atlantic salmon (Salmo salar): Genetic differentiation of North American and European populations. Can. J. Fish. Aquat. Sci. 52:18631872. Miklos, G. L. G. 1985. Localized, highly repetitive DNA sequences in vertebrate and invertebrate genomes. In "Molecular Evolutionary Genetics" (R. J. MacIntyre, ed.), pp. 231-241 Plenum Press, New York. Miller, R. R. 1950. Notes on the cutthroat and rainbow trouts with the description of a new species from the Gila River, New Mexico. Occ. Pap. Mus. Zool. Univ. M1529 :1-42. Miller, R. R. 1972. Classification of the native trouts of Arizona, with the description of a new species, Salmo apache. Copeia 1972:401422. Moore, S. S., Sargent, L. L., King, T. J., Mattick, J. S., Georges, M., and Hetzel, D. J. S. 1991. The conservation of dinucleotide microsatellites among mammalian genomes allows the use of heterologous PCR primer pairs in closely related species. Genomics 10:654-660. Morin, P. A., Moore, J. J., Chakraborty, R., Jin, L., Goodall, J., and Woodruff, D. S. 1994a. Kin selection, social structure, gene flow, and the evolution of chimpanzees. Science 265:1193-1201. Morin, P. A., Wallis, J., Moore, J. J., and Woodruff, D. S. 1994b. Paternity exclusion in a community of wild chimpanzees using hypervariable simple sequence repeats. Mol. Ecol. 3:469-478. Moritz, C., Dowling, T. E., and Brown, W. M. 1987. Evolution of animal mitochondrial DNA: Relevance for population biology and systematics. Ann. Rev. Ecol. Syst. 18:269-292. Moritz, C., Lavery, S., and Slade, R. 1995. Using allele frequency and phylogeny to define units for conservation and management. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation" (J. L. Nielsen, ed.), pp. 249-262. Am. Fish. Soc. Symposium No. 17, Bethesda, MD. Morris, D. B., Richard, K. R., and Wright, J. M. 1996. Microsatellites from rainbow trout (Oncorhynchus mykiss) and their use for genetic studies of salmonids. Can. J. Fish. Aquat. Sci. 53:120-126. Needham, P. R., and Gard, R. 1964. A new trout from central Mexico: Salmo chrysogaster, the Mexican golden trout. Copeia 1964:169173. Neeley, D. 1995. A statistical evaluation of coastal California steelhead genetic data gathered by J. L. Nielsen et al. and by Trihey and Associates. Prepared for S. P. Cramer & Asso. Submitted to Association of California Water Agencies, Sacramento, CA. Neigel, J. E., and Avise, J. C. 1986. Phylogenetic relationships of mitochondrial DNA under various demographic models of speciation. In "Evolutionary Processes and Theory" (E. Nevo and S. Karlin, eds.), pp. 515-534. Academic Press, New York. Nielsen, J. L. 1996. Molecular genetics and the conservation of salmonid biodiversity: Oncorhynchus at the edge of their range. In "'Molecular Genetic Approaches in Conservation" (T. Smith and R. Wayne, eds.) pp. 383-398. Oxford University Press, London. Nielsen, J. L., Gan, C. A., and Thomas, W. K. 1994a. Differences in genetic diversity for mtDNA between hatchery and wild populations of Oncorhynchus. Can. J. Fish Aquat. Sci. 51(Suppl. 1):290297. Nielsen, J. L. Gan, C. A., Wright, J. M., Morris, D. B., and Thomas, W. K. 1994b. Biogeographic distributions of mitochondrial and
nuclear markers for southern steelhead. Mol. Marine Bio. Biotech. 3:281-293. Okazaki, T. 1984. Genetic divergence and its zoogeographic implications in closely related species Salmo gairdneri and Salmo mykiss. Jap. J. Ichthyol. 31:297-310. O'Reilly, P., and Wright, J. M. 1995. The evolving technology of DNA fingerprinting and its application to fisheries and aquaculture. J. Fish. Biol. 47(Suppl. A) :29-55. Pamilo, P., and Nei, M. 1988. Relationships between gene trees and species trees. Mol. Biol. Evol. 5:568-583. Parkinson, E. A. 1984. Genetic variation in populations of steelhead (Salmo gairdneri) in British Columbia. Can. J. Fish. Aquat. Sci. 41: 1412-1420. Phillips, R. B., and Oakley, T. H. 1997. Phylogenetic relationships among the Salmonidae based on nuclear DNA and mitochondrial DNA sequences. In "Molecular Systematics of Fishes" (T. D. Kocher and C. A. Stepien, eds.). Academic Press, San Diego. Queller, D. C., Strassmann, J. E., and Hughs, C. R. 1993. Microsatellites and kinship. Trends Ecol. Evol. 8:285-288. Quinn, T. P., and Foote, C. J. 1994. The effects of body size and sexual dimorphism on the reproductive behavior of sockeye salmon (Oncorhynchus nerka). Anim. Behav. 48: 751-761. Regan, C. T. 1914. The systematic arrangement of the fishes of the family Salmonidae. Ann. Mag. Nat. Hist. 13(8):405-408. Reisenbichler, R. R., McIntyre, J. D., Solazzi, M. F., and Landing, S. W. 1992. Genetic variation in steelhead of Oregon and Northern California. Trans. Am. Fish. Soc. 121:158-169. Ruiz-Campos, G., and Pister, E. P. 1995. Distribution, habitat, and current status of the San Pedro Martir rainbow trout, Oncorhynchus mykiss nelsoni (Evermann). Bull. S. CA Acad. Sci. 94(2):131148. Schlotterer, C., and Tautz, D. 1992. Slippage synthesis of simple sequence DNA. Nucleic Acids Res. 20: 211-215. Slatkin, M. 1991. Inbreeding coefficients and coalescence times. Genet. Res. 58:167-175. Slatkin, M. 1995. A measure of population subdivision based on microsatellite allele frequencies. Genetics 139:457-462. Smith, G. R., and Stearley, R. F. 1989. The classification and scientific names of rainbow and cutthroat trouts. Fisheries 14:4-10. Spencer, P. B. S., Odorico, D. M., Jones, S. J., Marsh, H. D., and Miller, D. J. 1995. Highly variable microsatellites in isolated colonies of the rock-wallaby (Petrogale assimilis) Mol. Ecol. 4:523-525. Stearley, R. F., and Smith, G. R. 1993. Phylogeny of the Pacific trouts and salmon (Oncorhynchus) and genera of the family Salmonidae. Trans. Am. Fish. Soc. 122:1-33. Stoneking, M., Hedgecock, D., Higuchi, R. G., Vigilant, L., and Erlich, H. A. 1991. Population variation of human mtDNA control region sequence detected by enzymatic amplification and sequence-specific oligonucleotide probes. Am. J. Hum. Genet. 48:370-382. Swift, C. C., Haglund, T. R., Ruiz, M., and Fisher, R. N. 1993. The status and distribution of the freshwater fishes of southern California. Bull. South. Calif. Acad. Sci. 92(2): 101-167. Tautz, D. 1989. Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res. 17: 6463-6471. Thomas, W. K., and Beckenbach, A. T. 1989. Variation in salmonid mitochondrial DNA: Evolutionary constraints and mechanisms of substitution. J. Mol. Evol. 29:233-245. Thomas, W. K., Withler, R. E., and Beckenbach, A. T. 1986. Mitochondrial DNA analysis of Pacific salmonid evolution. Can. J. Zool. 64:1058-1064. Titus, R. G., Erman, D. C., and Snider, W. M. History and status of steelhead in California coastal drainages south of San Francisco Bay. Hilgardia, in press.
5. m t D N A and Nuclear Microsatellites in Trout Utter, F. M., F. W. Allendorf, and H. O. Hodgins. 1973. Genetic variability and relationships in Pacific salmon and related trout based on protein variation. Syst. Zool. 22:257-270. Utter, F. M., and Allendorf, F. W. 1994. Phylogenetic relationships among species of Oncorhynchus: A consensus view. Conser. Biol. 8: 864- 867. Vladykov, V. 1963. A review of salmonid genera and their broad geographical distribution. Trans. Roy. Bd. Can. 1 (Ser. 4, Sect. 3): 459504. Weber, J., and May, P. 1989. Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Hum. Gene. 44:388-396. Weller, S. J., Pashley, D. P., Martin, J. A., and Constable, J. L. 1994.
69
Phylogeny of noctuoid moths and the utility of combining independent nuclear and mitochondrial genes. Syst. Biol. 43(2):194211. Wilson, G. M., Thomas, W. K., and Beckenbach, A. T. 1985. Intra- and inter-specific mitochondrial DNA sequence divergence in Salmo: Rainbow, steelhead, and cutthroat trouts. Can. J. Zool. 63:20882094. Wright, J. M. 1993. DNA fingerprinting in fishes. In "Biochemistry and Molecular Biology of Fishes" (P. W. Hochachka and T. Mommsen, eds.), Vol. 2, pp. 57-91. Elsevier Press, New York. Wright, J. M., and Bentzen, P. 1994. Microsatellites: Genetic markers for the future. In "Reviews in Fish Biology and Fisheries" (G. R. Carvalho and T. J. Pitcher, eds.). Chapman and Hill, London.
70
JENNIFER L. NIELSEN et at. APPENDIX I m t D N A haplotype Location North coast Van Duzen River Eel River Albion River Navarro River Gualala River Garcia River Russian River Salmon River Usal Creek Cottoneva Creek H o w a r d Creek Redwood Creek Lagunitas Creek North interior Sacramento River Mears Creek Soda Creek Dog Creek Slate Creek McCloud River Edson Creek Dry Creek Trout Creek Sheepheaven Creek Eagle Lake
1
2
5 10 7 2 1 1
1 10
3
6
8
12
13
15
16
18
Total
Total
8 25 8 2 1 1 2 4 3 3 3 5 3 68
Total
5 5 3 2 8 6 8 5 4 11 10 67
Total
4 4 6 15 9 6 6 5 7 6 9 77
2 2
3 1
1 2 1 3
6 8 5 4 11 10
South coast San Lorenzo River Zyante Creek Carmel River Santa Ynez River Morro Bay Scott Creek Waddell Creek Santa Rosa Creek Pico Creek Gaviota Creek Malibu Creek South interior Kern River Dry Meadows Creek Freeman Creek Little Kern River Bullfrog Lake Sheep Creek Willow Creek South Fork Kern River Fay Creek Manter Creek Ramshaw Meadows Taylor Creek Golden Trout Creek Johnson Creek
5
2 2 3 3
4
1 3 2 7
1
3 2
1 2
11 6 6
11 6 6
3 8 8 13 15 6
3 7 10
6 1
1 Total
13 15 9 6 9 10 98
5. mtDNA and Nuclear Microsatellites in Trout
71
APPENDIX ImContinued mtDNA.haplotype Location
1
Mexican coastal Rio Santo Domingo Arroyo San Rafael Arroyo San Antonio Arroyo La Zanja Arroyo E1 Potrero
12 6 4 3 2
2
3
5
6
8
12
13
15
16
Mexican interior Rio Yaqui Total
99
17
108
20
7
25
7
8
7
45
18
Total
Total
12 6 4 3 2 27
11
11
11
354
APPENDIX H Locus = O m y 77 Population (haplotype)
77
79
81
83
85
87
89
CA steelhead (1) M c C l o u d r a i n b o w (1) Rio Santo D o m i n g o (1)
91
93
95
97
101
103
105
107
109
2
1
1
14
5
5
4
3
4
117
6
CA steelhead (3) 6
Little Kern golden (3)
14
Sacramento r a i n b o w (3)
20
3 1
4
2 1
1
9
1 5
3
2
CA steelhead (6)
1 1
3
CA steelhead (12)
1
CA steelhead (13)
5
CA g o l d e n trout (15)
3 1
Eagle Lake r a i n b o w (16) Rio Yaqui trout (18)
12 2
5 30
1 2
10
8
2
2 2 11 1
1 3
5 2
16
5
3 2
3 2
1
CA steelhead (5) CA steelhead (8)
125
127
129
131
135
1
2
3
1
7
137
141 Total
1
2
1
1
1
6
2 1
5
3 1
2
4
1 1
4
1
3
1
1 1
4 3 6
1
2 1
3
1
4
2
1 2
32 38 1
3
2
1
1
1
2
3
4
10
1 1
1
3
32 14
3
56 1
2
14 16 14 18
1
3 20
44 16 38 46
1
11
2
1 1
2 4
2 5
2
2
7 10 3
1
1 4
54 32 20
14 3
2
121
24
CA g o l d e n t r o u t (3)
CA g o l d e n trout (16)
115
8
CA steelhead (2)
Kern River r a i n b o w (3)
111
20 22
2 Locus = O m y 207
76
CA steelhead (1)
78
80
82
84
86
88
90
4
5
1
1
1
5
92
M c C l o u d r a i n b o w (1)
94
96
98
100
6
2
1
9
2
Rio Santo D o m i n g o (1) CA steelhead (2) CA G o l d e n t r o u t (3)
104
106
4
1
1
14 1
1
1
1
2
2
CA steelhead (3)
2
2
1 2
3
2
2
1
10
1
7
1
1
1
108
110
112
4
116
118
2
1 8
1
3
124
1
1
1
1
2
1
7
6
4
10
122
126
128
130
132
134
136
138
148 Total
1
50
2
4 1
120
1
1 1
1 4
114
6 1
3
6
21 2
Kern River r a i n b o w (3) Little Kern g o l d e n (3)
102
20 6 2
19
30 30
1 1
38
1
1
5
1
1
3
1
38 1
2
3
1
16 36
Sacramento rainbow (3) CA steelhead (5) Kern River rainbow (5) CA steelhead (6) CA steelhead (8) Kern River rainbow (8) CA steelhead (12) CA steelhead (13) CA golden trout (15) CA golden trout (16) Eagle Lake rainbow (16) Rio Yaqui trout (18)
2
19 2
3
6
7
5 2
1
3
9 3 2 2 2
4 1
1
1
4 3
1 1
3
1
1
7
2 8
1
5
2
2
89
CA steelhead (1)
Y
91
93
95
97
2
30 8 16 14 2 17 3 22 10 12 5 9 3 6 3 14
3
4
2 11
1
1
4 5
1
2 4 14 2 4 14 14 1 1 36 2 1 4 5 12 22
3
2 1
2
9 5 3 1 2
6
2
6
1 2 1 6 2
4 4
5 27 3
2 2 2
1 3 1 3
2 3
4 1 4 1 3 1 4 2
4 4
5
3 2 1 6 1 4 1 3 4 1
1
1 1
2 2
3
4
1
1
2 1 1 2 6
101 103 105 107 109 Total
3 26 2 1
1
1
1 3
Locus = Ssa 289
McCloud rainbow (1) Rio Santo Doming0 (1) CA steelhead (2) CA golden trout (3) CA steelhead (3) Kern River rainbow (3) Little Kern golden (3) Sacramento rainbow (3) CA steelhead (5) Kern River rainbow (5) CA steelhead (6) CA steelhead (8) Kern River rainbow (8) CA steelhead (12) CA steelhead (13) CA golden trout (15) CA golden trout (16) Eagle Lake rainbow (16) Rio Yaqui trout (18)
1 2
50 34 20 32 34 44 16 36 44 32 14 14 54 10 14 16 14 20 20 22
1 1 1 2
4 2
46 30 12 12 36 10 14 16 14 20 20 20
This Page Intentionally Left Blank
C H A P T E R
6 Mitochondrial DNA Sequence Variation among the Sand Darters (Percidae: Teleostei) E. O. WILEY Museum of Natural History and Department of Systematics and Ecology University of Kansas Lawrence, Kansas 66045
time, developments in population genetic theory also have begun to provide bridges between disciplines (e.g., Slatkin and Maddison, 1989; Hudson, 1990; Templeton et al., 1992). In 1987, Avise and colleagues coined the term "intraspecific phylogeography" for the use of molecular data to reconstruct population histories in relation to geography. The essence of their approach is a threestage process. Molecular data are obtained from individuals sampled from geographically distinct populations. These data are next used to generate a tree showing genealogical relationships among individuals. Finally, the geographic distribution of individuals is compared with the tree. Avise et al. (1987) argued that patterns of concordance between genealogy and geography should reflect historical events responsible for current distribution of an organism. A fundamental assumption is that molecular data preserve a record of genealogy that is independent of the historical pattern of dispersal or vicariance among populations. Intraspecific phylogeography may offer significant insight into processes occurring at the interface between systematics and population genetics, but much work is still needed to refine methods. Since 1987, several authors have proposed methods that may allow
I. I n t r o d u c t i o n
The interface between population genetics and systematics remains one of the most challenging areas in evolutionary biology. Part of the difficulty arises from differing analytical approaches. The goal of most phylogenetic studies is to reconstruct historical (sister group) relationships among taxa through analysis of character distributions. Standard phylogenetic methods work best with characters that are monomorphic within species (Swofford et al., 1996). In contrast, the goals of most population genetic studies are to infer rates and directions of ongoing processes that affect relationships among individuals and populations through analysis of allele frequencies within and among populations. Standard population genetic methods do not include information about the historical relationship among different alleles (Weir, 1996). The contrasting emphases on monomorphic versus polymorphic characters, on character states versus alleles, and on historical versus contemporaneous processes have tended to create barriers between disciplines. However, since the mid-1980s, there has been increasing convergence in the types of data used for population genetic and systematic studies. At the same MOLECULAR SYSTEMATICS OF FISHES
ROBERT H. HAGEN Department of Entomology University of Kansas Lawrence, Kansas 66045
75
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
76
E. O. WILEY AND ROBERT H. HAGEN
formal statistical assessment of inferences about population histories. The approaches can be divided into three categories: (1) extensions of the standard F statistics used in population genetic studies (Excoffier et al., 1992; Excoffier and Smouse, 1994); (2) extensions of spatial autocorrelation methods (Bertorelle and Barbujani, 1995); and (3) cladistically based tests of geographic associations (Slatkin and Maddison, 1989; Templeton et al., 1992, 1995). In addition to theoretical work, empirical studies are also needed to assess the utility and robustness of assumptions and methods. This chapter uses phylogenetic methods to address both inter- and intraspecific variation among sand darters within a single hierarchical framework. The chapter begins with a brief account of cytochrome b and its relevance to the study of inter- and intraspecific variation. The chapter then describes sand darters, a small group of North American percid fishes, before moving on to the results of the study.
A. Cytochrome b Analysis of mitochondrial DNA (mtDNA) sequence variation has become a well-established tool for studying fish evolution (reviewed in Meyer, 1994a). For this study, the authors chose to sequence a portion of the cytochrome b gene. Because the cytochrome b gene is a well-characterized gene that codes for an evolutionarily conservative protein, it has been used in a large number of systematic and population studies. The availability of polymerase chain reaction (PCR) primers that reliably amplify portions of the gene (Irwin et al., 1991), ease of aligning sequences from different species, and the ability to compare results from other studies all contribute to this popularity. Graybeal (1993) and Meyer (1994b) have issued cautions about the uncritical use of cytochrome b sequences in systematic studies. However, most of the difficulties appear at high levels of divergence, when widely separated taxa are included in the analyses. Krajewski and King (1996), using data from a series of phylogenetic studies on cranes (Gruidae), found that cytochrome b sequences yielded consistent results even with uncorrected divergences of up to 11%. Most studies have used cytochrome b sequences for studies at lower taxonomic levels, including studies not reviewed by Meyer (1994b) on Rivulus (Murphy and Collier, 1996) and Gambusia (Lydeard et al., 1995). The usefulness of cytochrome b sequences for intraspecific studies is more likely to be limited by lack of variation, although there is no obvious reason why the amount of variation should be less than for any other mitochondrial region. In a comparison of restriction fragment polymorphism with partial cytochrome b se-
quences, Birt et al. (1995) found comparable levels of mitochondrial variation within Mallotus villosus (Atlantic capelin) population samples from both techniques. Cytochrome b sequences have been used to detect intraspecific variation in five nominal species of rainbow fishes (Melanotaenia: Zhu et al., 1994), in three species of South American rodents (Patton et al., 1996), in the Atlantic cod (Gadus morhua: Carr et al., 1995), and in the Pacific sockeye salmon (Oncorhynchus nerka: Bickham et al., 1995).
II. Systematics of Sand Darters Percidae comprises some of the more familiar Eurasian and North American freshwater fishes, including the yellow perches (Perca), walleyes and saugers (Stizostedion), the ruffes (Gymnocephalus), the North American darters (Crystallaria, Percina, and Etheostoma), and two darter-like European genera (Zingel and Romanichthys). Darters are the largest percid group with approximately 164 described species distributed throughout eastern North America (Mayden et al., 1992). Sand darters consist of six species of small (a maximum of 50-60 mm standard length), translucent, insectivorous predators that live in clear streams, usually over sand bottoms. They typically bury into the sand and await their prey. Two typical species are shown in Fig. 1. Prior to Simons (1991, 1992) and Wiley (1992), seven species of percids were considered sand darters and placed in their own genus, Ammocrypta (Williams, 1975). One species, the crystal darter (Crystallaria asprella), was shown to be the sister group of Per-
FIGURE 1 Two members of Etheostoma (Ammocrypta): (a) E. beanii and (b) E. bifascia. From Williams (1975); reproduced with permission of the author and the Bulletin, Alabama Museum of Natural History.
T
77
6. mtDNA Sequence Variation among Sand Darters cina+Etheostoma (Simons, 1992; Wiley, 1992). The remaining species were shown to be related to species well embedded in Etheostoma (Simons, 1992), specifically to darters of the subgenera Ioa (monotypic: Etheostoma vitreum) and Boleosoma (five species including the common johnny darter, E. nigrum). Thus, Ammocrypta in the strict sense (s.s.) is now regarded as a subgenus of Etheostoma. Williams (1975) recognized two species groups within Ammocrypta s.s.; the E. beanii group and the E. pellucidum group (Fig. 2a). The E. beanii group comprised three species. Etheostoma beanii (Jordan) inhabits the Gulf Coastal Plain from the Hatchie River, southwest Tennessee, south along eastern tributaries of the Mississippi River to Lake Pontchartrain, Louisiana, and south and east to the Tombigbee and Alabama rivers of Alabama (Stauffer, 1980a; Fig. 3). Etheostoma bifascia (Williams) is distributed along Gulf Coast drainages in southern Alabama and western Florida from the Perdido River east to the Choctawhatchee, with possible introduction to the Apalachicola River (Stauffer et al., 1980; Fig. 3). Etheostoma clarum (Jordan and Meek) is sporadically distributed from the Neches and Sabine rivers in Texas north through the Mississippi Valley to Minnesota and Wisconsin, with populations in the Green and Cumberland river drainages of Kentucky (Stauffer, 1980b; Fig. 4). The E. pellucida group also comprised three species. Etheostoma pellucidum (Agassiz) was the northern species of the group and is found throughout the Ohio river basin south to western Kentucky and north to the southern margin of Lake Huron, around Lake Claire, and Lake Erie, with a disjunct population in the central tributaries of the St. Lawrence-Lake Champlain drainage (Hocutt, 1980b; Fig. 4). Etheostoma vivax (Hay) is
distributed from the Trinity River basin of eastern Texas, east to the Pascagoula River drainage of Mississippi, and north along the major tributaries of the Mississippi River to southern Missouri and western Tennessee and Kentucky (Stauffer and Hocutt, 1980; Fig. 4). Etheostoma meridianum (Williams) occupies the Tombigbee and Alabama rivers and their tributaries in Mississippi and Alabama, immediately adjacent to the southeastern range of E. vivax (Hocutt, 1980a; Fig. 4). William's (1975) recognition of two groups of sand darters was largely intuitive, i.e., not based on synapomorphic characters. Simons (1992) analyzed the relationships among members of the group with phylogenetic methods using a number of different morphological characters and arrived at a different hypothesis (Fig. 2b). He hypothesized that E. clarum was the basal member of the clade, removing it from the E. beanii group while maintaining the E. pellucidum group sensu Williams (1975). Although Simons (1992) hypothesized that E. meridianum and E. pellucidum were sister species, he acknowledged that support for this hypothesis was weak and that recognition of the three species as closest relatives rested on a single character. The most recent attempt to understand the relationships of sand darters was undertaken by Shaw et al., (1997) using morphology and allozyme data. They removed E. pellucidum to a more basal position, between E. clarum and the remaining four species (Fig. 2c), and hypothesized that E. vivax was the sister of E. beanii + E. bifascia. This chapter presents a new level of analysis of the sand darters, based on comparison of mitochondrial DNA sequences. Its objectives are threefold: (1) to further test the three different hypotheses of sand darter
t t!"
~iil Alabama .....
\\\\
Perdido ~i~i~::~ii~::~::i~i~i~i ;i i i i i i~::i~::~:~::~i~i~i~::~::i~;~@~::~;~;@~f:~f:~@~@~~~ ::i::i::i~::F: i',i i i~i~
ii,i,iii',iiiii!ii!!!!iiii!i!ii!',!i,i',',i;;
i~;~i~i~iii~i~;~i~!~i~i~i~!ill~i~i~i
Escatawpa
I iiiiiiiiilE. beanii
!ii i ',',',i
& Pascagoula
~~
E. bifascia
FIGURE 10 A network of haplotypes observed in E. beanii and E. bifascia. Dots represent unobserved haplotypes one mutational step removed from other dots and/or observed haplotypes. The root occurs between haplotypes B-17 and B-12. Drainages from which haplotypes were observed are overlaid on the network. Note that for graphic reasons the geographic positions of drainages are not accurate. The actual spatial relationships among the drainages from west to east are: Pascagoula, Escatawpa, Tombigbee, Alabama (Tombigbee + Alabama = Mobile Bay drainage), Perdido, Escambia, Blackwater, Yellow rivers.
types are all unique within the set of E. beanii/E, bifascia haplotypes. The root of the entire E. beanii/E, bifascia clade lies between haplotypes B-17 and B-12, based on the reconstructed ancestral character state vector derived from the interspecific phylogenetic tree shown in Fig. 7. Of these two haplotypes, B-17 is one step from the ancestral node reconstructed by the parsimony analysis whereas B-12 is three steps removed. Another feature of the network in Fig. 10 is the closed loop connecting haplotypes B-1-B-3-B-2-B9. The presence of such loops indicates ambiguity among alternative, equally short, pathways connecting the
haplotypes (Excoffier and Smouse, 1994). Two circumstances could produce this condition: either one of the haplotypes arose from independent mutations in two different lineages (i.e., convergence) or one of the line segments represents a direct evolutionary path that did not actually occur (i.e., a case of back mutation resulting in character reversal at a particular site). As examples, haplotype B-2 may have been derived independently from B-9 and B-3 (convergence), or haplotypes B-1 and B-3 may differ by three mutations rather than one (transitions from C to T at positions 147 and 282, followed by reversal to C at 147; Table II). The rooted version of the network shown in Fig. 10 provides more biogeographic information than the haplotype tree. Haplotypes common to both species (B-l) or with descendants in both species (B-12) are near the middle of the network and have many descendants. Haplotypes represented by single individuals occur at the tips of the branches, as would be expected if these represent recent divergences. As in the tree, the three Escatawpa and Pascagoula haplotypes are clearly distinct, but co-occurring haplotypes in other drainages tend to be clustered on the network. There is an overall west-to-east pattern of haplotype distributions and most haplotypes are confined to single drainages (e.g., Perdido River populations) or drainages that share a common bay (e.g., Escatawpa and Pascagoula populations). This congruence would be expected of populations that have experienced little gene flow over a long period of time, sufficient for new haplotypes to evolve in situ. The authors note that while gene flow is lacking, the rooted network apparently does contain some haplotype relationships that imply historical connections between drainages. For example, the Escambia River is linked to both the Yellow and Blackwater and the Perdido through a series of ancestral/descendant haplotypes. Some of these relationships are probably younger than others. For example, the relationship between B-3 and B-11 suggests a relatively recent gene flow event because B-11 is far removed from any relationship with other Perdido haplotypes. 2. E. vivax and E. meridianum
Eight haplotypes were detected among 11 E. vivax individuals sequenced. The V-2 haplotype occurred in fish from the geographically close samples taken from the Mississippi and St. Francis rivers; the remaining haplotypes appeared only in single drainages. Two haplotypes were observed among th e four E. meridianum individuals sequenced (Table V). For these species, the phylogenetic trees based on equally weighted substitutions (Figs. 5 and 6) and the
6. mtDNA Sequence Variation among Sand Darters TABLE V
91
Distribution of Haplotypes among E. v i v a x and E. meridianum Samples Haplotype
Species
Drainage
M-1
M-2
V-1
V-2
V-3
V-4
V-5
V-6
V-7
V-8
meridianum Subtotal vivax
Tombigbee
3 3
1 1
0
0 1 1 .
0
0
0
0
0
0 1
1
1
1
Mississippi St. Francis Ouachita Pearl Sabine
1
9 2
0
Subtotal Total
0
1
minimum spanning network (Fig. 11) yield identical information. The eight E. vivax and two E. meridianum haplotypes resolve into five distinct haplotype lineages, none of which can be identified as basal. Each of the haplotype lineages are composed of individuals from the same or interconnected drainages. Greater divergence among haplotypes does limit the effectiveness of the network approach to genealogical reconstruction. When basal haplotypes are missing from the array (due to extinction or insufficient sampling), most of the network reduces to a tree,
2
1
1
1
2
2 2
1
Sample size 4 4 2 2 3 2 2 11 15
although information may still be obtained at the branch tips (Templeton et al., 1992). The small sample sizes representing these taxa limit the biogeographic analyses that can be made. The estimated nucleotide diversity among E. vivax was four times higher than for E. beanii and E. bifascia: 0.0142. However, the geographic area represented by these samples was also considerably larger, which could also account for the difference.
V. D i s c u s s i o n Ouachita
A. P h y l o g e n e t i c A n a l y s i s
....~:~:~!i!ii". ..... ................. ~ii!ii!iiiiii?
9 @
Tombigbee .................. ..:!iiiT
Sabine
. .........
Mississippi & St. Francis
Pearl
FIGURE 11 A network of haplotypes observed in E. vivax and E. meridianum. Dots represent unobserved haplotypes one mutational step removed from other dots a n d / o r observed haplotypes.
The major difference between the DNA-only analyses and those based on morphology and allozymes presented by Shaw et al. (1997) concerns the phylogenetic placement of E. pellucidum and the E. vivaxE. meridianum species pair. Shaw et al. (1996) hypothesized that E. pellucidum was the sister group of all Ammocrypta except E. clarum, a hypothesis identical to DNA tree 1 (Fig. 5), but not to DNA tree 2 (Fig. 6). If only a few individuals of each species had been sampled, the authors would have consistently drawn a conclusion at variance with the morphological and allozymic data unless they employed an approach minimally incorporating the DNA and morphological evidence. The authors' DNA-morphology tree favors the basal placement of E. clarum (Fig. 5) rather than a monophyletic group comprising E. clarum and E. pellucidum (Fig. 6). The authors consider this combined analysis tree (Fig. 9) to be their best estimate of the phylogenetic relationships among sand darters in this study because of the principle of total evidence (Kluge, 1989) and thus conclude that Shaw et al. (1997) correctly placed E. clarum and E. pellucidum. Regarding
92
E. O. WILEYAND ROBERTH. HAGEN
E. vivax and E. meridianum, the authors consistently arrived at the hypothesis that this species pair is monophyletic regardless of the data and weighting scheme used (Figs. 5, 6, and 9). Given that morphological data used by Shaw et al. (1996) as well as DNA data were used, the authors suggest that this is a more robust phylogenetic hypothesis than the alternative hypothesis that E. vivax is the sister to the E. beanii-E, bifascia pair. The biogeographic history implied by the total evidence phylogeny also provides some corroboration. A biogeographic pattern of a Mobile Bay drainage endemic being closely related to a species found west and north of the Mobile Bay is replicated in both the Hybopsis longirostris species group (Wiley and Titus, 1992) and the Lythrurus roseipinnis species group (Wiley and Siegel-Causey, 1994).
B. Biogeography of E. vivax and E. meridianum E. vivax is a relatively widespread species and it is not surprising that the populations sampled displayed a diversity of haplotypes. Although two of the E. vivax samples were geographically close (upper Pearl River drainage) to the samples of E. meridianum, they were not particularly close to that species in terms of sequence similarity. The authors hesitate to speculate as to the cause of the diversity observed in E. vivax haplotypes and suggest that a broader survey, including greater sample sizes and better geographic coverage, is needed.
C. Biogeography of E. beanii and E. bifascia
Despite the relative proximity of the rivers from which E. beanii and E. bifascia samples were collected, there was no indication of ongoing gene flow between populations within species as evidenced by haplotypes shared among drainages. The few exceptions were not sufficient to obscure the correlation. AMOVA analysis (Excoffier et al., 1992) does not contribute much additional information because there is little overlap among haplotype distributions. AMOVA may be more appropriate for situations where current gene flow plays a more significant role between populations within species. In contrast to clear differentiation among river drainage systems within species, relationships between the two species were obscure. Clearly, the gene tree shown through parsimony analysis and the haplotype network was not congruent with relationships on the species level, i.e., neither the gene tree nor the haplotype network supports the conclusion that E. beanii and
E. bifascia are separate species, but both morphological (Williams, 1975) and allozyme data (Shaw et al., 1996) leave no doubt that E. beanii and E. bifascia are separate lineages ("good" species). The authors suspect that this incongruence is due to the retention of ancestral haplotypes subsequent to speciation followed by mutational events derived from these haplotypes isolated in separate drainages. Such a scenario would not be expected to mirror evolution on the morphological or allozymic levels. This speculation is based on three lines of evidence. First, when the haplotype network was inspected, the authors observed that the haplotype shared by both species (B-l) is relatively far removed from the root of the network as shown by the phylogenetic analysis and that one found only in E. bifascia (B12) is between the shared haplotype (B-l) and the ancestral node (Fig. 10). Second, the shared haplotype is not found in geographically contiguous drainages, being found in the Tombigbee (E. beanii) and Yellow and Blackwater (E. bifascia) rivers. Given the lack of gene flow within species between drainages, it is unlikely that this shared haplotype is the result of gene flow. Third, the root of the network, as shown by the phylogenetic tree, is between the two major populations of E. beanii. Although the authors do not doubt that the Pascagoula/Escatawpa system is isolated from the Mobile Bay (Alabama-Tombigbee) drainage, it is believed that this occurred after the speciation event that separated E. beanii and E. bifascia. This is based on the general geologic history of the region, the monophyly of the species pair, and the distinctiveness of the two species which indicates that any split occurring between populations of E. beanii must have occurred after, not before, the origin of E. beanii and E. bifascia from their common ancestor. If this interpretation is accepted, then the historical gene flow patterns implied by the rooted network can be further resolved (Fig. 10). Older gene flow patterns are represented by haplotypes B-1 and B-12 and denote gene flow that occurred before the origin of either descendant species from their common ancestor. If so, then the ancestor/descendant patterns between these haplotypes and others found in both E. beanii and E. bifascia do not imply recent gene flow between these two species. Furthermore, they do not imply recent gene flow within E. bifascia relative to the relationship between the Blackwater and Yellow rivers on the one hand and the Escambia River on the other hand. Thus, the authors suggest that the mere presence of shared haplotypes cannot be automatically interpreted as evidence of gene flow. Rather, shared haplotypes are better interpreted within a historical context.
6. mtDNA Sequence Variation among Sand Darters Parts of the rooted network give indications of historical gene flow between drainages within species. There are two possible kinds of relationships within species between drainages. First, a haplotype might be shared between drainages. Haplotype B-l, as an ancestral haplotype, may not be indicative of a relationship between the Yellow and the Blackwater rivers, but B-9, a derived haplotype, probably is indicative of that relationship. Second, ancestor/descendant relationships among different haplotypes occupying different drainages might be indicative of a relationship. For example, however we resolve the loop B-l, B-9, B-2, B-3, the relationships of these haplotypes to other haplotypes in the Escambia, Yellow, and Blackwater rivers denotes historical gene flow exclusive to these drainages. The Perdido, highly isolated, shows some affinities to the contiguous Escambia through the B-3/B-11 haplotype relationship, but we must be cautious. Haplotype B-l, if interpreted as an ancestral haplotype, is not indicative of a close relationship among the Perdido, Escambia, Yellow, and Blackwater rivers. The link between B-1 and B-12 does not imply a close relationship among the Yellow, Blackwater and Perdido rivers. Just as shared haplotypes might not be indicative of a relationship between drainages, the link between ancestral and descendant haplotypes may be indications of more ancient rather than more recent gene flow.
D. Speciation and Biogeography E. beanii and E. bifascia are part of a larger story involving replicate speciation among several groups of fishes inhabiting the northern Gulf Coastal Plain. Wiley (1977) suggested that the speciation event involving two sister species of topminnows (Fundulidae), Fundulus nottii and F. escambiae, might be correlated with tectonic events occurring in the region that had apparently shifted drainage patterns from a northeastsouthwest direction to a north-south direction (Price and Whetstone, 1977). This event would have separated the Mobile Bay drainage basin (Alabama and Tombigbee rivers) from those immediately to the east (Perdido and Escambia rivers), producing the biogeographic pattern observed in the two species. Wiley and Mayden (1985) reviewed biogeographic distributions of a number of species groups of fresh and brackish water fishes, as well as selected other aquatic vertebrates and invertebrates. The same drainage boundaries corresponded to species limits for a number of groups. Wiley and Mayden (1985) identified six additional species pairs distributed in such a manner that one of the pair occupied the Mobile Bay basin and drainages to the west, whereas the sister species occupied the Perdido and Escambia river drainages and
93
drainages to the east. These include two groups of topminnows (F. nottii-F, escambiae; F. confluentus-F, pulvereus), two species pairs of darters (E. chlorosomaE. davisoni; E. beanii-E, bifascia), two species pairs of minnows (L. roseipinnis-L, atripiculus; H. longirostrisHybopsis sp.), and three species pairs of snakes (Natrix rhombifera-N, taxispilota; N. cyclopion-N, floridans; Farencia reinwardti-F, abacura). Two apparent exceptions to this pattern, a group of Hybopsis minnows (the H. longirostris group) and a group of Lythrurus minnows (the L. roseipinnis group), were shown by subsequent phylogenetic analyses to conform to the pattern (Wiley and Titus, 1992; Wiley and Siegel-Causey, 1994). Additional groups that might be implicated in the pattern include subspecies of the pike Esox americanus (Crossman, 1966), species of the blenny genus Chasmodes (Williams, 1983), and the mosquito fishes Gambusia affinis and G. holbrooki (Wooten et al., 1988; Wooten and Lydeard, 1990; Scribner and Avise, 1993). What emerges from the phylogenetic studies is a pattern of vicariance. In the case of closely related and relatively recent species pairs such as E. beanii and E. bifascia, dispersal apparently has not altered the original vicariance pattern and it is relatively easy to correlate this pattern with geologic events. When one examines the patterns of the clades to which the various species pairs studied by Wiley and Mayden (1985) belong, it becomes quickly apparent that simple vicariance explanations of deeper nodes are difficult (Wiley and Mayden, 1985), but there is some trace among the fishes. The closest relatives of H. longirostris and Hybopsis sp. live in the Mobile Bay basin and in tributaries west to the Mississippi. The same is generally true of the relatives of L. roseipinnis and L. atripiculus: L. bellus, a Mobile Bay endemic, is the sister species of the pair whereas other relatives are found to the north and west (L. ardens and L. umbratilis). F. nottii and F. escambiae are a bit different in having an eastern sister species (F. lineolatus), but the three are related to two species that have more western distributions. Thus, finding that E. pellucidum is not, in fact, closely related to either E. vivax or E. meridianum (Shaw et al., 1997) brings the biogeography of the subgenus in closer agreement with what is known of the phylogeny and biogeography of fishes with similar distributions and presumably similar histories. Swift et al. (1986) suggest that the eastern Gulf Coast lowland fish fauna represented by such groups as the F. nottii group and, by extension, Ammocrypta colonized the coastal plain during the late Miocene during a period of low sea levels. Their time table of vicariance suggests a late Miocene-early Pliocene (4-5 mybp) event and certainly an event that happened no later than the late Pliocene [1-2 mybp: see Price and Whet-
94
E. 0. WILEY AND ROBERT H. HAGEN
stone (1977) for geological evidence involving changes in drainage patterns]. In other groups, Pleistocene events are not correlated with speciation, but with geographic variation (Swift et al., 1986). Thus, differences within species among the various drainages along the Gulf Coastal Plain would be expected, but the authors conclude that they are younger than the vicariance events that produced the species observed in this study. If a relatively recent event (isolation of the Pascagoula/Escatawpa from the Mobile Bay within E. beanii) falls out at the base of a phylogenetic tree while a more ancient event (speciation of E. beanii and E. bifascia) does not, the most likely conclusion is that both species retained an array of haplotypes that were present in their common ancestor and that, given this, there is no expectation that the gene tree will reflect the species tree until such time as these ancestral haplotypes go extinct and more lineage-specific haplotypes emerge. Obviously, 2 - 4 million years has been insufficient time for this to occur.
VI. Summary DNA sequence data were obtained from the N-terminal end of the mitochondrial cytochrome b gene and a portion of the adjacent glutamine tRNA gene for 89 individuals representing six species of Etheostoma (Ammocrypta) and two closely related species, E. vitreum and E. nigrum. Substitutions were found at 27% of 422 sites and 71% of these were transitions. Six substitutions produce nonsynonymous codons out of the 134 cytochrome b codons included in the sequence. Among the recently diverged sister species E. beanii and E. bifascia, there were 21 substitutions (5%), none of which were fixed between species. Phylogenetic analyses of DNA data with haplotypes as terminal taxa were performed using equally weighted and transversionweighted matrices. A total evidence analysis was also performed that favored one of the DNA-only trees. In no case were the relationships among the haplotypes of E. beanii and E. bifascia resolved. Rather, some E. beanii haplotypes were basal relative to the remaining E. beanii and all E. bifascia haplotypes. A minimum spanning network of haplotypes was constructed. This network suggests that river drainage systems are isolated from each other and that haplotypes show high drainage system affinity. The placement of the root derived from the phylogenetic analysis onto the network suggests that there has been insufficient time for complete lineage sorting as E. beanii and E. bifascia diverged from their common ancestor. Despite this, the rooted network does suggest biogeographic relationships of populations from some of the drainages within E. bifascia.
Acknowledgments We thank Bob Cashner and Steve Stevenson (University of New Orleans), Rick Mayden, Berney Kahajda, and Andrew Simons (University of Alabama), Hank Bart (Tulane University), Doug SiegelCausey (University of Nebraska), Tim Schmidt (Wayne State University), George Harp (Arkansas State University), and Frank Cross and Kate Shaw (University of Kansas) for their help in collecting specimens in the field. Thanks to Larry Page (Illinois Natural History Survey) for specimens of E. pellucidum. Thanks to Bob Jenkins (Roanoke College) for specimens of E. vitreum. This project was generously supported by grants from the General Research Fund, University of Kansas (3208), and from the National Science Foundation (BSR 8722562 for field work and DEB 9207600 for data collecting and analysis).
References Avise, J. C. 1992. Molecular population structure and the biogeographic history of a regional fauna: A case history with lessons for conservation biology. Oikos 63: 62- 76. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489- 522. Bertorelle, G., and Barbujani, G. 1995. Analysis of DNA diversity by spatial autocorrelation. Genetics 140:811-819. Bickham, J. W., Wood, C. C., and Patton, J. C. 1995. Biogeographic implications of Cytochrome b sequences and allozymes in sockeye (Oncorhynchus nerka). J. Hered. 86:140-144. Birt, T. P., Friesen, V. L., Birt, R. D., Green, J. M., and Davidson, W. S. 1995. Mitochondrial DNA variation in Atlantic capelin, Mallotus villosus: A comparison of restriction and sequence analyses. Mol. Ecol. 4: 771- 776. Carr, S. M., Snellen, A. J., Howse, K. A., and Wroblewski, J. S. 1995. Mitochondrial DNA sequence variation and genetic stock structure of Atlantic cod (Gadus morhua) from bay and offshore locations on the Newfoundland continental shelf. Mol. Ecol. 4: 79-88. Crossman, E. J. 1966. A taxonomic study of Esox americanus and its subspecies in Eastern North America. Copeia 1966(1):1-20. Danzmann, R. E., and Ihssen, P. E. 1995. A phylogeographic survey of brook charr (Salvelinus fontinalis) in Algonquin Park, Ontario based upon mitochondrial DNA variation. Mol. Ecol. 4: 681-697. Excoffier, L., and Smouse, P. E. 1994. Using allele frequencies and geographic subdivision to reconstruct gene trees within a species: Molecular variance parsimony. Genetics 136:343-359. Excoffier, L., Smouse, P. E. and Quattro, J. M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 131: 479-491. Graybeal, A. 1993. The phylogenetic utility of cytochrome b: Lessons from bufonid frogs. Mol. Phylo. Evot. 2:256-269. Hocutt, C. H. 1980a. Ammocrypta meridiana Williams. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.). North Carolina St. Mus. Nat. Hist., Raleigh, NC. Hocutt, C. H. 1980b. Ammocrypta pellucida (Agassiz). In "Atlas of North American Freshwater Fishes" (D. S. Lee et at., (eds.). North Carolina St. Mus. Nat. Hist., Raleigh, NC. Hudson, R. R. 1990. Gene genealogies and the coalescent process. Oxford Surv. Evol. Biol. 7:1-44. Irwin, D. M., Kocher, T. D., and Wilson, A. C. 1991. Evolution of the cytochrome b gene of mammals. J. Mot. Evol. 32:128-144.
6. m t D N A Sequence Variation among Sand Darters
Kluge, A. G. 1989. A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes). Syst. Zool. 38: 7-25, Kocher, T. D., Meyer, W. K., et al. 1989. Dynamics of mitochondrial DNA evolution in animals. Proc. Natl. Acad. Sci. USA 86:61966200. Krajewski, C., and King, D. G. 1996. Molecular divergence and phylogeny: Rates and patterns of cytochrome b evolution in cranes. Mol. Biol. Evol. 13:21-30. Lydeard, C., Wooten, M. C. and Meyer, A. 1995. Cytochrome b sequence variation and a molecular phylogeny of the live-bearing fish genus Gambusia (Cyprinodontiformes: Poeciliidae). Can. J. Zool. 73:213-227. Maddison, W. P., and Maddison, D. R. 1992. "MacClade." Sinauer, Sunderland, MA. Magoulas, A., Tsimenides, N., and Zouros, E. 1996. Mitochondrial DNA phylogeny and the reconstruction of the population history of a species: The case of the European anchovy (Engraulis encrasicolus). Mol. Biol. Evol. 13:178-190. Maniatis, T., Fristch, E. F., and Sambrook, J. 1982. "Molecular Cloning: A Laboratory Manual." Cold Spring Harbor Laboratory, Cold Spring, NY. Mayden, R. L., B. M. Burr, L. M. Page, and R. R. Miller. 1992. The native freshwater fishes of North America. "Systematics, Historical Ecology and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 827-863. Stanford Univ. Press, Stanford, CA. Meyer, A. 1994a. DNA technology and phylogeny of fish. In "Genetics and Evolution of Aqauatic Organisms" (A. R. Beaumont, ed.), pp. 219-249. Chapman and Hall, London. Meyer, A. 1994b. Shortcomings of the cytochrome b gene as a molecular marker. Trends Ecol. Evol. 9:278-280. Murphy, W. J., and Collier, G. E. 1996. Phylogenetic relationships within the aplocheiloid fish genus Rivulus (Cyprinodontiformes, Rivulidae): Implications for Caribbean and Central American biogeography. Mol. Biol. Evol. 13:642-649. Nei, M. 1987. "Molecular Evolutionary Genetics." Columbia Univ. Press, New York. Paabo, S. 1990. Amplifying ancient DNA. In "PCR Protocols: A Guide to Methods and Applications" (M. A. Innes et al., eds.), pp. 159-166. Academic Press, San Diego. Patton, J. L., da Silva, M. N. F., and Malcolm, J. R. 1996. Hierarchical genetic structure and gene flow in three sypatric species of Amazonian rodent. Mol Ecol. 5: 229- 238. Price, R. C., and Whestone, K. N. 1977. Lateral stream migration as evidence for regional geologic structures in the eastern Gulf Coastal Plain. Southeast. Geol. 18(3):129-147. Saiki, R. K. 1990. Amplification of genomic DNA. In "PCR Protocols: A Guide to Methods and Applications" (M. A. Innis et al., eds.), pp. 13-20. Academic Press, San Diego. Scribner, K. T., and Avise, J. C. 1993. Cytonuclear genetic architecture in mosquitofish populations and the possible roles of introgressive hybridization. Mol. Ecol. 2:139-149. Shaw, K., Simons, A. M. and Wiley, E. O., 1997. A reexamination of the phylogenetic relationships of the sand darters (Teleostei: Percidae). Occas. Publ. Mus. Nat. Hist. Univ. Kansas. Submitted for publication. Simons, A. M. 1991. Phylogenetic relationships of the crystal darter, Crystallaria asprella. Copeia 1991:927- 936. Simons, A. M. 1992. Phylogenetic relationships of the Boleosoma species group (Percidae: Etheostoma). In "Systematics, Historical Ecology and North American Fireshwater Fishes" (R. L. Mayden, ed.), pp. 268-292. Stanford Univ. Press, Stanford, CA. Slatkin, M., and Maddison, W. P. 1989. A cladistic measure of gene flow inferred from the phylogenies of alleles. Genetics 123:603613. Stauffer, J. R. 1980a. Ammocrypta beani Jordan. In "Atlas of North
95
American Freshwater Fishes" (D. S. Lee et al., eds.), p. 616. North Carolina St. Mus. Nat. Hist., Raleigh, NC. Stauffer, J. R. 1980b. Ammocrypta clara Jordan and Meek. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.), p. 618. North Carolina St. Mus. Nat. Hist., Raleigh NC. Stauffer, J. R, and Hocutt, C. H. 1980. Ammocrypta vivax Hay. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.), p. 621. North Carolina St. Mus. Nat. Hist., Raleigh, NC. Stauffer, J. R, Hocutt, C. H. and Gilbert, C. R. 1980. Ammocrypta bifascia Williams. In "Atlas of North American Freshwater Fishes" (D. S. Lee et al., eds.), p. 617. North Carolina St. Mus. Nat. Hist., Raleigh, NC. Swift, C. C., Gilbert, C. R. Bortone, S. A., Burgess, G. H., and Yeger, R. W. 1986. Zoogeography of the freshwater fishes of the southeastern United States: Savannah River to Lake Ponchartrain. In "The Zoogeography of North American Freshwater Fishes" (C. H. Hocutt and E. O. Wiley, eds.), pp. 213-265. WileyInterscience, New York. Swofford, D. L. 1993. PAUP 3.1.1. Computer Program distributed by the Illinois Natural History Survey, Champaign, IL. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. 1996. Phylogenetic inference. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), 2nd Ed., pp. 407-514. Sinauer Associates, Sunderland, MA. Templeton, A. R., Crandall, K. A., and Sing, C. F. 1992. A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 132: 619-633. Templeton, A. R., Routman, E., and Phillips, C. A. 1995. Separating population structure from population history: A cladistic analysis of the geographical distribution of mitochondrial DNA haplotypes in the tiger salamander, Ambystoma tigrinum. Genetics 140: 767-782. Weir, B. S. 1996. Intraspecific differentiation. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), 2nd Ed., pp. 385-406. Sinauer Associates, Sunderland, MA. Wiley, E. O. 1977. The phylogeny and systematics of the Fundulus nottii species group (Teleostei: Cyprinodontidae). Occ. Pap. Mus. Nat. Hist. Univ. Kansas. 67:1-31. Wiley, E. O. 1992. Phylogenetic relationships of the Percidae (Teleostei: Perciformes): A preliminary hypothesis. In "Systematics, Historical Ecology and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 247-267. Stanford University Press, Stanford, CA. Wiley, E. O., and Mayden, R. L. 1985. Species and speciation in phylogenetic systematics, with examples from the North American fish fauna. Ann. Missouri Bot. Gard. 72:596-635. Wiley, E. O., and Siegel-Causey, D. 1994. A phylogenetic analysis of the Lythrurus roseipinnis species group (Teleostei: Cyprinidae), with comments on the relationship of other Lythrurus. Occas. Pap. Mus. Nat. Hist. Univ. Kansas 171:1-20. Wiley, E. O., and Titus, T. A. 1992. Phylogenetic relationships among members of the Hybopsis dorsalis species group (Teleostei: Cyprinidae). Occas. Pap. Mus. Nat. Hist. Univ. Kansas 152:1-18. Williams, J. D. 1975. Systematics of the percid fishes of the subgenus Ammocrypta, genus Ammocrypta, with descriptions of two new species. Bull. Alabama Mus. Nat. Hist. 1:1-56. Williams, J. T. 1983. Taxonomy and ecology of the genus Chasmodes (Pisces: Blenniidae) with a discussion of its zoogeography. Bull. Florida St. Mus. Biol. Sci. 29(2): 1-100. Wooten, M. C., and Lydeard, C. 1990. Allozyme variation in a natural contact zone between Gambusia affinis and Gambusia holbrooki. Biochem. Syst. Ecol. 18(2/3): 169-173. Wooten, M. C., Scribner, K. T. and Smith, M. H. 1988. Genetic variation and systematics of Gambusia in the Southeastern United States. Copeia 1988(2):283-289.
96
E. O. WILEY AND ROBERTH. HAGEN
Zhu, D., Jamieson, B. G. M., Hugall, A., and Moritz, C. 1994. Sequence evolution and phylogenetic signal in control-regionand cytochrome-bsequences of rainbow fishes (Melanotaeniidae).Mot. Biol. Evol. 11(4):672-683.
A p p e n d i x h Specimens Examined
Specimens are presented by KU number (number of specimens), state, and major drainage. Exact localities are available from E. O. Wiley. Vouchered specimens are actual specimens used in this chapter, and the series may contain more specimens than analyzed. Etheostoma beanii: KU 22898 (6), Alabama, Esactawpa Dr. KU 24380 (5), Alabama, Tornbigbee Dr. KU 24382 (6), Alabama, Tombigbee Dr. KU 24381 (6), Alabama, Alabama Dr. KU 24383 (6), Mississippi, Pascagoula Dr. KU 24384 (6), Mississippi, Pascagoula Dr. Etheostoma bifascia: KU 24385 (6), Alabama, Perdido Dr. KU 24388 (5), Florida, Perdido Dr. KU 22146 (6), Florida, Escambia Dr. (1988 collection). KU 24387 (6), Florida, Escambia Dr. (1989 collection). KU 24386 (6), Alabama, Yellow Dr. KU 24862 (5), Florida, Blackwater Dr. Etheostoma clarum: KU 23145 (2), Arkansas, Strawberry Dr. Etheostoma meridianum: KU 23148 (2), Mississippi, Tombigbee Dr. KU 23149 (2), Misssippi, Tombigbee Dr. Etheostoma nigrum: KU 23143 (1), Kansas, Kansas Dr. Etheostoma pellucidum: KU 23150 (2), Indiana, Tippecanoe Dr. Etheostoma vitreum: KU 23144 (1), KU 24389 (1), Virginia. Blackwater Dr. Etheostoma vivax: KU 24390 (1), Missouri, Mississippi R. KU 23146 (2), Arkansas, St Francis Dr. KU 24391 (1), Louisiana, Ouachita Dr. KU 24392 (1), Louisiana, Ouachita Dr. KU 24393 (1), Louisiana, Sabine Dr. KU 24394 (1), Mississippi, Pearl Dr.
Appendix Ih Morphological Characters
The following characters are taken from Simons (1992) and Shaw et al. (1997). In each case, the observed attribute of individual specimens is considered a character, and homologous characters are organized into a transformation series (TS). Thus a transformation series is a column of data, and a cell in the matrix is the character observed for individuals of a species (see Wiley et al., 1991). This convention circumvents the usual (and inaccurate) convention of treating columns as "characters" and cells as "character states." TS 117: Ascending process of premaxilla perpen-
dicular to the alveolar process (0) or inclined posteriorly (1). TS 118: Premaxillary process of maxilla V-shaped (0) or U-shaped (1). TS 119: Notch laying posteroventral to the articular process of the quadrate shallow or absent (0) or deeply cut into the quadrate body (1). TS 120: Body of quadrate rounded (0) or rectangular (1). TS 121: Hyomandibular struts present as cruciform thickenings (0) or reduced to absent (1). TS 122: Descending process of the hyomandibular long and extending beyond the preopercular groove (0) or short and terminating at the end of the groove (1). TS 123: Hyomandibular spur absent (0) or present (1). TS 124: Ventral plate of the urohyal flattened (0) or curved (1). TS 125: Interhyal articular process of the posterior ceratohyal present (0) or absent (1). TS 126: Posterior margin of the preopercle smooth (0) or serrate (1). TS 127: Notch in the anterior angle of the preopercular roofing the articular facet for the interhyal present (0) or absent (1). TS 128: Opercular spine present (0) or absent (1). TS 129: Opercular strut extending from the hyomandibular articulatory facet strong (0) or greatly reduced (1). TS 130: Posterodorsal extension of the subopercle elongate (0) or truncated (1). TS 131: Mesethmoid thick and expanded anteriorly (0) or thin and not expanded (1). TS 132: Maxillary ligament inserted on two dorsomedial ridges of the mesethmoid (1), or inserted on two dorsolateral projections (0). Recoded from Shaw et al. (1996) three character transformation series. TS 133: Vomerine teeth present (0) or usually absent (1). TS 134: Membrane bone on the lateral margin of the nasal well developed (0) or present as a thin slip (1). TS 135: Remnant of the lateral line canal on the supracleithrum present (0) or absent (1). TS 136: Postcleithrum 2 present (0) or absent (1). TS 137: Longitudinal struts on the proximal anal pterygiophores present (0) or absent (1). TS 138: Processes for the insertion of the infracarinalis medius muscles on the anterior face of the first anal pterygiophore present (0) or absent (1). TS 139: Body scalation almost complete (0) or reduced laterally to a few scale rows (1). TS 140: Male anal fin breeding tubercles absent (0) or present (1).
C H A P T E R
7 Phylogeographic Patterns in Populations of Cichlid Fishesfrom Rocky Habitats in Lake Tanganyika CHRISTIAN STURMBAUER Department of Zoology University of Innsbruck A-6020 Innsbruck, Austria
ERIK VERHEYEN Royal Belgian Institute of Natural Sciences B-IO00 Brussels, Belgium
LUKAS ROBER Zoological Museum of the University of Z~rich Switzerland
AXEL MEYER Department of Ecology and Evolution State University of New York Stony Brook, New York 11794
I. Lake Tanganyika and Its Cichlid Species Flock
lake has reached its largest extension due to addition of a large tributary river, the Ruzizi in the northern edge of the lake. It was formed about 10,000 years ago by the formation of the Virunga volcano chain in Rwanda, blocking the formal connection of this area with the Nile system. The influx of the Ruzizi River also ended a long period of isolation from the Zaire River system and caused an overflow of Lake Tanganyika via the Lukuga into the Lualaba, the upper reaches of the Zaire River. Although the cichlid flock of Lake Victoria is considered to be monophyletic (Meyer et al., 1990), the Malawi and Tanganyika cichlid flocks are probably of polyphyletic origin (Greenwood, 1981; Nishida, 1991; Sturmbauer and Meyer, 1993; Moran et al., 1994; Sturmbauer et al., 1994; Kocher et al., 1993, 1995). Lake Malawi harbours a small sub-flock of five endemic species of Tilapiine cichlids (Eccles and Trewavas, 1989; Axelrod, 1993), in addition to its subflock of "haplochromines" which is considered monophyletic (Moran et al., 1994). The Lake Tanganyika cichlid flock is composed of several lineages, assigned to 12 tribes (Poll,
The cichlid species flocks of the great East African lakes represent the most diverse assemblages of freshwater fishes in the world. Lake Tanganyika is by far the oldest of the three major East African rift lakes with an estimated age of about 9 to 12 million years (Cohen et al., 1993). Its geological history is relatively well known (reviewed in Tiercelin and Mondeguer, 1991). The lake is formed of three basins which have been fused to one large lake about 5 to 6 million years ago. Seismic data show that about 200,000 (Tiercelin and Mondeguer, 1991) to 75,000 (Scholz and Rosendahl, 1988; C. A. Scholz personal communication) years ago the level of Lake Tanganyika dropped 600 m below its present level, possibly even splitting the lake into three sublakes for several tens of thousands of years. This vicariant event must have had severe effects on several habitats and their fish populations. After this period the lake level rose again with additional minor fluctuations in the more recent history. At present times the
MOLECULAR SYSTEMATICS OF FISHES
97
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
98
CHRISTIAN STURMBAUER et al.
1986). The ancestors of some tribes are likely to be older than the lake and probably have colonized the proto-lakes of Tanganyika to radiate in parallel into subflocks. The Victoria and Malawi cichlids are all, without exception, maternal mouthbrooders (females brood their eggs by buccal incubation; reviewed by Barlow, 1991; Keenleyside, 1991), and the Tanganyika flock contains several lineages of mouthbrooders as well as substrate breeders (Nishida, 1991; Sturmbauer et al., 1994; Kocher et al., 1995). The Tanganyikan cichlid fauna is morphologically, ecologically and behaviorally the most diverse species flock of the African lakes (Fryer and Iles, 1972; Greenwood, 1984). Due to its old age, the radiation may be in a highly advanced stage and the phylogeographic history of species and populations may reach back far in time compared to the cichlids of Lake Malawi and Lake Victoria. Several species are split into numerous populations which might have complex histories. Some are likely to be old, and therefore may have highly diversified genetically, to an extent that their history can be deduced from gene sequences.
sal and consequently the amount of gene flow among cichlid populations are ecological specialization and niche partitioning, e.g., habitat specifity, site fidelity or territoriality, homing behavior, and social organization (Fryer and Iles, 1972; McKaye and Gray, 1984; McElroy and Kornfield, 1990; Yanagisawa and Nishida, 1991; Hert, 1992; Sturmbauer and Dallinger, 1995). These species-specific characteristics may influence to which extent species will be split into distinct populations, to which extent populations will be isolated from each other, and also to which extent physical changes might affect their population structures. During periods of physical separation, genetic differences between populations will accumulate. Prezygotic isolation mechanisms might evolve as byproducts of genetic isolation, possibly driven by sexual selection on traits involved in social and/or reproductive behavior. This mechanism was suggested for color patterns of males being the decisive criterion of mate recognition and choice (Mayr, 1984; Dominey, 1984). Given the behavioral diversity in Tanganyikan mouthbrooders, the relative importance of sexual selection may also vary among species or lineages.
A. M o d e l i n g A d a p t i v e R a d i a t i o n
The Tanganyika cichlid species flock thus provides an excellent model system to elucidate the evolutionary mechanisms which induce and trigger explosive speciation events. An important aspect of understanding adaptive radiations is concerned with the mode of speciation which led to their diversification. Specifically, the relative importance of intrinsic biological characteristics such as ecology, anatomy, and behavior versus abiotic factors such as geological history, geographic structuring of the lake basin, barriers to gene flow, and fluctuations of the lake level is controversial. Although abiotic factors are thought to provide or prevent the opportunities for dispersal, biotic factors may define the dispersal capability of each species once an opportunity for gene flow is provided. Abiotic factors may thus be viewed as shape parameters of habitats in the lake ecosystem defining their location, size, and discontinuity in time and space. Changes in any abiotic parameter might reshape habitats, and existing barriers might be "torn down" at one time whereas others might arise at another. The degree of habitat change extends from small-scale fluctuations to vicariant events affecting almost all habitats and their species communities. Because abiotic factors most likely affect the whole species communities equally in their habitats, differences in the distribution patterns among species may primarily arise due to species-specific biological differences. Among biotic factors presumed to affect the disper-
II. Speciation and D N A The comparison of genetic patterns among species assemblages living sympatrically in geographically isolated populations is expected to provide insights into the dynamics of population histories and their evolutionary causes (e.g., reviewed in Avise, 1994). The amount of genetic divergence within and among populations, as well as frequencies and distributions of different genotypes, will provide information about their historical demography. By relating the observed patterns to ecology, habitat specificity and behavior, the decisive characteristics triggering the degree of isolation may be identified for several species on a comparative basis. Identifying the causes of isolation in various species is the goal of such an approach, and ultimately identifying possible patterns for various groups of species of similar biology. This chapter combines results of mitochondrial (mt)DNA sequence data presently available for three endemic Tanganyika cichlid lineages: Tropheus (Sturmbauer and Meyer, 1992), Simochromis (Meyer et al., 1996), and the members of the tribe Eretmodini (R(iber, 1994; Verheyen et al., 1996). MtDNA was shown to be a sensitive marker for population differentiation because it evolves 5 to 10 times faster than nuclear DNA (Avise, 1994). It is exclusively maternally transmitted in cichlids, making it more sensitive to population size fluc-
7. PhylogeographicPatterns of Cichlid Fishes tuations (reviewed in Meyer, 1993; Avise, 1994). This chapter focuses on results based on the mitochondrial control region because it is the most variable region of the entire genome (reviewed in Meyer, 1993, 1994) and thus most suitable in addressing phylogenetic questions at the population level. All species in this chapter inhabit rock and cobble shores along the lake where they often occur in sympatry. They all are epilithic algae feeders and are habitat specific to different degrees (Sturmbauer et al., 1992). For all three taxa, geographically distinct populations have been described, distinguishable only by minor, if any, morphological variation, but sometimes pronounced differences in coloration. DNA sequence data are available for populations of all three taxa along the central eastern coast of Lake Tanganyika. 1 This shoreline contains the major breakpoints which correspond to the locations of the three main basins of the lake and thus covers habitats in shallow water which are more strongly affected by fluctuations of lake level, as well as habitats situated at very steep shorelines which were probably not affected by periods of low lake level (Fig. 1A). Additional, yet unpublished, data were added to the Tropheus data set to increase geographical overlap with the data sets for Simochromis and the Eretmodini in the central eastern region of the lake. Species of the genus Tropheus are strictly confined to rock habitats for foraging and mating, and have a limited capacity for dispersal across open water (Brichard, 1989; Sturmbauer and Dallinger, 1995). Six nominal species are described (Poll, 1986), some of which have overlapping distribution (Snoeks et al., 1994), and more than 70 distinctly colored "races" have been reported (P. Schupke, personal communication). Samples of 23 localities are included in Fig. 1A. The genus Simochromis is closely related to Tropheus (Nishida, 1991; Sturmbauer and Meyer, 1992; Kocher et al., 1995) and both genera are classified within the same tribe, the Tropheini (Poll, 1986). The two Simochromis species studied so far, S. babaultii and S. diagramma, appear to be similar to Tropheus in their ecology, but they are typically less abundant. Their number of described geographical "races" is much smaller than Tropheus and some behavioral differences also exist between the taxa of the two genera. Although both sexes of Tropheus are highly sedentary, Simochromis species have been observed to move about in 1The nucleotide sequences in this chapter are available from EMBL/GenBankand are as follows:Z12047to Z12100and Z75694to Z75709 for Tropheus;X90593to X90638for the eretmodines; U40524 to U40532 for Simochromis babaultii; and U38808 and U38984 to U38995 for Simochromisdiagramma.
99
schools, and only dominant males keep territories (Brichard, 1989; C. Sturmbauer et al., unpublished observations). In contrast to Tropheus and to the Eretmodini, Simochromis is sexually dichromatic. Samples of 11 and 13 localities were analyzed for S. babaultii and S. diagramma, respectively. Species of the tribe Eretmodini are small, stenotopic cichlids and are the only group of cichlids that are adapted to living in shallow coastal areas, exposed to wave action. Three genera and four nominal species,
Eretmodus cyanostictus, Spathodus erythrodon, Spathodus marlieri, and Tanganicodus irsacae, have been described. Because eretmodines have a reduced swimbladder, they actually "sit" on the substrate, like the Gobiidae and the marine Blenniidae. The species differ in their ecology and dental morphology: three species were classified as epilithic algae feeders and one species (Tanganicodus) tends to feed on higher portions of invertebrates (Yamaoka et al., 1986). As in the case of Tropheus, numerous geographically isolated populations have been described for all four species, but color differences among populations are less pronounced than in Tropheus. Samples of 43 specimens from 32 localities were analyzed for the Eretmodini.
A. Genetic Variation in Tropheus Comparisons of the amount of sequence variation found within the genus Tropheus to that within the haplochromine species flocks of Lake Malawi and Lake Victoria suggested that Tropheus may be roughly twice the age of the whole Malawi species flock and six times the age of the Lake Victoria cichlid flock (Sturmbauer and Meyer, 1992). Tropheus duboisi is the most basal species in the genus, sister group to seven distinct lineages comprising the remaining five presently recognized species (Fig. lb). Although the average corrected sequence divergence among these seven different lineages amounted to 9.1% (standard deviation 1.6%), the average genetic divergence among populations of the same mitochondrial lineage was only 3.0% (standard deviation 1.1%). On the basis of the observed short branches and thus similar levels of genetic divergence defining the major mitochondrial lineages, and of similar levels among populations within each lineage, two successive radiations are hypothesized: in the primary radiation, Tropheus colonized rocky shores along the entire lake and the seven lineages originated. Relatively recently, each of those lineages underwent secondary radiations during which time geographically separated populations diversified to the patterns presently observed. The major fluctuation of the lake level was suggested as the trigger of the secondary radiations.
100
CHRISTIAN STURMBAUER et al.
B
primary radiation (7 lineages arise) -----
secondary radiations
, .... ~ /x Rutunga
:4:, ~
--99
N
D Nyanza Lac
--52--
Bemba Kiriza
(10
(2) /~
Ubwari
/x
~~~.
z, ~/k (3) Rutunga
p$~ A 1
Kabimba
C e n t r a l basin
~
S o u t h e m basin
... ~
D Kabimba
-- ~
N o r t h e r n basin
~. ~ \
,
~,Mahale M o u n t a i n Range
'| 97
,.) .>~'
i RiOMagd~ena t "~
:-,,......)
i~.,
i
/
~oA~o
i
;:i..: i
"".-
t!.....
,
:,: ,.'.
S2"
/
;:
i
S":
G.~:
i FIGURE 7 L o c a t i o n of m a j o r d r a i n a g e s in C o s t a Rica, P a n a m a , a n d C o l o m b i a w h e r e s a m p l e s of Roeboides w e r e collected.
fauna which led Myers (1966) to state that he could "see no escape from the conclusion that Central America possessed no obligatory freshwater ostariophysans until the Pliocene or even the Pleistocene, since which time the most aggressive and ubiquitous of all characoid genera (Astyanax) has, in a geological sense, raced northward to the Rio Grande, trailed a little more slowly by Hyphessobrycon, Brycon, Roeboides, Gymnotus, and a few others." Furthermore, the significant mtDNA divergence and reciprocal monophyly of fish mtDNA lineages among lower Central American drainage basins foreshadow an analysis of the pattern and rate of freshwater fish exchange that took place before and following the Pliocene completion of the Panamanian isthmus.
VI. Concluding Remarks Surveying large numbers of individuals across moderate numbers of species with overlapping distribu-
tions should be a goal in evolutionary biology for both theoretical and applied reasons. On the theoretical side, species richness may be influenced more strongly by extrinsic biogeographical relationships and historical circumstances than by such intrinsic, local processes as competiton and predation (Ricklefs, 1987; Cornell and Lawton, 1992; Ricklefs and Schluter, 1993). The sheer magnitude of systematic description required in the tropics indicates a pervasive role for molecular systematics if we are to determine the dependence of local richness on regional species richness in tropical ecosystems. On the practical side, molecular genetic analyses can provide a reasonably rapid means for surveying regional biotic diversity. Indices of species richness, sometimes taking into account abundance, have been the traditional measures of diversity. When used to make decisions regarding the preservation of biodiversity, however, it has been argued that these indices fail because they consider all species to be equal or nearly equal. Erwin (1991), Vane-Wright et al. (1991), and others (Crozier, 1992; Faith, 1992; Weitzman, 1992; Solow
126
ELDREDGE BERMINGHAM et al.
et al., 1993; r e v i e w e d b y K r a j e w s k i , 1994) h a v e s u g gested that phylogenetic history and/or genetic divers i t y s h o u l d b e u s e d in b i o d i v e r s i t y i n d i c e s to e m p h a -
size t h e p h y l o g e n e t i c a n d g e n e t i c d i s t i n c t i v e n e s s of s o m e g r o u p s c o m p a r e d to o t h e r s . To t h e d e g r e e this v i e w is a d o p t e d b y c o n s e r v a t i o n b i o l o g i s t s , m o l e c u l a r s y s t e m a t i c s w i l l u n d o u b t e d l y b e c a l l e d u p o n to p r o v i d e m e a s u r e s of t a x o n o m i c d i s t i n c t i v e n e s s . T h e res u l t i n g taxic d i v e r s i t y m e a s u r e s , w h e n c o u p l e d to d e t a i l e d k n o w l e d g e of o r g a n i s m a l d i s t r i b u t i o n p a t t e r n s , c a n b e u s e d to i d e n t i f y p r i o r i t y a r e a s for c o n s e r v a t i o n ( V a n e - W r i g h t et al., 1991).
Acknowledgments
The research reported in this chapter results from collaborations intitiated by EB, SM, and Haris Lessios on marine fishes and AM and EB on freshwater fishes. We gratefully acknowledge the financial support of the Smithsonian Institution (Tupper Postdoctoral Fellowship to AM and the STRI Molecular Systematics program), the National Geographic Society, and NSF (BSR-8607403 to Myra Shulman and EB). We thank the following for granting scientific collecting/research permits: INRENARE, Panama; The Comarcas of the Kuna, Ngobe, Embera, and Waunaan; Recursos Marinos, Panama; Ministerio de Recursos Naturales, Energia, y Minas, Costa Rica; and the Museo Nacional de Colombia. Most of all, we owe a very heartfelt thanks to the following people for extensive help in the field and laboratory: Heidi Banford, Bill Bussing, German Galvis, Luifer Garcia, Nimiadina Gomez, Myra Shulman, Ross Robertson, and Gustavo Ybazeta.
References
Allen, G. R. 1991. "Damselfishes of the World." Mergus Publishers, Germany. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Neigel, J. E., and Arnold, J. 1984. Demographic influences on mitochondrial DNA lineage survivorship in animal populations. ]. Mol. Evol. 20:99-105. Avise, J. C., Bowen, B. W., Lamb, T., Meylan, A. B., and Bermingham, E. 1992. Mitochondrial DNA evolution at a turtle's pace: Evidence for low genetic variability and reduced microevolutionary rate in the Testudines. Mol. Biol. EvoI. 9:457-473. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel, J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489-522. Bermingham, E., and Avise, J. C. 1986. Molecular zoogeography of freshwater fishes in the southeastern United States. Genetics 113: 939-965. Bermingham, E., and Lessios, H. 1993. Rate variation of protein and mtDNA evolution as revealed by sea urchins separated by the Isthmus of Panama. Proc. Natl. Acad. Sci. USA 90: 2734-2738. Bermingham, E., Rohwer, S., Wood, C., and Freeman, S. 1992. Vicariance biogeography in the Pleistocene and speciation in North American wood warblers: A test of Mengel's model. Proc. Natl. Acad. Sci. USA 89: 6624-6628. Bermingham, E., Seutin, G., and Ricklefs, R. E. 1996. Regional approaches to conservation biology: RFLPs, DNA sequences, and Caribbean birds. In "Molecular Genetic Approaches to Conser-
vation Biology" (T. Smith and R. Wayne, eds.), pp. 104-124. Oxford University Press, London. Beverly, S. M., and Wilson, A. C. 1985. Ancient origin for Hawaiian Drosophiliniae inferred from protein comparisons. Proc. Natl. Acad. Sci. USA 82:4753-4757. Brawn, J. D., Collins, T. M., Medina M., and Bermingham, E. 1996. Associations between physical isolation and geographical variation within three species of Neotropical birds. Mol. Ecol. 5:33-46. Britten, R. J. 1986. Rates of DNA sequence evolution differ between taxonomic groups. Science 231:1393-1398. Bussing, W. A. 1976. Geographic distribution of the San Juan ichthyofauna of Central America with remarks on its origin and ecology. In "Investigations of the Ichthyofauna of Nicaraguan Lakes" (T. B. Thorson, ed.), pp. 157-175. University of Nebraska, Lincoln, NE. Bussing, W. A. 1985. Patterns of distribution of the Central American ichthyofauna. In "The Great American Biotic Interchange" (F. G. Stehli and S. D. Webb, eds.), pp. 453-473. Plenum, New York. Capparella, A. P. 1991. Neotropical avian diversity and riverine barriers. In "Acta XX Congressus Internationalis Ornithologici," pp. 307-316. Washington, D.C. Chernoff, B. 1982. Character variation among populations and the analysis of biogeography. Am. Zool. 22: 425-439. Coates, A. G., Jackson, J. B. C., Collins, L. S., Cronin, T. M., Dowset, H. J., Bybell, L. M., Jung, P., and Obando, J. A. 1992. Closure of the Isthmus of Panama: The near-shore marine record of Costa Rica and western Panama. Bull. Geol. Soc. Am. 104:814-828. Coates, A. G., and Obando, J. A. 1996. The geologic evolution of the Central American isthmus. In "Evolution and Environment in Tropical America" (J. Jackson, A. F. Budd, and A. G. Coates, eds.). pp. 21-56. The University of Chicago Press, Chicago, IL. Cornell, H. V., and Lawton, J. H. 1992. Species interactions, local and regional processes, and limits to the richness of ecological communities: A theoretical perspective. J. Anim. Ecol. 61:1-12. Crozier, R. H. 1992. Genetic diversity and the agony of choice. Biol. Conserv. 61:11-15. Darlington, P. J. 1957. "Zoogeography: The Geographical Distribution of Animals." Wiley, New York. Darlington, P. J. 1964. Drifting continents and Late Paleozoic geography. Proc. Natl. Acad. Sci. USA 52:1084-1091. daSilva, M., and Patton, J. 1993. Amazonian phylogeography: mtDNA sequence variation in arboreal echimyid rodents (Caviomorpha). Mol. Phylogenet. Evol. 2:243-255. Duque-Caro, H. 1990. Neogene stratigraphy, paleoceanography and paleogeography in northwest south America and the evolution of the Panama Seaway. Paleogeog. Paleoecl. Palaeoec. 77:203-234. Erwin, T. L. 1991. An evolutionary basis for conservation strategies. Science 253: 758- 761. Escalante-Pliego, B. P. 1991. Genetic differentiation in yellowthroats (Parulinae: Geothlypis). In "Acta XX Congressus Internationalis Ornithologici," pp. 333-343. Washington, D.C. Excoffier, L., Smouse, P. E., and Quattro, J. M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 131: 479-491. Faith, D. P. 1992. Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61:1-10. Felsenstein, J. 1993. "Phylogeny Inference Package (PHYLIP) 3.5 edition." University of Washington, Seattle, WA. Gaut, B. S., Muse, S. V., Clark, W. D., and Clegg, M. T. 1992. Relative rates of nucleotide substitution at the rbcL locus of monocotyledonous plants. J. Mol. Evol. 35:292-303. Gill, T. N. 1862. Catalogue of fishes of lower California in the Smithsonian Institution, collected by Mr. J. Xantus. Proc. Natl. Acad. Sci. Philadelphia 14:140-151.
8. Fish Biogeography and Molecular Clocks
Grande, L. 1985. The use of paleontology in systematics and biogeography, and a time control refinement for historical biogeography. Paleobiology 11:234-243. Hackett, S. J., and Rosenberg, K. V. 1990. Comparison of phenotypic and genetic differentiation in South American antwrens (Formicariidae). Auk 107: 473-489. Hasegawa, M., and Hashimoto, T. 1993. Ribosomal RNA trees misleading? Nature 361:23. Hensley, D. A. 1978. "Revision of the Indo-West Pacific Species Abudefduf (Pisces: Pomacentridae)." Unpublished Ph.D. dissertation, University of South Florida, Tampa, FL. Hillis, D. M., Mable, B. K., and Moritz, C. 1996. Applications of molecular systematics: The state of the field and a look to the future. In "Molecular Systematics" (D. M. Hillis, C. Moritz and B. K. Mable, eds.), 2nd Ed., pp. 515-543. Sinauer, Sunderland, MA. Irwin, D. M., Kocher, T. D., and Wilson, A. C. 1991. Evolution of the cytochrome b gene of mammals. J. Mol. Evol. 32:128-144. Jordan, D. S. 1908. The law of geminate species. Am. Nat. XLII(494) : 73-80. Joseph, L., Moritz, C., and Hugall, A. 1995. Molecular support for vicariance as a source of diversity in rainforest. Proc. R. Soc. Lond. B 260:177-182.
Keigwin, L. D. 1978. Pliocene closing of the Isthmus of Panama based on biostratigraphic evidence from nearby Pacific Ocean and Caribbean Sea cores. Geology 6:630-634. Keigwin, L. D. 1982. Isotopic paleoceanography of the Caribbean and east Pacific: Role of Panama uplift in late Neogene time. Science 217:350-353. Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitutions through comparitive studies of nucleotide sequences. J. Mol. Evol. 16:111-120. Kimura, M. 1983. "The Neutral Theory of Molecular Evolution." Cambridge University Press, Cambridge, England. Knowlton, N., Weigt, L. A., Sol6rzano, L. A., Mills, D. K., and Bermingham, E. 1993. Divergence in proteins, mitochondrial DNA, and reproductive compatability across the Isthmus of Panama. Science 260:1629 - 1632. Krajewski, C. 1994. Phylogenetic measures of biodiversity: A comparison and critique. Biol. Conser. 69:33-39. Leis, J. M. 1991. The pelagic stage of reef fishes. In "The Ecology of Fishes on Coral Reefs" (P. E Sale, ed.), pp. 183-230. Academic Press, San Diego. Lessios, H. A. 1979. Use of Panamanian sea urchins to test the molecular clock. Nature 280:599-601. Lessios, H. A. 1981. Divergence in allopatry: Molecular and morphological differentiation between sea urchins separated by the Isthmus of Panama. Evolution 35:618-634. Lessios, H. A., Allen, G. R., Wellington, G. M., and Bermingham, E. 1995. Genetic and morphological evidence that the Eastern Pacific damselfish Abudefduf declivifrons is distinct from A. concolor (Pomacentridae). Copeia 1995(2):277-288. Li, W.-H., Tanimura, M., and Sharp, P. M. 1987. An evaluation of the molecular clock hypothesis using mammalian DNA sequences. J. Mol. Evol. 25:330-342. Li, W.-H., Wu, C.-I., and Luo, C.-C. 1985. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 2:150-174. Lockhart, P. J., Howe, C. J., Bryant, D. A., Beanland, T. J., and Larkum, A. W. D. 1992. Substitutional bias confounds inference of cyanelle origins from sequence data. J. Mol. Evol. 34:153-162. Lockhart, P. J., Steel, M. A., Hendy, M. D., and Penny, D. 1994. Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11:605-612. Lundberg, J. G. 1993. African-South American freshwater fish clades
127
and continental drift: Problems with a paradigm. In "Biotic Relationships between Africa and South America" (P. Goldblatt, ed.), pp. 156-198. Yale University Press, New Haven, CT. Lundelius, E. L. 1987. The North American quaternary sequence. In "Cenozoic Mammals of North America" (M. O. Woodburne, ed.), pp. 211-235. Univ. Calif. Press, Los Angeles, CA. Marshall, L. G. 1988. Land mammals and the great American interchange. Am. Sci. 76:380-388. Martin, A. P. 1995. Metabolic rate and directional nucleotide substitution in animal mitochondrial DNA. Mol. Biol. Evol. 12(6): 1124-1131. Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks is slow compared with mammals. Nature 357:153-155. Martin, A. P., and Palumbi, S. R. 1993. Body size, metabolic rate, generation time, and the molecular clock. Proc. Natl. Acad. Sci. USA 90: 4087-4091. Mayr, E. 1963. "Animal Species and Evolution." Belknap Press, Cambridge, MA. McFarland, W. N., Brothers, E. B., Ogden, J. C., Shulman, M. J., and Bermingham, E. L. 1985. Recruitment patterns in young French grunts Haemulon flavolineatum (family Haemulidae) at St. Croix, U.S.V.I. Fish. Bull. 83:413-426. McMillan, W. O., and Bermingham, E. 1996. The phylogeographic pattern of mitochondrial DNA variation in the Dall's porpoise Phocoenoides dalli. Mol. Ecol. 5: 47-61. Meyer, A. 1993. Evolution of mitochondrial DNA of fishes. In "The Biochemistry and Molecular Biology of Fishes" (P. W. Hochachka and P. Mommsen, eds.), pp. 1-38. Elsevier, Amsterdam. Miller, R. R. 1966. Geographical distribution of freshwater fish fauna of Central America. Copeia 1966(4):773-802. Moritz, C. 1994. Defining evolutionary significant units for conservation. Trends Ecol. Evol. 9:373-375. Muse, S. V., and Weir, B. S. 1992. Testing for equality of evolutionary rates. Genetics 132:269-276. Myers, G. S. 1966. Derivation of the freshwater fish fauna of Central America. Copeia 1966(4):766-773. Page, R. D. M. 1991. Clocks, clades, cospeciation: Comparing rates of evolution and timing of cospeciation events in host-parasite assemblages. Syst. Zool. 40:188-198. Page, R. D. M. 1993. Genes, organisms, and areas: The problem of multiple lineages. Syst. Biol. 42(1):77-84. Patterson, C. 1975. The distribution of Mesozoic freshwater fishes. Mem. Mus. Natl. Hist. Nat. Ser. Paris. A Zool. 88. Patton, J., and Smith, M. F. 1992. mtDNA phylogeny of Andean mice: A test of diversification across ecological gradients. Evolution 46: 174-183. Perna, N. T., and Kocher, T. D. 1995a. Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes. J. Mol. Evol. 41:353-358. Perna, N. T., and Kocher, T. D. 1995b. Unequal base frequencies and the estimation of substitution rates. Mol. Biol. Evol. 12(2) :359-361. Peterson, A. T., Escalante, P., and Navarro, A. 1992. Genetic variation and differentiation in Mexican populations of common bushtanagers and chestnut-capped brush-finches. Condor 94:244-253. Ricklefs, R. E. 1987. Community diversity: Relative roles of local and regional processes. Science 235:167-171. Ricklefs, R. E., and Schluter, D. 1993. Species diversity: An introduction to the problem. In "Species Diversity in Ecological Communities: Historical and Geological Perspectives" (R. E. Ricklefs and D. Schluter, eds.), pp. 1-10. University of Chicago Press, Chicago, IL. Rohlf, F. J. 1993. "NTSYS-pc: Numerical Taxonomy and Multivariate Analysis System." Exeter Software, Applied Biostatistics, Setauket, New York.
128
ELDREDGE B E R M I N G H A M et al.
Rosen, D. E. 1978. Vicariant patterns and historical explanation in biogeography. Syst. Zool. 27:158-188. Rubinoff, I., and Leigh, E. G. 1990. Dealing with diversity: The Smithsonian Tropical Research Institute and tropical biology. Trends Ecol. Evol. 5(4): 115-118. Rzhetsky, A., and Nei, M. 1994a. A simple method for estimating and testing minimum-evolution trees. Mol. Biol. Evol. 9(5):945967. Rzhetsky, A., and Nei, M. 1994b. METREE: A program package for inferring and testing minimum-evolution trees. Cambios 10(4): 409-412. Saccone, C., Pesole, G., and Preparata, G. 1989. DNA microenvironments and the molecular clock. J. Mol. Evol. 29:407-411. Sarich, V. M., and Wilson, A. C. 1967. Immunological time scale for hominid evolution. Science 158:1200-1203. Seutin, G., Brawn, J., Ricklefs, R. E., and Bermingham, E. 1993. Genetic divergence among populations of a tropical passerine, the Streaked Saltator (Saltator albicollis). Auk 110:117-126. Seutin, G., Klein, N. K., Ricklefs, R. E., and Bermingham, E. 1994. Historical biogeography of the bananaquit (Coerebaflaveola) in the Caribbean region: A mitochondrial DNA assessment. Evolution 48(4):1041-1061. Shulman, M. J., and Bermingham, E. 1995. Early life histories, ocean currents, and the population genetics of Caribbean reef fishes. Evolution 49(5):897-910. Sidow, A., and Wilson, A. C. 1990. Compositional statistics: An improvement of evolutionary parsimony and its deep branches in the tree of life. J. Mol. Evol. 31:51-68. Sneath, P. H. A., and Sokal, R. R. 1973. "Numerical Taxonomy." Freeman, San Francisco. Sokal, R. R., and Rohlf, F. J. 1981. "Biometry." Freeman, San Francisco. Solow, A. R., Broadus, J. M., and Tonring, N. 1993. On the measurement of biological diversity. J. Environ. Econ. Manag. 24:60-68. Springer, V. G. 1982. "Pacific Plate Biogeography, with Special Reference to Shorefishes." Smithsonian Institution Press, Washington, D.C. Steel, M. A., Lockhart, P. J., and Penny, D. 1993. Confidence in evolu-
tionary trees from biological sequence data. Nature 364: 440-442. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. 1996. Phylogenetic inference. In "Molecular Systematics" (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), 2nd Ed., pp. 407-514. Sinauer, Sunderland, MA. Tamura, K., and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526. Thomas, W. K., and Beckenbach, A. T. 1989. Variation in salmonid mitochondrial DNA: Evoutionary constraints and mechanisms of substitution. J. Mol. Evol. 29:233-245. Thomson, D. A., Findley, L. T., and Kerstitch, A. N. 1979. "Reef Fishes of the Sea of Cortez." Wiley, New York. Thresher, R. E., and Brothers, E. B. 1989. Evidence of intra- and interoceanic regional differences in the early life history of reefassociated fishes. Mar. Biol. Progr. Ser. 57:187-205. Vane-Wright, R. I., Humphries, C. J., and Williams, P. H. 1991. What to protect? Systematics and the agony of choice. Biol. Conserv. 55: 235-254. Vawter, A. T., Rosenblatt, R., and Gorman, G. C. 1980. Genetic divergence among fishes of the eastern Pacific and the Caribbean: Support for the molecular clock. Evolution 34:705-711. Vawter, L., and Brown, W. M. 1986. Nuclear and mitochondrial DNA comparisons reveal extreme rate variation in the molecular clock. Science 234:194 - 196. Vermeij, G. J. 1978. "Biogeography and Adaptation." Harvard University Press, Cambridge, MA. Victor, B. C. 1986. Duration of the planktonic larval stage of one hundred species of Pacific and Atlantic wrasses (family Labridae). Mar. Biol. 90:317-326. Weitzman, M. L. 1992. On diversity. Quart. J. Econ. 107:363-405. Wellington, G. M., and Victor, V. C. 1989. Planktonic larval duration of one hundred species of Pacific and Atlantic damselfishes (Pomacentridae). Mar. Biol. 101:557-567. Zuckerkandl, E., and Pauling, L. 1965. Evolutionary divergence and convergence in proteins. In "Evolving Genes and Proteins" (V. Bryson and H. J. Vodel, eds.), pp. 97-166. Academic Press, New York.
C H A P T E R
9 The Utility of Mitochondrial DNA Control Region Sequencesfor Analyzing Phylogenetic Relationships among Populations, Species, and Genera of the Percidae JOSEPH E. FABER and CAROL A. STEPIEN Department of Biology Case Western Reserve University Cleveland, Ohio 44106
I. I n t r o d u c t i o n
Acipenser transmontanus among rivers of the Pacific coast of North America (Brown et al., 1993). Lack of nucleotide diversity in this rapidly evolving region has also suggested stock depression and genetic bottlenecks in the white sturgeon A. transmontanus (Brown et al., 1993) and lack of substructure of Atlantic cod Gadus morhua populations in the north Atlantic Ocean (Arnason and Rand, 1992). At higher taxonomic levels, nucleotide divergence in control region sequences was used to construct phylogenies of relationships among species and genera of morphologically variable and homoplastic African rift lake cichlids (Meyer et al., 1990; Sturmbauer and Meyer, 1992, 1993). The more slowly evolving central conserved section of the control region appears to contain phylogenetically reliable information up to the family level and higher in teleost fishes, despite rapid evolutionary rates of flanking "left" and "right" domains of the control region (Lee et al., 1995; Stepien, 1995). Thus, research involving several different fish taxa indicates that the mtDNA control region "bridges the gap" (Avise et al., 1987) between phylogenetics and population genetics. However, little research has been conducted to de-
The rapid mutation rate and predominantly maternal inheritance of vertebrate mitochondrial (mt)DNA (Brown et al., 1979; Wilson et al., 1985) provide a valuable tool for evaluating evolutionary genetic divergence (Stepien and Kocher, Chapter 1). Comparison of sequences among vertebrates shows that both relatively fast and slowly evolving areas lie within the mtDNA control region (Brown, 1986; Lee et al., 1995). Control region DNA sequences may therefore reveal evolutionary relationships at various taxonomic levels (Moritz et al., 1987). Recent studies of fishes illustrate the use of the mtDNA control region to study problems involving different evolutionary time scales. For example, variability in sequences has been used to detect the evolutionarily recent population structure of the Dover sole Microstomus pacificus and the thornyhead Sebastolobus alascanus in biogeographic provinces of the Pacific continental slope (Stepien, 1995), the spotted sand bass Paralabrax maculatofasciatus between the Pacific Coast and Sea of Cortez (Stepien, 1995), and white sturgeon MOLECULAR SYSTEMATICS OF FISHES
129
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
130
JOSEPHE. FABERAND CAROLA. STEPIEN
termine the rates and patterns of mtDNA control region nucleotide evolution and the phylogenetic signal at various taxonomic levels within single lineages. Phylogenetic relationships among species and genera of rainbow fishes (Melanotaeniidae) were investigated, but only a small portion (--- 330 bp) of the left domain of the mtDNA control region was utilized (Zhu et al., 1994). Relationships among genera and among higher taxonomic levels were reviewed by Lee et al. (1995), but among- and within-species relationships were not addressed. The purpose of this chapter is to compare the genetic divergence of mtDNA control region sequences within and among closely related species and genera in the teleost fish family Percidae. Gene trees from control region data are compared to morphology-based phylogenies, and population genetic statistics are compared to hypotheses of population structure derived from geological evidence and tagging studies, to test phylogenetic signal (phylogenetic information in the data set), and to determine the utility of mtDNA control region sequences across a range of evolutionary time scales.
similar suprageneric phylogeny based on reproductive behavior characters, including the subfamilies Percinae (the tribe Percini including Gymnocephalus, Perca, and Percarina) and Etheostomatinae (the tribes Romanichthyini, including Romanichthys and Zingel; Luciopercini, including Stizostedion; and Etheostomatini, including Ammocrypta, Crystallaria, Etheostoma, and Percina) (Fig. 1C). The phylogenies of Page (1985) and Wiley (1992) differed at the generic level: Wiley included only Perca in the Percini whereas Page included Perca, Gymnocephalus, and Percarina. Disparate phylogenetic relationships were also suggested within the Etheostomatinae, as Page (1985) hypothesized that the Luciopercini is the sister group of Etheostomatini and both share a common ancestor with Romanichthyini (Fig. 1C). Wiley (1992) regarded Romanichthys as the
A
Percinae
I
Perches
Romanichthys Zingel
I
Darters
Ruffe
A. Morphological Evolution of the Percidae The holarctic family Percidae includes 162 described species in 10 genera (Nelson, 1994). The darters, comprising the genera Ammocrypta, Crystallaria, Etheostoma, and Percina, are Nearctic, and Gymnocephalus, Percarina, Zingel, and Romanichthys are Palearctic in distribution. Perca and Stizostedion are endemic to both North America and Eurasia. Several species, including the ruffe Gymnocephalus cernuus, the perch Percayquviatilis, and the walleye Stizostedion vitreum, have become established outside their historical ranges through accidental and intentional introductions (Nelson, 1994), which may affect geographic patterns of genetic diversity and local adaptations of resident stocks (Billington and Hebert, 1991). The taxonomy of the Percidae was described by Collette (1963) and Collette and Banarescu (1977). Two subfamilies are recognized, and suprageneric relationships are presently disputed in four different morphological phylogenetic hypotheses, which are given in Fig. 1 (reviewed in Coburn and Gaglione, 1992). Collette's (1963) subfamily Percinae included the tribes Percini (Gymnocephalus, Perca, and Percarina) and Etheostomatini (Ammocrypta, Crystallaria, Etheostoma, and Percina), and the subfamily Luciopercinae contained the Luciopercini (Stizostedion) and Romanichthyini (Romanichthys and Zingel) (Fig. 1A). Wiley (1992) alternatively divided the Percidae into the subfamilies Percinae (including only Perca) and Etheostomatinae (including all other genera except Percarina, which was not analyzed) (Fig. 1B). Page (1985) hypothesized a
Luciopercinae
Perca Ammocrypta, Crystallaria Percarina Etheostoma, Gymnocephalus Percina Stizostedion
I
Pikeperches
Europeandarters
i Percinae
B
Etheostomatinae
Perca Gymnocephalus Stizostedion
Ruffe/
Perches
Pike-
I
darters
Ruffe
Etheostomatinae
l European darters
Pike-I
Ruffe
I Darters
I
Etheostomatinae
Perca Gymnocephalus
Perches
Ammocrypta, Crystallaria Etheostoma, Percina
perches I
I
Percinae
I Darters
1
Percinae
Perches
[
I
Perca Percarina Romanichthys Gymnocephalus Zingel Stizostedion
C
D
Romanichthys Ammocrypta, Crystallaria Etheostoma, Percina
European I .......
perches
9-
...
Zingel
Stizostedion
Pikeperches
Zingel
Romanichthys
I European I darters
Ammocrypta, Crystallaria Etheostoma, Percina
Darters
I .......... F .........
FIGURE1 Hypothesesof phylogeneticrelationshipsamongtaxaof the teleost fish familyPercidae: (A) Collette (1963)and Colletteand Banarescu (1977);(B) Wiley(1992);(C) Page (1985);and (D) Coburn and Gaglione(1992).Adaptedfrom Coburn and Gaglione(1992).
9. PhylogeneticAnalysis of the Percidae sister taxon of Etheostomatini, and the clade containing both as the sister group to Zingel (Fig. 1B). Wiley (1992) also suggested that Stizostedion is the sister group to the Zingel-Romanichthys-Etheostomatini clade and that Gymnocephalus is, in turn, the sister clade to the Stizostedion-Zingel-Romanichthys-Etheostomatini group (Fig. 1B). Coburn and Gaglione (1992) presented a phylogeny similar to Wiley (1992), except for placing Romanichthys as the sister group to Zingel, which is the sister taxon to Etheostomatini, and placing Gymnocephalus as the sister taxon to Perca in the Percinae (Fig. 1D). Phylogenetic relationships at lower taxonomic levels in the Percidae are poorly understood. Bailey and Gosline (1955) recognized subgenera, and Page (1981) and Bailey and Etner (1988) examined subgeneric relationships in Etheostoma, but group assignments differed between these studies. Essentially, few phylogenetic studies have attempted to resolve relationships within this large genus (n = 150 species; Nelson, 1994).
B. Genetics o f the Percidae Genetic relationships of most percid taxa have not been studied. However, some North American taxa have been examined using allozyme electrophoresis and restriction fragment length polymorphism (RFLP) analysis of mtDNA. Allozyme polymorphisms indicated that the darter genus Percina was the sister group of a clade containing other morphologically derived darter genera, including Crystallaria, Ammocrypta, and Etheostoma (Page and Whitt, 1973a,b; Page, 1974). Simons (1989, 1992) used mtDNA RFLPs to find that species assigned to the Ammocrypta (sand darters) by Bailey and Gosline (1955) are not monophyletic and instead should be assigned subgeneric status in Etheostoma. Wiley and Hagen (Chapter 6) sequenced the cytochrome b gene of mtDNA to test intra- and interspecific relationships of sand darters. Intraspecific variability of allozymes in Etheostoma has also been used to resolve biogeographic variability, both among drainage systems (among populations; Wiseman et al., 1978) and within drainages (among and within populations; Echelle et al., 1975). Evolutionary relationships among and within species of the genus Stizostedion have been studied using genetic techniques. The morphological hypothesis of the European taxa being the sister group to the North American taxa and the evolutionary hypothesis of North American colonization from Eurasia via Beringia during the Pliocene (Collette and Banarescu, 1977) have been tested and supported with allozyme and mtDNA RFLP data by Billington et al. (1990, 1991). The majority of genetic research on this genus, however, has focused on determining the population structure
131
of the economically important North American walleye, S. vitreum. Identification of allozyme polymorphisms and mtDNA RFLP haplotypes has resolved broad-scale biogeographic patterns across much of the North American range of the walleye, including the Great Lakes (Billington and Hebert, 1988; Ward et al., 1989; Todd, 1990; Billington et al., 1992; Billington and Strange, 1995). However, among more closely spaced sites, representing populations (stocks) identified by tag and recapture methods (Ferguson and Derksen, 1971; Bodaly, 1980), allozyme analyses appear to lack the resolving power necessary to discern significant population-level divergences (i.e., within the Great Lakes; Todd 1990; Hawley et al. 1991). MtDNA RFLP analysis has revealed population divergence among female walleye from spawning sites in two closely spaced tributaries in Lake Erie (Mercker and Woodruff, 1996), suggesting that walleye population markers can be identified. Population genetics of the yellow perch, P.flavescens, based on allozymes have suggested that variability is low or nonexistent in Green Bay, Lake Michigan (Leary and Booke, 1982), and in Lake Erie, Lake Champlain, and Lake Oneida (Strittholt et al., 1988). RFLP analysis of mtDNA indicates variability in a small sample of yellow perch from Lake Erie (Billington, 1993), suggesting that useful population markers may be available in the mitochondrial genome.
II. Materials and Methods A. Collection o f Specimens Eight species representing five genera of the family Percidae, including the banded darter Etheostoma zonale, bluebreast darter Etheostoma camurum, ruffe Gymnocephalus cernuus, yellow perch Percaflavescens, blackside darter Percina maculata, sauger Stizostedion canadense, walleye S. vitreum, and zander S. lucioperca, were collected from sites in eastern North America and Eurasia (see Section V). Intraspecific variation was studied in four species. One hundred and seventeen specimens of S. vitreum were collected from spawning sites in four tributary rivers and one nearshore reef in Lake Erie, as well as from one tributary to Lake St. Clair (Fig. 2). Twenty G. cernuus were collected from the site of recent introduction in Lake Superior (ca. 1987; Simon and Vondruska, 1989) and from one site near St. Petersburg, Russia. Four S. canadense and 10 P. flavescens were also collected to examine genetic variability in these species. Specimens were collected by seine, electroshocking, or hook and line. Whole individuals or tissues (fin, muscle, eggs, or liver) were ei-
132
JOSEPH E. FABER AND CAROL A. STEPIEN
ONTARIO ClintonR i v e r ~ t LAKE - ~ s ~ e r n
MICHIGAN ~ / ..... {
__f ~
bG'."Oi ~ _/ ,
.... j ....... ~ ~I ....\, Sandus
~
LAKEERIE J basin ~
\
i
~ - : / ~
NEWGRK ~______ PENNSYLVANIA
I
/ GrandRiver i j.,.,..,___t~ ..... ! o ~ '\
';~1111111~
I I Haplotype1
I
~ Hapio~pe4 100 km
I
Haplotype 5
end-labeled with biotin (Hultman et al., 1989) and the strands were separated using Dynabeads M-280 streptavidin (Dynal Corp., Oslo, Norway) for single strand sequencing (Hultman et al., 1989; Uhlen, 1989). Both strands were sequenced separately using diluted PCR primers with Sanger dideoxy sequencing (Sanger et al., 1977) and Sequenase PCR product sequencing kits (Amersham/U.S. Biochemical Corp., Cleveland, OH). Sequencing reactions were run on 6% polyacrylamide gels for 2, 5, and 8 hr in order to resolve approximately 600 bp from the primer and were visualized by autoradiography.
~ Haplotypes6, 7, 8
FIGURE 2 Collection sites of spawning walleye, Stizostedion vitreum, and relative frequencies of eight mtDNA control region se-
quence haplotypes in Lake Erie and Lake St. Clair, including the Clinton River, Maumee River, Sandusky River, Grand River, and Van Buren Bay.
ther immediately frozen at -80 ~C or preserved in 95% ethanol in the field. Frozen samples were stored at - 8 0 ~C, and ethanol preserved materials were stored at room temperature prior to DNA extraction.
B. Genetic Analysis Whole small fishes or tissues of larger individuals were frozen and ground in liquid nitrogen using a cylindrical mortar and pestle. DNA was extracted in a guanidine thiocyanate buffer and purified using proteinase K, RNase, phenol, and chloroform for each individual following standard protocols (Stepien et al., 1993; Stepien, 1995). The entire mtDNA control region was amplified in three sections using conserved primers (Kocher et al., 1989; Meyer et al., 1990) and the polymerase chain reaction (PCR). The 5' end or "left" domain of the control region from tRNApr~ t o the central conserved section was amplified using oligonucleotide primers L15926, 5'-TCA AAG CTT ACA CCA GTC TTG TAA ACC-3' (Kocher et al., 1989), and H16498, 5'-CCT GAA GTA GGA ACC AGA TG-3' (Meyer et al., 1990). The 3' end or "right" domain of the control region from the central conserved section to tRNA phe was amplified with the light strand complement of H16498, L16498 5'-CAT CTG GTT CCT ACT TCA GG-3', and H503 5'-GCA CGA GAT TTA CCA ACC C-3' (Titus and Larson, 1995). One hundred and sixty-eight nucleotides of the central conserved section were amplified with primers designed from sequences conserved among fishes sequenced in this study, using the light chain primer L16378, 5'-AAT GTA GTA AGA GCC TA-3', and the heavy chain primer H16578, 5'-GGG TAA CGA GGA GTA TG-3'. Heavy chain primers were
C. Data Analysis DNA sequences were read into a Macintosh computer using an IBI/Kodak digitizer, aligned using MacVector-AssemblyLIGN software (International Biotechnologies, Inc., 1992), checked by hand, and aligned among species to identify evolutionarily conserved and variable sections. Percentage nucleotide composition and relative proportion of polymorphic nucleotides (pn, Nei, 1987) were calculated by hand. Secondary folding structures of nucleotide sequences were explored using the RNAdraw program (Version 1.01; Matzura and Wennborg, 1995). Phylogenies were analyzed for different sections of the control region, based on apparently variable evolutionary rates and phylogenetic signal among these sections in primates (Kocher and Wilson, 1991) and teleost fishes (Lee et al., 1995). In addition to calculating phylogenies for data from the entire control region, the rapidly evolving left domain from tRNA pro to the central conserved section, the slowly evolving central conserved section, and the rapidly evolving right domain from the central conserved section to tR_N~Aphewere considered separately. Phylogenies were analyzed using two methods: distance analysis of percentage sequence divergence using pairwise genetic distances, and calculating parsimonious relationships using character states (cladistics). y genetic distances that approximate substitution rate variation among nucleotide sites in mtDNA control regions (Kocher and Wilson, 1991; Tamura and Nei, 1993) were calculated using the Tamura-Nei model of nucleotide substitution, which accounts for variable substitution rates between purines and pyrimidines (Tamura and Nei, 1993), using MEGA (Molecular Evolutionary Genetics Analysis, Version 1.01; Kumar et al., 1993). A y parameter of 0.11 was used for analysis of the entire control region and separate analysis of the central conserved section (Kocher and Wilson, 1991; Tamura and Nei, 1993), and a parameter of 0.50 was implemented for separate analysis of the more rapidly evolving left and right domains
9. Phylogenetic Analysis of the Percidae (Wakely, 1993). Distance phylogenies were inferred with a neighbor-joining algorithm (Saitou and Nei, 1987), and support for individual nodes of resulting trees was examined with 1000 bootstrap replicates (Felsenstein, 1985). Most parsimonious relationships among species and genera were determined with the branch and bound algorithm using PAUP (Phylogenetic Analysis Using Parsimony, Version 3.1.1; Swofford, 1993), and support for nodes of cladograms was estimated using 1000 bootstrap replications (Felsenstein, 1985) and 50% majority rule consensus analysis of the shortest trees (Margush and McMorris, 1981). The percid species with the greatest pairwise genetic distance between itself and all other species examined (the zander, Stizostedion lucioperca) was utilized as the outgroup for parsimony analysis because no mtDNA control region sequence data were available for closely related outgroup taxa. Parsimony and distance trees were both midpoint rooted. Exhaustive maximum parsimony searches (excluding intraspecific variability) were conducted using PAUP Version 3.1.1 (Swofford, 1993) for the entire control region and for the left domain, central conserved section, and right domain separately. The skewness of the frequency distribution of tree lengths relative to the most parsimonious tree length (gl statistic) was used to determine phylogenetic signal in the data set (Hillis and Huelsenbeck, 1992). A topology-dependent cladistic permutation tail probability analysis (T-PTP; Faith, 1991) was also used to determine if most parsimonious trees differed significantly from morphologically derived phylogenetic hypotheses (Fig. 1). Sequence data were randomly permuted 1000 times, and the frequency distribution of 1000 branch and bound tree lengths was calculated using T-PTP in PAUP* (Swofford, 1996). Tree length changes between most parsimonious branch and bound trees and the morphological trees (excluding taxa not surveyed in this study) were determined in MacClade (Version 3.0; Maddison and Maddison, 1992), and these lengths were compared to the frequency distributions of tree length differences between the 1000 randomly permuted trees and the most parsimonious trees to test the statistical support of control region data for their relationships in morphological hypotheses (for a discussion of this method see Bernardi, Chapter 12). T-PTP analyses were repeated for the entire control region and for the left domain, central conserved section, and right domain to test their relative phylogenetic signals. Intraspecific variability was analyzed using population genetic statistics. Haplotypic diversity (h) was calculated following Nei (1987). Geographic heterogeneity in frequency distributions of haplotypes was analyzed with a/~2 test, using a Monte Carlo simula-
133
tion approach with 1000 randomizations to account for small sample sizes and empty cells in the contingency matrix (Roff and Bentzen, 1989), with the MONTE option in REAP (Restriction Enzyme Analysis Package, Version 4.0; McElroy et al., 1992). In addition, molecular variability was ascribed to variance components including among regions, among populations within regions, and within populations, and the Fst analog 9st (Weir and Cockerham, 1984) was calculated using AMOVA (Analysis of Molecular Variance, Version 1.53; Excoffier et al., 1992).
III. Results Sequences are deposited in GenBank (accession numbers 90617-90624). Alignment of mtDNA control region sequences reveals a common pattern of conserved sections with intervening variable sections (Fig. 3). The left domain flanking tRNA pr~ contains a putative termination associated sequence (TAS) 5'AAA CTA TTC TTT-3', which appears homologous to that of other vertebrates (e.g., mouse; Doda et al., 1981). In the downstream 3' direction, alignment reveals conserved sequences that also appear homologous to conserved sequence blocks (CSBs) of mammals (Southern et al., 1988; Saccone et al., 1991) and other fishes (Lee et al., 1995), including the central conserved section, a pyrimidine rich "tract," and conserved sequence boxes 2 and 3 (CSB-2, 3). The total length of the control region is similar among the percids analyzed, ranging from 908 nucleotides in yellow perch (P. flavescens) to 1248 nucleotides in walleye (S. vitreum) (Fig. 3). Variability in length is primarily due to variable numbers of repeated sequences (e.g., tandem repeats; Buroker et al., 1990) located immediately downstream of the TAS (Fig. 3). Sequences of 10 to 12 nucleotides, similar in sequence to the TAS, are repeated from 5 times for 50 total nucleotides in zander (S. lucioperca) to 34 times for 388 nucleotides in walleye. Nucleotide sequences of tandem repeats vary both among species and within species. For example, in sauger (S. canadense) the same motif is repeated 18 times for all individuals (n = 10), but for walleye the same motif is repeated from 7 to 14 times, followed by 17 to 20 repeats that are variable in both primary sequence and number among individuals (n = 117). Analyses of folding kinetics indicate that tandem repeats may potentially form hairpin-shaped secondary folding structures with negative associated energies (Fig. 4). Excluding tandem repeats, lengths of control region sequences are more similar among species of percids,
134
JOSEPH E. FABER A N D CAROL A. STEPIEN
908 - 1248 nucleotides total length 20
855 - 861 nucleotides
5 0 - 388 nucleotides
TAS
tRNApi~
Repeated Sequences
Central Conserved Section walleye substitutions 114 165 225 250
CSB-2
walleye substitutions 573 646
11
CSB-3 tRNAphe
I 11
11131 g 85 r8~e
g
;201 396 suDstetutions
FIGURE 3 Structure and sites of variability in the mtDNA control region of fishes from the teleost fish family Percidae. The control region is flanked by sequences that code for transfer RNA. The length of the control region varies between 908 and 1248 nucleotides and consists of several conserved sections, including the termination associated sequence (TAS; Doda et al., 1981; Southern et al., 1988), central conserved section (Lee et al., 1995), and conserved sequence boxes 2 and 3 (CSB-2 and CSB-3; Southern et al., 1988; Saccone et al., 1991), intervened by variable sections, including tandem repeats and base substitutions. Length heterogeneity is due to variable numbers of tandem repeats, totaling 50 to 388 nucleotides. Sites of intraspecific base substitutions for the walleye Stizostedion vitreum and the ruffe Gymnocephalus cernuus are indicated.
.~..., . ,, I
I
.'k . ~ ' ~
CZ-';": "":_-P2a'-i
..r
CG :y. -,,._i.,
: ~.-~ '--.--" C.-',
,.:-:,~..
x._~.
:
~,l. I I i " - ; "
......... v-- .......
,"
..,"
Z/ ;~ L:'-
FIGURE 4 Secondary structures of tandem repeat sequences in the mtDNA control region of fishes from the family Percidae, determined using DNAdraw (Matzura and Wennborg, 1995). Concentric circles denote 5' end of sequence. (A) Folding structure of tandem repeats (274 bases) for sauger, Stizostedion canadense, with estimated free energy of -139.80 kJ (37~ A similar stem and loop-shaped structure was found for walleye, S. vitreum. (B) Folding structure of tandem repeats (132 bases) for blackside darter, Percina macutata, with estimated free energy of -278 kJ (37~ Similar hairpin-shaped structures were found for the banded darter Etheostoma zonate, bluebreast darter E. camurum, zander S. lucioperca, yellow perch Perca flavescens, and ruffe Gymnocephalus cernuus.
ranging from 855 bp in the bluebreast darter E. camurum to 861 bp in sauger S. canadense. Alignment of nonrepeated sequences shows 196 variable sites (pn -0.227), with 61 transitions, 82 transversions, 26 that show both transitions and transversions, 20 insertions/deletions, 6 that include both transversions and insertions/deletions, and 1 site that includes transitions, transversions and insertions/deletions. Higher levels of nucleotide polymorphism are found in the left domain (pn = 0.282) and in the right domain (pn ---- 0.248), with a considerably lower level in the central conserved section (pn -- 0.088). Despite sequence variability among species, nucleotide compositions are similar, with high A-T content and significantly different frequencies among the four nucleotides than expected at random (X2, df = 3, p < 0.05). For example, nucleotide frequency ratios for sauger (S. canadense) are (G)0.14:(A)0.34: (T)0.29: (C)0.23; these ratios differ by no more than two percentage points among species.
A. Phylogenetic Analysis Genetic distance analysis of the entire control region (excluding tandem repeats) produced the tree shown in Fig. 5. Two primary branches were supported, with one containing walleye (S. vitreum) and sauger (S. canadense) as the sister group to zander (S. lucioperca) and
9. PhylogeneticAnalysis of the Percidae F
135
Walleye (haplotype 2) Stizostedion vitreum
I2 /Walleye (haplotype 7) S. vitreum I ~/ Walleye (haplotype 1)S. vitreum .002~~
I 13,1
.043
.037
100
I I
99
I
.015
.016J 711 .028 .006
98
.042
Walleye(haplotype3) S. vitreum
Walleye(haplotype 6) S. vitreum
"~-Walleye (haplotype 4) S. vitreum Walleye (haplotype 8) S. vitreum
.034
73
~
Walleye(haplotype 5) S. vitreum
Sauger, Stizostedion canadense Zander, Stizostedion lucioperca
.101
Yellow perch, Perca flavescens
~ Ruffe (Lake Superior), Gymnocephalus cernuus Ruffe (Russia), G. cernuus .048
.077 100
I
_009 ! 60
~
.039 .034
Banded darter, Etheostoma zonale Bluebreast darter, E. camurum Blackside darter, Percina maculata
FIGURE 5 MEGA(Molecular Evolutionary Genetic Analysis; Kumar et al., 1993)neighbor-joining distance tree for the mtDNA control region of eight species of fishes from the family Percidae. Distanceswere estimated using pairwise ~, genetic distances (a=0.50) and the Tamura-Nei model of nucleotide substitution (Tamura and Nei, 1993);the tree is midpoint rooted. Distances are listed above branches, and values for 1000bootstrap replications (Felsenstein, 1985)are below branches.
yellow perch (P. flavescens), and the other clade depicting ruffe (G. cernuus) as the group sister to the darters, with the banded darter (Etheostoma zonale) as the sister taxon to the bluebreast (E. camurum) and blackside darter (Percina maculata). All relationships, except those among walleye haplotypes, were supported by bootstrap values greater than 60%. Genetic distance analysis of the right domain alone produced a tree with the same topology and similar bootstrap values (not shown), except that the blackside darter was the sister group to the other two species of darters (bootstrap = 39%). Similar trees were produced using the left domain and the central conserved section alone; however, the position of yellow perch and ruffe and relationships among the darters differed. The left domain supported sister group relationships between zander and ruffe (bootstrap = 37%) and between yellow perch and the walleye/sauger (bootstrap = 37%). The central conserved section supported a sister group relationship between ruffe and walleye/sauger (bootstrap = 72%), the blackside darter was the sister group of the other two darter species (bootstrap = 46%), and the yellow perch was the sister taxon to all other species.
Cladistic analysis produced six most parsimonious trees of 313 steps, and the 50% majority rule consensus cladogram is shown in Fig. 6. Excluding the walleye haplotypes produced one most parsimonious tree with the same topology (not shown). All relationships supported by distance analysis were also supported by cladistics. For example, two primary clades were produced: one containing walleye (S. vitreum) as the group sister to sauger (S. canadense), and the zander (S. lucioperca) as the group sister to yellow perch (P. flavescens) and the other containing ruffe (G. cernuus) as the taxon sister to the darters, with the banded darter (E. zonale) as the taxon sister to the bluebreast (E. camurum) and blackside darter (P. maculata). Separate cladistic analyses of the control region by section supported the same general topology (trees not shown). However, the right domain (one shortest tree, 170 steps) suggested that the bluebreast darter is the sister group of the other two species of darters (bootstrap = 99%), the left domain (50% majority rule of 3 trees, 112 steps) suggested that ruffe is the sister group of the clade containing walleye/sauger and the darters (bootstrap = 67%), and the central conserved section (50% majority rule of 3 trees,
136
JOSEPH E. FABER A N D CAROL A. STEPIEN
Sauger, Stizostedion canadense
100 97
67i oo[ 1 6, i
Walleye (haplotype 1), Stizostedion vitreum Walleye (haplotype 2), S. vitreum Walleye (haplotype 7), S. vitreum Walleye (haplotype 3), S. vitreum Walleye (haplotype 5), S. vitreum Walleye (haplotype 6), S. vitreum Walleye (haplotype 4), S. vitreum
100 61
Walleye (haplotype 8), S. vitreum Zander, S. lucioperca
Yellow perch, Perca flavescens 100 ~ 100
I
69
1O01
Ruffe (Lake Superior), Gymnocephaluscernuus
L__ Ruffe (Russia), G. cernuus
10011O01
Banded darter, Etheostoma zonale Bluebreast darter, E. camurum
63 !____ Blackside darter, Percina maculata FIGURE 6 PAUP (Phylogenetic Analysis Using Parsimony, Version 3.1.1; Swofford, 1993) branch and bound 50% majority-rule consensus (Margush and McMorris, 1981) tree of the six most parsimonious trees of 313 steps, for the mtDNA control region of eight species of fishes from the family Percidae, with the zander, (Stizostedion lucioperca) as the outgroup. The tree is midpoint rooted. Consensus values are listed above branches, and values for 1000 bootstrap replications (Felsenstein, 1985) are below branches.
18 steps) suggested that yellow perch is included in the same clade with the darters (bootstrap = 55%). The gl skewness statistic was -0.859, indicating significant skew of the most parsimonious (shortest) tree relative to all other trees produced (p K 0.05; Hillis and Huelsenbeck, 1992). The gl statistic was also significant (p < 0.05) for separate analyses of the left and right domains and the central conserved section. T-PTP tests indicated that control region data supported relationships among taxa (Figs. 5 and 6) that were not significantly different than those supported by morphology (Fig. 1). In other words, for the five genera surveyed, control region data statistically supported relationships found in the morphological trees. Similar levels of support for morphological hypotheses were found for the left and right domains (p > 0.05), but central conserved section data produced relationships that were significantly different than those indicated by morphology (p < 0.05).
B. Population Genetics Seven intraspecific polymorphic nucleotide sites (pn = 0.007) were identified in the control region of
walleye (S. vitreum), with four sites in the left domain (nucleotides 114, 165, 225, and 250; Fig. 3), and two rarer substitutions (found in one or two individuals) in the right domain (nucleotides 573 and 646; Fig. 3). For 117 individuals examined, eight primary haplotypes (excluding differences in tandem repeats) were identified. The most parsimonious relationships among these haplotypes were uncertain (Fig. 6), and genetic distance analysis support for these relationships was weak, with bootstrap values less than 50% (Fig. 5). Most haplotypes were distributed widely among sites in Lake St. Clair and Lake Erie (h = 0.707 + 0.025), with haplotypes 2 and 3 found in each population sampled (Fig. 2). However, four haplotypes were identified only in the Sandusky River or Van Buren Bay populations. The most variance in molecular data was attributable to within-population variability, followed by variability between regions and lakes. Both within-population and between region variance and associated 9 statistics were significantly different than expected by chance (Table Ia). Among-population variability accounted for the smallest variance component, and cI)st, although greater than zero, was not significantly different from that expected by chance (0.05 K
9. Phylogenetic Analysis of the Percidae TABLE Ia
137
Hierarchical Analyses of Molecular Variance among Haplotypes of Walleye (Stizostedion vitreum) a
Variance component Among regions Among populations/regions Within populations
O'a 2 O'b 2 O'c 2
Variance
% total variance
pb
9 statistic
0.0628 0.0214 0.4827
11.08% 3.78% 85.14%
~0.001 ~ 0.096 ~0.001 ~
~ct = 0.100 ~sc = 0.042 (I) st "- 0.149
From Excoffier et al. (1992). bProbability of finding a greater variance component and 9 statistic than the observed values by chance. An asterisk denotes significance at p ~ 0.05. TABLE Ib X 2a and Standard Error Values, and Their Significance, for Haplotype Frequency Distributions of Walleye (S. vitreum) among All Sites and for Unplanned Pairwise Comparisons of Frequency Distributions between Sites b
Lake St. Clair Clinton River Clinton River Maumee River Sandusky River Grand River Van Buren Bay
m 0.001 0.001 0.001 0.001
Lake Erie MaumeeRiver Sandusky River 0.001 ~ m 0.004 0.013 0.005
0.001~ 0.018~ ~ 0.002 0.013
Grand River 0.001~ 0.216 0.006~ ~ 0.004
Van Buren Bay 0.001~ 0.022~ 0.801 0.019 9
aMonte Carlo simulation (Roff and Bentzen, 1989). bAmong all sites: df = 4, X2 = 49.44, p = 0.001 + 0.001~. Tabled matrix includes p values in the upper right and associated standard errors in the lower left. An asterisk denotes significance at p ~ 0.05.
p K 0.10; Table Ia). However, h a p l o t y p e frequencies were not distributed r a n d o m l y a m o n g sites (X 2, p < 0.001 + 0.001; Table Ib), and additional unp l a n n e d pairwise ,~,,2 tests suggested significantly different frequencies of haplotypes b e t w e e n sites (p 0.05), except b e t w e e n the M a u m e e River and Grand River and b e t w e e n Van Buren Bay and the S a n d u s k y River (Table Ib). Four nucleotide substitutions were identified in the control region of ruffe, G. cernuus, with three in the left d o m a i n at nucleotides 85, 87, and 201, and one in the central conserved section at nucleotide 396 (Fig. 3). Two haplotypes were identified, one in the North American Lake Superior region (n = 10) and one in St. Petersburg, Russia (n = 10) (h = 0.500). Intraspecific sequence variability was not found in the control regions of sauger, S. canadense (n = 4), or yellow perch, P. flavescens (n = 10).
IV. D i s c u s s i o n The m t D N A control region of vertebrate taxa, including b o n y fishes, contains highly variable sequences p u n c t u a t e d by conserved sections that are involved in
m t D N A replication (reviewed by Lee et al., 1995). Apparently, h o m o l o g o u s conserved sections revealed in this s t u d y i m p l y similar function in the m t D N A control region of the Percidae. T a n d e m repeats identified in Percidae are also similar to those identified in several other vertebrate taxa (for a review see Hoelzel et al., 1994). Repeats have been d o c u m e n t e d at several sites in the vertebrate control region, but are k n o w n from only the RS1 (repeat site 1; Hoelzel et al., 1994) in some fishes, including Pacific white sturgeon A. transmontanus (Buroker et al., 1990) and Atlantic cod G. morhua (Arnason and Rand, 1992). This s t u d y found that t a n d e m repeats occur at the RS1 in all the Percidae examined. Several m e c h a n i s m s (reviewed in Arnason and Rand, 1992) have been suggested to explain the formation, maintenance, and variability (in p r i m a r y sequence and repeat number) of t a n d e m repeats, including slipped strand mispairing (Levinson and Gutman, 1987), recombination (Rand and Harrison, 1989), and illegitimate elongation (Buroker et al., 1990). Essentially, these hypotheses require the secondary folding of sequences in or near the TAS, causing imperfect termination of strand replication that produces repeated TAS-like sequences. Energetically favorable hairpin structures associated w i t h t a n d e m repeats in the Percidae (Fig. 4) s u p p o r t these hypotheses.
138
JOSEPH E. FABER AND CAROL A. STEPIEN
Comparison of control region sequences also reveals different amounts of phylogenetic signal and phylogenetic noise at different taxonomic levels in the Percidae. The data set appears to contain more phylogenetic noise at higher taxonomic levels and more phylogenetic signal at lower taxonomic levels.
A. Comparisons among Genera Mitochondrial DNA control region sequences show phylogenetic utility at the generic level in the family Percidae. Significant skew of tree lengths from the exhaustive cladistic search indicates phylogenetic signal in the data set, which should support at least some "correct" lineages (Hillis and Huelsenbeck, 1992), and T-PTP analysis indicates significant support in data for aspects of all four morphological hypotheses of relationships among genera (Fig. 1). Genetic distance and most parsimonious tree topologies support congruent relationships among all species examined (Figs. 5 and 6), and both show pike-perches (Stizostedion) and darters (Etheostoma and Percina) correctly assigned to separate clades. Despite support for phylogenetic signal in the data set, inconsistencies between the genetic distance and cladistic trees and the morphological alternatives indicate phylogenetic noise. For example, Stizostedion is paraphyletic, with S. lucioperca (zander) as the sister group to P. flavescens (yellow perch) in the distance and cladistic trees. The sister relationship of yellow perch and zander is inconsistent with all morphological hypotheses and appears unlikely. Finally, the sister relationship of ruffe to the darters (distance and cladistic analyses) supports only Collette and Banarescu's (1977) morphological hypothesis (Fig. 1A), which is not widely accepted. Our study is presently missing several genera (including the probably extinct Percarina, the darters Ammocrypta and Crystallaria, and the European darters Romanichthys and Zingel), which precludes the effective testing of alternative morphological hypotheses. For example, the hypotheses of Page (1985) and Coburn and Gaglione (1992) (Figs. 1C and 1D, respectively) differ only in placement of the European darters, which are not included here. In addition, low sample size (few genera) and/or inherent sensitivity problems of the T-PTP test (Alroy, 1994) may have falsely provided statistical support of our data set for all four morphological hypotheses (Fig. 1).
B. Comparisons among Species Two species of S tizostedion and one of the two genera of darters examined (Etheostoma) were paraphyletic
in both genetic distance and cladistic trees (Figs. 5 and 6), suggesting that the phylogenetic signal may be imperfect at this taxonomic level. However, both distance and cladistic trees support the well-accepted hypothesis of a sister relationship between S. vitreum (walleye) and S. canadense (sauger), with both more distantly related to the S. lucioperca (zander) (Collette and Banarescu, 1977). Evolutionary divergence times estimated from genetic distances appear to support divergence time estimates from other molecular data. If divergence time is estimated at 1 million years per 2% sequence divergence (Brown et al., 1979), as has been assumed for several other taxa of bony fishes (Thomas et al., 1986; Grewe et al., 1990; Bermingham and Avise, 1986), our distance data (Fig. 5) suggest that North American walleye and sauger and the European zander last shared a common ancestor 4.75 + 1.45 million years before present (mybp), supporting the hypothesis of North American colonization from Eurasia via Beringia during the Pliocene, previously supported with allozyme and mtDNA RFLP data (Billington et al., 1990, 1991). Genetic distances also estimate a divergence time of 3.85 + 0.90 mybp between walleye and sauger similar to that proposed using allozymes and mtDNA RFLPs by Billington et al. (1990, 1991). Although both genetic distance and most parsimonious trees show the bluebreast and blackside darters (E. camurum and P. maculata, respectively) as more closely related to each other than to the banded darter (E. zonale), bootstrap values indicate little support for this relationship. Alternatively, the relationships of Percina may need to be reexamined. Genetic distances are consistent with the hypothesis that darters have originated and diversified in North America since the Pliocene (reviewed by Page, 1983). Assuming 2% sequence divergence per million years (Brown et al., 1979), divergence time estimates among darters range from 3.70 + 0.78 mybp between E. camurum and P. maculata to 5.02 + 1.02 mybp between E. zonale and P. maculata.
C. Comparisons among Populations Sequencing the mtDNA control region of walleye detected divergence at a finer geographic scale in Lake Erie and Lake St. Clair than either allozyme or mtDNA RFLP analyses (Todd, 1990; Billington and Hebert, 1988; Ward et al., 1989; Billington et al., 1992). Although the majority of molecular variance was found within populations (Clinton River, Maumee River, Sandusky River, Grand River, and Van Buren Bay) and between regions (Lakes Erie and St. Clair), the nonrandom distribution of haplotypes among populations strongly suggests that walleye do not mate randomly and prob-
9. Phylogenetic Analysis of the Percidae
ably home to natal spawning sites. The population substructure is most easily definable if unique genetic traits arise in isolated populations over long evolutionary time periods (e.g., among sunfish populations in the southeastern United States; Bermingham and Avise, 1986). Unique populations may also arise as indicated by nonrandom frequency distributions of genetic markers due to random extinctions of lineages in isolated populations (Avise, 1994; e.g., Caribbean reef fishes; Shulman and Bermingham, 1995). Most presentday genetic variability of Great Lakes walleye probably originated before or during the Wisconsin glaciation, which lasted from at least 1,000,000 years bp to approximately 10,000 years bp (Pielou, 1991), when populations were putatively isolated in glacial refugia (Billington et al., 1992). The present-day nonrandom distribution of walleye haplotype frequencies in Lake Erie may be due to the postglacial dispersal of walleye haplotypes, followed by random extinction of haplotypes among subsequently isolated spawning populations. Work in progress demonstrates that control region sequences are also useful at a broader geographic range in walleye, with fixed nucleotide substitutions among geographic regions that were probably colonized by populations from different glacial refugia. Some of these differences were also found with allozymes and mtDNA RFLPs by Billington et al. (1990, 1991). Lack of within-population variation in ruffe suggests genetic drift or founding events in the populations sampled. Low variability in Eurasia may be due to genetic drift, and low variability in the recently introduced (ca. 1987) North American sample may be due to a founder effect (a small number of founding individuals in cargo vessel ballast discharge). Alternatively, small sample sizes may have precluded the detection of within-population variability. Distinct fixed genetic differences at least indicate that the North American Lake Superior and the Russian Lake Komsomolskoe populations are genetically divergent and that the Lake Superior introductions did not originate from the Lake Komsomolskoe region of the ruffe's native range. Lack of variability in yellow perch (n = 10) and sauger (n = 4) is also likely due to small sample sizes, but may reflect lack of true population variability. For example, yellow perch populations tend to experience boom and bust cycles (Ney, 1978), which may reduce genetic variability (Strittholt et al., 1988). However, previously reported allozyme variability in sauger from the Ohio River (White, 1993) and mtDNA RFLP variability in yellow perch from Lake Erie (Billington, 1993), indicate that variability should be detected in larger sample sizes. Alternatively, the Percidae
139
as a family may be characterized by low populationlevel variability in the mtDNA control region. D. General Discussion
Relatively poor resolution of relationships among genera and higher resolution of phylogenetic and biogeographic patterns among and within species indicate phylogenetic noise and reduced signal in the control region data set among taxa that have diverged for longer evolutionary periods of time. Unlikely associations among distantly related taxa are probably due to multiple mutations at polymorphic nucleotide sites, which is evidenced by a relatively high nucleotide polymorphism (pn = 0.227) and a transversion to transition ratio of 2.2:1. DNA sequences with nucleotide polymorphism greater than 0.25 and transversion to transition ratios greater than approximately 1:10 may accumulate mutations and homoplasies (reversals of character state changes) that obscure accurate phylogeny reconstruction (Brown, 1983). Genetic distance phylogenetic analyses of mtDNA sequences may also be inaccurate when genetic distances approach or exceed 15% (Brown, 1983; e.g., pairwise distances ranged from about 15 to 28.8% between the bluebreast darter and yellow perch). Once p values exceed 15%, the molecular clock calibration becomes nonlinear (Brown, 1983) and phylogenetic resolution is reduced by homoplasy (Moritz et al., 1987). Alternatively, genetic distance estimates among species of Stizostedion and darters (Etheostoma and Percina) that probably diverged for shorter evolutionary periods were less than 15%, suggesting that fewer homoplasies (and less phylogenetic noise) have probably accumulated. Although the molecular clock estimate for mtDNA divergence is only generally applicable among different regions of the molecule and among different taxa (reviewed by Avise, 1994), these results indicate that control region sequences are at least partially predictive for more closely related genera and species of pike-perches and darters. More recent divergences of walleye and ruffe populations preclude homoplasy in control region sequences within species, indicating that they are accurate population genetic markers. Separate analyses of control region sections indicated that the central conserved section (CCS) evolves more slowly than do the left and right domains. The central conserved section exhibited lower p n and produced phylogenetic trees with very different topologies. Lower polymorphism in the CCS and apparent usefulness in other phylogenetic studies up to the family level (Lee et al., 1995) indicate fewer homoplasies and probable phylogenetic utility at higher taxonomic
140
JOSEPH E. FABER A N D CAROL A. STEPIEN
levels. However, both cladistic and genetic distance trees supported unrealistic phylogenies, and T-PTP analysis of the CCS data set did not support any of the four currently accepted morphological phylogenetic hypotheses (Fig. 1). These results probably do not indicate the true phylogenetic utility of the CCS due to few available characters; only 15 genetic distance and 7 cladistically informative characters were found out of 168 total nucleotides. Control region sequences, though, may help to clarify relationships that cannot be clearly resolved using more slowly evolving genes. For example, rapidly evolving left domain data have been used in tandem with more slowly evolving cytochrome b data to examine the phylogeny of cichlids (Sturmbauer and Meyer, 1993) and rainbow fishes (Zhu et al., 1994). It may also prove instructive to include control region CCS data with such analyses, as this section evolves more slowly than the rest of the control region. Conversely, the left and right domains (including most of the control region) evolve at higher rates, with relatively high nucleotide polymorphism (pn) and support for phylogenetic trees approximately the same as that with the entire control region. Population genetic markers for walleye and ruffe were found primarily in the left domain, suggesting that this section may house population genetic markers in other percids. Population genetic analyses of other fishes, including the pleuronectid Dover sole Microstomus pacificus (Stepien, 1995), scorpaenid thornyhead Sebastolobus alascanus (Stepien, 1995), serranid spotted sand bass Paralabrax maculatofasciatus (Stepien, 1995), acipenserid white sturgeon A. transmontanus (Brown et al., 1993), xiphiid swordfish Xiphias gladius (Bremer et al., 1995), and the salmonid rainbow trout (steelhead) Oncorhynchus mykiss (Nielson et al., Chapter 5), each identify population genetic markers in the left domain of the mtDNA control region, further supporting its population genetic utility across a wide range of fish taxa. The utility of mtDNA sequences for population genetics and biogeography for several taxa of bony fishes indicate that mtDNA control regions of other teleosts may evolve at similarly rapid rates. We expect that mtDNA control region sequences will continue to prove valuable for elucidating within-species and congeneric relationships. Higher-level phylogenies that rely solely on control region data will probably be affected by homoplasy and may suggest unrealistic associations among taxa. It is prudent (although too rarely practiced) to examine variability above and below the taxonomic level being studied in order to evaluate whether the level of variability is appropriate for the problem. Relative evolutionary rates of DNA sequences can only be elucidated through multispecies
comparisons as have been performed here, and such comparisons will undoubtedly help determine phylogenetic signal and utility for other regions of the mitochondrial and nuclear genomes.
V. Material Examined Etheostoma zonale: Tuscarawas River, Ohio, May 1994, n = 1. Gymnocephalus cernuus: St. Louis River, Lake Superior, Wisconsin, April 1994, n = 10; Komsomolskoe Lake, St. Petersburg, Russia, 1994, n = 10. Perca flavescens: South Bass Island, Lake Erie, Ohio, May 1993, n = 2; Sandusky Bay, Lake Erie, Ohio, May 1993, n = 8. Percina maculata: Shade River, Ohio, May 1994, n = 1. Stizostedion canadense: Hannibal lock spillway, Ohio River, Ohio, January 1993, n = 2; Racine lock spillway, Ohio River, Ohio, December 1994, n = 2. Stizostedion lucioperca: unknown European site, 1994, n = 1. Stizostedion vitreum: Clinton River, Lake St. Clair, Michigan, April 1995, n = 24; Maumee River, Lake Erie, Ohio, April 1993, n = 23; Sandusky River, Lake Erie, Ohio, April 1993, n = 24; Grand River, Lake Erie, Ohio, April 1993, n = 25; Van Buren Bay, Lake Erie, Dunkirk, New York, April 1993, n = 21.
Acknowledgments We thank C. Baker, C. Knight, R. Knight, T. Bader, and the Ohio Division of Wildlife; D. Einhouse and the New York Department of Environmental Conservation; and R. Haas and the Michigan Division of Wildlife for collecting spawning walleye. The Ohio Division of Wildlife also provided zander. Ruffe were provided by J. Gunderson and Minnesota Sea Grant. Other samples were collected by J.E.F. and C.A.S. under Ohio scientific permits. Thanks also to N. Billington for help in identifying our zander sample. M. Chandler (C.A.S. laboratory) collected ruffe DNA sequence data and A. Hubers provided technical assistance. E. Bermingham, M. Coburn, T. Kocher, R. Wilson, and an anonymous reviewer provided constructive criticism of this manuscript. This work was supported by Ohio Sea Grant Project RF-726750 (1994), National Sea Grant Project RF-707294 (1995-1998), and Lake Erie Protection Fund Grant LEPF-07-94 (1995 to 1998) to C. Stepien.
References Alroy, J. 1994. Four permutation tests for the presence of phylogenetic structure. Syst. Biol. 43:430-437. Arnason, E., and Rand, D. M. 1992. Heteroplasmy of short tandem repeats in mitochondrial DNA of Atlantic cod, Gadus morhua. Genetics 132: 211- 220. Avise, J. C. 1994. "Molecular Markers, Natural History and Evolution." Chapman and Hall, New York. Avise, J. C., Arnold, J., Ball, R. M., Bermingham, E., Lamb, T., Neigel,
9. Phylogenetic Analysis of the Percidae J. E., Reeb, C. A., and Saunders, N. C. 1987. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18: 489-522. Bailey, R. M., and Gosline, W. A., 1955. Variation and systematic significance of vertebral counts in the American fishes of the family Percidae. Misc. Publ. Mus. Zool. Univ. Michigan 93:1-44. Bailey, R. M., and Etner, D. A. 1988. Comments on the subgenera of darters (Percidae) with descriptions of two new species of Etheostoma (Ulocentra) from southeastern United States. Misc. Publ. Mus. Zool. Univ. Michigan 175:1-48. Bermingham, E., and Avise, J. C. 1986. Molecular zoogeography of freshwater fishes in the southeastern United States. Genetics 113: 939-965. Billington, N. 1993. Genetic variation in Lake Erie yellow perch (Perca flavescens) demonstrated by mitochondrial DNA analysis. J. Fish. Biol. 41:941-950. Billington, N., Barrette, R. J., and Hebert, P. D. N. 1992. Management implications of mitochondrial DNA variation in walleye stocks. N. Am. J. Fish. Mng. 12:276-284. Billington, N., Danzmann, R. G., Hebert, P. D. N., and Ward, R. D. 1991. Phylogenetic relationships among four members of Stizostedion (Percidae) determined by mitochondrial DNA and allozyme analyses. J. Fish. Biol. 39:251-258. Billington, N., and Hebert, P. D. N. 1988. Mitochondrial DNA variation in Great Lakes walleye (Stizostedion vitreum) populations. Can. J. Fish. Aquat. Sci. 45:643-654. Billington, N., and Hebert, P. D. N. 1991. Mitochondrial DNA diversity in fishes and its implications for introductions. Can. J. Fish. Aquat. Sci. 48:80-94. Billington, N., Hebert, P. D. N., and Ward, R. D. 1990. Allozyme and mitochondrial DNA variation among three species of Stizostedion (Percidae): Phylogenetic and zoogeographic implications. Can. J. Fish. Aquat. Sci. 47:1093-1102. Billington, N., and Strange, R. M. 1995. Mitochondrial DNA analysis confirms the existence of a genetically divergent walleye population in northeastern Mississippi. Trans. Am. Fish. Soc. 124: 770-779. Bodaly, R. A. 1980. Pre- and post-spawning movements of walleye, Stizostedion vitreum, in southern Indian Lake, Manitoba. Can. Tech. Rep. Fish. Aquat. Sci. 931:1-30. Bremer, J. P. A., Baker, A. J., and Mejuto, J. 1995. Mitochondrial DNA control region sequences indicate extensive mixing of swordfish (Xiphias gladius) populations in the Atlantic Ocean. Can. J. Fish. Aquat. Sci. 52:1720-1732. Brown, G. G. 1986. Structural conservation and variation in the Dloop-containing region of vertebrate mitochondrial DNA. J. Mol. Biol. 192:503-511. Brown, J. R., Beckenbach, A. T., and Smith, M. J. 1993. Intraspecific DNA sequence variation of the mitochondrial control region of white sturgeon (Acipenser transmontanus). Mol. Biol. Evol. 10: 326-341. Brown, W. M. 1983. Evolution of animal mitochondrial DNA. In "Evolution of Genes and Proteins" (M. Nei and R. K. Koehn, eds.). Sinauer, Sunderland, MA. Brown, W. M., George, M., Jr., and Wilson, A. C. 1979. Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967-1971. Buroker, N. E., Brown, J. R., Gilbert, T. A., O'Hara, P. J., Beckenbach, A. T., Thomas, W. K., and Smith, M. J. 1990. Length heteroplasmy of sturgeon mitochondrial DNA: An illegitimate elongation model. Genetics 124:157-163. Coburn, M. M., and Gaglione, J. I. 1992. A comparative study of Percid scales (Teleostei: Perciformes). Copeia 1992:986-1001. Collette, B. B. 1963. The subfamilies, tribes, and genera of the Percidae (Teleostei). Copeia 1963:615-623.
141
Collette, B. B., and Banarescu, P. 1977. Systematics and zoogeography of the fishes of the family Percidae. J. Fish. Res. Board Can. 34: 1450-1463. Doda, J. N., Wright, C. T., and Clayton, D. A. 1981. Elongation of displacement-loop strands in human and mouse mitochondrial DNA is arrested near specific template sequences. Proc. Natl. Acad. Sci. USA 78:6116-6120. Echelle, A. A., Echelle, A. F., Smith, M. H., and Hill, L. G. 1975. Analysis of genic continuity in a headwater fish, Etheostoma radiosum (Percidae). Copeia 1975:197-204. Excoffier, L., Smouse, P. E., and Quattro, J. M. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA data. Genetics 131: 479-491. Faith, D. P. 1991. Cladistic permutation tests for monophyly and nonmonophyly. Syst. Zool. 40:366-375. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791. Ferguson, R. G., and Derksen, A. J. 1971. Migration of adult and juvenile walleyes (Stizostedion vitreum vitreum) in southern Lake Huron, Lake St. Clair, Lake Erie and connecting waters. J. Fish. Res. Board Can. 28:1133-1142. Grewe, P. M., Billington, N., and Hebert, P. D. N. 1990. Phylogenetic relationships among members of Salvelinus inferred from mitochondrial DNA divergence. Can. J. Fish. Aquat. Sci. 47:984-991. Hawley, G. J., D. H., Dehayes, and Labar, G. W. 1991. "Genetic Analysis of Lake Champlain and New York Walleye Populations." Report to Vermont Dept. of Fish and Wildlife. Hillis, D. A., and Huelsenbeck, J. P. 1992. Signal, noise, and reliability in molecular phylogenetic analyses. J. Hered. 83:189-195. Hoelzel, A. R., Hancock, J. M., and Dover, G. A. 1991. Evolution of the cetacean mitochondrial D-loop region. Mol. Biol. Evol. 8: 475493. Hoelzel, A. R., Lopez, J. V., Dover, G. A., and O'Brien, S. J. 1994. Rapid evolution of a heteroplasmic repetitive sequence in the mitochondrial DNA control region of carnivores. J. Mol. Evol. 39:191-199. Hultman, T., Stahl, S., Homes, E., and Uhlen, M. 1989. Direct solid phase sequencing of genomic and plasmid DNA using magnetic beads as solid support. Nucleic Acids Res. 17: 4937-4946. International Biotechnologies, Inc., Kodak. 1992. Assembly LIGN Sequence Assembly Software. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kocher, T. D., and Wilson, A. C. 1991. Sequence evolution of mitochondrial DNA in humans and chimpanzees: Control region and a protein-coding region. In "Evolution of Life" (S. Osawa and T. Honjo, eds.), pp. 391-413. Springer-Verlag, New York. Kumar, S., Tamura, K., and Nei, M. 1993. "MEGA, Molecular Evolutionary Genetics Analysis, Vers. 1.01." Institute of Molecular Evolutionary Genetics, The Pennsylvania State University, University Park, PA. Leary, R., and Booke, H. E. 1982. Genetic stock analysis of yellow perch from Green Bay and Lake Michigan. Trans. Am. Fish. Soc. 111:52-57. Lee, W., Conroy, J., Howell, W. H., and Kocher, T. D. 1995. Structure and evolution of teleost mitochondrial control regions. J. Mol. Evol. 41:54-66. Levinson, G., and Gutman, G. A. 1987. Slipped-strand mispairing: A major mechanism for DNA sequence evolution. Mol. Biol. EvoI. 4: 203-221. Maddison, W. P., and Maddison, D. R. 1992. "MacClade Version 3.0. Analysis of Phylogeny and Character Evolution." Sinauer, Sunderland, MA.
142
JOSEPH E. FABER A N D CAROL A. STEPIEN
Margush, T., and McMorris, F. R. 1981. Consensus n-trees. Bull. Math. Biol. 43:239-244. Matzura, R., and Wennborg, G. 1995. RNAdraw Version 1.01. McElroy, D., Moran, P., Bermingham, E., and Kornfield, I. 1992. REAP: An integrated environment for the manipulation and phylogenetic analysis of restriction data. J. Hered. 83:157-158. Mercker, R. J., and Woodruff, R. C. 1996. Molecular evidence for divergent breeding groups of walleye (Stizostedion vitreum) in tributaries to western Lake Erie. J. Great Lakes Res. 22:280-288. Meyer, A., Kocher, T. D., Basasibwaki, P., and Wilson, A. C. 1990. Monophyletic origin of Lake Victoria cichlid fishes suggested by mitochondrial DNA sequences. Nature 347:550-553. Moritz, C., Dowling, T. E., and Brown, W. M. 1987. Evolution of animal mitochondrial DNA: Relevance for population biology and systematics. Annu. Rev. Ecol. Syst. 18:269-292. Nei, M. 1987. "Molecular Evolutionary Genetics." Columbia University Press, New York. Nelson, J. S. 1994. "Fishes of the World." Wiley, New York. Ney, J. J. 1978. A synoptic review of yellow perch and walleye biology. Am. Fish. Soc. Spec. Publ. 11:1-12. Page, L. M. 1974. The subgenera of Percina (Percidae: Etheostomatini). Copeia 1974:66-86. Page, L. M. 1981.The genera and subgenera of darters (Percidae, Etheostomatini). Occ. Pap. Mus. Nat. Hist. Univ. Kansas 90:1-69. Page, L. M. 1983. "Handbook of Darters." TFH Publications, Inc., Champaign, IL. Page, L. M. 1985. Evolution of reproductive behaviors in Percid fishes. Ili. Nat. Hist. Surv. 33:275-295. Page, L. M., and Whitt, G. S. 1973a. Lactate dehydrogenase isozymes, malate dehydrogenase isozymes and tetrazolium oxidase mobilities of darters (Etheostomatini). Comp. Biochem. Physiol. B 44: 611-623. Page, L. M., and Whitt, G. S. 1973b. Lactate dehydrogenase isozymes of darters and the inclusiveness of the genus Percina. Ill. Nat. Hist. Surv. 82:1-7. Pielou, E. C. 1991. "After the Ice Age: The Return of Life to Glaciated North America." University of Chicago Press, Chicago, IL. Rand, D. M., and Harrison, R. G. 1986. Mitochondrial DNA transmission genetics in crickets. Genetics 114:955-970. Roff, D. A., and Bentzen, P. 1989. The statistical analysis of mitochondrial DNA polymorphisms: X2 and the problem of small samples. Mol. Biol. Evol. 6:539-545. Saccone, C., Pesole, G., and Sbisa, E. 1991. The main regulatory region of mammalian mitochondrial DNA: Structure-function model and evolutionary pattern. J. Mol. Evol. 33:83-91. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467. Shulman, M. J., and Bermingham, E. 1995. Early life histories, ocean currents, and the population genetics of Caribbean reef fishes. Evolution 49: 897-910. Simon, T. P., and Vondruska, J. T. 1989. Larval identification of the ruffe, Gymnocephalus cernuus (Linnaeus) (Percidae: Percini), in the St. Louis River estuary, Lake Superior drainage basin, Minnesota. Can. J. Zool. 69:436-441. Simons, A. M. 1989. "The Phylogenetic Relationships of the Sand Darters (Teleostei: Percidae)." Masters thesis, University of Kansas, Lawrence, KS. Simons, A. M. 1992. Phylogenetic relationships of the Boleosoma species group (Percidae: Etheostoma). In "Systematics, Historical Ecol-
ogy, and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 268-292. Stanford University Press, Stanford, CA. Southern, S. O., Southern, P. J., and Dizon, A. E. 1988. Molecular and phylogenetic studies with a cloned dolphin mitochondrial genome. J. Mot. Evol. 28:32-42. Stepien, C. A. 1995. Population genetic divergence and geographic patterns from DNA sequences: Examples from marine and freshwater fishes. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation." (J. U Nielson, ed.), pp. 263-287. American Fisheries Society, Bethesda, MD. Stepien, C. A., Dixon, M. T., and Hillis, D. M. 1993. Evolutionary relationships of the fish families Clinidae, Labrisomidae, and Chaenopsidae: Congruence between DNA sequence and allozyme data. Bull. Mar. Sci. 52:873-896. Strittholt, J. R., Guttman, S. I., and Wissing, T. E. 1988. Low levels of genetic variability of yellow perch (Perca flavescens) in Lake Erie and selected impoundments. In "The Biogeography of the Island Region of Western Lake Erie." (J. F. Downhower, ed.). Ohio State University Press, Columbus, OH. Sturmbauer, C., and Meyer, A. 1992. genetic divergence, speciation and morphological stasis in a lineage of African cichlid fishes. Nature 358:578-581. Sturmbauer, C., and Meyer, A. 1993. Mitochondrial phylogeny of the endemic mouthbrooding lineages of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 10: 751- 768. Swofford, D. L. 1993. PAUP (Phylogenetic Analysis Using Parsimony) vers. 3.1. for Macintosh Computers. Ill. Nat. Hist. Surv., Champaign, IL. Swofford, D. L. 1996. "PAUP* (Phylogenetic Analysis Using Parsimony): Preliminary Test Version." Sinauer, Sunderland, MA. Tamura, K. and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mot. Biol. Evol. 10:512-526. Thomas, W. K., Withler, R. E., and Beckenbach, A. T. 1986. Mitochondrial DNA analysis of Pacific salmonid evolution. Can. J. Zool. 64: 1058-1064. Titus, T. A., and Larson, A. 1995. A molecular phylogenetic perspective on the evolutionary radiation of the salamander family Salamandridae. Syst. Biol. 44:125-151. Todd, T. N. 1990. "Genetic Differentiation of Walleye Stocks in Lake St. Clair and Western Lake Erie." U.S. Department of the Interior, Fish and Wildlife Service, Fish and Wildlife Technical Report 28: 1-19. Uhlen, M. 1989. Magnetic separation of DNA. Nature 340: 733-734. Wakely, J. 1993. Substitution rate variation among sites in hypervariable region I of human mitochondrial DNA. J. Mol. Evol. 37:613623. Walberg, M. W., and Clayton, D. A. 1981. Sequence and properties of the human KB cell and mouse L cell D-loop regions of mitochondrial DNA. Nucleic Acids Res. 9:5411-5420. Ward, R. D., Billington, N., and Hebert, P. D. N. 1989. Comparison of allozyme and mitochondrial variation in populations of walleye, Stizostedion vitreum. Can. J. Fish. Aquat. Sci. 46:2074-2084. Weir, B. S., and Cockerham, C. C. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370. White, M. M. 1993. "An Evaluation of the Genetic integrity of Ohio River Walleye and Sauger Stocks." Report to the Ohio Department of Natural Resources. Wiley, E. O. 1992. Phylogenetic relationships of the Percidae (Teleostei: Perciformes): A preliminary hypothesis. In "Systematics, Historical Ecology, and North American Freshwater Fishes" (R. L. Mayden, ed.), pp. 247-267. Stanford University Press, Stanford, CA.
9. Phylogenetic Analysis of the Percidae Wilson, A. C., Cann, R. L., Carr, S. M., George, M., Jr., Gyllensten, U. B., Helm-Bychowski, K. M., Higuchi, R. G., Palumbi, S. R., Prager, E. M.. Sage, R. D., and Stoneking, M. 1985. Mitochondrial DNA and two perspectives on evolutionary genetics. Biol. J. Linn. Soc. 26:375-400. Wiseman, E. D., Echelle, A. A., and Echelle, A. F. 1978. Electropho-
143
retic evidence for subspecific differentiation and intergradation in Etheostoma spectabile (Teleostei: Percidae). Copeia 1978:320-327. Zhu, D., Jamieson, B. G. M., Hugall, A., and Moritz, C. 1994. Sequence evolution and phylogenetic signal in control-region and cytochrome b sequences of rainbow fishes (Melanotaeniidae). Mol. Biol. Evol. 11:672-683.
This Page Intentionally Left Blank
C H A P T E R
10 Phylogenetic Relationships among the Salmoninae Based on Nuclear and Mitochondrial DNA Sequences R U T H B. PHILLIPS and T O D D H. OAKLEY
Department of Biological Sciences University of Wisconsin-Milwaukee Milwaukee, Wisconsin 53201
have been used in phylogenetic analyses of Oncorhynchus (Thomas et al., 1986; Shedlock et al., 1992; Domanico and Phillips, 1995; McKay et al., 1996) and Salvelinus (Grewe et al., 1990; McVeigh and Davidson, 1991; Pleyte et al., 1992; Phillips et al., 1994). Analysis of these molecular data has resulted in clarification of the species relationships in these two genera. Despite all of these studies, major questions remain concerning the relationships among genera, species, subspecies, and populations of these fishes. Salmonid relationships are especially problematic for systematists for the following reasons. First, these fishes underwent a rapid adaptive radiation following tetraploidization around 50-100 million years ago (A1lendorf and Thorgaard, 1984). Rapid radiations are characterized by star phylogenies, and branches are supported by only a few shared derived characters (synapomorphies). Second, hybridization and introgression appear to have been quite common in this group (reviewed in Utter and Allendorf, 1994), leading to inconsistencies between maternally inherited characters [such as mitochondrial (mt) DNA] and characters that are biparentally inherited. Finally, recolonization of lakes released from glaciation within the past 10,000 years has resulted in assemblages of sympatric morphotypes in different degrees of reproductive isolation in different northern lakes; these are especially
I. I n t r o d u c t i o n
Clarification of salmonid relationships is important for conservation of these fishes, many of which are threatened or endangered. In this chapter, major questions concerning the relationships of the fishes in the subfamily Salmoninae are reviewed and conflicting data are evaluated. Trees based on recently obtained molecular data from sequencing specific nuclear and mitochondrial genes are presented and explanations for conflicts are discussed. Assuming that the consensus tree based on molecular data is correct, the implications for the evolution of chromosomes and life history traits are considered. Salmonid fishes have been one of the most intensively studied fish groups. There are three subfamilies of salmonid fishes: Coregoninae (whitefishes and ciscoes), Thymallinae (graylings), and Salmoninae (lenoks, huchen, trouts, chars, and salmons). These fishes have been the subject of many systematic studies using morphological (Behnke, 1989; Sanford, 1990; Stearley and Smith, 1993), karyological (Hartley, 1987; Cavender, 1984; Cavender and Kimura, 1989; Phillips et al., 1989), ontogenetic (Pavlov, 1980; Kendall and Behnke, 1984), and allozyme markers (Utter and Allendorf, 1994; Crane et al., 1994). Molecular data including sequences from both mitochondrial and nuclear genes
MOLECULAR SYSTEMATICS OF FISHES
145
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
RUTH B. PHILLIPS AND TODD H. OAKLEY
146 TABLE I
Scientific and Common Names of Fishes of the Subfamily Salmonine
Scientific name
Common name
Range
Brachymystax lenok Hucho hucho hucho H. hucho taimen H. perryi Salvelinusfontinalis S. namaycush S. alpinus S. malma S. confluentus S. leucomaenis Salmo trutta S. salar Salmothymus Oncorhynchus mykiss
Lenok Danubesalmon Taimen Huchen Brook trout Lake trout Arctic c h a r Dolly Varden char Bull trout Spotted char Brown trout Atlantic salmon Adriatic salmon Rainbowtrout
0. clarki
Cutthroat trout
O. tshawytscha O. kisutch O. keta O. nerka O. gorbuscha
Chinook salmon Coho salmon Chum salmon Sockeye salmon Pink salmon
Russia DanubeRiver Russia Japan North America North America Circumpolar regions North Pacific North America Japan, Russia Europe Atlanticocean EasternEurope Western North America, Asia Western North America North Pacific North Pacific North Pacific North Pacific North Pacific
c o m m o n in fishes of the genus Salvelinus (reviewed in Savvaitova, 1989, 1995), leading to a large n u m b e r of species and subspecies being n a m e d in this genus. A. Intergeneric Relationships
1. Taxonomy and Status of Proposed Genera The subfamily Salmoninae contains b e t w e e n five and nine extant genera. There is general a g r e e m e n t
TABLE H
Disputed genus
Specificname
regarding the status of five genera: Brachymystax, Hucho, Salvelinus, Salmo, and Oncorhynchus (Table I). Four other genera, Acantholingua, Parahucho, Salmothymus, and Salvethymus, have been proposed, but a consensus has not been reached w i t h respect to their status. All of the disputed genera have been erected for single species w i t h restricted ranges with the exception of Salmothymus, which has been p r o p o s e d to include from one (Hadzisce, 1961) to as m a n y as five species (Stearley and Smith, 1993). The competing hypotheses for the disputed genera are listed in Table II. As a result of recent genetic data (Chereshnev and Skopets, 1992), a consensus has been attained regarding the position of Salvethymus. Salvethymus svetovidovi is an endemic species discovered recently in Lake E1g y g y t g y n (Chereshnev and Skoptets, 1992). Despite its u n u s u a l primitive morphology, which led initially to its being placed in a n e w genus, genetic distances based on allozyme data suggest that it is very closely related to the two forms of Arctic char found in the same lake and it should be considered a subgenus of Salvelinus (reviewed in Glubokovsky and Frolov, 1994; Savvaitova, 1995). The k a r y o t y p e of S. svetovidovi is highly derived (Frolov, 1992, 1993), which also contradicts the original hypothesis that the fish is more primitive than other extant species of Salvelinus. The status of another controversial genus, Parahucho, has also been the subject of osteological (reviewed in Holcik, 1988; Dorofeyeva, 1989), c h r o m o s o m a l (Kang and Park, 1973; Viktorovsky et al., 1985; Rab and Liehman, 1982), and allozyme (Osinov, 1991) studies. This controversy involves the relationships of three species: Brachymystax lenok, Hucho hucho sp., and Hucho perryi or Parahucho perryi, referred to as perryi in the follow-
Summary of Disputed Generic Classifications Range
Salvethymus
svetovidovi
Endemic to the Siberian Lake Elgygytgyn
Salmothymus
From one (obtusirostris) to as many as five species
The species obtusirostris is native to rivers in Bosnia-Herzegovina which flow into the Adriatic Sea
Acantholingua
ohridana
Endemic to Lake Ohrid in Macedonia
Parahucho
perryi
Japan
Hypotheses 1. Salvethymus is a monotypic genus (Chereshnev and Skopets, 1992) 2. Salvethymus is a subspecies of Salvelinus derived from a sympatric form of Arctic char (Chereshnev and Skopets, 1992; Glubokovsky et al., 1992; reviewed in Glubokovsky and Frolov, 1994) 1. Salmothymus is a monotypic genus (Hadzisce, 1961) 2. Salmothymus is a subgenus of Salmo (Behnke, 1968) 3. Salmothymus is a genus containing two species: S. ohridana and S. (Acantholingua) obtusirostris (Svetovidov, 1975) 4. Salmothymus is a genus containing as many as five species (Stearley and Smith, 1993) 1. Acantholingua is a monotypic genus (Hadzisce, 1961) 2. Acantholingua is a subgenus of Salmo (Behnke, 1968) 3. Acantholingua is a subgenus of Salmothymus (Svetovidov, 1975) 1. Parahucho is a subgenus of Hucho (Vladykov, 1963) 2. Recognition of Parahucho is not necessary, and perryi is a member of the genus Hucho (Jordan and Snyder, 1902; Stearley and Smith, 1993) 3. Parahucho is a monotypic genus (Viktorovskyet al., 1985; Dorofeyeva, 1989; Phillips et al. 1995a)
10. Salmoninae
147
OUTGROUP
OUTGROUP
OUTGROUP
Brachymystaxlenok
Brachymystaxlenok
Brachymystaxlenok
Hucho hucho
Huchohucho
Parahuchoperryi
Parahuchoperryi
Hucho hucho
1
Huchoperryi
A
B
C
FIGURE 1 Threehypotheses regarding the relationships among Brachymystax lenok, Hucho hucho, and H. (or Parachucho) perryi. (A) H. hucho and perryi are closely related sibling species. Parahucho is not necessary and perryi should be H. perryi (Smith and Stearley, 1993). (B) H. hucho and perryi are diverged enough that perryi can be placed in its own genus. (C) H. hucho and B. lenok are sister species, so including perryi in Hucho makes Hucho paraphyletic.
ing discussion. The three hypotheses are illustrated in Fig. 1. The first possibility (Fig. 1A) is that perryi and H. hucho are closely related species and Parahucho is not a valid genus (Stearley and Smith, 1993). The other two alternatives support the legitimacy of Parahucho. If perryi is not closely related to H. hucho, as in Fig. 1B, it could validate the existence of Parahucho. The third possibility (Fig. 1C) is that B. lenok and H. hucho are more closely related than are H. hucho and perryi. If this is the case, placing perryi in the genus Hucho would make Hucho a paraphyletic taxon. Placing perryi in Parahucho w o u l d prevent this unnatural grouping. Both karyotypic and allozyme data support the third hypothesis. Osinov (1991) calculated genetic distance based on allozymes and found that B. lenok and H. hucho are the closest pair (Table III). Karyotypic data also support this hypothesis because perryi has a very different karyotype (2n = 62) c o m p a r e d with H. hucho (2n = 82-84) and B. lenok (2n = 92), which share karyotypic features not present in perryi (reviewed in Rab et al., 1994). H. hucho is divided into two major subspecies: H. hucho hucho from the Danube in eastern Europe
TABLE III
and H. hucho taimen from Siberia (reviewed in Holcik, 1982a,b). The presence of natural hybrids between H. hucho taimen and B. lenok in the A m u r river system reported by Hsieh et al. (1959) is also consistent with a close relationship between these two species. 2. M o l e c u l a r Data o n D i s p u t e d G e n e r a
Molecular data are available only for the problem of the controversial genus Parahucho. Collection of fresh specimens of the other disputed genera has been difficult because they are located in remote areas. M u s e u m specimens are available for all of them and could be the focus of future studies using ancient D N A techniques. Data relevant to the Parahucho problem are derived from two nuclear genes: ribosomal D N A (rDNA) and intron C of growth h o r m o n e 2 (GH-IIC), as well as mitochondrial D N A restriction fragment length polymorphisms (RFLPs) (Table III). Restriction m a p s of the ribosomal D N A from B. lenok, H. hucho taimen, and perryi were p r e p a r e d and a phylogenetic analysis of these data was done (Phillips et al., 1995a) using m a x i m u m p a r s i m o n y as i m p l e m e n t e d in PAUP (Swofford, 1993). These results (Fig. 2) s u p p o r t the placement of perryi in
Genetic Distances among the Huchonini
lenok/ lenok/ lenok/ taimen/ taimen/ hucho/ blunt/ taimen hucho perryi hucho perryi perryi sharp
Allozyme" mtDNA RFLPb rDNA RLFPc GH2 C a,e GH2 C a,f
0.335 3.25 1.97 2.45 1.77
2.46 1.77
0.891 7.08 3.89 4.49 4.50
1.11 1.32
0.755 6.54 2.68 3.57 3.58
0.103 2.24 3.57 3.58
aData from Osinov (1991). bData from Shedko and Ginatulina (1993). cData from Phillips et al. (1995a). ~Growth hormone 2 intron C data from T. H. Oakley and R. B. Phillips (manuscript in preparation). eDistances were calculated including the 80-bp insertion. fDistances were calculated without the 80-bp insertion.
148
RUTH B. PHILLIPSAND TODD H. OAKLEY
7874~_ 63-~ 83~
Thymallus arcticus Hucho hucho taimen Brachymystax lenok Parahucho perryi Salvelinus leucomaenis Salvelinus fontinalis Salvelinus namaycush
FIGURE 2 Relationshipsamong Brachymystax, Hucho, and Salvelinus based on maximum parsimony analysis of molecular data from
RFLPs of ribosomal DNA. Majority-rule consensus cladogrambased on 500 bootstrap replications using the branch and bound search option of PAUP. Numbers represent bootstrap percentages (Phillips et al., 1995a).
a separate genus Parahucho and suggest that B. lenok is a sister species to H. hucho taimen. Evidence supporting the genus Parahucho was also obtained from analysis of the sequences of GH-IIC. B. lenok and the two H. hucho subspecies have an 80-bp insertion at base pair 28,
A
which is not found in any other salmonids, including perryi (T. H. Oakley and R. B. Phillips, manuscript in
preparation). Genetic distances based on allozymes, mtDNA RFLPs, nuclear rDNA RFLPs, and the nuclear GH-IIC all support the placement of perryi in a separate genus, Parahucho (Table III).
3. Relationships among Genera There is a consensus that the genus Oncorhynchus is the most derived genus in the subfamily Salmoninae, followed by the genus Salmo and then the genus Salvelinus (Fig. 3). Four major hypotheses have been proposed for the relationships among Brachymystax, Hucho, and Salvelinus (Fig. 3): (1) each genus is monophyletic and branches off separately from the main stem (Fig. 3A, Norden, 1961); (2) all three genera form a monophyletic group (Fig. 3B, Kendall and Behnke, 1984); (3) Brachymystax, Hucho, and Parahucho form a monophyletic group, and Salvelinus is on a separate branch (Fig. 3C, Dorofeyeva, 1989); and (4) Brachymystax is on the most basal branch, and Hucho (including perryi) and Salvelinus form a separate monophyletic group (Fig. 3D, Stearley and Smith, 1993). Part of the confusion probably stems from the fact that some of the investigators
Thymallus
Thymailus
Brachymystax
Brachymystax
Hucho
Hucho
Salvelinus
Salvelinus
Salmo
Salmo
Oncorhynchus
Oncorhynchus
Thymailus Thymallus
Brachymystax
Brachymystax
Hucho
C
Parahucho
D
Salvelinus Salmothymus Salmo Oncorhynchus
Salmothymus Hucho Salvelinus Salmo Oncorhynchus
FIGURE 3 Fourmajor hypotheses for relationships among Brachymystax, Hucho, and Salvelinus: (A) Norden
(1961), (B) Kendall and Behnke (1984),(C) Dorofeyeva(1989),and (D) Stearley and Smith (1993).
10. Salmoninae
149
examined only one or two of the species within the genus Hucho, and the species in this genus do not appear to be a monophyletic group as described earlier.
Brachymystaxlenok sharp-snoutedform
5t
Brachymystaxlenok blunt-snoutedform Hucho hucho taimen
4. Molecular Data on Relationships among Genera Molecular data relevant to intergeneric relationships have been obtained from sequences of the GH-IIC (T. H. Oakley and R. B. Phillips, manuscript in preparation). Salmonid fishes have at least two unlinked growth hormone genes (GH-I and GH-II). There are six exons and five introns in the GH genes. The two largest introns are C and D, which average 750 bp for GH-IC, 450 bp for GH-IIC, 1100 bp for GH-ID, and 1300 bp for GHIID (Devlin, 1993; Blackhall, 1994). The GH-IIC intron was sequenced from the two morphotypes of B. lenok, the two subspecies of Hucho (H. hucho hucho and H. hucho taimen), P. perryi, three species of Salvelinus, Salmo trutta, Oncorhynchus kisutch, and Oncorhynchus gorbuscha. These sequences were combined with previously published GH-IIC sequences from Salmo salar (Johansen et al., 1989), five other species of Oncorhynchus (Devlin, 1993; D u e t al., 1993), and two sex-linked pseudogenes from O. kisutch and O. tshawytscha (Du et al., 1993; Forbes et al., 1994) and were analyzed with maxim u m parsimony using the PAUP program. The tree based on this analysis is shown in Fig. 4. This analysis shows that 13 synapomorphies support the placement of P. perryi with Salvelinus, Salmo, and Oncorhynchus and suggests that these other genera radiated rapidly. The additional GH intron sequences should contain enough informative sites to resolve these relationships. Although rDNA RFLP data are available for almost all of the species in the Salmoninae, there are very few informative sites for intergeneric relationships (Phillips et al., 1992, 1995a,b). Maximum parsimony analysis of the ITS1 rDNA sequences produced one most parsimonious tree in which Salmo and Oncorhynchus form one clade and Parahucho is on a branch between this clade and Salvelinus (Fig. 5).
Hucho huchohucho
Salvelinus namaycush
2 I
Salvelinus alpinus Parahuchoperryi
2 I
13
Salmo trutta Salmo salar Oncorhynchus kisutch Pseudogene
13
Oncorhynchustshawytscha Pseudogene
2 I 3]
Oncorhynchus clarki Oncorhynchus mykiss Oncorhynchus kisutch Oncorhynchus tshawyscha Oncorhynchus nerka
3
Oncorhynchusgorbuscha Oncorhynchus keta
FIGURE 4 Intergeneric relationships based on maximum parsimony analysis of molecular data from the GH-IIC sequences. Strict consensus tree of the 12 most parsimonious trees obtained using the branch and bound search option of PAUP. Numbers represent synapomorphies.
Coregonus Brachymystax
B. Relationships among Chars of
Salvelinus
the genus Salvelinus
1. Summary of Morphological, Karyological, and Allozyme Data Behnke (1980) and Cavender (1980) have proposed that the genus Salvelinus includes six major morphologically distinct species. Three major lineages in North America have been designated as subgenera by Behnke (1980) (Fig. 6A). These are the subgenus Cristovomer with one species, S. namaycush (lake trout), confined to lakes in northern North America; the subgenus Baione with one species, S. fontinalis (brook trout), found in
-100
Parahucho
67-~
~
Salmo
90Oncorhynchus
FIGURE 5 Intergeneric relationships based on maximum parsimony analysis of molecular data from the rDNA ITS1.Majority-rule consensus tree based on 500 bootstrap replications using the branch and bound search option of PAUP. Numbers represent bootstrap percentages.
150
RUTH B. PHILLIPS A N D TODD H. OAKLEY
BAIONE
CRISTOVOMER
A
Hucho
Hucho
S. fontinalis
S. namaycush
S. namaycush
S. fontinafis
S. confluentus
SALVELINUS
C
S. albus
S. confluentus
S. leucomaenis
S. malma
S. malma
S. alpinus Hucho
S. alpinus Hucho
S. fontinalis
S. larsoni*
S. namaycush
S. leucomaenis
S. leucomaenis
S. fontinalis
B
S. leucomaenis
D
S. confluentus
S. confluentus
S. malma (Northern AK)
S. namaycush
S. malma (Southern AK)
S. malma
S. malma (Japan)
S. alpinus
S. alpinus (Southern AK) S. alpinus (Norway)
Suggested relationships among chars of the genus Salvelinus: (A) Behnke .(1984), (B) Stearley (1990), (C) Cavender and Kimura (1989), and (D) Crane et al. (1994).
FIGURE 6
streams in eastern and southern North America; and the Arctic char complex, named subgenus Salvelinus. The subgenus Salvelinus includes S. alpinus (Arctic char) with a circumpolar distribution in the Arctic, S. malma (Dolly Varden char) which occurs sympatrically with S. alpinus in the North Pacific, and S. confluentus (bull trout) which is found in the North American Rocky Mountains. Initially, S. confluentus was confused with S. malma because they are very similar in external appearence, but a detailed osteological analysis by Cavender (1978, 1980) showed that they are distinct species. In North America, S. malma is found along the coast from the Olympic peninsula through northern Alaska and S. confluentus is found inland from southern Washington to the Yukon Territory (Haas and McPhail, 1991). In eastern Asia, another distinct species, S. leucomaenis (Japanese char), was considered to be more closely related to S. namaycush based on morphology and life history data by Savvaitova (1980) and Viktorov-
sky (1975), but was placed in the subgenus Salvelinus by Behnke (1965). Many additional char species and subspecies have been named by various Russian researchers (Glubokovsky, 1976; Glubokovsky and Chereshnev, 1982; Glubokovsky and Frolov, 1994), but Savvaitova (1980, 1995) makes a convincing argument that most of these are recently derived forms of the S. alpinus-S, malma complex. Savvaitova (1995) suggests that Salvethymus svetovidovi from Lake Elgygytgyn should be recognized as an additional species in the genus Salvelinus. The description of a fossil (Salvelinus larsoni, formerly Paleolox larsoni) with characteristics of both Parahucho and Salvelinus (Smith et al., 1982) suggested that P. perryi might be a good outgroup for the phylogenetic analysis of Salvelinus species. A close relationship has been confirmed by genetic distances based on allozymes and molecular data. A cladistic analysis of Salvelinus using osteological characters (Stearley, 1992)
10. Salmoninae has produced the cladogram shown in Fig. 6B. There is weak support for most of the nodes, and the placement of S. namaycush and S. conJluentus as sister species is not supported by other data. In fact, a different cladogram for Salvelinus was obtained by Cavender and Kimura (1989) based on osteological data which places S. confluentus as a sister species with S. leucomaenis from Asia (Fig. 6C). Cavender and Kimura (1989) obtained the same cladogram for Salvelinus using karyotype data as they did with osteological data, and this was similar to the one obtained by Phillips et al.(1989) using karyological data. Crane et al. (1994) completed a comprehensive study of allozymes in this genus and analyzed data using the Fitch-Margolish algorithm (Fig. 6D). This study has confirmed a close relationship between S. malma and S. alpinus and between S. leucomaenis and S. confluentus. S. fontinalis was closest to the outgroup, P. perryi, followed by S. namaycush.
151 Hucho S. fontinalis S. leucomaenis S. namaycush
A
S. malma
S. alpinus alpinus S. alpinus erythrinus S. confluentus Hucho S. namaycush 99-
S. fontinafis
2. Molecular Data from Mitochondrial Genes
Trees based on data from mitochondrial genes differ from those based on allozymes and morphological data. The analysis based on restriction site data of the mtDNA molecule (Grewe et al., 1990) found a close relationship among S. alpinus, S. malma, and S. confluentus, but it did not include the Asian species S. leucomaenis. McVeigh and Davidson (Memorial University, Newfoundland, personal communication) have examined 289 bp of the cytochrome b gene in all six species with P. perryi as an outgroup. Their data suggest that S. malma, S. alpinus, and S. confluentus are very closely related (1- to 2-bp differences), but none of them are closely related to the Asian species S. leucomaenis (13-bp differences from the others). The 451-bp mtND3 gene has been sequenced from all of the Salvelinus species, including one member of each of the subspecies in the S. alpinus-S, malma (Arctic char-Dolly Varden) complex (Phillips et al., 1995b). The ND3 gene is evolving faster than cytochrome b in all of the Salvelinus species and P. perryi (Phillips et al., 1995b). However, almost all of the nucleotide differences among species within the genus are transition mutations, suggesting that the mtDNA is not very diverged. Phylogenetic analysis of the ND3 sequence data confirms the close relationship among S. malma, S. alpinus, and S. confluentus, but S. namaycush is located on a branch next to S. alpinus. A maximum parsimony analysis of all of the available mtDNA data produced a tree in which most of the branches were unresolved, suggesting that recent genetic contact has occurred among these species. In the neighbor-joining tree based on these combined mtDNA data (Fig. 7A), bull trout is placed within the S. alpinus-S, malma complex. These results
B
s confluentus 97-
S. leucomaenis 60
S. malma 00-
S. alpinus Relationships among species of the genus Salvelinus based on molecular data (A) Neighbor-joining tree calculated using MEGA based on the two-parameter distance of Kimura (1980) from the combined sequences of the mtND3 gene and a portion of the cytochrome b gene (data from Domanico and Phillips, 1995; H. P. McVeigh and W. S. Davidson, personal communication). Numbers represent bootstrap percentages based on 500 replications. (B) Majority-rule cladogram based on a maximum parsimony analysis of the sequences from the ITS1 and ITS2 of the nuclear ribosomal DNA (Phillips et al., 1994) using the branch and bound search option of PAUP. Numbers represent bootstrap values based on 500 replications. FIGURE 7
conflict with those obtained from allozymes and nuclear DNA sequences from the ribosomal DNA spacers (see later). 3. Molecular Data from Nuclear Genes
When the internal transcribed spacers of the ribosomal DNA were sequenced from the six species of Salvelinus, they were found to be more divergent than the mtDNA ND3 gene in these same species (Phillips et al., 1994; Table IV). These sequences were easy to align because the sequence divergence between any two species pairs did not exceed 7%. Most of the length variation was produced by runs of G's and C's which show
152
RUTH B. PHILLIPSAND TODDH. OAKLEY TABLE I V
Pairwise Nucleotide Differences in Selected Genes in Salmonid Fishes
Selected Genesa Species Pink/sockeye Rainbow/sockeye Hucho/lake trout Lake trout/Arctic char Arctic char-NWT/Alaska
mtDNA ND3b mtDNAcyt b c 0.091 0.117 0.111 0.034 0.017
0.048 0.093 0.070 0.025 0.010
rDNAITS1-2~,e 0.024, 0.062 0.037, 0.073 0.87, 0.137 0.050, 0.065 0.03
GH-IIC~,/ 0.020, 0.017 0.028, 0.027 0.020, 0.037 0.02, 0.03
amtDNA, mitochondrial DNA; rDNA ITS1-2, ribosomal DNA internal transcribed spacers 1 and 2; GH2, growth hormone 2. bData from Thomas and Beckenbach (1989) and Domanico and Phillips (1995). 28 0, ossified; 1, cartilagenous 0, free; 1, embedded, leading to 90~ angle of lower mandible 0, mode ~15; 1, mode ->15 0, absent; 1, present 0, absent; 1, present 0, absent; 1, present 0, without ventral extension; 1, with ventral extension 0, nominal; 1, enlarged and weakly striated 0, nominal; 1, enlarged and weakly striated 0, unicuspid; 1, tricuspid; 2, bicuspid; 3, absent
186
ALEX PARKER
APPENDIX 11: Morphological data employed in analysis of cyprinodontoid relationships Character
Cynolebias whitei Rivulus sp. R. harti Nothobranchius melanospilus Profindulus guatemalensis Fundulus heteroclitus Poecilia caucana Xipophorus maculatus X . signum Cnesterodon decemmaculatus Tomeurus gracilis Aplocheilichthys kassenjiensis A. spilauchen Fluviphylax pygmaeus Anableps anableps lenynsia lineata Crenichthys baileyi Zoogoneticus quitzeoensis Xenotoca eiseni Cubanichthys pengellyi Cyprinodon rubr$uvilatus Jordanellaj7oridae
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 1
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 1 1 1 3 3 3
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 1 1 1 3 3 3 3 3 2 2 3 3 3 0 0 0 0 0 0
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
0 0 0 0 0 0 0 2 2 2 2 2 0 0 0 0 1 1 1 1 1 1
0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 1 1 1 2 2 2
0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
187
22. Combined Data of Cyprinodontiforrnes
3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 8 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 8 0 0 0 0 0 0 0 0 0 o 1 2 2
0 0 0 0 0 0 1 1 1 i 1 1 1
0 0 0 0 0 0 0 0 0 o 1 0 0
0 0 0 0 0 0 0 0 0 ~ 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
1 1 1 1 1 1 1 1 1 i 1 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
0 0 0 1 1 1 1 1 1 i 1 1 1
1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 o o o o o i i i i 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0
1 1 1 1 1 1 0 0 0 o 0 0 0
1 1 1 1 1 1 1 1 1 o 1 1 1
1 1 1 1 1 1 2 2 2 i 2 1 1
1 1 1 1 1 1 2 2 2 2 2 1 1
1 1 1 1 1 1 2 2 2 2 2 1 1
1 1 1 1 1 1 1 1 1 2 1 1 1
1 1 1 1 1 1 1 1 1 i 1 1 1
0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 1 1 1 i 0 0 0
0 0 0 0 0 0 1 1 1 i 1 0 0
1 1 1 1 1 1 0 0 0 i 0 1 1
1 1 1 1 0 0 0 0 0 o 0 1 1
0 0 0 0 0 0 0 0 0 ~ 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
0 0 0 0 0 0 0 0 0 o 0 0 0
1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 0 o i o o o i 1 o i i o 1 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0
0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 ? 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0
2 2 0 0 0 0 0 0
1 1 0 0 0 0 0 0
0 0 0 0 0 1 1 1
0 0 0 0 0 0 1 1
0 0 0 0 0 0 1 1
1 1 1 1 1 1 0 0
0 0 0 0 0 0 1 1
0 0 0 0 0 0 1 1
1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1
1 1 1 0 0 1 1 1
2 2 1 0 0 1 1 1
2 1 1 0 0 1 1 1
2 1 1 0 0 1 1 1
2 2 1 1 1 1 1 1
2 2 1 1 1 1 1 1
0 0 0 0 0 0 0 0
1 1 0 1 1 0 0 0
0 0 0 0 0 0 0 0
1 1 1 1 1 1 0 0
1 1 0 0 0 1 1 1
1 1 0 0 2 0 0 0
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0
0 0 1 1 1 1 2 2
0 1 1 1 1 0 0 0
0 0 1 1 1 0 0 0
0 0 1 1 1 0 0 0
1 1 0 1 1 0 0 0
2 2 0 3 3 0 0 0
0 0 0 1 1 0 0 0
1 1 1 0 0 1 1 1
1 1 0 0 0 0 0 0
1 1 2 0 0 0 1 1
188
ALEX PARKER
APPENDIX IIZ: Morphological data employed in analysis of cyprinodontid relationships Character
Aphanius fasciatus A. dispar A. mento A. chantrei Kosswigichthys asquamatus Orestias ispi 0.luteus 0.agassii Cyprinodon variegatus Megupsilon aporus ]ordanellafloridae Gumanella pulchra Cualac tesselatus Floridichthys carpi0 Cubanichthys pengellyi Fundulus heteroclitus
1 1 3 3 3 3 3 3 3 4 4 4 6 6 6 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 7 5 6 3 4 5 6 7 8 9 3 7 9 0 1 6 2 4 5 6 7 8 9 0 1 2 3 4 5 6 7 1 1 0 0 0 0 0 0 1 2 1 1 1 1 1 1
2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 2 1 2 2 1 1 1 1
1 1 1 1 1 2 2 2 1 1 1 1 1 1 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1
0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0
0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0
1 1 1 1 1 2 2 2 1 2 1 1 1 1 1 1
1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1
1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1
0 0 0 0 0 2 2 2 0 0 0 0 0 0 0 0
1 1 1 1 1 2 2 2 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0
C H A P T E R
12 Molecular Phylogeny of the Fundulidae ('Teleostei, Cyprinodontiformes) Based on the Cytochrome b Gene GIACOMO BERNARDI Department of Biology University of California at Santa Cruz Santa Cruz, California 95064
of characters used to unravel the phylogenetic relationships among Fundulidae. This chapter gives a general overview of the major phylogenetic issues relevant to the family and presents molecular data that will address some of these issues. Fundulidae is a relatively large group of cyprinodontiform fishes that live in fresh, brackish, and coastal marine waters. They are distributed over Central and North America, and their tolerance for high salinity probably explains their presence on Cuba and Bermuda (Fig. 1). An introduced population of F. heteroclitus is also found in southern Spain (Bernardi et al., 1995). Two species, F. parvipinnis and F. lima, are isolated on the western part of the North American continent, in California and Baja California (Mexico). Fundulids are oviparous, and their reproduction and egg development have been thoroughly studied (on earth as well as in space!) (Hubbs and Burnside, 1972; Koenig and Livingston, 1976; Taylor et al. 1977, Hoffman et al., 1977). Other aspects of fundulid biology have also been studied such as hybridization (Hubbs and Drewry, 1959; Setzer, 1970), behavior (Foster, 1967), and karyology (Chen, 1971; Chen and Ruddle, 1970). F. heteroclitus is probably the best-studied fish model for enzyme kinetics and expression. Overall, this group has been
I. I n t r o d u c t i o n
The family Fundulidae has a long and complex taxonomic history. After being included in the cyprinodontid subfamily Fundulinae by Myers (1931), the genera Adinia, Fundulus, Lucania, Leptolucania, and Plancterus were elevated to family status (Fundulidae) by Parenti (1981) in her major revision of order Cyprinodontiformes. To these extant genera, a few fossils forms, generally attributed to either Fundulus or Parafundulus, are also added to the family (Eastman, 1917; Miller, 1945; Parenti, 1981). A Central American family, Profundulidae, which includes one genus Profundulus with five species, is generally considered a sister clade to fundulids and other cyprinotontoids (Fig. 1). The fundulid genera themselves have been the subject of extensive taxonomic work, with a special emphasis put on the most speciose genus of the family, Fundulus. Fundulus systematics dates as far back as Linnaeus. The genus was revised several times by researchers including Garman (1895), Jordan and Evermann (1896), Jordan et al. (1930), Hubbs (1931), Miller (1955), Farris (1968), Parenti (1981), and Wiley (1986). Allozymic (Cashner et al., 1992, for a review) and DNA (Bernardi and Powers, 1995) data have also been added to the list
MOLECULAR SYSTEMATICS OF FISHES
189
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
190
GIACOMO BERNARDI
Profundulus (5)
I Profundulidae
Plancterus (1) Fundulus (40)
Fundulidae
Lucania (2) Leptolucania (1) Adinia (I)
Other Cyprinodontoids
: .~
Parenti, 1981 oC:::~ Profundulus "Plancterus"
13o ~
~ooo
\~~AN"
70
o
I Profundulidae
Lucania
Distributional limits of the family Fundulidae (redrawn after Parenti, 1981).
FIGURE 1
"Fundulus"
Fundulidae
Leptolucania Adinia
thoroughly studied in almost every possible aspect, however, its phylogenetic relationships, so essential for comparative studies, are still poorly understood.
Other Cyprinodontoids
Wiley, 1986
II. Morphology
Relationships of the major genera among Fundulidae according to Parenti (1981) (top) and Wiley (1986) (bottom). (Top) Numbers in parentheses correspond to the number of species within each genus.
Parenti (1981) defines the family using two morphological synapomorphies: "(1) inner arms of the maxillaries directed anteriorly, and often pronounced hooks; and (2) snout pointed and drawn anteriorly with the autopalatine projecting and not articulating with the lateral ethmoid." Wiley (1986) agrees with this definition but questions the validity of the second character. He proposes, however, another morphological character to support the family: "in all fundulids, the epipleural ribs overlap the pleural ribs and are either directly connected to the parapophysis (Adinia, Leptolucania) or to the parapophysis via connective tissue (Fundulus, Lucania, "'Plancterus")" (Wiley, 1986). Once the boundaries of the family are defined, the major issues concerning this family are the interrelationships of the different genera and the monophyletic status of Fundulus. Few attempts have been made to establish phylogenetic relationships among fundulid genera, the most precise ones being presented by Parenti (1981) and Wiley (1986) (Fig. 2). Wiley (1986) questions several morphological characters used by Parenti (1981) to derive phylogenetic relationships within the family and concludes "that the placement of nominal genera
within the family is problematical and a solution must await additional characters." The phylogenetic relationships among Fundulus species, however, have been studied in great detail. By removing Plancterus from Fundulus, Parenti was able to find a single character in support of Fundulus monophyly, a broad articular surface on the second pharyngobranchial. This character is questioned by Wiley (1986), but no alternate character is proposed. In any case, both Parenti and Wiley have doubts about Fundulus monophyly. Indeed, Parenti (1981) says that "a more parsimonious interpretation would place some species of Fundulus as more closely related to Lucania, Leptolucania, or Adinia,'" and Wiley cannot show Fundulus "to be monophyletic and cannot exclude the possibility that it might be para- or polyphyletic" (Wiley, 1986). Fundulus is the most speciose genus of the family. Although Adinia, Leptolucania, Lucania, and Plancterus comprise 5 or 6 species altogether, Fundulus alone includes more than 35 species. Studies on Fundulus relationships were first attempted by Miller
FIGURE 2
12. Fundulidae
(1955) who ranked 27 species in a tentative phylogenetic sequence and by Brown (1957) who placed, without explanation, these taxa into five subgenera: Fontinus, Fundulus, Plancterus, Xenisma, and Zygonectes. Griffith (1972, 1974) established evolutionary relationships among the different taxa based on 70 characters, and Farris (1968), using morphological characters, placed Fundulus taxa into four monophyletic subgenera: Fundulus, Plancterus, Xenisma, and Zygonectes. Lastly, Wiley (1986) provided a phylogenetic analysis of the genus using morphological characters. Wiley recognized the five subgenera described by Brown (1957) but did not find a place for Plancterus and the West Coast Fundulus (i.e., F. parvipinnis and F. lima), which were assigned to the "other species" category.
191
IV. Fish Samples Samples were obtained from all the extant fundulid genera. Adinia xenica , Fundulus olivaceus, F. chrysotus, and F. dispar were collected in Louisiana by B. J. Granier, Leptolucania ommata was collected in Alabama by R. Harper, F. notatus and F. catenatus were collected in Texas by A. Stock and D. W. Stock, F. lima was collected in San Ignacio, Baja California Sur, Mexico, by C. H. Stowell, and F. parvipinnis was collected in Santa Barbara, California, by S. Anderson. DNA sequences from Plancterus zebrinus and DNA from Profundulus punctatus were made available by C. Grant. DNA was extracted from liver tissue following Bernardi and Bernardi (1990).
III. Allozymes and DNA V. DNA Sequences Allozyme data have been used to study Fundulus phylogenetic relationships at the population (Powers and Place, 1978) and species level (Fleming et al., 1962; Duggins et al., 1989). More extensive investigations at the subgeneric and generic level were presented by Cashner and co-workers (Rogers and Cashner, 1987; Cashner et al., 1988; Grady et al., 1990; Cashner et al., 1992). Allozyme work not only provided support for the monophyletic status of subgenera Xenisma and Zygonectes, as well as a clarification of the relationships of taxa within these subgenera, but also provided a framework to better understand the biogeographical implications of Fundulus distributions (Cashner et al., 1992). At the DNA level, nuclear and mitochondrial markers have been used. The nuclear lactate dehydrogenase-B gene has extensively been studied by Powers and co-workers (1993 for a review), mostly in F. heteroclitus populations. Mitochondrial DNA (mtDNA) restriction fragment length polymorphisms (RFLPs) and sequences were also studied for the same populations (Gonzales-Villasenor and Powers, 1990; Bernardi et al., 1993). At a higher taxonomic level, mtDNA gene sequences were determined for the genera Crenichthys and Empetrichthys, which were confirmed as nonfundulids (Grant and Riddle, 1995), for nine species of Fundulus and for Plancterus zebrinus (Bernardi and Powers, 1995). West Coast Fundulus were found to be very divergent, but because sequences from only two genera, Fundulus and Plancterus, were analyzed, and Plancterus was used as an outgroup, the monophyletic status of Fundulus could not be addressed. This chapter presents sequence data from all fundulid genera and one outgroup, Profundulus, and discusses the phylogenetic implications derived from these results.
The polymerase chain reaction (PCR) (Saiki et al., 1988) was used to amplify a 270-bp region of the cytochrome b gene, beginning at the human amino acid 34. Primers and PCR protocols followed Kocher et al. (1989) and Palumbi et al. (1991). Sequencing and PCR primers used were CB2-H, CB1-L, and GLUDG-L (Palumbi et al., 1991). Approximately 100 ng of DNA was used as template for 100-~1 PCR reactions containing 10 mM Tris-HC1 (pH 8.3), 50 mM KC1, 1.5 mM MgC12, 0.01% (w/v) gelatin, 200 mM each dNTP, 2.5 units of Taq DNA polymerase (Perkin-Elmer Cetus), and 1 ]zM each amplification primer. PCR products were used for Taq DyeDeoxy Terminator cycle-sequencing reactions (Applied Biosystems Inc.) and loaded on an automated sequencer (Applied Biosystems 373A). Cytochrome b sequences were aligned using the Navigator program (Applied Biosystems Inc.). Phylogenetic analyses employed maximum parsimony (MP) using the Heuristic option of the PAUP program (phylogenetic analysis using parsimony, Swofford, 1993). The degree of confidence assigned to nodes in trees obtained by MP was determined by bootstrapping (Felsenstein, 1985) with 2000 replicates (Hedges, 1992). The topology-dependent cladistic permutation tail probability analysis (T-PTP) (Faith, 1991) was performed by randomly shuffling the data sets 99 times (after removing the outgroup sequence), using the RANDOMIZER package (Trueman, 1994) and these permuted data sets as input files in PAUP. Actual tree topologies were considered significantly better than random ones when less than 5% of the random sets produced shorter trees than the actual data. The maximum likelihood test of Kishino and Hasegawa (1989)
TABLE I
Fundulus heteroclitus heteroclitus Fundulus heteroclitus macrolepidotus Fundulus grandis Fundulus notatus Fundulus olivaceus Fundulus dispar Fundulus chrysotus Fundutus catenatus Fundulus lima Fundulus parvipinnis Planeterus zebrinus Lucania parva Adinia xenica Leptolucania ommata Profundulus punctatus
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
m 4 18 45 41 44 49 37 50 51 47 37 37 49 59
inf m 20 45 41 46 51 39 50 49 47 39 39 50 55
9.0 10 -46 45 49 48 34 53 55 46 37 35 49 56
3.0 3.0 3.5 -15 36 43 46 54 55 44 52 39 44 54
2.9 2.9 3.8 5.0 -35 39 47 55 54 47 48 39 47 58
2.8 2.9 3.1 4.0 3.5 ~ 47 50 51 58 47 48 41 49 58
4.1 4.2 4.8 4.8 4.9 3.9 ~ 45 62 59 51 41 43 50 54
3.7 3.9 4.2 4.2 3.9 3.6 4.5 ~ 50 49 45 43 41 53 57
2.6 2.6 2.8 3.0 3.2 2.4 3.0 2.6 -20 53 53 58 63 59
3.2 3.1 3.4 3.7 3.4 2.9 3.3 3.1 4.0
4.7 4.7 4.6 2.6 2.9 3.4 4.2 3.2 2.3 2.5
2.8 3.0 2.8 3.2 3.7 3.7 3.7 3.9 2.7 2.8 4.3
3.1 3.2 2.9 3.5 3.9 4.1 7.2 3.4 2.8 3.4 3.7 4.4
2.5 2.5 2.5 2.9 3.4 2.5 3.6 2.7 2.7 3.3 2.7 2.6 2.6
4.2 3.9 4.0 3.2 3.6 3.2 3.4 2.8 2.2 2.3 3.1 3.0 2.8 2.6
60 58 62 66 59
a The n u m b e r of s u b s t i t u t i o n s b e t w e e n taxa are s h o w n below the diagonal. T r a n s i t i o n / t r a n s v e r s i o n ratios are s h o w n above the diagonal.
47 44 59 56
48 50 62
47 57
62
12. Fundulidae was also used to test significance between different trees. The test was performed using the corresponding option in the PHYLIP package (Felsenstein, 1989). The 270-bp portion of the cytochrome b was analyzed (contact author for raw data). Of 263 aligned positions, 111 were variable and 92 were phylogenetically informative. The number of transitions was higher than the number of transversions (Table I), with an average ratio of 3.5 (thus weighting ratios, when used, corresponded to 3 for transversions and 1 for transitions). This result indicates that data are likely to be close to the multiple-hit zone (Brown et al., 1982; Meyer and Wilson, 1990) and corroborates the idea that cytochrome b genes are often not ideal molecular markers for this level of phylogenetic analysis (Meyer, 1994). In the case reported here, cytochrome b data do not allow us to completely resolve the phylogenetic relationships among all taxa that were studied, but do allow us to statistically test some phylogenetic hypotheses.
9 .~
9 .~
193
VI. Phylogenetic Relationships A single most parsimonious tree was obtained (unweighted tree length = 342 steps, consistency index = 0.49; weighted tree length = 538 steps). When transition/tranversion weights were changed or removed, the topology of the tree remained mostly unchanged with the exception of the unstable position of F. chrysotus. Although the overall topology of the tree was stable, only a few clades showed high bootstrap support. A consensus tree (50% majority-rule consensus) of 2000 bootstrap replicates is shown in Fig. 3 (only bootstrap values higher than 50% are shown). Three questions may be addressed from these results: (1) Are the West Coast Fundulus a sister clade to all other fundulids? (2) Is Fundulus monophyletic? (3) Are phylogenetic relationships among fundulids based on morphological and DNA characters concordant?
F.h.heteroclitus
F
F.h.macrolepidotus
F
F.grandis
F
F.catenatus
X
F.notatus
Z
F.olivaceus
Z
F.dispar
Z
Fundulus
~ Xenisma
Zygonectes
L.ommata F.chrysotus
Z
~Zygonectes
O.S
~Plancterus
A.xenica P.zebrinus L.parva 100
F.lima
O.S
F.parvipinnis
O.S
Xenisma
P.punctatus FIGURE 3 Phylogenetic tree of the family Fundulidae obtained using 270 bp of the cytochrome b gene. Consensus tree (50% majority rule consensus) resulting from 2000 bootstrap replicates. The numbers by each branch indicate the result of a bootstrap analysis (2000 replicates) using maximum parsimony (heuristic search) when the bootstrap value was greater than 50%. Vertical bars correspond to the subgenera recognized by Farris (1968) whereas italic letters correspond to the subgenera described by Wiley (1986) (E Fundulus; Z, Zygonectes; X, Xenisma; and O.S., Other species).
194
GIACOMO BERNARDI
A. Are West Coast Fundulus a Sister Clade to A l l O t h e r Fundulids?
(1968), Wiley could not place them in any subgenus and prefers to include them in an undefined "other species" group (Wiley, 1986). Data show that the two West Coast species form a robust clade (supported in 100% of the bootstrap replicates) and that they are the sister group to all other species examined. Four supplementary steps would be necessary to disrupt this sistership. A T-PTP test showed that this result was highly significant (Fig. 4a). The maximum likelihood test of Kishino and Hasegawa (1989) also indicates that the phylogenetic trees have significantly different topologies. Two important implications can be derived from these results: (1) the West Coast Fundulus are shown to be the sister clade of all other fundulids and (2) Fundulus is not monophyletic.
As mentioned earlier, both Parenti (1981) and Wiley (1986) have questioned the monophyletic status of Fundulus. Indeed, only a single character was found by Parenti to support Fundulus monophyly. It is also worth noticing that within the family, the other genera only include one or two species whereas Fundulus comprises more than 35 species (Fig. 2). When using cytochrome b sequence data, 12 steps would have to be added to the most parsimonious tree to obtain a monophyletic Fundulus. In order to determine if these 12 steps are statistically significant, a topology-dependent cladistic permutation tail probability (T-PTP) test was performed (Faith, 1991; Halanych et al., 1995). Data reported here did not support a monophyletic Fundulus (data not shown). If, as shown in Fig. 3, West Coast Fundulus are the sister clade of all other fundulids, then by definition the genus Fundulus is not monophyletic. Two species of West Coast Fundulus, F. lima and F. parvipinnis, live in an isolated area of the West Coast of the United States and Mexico. Although F. parvipinnis can live in fresh, brackish, or salt water, generally preferring brackish estuaries and sloughs along the coasts of California and Baja California (Mexico), F. lima live in freshwater lagoons in the Baja California desert close to San Ignacio. These species have been isolated from the rest of the group since the beginning of the Pliocene, 5.3 million years ago (Griffith, 1972). F. lima and F. parvipinnis may have migrated to the western part of the continent from the East Coast using a southern route before the closing of the Isthmus of Panama (Griffith, 1972). Although these species have tentatively been assigned to subgenus Xenisma by Farris
14d,)
12-
q--4
10-
I.,.,.
8-
,.~
6-
E
4-
Z
e-
o
0
-
B. Is the Genus Fundulus Monophyletic? Because data show that West Coast Fundulus are not to be included in the genus, the next question is whether other Fundulus representatives form a monophyletic assemblage. The author analyzed data constraining the genus Fundulus (after removing the west coast species) to be monophyletic. However, these data were unable to provide statistically significant evidence for either hypothesis (Fig. 4b).
C. Are Phylogenetic Relationships among Fundulids Based on Morphological and DNA Characters Concordant? 1. F u n d u l i d a e
Fundulid relationships have been proposed by Parenti (1981) and Wiley (1986) (Fig. 2), however; neither
a
i
-5
6
5
10
1'5
2'0
0
q '0
15
20
25
30
Tree length differences (steps) FIGURE 4 Distributions of tree length differences between constrained and unconstrained topologies. Randomized data (white bars) are compared to actual data (black bar). (a) A T-PTP test for West Coast Fundulus (i.e., F. parvipinnis and F. lima) being the sister clade of all other fundulids. (b) A T-PTP test for Fundutus (after removing West Coast representatives) monophyly.
12. Fundulidae are supported by the author's data. Indeed, the phylogenetic relationships suggested by Parenti and Wiley require, respectively, 16 and 19 more steps than the relationships based on cytochrome b sequences (as shown in Fig. 3). Although the author's data do not support these relationships, no statistically supported alternative emerges from these data (most of the clades have low bootstrap support). 2. Fundulus
Fundulus has been divided into three subgenera by most authors, Fundulus, Xenisma, and Zygonectes; two other subgenera have also been proposed, Fontinus and Plancterus. Figure 3 compares the author's molecular results with previous subgeneric assignments based on morphological characters (Farris, 1968; Wiley, 1986). Representatives of subgenera Fundulus, Xenisma, and Zygonectes were included in the analysis. The subgenus Fundulus, which is the least controversial of the groupings, is consistent for the three studies presented in Fig. 3. Molecular data support this group with high bootstrap values (94% of bootstrap replicates); however, data for more taxa are needed to confirm these results. Within Zygonectes, the striped species F. notatus and F. olivaceus are found to be sister taxa (95% bootstrap). This result is not surprising and is generally accepted (Farris, 1968; Wiley, 1986; Cashner et al., 1992). Another Zygonectes representative, F. chrysotus, does not cluster with the remaining Zygonectes. However, as mentioned earlier, the branch leading to F. chrysotus is unstable and data are not incompatible with a monophyletic Zygonectes. F. catenatus, a Xenisma representative, is found to be the sister clade of subgenus Fundulus. This result is in disagreement with Farris (1968), who considers Fundulus to be closely related to Zygonectes. The author's results are also in disagreement with the placement of the West Coast species in the subgenus Xenisma (Farris, 1968). As mentioned earlier, F. parvipinnis and F. lima are found to be the sister clade of the rest of the fundulids.
VII. C o n c l u s i o n
Fundulids have been the subject of several conflicting phylogenetic analyses making them a system of choice for molecular studies. Hypotheses based on morphology, behavior, and allozymic studies can be compared with molecular data, and the differences can be statistically tested. Our results are mostly in agreement with subgeneric assignments of different Fundulus species. The subgenera Fundulus and Zygonectes are concordant between the different studies; only Xenisma
195
exhibits important differences among the analyses. At the other end of the hierarchical scale, the generic positions within the family are different between the two morphological studies and sequence data presented here. More taxa and more characters will be needed to clearly define relationships at the intrafamilial level. The West Coast Fundulus species, previously assigned to the "other species" group by Wiley (1986), seem to form a monophyletic sister clade to all other fundulids (or at least all other fundulids studied here). This finding could be the result of long time and geographical isolation of F. lima and F. parvipinnis from the rest of the group, which would produce long branches that might artificially group the two clades. However, for both species the branch length is less than the average branch length of other taxa making this possibility unlikely. If West Coast Fundulus are a sister clade to all other Fundulidae, as data suggest, some taxonomic revisions concerning these two species may have to be considered. Furthermore, it has been shown that F. lima and F. parvipinnis occupy a basal position, making them good indicators for the time of divergence of the family. The family Fundulidae would have diverged before the divergence of the West Coast fundulids from the rest of the family, approximately 5 million years ago.
Acknowledgments This study would not have been possible without the samples or sequences provided by Chris Grant, B. G. Granier, Chris Stowell, Rodney Harper, Shane Anderson, Albert Stock, and David Stock. Chris Grant, Robert Cashner, and Dennis Powers provided useful comments and discussion. Thanks to John Trueman (Australian National University) for providing the Randomizer program and his expertise on T-PTP tests. This research was partly supported by faculty research funds granted by the University of California, Santa Cruz.
References Bernardi, G., and Bernardi, G. 1990. Compositional patterns in the nuclear genome of cold-blooded vertebrates. J. Mol. Evol. 31: 265-281. Bernardi, G., Fernandez-Delgado, C., Gomez-Chiarri, M., and Powers, D. A. 1995. Origin of a Spanish population of Fundulus heteroclitus inferred by cytochrome b sequence analysis. J. Fish Biol. 47: 737-740.. Bernardi, G., and Powers, D. A. 1995. Phylogenetic relationships among nine species from the genus Fundulus (Cyprinodontiformes, Fundulidae) inferred from sequences of the cytochrome b gene. Copeia 469-471. Bernardi, G., Sordino, P., and Powers, D. A. 1993. Concordant mitochondrial and nuclear DNA phylogenies for populations of the teleost fish Fundulus heteroclitus. Proc. Natl. Acad. Sci. USA 90: 9271-9274. Brown, J. L. 1957. A key to the species and subspecies of the cyprin-
196
GIACOMO BERNARDI
odont genus Fundulus in the United States and Canada east of the continental divide. J. Wash. Acad. Sci. 47:69-77. Brown, W. M., Prager, E. M., Wang, A., and Wilson, A. C. 1982. Mitochondrial DNA sequences of primates: Tempo and mode of evolution. J. Mot. Evol. 18:225-239. Cashner, R. C., Rogers, J. S., and Grady, J. M. 1988. Fundulus bifax, a new species of the subgenus Xenisma from the Tallapoosa and Coosa river systems of Alabama and Georgia. Copeia 674-683. Cashner, R. C., Rogers, J. S., and Grady, J. M. 1992. Phylogenetic studies of the genus Fundulus. In "Systematics, Historical Ecology, and North American Freshwater Fishes" (Richard L. Mayden, ed.). Stanford University Press, Stanford, CA. Chen, T.R. 1971. A comparative chromosome study of twenty killifish species of the genus Fundulus (Teleostei: Cyprinodontidae). Chromosoma 32: 436-453. Chen, T. R., and Ruddle, F. H. 1970. A chromosome study of four species and a hybrid of the killifish genus Fundulus (Cyprinodontidae). Chromosoma 29:255-267. Duggins, C. F. J., Relyea, K. G., and Karlin, A. A. 1989. Biochemical systematics in southeastern populations of Fundulus heteroclitus and Fundulus grandis. Northeast Gulf Sci. 10:95-102. Eastman, C. R. 1917. Fossil fishes in the collection of the United States National Museum. Proc. U.S. Natl. Mus. 52:235-304. Faith, D. P. 1991. Cladistic permutation tests for monophyly and nonmonophyly. Syst. Zool. 40:366-375. Farris, J. S. 1968. "The Evolutionary Relationships between the Species of the Killifish Genera Fundulus and Profundulus (Teleostei: Cyprinodontidae)." Unpublished Ph.D. dissertation, University of Michigan, Ann Arbor, MI. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791. Felsenstein, J. 1989. PHYLIP, manual version 3.4 (University Herbarium, University of California, Berkeley, 1989). Fleming, W. R., Scheffel, K. O., and Linton, J. R. 1962. Studies on the gill cholinesterase activity of several cyprinodontid species. Comp. Biochem. Physiol. 6:205-213. Foster, N. R. 1967. "Comparative Studies of the Biology of Killifishes (Pisces: Cyprinodontidae)." Unpublished Ph.D. dissertation, Cornell University, Ithaca, NY. Garman, S. 1895. The cyprinodonts. Mem. Mus. Comp. Zool. 19:1-179. Gonzales-Villasenor, L. I., and Powers, D. A. 1990. MitochondrialDNA restriction-site polymorphisms in the teleost Fundulus heteroclitus support secondary intergradation. Evolution 44:27-37. Grady, J. M., Cashner, R. C., and Rogers, J. S. 1990. Evolutionary and biogeographic relationships of Fundulus catenatus (Fundulidae). Copeia 315-323. Grant, E. C., and Riddle, B. R. 1995. Are the endangered springfish (Crenichthys Hubbs) and poolfish (Empetrichthys Gilbert) Fundulines or Goodeids? A mitochondrial DNA assessment. Copeia 209-212. Griffith, R. W. 1972. "Studies on the Physiology and Evolution of Killifishes of the Genus Fundulus.'" Unpublished Ph.D. dissertation, Yale University, New Haven, CT. Griffith, R. W. 1974. Environment and salinity tolerance in the genus Fundulus. Copeia 319-331. Halanych, K. M., Bacheller, J. D., Aguinaldo, A. M. A., Liva, S. M., Hillis, D. M., and Lake, J. A. 1995. Evidence from 18S ribosomal DNA that the lophophorates are protostome animals. Science 267: 1641-1643. Hedges, S. B. 1992. The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies. Mol. Biol. Evol. 9:366-369. Hoffman, R. B., Salinas, G. A., and Baky, A. A. 1977. Behavioral analysis of killifish exposed to weightlessness in the ApolloSoyuz test project. Aviat. Space Environ. Med. 48: 712-717.
Hubbs, C. 1931. Studies of the fishes of the order Cyprinodontes. X. Four nominal species of Fundulus placed in synonymy. Occas. Pap. Mus. Zool. Univ. Mich. 16:1-86. Hubbs, C., and Burnside, D. F. 1972. Developmental sequences of Zygonectes notatus at several temperatures. Copeia 862-865. Hubbs, C., and Drewry, G. E. 1959. Survival of F1 hybrids between cyprinodont fishes, with a discussion of the correlation between hybridization and phylogenetic relationships. Publ. Inst. Mar. Sci. 6:81-91. Jordan, D. S., and Evermann, R. W. 1896. The fishes of North and Middle America. Bull. U.S. Nat. Mus. 47:1-3313. Jordan, D. S., Evermann, R. W., and Clarck, H. W. 1930. Checklist of the fishes and fishlike vertebrates of North and Middle America north of the northern boundary of Venezuela and Colombia. Rept. U.S. Comm. Fish. 1928:1-670. Kishino, H., and Hasegawa, M. 1989. Evaluation of the maximum likelyhood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order of the Hominoidea. J. Mol. Evol. 29:170-179. Koenig, C. C., and Livingston, R. J. 1976. The embryological development of the diamond killifish (Adinia xenica). Copeia 435-445. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Paabo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Maddison, W. P., and Maddison, D. R. 1989. Interactive analysis of phylogeny and character evolution using the computer program McClade. Folia Primatot. 53:190-202. Meyer, A. 1994. Shortcomings of the cytochrome b gene as a molecular marker. Trends Ecol. Evol. 9:278-280. Meyer, A., and Wilson, A. C. 1990. Origin of tetrapods inferred from their mitochondrial DNA affiliation to lungfish. J. Mol. Evol. 31: 359-364. Miller, R. R. 1945. Four species of fossil cyprinodont fishes from eastern California. J. Wash. Acad. Sci. 5:315-321. Miller, R. R. 1955. An annotated list of the American cyprinodontid fishes of the genus Fundulus, with the description of Fundulus persimilis from Yucatan. Occas. Pap. Mus. Zool. Univ. Mich. 568:1-25. Myers, G. S. 1931. The primary groups of oviparous cyprinodont fishes. Stanford Univ. Publ. 6:1-14. Palumbi, S. R., Martin, A., Romano, S., McMillan, W. O., Stice, L., and Grabowski, G. 1991. "The Simple Fool's Guide to PCR." University of Hawaii, Honolulu. Parenti, L. R. 1981. A phylogenetic and biogeographic analysis of cyprinodntiform fishes (Teleostei, Atherinomorpha). Bull. Am. Mus. Nat. Hist. 168:335-557. Poss, S. G., and Miller, R. R. 1983. Taxonomic status of the plains killifish, Fundulus zebrinus. Copeia 55-67. Powers, D. A., and Place, A. R. 1978. Biochemical genetics of Fundulus heteroclitus (L.). I. Temporal and spatial variation in gene frequencies of Ldh-B, Mdh-A, Gpi-B, and Pgm-A. Biochem. Genet. 16:593-607. Powers, D. A., Smith, M., Gonzalez-Villasenor, I., DiMichele, L., Crawford, D., Bernardi, G., and Lauerman, T. 1993. A multidisciplinary approach to the selectionist/neutralist controversy using the model teleost Fundulus heteroclitus. In "Oxford Surveys in Evolutionary Biology" (D. Futuyama and J. Antonovics, eds). Volume 9, 43-107. Rogers, J. S., and Cashner, R. C. 1987. Genetic variation, divergence, and relationships in the subgenus Xenisma of the genus Fundutus. In "Community and Evolutionary Ecology of North American Stream Fishes" (W. J. Matthews and D. C. Heins, eds), pp. 251264. University of Oklahoma Press, Norman, OK. Saiki, R., Gelfand, D., Stoffel, S., Sharf, S., Higuchi, R., Horn, G., Mullis, K., and Erlich, H. A. 1988. Primer-directed enzymatic ampli-
12. Fundulidae
fication of DNA with a thermostable DNA polymerase. Science 239: 487-491. Setzer, P. Y. 1970. An analysis of a natural hybrid swarm by means of chromosome morphology. Trans. Am. Fish. Soc. 99:139-146. Swofford, D. L. 1993. PAUP: Phylogenetic Analysis Using Parsimony, Version 3.1 (Illinois Natural History Survey, Champaign, 1991).
197
Taylor, M. H., DiMichael, L., and Leach, G. J. 1977. Egg stranding in the life cycle of the mummichog, Fundulus heteroclitus. Copeia 397-399. Trueman, J. W. H. 1994. RANDOMISER program package, version 11/94. Wiley, E. O. 1986. A study of evolutionary relationships of Fundulus topminnows (Teleostei: Fundulidae). Am. Zool. 26:121-130.
This Page Intentionally Left Blank
C H A P T E R
13 Interrelationships of Lamniform Sharks: Testing Phylogenetic Hypotheses with SequenceData G A V I N J. P. NAYLOR
A N D R E W P. M A R T I N
ERIK G. M A T T I S O N and WESLEY M. B R O W N
Department of Biology Osborn Memorial Laboratory Yale University New Haven, Connecticut 06520
Biological Sciences University of Nevada Las Vegas Las Vegas, Nevada 89154
Department of Biology University of Michigan Ann Arbor, Michigan 48109
notion that molecular data are intrinsically better templates than are morphological or behavioral data for recording the tell-tale imprint of evolutionary history. The authors believe that DNA sequence data offer advantages for phylogenetic reconstruction, but do not subscribe to the view that they are intrinsically "better" than other forms of data. The advantages seen for DNA sequence data are: (1) A large number of potentially informative, heritable, and discrete characters can be obtained. This can be useful when the group under investigation is conservative, is characterized by a scarcity of good morphological characters, or has been subjected to repeated bouts of evolutionary parallelism of phenotypic characters. (2) Protein-encoding DNA sequence data can be broken down into different constraint categories based on a knowledge of the genetic code. First, second, and third codon positions can be recognized as can two- and four-fold degenerate sites. These categories can be treated as distinct classes of data and analyzed separately. Morphological and behavioral traits cannot be broken down in this way. (3) Because distinct classes of data can be recognized, differences in the evolutionary dynamics among classes can be investigated. Observations in a given class can be pooled across a number of sites and sub-
I. I n t r o d u c t i o n
Phylogenetic reconstruction involves estimating relationships from patterns of character-state covariation seen among taxa. The endeavor would be straightforward were each evolutionary lineage to acquire its o w n set of unique traits at birth and then pass them on immutably to all descendents. If this were the case, phylogenetic reconstruction would require no more than a search for the evolutionary tree which accounted for the distribution of traits as a perfectly nested set. Unfortunately, evolution is not so simple. Character-state changes occur with markedly different probabilities across both characters and taxa, traits frequently revert to previous conditions, lineages occasionally coalesce, and identical character states arise in multiple lineages by parallel or convergent evolution. In many cases these vagaries conspire to confound or bias inferences about evolutionary history. It is important, when estimating phylogeny, to explore the strengths and limitations of data in light of these potentially confounding influences. In the past decade, much has been made of the power of molecular sequences for inferring evolutionary history (Avise, 1994). Arguments often promote a MOLECULAR SYSTEMATICS OF FISHES
199
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
200
G A V I N J. P. N A Y L O R et al.
stitutional changes contrasted. This capacity to pool observations across sites provides the sample size necessary to detect subtle patterns of character change that might otherwise go undetected if sites had to be analyzed individually (Collins et al., 1994). (4) Because a large number of characters with similar evolutionary dynamics can be assembled, it is possible to estimate relative branch lengths among lineages for a particular class of trait. This can provide a relative (albeit approximate) temporal scale to the pattern of inferred relationships. While the same might be attempted for morphological data, the diverse range of evolutionary dynamics spanned by a collection of morphological traits likely increases the error term of any estimate and thus diminishes reliability. The advantages ascribed to molecular data are thus predominantly associated with the ability to describe their evolutionary dynamics (i.e., to specify a model of character-state change). There are, however, serious shortcomings to DNA data. First, there are only four character states (G, A, T, and C). This limited number of states increases the likelihood of reversion caused by multiple substitutions at a site. Such reversion, termed "site saturation," is a general problem for any set of characters and depends on intrinsic rates of character-state change and the character-state space available. Higher rates and fewer states promote rapid saturation. Problems associated with saturation can sometimes be sidestepped by focusing on slowly evolving sites or on characters that contain a greater number of character states [e.g., focusing on the codon as a character rather than on single nucleotide sites (Goldman and Yang, 1994)]. Second, DNA sites are probably not independent of one another. Most phylogenetic inference methods assume character independence (although many appear quite resilient to violation of this assumption). When sites are not independent or linked, undue weight can be assigned unknowingly to certain character-state distribution patterns at the expense of others. This has the potential to confound phylogenetic inference. Third, there is often considerable variation in the rates at which sites evolve, either across sites within a lineage (Gillespie, 1986a) or for homologous sites across lineages (e.g., Martin et al., 1992). In certain circumstances, rate variation can seriously compromise phylogenetic inferences (Huelsenbeck and Hillis, 1993). Fourth, the nucleotide bases G, A, T, and C are often found in unequal proportions. Such bias in base composition diminishes the character-state space available for recording evolutionary change and effectively lowers the saturation ceiling (De Salle et al., 1987). If bias is similar in different lineages, all lineages should be similarly affected. However, if bias differs across lineages--a con-
dition known as "deviation from stationarity"--there is a tendency for phylogenetic methods to group taxa by the similarity of base composition, regardless of their historical relatedness (Lockhart et al., 1992, 1994). Finally, the various processes and constraints that act on DNA sequences can interact to increase the variance in the evolutionary rate seen for a given class of site, making the task of fitting a model of evolutionary change to the data particularly difficult. For example, redundancy of the genetic code renders third codon positions more free to vary than other codon positions. However, sites free to vary also accumulate compositional biases. The bias, in turn, restricts the amount of evolutionary change that can be recorded. Third position sites can thus range from appearing to be highly variable, fast-evolving sites that record a large number of evolutionary changes to highly constrained, slowly evolving sites that seldom record an event, depending on the eveness of base composition. Phylogenetic inferences derived from molecular data should be critically evaluated in light of these shortcomings so that character-state covariations due to site saturation, rate variation, and base compositional effects are not mistaken for evolutionary signal.
A. Assumptions In phylogenetic inference, a model of evolutionary transformation between character states is applied to a distribution of character states for a group of taxa to yield a tree that best explains the data (Sober, 1988). Different tree-building algorithms invoke different models. Some models are very specific and restrictive (e.g., distance cluster analysis of corrected genetic distances among taxa). Others are more general and have fewer restrictions (e.g., cladistic parsimony). There is generally a trade-off: the more restrictive the model the more explanatory power is reaped; however, as restrictions increase so does the likelihood that assumptions of the model will be violated (Huelsenbeck and Hillis, 1993). Because of the heterogeneous nature of evolutionary change both among characters and among lineages, it is perhaps best to rely on models that can be (1) applied across different types of characters, and (2) modified to include restrictive assumptions when suggested by the data. Cladistic parsimony has been championed as a method that requires few restrictive assumptions (Farris, 1983) and, therefore, as a method that should be widely applicable across a broad range of evolutionary dynamics, such as those seen at the different positions of a codon. The authors subscribe to this view, but feel it important to outline the assumptions that are implicit in parsimony analyses. This is done to underscore that the conclusions are inferences
13. Lamniform Sharks
contingent on data fitting the implied model. Parsimony requires that homoplasy be randomly distributed among taxa. When this is the case, the true historical "signal" (if it is the most influential source of character-state covariance among taxa) should overshadow any "noise" due to homoplasy. In keeping with this view, it is assumed (1) that incorrect inferences are due to stochastic error associated with a small sample of characters, and (2) that erroneous inferences should disappear as more data are collected. This argument is appealing. However, highly structured, nonhistorical sources of character-state covariation among taxa can often dilute or eradicate the signal due to shared history. In some cases these can even swamp any phylogenetic signal with a positively misleading signal. For example, as previously alluded to, nucleotide base compositional frequencies can vary across taxa in such a way that distantly related organisms have more similar base compositions than do close relatives. In such situations, parsimony will be predisposed to incorrectly group the distantly related taxa together because of their base compositional similarity (Loomis and Smith, 1990; Penny et al., 1990; Sidow and Wilson, 1990, 1991; Lockhart et al., 1992; Hasegawa and Hashimoto, 1993; Steel et al., 1993). However, such nonrandom distributions of homoplasy do not necessarily preclude the effective use of parsimony. A careful inspection of data can help identify classes of characters that have the potential to be misleading. Problems so identified can sometimes be ameliorated by judicious use of differential weighting schemes (Hillis et al., 1994; Huelsenbeck et al., 1994). For example, in a data set of six taxa where the two most divergent forms share a base composition profile comprising 45% G, 45% C, 5% T, and 5% A, while the remainder share a profile of 5% G, 5% C, 45% T, and 45% A, parsimony will be predisposed to link the two most divergent forms together as sister taxa. However, if the nucleotides are recoded as either purines or pyrimidines, all six taxa are rendered an unbiased 50:50 purine:pyrimidine base composition. In essence, the transformation of data results in an amplification of the phylogenetic signal to noise ratio by bringing the data more into line with parsimony's requirement for the random distribution of homoplasy.
B. Fossils and Phylogenies Investigation of the sources of character-state covariation in molecular data sets is best accomplished for groups in which the phylogeny is known [e.g., bacteriophage (Hillis et al., 1992), mice (Sage et al., 1993), corn (Kellog and Birchler, 1993)] or can be corroborated by independent means (e.g., higher-order verte-
201
brate classes). In most cases these "model groups" may not lead to predictive results that can be widely applied to different groups of organisms because they either focus on unusual genomes in contrived conditions (e.g., bacteriophage) or address issues that arise when highly divergent lineages are compared (e.g., vertebrate classes). Successes and pitfalls encountered in the analysis of higher taxa may have little relevance for analyses at lower taxonomic ranks because higher taxa generally differ in so many ways that it is impossible to attribute patterns of character-state covariation to any specific subset of biological influences. Ideally, character-state covariation in molecular data sets is best explored by describing patterns of molecular evolution for a group whose phylogeny can be corroborated by independent means and then expanding the taxonomic scope to include taxa that are biologically similar. The success of these studies often depends on information about the history of the group derived from the fossil record. Although first appearances of fossil lineages that lead to extant forms do not provide information about phylogeny, positive correlation between the age of lineages estimated from fossils and the age of lineages determined from phylogenetic analysis of DNA sequences can provide a gauge for the accuracy of phylogenetic inference. Although a lack of significant correlation between age and clade ranks may be indicative of a poor fossil record (Norell and Novacek, 1992) or of a grossly inaccurate reconstruction of phylogeny, a significant positive correlation between age and clade ranks most likely reflects correspondence between phylogenetic pattern and evolutionary history recorded in the paleontological record (Norell and Novacek, 1992). Times of first appearance in the fossil record have been documented for diversifying lineages in a number of groups [e.g., Bryozoans (Jackson and Cheetham, 1994); catfishes (Lundberg, 1992); sharks (Maisey, 1984; Cappetta, 1987)]. When the fossil record for such groups is dense and continuous, the first appearance times of different lineages can be used to calibrate rates of molecular evolution and to investigate rate heterogeneity within and among taxonomic groups (Martin et al., 1992). This can be important for testing alternative hypotheses of molecular evolution (Gillespie, 1986b; Kimura, 1983) and for establishing the phylogenetic utility of specific genes at various levels of taxonomic differentiation (Graybeal, 1994; Friedlander et al., 1994). C. Sharks and the Order Lamniformes
The fossil record of sharks is dense, relatively continuous, and consists almost entirely of teeth (Maisey, 1984; Cappetta, 1987). Many of these teeth are distinc-
202
GAVIN J. P. NAYLOR et al.
tive enough to allow identification of the fossil lineages that gave rise to extant forms, making sharks a model group for the type of fossil-calibrated molecular systematics study described earlier. In general, sharks are a morphologically conservative group. Phylogenetic hypotheses based on morphology have been hampered by a scarcity of shared derived character states (Fechhelm and McEachran, 1984; Compagno, 1988). Molecular sequences may provide a much needed source of shared derived character information with which to infer phylogenetic hypotheses for different shark groups (e.g., Martin, 1993). One group of sharks that is particularly well represented in the fossil record is the order Lamniformes, which originated 124-140 million years ago (Maisey, 1984; Cappetta, 1987). Paleontological work indicates that the order was at its most diverse in the middle and late Cretaceous, but subsequently suffered repeated bouts of extinction. Extant lamniform sharks thus constitute a relictual assemblage of highly divergent lineages. The differentiation among contemporary species is reflected in their classification. There are 16 recognized species classified in 10 genera and seven families. Five of the genera and four of the families are monotypic. The order comprises the relatively wellknown, endothermic superpredators (Lamnidae), i.e., the great white, the two makos, the porbeagle, and the salmon shark; the three species of thresher sharks (Alopiidae) with their extremely long caudal fins, which they use like whips to stun and kill schooling fishes (Compagno, 1984); the whale-like, filter-feeding, basking shark (Cetorhinidae), which can attain lengths of up to 30 feet; the sluggish, benthic sand tiger sharks (Odontaspididae); the deep-water crocodile and goblin sharks (Pseudocarchariidae and Mitsukurinidae, respectively); and the recently discovered megamouth shark (Megachasmidae). Attempts to estimate the evolutionary relationships among these extant taxa based on morphological characters have yielded conflicting hypotheses (Maisey, 1985, Fig. 1A; Compagno, 1990; Fig. 1B). In order to evaluate these alternate hypotheses and to investigate the correspondence between phylogenetic inference and paleontological information, the sequences of the mitochondrial protein-encoding cytochrome b and NADH 2 genes have been determined for all but 2 of the 16 extant lamniform species and the data have been subjected to phylogenetic analysis. Particular attention has been paid to issues that might confound covariation of character states due to shared ancestry. Results based solely on these sequence data suggest a new phylogenetic hypothesis for the order. When inferred branch lengths based on sequence data are contrasted with first appearance information from the fossil rec-
ord, the new hypothesis shows a better fit to the fossil record than do the hypotheses of either Compagno (1990) or Maisey (1985).
II. Materials and Methods Fresh tissue samples were obtained for all but 2 of the 16 lamniform species. Tissues from Odontaspis noronhai or from Carcharias tricuspidatus were not obtained. Where possible, multiple specimens were sequenced for each species. A list of specimens sequenced is presented in the Appendix with corresponding locality data. Sequences for the mitochondrial NADH 2 and cytochrome b genes were obtained. Both were amplified using polymerase chain reaction (PCR), using a different protocol for each. The NADH 2 gene was amplified in two steps. The double-stranded DNA product was made by subjecting a total DNA preparation to 30 cycles of PCR amplification in a 100-#1 reaction using Perkin-Elmer Taq polymerase and two universal NADH 2 primers (Kocher et al., 1995). The product was chloroform extracted, precipitated with ammonium acetate and ethanol, washed in a 70% ethanol/10mM TRIS, 1/~M EDTA (TE) solution, air dried, resuspended in TE, and run out on a low melting point gel. The band, visualized under low intensity ultraviolet light, was excised, purified (Gene-Clean; United States Biochemical), and stored in 80 #10.1 TE. A single-stranded DNA product was then made by using the double-stranded product as the template in a second 20-cycle PCR reaction, to which only one of the two original amplification primers was added. The resulting single-stranded product was cleaned and concentrated using four flushes of 0.1 TE solution through ultrafree microconcentration tubes (Millipore Corporation), then sequenced using the Sequenase protocol (USB) employing dideoxy-NTP termination reactions (Sanger et al., 1977, 1980) in conjunction with a series of sequencing primers spaced at approximately 150-bp intervals along the fragment. The cytochrome b sequence was amplified in a 25-#1 reaction using 12.5 pmol of the primers GluDG and Cb1211H (Palumbi et al., 1991), Perkin-Elmer buffer, 200 ~M of each nucleotide, and 1/~1 of Perkin-Elmer Taq polymerase. An initial 30 amplification cycles were carried out at 94~ for 30 sec, 52~ for 15 sec, and 72~ for 60 sec. One microliter of the amplified product was used to seed a second, 150-]zl reaction containing 75 pmol of the GluDG primer and 7.5 pmol of biotinlabeled Cb1211H. After another 35 cycles of amplification, the DNA was precipitated with 7.5 M ammonium acetate and 50% ethanol, pelleted by centrifugation at
13. Lamniform Sharks
Lamnidae
A
203
Alopiidae u~ r
.C) "~
~
~~~
g ~
I I I
I I I I I
Lamnidae
B 4:=
"
LIJ I I I I
Alopiidae
~.~ ~
"P-.
.~ ~~: ~
Odontaspididae
~ .s
~
E
[-fJ 'iJ L-IJ I I
i
Hypotheses of lamniform relationships forwarded by (A) Maisey (1985) and (B) Compagno (1990).
FIGURE 1
high speed for 10 min, washed once with ethanol, air dried, and resuspended in 40 /zl of water. For each sample, 20/~1 of Dynal streptavadin beads was washed with 50/zl of binding and washing (BW) buffer (4 M NaC1, 10 mM Tris, 1 mM EDTA, 0.1% Nonidet P-40). The beads were resuspended in 40 #1 of BW buffer, combined with the DNA, and the solution was incubated for 1 hr with slow rotation at 45~ to allow the biotin-labeled DNA to bind to the streptavadin beads. The beads were then washed once with 50 #1 of BW buffer, twice with 50/~1 of sterile, distilled water, and resuspended in 12/~1 of water. The sample was boiled
for 15 sec and quickly put on the magnet to remove the beads from solution. The solution containing the nonbiotin-labeled DNA was collected, diluted with 28 #1 of water, labeled, and stored at 4~ Following heat denaturation, the beads were incubated at room temperature for 10 min in 0.1 N NaOH, washed twice with 50 /zl of sterile, distilled water, and resuspended in 40/~1 of water. Both strands were sequenced using a battery of primers and the Sequenase protocol (USB). In most cases, both genes were sequenced for each individual. However, there were instances where NADH 2 was sequenced from one individual whereas
204
GAVIN J. P. NAYLOR et al.
cytochrome b was sequenced from a different individual of the same species (see Appendix). In these cases, the genes from the different individuals were combined to represent that species. Such use could pose a problem if within-species p o l y m o r p h i s m was so great as to render a species paraphyletic with respect to other taxa. However, in all cases in which multiple individuals were sequenced there was very little within-species sequence variation (
806040-
20-
16S, stems I
12S, stems
I
12 o9
co
10
-e-
12S, loops
-a-
16S, loops
O9 > O9
c-
The number of variable sites observed in each structural category and the transition/transversion ratios for the three data sets are plotted in Fig. 3. Transition/ transversion ratios were considerably higher in stems than in loops for both genes, presumably reflecting structural constraints (Vawter and Brown, 1993). Because the G-U pair has a stable conformation in stems, most transitions do not disrupt pairing (e.g., G-U may change to G-C or A-U by a single transition, without disrupting the stem structure). Strong selection for maintaining secondary structure was also suggested
J
O9
8 6
C 0
4-
c"
O9
2
I-
0
~ I
Serrasalminae
I
Characiformes
6
b I
Ostariophysi
Number of variable sites (top) and transition/transversion ratios (bottom) for changes reconstructed on the most parsimonious tree for each of three mitochondrial DNA data sets (12S and 16S). Values for stem and loop regions for each gene fragment are shown separately. FIGURE 3
224
GUILLERMO ORTI
and Hillis, 1993). For example, for the characiform data set, a total of 192 out of 243 substitutions in stems did not disrupt base pairing (and of 243, only 50 nondisrupting mutations are expected by chance alone). The percentage of all potential compensatory mutations observed was close to 70% for the three data sets, but as the level of divergence among sequences increased, a decrease in the frequency of single substitutions in stems was observed (i.e., a nucleotide changing in one strand only). For the serrasalmin data set 61% of all changes in stems were single substitutions, but this value dropped to 36% among ostariophysan sequences. Therefore, increasing sequence divergence was paralleled by a higher frequency of double substitutions in stems (i.e., a nucleotide change accompanied by a change in the corresponding base on the opposite strand), and with an increase in the proportion of transversions (Fig. 3). These observations suggest that although the fraction of mutations in stems resulting in compensatory changes is not affected by the level of divergence among sequences, the kinds of substitutions involved (single or double) are different. 3. Nucleotide Variation and Saturation
An increase in the number of variable sites, but a drop in transition/transversion ratios were both correlated with increasing taxonomic diversity (Fig. 3). However, differences in these two parameters between the characiform and the ostariophysan data sets were relatively small compared to differences between the characiform and the serrasalmin data sets. The same pattern can be observed in a comparison of the number of changes per site (Fig. 4, right). The small differences between the characiform and the ostariophysan data sets in both of these parameters suggest that nucleotide changes may have reached saturation beyond the interfamilial level (especially in loop regions, where the reconstructed number of changes per site is about 2.5; Fig. 4). Among genes and structural categories, the 16S and 12S fragments were overall equally variable (when number of variable sites were corrected for category size), but loop regions (in both genes) were more variable than stems. The pattern of relative frequency of change per category was not affected by the level of taxonomic diversity (Fig. 4, left). A dramatic view of why saturation at the nucleotide level might occur among sequence comparisons beyond the interfamilial level can be seen by identifying the variable sites in the secondary structural models (Figs. 5 and 6). Structural constraints limit variation to well-defined regions of the molecules (shown by lower-case letters in Figs. 5 and 6). These regions al-
Freq. of Change per Category
No. of Changes per Site
Sites U2 S
[ 6s
mong genera errasalminae)
Stems U 2s
Loops U 2s
All Sites
~,,,
-7
~, ~ ,
0
0.2
0.4
0.6
0
0
0.2
0.4
0.6
0
'
i
, 11
,
i
,
2
i
,
i
,
3
I
os
2S
Stems Loops
~2s
All Sites
Stems
1
2
3
2S
I
~2s
Loops 0
0.2
0.4
..... L 0.6
I
iii ' 0
i
,
i 1
,
i
,
i 2
,
i
, 3
Distribution of variation among structural categories in the 12S and 16S sequences. From top to bottom: changes among genera within the subfamily Serrasalminae (33 taxa), among families within the order Characiformes (27 taxa), and among orders within the superorder Ostariophysi (22 taxa). Histograms on the left show the relative frequency of change in the different genes and in stems and loops (number of changes observed in the category divided by total number of changes). Histograms on the right show the amount of change per site in each category (number of changes in category divided by category size). All changes were reconstructed on the most parsimonious trees with MacClade.
FIGURE 4
ready become apparent even among closely related serrasalmin sequences, and only a few more sites are seen to vary when taxonomically more distant sequences are compared. Almost identical profiles in the sliding-window analyses of the Characiformes and ostariophysan data sets suggest that the variation recorded among characiform families might be close to the maximum "allowed" by structural constraints. Indeed, the most divergent characiform sequences were 21.3% different (Gnathocharaxand Hoplias ), the same as the maximum sequence divergence observed between otophysan orders (21.9% between Crossostoma and Boulengerella), and only slightly lower than the maxi-
14. Radiation of Characiform Fishes
G-c
G-C
A--U A
G
225
G
UC -G u
G--C A C--G C-G G-cU E~ c--G C A A G U'G
12S rRNA
A--U U--A cCGCC-GCu
UC_GeA ~
5'-Uc
9
A--U
,
A
-c
-C
%
AA
u u
%u
/ cO o,~,c,50 (neighbor-joining bootstrap values above branches, parsimony values below branches). African characiform taxa are shown in black boxes.
B. Evidence from Ependymin Sequences 1. Variation among Ependymin Sequences About 600 bp of the cDNA sequence was obtained for 13 characiforms, two gymnotiforms, and four catfish taxa. The inferred amino acid sequences, aligned with published sequences from cyprinids, salmoniforms, and a herring, are shown in Fig. 13. Percentage sequence differences among species were large (Fig. 7), confirming previous observations that ependymin is a rapidly evolving gene (MLiller-Schmid et al., 1993). Even cysteine residues are not fully conserved, but the pattern of variation seems to agree with major taxonomic divisions. For example, cysteine residues are found at position 20 for catfish taxa only and at positions 154 and 155 for most (but not all) catfish, gymnotids, and characiforms (Fig. 13). Length variation among sequences, comprising 1-8 amino acid residues, is most noteworthy in comparisons involving catfish taxa; Distichodus and Nannobrycon share a deletion at position 48. Conserved features in the sequences, including potential N-glycosylation sites and cysteine and tryptophan residues, are also shown in Figure 13. The most conserved region is located around the potential N-glycosylation site at position 80. This site is presumably necessary for binding crucial oligo-
Data sets including all 25 taxa and only catfish, electric fish, and characiforms (19 taxa) were analyzed separately. Figure 14 shows parsimony trees obtained with these data sets using different weighting strategies. For 25 taxa, a total of 588 bp were aligned of which 442 were variable and 359 phylogenetically informative. When third codon positions were excluded, only 258 sites were variable and 193 were phylogenetically informative. A herring (Clupea harengus) was used as the outgroup. Currently accepted relationships among these orders, based on morphology, are shown in Fig. 2E. Most parsimonious trees obtained by excluding transitions in third codon positions and by eliminating third positions completely were mostly congruent with each other, but differed somewhat from the ones using all characters with equal weight (Fig. 14A and B). The most basal branches on the trees resulting from all weighting strategies are congruent and receive very high bootstrap support. Protacanthopterygii (Esox + Salmo), Otophysi (cyprinids + characiforms + siluriforms + gymnotiforms), Characiphysi (characiforms + siluriforms + gymnotiforms), Gymnotiformes, and Cyprinidae are all strongly supported. In contrast, characiform monophyly is not well supported as the African Distichodus either tends to branch off before electric fish (Fig. 14B) or forms a trichotomy with siluriforms and gymnotiforms (Fig. 14A). Although the topology within characiforms is congruent for trees A and B, it receives very low bootstrap support, except for the grouping of Alestes + Phenacogrammus (African subfamily Alestinae, family Characidae) and Gymnocorymbus + Paracheirodon (Neotropical family Characidae). Note that Chalceus and Metynnis, traditionally included in the family Characidae, come out in separate branches while Gasteropelecus (family Gasteropelecidae) groups with Gymnocorymbus + Paracheirodon. The most important difference between the trees is the relationship among electric fish, catfish, and characiforms. Whereas tree A (Fig. 14) groups catfish with electric fish (the "traditional" hypothesis also shown in Fig. 2), tree B suggests a closer
14. Radiation of Characiform Fishes o o
:r: ::r: :r: :I: :I: :i: :z: :z: ~
:1: >.
~
:z: ::I: :i: :z: :r: :z: :z: "v ~
~
~
~
~. ~
~
< < < < < < < < < < < < < < < < < < < < < < ~ <
>:>
:E
mm
[~ M
M M
I >
I >
mm
M
~
~ ~E
~ ~
~
~ ~
~
> >
~
>
>
>
I >
I >
I >
.
M
~
ooooooooo
I >
mm
I >
M
~
I >
I >
m< ~
~
~ ~
mm M
~
~ ~
~ ~-~ ~
>
~
M
.
I >
> >
I I I :> :> >
M
I >
> >
~ ~
~.OM
~
r ~ ~-~ ~-~ ~
oooooooo.
I >
~.~
.
M
.
m
>
> >
>
I I :> >
I >
I >
~
I >
~. ~.~. ~.~. ~ . ~
~.~
~.~. ~.~.
>
>
>
r
>
>
>
>
~
t9 t9 e- L9 L9 t9 >
~
~
~. ~. ~
~E ~.
O
r~" O
O
~ ~. ~. ~-
~. ~ ~~. ~. ~- ~.
m ~,. ~. ~.
~. ~. ~. ~.
r,. >
t9 e'. t'. t9 ~9 t9 t9 t9 e- t9 ~
r~" e" r ~. ~-. ~. ~.
r
~. ~:
~
~:
o- ~. o- 0 . . . . . .
~
0
~
~
~
~: ~:
~. ~
~. ~:
~ ~
~:
~
~. ~
~: ~
~
>
~
0
~
0
0
0
0
~. 0
0
0
0
00]
~
~
,-.
~
~
~
~
,,-
~
~
~
~:
z
C~ O e O 0
r
e" 0
0
C~ 0
r
>
~
~1 0.1 r ~ 0
0
~. ~. ~. ~.
~. ~. ~. ~.
~. ~. ~. ~.
~. ~. ~. ~.
~. ~. ~. ~.
~~ ~. ~.
m u~ ~ ~=
m vl t9 ~.
m u~ t9 ~
m u~ < ~
m ~ u~ < :~ ~= u~
I'
J
~. ~. ~. ~.
~ ~ ~. ~.
lln v/////~
I I I
~
> > > > > >
> >
> >
>
~
> >
> > >
> >
> ~
0
0
~ u~ ~: o
~ ~ ~
:E: >
[~ ~
~
r
~. ~. ~. ~.
I U O O U U O O O O O O O O O O O O O O O O U ~ O ]
~
~: u~ o
Fill F/////~
I
I
FIGURE 13
relationship between electric fish and characiforms. Both of these topologies are stable to a posteriori reweighting (Farris, 1969; Carpenter, 1988). For example,
the tree obtained when this procedure is applied to the data set with all characters equally weighted shows Distichodus branching off before Siluriformes +
oo, ~ L92.r~ ~ Fr~r'-r~ r~
r"-! J I P--I
q t
_.li
I
I
i
~
g0
~I~
,
I""
II "-J~[:] =P"T"I';'. i ~ l i i T i i l q ~
Hoplias Boulengerella Chalceus Gymnocorymbus Paracheirodon
L.po nu. Metynnis
Nannobrycon
97 35
100
100
1oo
A
~l't~rlll Rhamphichthys Eigenmania Hypostomus Pimelodus Synodontis Schilbe Cyprinus Carassius Danio [--- Esox Salmo Clupea
r
'J'r:]'r'r'~'z' lt:l'*~'Jl'I:m"-' Hoplias i J"l Boulengerella I J_..j Chaiceus I J Gymnocorymbus ~ J Paracheirodon--J J 70 I i Gasteropelecus 5 Hemiodus. Leporinus Metynnis Nannobrycon Eigenmania--] 98
Rhamphichthys---J liwrtr'~,'~r~z Schilbe~ Pirnelodus Hypostomus Cyprinus, ~ Carassius Danio Esox Salmo--'] Clupea
B
~ !~[:]=P._~4,.-~.1.1.p [.t Metynnis Hemiodus Leporinus
Chalceus 1ool ~ Gymooco~bu~ s3J "K! ' Paracheirodon II ~ - Gasteropelecus J[! Boulengerella [~90 J L - Nannobrycon
[
80 Schilbe --'-Ip Synodontis imelodus Hyposlomus
C
1 tree, L=377 C1=0.69
R1=0.68
10o 100
All positions, no TS in 3rd 2 trees, L=1029 C1=0.63 R1=0.69
Hemiodus ~'~5 Le po ri n u s--- I /
Nannobrycon---u I I 13 Boulengerella---J J Hoplias -----J
[::::: Gymnotiformes ~ \\\ \\
[\\\'q\
kN\ Siluriformes \\]
t ~,'T:I:FII Eigenmania j 100 Rhamphichthys Schilbe eus~ Synodonlis Pimelod Hyposlomus
5 I st and 2nd positions only
52
Gasteropelecus "'-]58 h Gymnocorymbus ~ J J Paracheirodon J r ~]o Chalceus-a 1132 Metynnis ---.-J J I ~ - . ~ h3 "-.J,I'I,P-W..,;TeI,,,,,,,[-t J I~,'
L_ Hoplias
lm]~ml".liT:FI~-, Eigenmania Rhamphichthys
40
71
Synodontis
All positions, equal weights 2 trees, L=1542 C1=0.53 R1=0.59
~
I
D
100
5 ,.,,., All positions, no TS in 3rd 2 trees, L=581 O1=0.69 R1=0.68
Parsimony trees from ependymin cDNA sequences. (A) Strict consensus tree obtained using all taxa and all characters with equal weight. (B) Strict consensus tree using all taxa and excluding transitions in third codon positions. For A and B, boldface type, thicker branches, and a solid bar identify characiform taxa, whereas boxes with horizontal lines, crossed-hatched, and open identify gymnotiform, siluriform, and cyprinid taxa, respectively. (C) Shortest tree using first and second codon positions only. (D) Strict consensus tree excluding transitions in third codon positions. For C and D, characiform taxa in boldface type belong to families other than the Characidae; branch lengths are proportional to the number of changes (scale corresponding to five changes is shown). For all trees, bootstrap values are shown above the branches only when those branches were recovered in the bootstrap majority-rule consensus tree. L, tree length; CI, consistency index (excluding uninformative characters); and RI, retention index. African taxa are enclosed in black boxes. FIGURE 14
14. Radiation of Characiform Fishes
Gymnotiformes and is equal to one of the shortest trees. Of these two alternative hypotheses, tree A (Fig. 14) is less well resolved, has lower bootstrap values, and a lower consistency index (CI) than tree B as a likely consequence of considering "noisy" third codon positions. Furthermore, forcing the topology shown in Fig. 14B on the data set with all characters equally weighted required only four additional steps (L = 1546), in contrast to eight additional steps required by the topology shown in Fig. 14A (L = 1037) on data excluding transitions in third positions. Excluding the fast-evolving third codon positions also results in higher bootstrap support for grouping the electric fish with characiforms (Fig. 14B) instead of with catfish (Fig. 14A). An alternative approach to test for how well particular clades are supported by data is by inspection of suboptimal trees ("decay analysis or Bremer support," Bremer, 1988), counting how many extra steps are required to collapse the clade of interest. For the clade grouping electric fish with catfish (Fig. 14A) two extra steps are required (with all characters, equal weights), whereas for the clade grouping electric fish with characiforms (Fig. 14B) three additional steps are required to break the group up (with no transitions in third positions). Although no statistical value can be attached to these decay indices, they also suggest that the grouping of electric fish with characiforms receives slightly better support than its alternative. Neighbor-joining analyses, with or without third codon positions included, always grouped electric fish with characiforms. Bootstrap support (500 pseudoreplicates) was very high for Protacanthopterygii, Otophysi, Cyprinidae, Gymnotiformes, and Siluriformes (values > 90) when all positions were included in the analysis. The main difference between trees including or excluding third codon positions was the placement of cyprinids and of Distichodus. When all positions were considered, characiform monophyly was supported with a bootstrap value of 63, and electric fish and characiforms were grouped together with a bootstrap value of 42. When third positions were excluded, Distichodus grouped with electric fish and this clade grouped with characiforms, supported by bootstrap values of 29 and 67, respectively. Excluding third positions also had the effect of placing cyprinids as the sister group of characiforms + electric fish, to the exclusion of catfish. Protein Poisson-corrected distances and Kimura (1981) distances excluding third positions resulted in the same topology. Relationships among characiform lineages were poorly supported in the neighbor-joining trees, but agreed with parsimony analyses in placing Distochodus at the base of characiforms and in grouping Alestes + Phenacogrammus and Paracheirodon + Gymnocorymbus + Gasteropelecus with high bootstrap support.
235
Maximum likelihood analysis was used to compare alternative hypotheses. The rate of change at each codon position was estimated by counting the number of changes reconstructed over the shortest tree (tree B in Fig. 14) using the program MacClade. These values were 373, 270, and 860 for first, second, and third positions, respectively. They were used as auxiliary information with the input to the fastDNAml program to activate the "categories and rates" option (Olsen et al., 1994). Five runs of the program using the jumble input option (27,249 trees examined) resulted in the same best tree every time (identical to tree B in Fig. 14), with a log likelihood of -6906.79. The alternative topology (Fig. 14A) had a log likelihood of -6929.36. The same best tree (Fig. 14B) was obtained in 3 out of 10 "jumbled" runs of fastDNAml with only first and second positions in the data set. To evaluate the extent to which the best tree is significantly better than its alternatives, the standard errors (SE) of the differences between log likelihoods (A/i, Kishino and Hasegawa, 1989) were computed using the program NUCML 2.2 (Adachi and Hasegawa, 1994; Hasegawa et al., 1985) for trees A and B (and alternative topologies, not shown), using data sets including either all positions or only first and second positions (NUCML does not allow rate categories in the input). The differences in log likelihood between trees are not statistically significant because all upper bounds of the 95% confidence intervals are greater than zero. According to Kishino and Hasegawa (1989), this means that none of the best trees is significantly better than the alternative hypotheses. However, the data set including only first and second codon positions provides somewhat better resolution among alternative trees than the one including all positions. First and second codon positions seem to be less "noisy" over the whole data set. For the comparison between tree A and tree B, ( A l l _ 3 -4- SE) is -6.8 + 9.0 (all data) and -10.1 + 7.2 (first and second only), the SE being larger than the difference in the first case, smaller and closer to being significant in the second case (even though it used only two-thirds of all sites). Using protein sequences, the best tree from maximum likelihood analysis (PROTML 2.2, Adachi and Hasegawa, 1994) is the tree shown in Figure 14B, but differences between the log likelihood of this tree and alternative topologies were not statistically significant, according to the test of Kishino and Hasegawa (1989). Although maximum likelihood analyses also favor the grouping of electric fish with characiforms, more data are obviously necessary to determine with confidence the best phylogenetic hypothesis. In order to test for the effect of the choice of taxa (see Lecointre et al., 1993) on the resolution of characiform relationships, the more distant taxa were excluded from the analysis and only catfish and electric fish
236
GUILLERMO ORTI
were used as outgroups. Although different results were obtained for different character weighting and reconstruction methods used, some elements were common to all results. The basal position of Distichodus and the grouping of Alestes and Phenacogrammus (Alestinae) and of Paracheirodon, Gymnocorymbus, and Gasteropelecus were found in all trees obtained and were supported by relatively high bootstrap values (Fig. 14C and D). These relationships were stable to outgroup choice because they were also retrieved when all 25 taxa were used (Fig. 14A and B). The position of Chalceus and Metynnis remained uncertain, but they never grouped together with the other taxa in the Characidae. A close relationship between Leporinus and Hemiodus, only weakly suggested in trees A and B (Fig. 14), seems to receive better support with a closer outgroup and downweighting third codon positions (trees C and D, Fig. 14). The major discrepancy among trees A - D involves the position of Hoplias and Boulengerella. When third codon positions (or only third position transitions) were excluded from the analysis, these taxa are no longer placed with Alestes + Phenacogrammus as a derived group within the Characiformes, but rather branch out next from Distichodus, at the base of the characiform clade. The same pattern is observed when amino acid sequences are used for parsimony analysis. Although no firm set of relationships can be established among characiform lineages other than those mentioned earlier, the monophyly of Neotropical taxa seems a very unlikely hypothesis. Under all alternative weighting strategies, Distichodus comes out as the sister group of all other characiforms, and the Alestinae always groups among the Neotropical taxa. Forcing monophyly of Neotropical taxa results in 7, 8, and 10 extra steps when all characters were equally weighted, when transitions in third positions were excluded, and when third positions were excluded from the analysis, respectively. Mitochondrial DNA sequence evidence (see earlier discussion) also suggests that the African and Neotropical lineages do not form reciprocally monophyletic groups. Neighbor-joining analysis of the 19 taxon data set (with catfish as the outgroup) always resulted in a monophyletic Characiformes with Distichodus branching out at the base. As in parsimony analysis, by excluding third codon positions (or using protein distances) the placement of Hoplias and Boulengerella in the tree changed from being close to the Alestinae to a more basal position in the characiform clade. The grouping of Leporinus and Hemiodus was also supported, but the monophyly of neither Characidae nor characiforms was supported by neighbor-joining bootstrap analyses. The topology of the best tree from fastDNAml (with the categories and rates options) is the same as that shown in Fig. 14D.
C. Systematic and Biogeographic Implications 1. Sequence Variation and the Limits of Phylogenetic Resolution Comparisons of 12S and 16S sequences among characiform families showed a slightly lower level of mean sequence divergence (14.9%) than comparisons among orders of otophysans (17.3%) (see Fig. 7). Assuming rate constancy across all lineages, this observation could be taken as evidence for dating the origination of the major lineages of Characiformes very close to the origin of the otophysan orders (cypriniforms, catfishes, electric fishes). Alternatively, similar values of sequence divergence among lineages may reflect saturation at the DNA level, given the structural constraints on sequence variation discussed earlier. As pointed out, transition/transversion ratios (Fig. 3), the amount of change per site in different data sets (Fig. 4), and sliding window analyses of variation (Figs. 5 and 6) all indicate that beyond the family level, multiple changes per site are to be expected in the 12S and 16S DNA sequences. Furthermore, even though average divergence between gonorhynchiforms and otophysans (21.1%) suggests that divergence values among otophysans (17.3%) might be close to but have not yet reached complete saturation, maximum divergence values among characiform families, otophysan, and ostariophysan orders were essentially all the same (21.3, 21.9, and 24%, respectively; Fig. 7), indicating that, indeed, saturation is a problem beyond the family level. Comparison of ependymin DNA and amino acid sequence divergences (Fig. 7) clearly shows that the mitochondrial rRNA genes have reached saturation. For ependymin, amino acid sequence divergence between Distichodus and the other characiforms (close to 22%) was slightly smaller than divergence between characiforms and electric fish (25%) and than between characiforms and cyprinids (27%). But ependymin amino acid sequence divergence between characiforms and catfishes and between cyprinids and electric fishes was above 34%. Furthermore, distances among characiform taxa other than Distichodus were lower than 15%. In the 12S and 16S sequences no such difference in sequence divergence among characiform taxa including or excluding the distichodontid-citharinid lineage was found.
2. Relationships among Orders 12S and 16S data did not contain appropriate information to establish relationships at this level (Fig. 12). But, given that ependymin sequences show nonsaturating levels of divergence even among the most divergent taxa, can we expect well-supported phylogenies
14. Radiation of Characiform Fishes for high-order relationships? One of the most significant results obtained from the phylogenetic analysis of ependymin is the highly supported sister group relationship of Esox and Salmo (Fig. 14), corroborating, in part, the notion of Protacanthopterygii (sensu Rosen 1973, 1974) also adopted by Nelson (1994, see Fig. 2E). Although this result was previously reported by M(iller-Schmid et al. (1993), its implication for lower euteleostean systematics remained unnoticed. The superorder Protacanthopterygii, containing a diverse assemblage of basal "Division III" fishes, was advanced in the seminal paper by Greenwood et al. (1966), but shortly after its inception all groups except Salmoniformes were removed (Rosen, 1973). The monophyly of Salmoniformes, which included Esocoidei (pikes, mudminnows, and Lepidogalaxias), Argentinoidei plus Osmeroidei (smelts and their relatives), and Salmonoidei (salmonids), was proposed based on gill arch anatomy (Rosen, 1974). But esocoids were later removed from the Salmoniformes and were regarded as the primitive sister group of euteleosts (Fink and Weitzman, 1982; Lauder and Liem, 1983; Fink, 1984). Salmoniformes became coextensive with Salmonidae, and much controversy clouded the relationships among salmonids, pikes, and the other euteleosts (for a review see Fink, 1984; Begle, 1991, 1992; Nelson, 1994). Morphological analyses have been complicated because a high proportion of characters show evolutionary losses and reductions or mosaic evolution, or exhibit a primitive condition for the euteleosts (Begle, 1992; Nelson, 1994). Ependymin DNA sequences have established the first molecular evidence for the monophyly of a group containing salmonids and esociforms, and hold great promise for the resolution of higher order relationships of fishes (Fig. 2E). The sister group relationship of electric fish (Gymnotiformes) and Characiformes suggested by ependymin sequences (Fig. 14B) constitutes a significant departure from the currently accepted hypothesis of otophysan relationships (Fig. 2E; Fink and Fink, 1981), but had been considered the "traditional" hypothesis before 1981 (e.g., Regan, 1922; Weitzman, 1962; Greenwood et al., 1966; Rosen and Greenwood, 1970). Gymnotiforms were then thought to be highly modified characins, albeit only based on circumstantial evidence (e.g., Mago-Leccia and Zaret, 1978). The first explicit cladistic analysis of morphological characters published by Fink and Fink (1981) proposed 20 synapomorphies for the clade formed by catfish + electric fish. More recently, Dimmick and Larson (1996) presented molecular data (1200 bp of mitochondrial DNA sequences encompassing most of the 12S and 16S genes and the intervening valine tRNA gene, and 1200 bp from the small and large subunit nuclear-encoded rRNA genes) that support the alternative hypothesis suggested by ependymin sequences. Analyzed sepa-
237
rately and combined, nuclear and mitochondrial sequence data independently support the grouping of Gymnotiformes and Characiformes (Dimmick and Larson, 1996). In agreement with the morphological evidence (Fink and Fink, 1981), ependymin (and the nuclear and mitochondrial sequences of Dimmick and Larson) support the basal position of cypriniforms among otophysan lineages (Figs. 12 and 14A and B). 3. Relationships among Characiform Families
Whether saturation plagues the 12S and 16S data sets at the family level is less apparent, but it might be suggested by the differences in sequence divergence discussed earlier. Low consistency indices of the phylogenetic trees obtained for the different data sets indicate a high degree of homoplasy at every level. For example, the consistency index was 0.50, 0.34, and 0.42 for the serrasalmin (33 taxa), characiform (27 taxa), and ostariophysan (22 taxa) data sets, respectively. Mindell and Honeycutt (1990) and Hillis and Dixon (1991) suggested that mitochondrial ribosomal genes could resolve phylogenetic relationships among taxa that had diverged as long as 300 or 65 million years ago, respectively. The oldest unequivocal gonorhynchiform fossils date from the early Cretaceous (Patterson, 1975, 1984), and the earliest otophysan fossils are late Cretaceous catfishes and characiforms (reviewed by Lundberg, 1993, 1996). This suggests that the otophysan stem group had originated before the separation of Africa and South America (Lundberg, 1993), dated at 84-106 million years ago (Pitman et al., 1993; Parrish, 1993). Fossils do not provide detailed evidence on the sequence of origins of the main otophysan and characiform lineages, but suggest a window of application for the 12S and 16S molecular markers closer to 100 than to 300 million years. Given these limitations of the ribosomal DNA sequences for comparisons among characiform families, only a few hypotheses of relationships among Characiformes could be established with confidence. These were the clades numbered 1 - 12 (Figs. 9-11), of which only three propose interfamilial (or subfamilial) sister group relationships, in addition to the cithariniddistichodontid clade already discussed. A close relationship of Prochilodontidae and Curimatidae was proposed by Vari (1983) and Buckup (1991) and was supported by molecular data (Fig. 2C and component 10, Fig. 11). Within the Characidae, the systematic position of Oligosarcus (subfamily Acestrorhynchinae) close to Astyanax (subfamily Tetragonopterinae) and Poptella (subfamily Stethaprioninae) was strongly supported by molecular data (component 5, Figs. 9-11), but a close relationship of Astyanax with Tetragonopterus, both tetragonopterines, was not supported. Oligosarcus was traditionally placed with Acestrorhynchus,
238
GUILLERMOORTI
but Buckup (1991), Lucena (1993), and P. Petry (personal communication) found evidence for a closer relationship of Oligosarcus with tetragonopterines (Fig. 2C) than with Acestrorhynchus. Lucena (1993) proposed a close relationship of Poptella with Tetragonopterus, but not with Astyanax (Fig. 2A). The third component supported by molecular data is formed by Hepsetus and Hoplias (number 2, Figs. 9-11), members of African and South American families Hepsetidae and Erythrinidae, respectively. Its relevance for biogeography and systematics of characiform fishes is discussed later. Ependymin sequences also failed to provide robust phylogeny estimates for characiform families (Fig. 14A-D). However, the position of Distichodus as a primitive taxon among characiforms is well established (Fig. 14), corroborating the mitochondrial DNA results (Fig. 12) and previous morphological evidence (Fink and Fink, 1981; Buckup, 1991). Distichodus forms part of a well-defined monophyletic lineage of African characiforms composed of the families Distichodontidae and Citharinidae (Vari, 1979). Among the South American Characidae, a close relationship between Paracheirodon ("neon tetra," subfamily Cheirodontinae) and Gymnocorymbus ("black tetra," subfamily Tetragonopterinae) is strongly suggested by ependymin (Fig. 14). Tetragonopterines and cheirodontines were also suggested by Lucena (1993) to be closely related (Fig. 2A). The genera Metynnis ("silver dollar," subfamily Serrasalminae) and Chalceus (subfamily Bryconinae), usually included in the Characidae, are not shown here to form a monophyletic group with the other characids (Fig. 14). The placement of serrasalrains (represented by Metynnis, Colossoma, and Pygocentrus in various trees) among the other putative characid taxa remained equivocal (Figs. 9-12 and 14). In an extensive survey of morphological characters, Machado-Allison (1983) presented convincing evidence for monophyly of the subfamily Serrasalminae but also failed to find the sister group of this unit among characids. More recently, Lucena (1993) proposed a monophyletic group including (in addition to other taxa) serrasalmins, Chalceus, Brycon, and Alestinae (Fig. 2A). Gasteropelecus (family Gasteropelecidae) is shown here to have a close relationship with Gymnocorymbus + Paracheirodon to the exclusion of Chalceus and Metynnis, based on ependymin (Fig. 14). Based on 12S and 16S sequences, gasteropelecids come out as the sister group of a clade containing anostomids, Chilodus and Characidium, in a clade which also includes Raphiodon and Apareiodon (Fig. 11) or of Boulengerella in the most inclusive ostariophysan data set (Fig. 12). The selection of taxa clearly has a major impact on inferences about the phylogenetic position of gasteropelecids. This effect was illustrated by Lecointre et al (1993) using a gnathostome 28S rRNA data set. The gastero-
pelecids were considered a subfamily of the family Characidae (Weitzman, 1960) but were later elevated to the rank of family by Greenwood et al. (1966). The suggestion that the family Characidae (sensu Greenwood et al., 1966) will undergo major taxonomic changes as phylogenetic relationships among the major lineages are established has been mentioned repeatedly (e.g., Weitzman and Fink, 1983; Buckup, 1991; Lucena, 1993) and seems to be supported by molecular data discussed herein.
4. African-South American Relationships and Biogeography A close relationship of Distichodus + Citharinus with the African subfamily Alestinae is not supported by ependymin, mitochondrial DNA sequences, or morphological evidence (Buckup, 1991). Hypotheses of the monophyly of Neotropical taxa were rejected by the mitochondrial DNA sequences (see earlier discussion). Therefore, at least three levels of Afro-South American sister group relationship have been suggested (Fig. 11, arrows 1-3; Fig. 14): (1) between the distichodontids (plus citharinids) and the rest of the characiforms (discussed earlier), (2) between Hoplias and Hepsetus, and (3) between the alestins and a group of undetermined South American characiforms. The sister group relationship of the African pike-characiform Hepsetus and the Neotropical family Erythrinidae, genus Hoplias (Figs. 9-11), was also suggested by Uj (1990). Although this hypothesis seems well supported by molecular data (but see Fig. 10), ctenolucids and erythrinids (both Neotropical groups) or ctenolucids alone were proposed as the sister group of Hepsetus, based on morphology (Fig. 2; Buckup, 1991; Lucena, 1993; Vari, 1995). The third clade with a trans-Atlantic sister group relationship included the African subfamily Alestinae and some Neotropical lineages (mitochondrial DNA data suggest Acestrorhynchus to be the closest Neotropical taxon to alestins, see Figs. 9-11). However, relationships of Alestinae and Acestrorhynchus with Neotropical characids are controversial (Fig. 2), and no agreement may be reached regarding the systematic position of these two groups based on morphology (Uj, 1990; Buckup, 1991; Lucena, 1993) and molecular data. Mean percentage sequence divergences (12S and 16S genes) between the African taxa and their corresponding Neotropical sister group were 16.2% for Distichodus + Citharinus, 11.2% for Hepsetus, and 15.1% for the Alestinae, respectively. Divergence between Hepsetus and ctenolucids (putative sister groups according to morphological studies) was 16.6%. These values are within the same range of divergence values recorded among the other families of Characiformes (and below the 21-24% saturation value shown in Fig.7), suggesting that most lineages (families) of characiform fishes
239
14. Radiation of Characiform Fishes had originated before the vicariant event separating African and Neotropical taxa, approximately 100 million years ago. If Characiformes experienced a rapid evolutionary radiation, comparable to that of cichlid fishes in East African lakes (e.g., Greenwood, 1984; Meyer, 1993), but 100 million years ago, resolution of phylogenetic relationships among the major lineages is not expected to be easily obtained. Poor resolution of relationships among characiform taxa using phylogenetic analyses of ependymin and mitochondrial DNA sequences and conflicting phylogenetic hypotheses from morphological data seem to agree with this prediction. Analyzing the phylogenetic hypothesis of Buckup (1991) in a biogeographic context, Lundberg (1993) also arrived at the conclusion that the major groups of characiforms had originated before the African-South American vicariant event (although the proposed African-South American sister group relationships differed). He then raised the important question of why most of the characiform groups now endemic to the Neotropics do not have close relatives in the African fauna. Assuming a strict vicariant view and no dispersals of characiforms across the widening Atlantic ocean, the present biogeographic distribution implies a remarkably high rate of extinction among African characiforms (Lundberg, 1993). For example, if the cladogram shown in Fig. 11 is taken at face value, then all six lineages enclosed in boxes and indicated by a cross must have gone extinct in Africa after the continental break. Although the fossil record of Characiformes is not very informative to test this hypothesis, intriguing fossils described by Greenwood and Howes (1975) and Stewart (1994) merit discussion. These are teeth and skulls of Miocene to lower Pleistocene age that were assigned to now extinct characiform fishes (Sindacharax lepersonnei and S. deserti), apparently widespread in northern and eastern Africa. They show greater similarity with the teeth of modern serrasalmins like Colossoma and Piaractus than with any African characiform fish (Greenwood and Howes, 1975; Stewart, 1994). Serrasalmins form a well-supported monophyletic taxon endemic to South America (Machado-Allison, 1982; Fig. 8, and clade number 7, Figs. 9-11) that includes herbivorous forms like Colossoma and Piaractus, considered the primitive sister group to the more derived predatory piranhas (e.g., Pygocentrus; Fig. 8). The systematic position of serrasalmins within Characiformes could not be resolved with confidence in the present study (Figs. 9-11, and 14), but no close relationship of serrasalmins with other Neotropical characids was suggested. South American serrasalmin fossils indicate that forms similar to Colossoma had differentiated by at least 13 million years ago (Lundberg et al., 1986; Lund-
berg, 1996). Considering that serrasalmins are exclusively freshwater fishes, if Sindacharax really belongs to the serrasalmin clade, the origin of serrasalmins would have to be unequivocally placed before the AfricanSouth American continental split (84 million years ago), in agreement with conclusions from DNA sequence divergences discussed earlier. S indacharax would also provide an example of extirpation in Africa of one trans-South Atlantic clade (Lundberg, 1993). Fossil serrasalmins from Miocene Amazonian-Orinocoan faunas discovered in the present Magdalena River basin in Colombia provide a good example for extirpation of a clade from a formerly diverse fauna (Lundberg et al., 1986). The depauperate fauna of the present Magdalena River does not include Colossoma and piranha species, and local extinction due to tectonism and climatic changes during the Cenozoic was suggested to explain the loss of diversity (Lundberg et al., 1986; Lundberg and Chernoff, 1992). Similar geological and climatic processes might have affected a previously characiform-rich African fauna and may be invoked to explain why only three lineages of characiforms are found there at present. Paleocene tectonic movements of the African plate and post-Miocene aridification affected the African continent more severely than South America and might have caused the well-known paucity of the tropical African flora (Goldblatt, 1993). Two alternative hypotheses are also plausible. Extinction of characiform lineages in Africa could also have resulted from competition with other fish groups that invaded that continent after the Gondwanian fracture. For example, knerids, notopterids, mormyriforms, and cypriniforms are freshwater fishes present in Africa but not in South America. Cyprinids such as Barbus and Labeo have been suggested to enter Africa from Asia during the late Miocene (Stewart, 1994). No evidence for this kind of competitive exclusion process is available. Another alternative scenario assumes ad hoc geographic distributions to minimize the number of extinctions: members of a clade, or single species that later gave rise to the clade, could have been restricted to a small part of the Gondwanian land mass and carried off in toto when the continent broke up. This assumption would reduce the number of necessary extinction events of characiform lineages needed to explain their modern geographic distribution.
Acknowledgments This work was supported by Doctoral DissertationImprovement Grant BSR9112367to G. Orti and grants to A. Meyer (BSR9119867, BSR9107838) and M. A. Bell (INT9117104) from the U.S. National Science Foundation. All the molecular work reported here was conducted in A. Meyer's laboratory. The author thanks numerous colleagues who contributed valuable specimens. A. Meyer, M. A. Bell,
240
GUILLERMO ORTI
D. Futuyma, W. Eanes, and R. Vari provided helpful comments on earlier versions of the manuscript. This paper was prepared in partial fulfillment of requirements for the Ph.D. in Ecology and Evolution by G. Orti. This is contribution 960 from the Graduate Program in Ecology and Evolution at SUNY at Stony Brook.
References Adachi, J., and Hasegawa, M. 1994. "MOLPHY: A Program Package for Molecular Phylogenetics, V. 2.2." The Institute of Statistical Mathematics, Tokyo. Alves-Gomes, J. A., OrtL G., Haygood, M., Heiligenberg, W., and Meyer, A. 1995. Phylogenetic analysis of the South American electric fishes (order Gymnotiformes) and the evolution of their electrogenic system: A synthesis based on morphology, electrophysiology, and mitochondrial sequence data. Mol. Biol. Evol. 12: 298-318. Begle, D. P. 1991. Relationships of the osmeroid fishes and the use of reductive characters in phylogenetic analysis. Syst. Zool. 40: 33-53. Begle, D. P. 1992. Monophyly and relationships of the argentinoid fishes. Copeia 350-366. Bremer, K. 1988. The limits of amino acid sequence data in Angiosperm phylogenetic reconstruction. Evolution 42: 795-803. Buckup, P. A. 1991. "The Characidiinae: A Phylogenetic Study of the South American Darters and Their Reonships with Other Characiform Fishes." Ph.D. dissertation, The University of Michigan, Ann Arbor, MI. Carpenter, J. 1988. Choosing among equally parsimonious cladograms. Cladistics 4:291-296. Collins, T. M., Wimberger, P. H., and Naylor, G. J. P. 1994. Compositional bias, character-state bias, and character-state reconstruction using parsimony. Syst. Biol. 43:482-496. Dimmick, W. W., and Larson, A. 1996. A molecular and morphological perspective on the phylogenetic relationships of the otophysan fishes. Mol. Phylo. Evol. 6:120-133. Dixon, M. T., and Hillis, D. M. 1993. Ribosomal RNA secondary structure: compensatory mutations and implications for phylogenetic analysis. Mol. Biol. Evol. 10:256-267. Farris, J. S. 1969. A successive approximations approach to character weighting. Syst. Zool. 18:374-385. Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17:368-376. Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791. Fink, S. V., and Fink, W. L. 1981. Interrelationships of the Ostariophysan fishes (Teleostei). Zool. J. Linn. Soc. 72:297-353. Fink, W. L. 1984. Basal euteleosts: Relationships. In "Ontogeny and Systematics of Fishes" (H. G. Moser, eds.). American Society of Ichthyologists and Herpetologists Special Publication 1. Fink, W. L., and Weitzman, S. H. 1982. Relationships of the stomiiform fishes (Teleostei), with a description of Diplophos. Bull. Mus. Comp. Zool. 150:31-93. Fitch, W. M., and Markowitz, E. 1970. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem. Genet. 4: 579-593. Gatesy, J., DeSalle, R., and Wheeler, W. C. 1994. Alignmentambiguous nucleotide sites and the exclusion of data. Mol. Phylo. Evol. 2:152-157. G6ry, J. 1977. "Characoids of the World." Tropical Fish Hobbyist Publications, Neptune City, NJ. Goldblatt, P. 1993. Biological relationships between Africa and South America: An overview. In Biological Relationships between Af-
rica and South America" (P. Goldblatt, ed.), pp. 3-14. Yale University Press, New Haven, CT. Greenwood, P. H. 1984. African cichlids and evolutionary theories. In "Evolution of Fish Species Flocks." (A. A. Echelle and I. Kornfield, eds.), pp. 13-19. University of Maine Press, Orono, ME. Greenwood, P. H., and Howes, G. J. 1975. Neogene fossil fishes from the lake Albert-Lake Edward rift (Zaire). Bull. Brit. Mus. (Nat. Hist.) Geol. 26: 69-127. Greenwood, P. H., Rosen, D. E., Weitzman, S. H., and Myers, G. S. 1966. Phyletic studies of teleostean fishes, with a provisional classification of living forms. Bull. Am. Mus. Nat. Hist. 131:339-455. Gyllensten, U. B., and Erlich, H. A. 1988. Generation of singlestranded DNA by the polymerase chain reaction and its application to direct sequencing of the HLA-DQa locus. Proc. Natl. Acad. Sci. USA 85: 7652- 7656. Hasegawa, M., Kishino, H., and Yano, T. 1985. Dating of the humanape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160-174. Hillis, D. M., and Dixon, M. T. 1991. Ribosomal DNA: Molecular evolution and phylogenetic inference. Q. Rev. Biol. 66: 411-453. Hoffmann, W. 1994. Ependymins and their potential role in neuroplasticity and regeneration: Calcium binding meningeal glycoproteins of the cerebrospinal fluid and extracellular matrix. Int. J. Biochem. 26:607-619. Kimura, M. 1981. Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA 78: 454-458. Kishino, H., and Hasegawa, M. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170-179. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., P/i/ibo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals. Proc. Natl. Acad. Sci. USA 86: 6196-6200. Kumar, S., Tamura, K., and Nei, M. 1993. "MEGA: Molecular Evolutionary Genetics Analysis, V. 1.0." The Pennsylvania State University, University Park, PA. Lauder, G. V., and Liem, K. F. 1983. The evolution and interrelationships of the Actinopterygian fishes. Bull. Mus. Comp. Zool. 150: 95-197. Lecointre, G., Philippe, H., LG H. L. V., and Le Guyader, H. 1993. Species sampling has a major impact on phylogenetic inference. Mol. Phyto. Evot. 2:205-224. Lucena, C. A. S. D. 1993. "Estudo filogen6tico da famflia Characidae com uma discussao dos grupos naturais propostos (Teleostei, Ostariophysi, Characiformes)." Doutoramento diss., Universidade de Sao Paulo, Brazil. Lundberg, J. G. 1993. African-South American freshwater fish clades and continental drift: Problems with a paradigm. In "Biological relationships between Africa and South America" (P. Goldblatt, eds.), pp. 156-199. Yale University Press, New Haven, CT. Lundberg, J. G. 1996. Fishes of the La Venta Fauna: Additional taxa, biotic and paleoenvironmental implications. In "Vertebrate Paleontology in the Neotropics: The Miocene Fauna of La Venta Colombia" (R. F. Kay et at., eds.), pp. 67-91. Smithsonian Institution Press, Washington, DC. Lundberg, J. G., and Chernoff, B. 1992. A Miocene fossil of the Amazonian fish Arapaima (Teleostei, Arapaimidae) from the Magdalena River region of Colombia: Biogeographic and evolutionary implications. Biotropica 24:2-14. Lundberg, J. G., Machado-Allison, A., and Kay, R. F. 1986. Miocene characid fishes from Colombia: Evolutionary stasis and extirpation. Science 234: 208-209. Machado-Allison, A. 1982. "Studies on the Systematics of the Sub-
14. Radiation of Characiform Fishes
family Serrasalminae (Pisces-Characidae)." Ph.D. dissertation, The George Washington University. Machado-Allison, A. 1983. Estudios sobre la sistem~tica de la subfamilia Serrasalminae (Teleostei, Characidae). II. Discusi6n sobre la condici6n monofil6tica de la subfamilia. Acta Biol. Venez. 11: 145-195. Maddison, W. P., and Maddison, D. R. 1992. "MacClade: Analysis of Phylogeny and Character Evolution, V. 3.0." Sinauer Associates, Sunderland, MA. Mago-Leccia, F., and Zaret, T. M. 1978. The taxonomic status of Rhabdolichops troscheli (Kaup, 1856) and speculations on gymnotiform evolution. Environ. Biol. Fish. 3:379-384. Meyer, A. 1993. Phylogenetic relationships and evolutionary processes in East African cichlid fishes. Trends Ecol. Evol. 8:279284. Mindell, D. P., and Honeycutt, R. L. 1990. Ribosomal RNA in vertebrates: Evolution and phylogenetic implications. Annu. Rev. Ecol. Syst. 21:541-566. M~ller-Schmid, A., Ganss, B., Gorr, T., and Hoffmann, W. 1993. Molecular analysis of ependymins from the cerebrospinal fluid of the orders Clupeiformes and Salmoniformes: No indication for the existence of an euteleost infradivision. J. Mol. Evol. 36:578-585. Myers, G. S. 1938. Freshwater fishes and West Indian zoogeography. Annu. Rep. Smith. Inst. 1937:339-364. Myers, G. S. 1949. Salt-tolerance of freshwater fish groups in relation to zoogeographical problems. Bijdragen tot de Dierkunde 28: 315-322. Nelson, J. S. 1994. "Fishes of the World." Wiley, New York. Olsen, G. J., Matsuda, H., Hagstrom, R., and Overbeek, R. 1994. fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 41-48. OrtL G. 1995. "The Evolutionary Radiation of Characiform Fishes: A Molecular Phylogenetic Perspective." Ph.D. dissertation, State University of New York at Stony Brook. Ortf, G., and Meyer, A. 1996. Molecular evolution of ependymin and the phylogenetic resolution of early divergences among euteleost fishes. Mol. Biol. Evol. 13:556-573. Orti, G., and Meyer, A. 1997. The radiation of characiform fishes and the limits of resolution of mitochondrial ribosomal DNA sequences. Syst. Biol., 46:75-100. Orti, G., Petry, P., Porto, J. I. R., J6gu, M., and Meyer, A. 1996. Patterns of nucleotide change in mitochondrial ribosomal RNA genes and the phylogeny of piranhas. J. Mol. Evol. 42:169-182. Palumbi, S., Martin, A., Romano, A., McMillan, W. O., Stice, L., and Grabowski, G. 1991. "The Simple Fool's Guide to PCR." Department of Zoology and Kewalo Marine Laboratory, University of Hawaii, Honolulu, HI. Parrish, J. T. 1993. The palaeogeography of the opening South Atlantic. In "The Africa-South America connection" (W. George and R. Lavocat, eds.), pp. 8-27. Clarendon Press, Oxford. Patterson, C. 1975. The distribution of Mesozoic freshwater fishes. M~m. Mus. Natl. Hist. Nat. Paris A Zool. 88:156-174. Patterson, C. 1984. Chanoides, a marine Eocene otophysan fish (Teleostei: Ostariophysi). J. Vertebr. Paleontol. 4: 430-456. Pitman, W. C. I., Cande, S., LaBrecque, J., and Pindell, J. 1993. Fragmentation of Gondwana: The separation of Africa from South America. In "Biological Relationships between Africa and South America" (P. Goldblatt, ed.), pp. 15-34. Yale University Press, New Haven, CT. Regan, C. T. 1922. The distribution of the fishes of the order Ostariophysi. Bijdragen tot de Dierkunde, Amsterdam 22:203-208. Rosen, D. E. 1973. Interrelationships of higher euteleostean fishes. In "Interrelationships of Fishes" (P. H. Greenwood, R. S. Miles,
241
and C. Patterson, eds.), pp. 397-513. Academic Press, London. Rosen, D. E. 1974. Phylogeny and zoogeography of salmoniform fishes and relationships of Lepidogalaxias salamandroides. Bull. Am. Mus. Nat. Hist. 153:265-326. Rosen, D. E., and Greenwood, P. H. 1970. Origin of the Weberian apparatus and the relationships of ostariophysan and gonorhynchiform fishes. Am. Mus. Novitat. 2428:1-25. Saiki, R. K., Gelfand, D.H., Stoffel, S., Scharf, S., Higuchi, R., Horn, G. T., Mullis, K. B., and Erlich, H. A. 1988. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487-491. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain terminator inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467. Shashoua, V. E. 1991. Ependymin, a brain extracellular glycoprotein, and CNS plasticity. Ann. N.Y. Acad. Sci. 627:94-114. Stewart, K. M. 1994. A late Miocene fish fauna from Lothgam, Kenya. J. Vertebr. Paleontol. 14:592-594. Sverlij, S. B., and Espinach Ros, A. 1986. E1 Dorado, Salminus maxillosus (Pisces, Characiformes) en el Rio de la Plata y Rio Uruguay inferior. Rev. Invest. Desarrollo Pesquero 6:57-75. Swofford, D. L. 1993. "PAUP: Phylogenetic Analysis Using Parsimony, V.3.1.1." Illinois Natural History Survey, Champaign, IL. Swofford, D. L., and Maddison, W. P. 1992. Parsimony, characterstate reconstructions, and evolutionary inferences. In "Systematics, Historical Ecology, and North American Freshwater Fishes." (R. L. Mayden, ed.), pp. 186-223. Stanford University Press, Stanford, CA. Thompson, J. D., Higgins, D. G., and Gibson, T. J. 1994. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673-4680. Uj, A. 1990. "Etude comparative de l'osteologie cranienne des poissons de la famille des Characidae et son importance phylogenetique." Ph.D. dissertation, Universit6 de Geneva. Vari, R. P. 1979. Anatomy, relationships, and classification of the families Citharinidae and Distichodontidae (Pisces, Characoidea). Bull. Brit. Mus. (Nat. Hist.) Zool. 36:261-344. Vari, R. P. 1983. Phylogenetic relationships of the families Curimatidae, Prochilodontidae, Anostomidae, and Chilodontidae. Smith. Contrib. Zool. 378:1-60. Vari, R. P. 1995. The Neotropical fish family Ctenoluciidae (Teleostei: Ostariophysi: Characiformes): Supra and intrafamilial phylogenetic relationships, with a revisionary study. Smith. Contrib. Zool. 5 6 4 : 1 - 97. Vawter, L., and Brown, W. M. 1993. Rates and patterns of base change in the small subunit ribosomal RNA gene. Genetics 134: 597-608. Weitzman, S. H. 1960. Further notes on the relationships and classification of the South American characid fishes of the subfamily Gasteropelecinae. Stanford Ichthyol. Bull. 7:217-239. Weitzman, S. H. 1962. The osteology of Brycon meeki, a generalized characid fish, with an osteological definition of the family. Stanford Ichthyol. Bull. 8:1-77. Weitzman, S. H., and Fink, W. L. 1983. Relationships of the neon tetras, a group of South American freshwater fishes (Teleostei, Characidae), with comments on the phylogeny of New World Characiformes. Bull. Mus. Comp. Zool. 150: 339-395. Weitzman, S. H., and Vari, R. P. 1988. Miniaturization in South American freshwater fishes: An overview and discussion. Proc. Biol. Soc. Wash. 101:444-465.
242
GUILLERMO ORTI
Appendix Below is a classification of fish taxa discussed in this chapter, with the GenBank accession numbers (GB) of their DNA sequences (12S, 16S, and ependymin indicated by "epy"). African taxa are indicated by "AFR." Serrasalmin specimens have been numbered from I to 34 and are referred to by these numbers in Orti et al. (1996). When voucher specimens were deposited in museum collections, their accession numbers are preceded by INPA for the specimens deposited at the Instituto Nacional de Pesquisas da Amazonia, Manaus, Brazil, and by USNM for those at the U.S. National Museum of Natural History (Washington, DC). Order Characiformes 1. Family Hepsetidae (AFR) Hepsetus odoe. GB: U33852, U33992. 2. Family Citharinidae (AFR) Citharinus congicus. GB: U33826, U33993. 3. Family Distichodontidae (AFR) Distichodus sp. GB: U33827, U33994, epy: U33477. 4. Family Crenuchidae Characidium sp. (USNM 318101). GB: U33828, U34030. 5. Family Characidae Subfamily Alestinae (AFR) Alestes sp. GB: U33829, U33995, epy: U33475. Phenacogrammus sp. GB: U33830, U33996, epy: U33476. Hydrocyon sp. GB: U33960, U33997. Subfamily Characinae Tribe Characini Cynopotamus sp. (USNM 325689). GB: U33961, U33998. Gnathocharax steindachneri. GB: U33589, U33624. Tribe Acestrorhynchini Acestrorhynchus sp. GB: U33962, U33999. Oligosarcus sp. (USNM 235690). GB: U33963, U34000. Subfamily Raphiodontinae Rhaphiodon vulpinus. GB: U33964, U34001. Subfamily Bryconinae Tribe Salminini Salminus sp. GB: U33965, U34002. Tribe Bryconini Brycon sp. (USNM 326005). GB: U33966, U34003. Chalceus macrolepidotus. GB: U33587, U33622, epy: U33478. Tribe Triportheini Triportheus paranensis. GB: U33588, U33623. Subfamily Aphyocharacinae Aphyocharax sp. GB: U33968, U34005. Subfamily Glandulocaudinae Corynopoma riisei. GB: U33969, U34006. Gephyrocharax sp. GB: U33970, U34007. Subfamily Stethaprioninae Poptella sp. GB: U33971, U34008. Subfamily Tetragonopterinae Astyanaxfasciatus. GB: U33972, U34009. Tetragonopterus sp. GB: U33973, U34010. Gymnocorymbus ternetzi. GB: epy: U33480. Subfamily Cheirodontinae Cheirodon sp. (USNM 325676). GB: U33974, U34011. Paracheirodon innesi. GB: U33975, U34012, epy: U33479.
Subfamily Serrasalminae genus Pygocentrus 1. P. nattereri. GB: U33558, U33590. 2. P. nattereri. GB: U33558, U33590. 3. P. nattereri (INPA 10143). GB: U33558, U33590. 4. P. nattereri (USNM 325686). GB: U33559, U33591. genus Serrasalmus 5. S. spilopleura (USNM 325683). GB: U33560, U33592. 6. S. n.sp. 2n = 58. GB: U33561, U33593. 7. S. compressus (cf. altuvei? 2n = 60). GB: U33562, U33594. genus Pristobrycon 8. P. sp. GB: U33563, U33595. 9. P. striolatus. GB: U33597, U33596. 10. P. striolatus. GB: U33564, U33598. genus Catoprion: 11. C. mento. GB: U33565, U33599. 12. C. mento (INPA 10145). GB: U33565, U33599. genus Metynnis 13. M. sp. GB: U33566, U33600. epy: U33481. 14. M. cf. mola (INPA 10146). GB: U33567, U33601. genus Myleus 15. M. Myloplus rubripinnis. GB: U33568, U33602. 16. M. Myloplus asterias. GB: U33569, U33603. 17. M. Myloplus tiete (INPA 10147). GB: U33570, U33604. 18. M. Prosomyleus schomburgkii. GB: U33571, U33605. 19. M. Myleus pacu. GB: U33572, U33606. 20. M. Myleus pacu. GB: U33573, U33607. genus Mylesinus 21. M. paraschomburgkii. GB: U33574, U33608. 22. M. paraschomburgkii. GB: U33574, U33609. genus 'N. gen. A' 23. N. gen. A n.sp. (R. Xingu, Parfi, Brazil). This specimen could not be assigned to any valid genus of the Serrasalminae, but is similar in many respects to Utiaritichthys and Myleus (J6gu unpublished data). GB: U33575, U33610. genus Acnodon: 24. A. normani. GB: U33576, U33611. 25. A. normani. GB: U33577, U33612. genus Mytossoma 26. M. duriventri (INPA 10154). GB: U33578, U33613. 27. M. paraguayensis (INPA 10152). GB: U33579, U33614. 28. M. aureum (INPA 10153). GB: U33580, U33615. genus Colossoma 29. C. macropomum (INPA 10149). GB: U33581, U33616. 30. C. macropomum (INPA 10150). GB: U33582, U33617. genus Piaractus 31. P. mesopotamicus (INPA 10151). GB: U33583, U33618. 32. P. brachipomus (INPA 10148). GB: U33584, U33619. 33. P. mesopotamicus. GB: U33585, U33620. 34. P. brachipomus. GB: U33586, U33621. 6. Family Erythrinidae Hoplias malabaricus. GB: U33976, U34013, epy: U33485. 7. Family Ctenoluciidae Ctenolucius sp. GB: U33977, U34014. Boulengerella maculata. GB: U33978, U34015. Boulengerella sp. GB: epy: U33486. 8. Family Lebiasinidae Nannostomus sp. GB: U33979, U34016. Pyrrhulina sp. (USNM 325675). GB: U33980, U34017. Nannobrycon sp. GB: epy: U33487. 9. Family Hemiodontidae Hemiodus sp. GB: U33981, U34018, epy: U33484.
14. Radiation of Characiform Fishes 10. Family Parodontidae Apareiodon affinis. GB: U33982, U34019. 11. Family Gasteropelecidae Carnegiella sp. GB: U33983, U34020. Gasteropelecus sp. GB: U33984, U34021, epy: U334482. 12. Family Curimatidae Cyphocharax gilberti (USNM 318079). GB: U33985, U34022. Steindachnerina sp. (USNM 325691). GB: U33986, U34023. 13. Family Prochilodontidae Prochilodus lineatus. GB: U33987, U34034. 14. Family Anostomidae Abramites sp. GB: U33988, U34025. Leporinus obtusidens. GB: U34031, U34026. Leporinus sp. GB: epy: U33483. 15. Family Chilodontidae Chilodus sp. GB: 33989, U34027. Order Gymnotiformes Family Eigenmanniidae Eigenmannia sp. GB: U15269, U15245 (from Alves-Gomes et al., 1995). Eigenmannia sp. GB: epy: U33492. Family Rhamphichthyidae Rhamphichthys sp. GB: U15257, U15233 (Alves-Gomes et al., 1995). Rhamphichthys sp. GB: epy: U33493. Family Apteronotidae Apteronotus albifrons. GB: U15275, U15226 (from AlvesGomes et al., 1995) Order Siluriformes Family Loricariidae Hypostomus sp. GB: epy: U33488. Hypostomus sp. GB: U15263, U15239 (from Alves-Gomes et al., 1995). Family Cetopsidae Cetopsis sp. GB: U15272, U15248 (from Alves-Gomes et al., 1995).
243
Family Trichomycteridae Trichomycterus sp. GB: U15251, U15227 (from Alves-Gomes et al., 1995). Family Malapteruridae Malapterurus sp. GB: U15261, U15237 (from Alves-Gomes et al., 1995). Family Pimelodidae Pimelodus sp. GB: epy: U33489. Family Schilbeidae Schilbe sp. GB: epy: U33490. Family Mochokidae Synodontis sp. GB: epy: U33491. Order Cypriniformes Family Cyprinidae
Cyprinus carpio. GB: X61010, epy: U00432. Carassius auratus. GB: epy: U00433, X14134. Danio rerio. GB: epy: M89643. Family Gastromyzontidae Crossostoma lacustre. GB: M91245. Order Gonorhynchiformes Family Kneriidae Kneria sp. GB: U33990, U34028. Parakneria sp. GB: U33991, U34029. Order Salmoniformes Family Salmonidae Salmo salar. GB: epy: M93699. Order Esociformes Family Esocidae Esox lucius. GB: epy: L09066. Order Clupeiformes Family Clupeidae
Clupea harengus. GB: epy: L09065.
This Page Intentionally Left Blank
C H A P T E R
15 The Evolution of Blennioid Fishes Based on an Analysis of Mitochondria112S rDNA CAROL A. STEPIEN, ALISON K. DILLON, MERIEL J. BROOKS, KRISTEN L. CHASE, and ALLYSON N. HUBERS Department of Biology Case Western Reserve University Cleveland, Ohio 44106
I. Introduction
morphy of the epaxial musculature (which is absent in the family Labrisomidae). Six families are presently recognized in the Blennioidei: Clinidae (clinid kelpfish), Labrisomidae (labrisomid kelpfish), Chaenopsidae (tube blennies), Tripterygiidae (triplefin blennies), Blenniidae (combtooth blennies), and Dactyloscopidae (sand stargazers; Fig. 1 and Table I; Springer, 1993). Blennioid groups have generated considerable systematic interest, including the following contemporary studies of the phylogenetic relationships of some component taxa; Fukao and Okazaki (1987), Acero (1987), Williams (1990), Stepien and Rosenblatt (1991), Hastings (1991), Stepien (1992), Stepien et al. (1993), Springer (1993), Fricke (1994), and Hastings and Springer (1994). Historically, relationships among blennioid taxa and related groups have been controversial (Springer, 1993; Johnson, 1993; Stepien et al., 1993). Studies based on morphological data have not resolved higher-level relationships among blennioid families, tribes, and other suborders (see summary by Springer, 1993). Earlier work illustrated the utility of molecular data from allozyme studies (Stepien and Rosenblatt, 1991; Stepien, 1992; Stepien et al., 1993) and nuclear ribosomal DNA internal transcribed spacer (ITS) sequences (Stepien et al., 1993) to address evolutionary
Blennioids are a suborder of perciform teleost fishes comprising approximately 732 species, 127 genera, and six families (Table I; Nelson, 1994). They are present in most temperate and tropical nearshore marine habitats, with a few species in brackish and fresh water (summarized in Springer, 1993; Nelson, 1994). They are among the most common demersal fishes (Springer, 1993), but may be overlooked due to their relatively small sizes and cryptic color patterns (Stepien, 1986a,b, 1987; Stepien et al., 1988). Their distinguishing characteristics include elongate dorsal and anal fins and jugular pelvic fins (see Fig. 1). Springer (1993) defined the Blennioidei by the following combination of characters (some of which may be plesiomorphies): anal fin with one or two spines and all simple soft rays; pelvic fins with one spine, two to four simple soft rays, and insertion ahead of the pectorals; paired nostrils; cirri often present on the head; a single bone representing infrapharyngobranchials 2-4; no autogenous parhypural (absent or fused to hypurals); hypurals 3 and 4 fused to each other and to the urostylar centrum; and pelvic bones shaped in a nut-like pod. Johnson (1993) added the synapomorphy of the first vertebra lacking a neural spine, and Mooi and Gill (1995) described a synapoMOLECULAR SYSTEMATICS OF FISHES
245
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
246
CAROL A. STEPIEN et al.
TABLE I
Summary of Taxonomy of the Suborder Blennioidei; Number of Taxa, Distribution, Primary Morphological Characters, and Genera Sequenced a
Taxon 1. Family Clinidae Clinid kelpfish a. Tribe Ophiclinini Snake blennies Ophiclinus Sticharium b. Tribe Clinini Klipfish Heteroclinus c. Tribe Myxodini Kelpfish Clinitrachus Gibbonsia Heterostichus Myxodes 2. Family Labrisomidae Labrisomid kelpfish a. Tribe Cryptotremini Auchenionchus b. Tribe Neoclinini Neoclinus c. Tribe Mnierpini Rock skippers Mnierpes d. Tribe Labrisomini Labrisomus Malacoctenus e. Tribe Starksiini Starksia f. Tribe Paraclinini Exerpes Paraclinus g. Unknown placement (may be Stathmonotus 3. Family Chaenopsidae Tube blennies A can them blemaria Chaenopsis Emblemaria 4. Family Tripterygiidae Triplefin blennies a. Tribe Lepidoblenninae Axoclinus Karalepis b. Tribe Tripterygiinae Rosenblatella Notoclinus Triperygion 5. Family Blenniidae Combtooth blennies a. Tribe Salariini Ecsenius Entomacrodus Ophioblennius Rhabdoblennius
N taxa 3 tribes 26 genera 89 species 4 genera 12 species
17 genera 68 species 5 genera 9 species
6 tribes 16 genera 106 species 4 genera 7 species I genus 9 species 2 genera 2 species
Distribution Marine; mostly temperate
Southern Australia
Characters b Ceratohyal connected to dentary symphysis; scales small and embedded and radii in all fields Dorsal and anal fins united to caudal fin; cirri and lateral line reduced; male intromittent organ; ovoviviparous
Indo West-Pacific and New Zealand; mostly temperate; 4 tropical species
Male intromittent organ; ovoviviparous
Temperate New World and Mediterranean (Mediterranean) (northeastern Pacific) (northeastern Pacific) (southeastern Pacific)
Oviparous; often sexually dimorphic in size; females larger; males guard nests
Temperate eastern Pacific Temperate eastern Pacific and western Pacific Tropical eastern Pacific
Scales with radii confined to anterior margin; scales sometimes absent, but never small and embedded Branched caudal fin rays Tube dwellers Thickened corneas; divided eyes; thickened anal fin rays; amphibious
2 genera 35 species
Tropical New World and Africa
No known morphological synapomorphies
2 genera 24 species 2 genera 21 species
Tropical New World
Male intromittent organ; ovoviviparous or oviparous Spine on opercle
New World; tropical and warm temperate
Chaenopsidae; Hastings and Springer, 1994) 6 species New World; tropical 11 genera New World; Pacific and Atlantic; most 64 species tropical and some warm temperate
28 genera 2 tribes 103 species 9 genera 31 species
Unique testis lobe arrangement tube dwellers; lack scales; no lateral line; single epural; two infraorbital bones
Atlantic, Indian, and Pacific; tropical and warm temperate; greatest diversity in New Zealand
Dorsal fin divided in three parts; no spine on first segmented dorsal ray
Circumglobal; mostly marine tropical, some temperate; many estuarine
Comb-like teeth; scales absent; coracoid ankyased to cleithrum
Primarily Indo-West Pacific
Two, four, or five circumorbitals; three or four segmented pelvic rays
19 genera 72 species
6 tribes 55 genera 346 species 26 genera 198 species
(continues)
15. Blennioid Relationships TABLE I
247
(Continued)
Taxon
N taxa
Distribution
b. Tribe Parableniini Hypsoblennius Parablennius c. Tribe Omobranchini Omobranchus d. Tribe Nemophini Saber-toothed blennies Petroscirtes 6. Family Dactyloscopidae Sand stargazers Myxodagnus
14 genera 70 species
Circumglobal; mostly marine; tropical to temperate
Branched caudal fin rays; five circumorbital bones
7 genera 30 species 5 genera 48 species
Indo-Pacific and one Caribbean spp. (introduced); marine, some fresh water Indian and Pacific Oceans marine; one brackish and fresh water
Unbranched caudal fin rays; two segmented pelvic fin rays Unbranched fin rays; swim bladder present; basisphenoid absent
9 genera 41 species
New World in Pacific and Atlantic Oceans; tropical and warm temperate
Fringed upper gill cover; gill membranes separate and free from isthmus; no endopterygoid
Characters b
aBased on Springer (1993) and Nelson (1994). bCaution: Some of these characters are probably plesiomorphies.
G
H
C
I
relationships a m o n g blennioid taxa. The objective of the present study was to use mitochondrial 12S r D N A sequences in order to test the m o n o p o l y of the six Blennioid families and m a n y of their c o m p o n e n t tribes, analyze the evolutionary relationships a m o n g them, and examine their possible relationships to outgroups. Blennioid higher taxa have distinctive distributional patterns in several marine provinces (see Table I; Springer, 1982, 1993). Although the majority of blennioids are primarily tropical groups, the family Clinidae and the labrisomid tribes Neoclinini and Cryptotremini are primarily temperate and antitropically distributed (Fig. 2; Hubbs, 1952; Stephens and Springer, 1973; Springer, 1993). Evolutionary relationships a m o n g the families and tribes analyzed in this study (Table I) offer a biogeographic f r a m e w o r k to address hypotheses of the relative ages of tropical versus temperate groups, relationships among Old and N e w World taxa, and questions of dispersal and distributional history of nearshore fishes. A. Hypotheses Tested
F FIGURE 1 Drawings of representative taxa [reprinted from "Fishes of the World," 3rd. edition by J. S. Nelson (1994). Reprinted with permission of John Wiley & Sons, Inc.]. Blennioidei: (A) Clinidae, (B) Labrisomidae, (C) Chaenopsidae, (D) Tripterygiidae, (E) Blenniidae, (F) Dactyloscopidae. Zoarcoidei: (G) Stichaeidae, (H) Pholidae, [reproduced with permission of Miller and Lea (1972),California Department of Fish and Game], (I) Zoarcidae. Notothenioidei: (J) Nototheniidae, (K) Bathydraconidae.
Some of the p r i m a r y evolutionary and biogeographic questions that m a y be addressed with a resolved p h y l o g e n y for these groups include: (1) Are the temperate members of the Clinidae and Labrisomidae ancestral to the tropical labrisomid and chaenopsid clades? (2) What is the relationship of the Mediterranean m y x o d i n clinid Clinitrachus to the N e w World clinids? (3) What are the relationships between the Blenniidae and Tripterygiidae? H o w are they related to the other blennioids? (4) Are the dactyloscopids appropriately g r o u p e d with the blennioids? (5) H o w are the blennioids, zoarcoids, and notothenioids related?
CAROL A. STEPIEN et al.
248
o
o
~_... . . . .
I
~
,
~ -
.
~
~-
~
~
-~'~
.
~
~,
,~.
_
.
"~i!
~
,E
o
I
I i
i
!
I ., :::,
i
"
--~
L
.
q
I
r
-
0
o
o
u
0
o
15. Blennioid Relationships
B. Relationships of the Family Clinidae George and Springer (1980) redefined the family Clinidae (Fig. 1A), excluding the Tripterygiidae, Labrisomidae, and Chaenopsidae, and adding the tribe Ophiclinini. Clinids can be distinguished by several characters, including a cord-like ligament extending from the ceratohyal to the symphysis of the dentaries and the presence of radii on all margins of the scales (Hubbs, 1952; Springer et al., 1977; Springer, 1993). The Clinidae contains three tribes; the matrotrophic (ovoviviparous) Clinini and Ophiclinini, and the oviparous Myxodini (George and Springer, 1980; Stepien and Rosenblatt, 1991). The family largely has a temperate distribution, except for five tropical species in the tribe Clinini (Fig. 2; Springer, 1982, 1993). A fossil clinid very similar to the extant Mediterranean myxodin Clinitrachus has been described from the Miocene of Romania (Bannikov, 1989; also see Springer, 1993), which is the sole known fossil record of the family. The clinids present an interesting biogeographic scenario in that the live-bearing and egg-laying tribes do not overlap in distribution (Fig. 2A), and it has been postulated that live-bearing taxa are more derived (Wourms and Lombardi, 1992). The question of origin of antitropical taxa and whether they are ancestral to tropical groups, such as most of the labrisomid tribes (except Neoclinini and Cryptotremini) and the Chaenopsidae (Briggs, 1974), may also be addressed using these groups.
C. Relationships of the F a m i l y Labrisomidae The family Labrisomidae (Fig. 1B) has often been regarded as the sister group to the Clinidae or as part of the Clinidae (together with the Chaenopsidae; Hubbs, 1952). Studies by Springer (1993) and Hastings and Springer (1994) did not find morphological synapomorphies to define the Labrisomidae. Labrisomid scales (when present) have radii only on the anterior margin and are never small and embedded, which are apparent plesiomorphies distinguishing them from clinids (Hubbs, 1952; Stephens and Springer, 1973; George and Springer, 1980; Springer, 1993). The absence of an anterior extension of the dorsal epaxial slip to the skull is an apparent reversal to an ancestral state that may characterize the Labrisomidae (Mooi and Gill, 1995). Most labrisomids are found in the New World, except for six species of Neoclinus in the northwestern Pacific (Fukao and Okazaki, 1987) and the eastern Atlantic species Labrisomus nuchipinnis and Malacoctenus africanus (Fig. 2B, Table I; Springer, 1993). A fossil labrisomid (Labrisomus pronuchipinnis) has been described from Miocene deposits in the Mediterranean, where the family is no longer represented (Springer, 1970; George and Springer, 1980), and is the sole known fossil.
249
There are few known morphological synapomorphies to suggest relationships among the tribes; however, allozyme data provided some synapomorphies which support presently recognized tribal groupings (Table I; Stepien et al., 1993). Allozyme data suggested that the labrisomids are paraphyletic and that the labrisomid tribe Cryptotremini may be the sister group of the clinids (Stepien et al., 1993); these hypotheses are tested in this Chapter. Consensus trees from allozyme data failed to conclusively resolve relationships among the labrisomid tribes Cryptotremini, Paraclinini, and Starksiini (Stepien et al., 1993), which are further examined with DNA sequences. Inclusion of the tribe Neoclinini in the Labrisomidae is controversial, and Hastings and Springer (1994) suggest that it belongs in the family Chaenopsidae. Neoclinins are provisionally treated as labrisomids here, as indicated by allozyme characters (Stepien et al., 1993). Their relationships to chaenopsids and labrisomids are tested. Familial affinity of the small, rarely observed eel-like genus Stathmonotus is also examined. Stathmonotus has been classified as a labrisomid but, most recently, as a chaenopsid (Hastings and Springer, 1994).
D. Relationships of the Family Chaenopsidae The family Chaenopsidae (tube blennies; Fig. 1C) is restricted to the tropical and temperate New World and is defined by several morphological synapomorphies (Table I; Springer, 1993; Hastings and Springer, 1994). Parsimonious phylogenies based on nuclear rDNA sequence and allozyme data in Stepien et al. (1993) supported the traditional concept of a close relationship among the families Clinidae, Labrisomidae, and Chaenopsidae, as suggested by morphological data (Hubbs, 1952; Stephens, 1963; Stepien, 1992; Springer, 1993), which is further examined in this study. The Chaenopsidae has often been regarded as most closely related to the neoclinin labrisomids (Hubbs, 1952; Stephens, 1963; Springer, 1993; Hastings and Springer, 1994). Most-parsimonious phylogenies based on allozyme and rDNA sequence data suggested that the Chaenopsidae may be the sister group to a clinid-labrisomid clade (Stepien et al., 1993). However, the next parsimonious alternative phylogeny based on rDNA sequence data placed the Labrisomidae and Chaenopsidae as sister groups. These possible relationships are examined in this study.
E. Relationships of the F a m i l y Tripterygiidae The Tripterygiidae (triplefin blennies: Fig. 1D) is widely distributed in temperate and tropical regions
250
CAROL A. STEPIENet al.
throughout the Atlantic, Indian, and Pacific Oceans. Tripterygiids have a dorsal fin divided into three distinct segments: the first two are composed of spines and the third with seven or more soft rays. They are also defined by the synapomorphy of lack of a dorsal fin spine on the pterygiophore supporting the first segmented dorsal fin ray (Table I; see Springer, 1993). They have been assumed to be related to the Clinidae / Labrisomidae/Chaenopsidae clade and to the Blennidae (Springer, 1993), and these relationships are tested in the present study. The relationship between the two subfamilies (Lepidoblenninae and Tripterygiinae; Table I) is also examined. A fossil species (Tripterygion pronasus) has been described in Miocene deposits from the Mediterranean Sea (Arambourg, 1927; Wirtz, 1980) and one of the members of this genus is included in this study.
F. Relationships of the Family Blenniidae The combtooth blennies (Fig. 1E), family Blennidae, are widely distributed in the Atlantic, Indian, and Pacific Oceans and the Mediterranean Sea. Blennnies are a species-rich group and are defined by the synapomorphies of their comb-like teeth (in most), the nonprotractile premaxillae, the ankylased coracoid, and a vertical pair of processes on each side of the urohyal (Springer, 1993). Six tribes are recognized, some of which are undefined by morphological synapomorphies (Table I); four are included here. Some tribal relationships were hypothesized by Smith-Vaniz (1976). This study tests relationships among these tribes, as well as the monophyly of two of them (Salariini and Parablenniini).
G. Relationships of the F a m i l y Dactyloscopidae The family Dactyloscopidae (sand stargazers, Fig. 1E) is found exclusively in warm temperate and tropical marine waters of the New World. Dactyloscopids are well characterized by several synapomorphies, including a unique branchiostegal pump, finger-like elements on the upper edge of the gill cover, and lack of vomerine teeth (Table I; see Springer, 1993). Springerand Friehofer (1976) and Springer (1993) placed the dactyloscopids in the Blennioidei, but various researchers have included it in other groups. Inclusion of Dactyloscopidae in the Blennioidei is tested in the present study.
Anderson, 1994). The four blenniiform suborders recognized by Nelson (1994), Blennioidei, Zoarcoidei, Notothenioidei, and Trachinoidei, have been regarded as being closely related. A possible synapomorphy is that the pelvic fin, when present, originates in front of the pectorals in all species of the four suborders (see Springer, 1993; Nelson, 1994). However, morphological characters suggesting their relationships, including this fin placement, may alternatively be due to evolutionary convergence for bottom-dwelling modes of life (Rosenblatt, 1984). Whether these four groups represent monophyletic lineages, are each other's closest relatives, or have closer affinities with other groups is presently uncertain. The relationships among three of these suborders, Blennioidei, Zoarcoidei, and Notothenioidei, and some of their component families are examined in this study. Members of the suborder Zoarcoidei (Fig. 1F, G, and H) are united by having a single nostril, loss of the basisphenoid, and the structure of the adductor mandibulae (Anderson, 1994). The zoarcoids are found primarily in the North Pacific (Table I; Anderson, 1994; summary in Nelson, 1994). In the authors' study, relationships among the families Zoarcidae (Fig. 1I), Stichaeidae (Fig. 1G), and Pholidae (Fig. 1H) are tested. The perciform suborder Notothenioidei (Fig. 1J and K) contains biochemically derived low-temperature specialists (Eastman, 1993) that are primarily found in coastal Antarctica. Analysis of this group using molecular characters may aid in the understanding of the biogeographic origins of modern Antarctic fish fauna. Notothenioids are united by having one nostril on each side of the head and by the loss of one pectoral actinost (Table II; summarized in Eastman, 1993; Miller, 1993; Nelson, 1994). A fossil notothenioid has been described from the late Eocene of Antarctica (Balushkin, 1994). Relationships of the notothenioids to blennioids are problematic as there are no known morphological synapomorphies linking them (Eastman and Grande, 1989). The study described in this chapter also examines the relationship between the notothenioid families Nototheniidae (Fig. 1J) and Bathydraconidae (Fig 1K). One of the hypotheses examined is whether Pagothenia is an early offshoot of the Nototheniidae, as projected by Eastman and Grande (1989).
II. Materials and Methods
H. Relationships with Other Suborders Relationships of blenny-like perciform fishes have been debated in modern ichthyology (see Gosline, 1968, 1971; Greenwood et al., 1966; Rosenblatt, 1984;
A. Collection o f Specimens Fishes were collected by netting intertidally with use of the anesthetic quinaldine or subtidally by hand
15. Blennioid Relationships TABLE H
Taxon 1. Suborder Zoarcoidei
A. Family Stichaeidae Pricklebacks
251
Summary of Outgroup Taxa Sequenced a
N taxa
Distribution
8 families 95 genera 318 species 36 genera 65 species
Marine; primarily North Pacific
Single nostril; no known synapomorphies
Marine; primarily North Pacific, a few North Atlantic
Elongate dorsal fin
4 genera 14 species 2 genera 4 species 2 genera 10 species 45 genera 220 species
Marine; North Atlantic and North Pacific
Elongate dorsal fin; small pectoral fins; rudimentary or no pelvic fin
Marine; most North Atlantic and North Pacific
All with single nostril; postorbital lateralis canal ends at the lateral extrascapulars, free of the pelvic bone
Many Antarctic endemics
Pelvic fins with one spine; single nostril on each side; three flat, plate-like pectoral radials Gill membranes in fold across isthmus; body scaled; mouth protractile
Characters
Dictyosoma Ptectobranchus B. Family Pholidae gunnels Subfamily Apodichthyinae
Apodichthys (Xererpes) Subfamily Pholinae
Pholis C. Family Zoarcidae Eelpouts
Lycodes (Aprodon) Lycodichthys Zoarces 2. Suborder Notothenioidei
A. Family Nototheniidae Cod icefishes 1. Subfamily Notothenninae
Notothenia 2. Subfamily Trematominae
Pagothenia Trematomus (Pseudotrematomus) B. Family Bathydraconidae Dragonfishes
5 families 46 genera 122 species 17 genera 50 species 8 genera 30 species 4 genera 14 species 10 genera 16 species
Marine; coastal Antarctic and southern hemisphere
Marine; Antarctic
Gymnodraco Parachaenichthys
Gill membranes united; mouth nonprotractile; no spinous dorsal fin
aBased on Miller (1993), Nelson (1994), and Anderson (1994).
nets while scuba diving. Specimens were sacrificed either by freezing in liquid nitrogen or on dry ice or were placed directly in 95% ethanol. Notothenioids and some zoarcoids were obtained from frozen tissue collections of George Somero (Hopkins Marine Laboratory, Pacific Grove, California). All frozen samples were stored at - 8 0 ~ until use. For large specimens, either liver or muscle tissue was used for DNA extractions. For small specimens, the gut was removed and one side of the fish was used. Voucher specimens were formalin-preserved when sufficient in number and many were deposited in the Marine Vertebrates Collections at Scripps Institution of Oceanography, University of California, San Diego.
B. Preparation of DNA, Amplification, and Sequencing Frozen tissues were pulverized in liquid nitrogen using a cylindrical stainless-steel mortar and pestle.
Ethanol-preserved tissues were wrapped in foil, placed in liquid nitrogen, and pulverized with a hammer. DNA was extracted in a guanidine thiocyanate buffer (Perbal, 1988) to circumvent degradation, purified using proteinase K, RNase, phenol, and chloroform, and then precipitated following methods used in the authors' laboratory (Stepien et al., 1993; Stepien, 1995). A small sample of the DNA was run on a mini-gel to verify relative amounts and quality. Mitochondrial (mt) DNA primers used included 12S light strand 5'-AAACTGGGATTAGATACCCCACTAT -3' and 5'-GTCAGGTCAAGGTGTAGCAAT-3' and 12S heavy strand 5'-AGGAGGGTGAcGGGcGGTGTGT -3' from Kocher et al., (1989) and Titus and Larson (1995). The primer for the heavy mitochondrial strand was end labeled with biotin (Hultman et al., 1989) for later separation of the double-stranded polymerase chain reaction (PCR) product by means of Dynal streptavidin magnetic beads (Dynal Corp.). Procedures, amounts of reagents, and buffers followed the Perkin-
252
CAROL A. STEPIEN et at.
Elmer protocol in their AmpliTaq DNA polymerase kit (Perkin-Elmer Inc., N808-0167). Typical amplification parameters were 35 cycles of denaturation at 96~ for 45 sec, annealing at 53~ for 55 sec, and polymerization at 72~ for 90 sec. Amplified DNA was then bound to Dynabead M-180 streptavidin (Dynal Corp.), which produced high yields of purified, single-stranded template DNA for sequencing (Hultman et al., 1989; Uhlen, 1989). Sanger dideoxy sequencing (Sanger et al., 1977) was performed by means of Sequenase II and PCR product sequencing kits (Amersham/U.S. Biochemical Corp.), using the complementary primer and the purified, single-stranded DNA as a template. Samples from sequencing reactions were run on 6% acrylamide gels with constant temperatures of 50~ at approximately 2500 V. Samples were usually run on three separate gels for 2.5, 5, and 8 hr. in order to resolve sequences at various distances to 500 bp from the primer. Gels were transferred to blotting paper, dried for 2 hr, and visualized by autoradiography after 72 hr or longer of exposure to Kodak X-OMAT film. Sequences from gels were read into a Macintosh computer using an IBI/Kodak digitizer and MacVector-AssemblyLIGN software (International Biotechnologies, Inc., 1992).
C. Alignment and Data Analysis Sequences were aligned with each other using MacVector and AssemblyLIGN IBI/Kodak sequence analysis software and by hand. Pairwise (p) genetic distances, which are the proportion of nucleotide sites differing between each pair of sequences, were calculated using the phylogenetic analysis of parsimony (PAUP~4.0) (Swofford, 1996), and their standard errors were determined using MEGA (Kumar et al., 1993). Neighbor-joining (Saitou and Nei, 1987) clustering analyses were used to generate distance trees from the p distances using PAUP ~ 4.0 (Swofford, 1996). Support of the data set for nodes of the trees was determined by 100 bootstrapping replications, and a standard error test for the interior branch lengths of the neighbor-joining tree was conducted using MEGA (Kumar et al., 1993). For purposes of providing a very rough comparison of possible relative divergence times, a "conventional" mtDNA calibration rate of 1% sequence divergence per million years (myr) for an ectothermic animal was used (Brown et al., 1979; Avise, 1994). Preliminary results indicated that 12S rDNA sequences appeared to evolve in blennioids at moderately rapid rates in comparison with other mtDNA regions. Caution should be used with such extrapolations to evolutionary times because different nucleotide positions and genes within mtDNA may evolve at varying rates within some lineages (Gillespie, 1986; Moritz et al., 1987) and the pace
of mtDNA evolution has been linked to differences in metabolic rate and/or to body size differences in some groups (Thomas and Beckenbach, 1989; Martin et al., 1992; Rand, 1994; see Section IV). Most of the blennioids examined in the present study were approximately similar in size, ranging from about 4 to 8 cm TL; inhabit similar nearshore habitats, from intertidal to approximately 30 m in depth; and are warm temperate to tropical species (see Section I and Table 1). Exceptions in this study are members of the notothenioid outgroup and the zoarcid Lycodichthys, which inhabit the much colder waters of Antarctica, and some zoarcids (i.e., Lycodes and Zoarces) and stichaeids (Plectobranchus), which inhabit deeper, colder waters of temperate regions. These taxa thus have markedly lower metabolic rates, which may influence the rates of mitochondrial evolution (See review by Rand, 1994). In the present study, approximate divergence estimates were compared with independent estimates from the fossil record, geologic events, and other genetic distance studies, including DNA and allozyme analyses, where available. For groups of taxa analyzed with both mtDNA and allozyme (Stepien and Rosenblatt, 1991; Stepien, 1992) data, regression analysis (SPSS, 1992, version 5.0.1) was used to compare the p distances with Nei's (1972) D values. Maximum parsimony in the PAUP~4.0 program (Swofford, 1996) was the primary method used to analyze relationships from the blennioid DNA sequences. Characters were coded as unordered, and uninformative characters and missing data were excluded. Deletions were treated as single, independent characters. Fifty separate heuristic searches with random input order of taxa were used to analyze the entire data set for all taxa, due to its size. The trees were rooted to the Notothenioidei and Zoarcoidei. After initial PAUP heuristic searches of all taxa were completed, individual families and clades of families were analyzed separately using either exhaustive searches or the branchand-bound algorithm (Hendy and Penny, 1982). Members of the sister family and several other outgroup taxa, determined from the prior heuristic searches of all taxa, were designated as outgroups. Independent searches tested different relative weightings for transversions and transitions, according to their relative frequencies in the data set, as well as insertions and deletions. Consistency indices (CIs), lengths of the most-parsimonious and near-most-parsimonious trees, and strict and 50% majority-rule consensus trees were used to evaluate competing phylogenies. Support of the data set for nodes was estimated with 500 bootstrap replications of the data set and either the branch-andbound algorithm (Hendy and Penny, 1982) or heuristic searches, when size of the data set precluded the former.
15. Blennioid Relationships Distance clustering trees, such as neighbor joining, are based on reducing the character-state data set to a single n u m b e r (the p distances here) between each pair of taxa. Although they are useful for comparing overall amounts of sequence divergence, as used in this study, distance models are generally not regarded as a rigorous approach for evaluating and comparing phylogenies. In contrast, m a x i m u m p a r s i m o n y analyses are based on character state changes t h r o u g h o u t the data set and allow competing phylogenies to be systematically compared (see discussions by Avise, 1994; Swofford et al., 1996). For this reason, in cases of discrepancy between the two types of trees in the present study, the p a r s i m o n y tree was regarded as the more likely evolutionary scenario. The authors also tested for possible unequal rates of nucleotide evolution due to the secondary structure in the paired (stem) versus unpaired (loop and single stranded) elements of the mitochondrial 12S ribosomal DNA, as have been found in some other studies of nuclear and mitochondrial ribosomal D N A sequences (Wheeler and Honeycutt, 1988; Vawter and Brown, 1993; Orti et al., 1996). These influences may bias phylogenetic results (Hillis and Dixon, 1991; Dixon and Hillis, 1993; Orti et al., 1996). The authors' aligned mitochondrial 12S r D N A sequences were compared with secondary structure models for Homo sapiens (Neefs et al., 1991) and piranhas (Teleostei: Characiformes: Characidae: Serrasalminae; Orti et al., 1996) to formulate a model of blennioid secondary structure for Paraclinus integripinnis, following methods used by Orti et al. (1996). This model (Fig. 3) was then used to identify paired and unpaired regions for the other taxa according to the aligned sequences. Base composition, TABLE III
253
n u m b e r s of variable positions, transition ratio versus transversion ratio, and n u m b e r s of informative characters (from PAUP 4.0*; Swofford, 1996) were compared in the two types of structural elements versus the entire data set. Relative rates of nucleotide substitution were determined by dividing the n u m b e r of changes in the paired versus unpaired regions by the n u m b e r of nucleotides in each region, following Orti et al. (1996). Frequencies of variations were compared between paired versus unpaired regions using contingency table tests (Siegel and Castellan, 1988). Separate neighbor-joining and p a r s i m o n y analyses were conducted (as discussed earlier) on subsets of data for the paired versus unpaired elements using PAUP* 4.0 (Swofford, 1996). Resulting trees were then compared with each other and with analyses based on the whole data set (see earlier discussion). D N A sequences were deposited in GenBank (access n u m b e r s U90356U90414.).
III. R e s u l t s
The aligned mitochondrial 12S r D N A data set for 59 blennioid and o u t g r o u p taxa used for analysis consists of 400 bp. Table III indicates the n u m b e r s of transitional and transversional substitutions per family and suborder. These ratios are approximately consistent a m o n g taxa at the levels of families and suborders (Table III), comprising 60% transitions and 40% transversions in the entire suborder Blennioidei and 58% transitions and 41% transversions in the o u t g r o u p s (Zoarcoidei and Notothenioidei combined). The sole
N u m b e r s of Transitional and Transversional Base Substitutions in Families and Suborders a
Taxon Family Clinidae Family Labrisomidae Family Chaenopsidae Family Tripterygiidae Family Blenniidae Suborder Blennioidei Family Stichaeidae Family Pholidae Family Zoarcidae Suborder Zoarcoidei Family Nototheniidae Family Bathydraconidae Suborder Notothenioidei
N transitions N transversions 113 115 48 76 132 249 8 15 27 59 20 13 40
73 75 35 42 81 164 6 9 14 35 22 8 38
Ratio
Total
1.55 1.53 1.37 1.80 1.63 1.52 1.33 1.67 1.93 1.69 0.91 1.63 1.05
186 190 83 118 213 413 14 24 41 94 42 21 78
aRatio is transitions/transversions. There are no significant differences in the proportions of transititions and transversions among blennioid families (x2 = 1.0, df = 4, p > 0.90), zoarcoid families (x2 = 0.35, df = 2, p > 0.90), at the familial versus blenniod suborder level (x2 = 0.12, df = 1, p > 0.70), or among the three suborders (x2 = 2.7, df = 2, p > 0.50).
254
CAROLA. STEPIENet al.
familial exception is a preponderance of transversions in the Nototheniidae. Transition: transversion ratios vary considerably within groups of congeners analyzed; ranging from 0.58 in Labrisomus (N substitutions = 27) and 0.83 in Gibbonsia (N substitutions = 22) to 1.9 in Entomacrodus (N substitutions = 26) and 4.0 in Lycodes (N = 15) in contrast to their more stable proportions at the familial level (Table III). Differential weighting schemes, including weighting transversional'transitional substitutions 3-2 (determined from their relative proportions, see earlier discussion) and insertions/deletions 3"1 and 10"1, did not change the most-parsimonious trees in the PAUP analyses and are not shown. Figure 3 shows the secondary structure model for the blennioid P. integripinnis. The first 54 bp of the blennioid data set was not used in constructing the model or for further structural comparisons due to difficulty in aligning to the piranha sequences (Orti et al., 1996; see Section II) and, consequently, determining secondary structure. Paired elements of the 12S blennioid sequence data have fewer nucleotide changes (64 of 169 sites vary, equal to 40% of the overall variability) than do the unpaired regions (96 of 177 sites vary, equal to 60% of the overall variability). Proportions of variable sites are significantly greater in the unpaired regions (X 2 = 8.7, df = 1, P < 0.005). Paired regions have a slightly greater proportion of phylogenetically informative characters (113 of 169, equal to 55% of the number of informative characters in the entire data set) than do unpaired regions (93 of 177, equal to 45% of the total number of informative characters), which is a significant difference (X 2 = 7.4, df = 1, P < 0.01). The transition" transversion ratio is slightly (but not significantly) higher in paired (2.5" 1.0) versus the unpaired (1.6-1.0) areas, and the former are thus somewhat (but not significantly) less saturated ( / ~ 2 - - 2.2, df = 1, P < 1.0). There are significant biases in nucleotide composition within the paired (24.9%G, 14.1%A, 24.9%T, 31.1%C;/~,,2 = 227, df = 3, P < 0.0001) and unpaired (12.0%G, 40.3%A, 22.2%T, 25.5%C; X2 = 1374, df = 3, P < 0.0001) sequence regions. These nucleotide proportions are also significantly different between the paired and the unpaired areas (,t'2 = 1032, df = 3, P < 0.0001). Paired elements are significantly richer in guanine and cytosine nucleotides (56%), whereas unpaired sites have significantly greater numbers of adenine and thymine bases (62%; X2 = 562, df = 1, P < 0.005). Separate neighbor-joining and parsimony analyses showed only slight variations in tree topologies among paired, unpaired, and combined data sets and are thus not included. The neighbor-joining distance tree of all blennioid genera for the entire data set, based on p distances
(PAUP* 4.0; Swofford, 1996), is shown in Fig. 4. Percentages on the nodes of the trees in this study show bootstrap support above 50% for nodes. Figure 5 is a summary of familial groupings from strict consensus of most-parsimonious trees, calculated using all genera and 50 independent repeated heuristic searches with PAUP*4.0 (Swofford, 1996). Parsimonious relationships among tribes, species, and genera are shown in greater detail in Figs. 6 and 7. The neighbor-joining (Fig. 4) and parsimony trees (Figs. 5 and 6) for blennioids are similar, but differ from each other in positionings of the family Dactyloscopidae and of the labrisomid tribe Mnierpini. In neighbor joining, the dactyloscopid is closest to the family Tripterygiidae. In parsimony analyses (Figs. 5 and 6), Dactyloscopidae is the basal clade in the suborder Blennioidei. In the neighbor-joining tree (Fig. 4), Mnierpini is genetically closest to the North American myxodin clinids (Gibbonsia and Heterostichus). In the parsimony analyses (Fig. 6A), Mnierpini is depicted as the sister taxon to the clade containing the other labrisomids (and the chaenopsids), and this entire clade is then the sister group of a monophyletic Clinidae. Both neighbor-joining and parsimony analyses group the "family Labrisomidae" as paraphyletic and the Chaenopsidae as a monophyletic group contained within it. The neighbor-joining (Fig. 4) and parsimony trees (Fig. 6) also differ in some cluster relationships among the clinid and labrisomid tribes, which are separated by relatively short genetic distances. Because the standard errors of these short branch lengths in the neighbor-joining analysis are high (MEGA analyses; Kumar et al., 1993), this tree cannot adequately distinguish the order of these higher-level relationships. This may be due either to site saturation (swamping of transitions; see Brown et al., 1979), which does not appear to be the case here, or to rapid taxon divergence rates. Figure 6A is the 50% majority-rule consensus tree of the mostparsimonious trees from a branch-and-bound search of the families Clinidae, Labrisomidae, and Chaenopsidae. A 50% majority-rule consensus of branch-andbound search maximum parsimony trees depicting the relationships of the families Dactyloscopidae, Tripterygiidae, and the Blenniidae is shown in Fig. 6B. A single most-parsimonious tree was obtained from a branch-and-bound search for the relationships of the suborders Zoarcoidei and Notothenioidei and is shown in Fig. 7. Separate exhaustive searches were also conducted for each family, and results are indicated in the legends of Figs. 6 and 7. Figure 8 shows results of the regression analysis of Nei's (1972) D from allozyme studies (reported in Stepien and Rosenblatt, 1991; Stepien, 1992) versus p distances from 12S mtDNA sequences.
AAUA
C
A
ACGC
A U U A
/G /
A
c / G U/A AC
/
u / C 5,
G
U
A
AGAAGC
C C C A C U ACGA
il
G
A
G~C G~C G~C CUbA
A
A C U U U U AU A G AAUUGACCCA
C
C A U A U G G G~C U~A G--C U--A C~G G~CC GAAA
G~C G~C A~U G
U
CAUUCGAC
G
U A A
AUAA
A
G
G
\
C
A
UACUA
\ ~
\
N
c
C A A
A A
u CUA
u
G
C C G C c AGGAACUAC
C
AC
G ~
CC
U
U
C
A
U
C
U
CGCC
C G C
U A U A U
GUUC
CUC
U C U C C AC
U U
u \ G G \ k AUC \
A
'~
A
cc
U
G A
G A \ G \ C U U G A A A C C C A A AGGA
CUAGCA3'
U
U A C C
U U G
A
Illlll
A A A C A A
UGA
FIGURE 3
GUAAGC
AA
C A
U U C C
UCG
U
G
U
AAT
C
G
CU
UGGCGGUGCUU
A
AGACC
C
C
C
C
U
A
G
C G A
S e c o n d a r y structure of the labrisomid Paraclinus integripinnis, s h o w i n g p a i r e d a n d u n p a i r e d regions. S e c o n d a r y structure w a s not d e t e r m i n e d for the first 54 bases of the blennioid dataset (see Section III).
256
CAROL A. STEPIEN et al. 0.047 / 98%
0.072 Ophiclinus gracilis IOphiclinini o.o14 Sticharium dorsale I 0.016 I I 0.023 Heteroclinus heptaelous 93% | 0.059 Heteroclinus wilsoni Clinini 0.057 Heteroclinus scotti 0.011 Clinitrachus argentatus 70% ! ~ 1 Myxodes viridis I L-0.017 | " 0 02o Heterostichus rostratus | | 94% I0.010 I 0 0 1 2 Gibbonsiametzi Myxodini o OlO I 88% " " I 9 h o" 022 GIbbonsla montereyensis 1 1 0 0 5 5 56~ " Gibbonsiaelegans J " o 035 Mnierpes macrocephalus Mnierpini LABRISOMIDAE 1 " 0.029 1100% I 0"035 Paraclinus integripinnis I I " Exerpes asper Paraclinini 0.038199%
0.0351 100%
CLINIDAE
r Acanthemblemaria aspera L___ Acanthemblemaria crockeri CHAENOPSIDAE " Chaenopsis sp 1,4 o o~ I II 0.o64 Emblemaria hypacanthus 0.013 U 0.072 Starksia atlantica I Starksiin i 57% II 0.075 Starksia nanodes Stathmonotus sp. I Uncertain Auchenionchus microcirrhis I Cryptotremini 0.071 LABRISOMIDAE Neoclinus blanchardi I Neoclinini 0.040 0.019 Labrisomus striatus o.o13 I 97% I 0.031 Labrisomus xantii Labrisomini 0.024 82% I 0"0321100%10.028 Malacoctenus zonifer Malacoctenus hubbsi 0.059 Karalepis stewarti 0.065 0.013 Rosenblatella etheostoma 0.060 Tripterygion delaisi TRI PTERYGIIDAE Notoclinus compressus 0.01311 0.092 Axoclinus nigricaudis 0.049 Myxodagnus opercularis I DACTYLOSCOPIDAE 0.024 196% I o.o41 Parablennius yatabei ! 0.010 | Hypsoblennius gentilis 82% I 0.027 Hypsoblennius gilberti 0.008 , 0.091 Ecsenius nalolo 0.058 Rhabdoblennius ellipes 0.0511100% I 0.039 Entomacrodus chiostictus BLENNIIDAE 0.013 Entomacrodus cadenati 0.061 85% 0.0101 0.036 Omobranchus Ioxozonus Omobranchus fasciolatoceps 0.014 0.037 Omobranchus punctatus 0.012 | 73% 0.056 Ophioblennius steindachneri 63% I 0.088 0.020 Petroscirtes breviceps .... Lycodes cortezianus I I
II
~ I I
I oo,~ ~" I
0.011
I
I
o
0 013 I I
0.0090t4 89% ~~ "
0.023/100%
0.034
57%
99%
o o, o, ~oo~
I u.u~u ~0 022 9
Lycodes pacificus 0 029 Lycodicthys dearborni " 0018 Zoarces viviparus I1~ 0019 Dictyosoma burgeri I"! " " Plectobranchus evides I I 0.0211103%|Apodichthys tfavidus I 0.042 "-- Apodichthys fucorum 0.039 Pholis gunnellus 0.012 |
0.040 / 100%
56% |
0.018 Notothenia gibberifrons I 0.018 Pagotheniaborchgrevinki 0.023 Trematomus bernacchii 0.032 Gymnodraco acuticeps Parachaenichthys charcoti
0.0271100%
0.027 / 92%
I'
ZOARCOIDEI
NOTOTHENIOIDEI
Neighbor-joining distance tree (PAUP 9 4.0; Swofford, 1996) using p distances for 12S mtDNA sequence data for all taxa. Branch lengths are indicated by decimals, where space is available, and can be calculated by length comparisons for others. Distances among taxa may be estimated by adding the branch lengths. Bootstrap values above 50% are shown as percentage support for nodes.
FIGURE 4
A. Parsimonious Relationships of the Families Clinidae, Labrisomidae, and Chaenopsidae The chaenopsids, labrisomids, and clinids together form a monophyletic clade in the parsimony analyses
(Figs. 5 and 6) and are also most closely related to each other in the neighbor-joining tree (Fig. 4). Maximum parsimony trees (Figs. 5 and 6A) show that the Clinidae and Chaenopsidae are each monophyletic, but the Labrisomidae is paraphyletic and contains the Chaenopsidae (Fig. 6A). Members of the egg-laying tribe
15. BlennioidRelationships
257 Clinidae
Labrisomidae/Chaenopsidae
Tripterygiidae
100%
Blenniidae
76%
Dactyloscopidae
Zoarcoidei
100%
Notothenioidei
Consensus of three most-parsimonious trees summarizing primary taxonomic groupings from 50 heuristic searches of all genera using PAUP 94.0 (Swofford, 1996), excluding uninformative characters. The topology of these major clades was identical in all three most-parsimonious trees (CI excluding uninformative characters = 0.30, length = 1320 steps). FIGURE 5
Myxodini are located basally within the family Clinidae in the parsimony analyses (Fig. 6A), but do not comprise a separate sister clade to the live-bearing tribes Clinini and Ophiclinini. Instead, the North American clinids form a monophyletic basal clade among the myxodins, and this clade is the sister group to Myxodes and the remaining clinids. Myxodes is then the sister taxon of the monotypic Mediterranean Clinitrachus and the livebearing tribes. Next, Clinitrachus is the sister taxon to the clade containing the tribes Clinini and Ophiclinini. Within the Labrisomidae, the tribe Mnierpini (Mnierpes macrocephalus) is depicted as
the basal taxon and as the sister group of the other labrisomids, as well as the Clinidae. The tribe Starksiini is shown as the next most basal labrisomid group and is then the sister group of the remaining labrisomids. The tribe Labrisomini (comprising the genera Labrisomus and Malacoctenus) is monophyletic and is the sister group of a clade grouping the tribes Neoclinini and Cryptotremini together. The genera Paraclinus and Exerpes comprise the monophyletic tribe Paraclinini. The family Chaenopsidae is depicted as a monophyletic clade within the Labrisomidae, and the relationships among Stathmonotus, the Paraclinini, and the Chaen-
258
CAROL A. STEPIEN et al.
A
I 99%
I
I
92% I
99%
Ophiclinus gracilis
Ophiclinini
Sticharium dorsale
Heteroclinus heptaelous
Clinini
Heteroclinus wilsoni Heteroclinus scotti
68%
Clinitrachus argentatus
CLINIDAE 55%
Myxodes viridis Heterostichus rostratus I
79%
Myxodini
Gibbonsia metzi
100% I
I 100% I i
99%
Gibbonsia montereyensis Gibbonsiaelegans Paraclinus integripinnis
Paraclinini
Exerpes asper
Stathmonotus sp. I Uncertain
~
Ncanthemblemaria
aspera
Acanthemblemaria crockeri ~
Chaenopsis so.
CHAENOPSIDAE
Emblemaria hypacanthus
85% "LABRISOMIDAE"
62%
I I
Auchenionchus microcirrhis I Cryptotremini Neoclinus blanchardi I Neoclinini Labrisomus striatus
78%
100%
89% !
Labrisomini
Labrisomus xantii
i
Malacoctenus zonifer
98% I
Malacoctenus hubbsi
I
Starksia atlantica Starksia nanodes
Starksiini
Mnierpes macrocephalus I Mnierpini Axoclinus nigricaudis
OUTGROUPS
I
Myxodagnus opercularis Omobranchus punctatus
FIGURE 6 Most-parsimonious (MP) trees from branch-and-bound searches for (A) the families Clinidae, Labrisomidae, and Chaenopsidae (rooted to tripterygiid, blenniid, and dactyloscopid outgroups; CI excluding uninformative characters = 0.41, length = 756 steps) and (B) the families Tripterygiidae, Blenniidae, and Dactyloscopidae (rooted to clinid, labrisomid, and chaenopsid outgroups). Bootstrap values are shown as percentage support for nodes. The trees were first analyzed using the basal species for each genus with more than one species represented, i.e., Heteroclinus, Gibbonsia, Labrisomus, and Malacoctenus in tree A and Hypsoblennius, Entomacrodus, and Omobranchus in tree B. Tree A contains a trichotomy, based on strict consensus of relationships among a labrisomid-chaenopsid clade. Separate exhaustive searches were conducted using all species. These had identical topologies for the families Clinidae (rooted to Paractinus, Labrisomus, Emblemaria, Starksia, and Mnierpes; one MP tree; CI = 0.53), Labrisomidae/Chaenopsidae (rooted to Heterostichus, Axoclinus, and Omobranchus; two MP trees, CI = 0.43), Chaenopsidae (rooted to Paraclinus, Labrisomus, Starksia, and Mnierpes; one MP tree; CI = 0.64), Tripterygiidae (rooted to Omobranchus, Rhabdoblennius, Starksia, and Mnierpes; one MP tree; CI = 0.62), and Blenniidae (rooted to Axoclinus and Myxodagnus; four MP trees, which differed in the relationships among species of Omobranchus and in relative positioning of Ecsenius and Rhabdoblennius; CI = 0.52).
15. Blennioid Relationships
259
Heterostichus rostratus Paraclinus integripinnis
OUTGROUPS
Emblemaria hypacanthus Starksia aUantica Mnierpes macrocephalus Karalepis stewarti
62%
Rosenblatella etheostoma Tripterygion delaisi
TRIPTERYGIIDAE
Notoclinus compressus Axoclinus nigricaudis Parablennius yatabei
86%
88% I
I
60%
Hypsoblennius gentilis
Parablenniini
Hypsoblennius gilberti Ecsenius nalolo
100% I
I
87% BLENNIIDAE
Entomacrodus chiostictus
Salariini
Entomacrodus cadenati Rhabdoblennius ellipes Omobranchus Ioxozonus
55%
Omobranchus punctatus
61%
Omobranchini
Omobranchus fasciolatoceps
71%
Ophioblennius steindachneri I Salariini
56%
Petroscirtes breviceps I Nemophini
DACTYLOSCOPIDAE
Myxodagnus opercularis
FIGURE 6 (Continued)
opsidae are not resolved in these analyses. Within Chaenopsidae, Chaenopsis is the sister group to the Acanthemblemaria clade, and Emblemariais then the sister group of the entire clade.
B. Relationships of the Families Dactyloscopidae, Tripterygiidae, and Blenniidae The family Dactyloscopidae is the basal group of the blennioids in the parsimony analyses (Figs. 5 and 6B)
and is very divergent in the neighbor-joining tree (Fig. 4). The dactyloscopid sequenced has one deletion and one insertion which distinguish it from all other taxa analyzed in this study. The families Tripterygiidae and Blenniidae are monophyletic and group together as sister taxa in the most-parsimonious branch-and-bound tree (Fig. 6B). In the overall parsimony analysis (Fig. 5), the family Blenniidae is the basal taxon to the clade containing the Tripterygiidae as the next most basal group. The single
260
CAROL A. STEPIEN et al.
Axoclinus nigricaudis Omobranchus punctatus OUTGROUPS
Rhabdoblennius ellipes
Myxodagnus opercularis Lycodes cortezianus 76%
Lycodes pacificus
75% ZOARCIDAE 87%
Lycodicthys dearborni
Zoarces viviparus Dictyosoma burgeri
STICHAEIDAE
61%
ZOARCOIDEI
Plectobranchus evides
89%
Apodichthys flavidus 100%
PHOLIDAE
Apodichthys fucorum
59%
Pholis gunnellus 100%
Notothenia gibberifrons NOTOTHENIIDAE
98%
Pagothenia borchgrevinki 100%
NOTOTHENIOIDEI 100%
BATHYDRACONIDAE
98%
Trematomus bernacchii Gymnodraco acuticeps
Parachaenichthys charcoti
FIGURE 7 Single most-parsimonious tree obtained from a branch-and-bound search for relationships of the suborders Zoarcoidei and Notothenioidei (rooted to a dactyloscopid, a tripterygiid, and two blenniids; CI excluding uninformative characters = 0.62, length = 451). Bootstrap values are shown as percentage support for nodes. A separate exhaustive search of the Zoarcoidei (rooted to Omobranchus, Myxodagnus, Notothenia, and Gymnodraco) yielded a single most-parsimonious tree (CI excluding uninformative characters = 0.66, length = 319 steps), which was identical to the topology of the tree shown. A separate exhaustive search of the Notothenioidei (rooted to Omobranchus, Myxodagnus, Zoarces, and Pholis) yielded a single mostparsimonious tree (CI excluding uninformative characters = 0.74, length = 309), which was identical to the topology of the tree shown.
most-parsimonious tree (Fig. 6B) shows members of the tripterygiid tribe Leptoblenninae (Axoclinus and Karalepis) as paraphyletic basal groups to the mono-
phyletic tribe Tripterygiinae (Notoclinus, Tripterygion, and Rosenblatella). Two primary sister clades are found in the family Blenniidae: one contains the tribe Para-
15. Blennioid Relationships 2.0' o o o
1.5, 1.0'
o oo
~
o
00 o z
o o
0.5
/
/ o
o
o
o
o o
0.0
o.oo
o.o2
o.&
o.o6
o.o8
o.io
o.42
o.i4
o.i6
o.18
p distance
FIGURE 8 Regressionanalysis of Nei's (1972)genetic distances (D) from allozyme data (Stepien and Rosenblatt, 1991; Stepien, 1992) versus p distancesfrom 12SmtDNAsequences (seeFig. 4). F = 19.42, P < 0.0002.
blennini and most of the tribe Salariini (Rhabdoblennius, Ecsenius, and Entomacrodus) as sister taxa and the other contains the tribe Omobranchini as the sister taxon to a clade containing the tribe Nemophini and the remaining salariin (Ophioblennius).
C. R e l a t i o n s h i p s o f the Suborders Zoarcoidei and N o t o t h e n i o i d e i The two outgroup suborders, Zoarcoidei and Notothenioidei, are monophyletic and are sister taxa in the most-parsimonious trees (Figs. 5 and 7). They are also most closely related to each other by genetic distances in the neighbor-joining tree (Fig. 4). The zoarcoid families tested (Zoarcidae, Stichaeidae, and Pholidae) are each monophyletic, and Pholidae is the basal clade to a sister group comprising Zoarcidae and Stichaeidae (Fig. 7). A unique insertion unites the notothenioids and this study depicts the families Nototheniidae and Bathydraconidae as sister groups.
IV.
Discussion
A. Molecular Features of the Data Set The secondary structure of blennioid 12S rDNA (Fig. 3) is highly consistent with other vertebrates (Neefs et al., 1991), especially other teleost fishes (Orti et al., 1996). Base compositional biases in the paired (G/C rich) and unpaired ( A / T rich) regions have also been found in nuclear srDNA (Vawter and Brown, 1993) and other mitochondrial 12S rDNA (Orti et al., 1996) data sets. The (G/C) bias of the paired elements is believed to increase ribosomal subunit structural sta-
261
bility, and the (A/T) bias of the unpaired regions is thought to facilitate protein binding (Gutell et al., 1985). Transitions outweigh transversions in both the paired and the unpaired regions of the 12S rDNA blennioid data set, a bias found in all studies of mitochondrial DNA reviewed by Meyer (1993). Transition: transversion ratios are similar among sets of congeners, families, and suborders in the authors' data set (see Section III and Table III), and consistency of these ratios may suggest a retention of the phylogenetic signal at the various hierarchical levels. Differential weighting of transversions :transitions (3:2; according to their relative frequencies) for the entire sequence data set and using transversions only did not change the topologies of parsimonious trees. The secondary structure of the 12S rDNA region does not appear to affect the phylogenetic reconstruction of blennioid taxa. Comparisons of paired and unpaired regions in some other studies have suggested that unpaired regions produce more reliable phylogenies (Wheeler and Honeycutt, 1988; Vawter and Brown, 1993; Orti et al., 1996). Dixon and Hillis (1993) suggested that when relative rates of evolution are markedly different in paired and unpaired regions, weighting may be used to compensate. In the authors' study, separate parsimony and neighbor-joining analyses of data from the paired and unpaired structural regions resulted in few changes to the overall tree topologies, compared with those based on the entire sequence (see Section III). These variant trees split some clades that are well characterized on the basis of morphology, and the separate analyses thus appeared to be at the expense of sacrificing the overall number of informative characters necessary to resolve these relationships. Orti et al. (1996) found that small subunit nuclear rDNA unpaired regions evolved four times as fast as paired regions in piranha taxa. In comparison, the authors' results indicate that blennioid unpaired regions evolve more slowly, less than two times as fast as the paired regions. Informative characters are thus more evenly distributed between the paired and the unpaired elements in the authors' data set (see Section III). The entire 12S rDNA sequence data set of blennioid taxa in this study, including the paired and unpaired structural elements, contains phylogenetically informative characters.
B. Overall Phylogenetic and
Distance Relationships The neighbor-joining tree based on p distances (Fig. 4) and the most-parsimonious PAUP trees (Figs. 5, 6, and 7) largely support morphological hypotheses for the relationships of these groups (Springer, 1993). A
262
CAROL A. STEPIENet al.
close correspondence exists between p distances based on these sequence data and distance estimates based on allozyme data (Nei's, 1972 D; Stepien et al., 1993; see Fig. 8). These similar distance ratios may suggest similar evolutionary periods of time, especially for the lower taxonomic levels common to both studies. Examples of approximate molecular clock/evolutionary time calibrations are given later in this chapter, but should be regarded with extreme caution due to difficulty in calibrating clocks, possible differences in evolutionary rates among lineages (Gillespie, 1986; Moritz et al., 1987), and possible site saturation (Brown et al., 1979) at the higher taxonomic levels. A rate of sequence divergence of 1% per million years was used for calibration in the present study, which is at the lower end of the conventional range of 1 to 2%, adjusted for ectotherms (reviewed in Avise, 1994). If the rate of molecular evolution of these taxa has been relatively constant with time, then the proportional distances will allow future calibration adjustments. If the phylogenetic signal is "swamped" at higher taxonomic levels by too many substitutions at given sites, then the divergences at the deeper branches of the distance tree (Fig. 4) are underestimated. The relative rate of mtDNA evolution has been postulated to be correlated with differences in metabolic rate, body size, and/or generation time in some animal groups (Thomas and Beckenbach, 1989; Martin et al., 1992, Martin and Palumbi, 1993; Rand, 1994). Martin et al. (1992) examined sharks and Thomas and Beckenbach (1989) tested salmonids, and both studies compared the rates of these cold-blooded animals with those of mammals. These studies suggest that the rate of substitution may be two to five times lower in ectotherms than in endotherms. However, other studies of marine and freshwater teleosts have identified relatively high and similar rates of mtDNA substitutions in groups inhabiting a variety of different biogeographic temperature zones (Stepien, 1995). For example, Stepien (1995) found that deep-sea teleost fishes (members of the pleuronectid genus Microstomus and the scorpaenid genus Sebastolobus) inhabiting cold waters (approximately 4~ and having very low metabolic rates (living in the oxygen minimum zone) had high levels of variability in the mtDNA control region (comparable to teleosts inhabiting shallow, warmer waters). It is also possible that the higher mtDNA evolution rates in populations of the species of Microstomus and Sebastolobus examined by Stepien (1995) may be due to the influence of warmer temperatures a n d / or mutagenic effects of ultraviolet radiation during their pelagic early life history stages (the larval period may extend to 1 year for the Dover sole, M. pacificus). Sharks and sea turtles (which also have slow rates of
mtDNA substitutions; Avise et al., 1992) have less exposure to radiation during early life history stages than do relatively transparent pelagic fish larvae in surface waters. Blennioid fishes typically have pelagic larvae and many have relatively long larval periods (Matarese et al., 1984; Thresher, 1984), e.g., 2 months in myxodin clinids (Stepien, 1986a). It is possible that damage to mtDNA may be extensive during this early life history period, resulting in high mutation rates. In support of using these genetic distances to roughly estimate separation times in this study, there are no marked differences in relative magnitudes separating congeners belonging to different biogeographic temperature regions, including the deep water Lycodes, the temperate shallow water Apodichthys, Gibbonsia, Heteroclinus, Hypsoblennius, Entomacrodus, and Omobranchus; and the shallow water tropical Malacoctenus, Labrisomus, and Starksia (Fig. 4). These results suggest that there may be no direct correlation between habitat temperature and the rate of mtDNA mutations among these ectothermic taxa. Total horizontal genetic distances separating all taxa in the neighbor-joining tree (Fig. 4) are equivalent to a possible divergence of approximately 30.0 + 3.0 myr, during the mid to late Oligocene epoch (or earlier, if the calibration rate should be increased and/or if site saturation is responsible for underestimation). Distances suggest that the lineage containing the clinids, labrisomids, and chaenopsids stemmed from a common ancestor shared with the other blennioids by 23.0 + 2.0 to 27.0 + 3.0 myr. Ancestors of the families Tripterygiidae and Blenniidae may have similarly diverged by approximately 22.0 ___ 2.0 and 26.0 + 2.0 myr, respectively. These distances may suggest a relatively rapid diversification of blennioid higher taxa in a variety of demersal tropical and temperate habitats during the early to mid-Miocene epoch. Alternatively, the deeper phylogenetic radiations in this study may erroneously appear to have occurred at approximately similar times due to site saturation of the sequence. This hypothesis may be tested with more slowly evolving nuclear DNA regions, such as the ribosomal array (Stepien et al., 1993). Trees of blennioid familial relationships obtained from nuclear ribosomal DNA ITS-1 spacer sequences appear congruent with those obtained in this study (Stepien et al., 1993), supporting resolution of these relationships with this mitochondrial DNA data set. White (1986, 1989) hypothesized that many modern antitropical distributions, such as that of the family Clinidae (Fig. 2; see Stepien, 1992; Stepien et al., 1993), may have a common paleoclimatic origin in a midMiocene, low-latitude warming event, which appears consistent with DNA distances separating the primary
15. Blennioid Relationships blennioid groups. For example, members of the egglaying clinid tribe Myxodini and the temperate livebearing clinids appear to have stemmed from an early to mid-Miocene ancestor approximately 21.0 + 1.5 myr. In the mid-Miocene, the two live-bearing tribes Ophiclinini and Clinini may have diverged from each other about 16.5 + 1.5 myr, and the egg-laying myxodin clinids split into North and South America groups by 13.3 + 1.5 myr (congruent with estimates from allozyme data; Stepien, 1992; Stepien et al., 1993). Similarly, the clade containing the temperate cryptotremin and neoclinin labrisomids seems to have separated from common ancestors shared with tropical labrisomids in the New World about 16.3 + 1.5 myr. Many of the primarily tropical clades in the labrisomid group on the neighbor-joining tree (Fig. 4) may also have diversified during the hypothesized tropical Miocene warming, including Chaenopsidae (16.3 + 1.5 myr), Starksiini (15.4 ___ 1.5 myr), the enigmatic genus Stathmonotus (15.4 + 2.0 myr), and Labrisomini (14.6 + 2.0 myr). Tribes in the family Blenniidae likewise appear to share this divergence pattern, including Salariini (19.3 + 2.0 myr), Omobranchini (17.3 + 2.0 myr), Nemophini (17.7 + 2.0 myr), and Parablennini (16.8 + 1.5 myr). The longest single branch divergence within the Blennioidei leads to the Dactyloscopidae (Fig. 4), also suggesting a possible mid-Miocene divergence (approximately 18.4 + 2.0 myr). Estimated separation times from blennioid groups not directly discussed in this text may be calculated by adding the branch lengths in Fig. 4. Ancestors of the Notothenioidei and Zoarcoidei appear to have separated from a common ancestor shared with the Blennioidei by at least 28.0 + 3.0 myr. The zoarcoid and notothenioid lineages may have diverged from each other by the early to mid-Miocene, approximately 20.5 ___2.5 myr. According to these estimates, modern zoarcoid groups may have diversified by at least 10.0 + 0.5 myr and modern notothenioids about 9.2 + 0.5 myr, the latter following the expansion of a true Antarctic ice cap approximately 15 myr (Van Andel, 1985; White, 1989; other researchers suggest an older date; see summaries by Eastman, 1993; Miller, 1993). In contrast to these estimates of evolutionary time, a fossil notothenioid in Antarctica dated to 38 myr (Balushkin, 1994) lends support to the metabolic rate/temperature hypothesis for a slower rate of mtDNA change in taxa inhabiting colder waters (Thomas and Beckenbach, 1989; Martin et al., 1992; Avise et al., 1992; Rand, 1994). The rate of mtDNA changes may be markedly slower in these cold water outgroups and these taxa may thus be considerably older (perhaps four times those given here). Alternatively, the calibration time of 1% per million years may underestimate the divergence times for this en-
263
tire study, although other fossil dates and allozyme data (discussed later) appear to correspond to these estimates.
C. Evolution and Biogeography of the Family Clinidae Phylogenies of clinids from 12S mtDNA sequences (Figs. 4, 5, and 6A) and allozyme data (Stepien and Rosenblatt, 1991; Stepien, 1992, Stepien et al., 1993) yield trees that show the same ordering of relationships within the family. They differ in that the allozyme tree depicts live-bearing taxa as basal and as more closely related to the labrisomids (Stepien et al., 1993). In the mtDNA parsimony tree (Fig. 6A), the egg-laying myxodins are basal. Neighbor-joining distance analysis (Fig. 4) suggests that live-bearing taxa have the greatest degree of divergence in the family from a common labrisomid ancestor shared with the myxodins and suggests relative timing of divergences that support the allozyme tree. Morphological data (Stepien, 1992; Springer, 1993; Stepien et al., 1993) and molecular data (Stepien et al., 1993; present study) support monophyly of the Clinidae. An exception is the depiction of a close relationship of the labrisomid M. macrocephalus to the North American myxodin clinids in the authors' neighbor-joining tree (Fig. 4), which is not supported by the parsimony analyses (Figs. 5 and 6A) or by the allozyme study (Stepien et al., 1993). Examination of the mtDNA data set reveals no synapomorphies that would place the Mnierpini as part of the Clinidae, to the exclusion of other labrisomids. Inclusion of Mnierpini in the Myxodini appears unlikely based on morphology, but should be further tested and the other mnierpin genus (Dialommus) should be included. Mitochondrial DNA sequence relationships (Figs. 4 and 6A) support the morphological hypothesis that the tribes Ophiclinini and Clinini are sister groups and that inclusion of the ophiclinins (snake blennies) by George and Springer (1980) in the family Clinidae is correct. Parsimony analyses (Fig. 6A) of mtDNA, in contrast to neighbor-joining distance (Fig. 4) and allozyme data (Stepien et al., 1993), suggest that oviparity and external fertilization are the ancestral states among the clinid/labrisomid/chaenopsid clade, supporting the hypothesis of Wourms and Lombardi (1992) that the evolution of viviparity is usually derived in fishes. The mtDNA parsimony tree (Fig. 6A) also supports the hypothesis of Penrith (1969) that the live-bearing groups Clinini and Ophiclinini are less closely related to the common clinid ancestor shared with the Labrisomidae. Evolution of matrotrophic viviparity in the Clinini and Ophiclinini may be responsible for their comparatively greater species richness,
264
CAROLA. STEPIENet al.
in comparison with the less numerous oviparous Myxodini (Table I), supporting the hypothesis of Lydeard (1993) that viviparity in actinopterygian fishes may be positively correlated with speciation. Parsimony analyses of mtDNA sequences (Fig. 6A), allozymes (Stepien and Rosenblatt, 1991; Stepien, 1992; Stepien et al., 1993), and nuclear rDNA sequences (Stepien et al., 1993) support the conclusion that the live-bearing tribes Ophiclinini and Clinini form a monophyletic sister clade to the egg-laying myxodins. According to p distances, divergence of modern myxodin taxa appears to have occurred at least 16.7 + 1.5 myr, corresponding to a mid-Miocene separation, possibly the warming of the tropics proposed by White (1986, 1989). Allozyme data (Nei's D = 0.812 + 0.031) for the previously mentioned separation of the Australian clinin Heteroclinus and the South American myxodin Myxodes (Clinitrachus was not available to the allozyme study) similarly estimated this time as 15.4 + 0.6 myr (Stepien, 1992; calibrated according to Grant, 1987, D of 1.0 = 19 myr). MtDNA sequence data show that the Mediterranean myxodin (the monotypic Clinitrachus argentatus) is the sister taxon of the live bearers and of the South American Myxodes. Divergence of Myxodes from a common ancestor shared with Clinitrachus is estimated at approximately 11.8 + 1.0 myr, another apparent Miocene event. There is fossil evidence for a Miocene Clinitrachus in Romania (Bannikov, 1989), which appears congruent with the dates estimated in this chapter. The southeastern Pacific myxodin genus Myxodes is shown to be the sister group to Clinitrachus and to a monophyletic clade containing the northeastern Pacific genera Heterostichus and Gibbonsia (Fig. 6A); the latter relationship is also supported by allozyme data (Stepien and Rosenblatt, 1991; Stepien, 1992). Separation of the North and South American taxa may have occurred about 13.3 + 1.5 myr, comparable to allozyme estimates of 13.5 + 1.0 myr (Nei's D = 0.712 + 0.031) and compatible with the hypothesized mid-Miocene climatic warming hypothesis (White, 1986). The genera Heterostichus and Gibbonsia are sister groups (Fig. 6A), as shown in the analyses of allozyme and mtDNA data (Stepien and Rosenblatt, 1991; Stepien, 1992; Stepien et al., 1993). The mtDNA trees show that G. metzi is the sister species to the clade of G. montereyensis and G. elegans (Fig. 6A), discerning between one of the two mostparsimonious trees from allozyme data (Stepien, 1992). The sister relationship between G. elegans and G. montereyensis is supported by two morphological synapomorphies: unequally spaced posterior dorsal fin rays and prominent dorsal ocelli (Hubbs, 1952; Stepien and Rosenblatt, 1991; Stepien, 1992). The divergence of common ancestors shared by Heterostichus and Gibbon-
sia may have occurred during the late Miocene, estimated here as 6.5 + 0.3 and 7.8 + 1.0 myr (Nei's D = 0.41 + 0.031) using allozymes (Stepien and Rosenblatt, 1991; Stepien, 1992). Separation of the species of Gibbonsia appear to have occurred about 3.0 to 4.5 + 0.2 myr. These divergence dates also correspond to those estimated from allozyme data (3.28 + 0.09 to 4.45 + 0.13 myr; Stepien and Rosenblatt, 1991; Stepien, 1992). Temperature changes during the Pliocene may have served as vicariant events separating formerly continuous distributions of these intertidal fishes, resulting in speciation (Stepien and Rosenblatt, 1991; Stepien, 1992).
D. Phylogenetic Relationships of the "'Family Labrisomidae'" Mitochondrial DNA sequences (Figs. 4 and 6A), nuclear rDNA sequences (Stepien et al., 1993), and allozyme data (Stepien et al., 1993) confirm the close relationship among the "Labrisomidae," the Chaenopsidae, and the Clinidae. Trees from mtDNA (Figs. 4 and 6A) and allozyme data (Stepien et al., 1993) show that the "labrisomids" are not monophyletic. Mooi and Gill (1995) reported that the labrisomids they examined (Labrisomus, Malacoctenus, Paraclinus, and Starksia) are characterized by a less derived type of epiaxial muscle morphology than that possessed by tripterygiids, dactyloscopids, clinids (myxodin clinids were not included), chaenopsids, and blenniids, which may be a possible character uniting them. MtDNA sequence divergences suggest that the "labrisomid" and clinid clades may have shared a common ancestor approximately 23.0 + 2.0 myr. Estimates from allozyme divergences appear congruent in suggesting a most recent common ancestry of 22.3 + 1.1 myr (Stepien and Rosenblatt, 1991; Stepien, 1992). George and Springer (1980) hypothesized that the Labrisomidae may not be closely related to the Clinidae, which is contradicted by a suite of molecular evidence, including allozymes (Stepien et al., 1993), nuclear rDNA sequences (Stepien et al., 1993), and the present mtDNA sequence data (Fig. 6A). "Labrisomids" lack clear morphological synapomorphies and have been referred to as a "wastebasket" of scaled blennioids not clearly falling into other families (Springer, 1993). Molecular data sets differ somewhat in the relative positionings of the clinids, "labrisomids," and chaenopsids. Parsimony (Fig. 6A) and neighbor-joining (Fig. 4) analyses of the mtDNA data set suggest that a labrisomid-chaenopsid clade is the sister group of the Clinidae, with the Chaenopsidae contained as a monophyletic clade within a paraphyletic "Labrisomidae." The nuclear rDNA data analysis was unable to
15. Blennioid Relationships
distinguish among these relationships, which varied among the most-parsimonious and two next mostparsimonious trees (Stepien et al., 1993). Analyses of allozyme data (Stepien et al., 1993) placed the Chaenopsidae as the basal clade of a paraphyletic "Labrisomidae and a monophyletic Clinidae was the terminal group of the clade. In the allozyme trees, the chaenopsids were the sister group to sister clades comprising the Neoclinini and the remaining "labrisomids," respectively (Stepien et al., 1993). The most-parsimonious tree from mtDNA sequences suggests that there are six "labrisomid" clades: the Mnierpini, Paraclinini, the Chaenopsidae, the Neoclinini-Cryptotremini, the Starksiini, and the Labrisomini (Fig. 6A). Placement of the Mnierpini is very weakly supported, as are the relationships of the Starksiini, Neoclinini, and Cryptotremini. Parsimony analyses of mtDNA sequences suggest that the Starksiini is the sister group of the Labrisomini (Fig. 6A), whereas those based on allozyme data placed the tribe Starksiini as the sister group of either the Paraclinini or the Cryptotremini (Stepien et al., 1993). Rosenblatt and Taylor (1971) hypothesized that starksiins may be derived from either a cryptotremin or a Labrisomus-like ancestor, indicating morphological support for one of the allozyme hypotheses, as well as the mtDNA hypothesis. Low resolution for tribal relationships from the molecular data sets (Fig. 6A and Stepien et al., 1993), coupled with lack of morphological synapomorphies, leave these relationships speculative. Although trees from allozyme data (Stepien et at., 1993) place the Neoclinini as the basal "labrisomid" group and the sister group of the chaenopsids, mtDNA sequences (Figs. 4 and 6A) suggest that it is more closely related to the tribe Cryptotremini. The neoclinin-cryptotremin clade is then the sister group of a clade containing the Chaenopsidae, S tathmonotus, and Paraclinini (the latter clade is unresolved by consensus of the most-parsimonious trees). In contrast, allozyme trees do not indicate a close relationship between cryptotremins and neoclinins (Stepien et al., 1993). Hubbs (1952) had placed the genus Neoclinus in the Chaenopsidae and Springer (1955) then removed it to the Clinidae-Labrisomidae, postulating that it is derived from ancestors of the Paraclinini. Stephens (1963) excluded Neoclinus from the Chaenopsidae on the basis of presence of scales, a lateral line, and four circumorbital bones. Hastings and Springer (1994) suggested that morphological characters may place Neoclinus as the sister group of the family Chaenopsidae, compatible with the allozyme study (Stepien et al., 1993). Both allozyme (Stepien et al., 1993) and mtDNA sequence data (Fig. 6A) support traditional morphological groupings of genera within the tribes (shown
265
in Table I), although they differ in the relationships among the tribes. For example, the genera Exerpes and Paraclinus comprising the tribe Paraclinini are sister taxa in the most-parsimonious mtDNA sequence (Fig. 6A) and allozyme trees (Stepien et al., 1993). A sister relationship among the labrisomin genera Malococtenus and Labrisomus based on allozyme data (Stepien et al., 1993) and mtDNA sequences (Figs. 4 and 6A) is congruent with morphological similarity postulated by Hubbs (1952). Their hypothesized mid-to-late Miocene radiation of 12.8 + 1.0 myr appears congruent with the Miocene fossil Labrisomus pronuchipinnis in the southwestern Mediterranean, where the genus is no longer represented. The modern descendant Labrisomus nuchipinnis is widespread throughout much of the western Atlantic (Springer, 1993). "Labrisomids" and clinids have been described to "raft" in pieces of drift algae, which may explain their wide dispersal capability (Hubbs, 1952; Stepien, 1986a, 1992). Their larvae and postlarvae are planktonic for about 2 months and juveniles tend to congregate in groups in drift algae, which apparently aids in dispersal across deep water areas (Stepien, 1986a). E. P h y l o g e n y o f the F a m i l y
Chaenopsidae
MtDNA sequence data confirm the monophyly of the Chaenopsidae, whose morphological synapomorphies have been analyzed by Hastings and Springer (1994). MtDNA analyses (Figs. 4 and 6A) place the chaenopsids as the sister group of some of the "labrisomids." Parsimony trees (Fig. 6A) group the chaenopsids as being closely related to S tathmonotus and the Paraclinini. Neighbor-joining analyses of genetic distances from mtDNA (Fig. 4) suggest the closest affinity of chaenopsids with Paraclinini, Starksiini, and Stathmonotus. Hastings and Springer (1994) hypothesized Stathmonotus to be the sister group of the chaenopsids, based on morphological characters. Hastings and Springer (1994) also stated that among the currently recognized tribes of "labrisomids," the Starksiini share the greatest number of apparent synapomorphies with chaenopsids. In contrast to their placement in the mtDNA study (Fig. 6A), trees based on allozyme data showed the Chaenopsidae as the basal clade and the sister group of the clades comprising the Neoclinini and the labrisomid-clinid lineage (Stepien et al., 1993). A sister relationship among neoclinins and chaenopsids (including Stathmonotus) was also hypothesized by Hastings and Springer (1994), based on morphological characters. The chaenopsids analyzed in this study (Emblemaria, Chaenopsis, and Acanthemblemaria) appear to be separated by a total divergence of approximately 15.0 + 1.5
266
C A R O L A. STEPIEN et al.
myr. The single most-parsimonious tree (Fig. 6A) from an exhaustive search of mtDNA data shows Chaenopsis as the sister taxon of Emblemaria, which is then the sister group of a clade containing Acanthemblemaria. In comparison, Hastings and Springer (1994) hypothesized that the Acanthemblemaria clade forms the basal sister group to a clade containing Emblemaria and Coralliozetus as sister groups.
F. Evolutionary Relationships of the Family Tripterygiidae MtDNA sequence data place the monophyletic Tripterygiidae as either the sister group of the family Blenniidae (in parsimony analyses; Fig. 6B) or as the sister group of the "labrisomid"-chaenopsid-clinid clade (in neighbor-joining and overall parsimony analyses; Figs. 4 and 5). In the latter hypothesis, the family Blenniidae is then the sister group to the clades containing the Tripterygiidae and the clinids, "labrisomids," and chaenopsids (Fig. 5). MtDNA sequences suggest that ancestors of the tripterygiids diverged about 22.0 + 2.0 myr and that the tribes separated by 13.4 + 1.0 myr (Fig. 4), compatible with early and mid-Miocene separations. A fossil species (Tripterygion pronasus) has been described from Miocene deposits by the Mediterranean Sea (Arambourg, 1927; discussed by Wirtz, 1980), which appears compatible with these dates. The tripterygiid tribe Lepidoblenninae (represented here by Axoclinus and Karalepis) appears to be paraphyletic, as Axoclinus is depicted as the basal group to Karalepis, which is then the sister group of the tribe Tripterygininae (represented here by Notoclinus, Tripterygion, and Rosenblatella; see Figs. 4 and 6B). Neighbor-joining (Fig. 4) and parsimony trees (Fig. 6) suggest that Notoclinus forms the sister group of a clade containing Tripterygion and Rosenblatella of those taxa analyzed. Arrangements of these taxa differ from those hypothesized by Fricke (1994), based on morphology. Additional tripterygiids need to be sequenced in order to further elucidate their relationships.
G. Phylogeny of the Family Blenniidae Monophyly of the combtooth blennies is supported by five morphological characters (Springer, 1968, 1993; Williams, 1990), nuclear rDNA sequences (Stepien et al., 1993), and these mtDNA sequence data (Figs. 4, 5, and 6B). Six tribes are recognized (Table I), of which mtDNA sequences from four are analyzed. Most-parsimonious trees support the idea that the tribe Parablennini is monophyletic, which has been hypothesized based on two possible morphological synapomorphies (Williams, 1990). The tribe Salariini appears paraphyletic (Fig. 6B), with the genus Ophio-
blennius not grouping with the others. This needs to be investigated further. The remainder of the Salariini form a sister group to the Parablenniini. A sister relationship between the Parablenniini and the Salariini was also shown based on osteological characters (Bock and Zander, 1986). The most-parsimonious tree from mtDNA sequences (Fig. 6B) depicts the Nemophini (saber-tooth blennies) as closely related to the Omobranchini. A close relationship between the Nemophini and the Omobranchini has also been suggested by Springer (1968) based on jaws, dentition, and caudal fin osteology and by Bock and Zander (1986) based on neurocranial osteology. Most-parsimonious trees (Fig. 6B, see legend) did not resolve relationships among the Omobranchus species and some reversed the ordering of the genera Rhabdoblennius and Ecsenius from that shown. These questions need to be addressed further with additional taxa and a larger sequence data set.
H. Placement of the Family Dactyloscopidae Springer (1993) placed the dactyloscopids (sand stargazers) in the Blennioidei, but some researchers have placed them with the Uranoscopidae (e.g., Gosline, 1968). Uranoscopidae is now included in the suborder Trachinoidei (Nelson, 1994). 12S rDNA data in the present study support the morphological hypothesis (Springer, 1993) that dactyloscopids (represented here by Myxodagnus opercularis) are blennioids. Examination of the data set also provides support for considerable divergence from the other blennioids, as shown by the single longest horizontal branch on the neighbor-joining tree in Fig. 4. Morphologically, the Dactyloscopidae is also the most divergent family from the other blennioids, corroborating mtDNA data (Springer, 1993; V. G. Springer, personal communication 1996). Most-parsimonious trees in the authors' investigation (Figs. 5 and 6B) place the Dactyloscopidae as the sister group to other blennioids. The neighborjoining tree (Fig. 4) shows it as most closely related to the tripterygiids.
I. Phylogenetic Relationships of the Suborders Zoarcoidei and Notothenioidei The outgroups Notothenioidei and Zoarcoidei form two sister clades, corresponding to their division in separate monophyletic suborders (Figs. 4, 5, and 7). Some morphologists have also hypothesized a sister relationship among the notothenioids and zoarcoids (Anderson, 1990). Results of the authors' study suggest a close relationship among the blenniiform suborders Blennioidei, Notothenioidei, and Zoarcoidei. Their re-
15. Blennioid Relationships lationships to the Trachinoidei are being tested. These mtDNA data suggest that the ancestors of the notothenioid and zoarcoid clade stemmed from a common ancestor shared with the Blennioidei (Figs. 4, 5, and 7) by at least 28.0 + 3.0 myr and that the suborder lineages diverged from each other about 20.5 + 2.5 myr. White (1987) hypothesized that deep sea groups, such as zoarcids, may have speciated during evolutionary pulses associated with oceanic anoxic events by which advancing oxygen minima promoted taxonomic diversification at intermediate depths on the continental slope by restricting isolated populations to disjunct hydrochemical refugia. Modern zoarcoid and notothenioid lineages may have diversified about 10.0 + 0.5 and 11.6 + 0.5 myr, respectively, according to the 1% calibration used here. In contrast, Anderson (1994) has suggested a much older origin for the suborder Zoarcoidei, as early as the Eocene in the North Pacific Ocean. He postulated that the early zoarcoids then spread throughout the Pacific Rim and that the family Zoarcidae radiated along the western coasts of the Americas during the pre-Miocene. An earlier date is also indicated by the description of a nototheniid fossil in Antarctica dated 38 myr (Balushkin, 1994). It is possible that the date discrepancy for these coldwater outgroups, may be due to low sequence variability correlated with slow metabolic rate or to a calibration error in this study. If so, this fossil suggests that the true divergence dates may actually be four times greater than those indicated. Within the monophyletic Zoarcoidei, the family Pholidae (gunnels) appears to be the sister group of the families Stichaeidae (pricklebacks) and Zoarcidae (eelpouts; Figs. 4 and 7), among the taxa included here. The pholids may have diverged by 6.8 + 0.5 myr (using the 1% per million year estimate), possibly during temperate changes in the Pliocene, as hypothesized for the northeastern Pacific clinid genera (Stepien and Rosenblatt, 1991; Stepien, 1992). Pholids and clinids inhabit similar algal-covered rocky intertidal areas and are sensitive to temperate changes (Stepien et al., 1991). The most-parsimonious tree depicts the genus Pholis as the sister taxon of the genus Apodichthys, which is congruent with a morphological analysis by Yatsu (1985). Among the members of the Zoarcidae included in this chapter, the genus Zoarces is depicted as the sister group of Lycodichthys, which is then the sister taxon of Lycodes (Fig. 7). The notothenioids analyzed are monophyletic and form two clades (Figs. 4 and 7), the families Nototheniidae (Notothenia, Pagothenia, and Trematomus) and Bathydraconidae (Fig. 7; Gymnodraco and Parachaenichthys), corresponding to their morphological classification (De Witt et al., 1990; Eastman, 1993). The nototheniids (cod icefishes) and bathydraconids (dra-
267
gonfishes) may have been separated since at least 11.6 + 1.5 myr (Fig. 4), following the expansion of the Antarctic ice cap hypothesized about 15 myr (Van Andel, 1985; White, 1989; other researchers have suggested a much earlier date; see discussions by Anderson, 1990; Eastman, 1993; Miller, 1993). Taxon divergence times estimated by Bargelloni et al. (1994) are very similar to the estimates described in this chapter. The two clades within the family Nototheniidae follow its morphological division in two subfamilies, with the Notothenninae (Notothenia) as the sister group of the Trematominae (Trematomus and Pagothenia). The distance (Fig. 4) and parsimony (Fig. 7) trees of the authors' study do not support the hypothesis by Eastman and Grande (1989) that Pagothenia is an early branch of the Nototheniidae. In another sequencing study, which included a smaller portion of the 12S rDNA gene (overlapping the end portion of the sequence in this study) and part of the 16S rDNA gene, Bargelloni et al. (1994) found less close correspondence with morphological groupings than the authors did. The trees of Bargelloni et al. (1994) depicted the Nototheniidae as paraphyletic, with the Bathydraconidae placed between the Notothenninae and the Trematominae, which had low consensus and bootstrap support. However, higher consensus and bootstrap support for the authors' data and correspondence between the trees reported in this chapter (Figs. 4 and 7) and morphological-based systematics support monophyly of the Nototheniidae and a close relationship between the subfamilies Notothenninae and the Trematominae. The phylogenies of Bargelloni et al. (1994), the authors' trees, and morphological characters (Table II; Eastman, 1993) support a sister relationship between the trematomin genera Trematomus and Pagothenia. Divergence of these trematomins is similar in both studies, with the authors' suggesting about 3.6 + 0.5 myr of separation (Fig. 4). These separation times are congruent with those estimated by McDonald et al. (1992) from allozyme distances.
V. Summary Analyses of mtDNA sequences from the 12S rDNA region result in phylogenies that are largely congruent with known morphological classification (summarized by Springer, 1993), supporting monophyly of the blenniiform suborders Blennioidei, Notothenioidei, and Zoarcoidei. Results also support monophyly of a Clinid-Labrisomid-Chaenopsid superfamily and the families of Clinidae, Chaenopsidae, Tripterygiidae, Blenniidae, Nototheniidae, Bathydraconidae, Zoarcidae, Stichaeidae, and Pholidae. Trees of blennioid
268
CAROL A. STEPIEN et al.
relationships are congruent with those based on sequences of nuclear rDNA spacer regions (Stepien et al., 1993) and are largely congruent with those based on allozyme data (Stepien et al., 1993; Stepien, 1992; Stepien and Rosenblatt, 1991). The present investigation suggests that the chaenopsids form a monophyletic clade within the "Labrisomidae." Relationships among the "labrisomids" remain enigmatic due to lack of synapomorphies discerned from DNA, allozyme, and morphological data. Phylogenies based on mtDNA data support inclusion of the family Dactyloscopidae as blennioids, and parsimony trees (Figs. 4 and 6B) suggest their placement as the basal clade. These data support a possible sister relationship between the outgroups used and the suborders Notothenioidei and Zoarcoidei, with the Trachinoidei remaining to be investigated. Molecular data also seem to support most familial radiations as occurring relatively rapidly, possibly during the early Miocene epoch 22 to 27 myr and most tribal radiations as occurring during the midMiocene about 13.5 to 21 myr, using a calibration of 1% divergence per million years. These dates appear consistent with the Miocene fossils of a labrisomid (Springer, 1970; George and Springer, 1980), a clinid (Bannikov, 1989; see Springer, 1993), and a tripterygiid (Arambourg, 1927) and may be related to Miocene warming of the tropics (White, 1986, 1989). Tropical warming may have vicariantly separated formerly continuous distributions, promoting speciation (White, 1986, 1989; Stepien, 1992). Alternatively, similar divergence estimates may be artifacts of site saturation, which does not appear to be the case due to the consistency of transition to transversion rates and relatively high proportions of phylogenetically informative sites in both paired and unpaired regions coded by the 12S rDNA. Fossil evidence suggests that divergence of the notothenioid outgroup may actually be four times older than estimated in this chapter (Balushkin, 1994), possibly due to their low metabolic rates. Results of this study indicate that 12S mtDNA sequences are useful for resolving phylogenetic hypotheses at taxonomic levels ranging from species through suborders and that this region appears to retain phylogenetic signals for these various hierarchies. This is part of an ongoing comprehensive investigation of these groups by C. A. Stepien, using mitochondrial and nuclear DNA sequences.
Acknowledgments We thank the following persons for helping collect specimens; P. Wirtz, R. R. McConnaughey, R. H. Rosenblatt, R. E. Thresher, M. E. Anderson, E. O. Wiley, K. Amaoka, K. Kawaguchi, T. Abe, O. Okamura, G. Somero, A. A. Naffziger, L. Badzioch, S. Mesnick, K. Dick-
son, D. Hoese, and A. C. Gill. This manuscript benefited substantially from critical reviews by V. G. Springer, R. H. Rosenblatt, P. Wirtz, J. T. Williams, M. E. Anderson, C. Lydeard, R. R. Wilson, B. N. White, and T. D. Kocher. A pilot study for this work was begun by CAS during a Sloan Postdoctoral Fellowship in Molecular Evolution, sponsored by D. M. Hillis at the University of Texas, Austin. Data acquisition, analysis, and writing were done in the laboratory of CAS at CWRU. This study was supported by the CWRU Department of Biology, a George B. Mayer assistant professorship to CAS, and laboratory setup funds from the Ohio Board of Regents and a Howard Hughes Medical Institute grant to the Department of Biology, CWRU. KLC thanks the Howard Hughes Medical Institute summer undergraduate research program in the Department of Biology at CWRU for fellowship support. MJB was supported by the CWRU Department of Biology during a 1-year postdoctoral fellowship in the laboratory of CAS. Specimen collections in Japan by CAS were supported by the National Research Council, in Chile by National Geographic Society Grant 3615-87 to CAS and R. H. Rosenblatt, in California and Mexico by NSF BSR-8600180 to CAS, and in Portugal by a travel grant from the Centro de Ciencia e Tecnologia da Madeira (CITMA) to CAS and P. Wirtz. Undergraduate research students L. Naftalin, G. Johns, N. Valtz, H. Strick, and J. Skidmore assisted in some of the DNA extractions.
References Acero, A. P. 1987. The chaenopsine blennies of the southwestern Caribbean (Pisces, Clinidae, Chaenopsinae). III. The genera Chaenopsis and Coralliozetus. Bol Ecotrop 16:1-21. Anderson, M. E. 1990. The origin and evolution of the Antarctic ichthyofauna. In "Fishes of the Southern Ocean" (O. Gon and P. C. Heemstra, eds.), pp. 28-33. J. L. B. Smith Institute of Ichthyology, Grahamstown, South Africa. Anderson, M. E. 1994. Systematics and osteology of the Zoarcidae (Teleostei: Perciformes). ]. L. 13. Smith Inst. Ichthyol. Ichthyol. Bull. 60:1-120.
Arambourg, G. 1927. Les poissons fossiles d'Oran. Mater. Carte gol. Alger (paleont.). 6:1-289. Arise, J. C. 1994. "Molecular Markers, Natural History, and Evolution." Chapman and Hall, New York. Avise, J. C., Bowen, B. W., Lamb, T., Meylan, A. B., and Bermingham, E. 1992. Mitochondrial DNA evolution at a turtle's pace: Evidence for low genetic variability and reduced microevolutionary rate in the testudines. Mol. Biol. Evol. 9(3):457-473. Balushkin, A. 1994. Proeleginops grandeast manorum gen. et. sp. nov. (Perciformes, Notothenioidei, Eleginopsidae) from the late Eocene of Seymour Island (Antarctica) is a fossil notothenioid, not a gadiform. ]. Ichthyol. 34(8): 10-23. Bannikov, A. E 1989. The first discovery of scale-bearing blennies (Teleostei) in the Sarmatian of Moldavia. Paleont. ]. 2: 64-70. Bargelloni, L., Ritchie, P. A., Patarnello, T., Battaglia, B., Lambert, D. M., and Meyer, A. 1994. Molecular evolution at subzero temperatures: Mitochondrial and nuclear phylogenies of fishes from Antarctica (suborder Notothenioidei), and the evolution of antifreeze glycopeptides. Mol. Biol. Evol. 11(6):854-863. Bock, M., and Zander, C. D. 1986. Osteological characters as tools for blenniid taxonomy: A generic revision of European Blenniidae (Percomorphi; Pisces). Zool. Inst. Zool. Mus. Univ. Hamburg. 1986: 138-143. Briggs, J. C. 1974. "Marine Zoogeography." McGraw-Hill, New York. Brown, W. M., George, M., Jr., and Wilson, A. C. 1979. Rapid evolution of animal mitochondrial DNA. Proc. Natl. Acad. Sci. USA 76: 1967-1971. De Witt, H. H., Heemstra, P. C., and Gon, O. 1990. Nototheniidae,
15. Blennioid Relationships In "Fishes of the Southern Ocean" (O. Gon, and P. C. Heemstra, eds.), pp. 279-331. J. L. B. Smith Institute of Ichthyology, Grahamstown, South Africa. Dixon, M. T., and Hillis, D. M. 1993. Ribosomal RNA secondary structure: Compensatory mutations and implications for phylogenetic analysis. Mol. Biol. Evol. 10(1):256-267. Eastman, J. T. 1993. "Antarctic Fish Biology." Academic Press, San Diego. Eastman, J. T. and Grande, L. 1989. Evolution of the Antarctic fish fauna with emphasis on the recent notothenioids. In: "Origins and Evolution of the Antarctic Biota" (J. A. Cranes, ed.). Geol. Soc. Spec. Pub. 47:241-252. Fricke, R. 1994. Tripterygiid fishes of Australia, New Zealand and the Southwest Pacific Ocean, with descriptions of 2 new genera and 16 new species (Teleostei). Theses Zoologicae, Vol. 24. Koeltz Scientific Books. Fukao, R., and Okazaki, T. 1987. A study on the divergence of Japanese fishes of the genus Neoclinus. Jap. J. Ichth. 34(3):309-323. George, A., and Springer, V. G. 1980. Revision of the Clinid fish tribe Ophiclinini, including five new species, and definition of the family Clinidae. Smith. Contr. Zool. 307:1-30. Gillespie, J. H. 1986. Variability of evolutionary rates of DNA. Genetics 113:1077-1091. Gosline, W. A. 1968. The suborders of perciform fishes. Proc. U. S. Natl. Mus. 124:1- 78. Gosline, W. A. 1971. "Functional Morphology and Classification of Teleostean Fishes." University Press of Hawaii, Honolulu, HI. Grant, W. S. 1987. Genetic divergence between congeneric Atlantic and Pacific Ocean fishes. In "Population Genetics and Fishery Management." (N. Ryman, and F. Utter, eds.), pp. 225-246, Washington Sea Grant Program, Univ. of Washington Press. Seattle, WA. Greenwood, P. H., Rosen, D. E., Weitzman, S. H., and Meyers, G. S. 1966. Phyletic studies of teleostean fishes, with a provisional classification of living forms. Bull. Am. Mus. Nat. Hist. 131:339-456. Gutell, R. R., Weiser, B., Woese, C. R., and Noller, H. F., 1985. Comparative anatomy of 16S-like ribosomal RNA. Prog. Nucleic Acid Res. Mol. Biol. 32:155-216. Hastings, P. A. 1991. Phylogenetic relationships of the tube blennies of the genus Acanthemblemaria (Pisces: Blennioidea). Bull. Mar. Sci. 47(3):725-737. Hastings, P. A., and Springer, V. G. 1994. A review of Stathmonotus, with redefinition and phylogenetic analysis of the Chaenopsidae (Pisces: Blennioidei). Smith. Contr. Zoot. 558:1-48. Hendy, M. D., and Penny, D. 1982. Branch and bound algorithms to determine minimal evolutionary trees. Math. Biosci. 59:277-290. Hillis, D. M., and Dixon, M. T. 1991. Ribosomal DNA: Molecular evolution and phylogenetic inference. Quart. Rev. Biol. 66(4): 411-453. Hubbs, C. 1952. A contribution to the classification of the blennioid fishes of the family Clinidae, with a partial revision of the eastern Pacific forms. Stanford Ichth. Bull. 4: 41-65. Hultman, T., Stahl, S., Hornes, E., and Uhlen, M. 1989. Direct solid phase sequencing of genomic and plasmid DNA using magnetic beads as solid support. Nucleic Acids Res. 17: 4937-4946. International Biotechnologies, Inc. (IBI) 1992. Assembly LIGN Sequence Assembly Software, Kodak. Johnson, G. D. 1993. Percomorph phylogeny: Progress and problems. Bull. Mar. Sci. 52(1):3-28. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., P~i~ibo, S., Villablanca, F. X., and Wilson, A. C. 1989. Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86:6196-6200. Kumar, S., Tamura, K., and Nei, M. 1993. "MEGA: Molecular Evolu-
269
tionary Genetics Analysis, Version 1.01." Pennsylvania State University, University Park, PA. Lydeard, C. 1993. Phylogenetic analysis of species richness: Has viviparity increased the diversification of Actinopterygian fishes? Copeia 1993(2):514-518. Martin, A. P., Naylor, G. J. P., and Palumbi, S. R. 1992. Rates of mitochondrial DNA evolution in sharks are slow compared with mammals. Nature 357:153-155. Martin, A. P., and Palumbi, S. R. 1993. Body size, metabolic rate, generation time and the molecular clock. Proc. Natl. Acad. Sci. USA 90: 4087-4091. Materese, A. C., Watson, W., and Stevens, E. G. 1984. Blennioidea: Development and Relationships. In "Molecular Systematics of Fishes" (H. G. Moser et al., eds.), pp. 565-573. Allen Press, Lawrence, KS. Meyer, A. 1993. Evolution of mitochondrial DNA of fishes. In "The Biochemistry and Molecular Biology of Fishes" (P. W. Hochachka, and P. Mommsen, eds.), Vol. 2, pp. 1-38. Elsevier Press, Amsterdam. McDonald, M. A., Smith, M. H., Smith, M. W., Novak, J. M., Johns, P. E., and Devries, A. L. 1992. Biochemical systematics of notothenioid fishes from Antarctica. Biochem. Syst. Ecol. 20:233-241. Miller, R. G. 1993. "A History and Atlas of the Fishes of the Antarctic Ocean." Foresta Institute for Ocean and Mountain Studies, Carson City, NV. Miller, D. J., and Lea, R. N. 1972. "Guide to the coastal fishes of California." Fish Bulletin 157. State of California. Department of Fish and Game. Sacramento, CA. Mooi, R. D., and Gill, A. C., 1995. Association of epaxial musculature with dorsal-fin pterygiophores in acanthomorph fishes, and its phylogenetic significance. Bull. Nat. Hist. Mus. Lond. (Zool.). 61(2): 121-137. Moritz, C., Dowling, T. E., and Brown, W. M., 1987. Evolution of animal mitochondrial DNA: Relevance for population biology and systematics. Annu. Rev. Ecol. Syst. 18:269-292. Neefs J. M., Y. Van de Peer, De Rijk, P., Goris, A., and De Wachter, R. 1991. Compilation of small ribosomal subunit RNA sequences. Nucleic Acids Res 19s: 1987-2015. Nei, M. 1972. Genetic distance between populations. Am. Nat. 106: 283-292. Nelson, J. S. 1994. "Fishes of the World," 3rd Ed. Wiley, New York. Orti, G., Petry, P., Proto, J. I. R. Jegu, M., and Meyer, A. 1996. Patterns of nucleotide change in mitochondrial ribosomal RNA genes and the phylogeny of piranhas. J. Mol. Evol. 42:169-182. Penrith, M. L. 1969. The systematics of the fishes of the family Clinidae in South Africa. Ann. S. Afr. Mus. 55(1): 1-127. Perbal, B. 1988. "A Practical Guide to Molecular Cloning." Wiley, New York. Rand, D. M. 1994. Thermal habit, metabolic rate and the evolution of mitochondrial DNA. TREE 9(4) : 125-131. Rosenblatt, R. H. 1984. Blennioidei: An introduction. In "Ontogeny and Systematics of Fishes," (H. G. Moser, et al., eds.), pp. 551552. Based on an international symposium dedicated to the memory of Elbert Halvor Ahlstrom, Allen Press, Lawrence, KS. Rosenblatt, R. H., and Taylor, L. R., Jr. 1971. The Pacific species of the clinid fish tribe Starksiini. Pacific Sci. 25: 436-463. Saitou, N., and Nei, M. 1987. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406-425. Sanger, F., Nicklen, S., and Coulson, A. R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Aci. USA 74: 5463-5467. Siegel, S., and Castellan, N. J., Jr. 1988. "Nonparametric Statistics for the Behavioral Sciences," 2nd Ed. McGraw-Hill, New York.
270
CAROL A. STEPIEN et al.
Smith-Vaniz, W. F. 1976. The saber-toothed blennies, tribe Nemophini (Piscesi Blenniidae). Acad. Nat. Sci. Philadelphia 19:1-196. Springer, V. G. 1955. The taxonomic status of the fishes of the genus Stathmonotus, including a review of the Atlantic species. Bull. Mar. Sci. Gulf Carib. 5(1):66-80. Springer, V. G. 1968. "Osteology and Classification of the Fishes of the Family Blenniidae." U.S. Nat. Mus. Bull. 284. Smith. Inst. Press, Washington, D.C. Springer, V. G. 1970. The western south Atlantic clinid fish Ribeiroclinus eigenmanni with discussion of the intrarelationships and zoogeography of the Clinidae. Copeia 1970(3): 430-436. Springer, V. G. 1982. Pacific plate biogeography with special reference to shorefishes. Smith. Contr. Zool. 367:1-182. Springer, V. G. 1993. Definition of the suborder Blennioidei and its included families (Pisces: Perciformes). Bull. Mar. Sci. 52(1): 427-495. Springer, V. G. and Freihofer, W. C. 1976. Study of the monotypic fish family Pholidichthyidae (Perciformes). Smith. Contr. Zool. 216: 1-43. Springer, V. G. Smith, C. L., and Fraser, T. H. 1977. Anisochromis straussi, new species of protogynous hermaphroditic fish, and synonymyr of Anisochromidae, Pseudoplesiopidae, and Pseudochromidae. Smith. Contr. Zool. 252:1-15. SPSS, Statistical Package for the Social Sciences. 1992. Version 5.0.1. Stephens, J. S. 1963. A revised classification of the blennioid fishes of the American family Chaenopsidae. Univ. Calif. Pub. Zool. 68: 1-165. Stephens, J. S., and Springer, V. G. 1973. Clinid fishes of Chile and Peru, with description of a new species, Myxodes ornatus, from Chile. Smith. Contr. Zool. 159:1-24. Stepien, C. A. 1986a. Life history and larval development of the giant kelpfish, Heterostichus rostratus Girard. Fish. Bull. 84(4):809826. Stepien, C. A. 1986b. Regulation of color morphic patterns in the giant kelpfish, Heterostichus rostratus Girard: Genetic versus environmental factors. J. Exp. Mar. Biol. Ecol. 100:181-208. Stepien, C. A. 1987. Color pattern and habitat differences between male, female, and juvenile giant kelpfish. Bull. Mar. Sci. 41: 45-58. Stepien, C. A. 1991. Population structures, diets, and biogeographic relationships of rocky intertidal fishes in central Chile: High levels of herbivory in a temperate system. Bull. Mar. Sci. 47(3): 598-612. Stepien, C. A. 1992. Evolution and biogeography of the Clinidae (Teleostei: Blennioidei). Copeia 1992(2):375-392. Stepien, C. A. 1995. Population genetic divergence and geographic patterns from DNA sequences: Examples from marine and freshwater fishes. In "Evolution and the Aquatic Ecosystem: Defining Unique Units in Population Conservation," (J. Nielsen, ed.), pp. 263-287. American Fisheries Symposium 17, Bethesda, MD. Stepien, C. A., Dixon, M. T., and Hillis, D. M. 1993. Evolutionary relationships of the blennioid fish families Clinidae, Labrisomidae, and Chaenopsidae: Congruence between DNA sequence and
allozyme data. In "Symposium on Evolution of Percomorph Fishes," (G. D. Johnson, ed.). Bull. Mar. Sci. 52(1):873-921. Stepien, C. A., Glattke, M., and Fink, K. M. 1988. Regulation and significance of color patterns of the spotted kelpfish, Gibbonsia elegans Cooper, 1864 (Blennioidei: Clinidae). Copeia 1998(1):7-15. Stepien, C. A., Phillips, H., Adler, J. A., and Mangold, P. J. 1991. Biogeographic relationships of a rocky intertidal fish assemblage in an area of cold water upwelling off Baja California, Mexico. Pacific Sci. 45(1): 63- 71. Stepien, C. A., and Rosenblatt, R. H. 1991. Patterns of gene flow and genetic divergence in the northeastern Pacific myxodin Clinidae (Teleostei: Blennioidei), based on allozyme and morphological data. Copeia 1991(4) :873-896. Swofford, D. L. 1996. "PAUP* (Phylogenetic Analysis Using Parsimony) vers. 4.0 (test version O)." Sinauer, Sunderland, MA. Swofford, D. U, Olson, G. J., Waddell, P. J., and Hillis, D. M. 1996. Phylogenetic Inference. In "Molecular Systematics, Second Ed." (D. M. Hillis, C. Moritz, and B. K. Mable, eds.), pp. 407-514. Sinaver Assoc., Inc. Sunderland, MA. Thomas, W. K., and Beckenbach, A. T. 1989. Variation in salmonid mitochondrial DNA: Evolutionary constraints and mechanisms of substitution. J. Mol. Evol. 29:233-245. Thresher, R. E. 1984. "Reproduction in Reef Fishes." T. F. H. Publications, Neptune City, NJ. Titus, T. A., and Larson, A. 1995. A molecular phylogenetic perspective on the evolutionary radiation of the salamander family Salamandridae. Syst. Biol. 44:125-151. Uhlen, M. 1989. Magnetic separation of DNA. Nature 340:733-734. Van Andel, T. H. 1985. "New Views on an Old Planet: Continental Drift and the History of the Earth. Cambridge University Press, Cambridge. Vawter, L., and Brown, W. M. 1993. Rates and patterns of base change in the small subunit ribosomal RNA gene. Genetics 134: 597-608. Wheeler, W. C., and Honeycutt, R. L. 1988. Paired sequence difference in ribosomal RNAs: Evolutionary and phylogenetic implications. Mol. Biol. Evol. 5(1):90-96. White, B. N. 1986. The Isthmian link, antitropicality and American biogeography: Distributional history of the Atherinopsinae (Pisces: Atherinidae). Syst. Zool. 35:176-194. White, B. N. 1987. Oceanic anoxic events and allopatric speciation in the deep sea. Biol. Oceanogr. 5:243-259. White, B. N. 1989. Antitropicality and vicariance: A reply to Briggs. Syst. Zool. 38(1):77-79. Williams, J. T. 1990. Phylogenetic relationships and revision of the blenniid fish genus Scartichthys. Smith Contr. Zool. 492:1-30. Wirtz, P. 1980. A revision of the eastern-Atlantic Tripteryygiidae (Pisces, Blennioidei) and notes on some west African blennioid fish. Cymbium 1980(21):83-101. Wourms, J. P., and Lombardi, J. 1992. Reflections on the evolution of piscine viviparity. Am. Zool. 32:276-293. Yatsu, A. 1985. Phylogeny of the family Pholidae (Blennioidei) with a redescription of Pholis scopoli. J. Ichthyol. 32(3):273-282.
CHAPTER
16 Major Histocompatibility Complex Genes in the Study ofFish Phylogeny DAGMAR KLEIN Department of Microbiology and Immunology University of Miami School of Medicine Miami, Florida 33136
JAN KLEIN Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 T~ibingen, Germany and Department of Microbiology and Immunology University of Miami School of Medicine Miami, Florida 33136
AKIE SATO Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 T~ibingen, Germany
FELIPE FIGUEROA Max-Planck-Institut far Biologie Abteilung Immungenetik D-72076 T~ibingen, Germany
COLM O'HUIGIN Max-Planck-Institut fiir Biologie Abteilung Immungenetik D-72076 Tfibingen, Germany
eral approach to using Mhc genes in phylogenetic and systematic studies and the advantages, as well as possible pitfalls, are discussed.
I. Introduction The major histocompatibility complex (Mhc) is a gene system that arose early in the evolution of vertebrates in response to an increased need for protection against parasites. Because of its key function in the immune response, which it has retained during its entire evolution, the Mhc has been studied extensively by immunologists and is consequently one of the best characterized genetic complexes in vertebrates. To fish taxonomists, the Mhc offers several advantages that other molecular systems do not provide. Foremost among these is the trans-species character of Mhc polymorphism. The functional Mhc loci are highly polymorphic and many of these polymorphisms predate speciation. Closely related species, such as those constituting the haplochromine flocks of East African Great Lakes, share identical Mhc alleles. The frequencies of alleles can be used to determine the phylogenetic relationships among the various species of the flocks. The genMOLECULAR SYSTEMATICS OF FISHES
II. Major Histocompatibility Complex (Mhc) Structure and Function All jawed vertebrates possess a set of molecules that have a characteristic, highly conserved quaternary, tertiary, and secondary structure but, at the same time, are highly divergent in their primary structure: the major histocompatibility complex molecules (for reviews, see Klein, 1986; Srivastava et al., 1991; Kasahara et al., 1995). During their early evolution, the Mhc molecules were apparently assembled from three types of modules that arose independently (Figs. 1 and 2): the membrane-anchoring module (MAM), the immunoglobulin-like module (ILM), and the peptide-binding module (PBM; Klein and O'hUigin, 1993). The MAM is 27/
Copyright 9 1997 by Academic Press. All rights of reproduction in any form reserved.
2 72
JAN KLEIN et al. E1
A
E2
E3
E4
E5
I
, 9
E6 E7 9
,,
E1
E2
....
Bq
E3
E4 E5
E6
i
T E1
E2
E3
E4
Relationship between exons (E) os Mhc class I genes (A and B) and domains of class I a and 13polypeptide chains. Different shading indicates modules: light, peptide-binding module (PBM); intermediate, immunoglobulin-likemodule (ILM);and dark, membrane-anchoring module (MAM). CT, connecting peptide; CY, cytoplasmic tail; TM, transmembrane region. Arrows indicate correspondence between exons and domains.
FIGURE 1
composed of a short connecting peptide (CT), a transmembrane (TM) region, and a cytoplasmic (CY) tail. The ILM consists of domains homologous to those of the immunoglobulin (Ig) superfamily proteins. The PBM resembles interleukin-8 (IL-8) and related proteins, and possibly also the endothelial-cell protein C receptor (EPCR). The three modules consist of domains whose arrangement distinguishes two types of Mhc molecules, class I and class II (Figs. I and 2). Molecules in both classes are heterodimers that consist of noncovalently associated a a n d / 3 polypeptide chains. The class I/3 chain contains a single Ig-like domain (ILD), which also occurs in a free form as/32-microglobulin in tissue fluids. In the class I cr chain, two peptide-binding domains (PBD), or1 and or2, constitute the PBM; one ILD joins noncovalently with the/3 chain ILD to form the ILM; and a single MAM fastens the molecule to the plasma membrane. In the class II molecule, PBDs of the cr and/3 chains (al and/31, respectively) comprise the PBM; another domain of the cr chain (or2), together with a domain of the/3 chain (/32), comprises the ILM; and the entire extracellular part of the molecule is fastened to the plasma membrane by two anchors, one contributed by the cr and the other by the/3 chain. The extracellular parts of the polypeptide chains are glycosylated, rendering the Mhc molecules glycoproteins. All class I and class II Mhc molecules thus far identified, be they from fish, amphibian, reptile, bird, or mammal, appear to have the same structure and the encoding genes the same exon-intron organization (Figs. 1 and 2). Each extracellular domain is encoded by a separate exon: E1 encodes the signal peptide; E2, E3, and E4 of the class I A genes encode the eel, c~2, and or3 domains, respectively; E2 and E3 of the class II A genes encode the or1 and or2 domains, respectively;
Aq
H E1
E2
E3
E4
Relationshipbetween exons (E) of Mhc class II genes (A and B) and domains of class II cr and 13polypeptide chains. For an explanation of symbols, see legend to Fig. 1.
FIGURE 2
and E2 and E3 of the class II B genes encode the/31 and/32 domains, respectively. The single domain of the class I/3 chain is encoded in three exons, although the bulk of the sequence is specified by a single exon (E2). The number of exons specifying the membraneanchoring domain (MAD) is somewhat more variable, both among genes and among species. The Mhc molecules are receptors that bind peptides produced by degradation of other proteins. Most of the time the peptides are derived from the body's own proteins, but in an infected animal, some of them originate from the parasite. Peptides originating from intracellular parasites, such as viruses, bind predominantly to class I molecules, whereas those derived from extracellular parasites, such as many bacteria, largely bind to class II molecules. The binding is dependent on interaction with a small number of amino acid residues of the peptide-binding region (PBR) specified by exons 2 and 3 in the case of the class I A gene and exon 2 in the case of class II A or B genes. The PBMs of class I and class II molecules are constructed somewhat differently so that they can accommodate peptides of different lengths and constitutions. Each PBM is capable of binding a large array of peptides, which, however, share amino acid residues at a few critical positions. The bound peptides, if derived from parasites, are recognized, together with parts of the Mhc molecules, by specific receptors on T lymphocytes. This recognition initiates the specific immune response to the parasite. The structural differences between class I and class II molecules outside the PBMs may reflect the distinctive modes of biosynthesis and intracellular transport of the two proteins. Class I molecules are synthesized and loaded with peptides in the endoplasmic reticulum. Class II molecules are synthesized in the endoplasmic reticulum and loaded with peptides in the early endosomes. Peptides used by different classes differ in their origin. Peptides for class I molecules are
16. Mhc in Fish Phylogeny
produced by processing intracellular proteins in specialized molecular aggregates (the proteasomes) in the cytosol. Peptides for class II molecules are produced by the enzymatic degradation of extracellularly derived proteins in the endocytic vesicle.
III. Mhc as a Source of Systematic Information Very few molecules have been studied as extensively and from so many different perspectives as those controlled by the Mhc. As a result, the Mhc products are among the best characterized glycoproteins. The main reason for this has been the desire to understand how the vertebrate immune system functions and how it originated. As such studies involve a variety of organisms, they provide not only the information sought, but also phylogenetic information. In the gene banks, Mhc sequences are well represented and thus provide a rich source of information for phylogenetic and taxonomical comparisons. Increasingly, however, Mhc genes are being studied with the sole purpose of obtaining phylogenetic information because they offer certain advantages over many other nuclear genes. For example, the Mhc genes are members of a rich multigene family which undergoes frequent rearrangements and thus constitutes a source of chromosomal mutations that can be used as characters in cladistic analysis. Another disadvantage is that certain regions of the Mhc genes, specifically the PBR, are highly variable. The variability is maintained by balancing selection (Hughes and Nei, 1988; Takahata et al., 1992) which retains alleles in populations as polymorphisms despite speciation events. These "transspecies polymorphisms" and their usefulness in systematics will be described in greater detail later. Study of the Mhc provides three types of phylogenetic information: sequence data, characters stemming from macromutations, and frequency data. Sequence differences originate from point mutations and can be evaluated by using either distance or parsimony (character-based) methods. Macromutations are defined as changes that simultaneously affect more than one nucleotide, in contradistinction to point mutations, which affect one site only. They include duplications, deletions, and other chromosomal rearrangements, and insertions of repetitive elements (transposons). Frequency data are derived from the study of gene and haplotype polymorphisms. Although they are normally used to evaluate relationships among populations, they can also be used to test relationships among closely related species.
273
IV. Sequences as a Source of Phylogenetic and Systematic Information Like other genes, the Mhc genes of two species that diverged from a common ancestor accumulate substitutional differences roughly in proportion to the elapsed time (Kimura, 1983). This "molecular clock" seems to tick not only for the neutral sites of the Mhc genes (synonymous, intron, and intergenic sites), but also for sites subject to balancing selection (largely the PBR sites; see Satta et al., 1991). The latter constancy of evolutionary rate presumably reflects a constancy of selection pressure. Because of these constancies, it is possible to use Mhc sequences to infer gene and species phylogenies. Examples are given in Figs. 3, 4, and 5 in the form of phylogenetic trees constructed on the basis of fish class I and class II amino acid sequences. The usefulness of Mhc sequence information for fish taxonomy has thus far been minimal. The trees in Figs. 3, 4, and 5 are congruent with established relationships among fish taxa, but do not add new information because only very few sequences are available from different taxa. The number can, however, be expected to grow rapidly in the near future and, with it, the utility of Mhc sequence information. Moreover, Mhc genes are already being used to help resolve long-standing taxonomical problems by focusing on specific taxa. One example is the relationship among the Dipnoi, Crossopterygii, and tetrapods (reviewed by Meyer, 1995). Class I Mhc genes of the coelacanth, Latimeria chalumnae (Betz et al., 1994), and of the African lungfish Protopterus aethiopicus (A. Sato, H. Sfiltmann, and J. Klein, unpublished data) have been cloned and show that the coelacanth class I Mhc genes are more closely related to the
91i
Pore-B17 Pore-A3
100 ] Sasa-P30
Cyca-UAl*01 Brre-UAl*01
lOOl
100i
Brre-UA-FU1 HLA-A11E
0.0
0.1
0.2
0.3
I
I
!
]
Genetic distance
FIGURE 3 Phylogenetictree of fish class I ~ polypeptide chain se-
quences. The tree was constructed by the neighbor-joiningmethod (Saitou and Nei, 1987); genetic distances were determined as percentage identity (Poisson corrected) between proteins. Numbers on nodes indicate percentage recovery of these nodes per 500 bootstrap replications. Pore, Poecilia reticulata, guppy (Sato et al., 1995); Brre, Brachydanio rerio, zebrafish (Takeuchiet al., 1995);Cyca, Cyprinus carpio, carp (Okamura et al., 1993); Sasa, Salmo salar, Atlantic salmon (Grimholt et al., 1993).
2 74
JAN KLEIN et al.
1oo{
,oo{
0.0
0.1
0.2
0.3
I
I
I
I
Brre-2.1.4 Brre-l.3.4 Brre-ll.2 Mosa-L35062 ~Gici-M89951 100L_. Gici-M89950
Genetic distance FIGURE 4 Phylogenetic tree of fish class II a polypeptide chain sequences. The tree was constructed by the neighbor-joining method (Saitou and Nei, 1987); genetic distances were determined as percentage identity between proteins. Numbers on nodes indicate percentage recovery of these nodes per 500 bootstrap replications. Gici, Ginglymostoma cirratum, nurse shark (Kasahara et al., 1993); Brre, Brachydanio rerio, zebrafish (Siiltmann et al., 1993, 1995); Mosa, Morone saxatilis, striped bass (Hardee et al., 1995).
amphibian homologs than any other fish class I genes, including the lungfish genes. The Mhc thus helps to resolve a dispute that so far has been based largely on morphological and paleontological data (but see Meyer and Wilson, 1990; Meyer and Dolven, 1992). There are, however, at least two problems that could arise in applying Mhc sequences to systematic studies,
~.~ 100 87
97]
10o 948~
I 991
Brre-DAB 1"01 Brre-DAB2*01
Brre-DAB4*01 Cyca-K7-3 100] Cyca-K9-4 Sasa-C144 Onmy-DAB*01 Sasa-c157 Pore-4-28 Auha-231a Auha-231b Mosa-C-1 Mosa-R41 Gici-L20274 100L Gici-L20275
one technical and the other interpretative. The technical problem lies in the difficulty of cloning Mhc genes from new taxa. Mhc sequences of distant taxa are so dissimilar that Mhc clones cannot be isolated by crosshybridization. The only possibility is to use degenerate primers for polymerase chain reaction (PCR) amplification, but even then, success depends very much on luck and persistence. Although there are residues shared by all or most Mhc proteins of a particular class, they occur mostly at single sites scattered along the entire sequence and are therefore often not suitable for designing PCR primers. Nevertheless, Mhc genes have been cloned from different taxa and the success rate will undoubtedly increase as more sequences become available. The interpretative problem lies in the fact that homology relationships among the Mhc genes are equivocal. The problem can be illustrated by a hypothetical example (Fig. 6). Consider an ancestral gene A that has duplicated in an ancestral species I and produced genes A1 and A2. The duplication then became fixed, and when two new species, 2 and 3, arose from the ancestral species 1, both duplicated genes were inherited. Since the time of the duplication, the A1 and A2 genes have been diverging from each other, first during the remaining time of existence of species 1 (time T1) and then after cladogenesis of species I into species 2 and 3 ( t i m e T2). Comparing A1 or A2 sequences from species 2 and 3 (orthologous genes) will reflects the species phylogeny, but comparison of A1 (A2) of species 2 with A2 (A1) of species 3 (paralogous genes) will not. The difficulty arises because it is not always possible to know whether a comparison is between orthologous or paralogous genes, especially when further duplications and deletions followed the initial event. The possibility of homoplasy exists in all multigene systems
A1
0.0
0.1
0.2
0.3
I
I
I
I
i
S
Genetic distance FIGURE 5 Phylogenetic tree of fish class II fl polypeptide chain sequences. The tree was constructed by the neighbor-joining method (Saitou and Nei, 1987); genetic distances were determined as percentage identity between proteins. Numbers on nodes indicate percentage recovery of these nodes per 500 bootstrap replications. Gici, Ginglymostoma cirratum, nurse shark (Bartl and Weissman, 1994); Onmy, Oncorhynchus mykiss, rainbow trout (Glamann, 1995); Sasa, Salmo salar, Atlantic salmon (Hordvik et al., 1993); Brre, Brachydanio rerio, zebrafish (Ono et al., 1992); Auha, Aulonocara hansbaenschi, cichlid fish (Oho et al., 1993c); Mosa, Morone saxatilis, striped bass (Walker and McConnell, 1994); Pore, Poecitia reticulata, guppy (Sato et al., 1995).
A2
A1
A2
s
T2 A1
ps c I E 1
-ti
A2
I A1
S
I
t
Divergence A2
T1
I Duplication A
FIGURE 6 A hypothetical example of gene duplication and divergence within and between species. A, A1, and A2 are loci (represented by rectangles); T, time. For discussion, see text.
16. Mhc in Fish Phylogeny
guishable. This is also true about deletions, insertions, and other rearrangements. If a macromutation occurs and becomes fixed in an ancestral population before the latter gives rise to extant taxa, a synapomorphic character for cladistic analysis is generated. Macromutations are, of course, not restricted to the Mhc; they can occur at any other locus or chromosomal region. The Mhc, however, has the potential to become a very rich source of macromutations (as it is in mammals; see Mfiukov~-Fajdelova et al., 1994; Satta et al., 1996) for two reasons. First, a dense cluster of closely related genes is more likely to undergo rearrangements than a chromosomal region occupied by unrelated loci. Second, because of the considerable attention awarded to the Mhc, various macromutations are likely to be discovered in this chromosomal region by chance. One example of a cladistically useful macromutation serendipitously discovered during the studies of the Mhc in cichlid fishes given below (Figueroa et al., 1995). As mentioned earlier, the organization of the Mhc exons and introns that code for the extracellular domains is the same in all genes studied thus far, with one exception. In Aulonacara hansbaenschi and other cichlids, the class IIB loci all contain an extra intron which splits the ILD-encoding exon 3 into two (Ono et al., 1993c; Fig. 7). Further examination has revealed the extra intron to be present not only in cichlids, but in all other Percomorpha examined, as well as in representative species of Atheriniformes and Cyprinodontifor-
(and may not even be excluded in single gene systems because they, too, might once have been multigenic) but it is particularly acute in the Mhc, in which contractions and expansions of the cluster are frequent occurrences (Klein et al., 1993b). This problem, however, does not occur when Zl is much smaller than Z2. In such situations, even the comparison of paralogous genes will provide a meaningful species phylogenetic tree. The fact that the available Mhc dendrograms are congruent with species phylogenies (Fig. 3), even though some of the former are almost certainly based on paralogous comparisons (all the Latimeria class I genes, for example, probably arose from an ancestral gene that emerged after the separation of Crossopterygii from other fish taxa), indicates that in "long distance" comparisons, paralogy is not a serious problem (this will be expanded on later). Mhc sequences, together with those of other nuclear genes, are therefore a useful source of phylogenetic information.
V. Cladistic Analysis with Macromutations Macromutations are likely to be unique events. Although a gene can duplicate repeatedly, it is highly improbable that the different duplications will involve exactly the same DNA segment and hence be indistin-
E1 -1 1 4
E2
95
96
275
E3
E4-E6 201 220 1 9 ~ ~ 221 1891 214
Brre ~
200-270
I3IIG
v
650
100 I415
E1 -1 1 5
95 I2
Auha 68-206
724-1300
96 ~
228 236 166 167 189 190 ] ooa] E 33B AE4 ~'~'~
97-167 78-89 408-413 335
FIGURE 7 Exon-intron organization of class IIB genes in zebrafish (Brre, Brachydanio rerio) and cichlid fish (Auha, Aulonocara hansbaenschi). Filled rectangles represent exons (E), open rectangles represent untranslated regions, and connecting lines represent introns (I). Border codon positions are indicated by numerals above the exons; numerals below two-way arrows give distances in base pairs (from Figueroa et al., 1995).
2 76
JAN KLEIN et al.
TELEOSTEI EUTELEOSTEI NEOTELEOSTEI
I i
OSTEOGLOSSOMORPHA
I
I
ELOPOMORPHA
OSTEOGLOSSO NOTOPTERO IDEI IDEI i
i i
~Ja