AN INTRODUCTION TO POPULATION GENETICS THEORY
AN INTRODUCTION TO POPULATION GENETICS THEORY JAMES F. CROW
UNIVERSITY OF WISCONSIN
MOTOO KIMURA
NATIONAL INSTITUTE OF GENETICS JAPAN
Front Cover: An illustration of Motoo Kimura's principle of quasilinkage equilibrium.
After a few generations of directional selection linkage disequilibrium cancels the epistatic
variance, so that the additive variance alone (VG) is the best predictor of the change in mean fitness (dmldt). See Figure 5.7.1 on page 222.
Reprint of 1970 Edition by Harper and Row, Publishers, Inc. This book was previously published by Pearson Education, Inc. Copyright © 1970 by James F. Crow and Motoo Kimura
AJI rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted by any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as may be expressly permitted by the applicable copyright statutes or in writing by the publisher.
An Introduction to Population Genetics Theory ISBN-I0: 1-932846-12-3 ISBN-13: 978-1-932846-12-6 Library of Congress Control Number:
THE BLACKBURN PRESS P. O. Box 287
Caldwell, New Jersey
07006
973-228-7077 www.BlackbumPress.com
U.S.A.
2005920027
To
Sewall
Wright
CONTENTS
PREFACE
xiii
INTRODUCTION
1.
MODELS OF POPULATION GROWTH 1.1 1.2 1.3 1.4 1.5 1.6 1.7
2.
1
Model 1 : Discrete. Nonoverlapping Generations Model 2: Continuous Random Births and Deaths Model 3: Overlapping Generations. Discrete Time Intervals 11 Model4: Overlapping Generations. 17 Continuous Change Fisher's Measure of Reproductive Value 20 Regulation of Population Number 22 30 Problems
RANDOMLY MATING POPULATIONS 2.1 2.2
Gene Frequency and Genotype Frequency The Hardy-Weinberg Principle 34
3 5 7
31
32
vii
viii
CONTENTS Multiple Alleles 40 X-linked Loci 41 Different Initial Gene Frequencies in the Two Sexes 44 2.6 Two Loci 47 2.7 More Than Two Loci 50 52 2.8 Polyploidy 2.9 Subdivision of a Population: Wahlund's 54 Principle 2.10 Random-mating Proportions in a Finite 55 Population 56 2.11 Problems 2.3 2.4 2.5
3. INBREEDING 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14
61
Decrease in Heterozygosity with Inbreeding 62 Wright's Inbreeding Coefficient, f 64 Coefficients of Consanguinity and Relationship 68 Computation of f from Pedigrees 69 Phenotypic Effects of Consanguineous Matings 73 The Effect of Inbreeding on Quantitative Characters 77 Some Examples of Inbreeding Effects 81 Regular Systems of Inbreeding 85 Inbreeding with Two Loci 95 Effect of Inbreeding on the Variance 99 The Inbreeding Effect of a Finite Population 101 Hierarchical Structure of Populations 104 Effective Population Number 109 111 Problems
4. CORRELATION BETWEEN RELATIVES AND 115 ASSORTATIVE MATING Genetic Variance with Dominance and Epistasis, and with Random Mating 116 4.2 Variance Components with Dominance and Inbreeding 130 4.3 Identity Relations Between Relatives 132 4.4 Correlation Between Relatives 136 4.5 Comparison of Consanguineous and Assortative Mating 141 4.6 Assortative Mating for a Single Locus 143 4.7 Assortative Mating for a Simple Multifactorial 148 Trait Multiple Alleles, Unequal Gene Effects, 4.8 and Unequal Gene Frequencies 153 156 4.9 Effect of Dominance and Environment 4.10 Effects of Assortative Mating on the Correlation Between Relatives 158 4.1
CONTENTS 4.11 Other Models of Assortative Mating 161 4.12 Disassortative-mating and Self-sterility Systems 171 4.13 Problems
5. SELECTION 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13
166
173
Discrete Generations: Complete Selection 175 Discrete Generations: Partial Selection 178 Continuous Model with Overlapping Generations 190 The Effects of linkage and Epistasis 195 Fisher's Fundamental Theorem of Natural Selection: Single Locus with Random Mating 205 The Fundamental Theorem: Nonrandom Mating and Variable Fitnesses 210 The Fundamental Theorem: Effects of linkage and Epistasis 217 Thresholds and Truncation Selection for a Quantitative Trait 225 A Maximum Principle for Natural Selection 230 The Change of Variance with Selection 236 Selection Between and Within Groups 239 Haldane's Cost of Natural Selection 244 253 Problems
6. POPULATIONS IN APPROXIMATE EQUILIBRIUM 255 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12
6.13 6.14
Factors Maintaining Gene Frequency Equilibria 256 Equilibrium Between Selection and Mutation 258 Equilibrium Under Mutation Pressure 262 Mutation and Selection with Multiple Alleles 264 Selection and Migration 267 Equilibrium Between Migration and Random Drift 268 Equilibrium Under Selection: Single Locus with 270 Two Alleles Selective Equilibrium with Multiple Alleles 272 Some Other Equilibria Maintained by Balanced Selective Forces 278 Selection and the Sex Ratio 288 Stabilizing Selection 293 Average Fitness and Genetic Loads 297 1. Kinds of Genetic Loads and Definitions 297 2. The Mutation Load 299 3. The Segregation Load 303 4. The Incompatibility Load 308 5. The Load Due to Meiotic Drive 311 Evolutionary Advantages of Mendelian Inheritance 313 317 Problems
ix
x
CONTENTS
7. PROPERTIES 7.1 7.2 7.3 7.4 7.5 7.6
7.7
OF A FINITE
POPULATION
Increase of Homozygosity Due to Random Gene Frequency Drift 320 Amount of Heterozygosity and Effective Number of Neutral Alleles in a Finite Population 322 Change of Mean and Variance in Gene Frequency Due to Random Drift 327 Change of Gene Frequency Moments with Random Drift 331 The Variance of a Quantitative Character Within and Between Subdivided Populations 339 Effective Population Number 345 345 1. Introduction 2. Inbreeding Effective Number 345 3. Variance Effective Number 352 4. Comparison of the Two Effective Numbers 361 5. An A Priori Approach to Predicting Effective Number in Selection Programs 364 365 Problems
8. STOCHASTIC PROCESSES IN THE CHANGE OF GENE FREQUENCIES 367 8.1 8.2 8.3 8.4 8.5 8.6
8.7 8.8
8.9
The Rate of Evolution by Mutation and Random 368 Drift Change of Gene Frequencies as a Stochastic Process 371 The Diffusion Equation Method 371 The Process of Random Genetic Drift Due to Random Sampling of Gametes 382 Change of Gene Frequency Under Linear Pressure and Random Sampling of Gametes 389 Change of Gene Frequency Under Selection and Random Sampling of Gametes 396 1. Genic Selection 396 2. Case of Complete Dominance 401 3. Arbitrary Degree of Dominance 408 4. Overdominant Case 409 Change of Gene Frequency Due to Random Fluctuation of Selection Intensities 414 Probability of Fixation of Mutant Genes 418 1. Introductory Remarks 418 2. Discrete Treatment Based on the 419 Branching-process Method 3. Continuous Treatment Based on the Kolmogorov 423 Backward Equation The Average Number of Generations Until Fixation of a Mutant Gene in a Finite Population 430
319
CO NTENTS
9. DISTRIBUTION OF GENE FREQUENCIES IN 433 POPULATIONS 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8
9.9
Wright's Formula for the Gene Frequency 434 Distribution Distribution of Gene Frequencies Among Subgroups Under Linear Pressure 436 Distribution of Gene Frequencies Under Selection and Reversible Mutation 442 445 Distri bution of Lethal Genes Effect of Random Fluctuation in Selection Intensity on the Distribution of Gene Frequencies 450 Number of Neutral Alleles Maintained in a Finite Population 453 The Number of Overdominant Alleles in a 457 Finite Population The Number of Heterozygous Nucleotide Sites Per Individual Maintained in a Finite Population 466 by Mutation Decrease of Genetic Correlation with Distance in the Stepping-stone Model of Population 469 Structure
APPENDIX. SOME STATISTICAL AND MATHEMATICAL METHODS FREQUENTLY USED 479 IN POPULATION GENETICS A.1 A.2
Various Kinds of Averages 480 Measures of Variability: The Variance and 482 Standard Deviation A.3 Population Values and Sample Values 485 A.4 Correlation and Regression 486 A.5 Binomial. Poisson. and Normal Distributions 491 A.6 Significance Tests and Confidence Limits 493 1. Significance Tests for Enumeration D ata: The Chi-square Test 493 2. Confidence Limits for Enumeration Data 496 3. Significance Tests for Measurement Data 498 4. The Significance of a Correlation Coefficient 499 5. Confidence Limits for the Mean with Measurement 499 Data A.7 Matrices and Determinants 500 A.8 Eigenv � lues and Eigenvectors 505 A.9 The Method of Maximum Likelihood 509 A10. Lagrange Multipliers 5 15
BIBLIOGRAPHY
517
GLOSSARY AND INDEX OF SYM BOLS INDEX OF NAMES INDEX OF SUBJECTS
583 587
577
xi
PREFACE TO THE 2009 REPRINTING Population genetics has undergone enonnous changes since this book was written.
It has become broader, deeper, and more rigorous.
Computers have brought major advances. Some parts of the book are in error or at least not as well done as they could have been; many more have been superseded. A proper revision would be an enonnous task; so it is better, I think, to leave the book as it was originally than to undertake what, at best, would be a makeshift job.
I have, however,
made a few minor corrections. This book has been out of print for some years and there has been a strong demand for copies.
There are now several good books on
population genetics, so is there still a place for this one? I believe there is, for some items are unique. It also, I believe, has a place as a historical document. So I am grateful that The Blackburn Press has taken on the task of reprinting it.
Motoo Kimura died on November 13, 1994, his seventieth birthday
anniversary. We had several times discussed producing a revised edition of this book, but circumstances always intervened.
For Motoo, his
neutral theory became almost an obsession and occupied a major part of his time during his last 25 years.
At the same time I was busy with a
number of teaching, research, and administrative commitments. there was the isolating influence of the Pacific Ocean.
And
As a result, we
never got around to doing a revision. This reprinting provides me with an opportunity to honor Kimura my student and collaborator, outstanding scientist, and close friend. JFC
November 2008
PREFACE
T
his book is i ntended primarily for graduate students and advanced undergraduates i n genetics and populat ion biology. We hope that it will be of value and interest to others also. It is a n attempt to present the field of population genetics, starting with elementary concepts and leading the reader well into the field . At first we i ntended to include experimental work ; but this has been largely omitted, partly i n the interest of coherence and partly because the book is already long. The first two-thirds of the book do not require advanced mathematical background. An ordinary knowledge of the calculus will su ffice. For the reader who is not familiar with the mathem atical and statistical procedures employed, we have added an appendix. The latter part s o f the book, wh ich deal with populat ions stochastically, use more ad vanced methods. We have made no attempt to explain al l of these, either in the text or in the appendix. The reader with only elementary knowledge will ha ve to accept some of the conclusions on faith. We have tried, however, to present t he model and the conclusion, xiii
xiv
PR EFACE
leaving it to the reader to follow as much of the intermediate mathematical manipulation as he wishes. We have also added a n umber of tables and graphs in these chapters so that the major conclusions are available in this form. There are problems at the end of the first seven chapters. These are more n u merous i n the early chapters where the reader is more likely to wish for a means to test his knowledge. The bibliography is longer than is customary. There are many more articles listed than are referred to. We hope in this manner to provide a list which is a useful guide to the literature and which shows the richness and d iversity of research in this field. Without any implication that they share any responsibility for the choice of content or for errors (although they are specifically responsible for the removal of a n u mber of errors), we should like to than k the many people who have helped in various ways. We have benefited greatly from critical comments by Joseph Felsenstein. Others who have helped by suggesting alternative ways of present ation and by pointing out errors are Takeo Maruyama, Laurence Resseguie, Daniel H artl, Carter Denniston, Thomas Wolfe, Etan Markowitz, and Tomoko Ohta. We should also like to thank the ma rlY students who have used i n class the notes that were the forerunner of this book ; they have been especially helpful in pointing out ambiguities. We should like to thank both our institutions, the U niversity of Wisconsin and the National I nstitute of Genetics i n Japan, for leaves that permitted us to wor k together on several occas ions either in the Un ited States or in Japan. The Rockefeller Foundation generously provided the initial s upport to get us started. Finally, our indebtedness to Professor Sewall Wright will be apparent in every chapter. In many, many i nstances i t is his pioneering work that gave us something to write about. A n umber of topics that we have treated only lightly , or not at all , are included i n his fou r-volume treatise, " Evolution a n d the Genetics of Populations. " J.F.C.
M.K.
AN INTRODUCTION TO POPULATION GENETICS THEORY
INTRODUCTION
W
e are concerned i n this book mainly with population genetics in a strict sense. We deal primarily with natural populations and less fu lly with the rather simi lar problems that arise in breeding live stock and cul t ivate d plants . The latter subject, sometimes cal l e d quantitative or biometrical genetics , emphasi zes economically important measu remen ts where the bree ding system is u nder human control . Although this is not neglecte d, we emphasi ze more the behav ior of genes an d pop Ulat io n attributes un der na tural selection where the most important measure is Darwinian fitness. As do most sciences, popUlation genetics includes both observa tions and theory. The observations come both fro m stu dies on natural populations and from laboratory experimen ts. Sometimes the observations are veri fications of existi ng theory, sometimes they are tests to disti nguish among alterna tive theories , or they may lea d to tota lly new ideas. Our emphasis is on the theory, but we sha ll occasionally ma ke use of experimental or observat ional data, usually for illustrat ion. 1
2
INTRODUCTION
The theory of population genetics is largely mathematical. By a biologist ' s standards it is highly developed, although a t heoretical physicist might well regard it as rather primitive. A mathematical theory that could take i nto ac count all the relevant phenomena of even the simplest popu1ation wou1d be i mpossibly complex. Therefore it is abs01ute1y necessary to make simp1 ifying assumptions. To a large extent the success or failure of a theory is determined by the choice of assumptions-by the extent to which the model accounts for important facts, ignores trivia, and suggests new basic concepts. The general pattern of this book wil 1 be to start with the si mp1est m ode1s and then to extend these to more complicated, but more realistic, formu1ations. We have tried to steer a midd1e cours � between comp1etely verbal bi010gical arguments and the rigor of the mathematician. We have not hesi tated to appeal to the reader's bi0 10gical i ntuition and we often have used derivations and proofs that fail to take i nto account a n the m athematica1 possibilities when these can fai rly clearly be ruled out on biologica1 grou nds. F u rthermore, we have somet i mes used models which are somewhat vague, but which seem to us to have considerable bi0 10gical i nterest and generality . We frequently use approximations rather than exact expressions, since we are more i nterested i n finding an approximate solution to a mode1 that seems to be bi010gical1y interesti ng than an exact solution to one that is 1ess i nterest i ng or realistic. When a choice is necessary we prefer genera1ity and realism to precision and rigor. We continue the tradition of Sewall Wright and R. A. Fisher in lIsing heuristic arguments that are not rigo rously proven, especia lly in the use of continuous approximations and diffusion mode 1s. This i nv01ves a risk of later being shown to be wrong, but the history of the physical sciences is on the side of such a strategy. The view is wel1 expressed by Richard Feynman i n his Nobel Pri ze 1ecture (Science 153: 699, 1966): In the face of the lack of direct mathematical demonstration one
mu st be careful
and thorough to make sure of the point, and one should make a perpetual attempt to demonstrate as much of the formula as possible. Nevertheless, very great deal more truth can become known than ca n
be proven.
a
1 MODELS OF POPULATION GROWTH
T
heoretica l population genetics is concerned wit h model bui lding. Any model of nature is an oversimplification, as is any verbal descrip tion of a natural process. The model is an attempt to abstrac t from nature some sig nificant aspect of the true situation. The models employed in population genetics are mathematical. The model i s always unsatisfactory i n some respects . I nevitably, it is unable to re flect all the com p lexities of the true situation. On the other hand, it is usua lly true that the more closely the model is made to conform to nature the more unmanageable it becomes from the mathematical standpo int. If it is as complex as the true situation, it is not a model. We have to choose some sort of compromise between a m odel that is so crude as to be unrealistic or mi slead i ng and one that is incomprehensible or too complex to handle . Tho se men who have laid the mathem atical fou n dation for the theory of populatio n ge netics- J. B. S. Haldane, R. A . Fisher, and Se wall Wright-have had the capacity of i nventing mathematical models that extracted the essence of the 3
4
AN I NTRO D U CTION TO PO PULATI O N G E N ETICS TH EORY
situation in a formulation that cou ld be handled mathematically. With i n creased mathematical sophistication , a m ore comprehensive a nd rigorous theory can be developed and much of the cu rrent research in the oretical population genetics is concerned with such developments . Just as the economist considers the broad consequences of individual transact ions in a more or less free market , the pop ulation geneticist is inter ested in the overall consequences of a large number of events-births, deaths , choice of mates . and the h ost of individual circumstances, habits, decisions, and accidents that determi ne these. As the physicist or chemist works with the statistical averages of molecular behavior and does not try to describe the behavior of each individual molecule, the population geneticist tries to describe the overall effect of a l arge number of individual events. It is convenient to divide mathematical models of population struct ure i n to t wo kinds , determ inistic and stocha stic. With a determ inistic model, the population is ass umed to be l arge enough and the factors determining ind ividual birth rates and death rates constant enough that the consequences of random fluctuations can be ignored . This would be true only for an infinite population under highly idealized cond itions ; but act ually many populations are large enough that the " noise " introd uced into the system by random processes is small enough in relation to systematic factors that, for the degree of approximation needed . it may be ignored . Deterministic models are much easier to handle mathema tically , as will be abundantly clear throughout the book. Stochastic models take account of the effects of the finiteness of the popu lation and other random elements. Some populations are small enough or the conditions are va riable eno ugh t hat rand o m fluctuations are appreciable . The difficulty is with the mathe matical comple xity . On ly the simplest situa tions have exact solutions. Howeve r, it was the genius of R . A. Fisher ( 1 930, ] 958) and Sewa ll Wright (1931. 1945, 1960) to devise procedures that provided very accurate approximations which have led to deep biological insights. Recently, more sophisticated mathematical techniques and e lectronic com puters have been used to give more e xact and extensive results. I n this chapter we describe brie fly some of the deterministic models that have been used i n population genetics. Genetics is ord inarily concerned more with the relative frequencies of di fferent genes and genotypes in a pop Ulation than with the size of the total population. Nevertheless, i n this chapter we shall consider the population as a whole. The pu rpose is to in tro duce the models rather than make u se of them in genetic analysis ; they w il l be used later. We introduce four deterministic models : 1. Discrete, nonoverlapping generations,
M O D ELS OF POPULATION G R OWTH
5
2. Continuous random births and deaths, 3. Overlapping generations, discrete time intervals, 4. Overlapping generations, continuous change.
In this book we sha l make use of only the first two for genet ic problems. Thus far the more real istic models 3 and 4 have been used mainly by ecologists and demographers. B ut the i ncreasing mutuality of interest of population geneticists, demographers, and ecologists forecasts a greater emphasis on these models in population genetic s. . 1.1 M odel 1 : Discrete, N onoverlapping G enerations
Th is is in many ways the simplest description of popula tion growth. We assume that the parent generation reproduces and that , before the o ffspring reach reproductive age, the parents have all died (or at [east are no longer counted). Time is measured most conveniently in units of generations. This model is a realistic description of some populations , such as annual plants. For others it may still be a useful first approximation . It is wide ly used because of its mathematical simplicity. Let Nt be the number of individuals in the population at time I, measured in generations. If the average nu mber of progeny per individual is w, the popu lation number in generation I can be expressed in terms of the number in the previo us generation, t 1, by -
1.1.1
We regard w as a measure of both survival and reproduction ; individuals who do not survive t o the reproducti ve age are counted as leaving 0 progeny. We cal l w the Darwinian fitness, or simp ly the fitness. Each generation must be cou nted at the same age . It is often convenient to count the popu lati on as zygotes, so tha t the surv ival and reproduction of an ind ividual occur within the same generation. The relation between Nr-t and Nt-2 is the same as that bet ween Nt and Nt-t. Therefore, if w remains constant Nt = w(wNt-2}
=
w2Nt_2•
Conti nuing this process, Nr Nt = No w',
=
w3Nr-3, and finally 1 .1 .2
where No is the number in the population i n generation O. The change in population size is anal ogous to money invested at com pound interest. If w = 1 + s, then s is equivalent to the interest rate. If w > J,
6
AN I N TRO D U CTI O N TO PO P U LATION G E N ETICS T H EO RY
or s> 0, the population is increasing; alternatively , it i s decreasing if or s < 0. Equation 1.1.1 may be written !!i N, =
W
e - mYl(y)b(y) dy. x
The exponential term serves to diminish the value of children born a long time in the future. This is analogous to the situation where the present value of a l oan or investment is greater if it is to be paid soon rather than l ater. This is reversed, of course, if m is negative (as would be the present value of a loan or investment if interest rates were negative or the investment were decreasing in value). The reproductive value is proportional to the total contribution per individual of this age, so we divide the contribution by the number of persons of that age. This l eads to the definition of reproductive value at age x,
fa> e - mYl(y)b(y) dy
v(x) = x
1 .5.1
e - mxl(x)
If x = 0, the denominator is equal to 1 . Likewise, if x = 0, the numerator is equal to I from equati on 1.4.3. Therefore, the reproducti ve value at birth is I , and vex) is a measure of the reproductive value of an individual of age x relative to that of a newborn child. J n devel oping 1 .5. 1 we discussed the situation as if the population were in age-distribution equilibrium . On the other hand, we can accept the definition as given and apply it to populations in general. From this a remarkable property emerges : Irrespective of the age distribution, the total reproductive value of a population increases at a rate given by m. This can be shown as follows. First we rewrite 1 .5. 1 as
e - mXl(x)v(x) =
fa> e-mYl(y)b(y) dy . x
Differentiating both sides with respect to x leads to
[
e - mx v(X )
dl(x) dx
+
l e x)
dv(x) dx
_
v(x) I(X)m
]
=
- e - mxl(x)b(x).
22
AN I NT R O D U CTIO N TO POPULATIO N G E N ETICS T H EO RY
Cancelling e - mx and dividing both sides by v(x)/(x) gives
1 dl(x)
lex) dx -
--
+
1 dv(x) vex) dx -
--
b(x) vex )
m = - -- .
The leftmost term , with s ign changed, is simply the death rate d(x), for it is the rate of decrease in the number of age x expressed as a fraction of the p roportion alive at that age. Mak ing this substitution and rearranging, we obtain
dv(x)
�
-
v e x) d(x )
+
b(x) = mv(x).
1 .5.2
The first term is the rate of change in the reproductive value of an i ndivid ual as his age increases. The second is the rate of decrease in reproductive value per individual caused by deaths of indivi duals of age x. The third is the rate of increase in reproductive value from new births ; this is simply the instantaneous birth rate, since the value of each newborn child, v(O) , is equal t o 1 . The l eft side of the equation, then, is the net change in reproductive value of the population contributed by an individual of age x, either by growing older, by dying, or by giving birth. The n(x) individuals of age x then contribute mn(x)v(x) to the increase in reproductive value of the population. Thus the rate of change in reproductive value for each age group is given by m. Adding up all ages, we have
dv dt
- = mv
and
V,
=
m.
Vo e
,
1 .5.3
where v i s the total reproductive value of the popul ation. This demonstrates Fisher's principl e : The rate of increase in total reproductive value is equal to the M althusian parameter times the total rep roductive val ue, regardless of the age distribution. This means that the equations of Section 2 become applicable for popUlations not at age equilib rium if each individual in the popUl ati on is weighted by the reproductive value appropriat e to his age. 1 .6 Regulation of Popu l ation N umber
We have said nothing so far about popUlation regulation. It is obvi ous that a popUlation cannot grow exponentially forever. I t must eventually reach a state where m becomes 0 or negative, or where in a discrete model w becomes 1 or less. The growth rate is eventually l i mited by al l the factors that collec tively make up the carrying capacity of the environment .
M O D ELS OF POPULAl"I O N G R OWTH
23
I n populat i on genet ics we are mai nly concerned with the changes in proportions of d i fferent types of i nd ivid uals, rather than total numbers. We shall co nsider some examples of this under various types of population regulatio n . H owever, we shall ignore until later i n the book the compl icat ions i ntroduced by M e ndel ian i nheritance. We shall be deal i n g in t h is secti on with continuo us m odel s of the type i ntroduced in Section
1 .2.
We are assum i ng t he k i nd of m odel described i n
that sect io n, or i f the population has a more complicated st ruct ure we assume that it has reached age-d istribution stability. Alternatively, i n princi ple we could deal with reproductive values rather than act ual numbers by weight ing each ind ivid u al by the reprod uctive val u e appropriate to its age. Eq uat ion 1 . 2. 2 or
1 .4. 5
can be mod ified to take regulation i n to acco unt
by wri t i ng
dN dt = r N [ 1 - feN)]. The quant ity
r
1 .6.1
is t he i n t ri nsic rate of i ncrease-the rate at which the popu lation
would grow if it had unlimited food su pply and room for expansion. The funct i o n f(N ) i mplies some cha nge in the rate of i ncrease with the size of the population. The regulat ion may be, for example, by l i mitation of food supply, by t he space available, by the accu mu lation of t ox ic products, or by territorial behavior pat terns . A particularly simple mode l is provided by lettingf(N) be a l i near fu nct ion of N, say
NtK,
where
K is
a constant sometimes cal led the carrying capaci ty
of the enviro nment. S uch a population will grow approximately exponentially as long as N i s much smaller than
K,
decrease u n t i l size st abi l ity is reached at Subst itut i n g Nt K for feN) in
rN ( K
dN dt
1 .6. 1
but as N approaches
N=
K.
K the
rate wi l l
and rearranging, we have
- N)
K
1 .6 . 2
The eq uation may be rewritten as
dN
N
-
+
dN = K-N
r d t,
which is read ily i n tegrated to give
t
=
� I n N.( K - No) . r
(K - N.)No
Here and thro ugh out the book In means loge .
1 .6.3
24
AN I NTRODUCTION TO POPULATION G E N ETICS TH EORY
For example, if t he i n t ri ns ic rate of i ncrease of a pop ulation is 1 year (r
%
per
.01) and the carrying capaci ty, K, is 5000, the t i me, I, requ ired to
change the n u mber from No = 1 000 to Nt = 2000 is t
2000 n I 3000 .0 1
1
=
x x
4000 1 000
=
9 8 ye ars.
If there were no regulation, t he time req u i red (see 1 . 2.3) would be t
=
1 r
In
Nt - =
No
69 years .
Notice that, whether there is regu lation or not, the t i m e requi red for a certain change is proporti onal t o
I /r.
W e c a n also write 1 .6. 3 i n t h e i n verse fo rm, giving the n u mbe r a t t i me t
as a fu nction of I and the i n itial n u mber No . This is
Nr
K =
1 + Co
1 .6.4
e-n '
where
Co =
K - No ---
No
This function i s shown graphicall y i n Figure 1 .6. 1 . The curve is often called the " l ogistic " curve of populat ion i ncre ase
and has been widely used i n ecology (Pearl, 1 940 ; Nair, 1 954 ; Slobod k i n ,
N
o
2
3
5
4
6
7
8
rr
Figure 1 .6.1 . The logistic curve of population i ncrease. The
ordinate is the population number ; the abscissa is rt, the product
of the intrinsic rate of increase and the ti me. No , the initial
population, is taken as 1 /20 the value of K, the final number.
M O DELS OF P O P U LATI O N G ROWTH
25
1 962}. Of course it is merely the simplest of a number of equations that could be d erived and many populations, natural and experimental, depart widely from the model . As stated earlier, we are mainly concerned in popu lation genetics with the proportion of different genes and genotypes rather than the total number. We shal l see that many of the equations for the proportions of different types are the same, despite quite different mechanisms for regulation of the popula tion number. The intrinsic rate of increase, r, is closely related to the Malthusian parameter m. We shall use the latter for the actual rate of change in numbers of the population, or of a part of the population, and r for the value this would take in a situation where the growth rate is not regulated. In Chapter 5 we shall consider the effects of natural and artificial selection on the composition of a population. Now we are considering a simplified system in which the complications of Mendelian genetics are omitted. The essential points are brought out more clearly this way. Furthermore, the general principles that we are trying to show can be illustrated with only two classes ; the extension to an arbitrary nu mber presents no difficulties. So we are treating two competing species or asexual clones. Actually the same equations can be adapted to a single locus in a haploid population, to a cytoplasmic particle, to a gene on the Y chromosome or to competition between two self-fertilized strains. The methods in the following sections come largely from Egbert Leigh. Consider two strains, 1 and 2, with numbers nl and n 2 and intrinsic growth rates r1 and r2' These are the same as the Malthu sian parameters when there is no restriction on continuous exponential growth_ Let N be the total population number ; i.e. , N = nl + n2 We shall designate by PI = nl/N and P 2 = 1 PI = n2/N the proportions of the two strains. If there is no regulation of the growth of either strain the rates of mcrease are 1 . U nregulated G rowth
_
-
and
dn 2 dt = r 2 n 2 '
1 .8.5
The rate of increase of the total population is
dN dt
- =
r1 n 1
+
r2 " 2
-N , = r
1 .8.8
where f is the mean of the r's, weighted by the nu mbers in each population.
26
AN I NTRODUCTION TO P O P U LATI O N G E N ETICS T H E O R Y
T o obtai n t h e rate of cha nge i n t h e proportions of t h e t w o types, w e write
d In(PI /P2 ) dt
d I n(" I /" 2 ) dt d I n " I d In "2 dt dt dn2 dnl = --n l dt n2 dt = rl - r 2 · -
-
1 .6.7
-
Not ice also t hat
d I n(p t / P2 ) d I n P I d In( 1 - P I ) - dt dt dt dp . dp . = -- + ---p . dt ( l - p . ) dt dp . - .( 1 - . ) dt · P P Putti ng t hese two equations together gives
dp . dt = ( r . - r 2 ) p .( I
-
PI)·
1 .6.8
Not ice that i f we let P I = N/ K and r. - r2 = r we o btain equation 1 .6.2. So, 1 .6.8. is the equation of a l ogistic curve . Despite the fact that both strains are growing exponentially, the proport ion of one type (the faster growing o ne) is i n creasing according to the logist i c equatio n . Equation 1 .6.8 ca n be written i n another fo rm by not i n g that r
p.r. + P2 r2 .
Su bstit u t i n g for
dp . - = p . (r . - r ) . dt _
r2
i n 1 .6.8 gives
=
1 .6.8a
This form of the equatio n suggests the extension to more than two strains. When t hree o r more strai ns are present the same equati o n is co rrect for the rate o f change of a part icular strai n and r i s t he weighted average of the rates of increase of all the strai ns.
I n this model the total popu lation is i ncreasi ng exponential l y at any O i nstant, although the rate of i ncrease, r, is chan gi ng cont i n u o usly as the faster
growing strain replaces the slower. This m odel is obviously u nrealistic fo r any long period of t i me . We d iscu ss it here to ill ustrate the point that equa tions of the type 1 . 6. 8 . can accurately describe the rate of change of the
M O D ELS O F POP ULATIO N G ROWTH
27
proport ion of o ne type i n a m ixed populat ion even when t he numbers are changi n g accordi ng to q u i t e a d i fferent rule. This is also t rue with various forms of regulat i o n as t he fol l owi ng exa mples show. 2. Logi stic Regulation of Total N umber
A simple model for this
si t uat ion i s g iven by the equat i o ns,
dn l dt = n l( r l - rN/K ) and 1 .6.9
The total populati o n i ncreases logist ically u n t i l it reaches an equ ilibri u m at
N = K,
dN dt
as can be seen by writ i ng
=
d( n l + n 2 ) _ = r l n l + r2 n 2 - (n l + n 2 )rN/ K . dt
But, si nce N =
dN dt
-=
n l + n2
and
rN( 1 - N/K ) .
1 .6.1 0
Th i s is t h e eq uat i o n fo r l ogistic growth (see 1 .6.2). At the same t i me th at N is changi ng accordi n g to t h i s rule, we can see what is h a p pen i ng to the proport i ons of the two types by wri t i ng
dn2 dn l d I n ( P I /P2 ) = -- - -- = r l - r 2 , "";;"" n l dt n 2 d t dt ---'-
-
or following the pat tern of
1 .6.7
and
1 .6.8,
dp i = p t ( r l - r) . dt _
1 .6.1 1
Agai n , the proportion of stra i n Eq uat i o n
1 .6.8
In[p/( I - p)] where
r = rl - r2
=
I is changing l ogistica l ly.
may be written i n integrated fo rm as
C + r1,
and
1 .6.1 2
C is In[po/( l - Po)] , a constant determ i ned by the
i ni tial composi t i o n . This s uggests a co nvenient way of pl o t t i ng data from
28
A N I NTRO D U CTION TO PO P U LATION G E N ETICS T H EORY
selection experimen ts ; by plotting In [pj{ I - p)] against t i me one can easily see whether the trend is li near and t h u s see if the l ogistic equation is appro pri ate. Alternatively, if we wish to know how m u ch ti me is needed to change
the proportion from Po to Pr , we can write 1 .6. 1 2 as
t
=
� I n p,( 1 - Po) , ,
1 .6.1 28
( 1 - p, ) Po
which gives the t i me as a fu nction of the frequency of the type of i nterest. Note that the t i me required to accomplish a certain change in proport ion is strictly proport ional to the reciprocal of ,.
3. Weaker Population Control
The previous example assumed that the
total popu lation has an a bsol ute upper l i mi t , K. We now consider a popUlation that i s l i mited, but the limit i s proportional to , so that as one type replaces the other the total popUlation increases. A simple model i s
dn l
dt = n l (' 1 - cN) and 1 .6.1 3
The total n um ber changes accord i ng to
dt = N( r - cN) . dN
1 .6.1 4
The popUlation reaches a limit when N = rjc ; for this val ue dN/dt = O. The same procedure as before leads to the eq uation for change in pro portion of type 1 ,
dp i - = PI(' I - ,). _
dt
1 .6 .1 5
This sit uation is probably q u i te u n usual i n nat u re. The size of the popu lation is usually determi ned mainly by factors other than the , 'so A replace ment of the original strain by one with a h i gher , will cause only a sl ight increase in the final popUlat ion n u m ber, if i ndeed there is any change at aIL The poi n t of these three examples is to show that, despite great differences in the way in which the total population changes, the changes i n proport ion follow the same general rule given by 1 .6. 1 5.
M O DELS O F POPU LATIO N G ROWTH
29
If popu lations are regulated by the available space, food, or some other limiting factor, the type that wins in the competition may not be the one with the higher intrinsic rate of increase, but rather the one that can maintain the largest numbers in this environment. A simpl e model illustrating this possibil ity is given by
4. Regulation by E fficiency of Space or Food Utilization
dt dn ,
= r. n .(K .
- N) /K .
and 1 .8.1 8
One interpretation o f K. i s that this is the maximum population size that strain I can maintain when it is the only species ; K2 has the same meaning for strain 2. Suppose, to bring out the point of interest, that '. = '2 ' but Kl =I: K2 ; the two strains differ, not in their intrinsic growth rates, but in the maximum number that this environment can support. The change in total number is given by
dN
[
]
= rN l - � - � K . K2 '
1 .8.1 7
dt = Rp . P 2 = Rp .(1 - Pi)'
1 .8.18
dt
where , = 'I = ' 2 ' The change i n the proportion of type 1 is dp .
where
R = Nr
[KK. -t KK22 ] .
Equation 1 .6. 1 8 is in the general form of the logistic equation (see 1 .6.8), but it is not the same since R is not a constant. However, in many cases R is changing slowly and an equation of the form of 1 .6.8 describes the rate of change at any particular time. In an uncrowded environment the success of a population is determined mainly by its intrinsic rate of increase, ,. In a crowded environment the carry ing capacity, K, for the species may be more important. MacArthur and Wilson ( 1 967) refer to " , selection " and " K selection." In an uncrowded environment (, selection) types which harvest the most food, even if they are wasteful, have the largest rate of increase. On the other hand, in a crowded environment (K selection) there is a great value on efficiency of utilization rather than simple productivity.
30
AN INTROD U CTION TO POPU LATI ON G E N ETICS TH EORY
Ecologists have considered i n more detail equations of t hese types. For many p urposes it i s more mean ingful to measure biomass rat her t han simply coun t numbers, so t hat one isn't in t he position of equating one mouse with one elephant. But we shall do no more w ith t he s u bject here and get o n to problems t hat are more strictly genetic. In Chapter 5 we shall be deali ng with t he changes in gene and gen otype frequencies u nder selecti o n . The equat i o ns will be basically s i m i la r to 1.6. 1 5, complicated by the Mendelian mechanism. The purpose of this section, as stated before, i s to show that equations in the same form a re widely applicable, even though the populations may differ greatl y i n their states. The total popu lation may be growi ng, o r static, o r decreasi ng and the different geno types may differ in i ntrinsic bi rth rates, death rates, or their response to the environment ; yet the equations for changes i n proportions may be basi cally similar. For these reasons, population genetics has usually ignored the total n u m bers and concentrated on the proportions. 1 .7 Problems 1.
2. 3.
4.
5.
6. 7.
8.
9.
In a population with discrete generations and with fitness w, h ow many generations are requ i red to double t he population number ? How long is requ i red for t he populati o n to double with model 2 ? A population u nder model 3 has reached age stability. How long, i n units o f A, wi ll be required for the population t o double ? What i s the effective generat ion length, defined as the u n i t that w iJI give the same answer as problem I ? Suppose you know the ageaspecific death rates (the probability t hat an i nd ividual of age x wi ll d i e d u ri ng the next t i me unit). What is the l ife expectancy, that is, the mean l engt h of l ife ? What i s the median length of l i fe ? Show that equat i on 1 .6.8a i s correct for any numbe r of strains. What are the med ian and mean length of l ife u nder model 2, expressed in terms of t he death rate, d ? S how t hat the time req u i red to cha nge the n umber from No to Nt i n a l ogistically growi ng populat ion exceeds that i n an unregulated popu lation with t he same i ntrinsic rate of i nc rease by I n [ ( K - No)/(K - Nt)]/r. Agai n considering a logist ic populati o n with carrying capaci ty K, what i s the time requ i red t o go from a fraction x t o a fracti o n y o f t h is capacity ? One bacteri um which reprod uces by fissio n and fol l ows a l ogistic growt h patte rn i s introd uced i nto each of several ponds. Show that the time req u i red to fil l a pond t o hal f its capacity is proporti onal to the log of the carry i ng capacity.
2 RANDOMLY MATING POPULATIONS
W
are now ready to consider populations with Mendelian i nheritance and begi n by i n q u i ri ng i nto the frequencies of the different geno· types that comprise the population. I n the l ast chapter we were inter· ested i n the total n um ber of i ndividuals i n the pop ulation and i n different s ubpopu lations. In genetic studies there is usually greater i nterest in the rel ative numbers of different genotypes, so i t is convenient to express the numbers of di fferent types as proportio n s of the total. As long as we use deterministic models the total n u mber in the population i s not importan t and we can deal as welJ with proportions. With stochasti c models the n u mber i n the population becomes i m portant in determi ning the extent of random fluctuations and there fore m ust be taken i nto considerati on. O n ly deterministic models w i l l be considered i n thi s chapter, and for t he most part we shaH be concerned with a model of discrete generations. 31
32
AN I NTRODUCTION TO POPULATIO N G E N ETICS T H EORY
2.1 Gene Frequency and Genotype Frequency
The number of possible genotypes in a popUlation greatly exceeds the number of genes and soon becomes enormous. A diploid population with only two alleles at each of 1 00 loci would have 3 1 00 possible genotypes, a number far larger than the number of individ uals in any population. Therefore we effect a great simplification by writing formulae in terms of gene frequencies rather than genotype frequencies. This usually entails some loss of information, for knowledge of the gene frequencies is not sufficient to specify the genotype frequencies ; but it is usually possible to do this to a satisfactory approxima tion by introducing other information, such as the mating system and linkage relations. Furthermore, in a sexually reproducing population the genes are re assorted by the Mendelian shuffle that takes place every generation. The effects of such reassortment are largely transitory, being undone as fast as they are done. For long-range trends we look to the changes in the fre quencies of the genes themselves. As R. A. Fisher ( 1 953) said :
The frequencies with which the different genotypes occur define the gene ratios characteristic of the population, SO that it is often convenient to con sider a natural population not so much as an aggregate of Jiving individuals as an aggregate of gene ratios. Such a change of viewpoint is similar to that familiar in the theory of gases, where the specification of the population of velocities is often more useful than that of a population of particles.
At the outset we need formulae to specify the relation between gene frequencies and genotype frequencies. This will also serve to i ntroduce the kind of notation that will be used throughout the book. We start with a single locus with only two alleles. Consider a population of N diploid i ndividuals, of which Ni l are of genotype A l A I '
2NI 2 are of genotype A I A 2 , and N2 2 are of genotype A 2 A 2 ,
where Ni l + 2NI 2 + N2 2 = N. It is someti mes convenient to distinguish, among A I A 2 heterozygotes, those that received A l from the mother and those that received it from the father. We can do this by designating the two numbers as NI 2 and N2 1 . However, in most populations NI 2 N2 1 and it is not necessary to make any distinction between them. We designate the frequency or proportion of the three genotypes by Pu , 2P 1 2 , and P2 2 as follows : =
A l A I : Pl l = NI l /N, A 1 A 2 : 2 P1 2 = 2 N I 2 /N, A2 A 2 : P22 N22/ N. =
RAN D O M LY M ATI N G P O P U LATI O N S
33
From these genotype frequencies, we can write the frequencies of alleles P 2 ' as fol lows :
AI and A z , which are designated P I and
2.1 .1
The blood groups provide convenient examples, and Table 2. 1 . 1 shows some data on the frequency of M N types in Britain. Numbers of persons o f blood types M, MN, and N in a sample from a British p op u la ti on . Data from Race a n d Sanger ( 1 962).
Table 2.1 .1 .
M MM
MN MN
N NN
TOTAL
363 0.284
634 0.496
282 0.220
1,279 1 .000
PHENOTYPE GENOTYPE
NUM BER FREQUENCY
PI
Pl
or PM = .284 + H.496) = .532
or PH = .220 + H.496) = .468
Extension of this principle to multiple alleles causes no difficulty, though the symbolism becomes a little more abstract. Equations of the type of 2. 1 . 1 suggest the nature of the extension. Consider a locus with n alleles. As before, we designate allele frequencies with small letters and genotype frequencies with capital letters. Alleles : Frequencies :
Al
PI
Az
pz
Ai
Pi
Aj
A"
Pj
Pit
The subscripts i and j are used to designate any two different alleles. With these symbols, and letting Pij stand for the frequency of genotype A j A i and 2Pij stand for the heterozygous genotype A i A j , we obtai n "
=
L
j= 1
Pij '
2.1 .2
34
AN I NTROD UCTION TO P O P U LATI O N G E N ETICS T H E O R Y
Th is procedure i s appl icable in any situation where each genotype is ident ifiable. The effects of o ther loci may be ignored if they do not obscure the distinct ion between the genotypes of the locus under consideration. Because of dominance, it is often n ot possible to disti nguish all geno types. For example, i n most blood group studies no distinction i s made between AA and AO persons, both being classified simply as belonging to blood group A . Under such circumstances the allele freq uency can be measured only i f there is some knowledge abo ut the way i n whi ch the genes are combined into genotypes in t he populatio n . The simplest assumption, and fortunately one that is often very closely a pproximated in many actual populations, is random mating-the subject of the next sectio n . 2.2 The H ardy-Weinberg Principle
With random mat i ng, the relation between gene frequency and gen otype freq uency is greatly simpl i fied. By random mating we mean that the mati ngs occur without regard to the genotypes i n q uestion. In other words, the prob ability of choosi ng a particular genotype for a mate i s equal to the relative frequency of that gen otype i n the population. Notice that it is possible for a populati o n to be a t the same t i me i n random mat i ng proportions for some genes and not for others. For example, it i s quite reasonable for mating to be random with respect to a blood group or ser u m prote i n factor b u t b e n o n random with re spec t to genes fo r ski n color or i ntelligence. I n many respects, random mating among the different genotypes in the population i s equi valent to random combi nation of the gametes produc�d Male Gametes
4i E CIl
(,:J CI> (ij E
CD u.
p,** p" 2
G G 2
p�p�. p;.p�
A,A, A2 A ,
p�. p; p;* p;
A,A2 A2 A 2
Random combination of gametes when the aIlele freque ncies are different i n the two sexes. The single and double asterisks refer to male and female frequencies, respectively. F igure 2.5.1 .
RAN DO M LY MATI N G POPULATIO N S
45
For an autosomal locus, the Hardy-Weinberg frequencies are attained, but after a delay of one generation. Consider a population in which the fre quencies of the a1leles A l and A 2 are pr and P; in males and p� * and P;* in females. If the gametes are combined at random, we obtain the results in Figure 2.5. 1 . In the next generation the frequency of gene A l i n both sexes is 2.5.1
and in all following generations the genotypes are in the proportions pi, 2PI P2 ' and pi As expected, the gene frequency is the unweighted average of what it was originally in the two sexes. The demonst ration that random mating of the different genotypes gives the same results as random combination of the gametes follows the same general method as was used i n Table 2.2. 1 . Likewise, extension to multiple alleles is straightforward and will not be discussed here. With an X-linked locus, starting out with different allele frequencies in the two sexes, the situation is quite different. The equilibrium, instead of being attained in two generations as for an autosomal locus, is reached only gradually. Consider a particular X-linked allele, A, in a multiple-allelic series. Let the frequency of this allele in generation t be P: in males and P:* in females. Since a male always gets his X chromosome from his mother, ( I ) the allele frequency i n males will always be what it was in females a generation earlier. Likewise, (2) the frequency in females will be the average of the two sexes in the preceding generation, since each sex contributes one X chromo some. Finally, (3) the mean frequency of the gene will be the weighted average of that i n the two sexes, attaching twice as much weight to the female frequency as the male because the female has twice as many X chromosomes. This quantity must be a constant, since the mean frequency of the gene does not change. These statements may be stated mathematically as follows, where t I designates the previous generation. .
-
( 1 ) P: = P:-* l '
+ ( 2) Pr* * = �ZPr*- l �ZPr** - I' - = J * + 2 ** = � * 1 ( 3 ) Pc "lPc 3"P, "lP, -
+
2 * 3"P, -*1
=
P, - l
= p,
2.5.2
-
where p is a constant throughout the process. From the third equation we have Pi- l = 3p 2P:� I ' Substituti ng th is into the second equation, after some algebraic rearrangement we obtain a -
46
AN I N T R O DUCTI O N TO P O P U LATIO N G E N ETICS T H EO RY
relation between the gene frequency i n the female sex from generation to generation.
P,* *
P
- t (Pi!l - p ).
2.5.3
Since the same relationship h olds for p �� 1 and p �� 2 '
pi* - p
and, finally,
( - t) 2( pi!2
( - t)J(pi*3
-
p) p)
2.5.4 p,* * - p = ( - !)'(p�* - p), where P6* is the allele frequency in females in the initial generation. Notice
that these formulae do not depend on equal numbers of males and females in the population.
�
r 2
r:. til :::J P 2 . . . and Pk in these subpopulations. Then the mean proportion of AA h omozygotes in the whole population is .
.
•
,
n l P i + n 2 Pi + . . . + nk PZ n l + n 2 + " ' + nk
------
=
2"
P .
2.9.1
Now suppose that these populations are pooled into a single panmictic unit. The average frequency of the A allele is now (as before) p, the weighted average of the frequencies in the different populations. Then the proportion of AA homozygotes in the pooled popUl ation after one generation of random mating is p 2 . Recall that the variance is Vp p 2 - p 2 (see A.2.2 and A.2.4). Hence =
p2
=
pl
_
V , p
2.9.2
where Vp is the variance in the frequency of the gene A among the k sub popUlations. This explains why the proportion of i ndividuals with recessive traits is reduced by migration betwee n previously isolated communities. Since the variance is always positi ve, there will always be a decrease unless the gene frequency is identical i n the sU bpopulations. The magnitude of the decrease will depend on the diversity offrequencies among the populations, as measured by the variance. The previous discussion has referred to a situation where two or more popUlations are pooled, and then matings occur without regard to the origin of the indi viduals. The situation is somewhat different if the first matings are all between individuals from different populations. We shall consider only two populati ons. If P I and P 2 represent the frequency of the A gene in the two popUlations, the proportion of AA homozygotes in the Fl hybrids is P I P 1 ' The gene
RAN D O M LY MATI N G POPULATIONS
55
frequency in the FI is the mean of the two parent population frequencies , or (P I + P2)/2. The variance in the two original populations (equally weighted) is
P
=
2.9.3
Note that p 1 - Vp = P I P2, which is the frequency of AA homozygotes in the FI population. For comparison, the proportion of AA homozygotes in the three popula tions is : (I)
Separate popUlations
(2)
FI popUlation F2 and later
(3)
p1
P- 1
p 2.
+ -
Vp ,
Vp '
Hybridization between two popUlations causes an initial decrease in homozygosity, followed by a rise to a point halfway between. This argument does not consider linkage, the effect of which is to slow the approach to the final value.
2.1 0 Random- matin g P roportions in a F inite Popul ation
The Hardy-Weinberg proportions are realized exactly only in an infinite popUlation. For one thing, a finite popUlation is subject to chance deviations from the expected proportions. There is also a systematic bias because of the discreteness of the possible numbers of different genotypes. The bias can become impo rtant if there are a number of individually ve ry rare alleles. For example, o ne might determi ne the allele frequencies from a natural popUlation and then wish to inqui re if these are in random-mating proportions. The problem has been considered in detail by Hogben ( 1 946) and Levene ( 1 949). Consider a popUlation of size N. Si nce we are considering diploid populations, there are 2N genes per locus. Let Pi be the proportion of allele A i in this population ; hence there are 2NPi representatives of the A i gene. Then we regard the zygotes as made up by combining these 2N genes at random i n pairs . The probabil ity of d rawing an A i allele is 2Npd2N; after this is done, the probability of drawing another A i al lele from the remaining genes is {2Npi I )/{2N - I }. Thus the expected proportion of A i A i individuals, given that there are exactly 2Npi A j alleles, is -
2.1 0.1
68
AN I NT R O D U CTIO N TO POPU LATIO N G E N ETICS TH EORY
where / = 1 /(2N zygotes is
-
I ) . Likewise the expected proportion of A j Aj hetero
2.1 0.2
Thus the heterozygotes are increased by a fraction / = 1 /(2N I ) and the homozygotes are correspondingly decreased, in comparison with the proportions in an infinite population with the same allele frequencies. As a simple example, consider a population with only two alleles at the A locus, A I and A 2 . Assume further that the A l allele i s represented only once. Thus P1 = l /2N and P 2 = I PI ' Substituting these values into 2. 1 0. 1 we obtain 0 for the frequency of A l A I ' as we should ; for if there is only one A l gene there can be no homozygotes. Furthermore, substituting i nto 2. 1 0.2 leads to a frequency of A I A 2 heterozygotes of l IN; this is also correct, since only one heterozygote exists i n the population of N individ uals. In Chapter 3 we shall see that within a finite population there is an opposite effect, a decrease in heterozygosity. However, this decrease i s strictly due to changes i n the gene frequencies d u e t o random gains and losses in a small population. Within the population the relation between the gene and genotype frequencies is given by the Hardy-Weinberg principle, with the slight correction given here. -
-
2.11 Problems
In all problems, unless the contrary is stated, assume rand om mating. I . In a population there are 8 times as many heterozygotes as homozygous recessives. What is the frequency of the recessive gene ? 2. Show that, for a very rare recessive gene, the proportion of heterozygous carriers is approximately twice the frequency of the recessive gene. 3. If 16 % of the population are Rh - (dd) , what fraction of the Rh + population (DD and Dd) are homozygous ? 4. From the data in problem 3, what fraction of the children from a large group of families where both parents were Rh + would be expected to be Rh + ? 5. Show that if the A and B antigens of the A BO blood group system were caused by two domi nant genes, independently inherited, the product of the frequency of A and B should equal the prod uct of 0 and A B. 6. What is the maximum proportion of heterozygotes with two alleles ? With three alleles ? With n alleles ? (See A . I O) 7. From the data given on color blindness, what fraction of women would be of normal vision, but carrying two different color-blind factors ?
R A N D O M LY M ATI N G P O P U LATI O N S
57
8 . Show that in a randomly-mating population with two alleles half the
heterozygotes have heterozygous mothers. 9. Here are some hypothetical data on the frequencies of the ABO blood groups : Genotype : Frequency :
1 0. I I.
1 2.
1 3.
14. 1 5.
00
.40
OA .30
AA .08
OB .12
BB .04
AB .06
What would the frequencies be next generation if mating were at random ? Letting p, q, and r stand for the frequencies of the A , B, and 0 blood group alleles, what is the probability that two persons chosen at random have the same blood group ? An outrageously careless hospital gives transfusions at random. What proportion would be mismatches ? (To refresh your memory, group 0 can give to anyone, A to A or AB, B to B or AB, and AB only to AB.) Show that if p is the frequency of a recessive allele, the average propor tion of recessive children when one parent is of the dominant phenotype and the other of the recessive phenotype is pl( 1 + p), and when both parents are of the dominant phenotype is [piO + p)J 2 . The two ratios, p/( l + p) and [piO + p)J 2, are sometimes called Snyder' s ratios and the fact that one is the square of the other is sometimes used as a test for recessi ve i nheritance. Does this discriminate between a trait caused by a single pair of recessive genes and one caused by simul taneous homozygosity for several recessive genes ? Does the answer to problem 1 3 depend on whether the genes are in dependent or not ? Must they be in gamtiec phase equilibrium ? A . G. Searle (J. Genet. 56 : 1 - 1 7, 1 959) reports the following frequencies of coat colors of cats i n Singapore. The observed numbers were as follows : MALES
FEMALES DARK
CALlCO
YELLOW
DARK
+/+ 63
+ /y 55
y/y 12
+ 74
YELLOW
y 38
Use the maximum-likelihood method to compute the gene frequency and test by Chi-square the agreement with the hypothesis of random-mating proportions. ] 6. Give an example of a set of gamete frequencies for three loci such that any two are in linkage equilibrium, but the set of three are not in equi l i brium.
58
AN I NT R O D U CTION TO PO P U LATIO N G E N ETICS T H EORY
1 7. Genes A and B are li nked with 20 % recombination between them. An
1 8. 1 9. 20.
21 .
22.
23.
initial populatio n is composed of AB/A B, A B/ab, and ab/ab plants in the ratio of I : 2 : I . The population is allowed to pollinate at random. a. What would be the frequencies of the four kinds of chromosomes in the next generation ? b. What would be the freq uency of the AB/aB genotype in the next generation ? c. What would be the chromosome frequencies when equilibri u m is reached ? d. What would be the frequency of the A B/aB genotype at equilibrium ? e. How many generations would be required for the population to go halfway to equil ibri u m ? Two homozygous strains aa bb and AA BB are crossed. The A and B loci are on separate chromosomes. Show that these loci are in linkage equilibri u m in the F2 generation. Why doesn ' t equation 2.6.3 apply ? Show that in an autotetraploid the value of Cl is 1 /7 as the locus becomes far enough from the centromere to be independent of it. If PI ' P 2 , P 3 ' and P4 represent the frequencies of alleles A I ' A 2 , A 3 , and A4 in a randomly mating tetraploid population that has reached equi librium, and the relevant l ocus is very far from the centromere, what will be the frequency of A I A 1 A 1 A , plants ? A 1 A 1 A 2 A2 ? A , A 2 A3 A4 ? --The equations Po = JO , P A = 1 - JB + 0, and P B 1 - JA + O , where 0, A, and B represent the frequencies of these three blood groups, are often used to estimate the gene frequencies. Show that these are not the solutions to the maximum-likelihood equations. Derive these equations from the relations below equation 2.3.2. Two p lausible hypotheses that explain the m uch greater incidence of early baldness in males than in females are ( 1 ) an autosomal dominant that is normally expressed only in males and (2) an X-linked recessive. If the first hypothesis is correct, and q is the frequency of the gene for baldness, what proportion of the sons of bald fathers are expected to be bald ? What proportion from nonbald fathers ? What are the correspond ing expectations on the X-linked recessive hypothesis ? Harris (Ann. Eugen. 1 3 : 1 72- 1 8 1 , 1 946) found that 1 3.3 % of males in a British sample were prematurely bald . He also found that of 1 00 bald men, 56 had bald fathers. Show that this is consistent with the sex l imited dominant hypothesis but not the sex-linked recessive. (You may want to satisfy yourself that the expected fraction of bald sons when the father is bald is the same as the expected fraction of bald fathers when the son is bald. It is easier to get data by selecting a group of bald men and i nquiring about their fathers than it is to wait for their sons to grow up.) =
RA N D O M LY MATING P O P U LATI O N S
59
24. Show that if a group of previous1y isolated populations are pooled the proportion of heterozygotes for alleles A i and Ai i s equal to the average proportion of heterozygotes before pooling minus twice the covariance of the 2-al lele frequencies . Show also that when there are only two alleles the covariance is minus the variance. 25. Prove the statements in the legend of Figure 2. 1 1 . 1 . Assume that the base
Representat ion of a po pulat io n as a po i nt in a 2dimensional diagram. P, H, and R represen t the frequencies of the ge no types A A. Aa, and aa; p and q are the frequencies of the A and a F igure 2.1 1 .1 .
alleles. P is given by the distance fro m the vertical axis, R b y that fro m the horizontal axis. H is given by either the horizontal or vertical distance to the hypotenuse of the triangle. All possible populatio ns lie w ithi n the
triangle ; populations in Hardy Weinberg ratios lie alo ng the parabola
.
and altitude are each equal to 1 . In particular, show that P + H + R = 1 . and that the perpendicular line from the point to the hypotenuse divides it in the rati o p : q. Show also that the equation of the Hardy-Weinberg 2 2 parabola i s p - 2PR + R - 2P - 2R + 1 = O. [You might find it useful to note that, with random-mating proportions, H 2 = 4PR.]
60
AN I N TR OD U CTI O N TO P O PULATION G E N ETICS T H E O RY
26. Another way of representing a population was used by De Finetti ( 1 926). Prove that, i f the altitude of the triangle i s I , P + H + R 1 . Show also that the perpendicular from the point to the base divides it in the ratio p : q. See Figu re 2. 1 1 .2. =
� ( ---
p --�) <E--- q �
F igure 2.1 1 .2.
a
De F i nell i
diagram
of
population in triangular coordinates.
and R are given by the perpendicular distances to the three sides. The vertical line divides the base in the ratio of the gene frequencies, p, and q . Points in Hardy-Weinberg ratios lie along the curve.
P, H,
27. Use Wahlund's principle or the definition of the variance to show that with n all e le s the mi ni m u m homozygosity with random m a t i ng occurs when all alldes are equally frequent. 28. Ass u m e two s e gre gat in g loci, each w i t h two al leles. Let PA B , PAb , PaB • and Pa" be the frequency of the four chromosomes and c the amount of recombination between the two loci . A conven tional measure of linkage disequilibrium is D = PA S Po" - PAb PaS ' What is the equilibrium value of D ? How fast is the equilibri um approached ?
3 INBREEDING
I
nbreeding occurs when mates are more closely related than they would be if they had been c hosen at random from the population . Related ind ivid uals have one or more ancestors in common, so the extent of in breed ing is related to the amount of ancestry that is shared by the parents of the inbred individuals. Alternatively stated, the degree of inbreeding of an ind ividual is determined by the proportion of genes that his parents have in com mon. An immediate consequence of this sharing of parental genes is that the inbred ind ivid ual will frequently inherit the same gene from each parent. Thus inbreeding i ncreases the amount of homozygosity. So one observable effect of inbreedi ng is that recessive genes, previously hidden by heterozy gosity with dominant alleles, will be expressed . Since most such genes are harmful i n one way or another, inbreeding usually leads to a decrease in size, fertility, vigor, yield, and fitness. There are also likely to be loci segre gating in a popUlation where a heterozygote is fitter than either of the two 61
62
AN I NT R O D U CTI O N TO POPU LATIO N G E N ETICS T H EO R 'f
corresponding homozygotes. In this case, too, inbreeding leads t o a decreased fitness. Another consequence of consanguineous mating within the population is greater genetic variability, since similar genes tend to be concentrated in the same individuals. Usually, because of the correlation between genotype and phenotype, this leads to an increase in phenotypic variability. Inbreeding may follow either of two patterns. There may be a certain amount of consanguineous mating within a population, with the consequences j ust mentioned. On the other hand, the inbreeding may be such as to break the population into subgroups. An extreme example is continued sel f fertilization i n which the population (if it is of constant size and each parent contributes equally to the next generation) is divided into a set of subpopula tions of one individual each. Likewise, a pattern of repeated sib mating could lead to a series of isolated populations of size 2. As a third example, there may be a natural population which is divided into isolated subpopulations, within each of which mating is random or nearly so. The effect will be that each subpopulation becomes more homozygous, and therefore the whole population does. The individual su bpopulations become more uniform genetically ; but, since they become homozygous for different genes, the population as a whole becomes more variable. Of course there may be only partial isolation, with intermediate consequences. A point that at first seems paradoxical is that within a subpopulation there is an increase in homozygosity despite the fact that mating within this group is random. The reason, as will be discussed in Section 3. 1 1 , is that there are random changes in the frequencies of the i ndivid ual alleles and these, on the average, lead to a decrease i n heterozygosity. As an extreme example, self-fertilization can be regarded as random mating (Le., random combination of gametes) within a population of one. The gene frequencies at different, previously heterozygous loci change from 1 /2 to 0 or 1 . Whether inbreeding leads to subdivision or not, it can be measured i n the same way-by Wright's ( 1 922) coefficient of inbreeding, f, which measures the proportion by which the heterozygosity has been decreased. As we shall show later, other population properties can also be related to f. H owever, before discussing j, we shall illustrate with two simple examples the effect of continued inbreeding. 3.1 Decrease in H eterozygosity with I nbreeding
The qualitati ve effect of continued i nbreeding can be seen by examining the most extreme form , sel f-fertilization. In a self-fertilized population the progeny of homozygotes are like their parents, whereas the progeny of hetero zygotes are 1 /2 heterozygotes and 1 /4 each of the two homozygous types.
I N B R E EDING
63
Thus, in each generation the proportion of heterozygous loci is reduced by half and the homozygous types are correspondingly increased. This is illus trated in Table 3 . 1 . 1 . Table 3.1 .1 . The ch anges
sel f-fertiliza t i on.
in
pro babili t ies of different genotypes with continued D, H, and R stand for the in itial p roportions of dominant, the
heterozygous, and recessive types.
f REQUENCY Of GENOTYPE GENERATION
AA
0
I
2 3
4
Limit
D D + H/4 D + 3H/8 D + 7H/ 1 6 D + 1 5H/32 D + H/2
aa
Aa
H H/2 H/4 H/8 H/1 6 0
R R + H/4 R + 3H/8 R + 7H/ 1 6 R + 1 5H/32 R + H/2
If Ho is the initial proportion of heterozygotes, the proportion after t generations of self-fertilization is Ho/2'. If the original population were panmictic, with A A , Aa, and aa genotypes in the proportions p2, 2pq, and ql (p + q = 1 ), the individual lines eventually become homozygous. The probability of being A A is D + HI2 pI + pq pep + q) p ; likewise the prob a b i l i ty o f be i n g aa is q. Thus, the population becomes broken into sepa rate l ines, each homozygous for one or the other of the genes in the ratio of their original frequencies in the population. Notice one other fact : There has been no change in the gene frequency. Inbreeding per se does not change the proportions of the various genes, only the way they are c ombined i nto homo zygous and heterozygous genotypes. With less extreme forms of inbreeding the results are similar, though the change in heterozygosity is less rapid. The results for continued brother sister mating are shown in Table 3. 1 .2. Again there is a decrease in the propor tion of heterozygotes, with the amount deducted being divided equally and added on to the two homozygous types. These results may be obtained by writing out all the possible matings generation after generation , as was done by the early investigators (Fish, 1 9 1 4 ; Jennings, 1 9 1 6). This and several other systems of recurrent inbreeding were worked out by these authors. The papers are now mainly of historical interest since mo re general methods are available. We shall discuss them =
=
=
64
AN I NTRODUCTION TO POPULATI O N G E N ETI CS TH EO RY
Table 3.1 .2.
sister mating
GENERATION
I
0 1 2 3 4 5 6
Limit
The decrease in heterozygosity with successive generations of brother
.
RELATIVE
DECREASE I N
RATES OF CHANGE I N
HETEROZYGOSITY
HETEROZYGOSITY
HETEROZYGOSITY
Hr
- =p
Ho
1 2/2 3/4 5/8 8/16 1 3/32 2 1 /64 0
Ho - H, Ho
H,
=/
0 0 1 /4 3/8 8/ 1 6 1 9/32 43/64 1
1
H' - 1
1 3/4 = .750 5/6 = .833 8/ 1 0 .800 1 3 / 1 6 = .8 1 2 2 1 /26 = .808 A = .809 =
H' - 1
-
H' - 1
H,
0 0 .250 . 1 67 .200 . 1 88 . 1 92 .191
in Sections 3.4 and 3.8, where the results o f this table will appear a s a special case. In this example the heterozygosity follows a simple rule. The numerator in successive generations i s given by the Fibonacci series in which each term is the sum of the two preced ing terms, while the denominator doubles each generation. The number 1 in the second row is written as 2/2 to make the sequence more obvious. The red uction in heterozygosity, expressed as a fraction of the initial heterozygosity, is the same regardless of the initial gene frequencies and, as we shall show later, the number of alleles. The relative heterozygosity, Hr/Ho , has been called by Wright ( 1 95 1) the panmictic index, for which he used the letter P. I P is the inbreeding coefficient, for which Wright has used the letter F. (We shall use the lower case f in order to reserve F for m ultiple-locus inbreedi ng effects.) The last two columns give the rate of change in heterozygosity. Notice that the ratio H, /H' - l after a few oscillations rapid ly approaches a constant value. The limiting value of the ratio of heterozygosity to that in the previous generation is usually designated by ;. ( Fisher, 1 949). -
3.2 Wright's I nbreeding Coefficient, f
Wright's ( 1 922) original derivation of the inbreeding coefficient,/, was through correlation analysis. An alternative approach using only probability rules has been developed by Haldane and M oshinsky ( 1 939), Cotterman ( 1 940), and
I N B R EE D I N G
65
Malecot ( 1 948). They distinguish between two ways in which an individual can be homozygous for a given locus. The two homologous genes may be : ( I ) aHke in state, that is to say, indistinguishable by any effect they produce (or perhaps, when molecular genetics has become sufficiently precise, alike in their nuc1eotide sequence), and (2) identical by descent, in that both are derived from the same gene in a common ancestor. We follow the notation of Cotterman i n designating an individual whose two homologous genes are identical by descent as autozygous. If the two alleles are of independent origin (as far as known from our pedigree information), the individual is allozygous. The effect of inbreeding is to increase that part of the homozygosity that is due to autozygosity. (Notice that an individual can be homozygous without being autozygous, if the two homologous genes are alike in state but not identical by descent. Conversely, an autozygous individual can be heterozygous for this locus if one of the two alleles has mutated since their common origin, although this is negligibly rare if only a small number of generations is being considered.) The inbreeding coefficient, J, is defined as the probability that the i ndi vidual is autozygous for the l ocus in question. Alternatively stated , it is the probability that a pair of alleles in the two gametes that unite to form the individual are identical by descent. An individual with inbreeding coefficient / has a probability / that the two genes at a particular locus are identical and a probability I - / that they are not identical, and therefore independent. I f they are independent the frequencies of the genotypes will be given by the binomial formula. If they are identical, the frequencies of the gene pairs will be simply the frequencies of the alleles in the population. Thus, for two alleles, A l and A 2 , with frequencies PI and P2 (PI + P2 = ] ), the genotype frequencies are :
Homozygous, A l A I
:
Heterozygous, A I A z :
Homozygous, A 2 A 2 :
Total
ALLOZYGOUS
A UTOZYGOUS
+ p�(l - I) 2p IPz( 1 - f) + p�( l - f) 1 -1
pil
P2 1 I
3.2.1
Notice that when / is 0 these formulae reduce to the usual Hardy Weinberg proportions. When / = I the population is completely homozy gous. Thus/ranges from 0 in a randomly mating population to I with complete homozygosity. How to compute / from a pedigree will be shown later. Multiple alleles introduce no difficulty. The genotype freq ue ncie s are a natural extension of the results for two alleles. The fre q ue ncie s a re
A iA i : pf( l - /) + P i /
3.2.2
66
AN I NT R O D U CTIO N TO P O P U LATION G E N ETICS TH EORY
for homozygous genotypes, and
for heterozygous genotypes. The expected proportion of heterozygous genotypes with inbreeding coefficient f, Hf , is given by 3.2.4
where Ho is a constant equal to the proportion of heterozygotes expected with random mating (I 0). The summation is over all combinations of values of i and j except when these are equal. This proves the assertion made earlier that the inbreeding coefficient measures the fraction by which the heterozygosity has been reduced. We have wrjtten the formula as if, when 1= 0, the population is in Hardy-Weinberg proportions. However, for any measured / (as determ ined, for example, from a pedigree), the heterozygosity, H, is Ho( l - I), where Ho is whatever the heterozygosity would have been in the absence of the observed inbreeding. To be concrete, the inbreeding coefficient for the child of a cousin marriage is 1 / 1 6 (as we shall show later) ; therefore the child of such a marriage is 1 5/ 1 6 as heterozygous as i f his parents had the same relationship a s a random pair in this population . There is a simple relationshi p between the correlation coefficient, r, and the inbreeding coefficient , ! I f we assign numerical values to each allele, then the inbreeding coefficient, J, is the correlation between these values in a pai r of uniting gametes. In fact, Wright ' s original derivation of the inbreeding coefficient was through correlation methods. The relationship between r and 1 can be shown in the following way. For convenience we assign the value I to allele A 1 and 0 to allele A 2 , though we would get the same result with any values. The calculations are shown in Table 3.2. 1 . Since the sum of the genotype frequencies is equa l to 1 , the weighted sum and the mean of any value are the same. For example, the sum (and mean) of the egg value, X, is [pi(l - I) + Pz /](O) + [PIPZ( l - I)]( l ) + [PZP I ( l - 1)](0) + [p�(l - I) + P I (]( l ), which after some algebraic simpli fication reduces to Pl ' The other calculations are given in the table, using the standard formula for calculation of r given in A A.3. The calculations in Table 3.2. 1 . are made by assuming that there are only two alleles and letting them have the values 0 and I . The correlation interpretation of I, however, is completely general . Table 3.2.2 gives the same =
I N B R EEDING
87
Table 3.2.1 . Demonstration of the equivalence of the inbreeding coefficient, /,
and the coefficient of correlation, gametes.
r"" ,
FREQUENCY OF
EGG
SPERM
Sum
or X
VALUE OF
THIS COMBINATION
A2 Al A2 Al
between the genetic values of the uniting
X2
y2
0 0
EGG
SPERM
X
Y
0
0
0 0 1
A2 A2 Al Al
p�( 1 - f) + P2 f PIP2( 1 - f) P2P I( 1 - f) p f ( 1 - f) + P I f
0
1
1
1
1 1
Mean
1
PI
PI
PI
PI
=
1
1 0
XY
0 0 0
1 p f ( 1 -f) + PI f
p'P2( 1 - f) + pt(l - f) + p, f = PI.
Likewise,
f = X 2 = y 2 = PI .
demonstration without restriction as to number of alleles and letting the contribution of the alleles differ. Furthermore, if the genic values are summed over k loci the covariance will be
where P.k and Q ik are the frequency and value of the ith allele at the kth locus. This is f times the variance. Hence f is the expected value of the correlation between the genetic values of two uniting gametes, regardless of the number of loci and number of alleles under consideration. The equi v al ence of r and f suggests an interpretation of the correlation coefficient. If a measurement can be thought of as being the sum of a number of elements, then the correlation coefficient is the measure of the fraction of these elements that are common to the two measurements, the other ele ments being chosen at random. This interpretation is useful in many branches of science. I n quantitative genetics the elements can obviously be interpreted as c u m u lati vely acting genes. The computation off will be discussed in Section 3.4.
68
AN I NTRODUCTIO N TO P O P U LATIO N G E N ETICS T H EORY
Table 3.2.2. Demonstration of the
equiva lence
of the inbreeding coefficient and
the correlation between the genetic value of the uniting gametes regardless of the contribution of the individual genes and the number of alleles. The contribution, or value, of allele AI i s assumed to be
aj ,
measured as
a
dev iation from the mean
value.
VALUE OF
FREQUENCY EGG
OF THIS
SPERM
COMBINATION
A,
A,
A,
A}
Vx
pf( 1 - f) + p, i p,pil - f)
EGG
SPERM
X
Y
a, a,
a, a}
xZ
yz
Xy
af af
af aJ
af a, a}
= 2. Pl af, ,
each wei ghted by its frequency. Vy is the same.
since the variance of the egg value is the sum of the squares of the allele values,
COVXy = (1
-
f) [2:" p fa f
+
2: PIP) al a)] ?o )
But the quantity in brackets i s equal to
+ i 2. Pl af. ,
[2: P I a,] 2 wh i ch
the sum of the deviations from the mean is
O. Therefore,
i s equal to 0, because
COVX y =i 2: Pl af. The correlation coefficient, being the ratio of the covariance to the geometric mean of the two variances (which in this case are the same), is f, as was to be shown.
rX Y = Covx d Vx =[
3.3 Coefficients of Consanguinity and Relationship
We have used the inbreeding coefficient of an individual I, II , to give the probability that two homologous genes i n that individual are identical by descent. Or, as j ust shown, this is the correlation between the genetic value of the two gametes that united to produce the individual. Since inbreeding of the progeny depends on the consanguinity of the parents we can u se the inbreeding coefficient as a measure of this. We define the coefficient of consanguinity, IIJ ' of two individuals 1 and J as the probability that two homologou s genes drawn at random , one from each of the two individuals, will be identical. The answer to this is clearly the same as the inbreeding coefficient of a progeny produced by these two indi-
I N B R EEDING
69
viduals. Hence the inbreeding coefficient of an individual is the same as the coefficient of consanguinity of its parents (Malecot, 1 948). There is a bewildering plethora of alternative names for this coefficient. Malecot, who introduced the idea, called it the coefficient de parente. Falconer ( 1 960) calls it the coancestry. Kempthorne ( 1 957) translated parente into parentage. Malt!cot himself has, on at least one occasion, translated it into kinship. We shall use either consanguinity or kinship. A different measure of relatedness, introduced much earlier and still widely used, is Wright's ( 1 922) coefficient of relationship, rIl ' defined as : 3.3.1
For two individuals that are not inbred, the coefficient of relationship is exactly twice the coefficient of consanguinity. As we shall show later, the coefficient of relationship is the correlation between the genic, or genetic, values of the two individuals. If the genes act without dominance or epistasis, and there is n o effect of the environment on the trait being measured, this is the expected correlation. We shall also show later the effect of dominance on the correlation between relatives (Section 4.3). 3.4 Computation of f from Pedi grees
The procedure for computing the inbreeding or consangU inity coefficient from a pedigree follows directly from the definition off Consider the pedigree in Figure 3.4. t . In this pedigree individual I is inbred because both his parents are descended from a single com mon ancestor, A . All unrelated ancestors, which are irrelevant to the inbreeding of I, are omitted fro m the pedigree. We ask for the probability that } is autozygous ; i.e., that the homologous genes contributed to J by gametes b and e are both descended from the same gene in ancestor A . We shall use the notation Prob(c = b ) to mean the probability that c and b carry identical genes for the locus under consideration. 1 /2, since the gene in b has an equal chance of having Prob(c = b) come from C or from B's other parent. Likewise, Prob(c = a) = 1 /2 The probability th at a and a ' carry identical genes may be obtained as follows : Let the two alleles in A be called W and Z. Then there are four equally likely possibilities for gametes a and a ' : (I ) W and W, (2) Z and Z, (3) W and Z, and (4) Z and W. In the first two cases they are identical, so the probability is 1 /2 that a and a' get the same gene from A . However, there is an additional possibility if ancestor A is inbred, for in this case the two alleles W and Z =
.
70
AN I NTR O D U C1·I O N TO PO P U L"ATI ON G E N ETICS TH EORY
may both be descended from some more rem ote ancestor not shown in the figure. The probability that A is autozygous, is, by definition, the inbreeding coefficient of A , IA . Altogether, if A is inbred, Prob(a = a') = 1- + tlA = t el + IA) ; if A is not inbred Prob(a at) 1 /2. =
=
F igure 3.4.1 . A simple
pedigree with inbreeding. Circles and squares denote females and males respectively. Eggs and
sperms are designated by small letters. Ancestors
that do not contribute to the inbreeding of I are omitted.
Continuing around the path BCA DE, Pr ob (a = d) = probed = e) = 1 /2. Summarizing, b and e wil1 carry identical genes only if b, c , a, a', d, and e do so. Therefore, since all these probabilities are i ndependent '
11 = IBE = Prob(b = e) =
t
b=c
x
t
c=a
x
!-( 1 + IA) a = at
x
!at = d
x
t
d=e
If A is not inbred (and according to information given in this pedigree she is not) the inbreeding coefficient of I is simply ( 1 /2)5 . Notice that whether B, C, D, and E is inbred is irrelevant, since, for example, the probability that c and b are identical is independent of the gene contributed by B's other parent. The general rule is that the contribution of a path of relationship
I N B R EE D I N G
71
through a common ancestor is ( 1 /2t(l + fA) where n is the number of indi viduals in the path from one parent to the ancestor and back through the other paren t.
Figure 3.4.2. A
o
�0
<E:- !k' _ 1 + t mt - I
3.8.1 3
Decrease of heterozygosity with four mating systems, starting from a randomly mating population. The numbers are the ratio of the heterozygosity in generation I to the original heterozygosity, HI/ Ho = h I '
Table 3.S.1 .
GENERATION I
0 2 3 4 5 6 10 15 20 30 50 ,\
DOU BLE FIRST-
CI RCULAR
COUSIN
HALF-SI B
MATING
MATING
N=4
N=4
1 .000 1 .000 .750 . 625 . 500 .406 .328
1 .000 1 .000 1 .000 .875 .8 1 3 .750 .688
1 .000 1 .000 .875 .813 .750 .695
.001
. 1 41 .048 .01 7 .002
.492 .324 .2 1 3 .092 .01 7
.477 .327 .224 . 1 05 .023
. 500
.809
.920
.927
SELF-
SI B
FERTILIZATION
MATING
N= 1
N=2
1 .000 .500 .250 . 1 25 .063 .03 1 .0 1 6
. 644
I N B R EEDI N G
(hr) (� t !) (hktr --11 ) t rnr - l
91
These are now homogeneous and can be written in matrix form
=
kr rnr
0
0
°
4
•
3.8.1 38
The characteristic equation is
- ,l ° *
I
! - ,l t
°
!
- ,l
= 0,
3.8.1 4
which, upon expansion, becomes
,l3 - t,l 2 - t,l - -! = 0, and the largest root is ,l = . 9 1 96.
3.8.1 48
Some n umerical values for heterozygosity with this mating system are given in Table 3.8. 1 . Notice that for a population o f size 4 this is the system o f mating in which mated pairs are least related. A corresponding system in a population of size 8 would be quadruple second-cousin mating. Wright ( 1 92 1 ) designated such system s as having maximum avoidance of inbreeding. Such systems do, in fact, minimize the rate of approach to homozygosity during the initial generations, but somewhat surprisingly there are systems of mating that ultimately have a slower rate of decrease in heterozygosity. An example, for a popuJation of 4, is half-sib mating, or circular mating, as illustrated in Figure 3.8.4. Letting gr be the coefficient of consanguinity of individuals one position apart and j, be that for ind ivid uals two positions apart, g , = /A B = /BC = /CD = /A D
(- 2
(-1
A
8
c
o
Half-sib mating i n a population of 4, or circular ma t i n g F igure 3.8.4.
>.
=
.927.
.
92
AN I NTRODUCTION TO P O P U LATI O N G E N ETICS TH EORY
and
jt
=
lAc = IBD ,
we have
J, = gt - l , gt = *
I
-
g, and m
3.8.1 5
=
I
-
j as before leads to 3.8.1 6
with the characteristic equation
.,1.3 - .,1.2
+
116 = 0
3.8.1 7
and the largest root is A.
=
.9273.
Notice that the eventual rate of decrease in heterozygosity is less i n this system than with double first-cousin mating. Referring to Table 3.8. 1 , we see that the heterozygosity curves for the two systems cross at abol,1t the fifteenth generation. The general principle is that more intense inbreeding produces a lower ultimate rate of decrease in heterozygosity, provided that there is no permanent splitting of the popul ation into isolated lines. Con versely, a system that avoids mating of relatives for as long as possible does so at the expense of a more rapid final approach to homozygosis. The breeder therefore may choose a different system of mating if he is more interested in maximum heterozygosity during the initial generations than in the long time future population. An extension of the procedures of this section to larger populations than N = 4 has been given by Kimura and Crow ( 1 963). Robertson ( 1 964) and Wright ( 1 965a) have shown that many of these results can be brought together very generally u nder a single point of view. For other types of mating systems see Wright ( 1 92 1 , 1 95 1). Many of Wright's earlier results are summarized by Li ( 1 955). 4. Partial Self-fertilization
All the examples discussed thus far lead eventually to complete homozygosity. This is not always the case, and we shall now consider one such example. This is the simple, yet important, case where a certain fraction each generation are self-fertilized and the remainder are mated at random, a situation found in several plant species.
I N B R E EDING
93
Let S be the fraction of the population that is produced by self-fertiliza tion ; then I - S is the fraction that is produced by random mating. From 3.8 . 1 we can write the expected recurrence relation for ! as S It = S[ ( 1 + 1, - I)/2J + ( 1 - S)(O) = "2 ( l + fr - l ) '
3.8.18
This assumes that the plants to be self-fertilized each generation are a random sample of the popul ation ; for example, there is no tendency for the progeny of self-fertilized plants to be self-fertilized . S ubstituting!, (Ho - Hr)/Ho from 3. 2 .4 into 3.8. 1 8 we get =
H,
=
S Ho ( l - S) + "2 H, - I '
3.8.18.
Subtracting 2( 1 - S)H0/(2 - S) from both sides and simplifying, H, -
S) ] [ (�) [ ] _- (-S) ' [Ho - 2( 1 - S) Ho] .
2( 1 S 2( 1 - S) H Hr - I H0 0 = 2 2 S 2 S 2 2( 1 - S) Ho H' - 2 2-S 2 _
_
_
3. 8.1 9
_
2-S
2
0.8
0.6
h 0.4
0.2
o
+-------��===---�--_;====� o
10
5
A. S el f- fertil i zation ; B. Sib
i ng ;
20
15
F igure 3.B.S. Ch ange in heterozygosity with four mat i ng systems.
D . C i rcu l a r half-sib mating. The ordinate is t he heterozygosity relative to ma t
C. D ou b le first-cousin mating ;
the starting population ; the abscissa is the time in ge nerations
.
94
AN I NTRO D U CTIO N TO P O P U LATI O N G E N ETICS T H EO RY
Since (S/2/ approaches 0 as t becomes large, the heterozygosity approaches a limit where the heterozygosity is a fraction 2(1 - S)/(2 - S) of its original value. The rate of approach is such that the departure from the equilibrium value is decreased by a fraction I - S/2 each generation. Notice that when S 1 we get the usual formula for self-fertilization. This situat;on is striking in that unless S is large there is almost no cumulative effect ; most of the effect occurs in the first generation. For example, with 1 0 % self-ferti1ization, the initial heterozygosity is reduced by 5 % in the first generation, but even when equilibrium is reached the reduction is only 5. 3 % ! =
5. Repeated Backcrossing to the Same Stra i n
Frequently a plant breeder may wish to introd uce one or more dominant genes from an extrane ous source into a standard variety. For example he may have a highly desirable variety, except for its being susceptible to some disease. The resistant gene may exist in another strain which is less desirable in other respects. He can introduce this gene by crossing the two strains and then repeatedly crossing resistant plants to the susceptible strain. In this way the resistant gene is inserted into a genetic background that becomes more like the susceptible strain with each backcross. As another example, a mouse breeder may wish to introgress a new histocompatibility gene into a standard inbred strain. It is clear that in recurrent backcrossing the number of loci that contain genes from both strains i s reduced by half each generation. Thus, after t t . s are from the recurrent parental generations, a fraction equal to 1 strain. After seven generations less than I % of the loci contain a gene from the other parental strain . If the recurrent parental strain is homozygous, the heterozygosity will red uce by half each generation, as with self-fertilization. However, genes that are l inked to the resistance or histocompatibility gene will tend to remain heterozygous. The question of how large a linked region will remain after a certain number of generations of backcrossing has been investigated by Haldane ( 1 936) and Fisher ( 1 949). Consider a chromosome segment on one side of the selected factor and let the length of the segment be lOOx map units in length (see Figure 3.8.6). -
Selected Locus
!
I I I � ( ---- )( -----�)�( dx �
chromosome segment. One locus is selected during recurrent backcrossing and the problem is to deter mine the length of chromosome to the right of th is locus that wiI1 be intact after t generations. Figure 3.8.6. A
I N B R EEDING
95
If there is no interference, the probability of no crossover in this interval in one generation is e-X (see Appendix A.5.6). The probability of no crossover in t generations is e - r x. The chance of a crossover in the small interval x to x + dx is dx, if we take this interval small enough that m ultiple crossovers can be ignored . The probability of a crossover in the interval dx sometime during t generations is tdx . Thus the probability after t generations of having had a crossover in the interval dx but not in the interval x is e - rXtdx. Then the mean value of the intact interval x is
x
=
f
a l l = a2 2
.00 .53 .78 .90 . 98
1 24
AN I NTRODUCTIO N TO PO P U LATIO N G E N ETICS TH EO RY
Heritability is also defined in other ways. Sometimes it is defined as H2 =
Vh
- .
V,
This gives the fraction of the total variance that is attributable to differences among the genotypes. This is useful if the question is the relative influence of genotype and environment in determining phenotypic differences. This is sometimes called heritability in the broad sense. On the other hand, the animal or plant breeder is less interested in this than he is in determining that part of t he genotypic variability that is respon sive to selection. This is Vg , so heritability is defined as V h 2 = -g
v.r '
and is often called heritability in the narrow sense. We have used the capital letter for the larger val ue and the small letter for the smaller. In some cases those components of epistatic variance which cannot be easily separated from the genic variance, and which contribute to the correlation between parent and offspring, are included. In this book, unless the contrary is specified, we shall use the word herita bility to mean Vg/ V, . With epistasis the problems become more complicated. The extension of the procedures we have been using to multiple interacting loci was developed originally by Fisher ( 1 9 1 8) and Wright ( 1935), but has been ex tended more recently by Cockerham ( 1 954) and Kempthorne (1954, 1 955). The basic procedure is, as mentioned earlier, to take up as much variance as possible in the additive term, then as much of the remainder as possible with the dominance deviation, and what is left over is attributed to epistasis. The epistatic terms can be broken further into those that are due to pairs of loci, then after removing this the remaining variance is associated with loci taken three or more at a time. The 3-10cus epistasis can in turn be removed, and so on. One would expect that, unless there are very intricate interactions (for example, a trait that is found only when three rare genes are simulta neously present), most of the variance is removed by 2-factor combinations. Wright (1935) showed that for one form of epistasis this is exactly true. This is the kind of interaction that occurs when there is selection for an intermediate phenotype, as when animals of intermediate size are more likely to survive and reproduce than those that are too small or too large. This introduces epistasis of a rather extreme type, since a gene that increases size will be favored in genotypes where most of the other genes are for small size but selected against when most of the others are for large size. Such
C O R R ELA1'I O N B ETWEEN R ELATIVES A N D ASSO RTATIVE M ATI N G
1 25
interaction must be quite frequent in occurrence. Wright assumed that the selective disadvantage is proportional to the square of the deviation from the optimum phenotype. With this model he showed that the total epistatic variance is simply the sum of all possible 2-locus components. For more general situations 3-factor and higher interactions are un doubtedly involved, b ut they cannot ordinarily be measured and are probably usually small . In any case, we shall consider only the interactions of pairs of l oci. Extensions to three or more are given by Kempthorne (1955, 1957) . The relationships are clearly seen in a two-way array (see Table 4. 1 .2) . The items in the body of the table are the phenotypic measurements of each of the nine genotypes, each measured as a deviation from the population mean. The first two subscripts refer to the A locus and the second two to the B locus. Thus a1 1 1 2 is the amount by which the phenotype of A I A I BI B 2 exceeds the population mean. Below each of the a's is its frequency in the population. Table 4.1 .2. Basic calcuJations for subdividing the genotypic variance determined
by two independent loci with two al leles each. The frequency of a lleles A I , A z , B l , and B 2 are P I , pz , q l , and q2 . The a's are all measured as deviations from the average measurement in the population. The frequency of each class is given below the deviation.
AlAI
AIA2
A2 Az
MEAN
al 1 l 1 ptqt
a1 2 1 1 2PIP2 q �
a22 1 1 p�q�
a. . 1 1 q�
al 1 1 Z 2P�q lqZ
a l 2. 1 2 4plPZ qlQZ
az z l 2. 2P� QIQZ
a. . l z 2qlqZ
Bz B z
a1 1 2 2 ptq�
al Z 2 2 2plPZ qi
aZ222 p� q �
a. . 2 Z qi
MEAN
all. . pt
au . . 2pIPz
a22 . . pi
0 1
BI BI
B I B2.
a:1 PI
MEAN
Z
=
a:z pz
Vg + Vd + VI = ptqtal � l l + 2pIP2 qfal� 1 l
W = V. + Vd u = V,
=
+ .
MEAN
/31 ql /3 2 qz
0 1 + p� q� aZ� 1 2
2 2 2 2 2 piau . . + 2PIP2 a u. . + pz a2 2 . . + 2 2. 2 2 2 qla. . l 1 + 2qlq2 a .. 1 2 + qz a. . 2 Z .
= 2(Pla:f + P z a:� + ql j3f + q2 j30
.
1 26
AN INTR O D UC1'I O N TO POPULATI O N G E N ETICS TH EORY
At the bottom of the table are the weighted averages of each column, For example
Corresponding quantities for the B locu s are on the next to the right column. The a's are obtained as before. Thus 4.1 .1 7
and the {J's are corresponding quantities for the B locus. Putting all this together, we have the variance components as given in the bottom part of Table 4. 1 . 2 . Table 4. 1 . 3 gives two numerical illustrations. In both cases epistasis and dominance are complete. In the left the dominant genes are comple mentary and in the right the dominant genes are d�plicate, representing
Table 4.1 .3.
N u merica l
examples of two contrasting directions of epis t at i c devia
dupl i cate. These lead to the classical 9 : 7 a n d 1 5 : 1 Mendel ian ra tios. The gene t i ons . On the left the dominant al leles are com p lementary ; on the righ t , they are frequencies are adjusted so that the two phenotypes are equally frequent i n both
examples ; thus the
ra c
va i n e 1 .
dev i at i ons
AA BB Bb bb
-1
PA
=
are equa l and have been scaled t
Aa
aa
-1
-1 -1 -1
q B = .4 5 9
AA BB Bb bb
1 1
PA
=
qB
=
Vg = .757
V"
=
V" = .072
V,
=
Vr
. 1 72
= 1 .000
make the
Aa
aa
1 1
-1
1
Vg = . 582 .24 7
o
V,
=
. 1 72
Vr = 1 .000
. 1 59
total
C O R RELATION B ETWEEN R E LATIVES A N D ASSO RTATIVE MATING
1 27
extreme cases of diminishing and reinforcing epistasis. We have assumed that the two phenotypes are equally frequent. This necessitates that the dominant allele frequencies are 0.459 and 0. 1 59 in the left and right tables, respectively. Notice that in both cases, despite the complete epistasis, only 1 7 % of the total variance appears in the epistatic term. The ratio of the additive or genic to the dominance variance is larger in the second case. This accords with the results in Table 4. 1 . 1 ; the dominance component decreases as the domi nant gene frequency decreases. These examples are intended to illustrate the underlying principles. In practice one is dealing with quantitative traits such as size, weight, or fitness, and the effect of individual genes cannot be ascertained. What is observed is a set of cumulative effects of many genes reflected in correlations or covariances between relatives. From these the variance components can often be inferred. The correlations between relatives will be considered later in the chapter, but we need to be able to subdivide the epistatic contributions further in order to study the correlations. This is necessary because the different epistatic components contribute differently to the covariances between individuals of different degrees of relationship. Again we consider only two loci with two alleles at each locus. The epistasis may be broken down into interaction of the additive or genic components at the two loci, interaction between the additive component of one locus and the dom inance component of the other, and i nteraction between the dominance components. We shall designate these as VAA , VA D , and VDD • Thus 4.1 . 1 8
The theory for such subdivision follows the principles of factorial experimental design (Fisher 1 93 5) and is described by Kempthorne ( I 957). We shall not attempt a proof; but will show a simple proced ure for obtai n ing these q uantities. It is simpler to rearrange the phenotype values as in Table 4. 1 .4. Remember that the a's are deviations from the population mean. The quantities in parentheses are the weighted means of the two immediately adjacent values. For example,
Likewise the val ues along the bottom are the means of the colu mns. For example,
1 28
AN I NTRO D U CTIO N TO P O P U LATION G E N ETICS TH EORY
Table 4.1 .4. Basic calculations for subdividing t he 2-alleIe, 2-locus epistasis into
additive
additive, additive
x
x
dominance, and dominance
x
dominance com
ponents. The gametes and their frequencies are given at the upper and left margins and the phenotypes in the center. Values in parentheses are the weighted means of the two immediately adjacent values. Al l values are measured as deviations from the population mean.
GAMETES
A 1 B2 Plq2
A 1 BI Plq2 A 1 BI Plq l
al l 1 1
(al l 1 .)
(a1 . l l)
A 2 B2 P2q2
A 2Bl P2ql
al l 1 2
aU l l
(a1 . 1 2)
(a2. l l)
(au d
auu (a2. 1 2)
A 2 Bl P2 ql
aUl l
(aU 1 . )
aU 1 2
a2 2 l l
(a2 2 d
a2 2 U
A 1B2 P1Q J.
al l U
(a1 l 2. )
a1 l 2 2
al 2 1 2
(aU 2.)
aU22
(aU2)
(a2. 1 2)
al 2 2 2
a2 2 1 2
al . 2 .
a2 . 1 .
---�--
(a1 . 1 2)
A 2 B2 P2Q2
auu
Mean
aL l .
(aU 2 . )
X = 4(P1QlaL . + PIQ2 ai . 2. +
Y
=
2(ptQlad l .
p 2 Q l a�. 1 .
(a2. U) (aU2.)
a2 2 2 2 a2.2.
+ P2 Q2 at 2.)
+ PI P2 Qlal � 1 . + . . . + P� Q2 a2� 2. + P2 Ql alH)
Now, we can put the various formulae together. Some are from Table 4. 1 .2 a nd the rest from 4. 1 .4.
u = Vg ,
w = Vg +
X
=
Vd ,
2 Vg + VA A ,
Y = 3 Vg + 2 Vd + 2 VAA + VA D ,
4.1 .1 9
CO R R ELATIO N B ETWEEN R ELATIVES A N D ASSO RTATIVE MATIN G
1 29
From these we readily obtain
Vd
=
- U + W,
VAA = - 2U + X , VA D = 3 U - 2 W 2 X + Y,
4.1 .20
-
VDD =
-
U+ W+ X
-
Y + Z.
The proof of these relations and extensions to m ultiple alleles and m ul tiple loci are given by K empthorne (1954, 1 955). An example is given in Table 4. 1 .5. Again, complementary dominant genes are assumed, but this time we have let the gene frequencies be 1 /2 for each allele.
Table 4.1 .5. A numerical example ; two loci , two alleles at each, complete domi
nance, complete complementary epistasis. The phenotypic values are given at the left, the deviati ons from the population mean at t he right. The gene frequencies are all equal to 1 /2.
Aa
aa
AA
Aa
aa
BB
101
101
1 00
BB
7/ 1 6
7/ 1 6
- 9/ 1 6
Bb
101
101
1 00
Bb
7/ 1 6
7/ 1 6
- 9/ 1 6
bb
1 00
1 00
1 00
bb
- 9/ 1 6
- 9/ 1 6
- 9/ 1 6
Vg Vd
V,.,. VAU Vuu
AA
VARIANCE
FRACTION OF
COMPONENT
TOTAL VARIANCE
}
. 1 406
.57 1
.0703
.286
.01 56
.O64
.01 56
.064 . 1 43
.0039
.0 1 6
130
AN INTR O D U CTION TO PO PUI:ATI O N G E N ETICS THEORY
4.2 Variance Components with Dominance and I nb reed ing
In Section 3. 1 0 of the previous chapter we showed that the genotypic variance increases if there are consanguineous matings within the population. We are referring to the whole population, not to a subpopulation if the inbreeding is such as to break the population into groups. Equation 3. 1 0.3 is, 4.2.1
where Vo and VI are the variance of a randomly mating (f = 0) and completely inbred population (f = I ). Yo and ¥l are the corresponding means. Unless the two means are the same, which would be true if there were no dominance, the variance is a quadratic function of! We now inquire as to how the genic and dominance components change with ! We generalize the model to i nclude inbreeding. GENOTYPE
FREQUENCY
GENOTYPIC VALUE GENIC VALUE
P I C(l
AlAI pt + P I P2 f YI I = a + al l a + 20: 1
AIA2 2PI P 2(1 f) Y1 2 = a + 01 2 a + 0:1 + 0:2 -
A2 A 2 pi + P I P 2 f Y2 2 = a + 02 2 a + 2 0: z
Following exactly the same procedure as before, we see that agam +
P 2 C( 2
=
O. The ge n ic vari ance is 4.2.2
This, which corresponds to 4. 1 .6, would seem to imply that as the population is inbred the genic variance is simply multiplied by I + f-as is true for a gene with no dominance (see 3. 1 0.6). But this is misleading, for the IX'S also change as the genotype frequencies change with inbreeding. What we need is an expression for Vg in terms of quantities that do not change with f. Continuing, if we go through the same least squares process as before, we have Q = (pi + P 1 P2 f)(a l l - 2(X 1 ) 2 + 2p 1 P2 ( 1 - f)(a 1 2 - 1X 1 - 1X 2) 2 2 + (p� + P 1 P2 f)(a 2 2 - 2?:2) .
4.2.3
C O R R ELATIO N B ETWEEN R ELATIVES A N D ASSORTATIVE MA1·I N G
1 31
From these equations we solve for (X l and (X 2 ' getting (X 1 -
(P1 + P2/)a l l + P2( 1 - /)a 1 2 � 1 +/ --
and P 1 ( l - /)a 1 2 + (P2 + P l /) a ll (X 2 - ------1 +/
4.2.4
Substituting these into 4.2.2 gives V9 =
2p 1 P2[(P1 + /P2)(a l l - a 1 2) + (P2 + /P 1)(a 1 2 - a22)J 1 +/
2 4.2.5
This shows that t he genic variance does not change in a linear way with f. Notice that al l ' a 1 2 , and all are measured as deviations from a mean a. But a is not constant, since it changes with f. Yet the quantities (al l - a1 2 ) and (a1 2 - all) do not change, since they are t he same as Yl l - Y22 and Y1 2 - Y22 , which are constant. Note that when al l - a1 2 = a1 2 - al l (no dominance), 4.2.5 becomes 4.2.6
linear in f as expected (compare 3. 1 0.6). We have not given an explicit formula for Vh comparable to 4.2.5 for Vg • This can be calculated from 4.2. 1 . Let Yl l - Y1 2 = al l - a1 2 = A, Y1 2 - Yll = a1 2 - a ll = B .
Then
2 Vo = 2P 1 P2(P1 A + P2 B) + p ip �( A - B) 2, 2 V1 = P1 P2(A + B) , Yo - Y1 = -P1P2 (A - B).
4.2.7 4.2.8 4.2.9
These may be substituted directly into 4.2. 1 which then gives Vh for any value off. The heritability is given by the ratio of 4.2.5. to 4.2. 1 . Equation 4.2.7 was obtained from 4. 1 . 10 and 4. 1 . 1 1 ; 4.2.8 and 4.2.9 were calculated fro m the table at the beginning of this section. Notice that whenf = I , the genic variance becomes the same as the total variance and the dominance variance disappears. This is not surprising, for the dominance variance depends on the extent to which the heterozygote departs from the mean of the homozygotes ; with no heterozygotes it has no meaning. Heritability i ncreases with inbreeding because, although the to tal
1 32
AN I NTRODUCTION TO P O P U LATION G EN ETICS T H EO RY
0.2
o
0.6
0.4
0.8
1 .0
f
changes i n variance components with in breedi ng . Notice that the dominance variance, which is the distance between the two lines, decreases to 0 as f approaches 1 . Complete dominance is assumed, with the recessive ge n e frequency equal to F igure 4.2.1 . Th e
0.333.
variance increases, the genic variance increases more rapidly and becomes a larger fraction of the total. This is illustrated in Figure 4.2. 1 . 4.3 I dentity Relations Between Relatives
We i ntroduced i n Section 3.2 the coefficients of inbreedi ng and of con sangui nity. These measure the probability that two homologous genes d rawn at random from an individual or from each of two i ndividuals are identical by descent. We wish now to extend the ideas to measure different identity relations among diploid individuals who may be identical for neither, for one, or for both of their genes. The method that we are using was first developed by Cotterman ( 1 940). Cotterman worked only with the relationships between two i ndividuals, neither of which was inbred. The extension of the method to include the re lationships between two i nbred i ndividuals has been made by Denniston ( 1 967). We shall consider only the simpler case where neither of the two individuals is inbred. Consider two related i ndividuals, I and J. We define the Cotterman k-coefficients as : ko 2kl
the probability that no two genes at the locus are identical , = the probability that one gene in I is identical to one gene in J, but not both, k 2 = the probability that both genes in I are identical to those in J. =
C O R R ELATIO N B ETWE E N R ELATIVES A N D ASS O R TATIVE MATIN G
133
For example, I
J
A.A1 A.A1 A.A1 A.A1
A.A1 A.A3 AJAz A 3 A4
GENOTYPE
k-PROBABILITY
kl k. k. ko
More precisely, if a and b are the two genes in I and c and d are the two in J, as shown in Figure 4.3. 1 , then (using = to mean " are identical by de scent ") the k-coefficients are : k2
=
2kl
=
ko
=
Prob[(a
=
c) and (b
=
d)] or [(a
=
d) and (b
=
c)] ,
Prob[(a = c) and (b =1= d)] or [(a = d) and (b =1= c)J or [(b = c) and (a =1= d)] or [(b = d) and (a =1= c)] ,
4.3.1
Prob[a =1= c, a =1= d, b =1= c, and b =1= d ] .
Notice that, if either or both of the two individuals I and J are inbred, there are other possible relations, such as a = b c =1= d or a = b c = d. But without inbreeding of l or J the only possibilities are those measured by ko , k. , and k2 • =
o
0
\o1
o
0
\o1
F ig u re 4.3.1 . A d iag ram to show gene-i dentity relationships. The small letters refer to the genes in the ga metes p ro duced by the individuals d esign ated by large letters. The curved a rrows i nd icate p ossi ble consan gui n i ty farther back i n the ped igree.
=
1 34
AN I NTROD U CTION TO P O P U LATI O N G EN ETICS TH EORY
To compute these k-coefficients we make use of the consanguinity coefficients of the parents. Referring again to Figure 4.3. 1 , we observe that 4.3.2
where f is the coefficient of consanguinity as defined in Chapter 3, Section 3. This follows immediately since the four f 's are the probabilities that (a = c). (b = d), (a = d), and (b = c) . Furthermore, 2k l
= l"c(I
- fBD) + Iw(1 - fBd + fBc( t - fAD) + fBD( I - fAd = fAc + fAD + fBc + fBD - 2(fAc fBD + fAD fBd = 4fIJ - 2k 2 ·
4.3.3
We can obtain ko by substraction since ko + 2kl + kl = I . As examples, consider the three sets of relationships shown i n Figure 4.3.2. With ordinary single cousins fBc = 1 /4, fAc = fA D = fBD = kl = 0,
0,
2k l = 4fIJ = 1 /4, ko = 3/4.
With
do u b l e first
cousins
fAC = fBD = 1 /4, fBc = fAD = kl = 1 / 1 6, 2k l = 6/ 1 6,
0,
ko = 9 / 1 6.
oX(
0-J?�O �I I / � °MT 0 0 0 0 0 0 0 0 0 0 0 0 \ / \/ \/ \/ \/ 0 0
F i g u re 4.3.2. Single first
cousins.
o
cous i ns ,
double
0
o
0
first cousin s , and quadruple half-first
C O R R ELATIO N B ETWEEN R E LATIVES A N D ASSO RTATIVE MATIN G
1 35
Finally, with quadruple half-first cousi ns,
fAC = fAD = fBC
kz 2kl ko
= 1 /32,
=
= fBD =
1 /8,
1 4/32,
=
1 7/32.
The coefficieut of consangui nity fIJ double first cousins, as expected .
c
b
())
Figure 4.3.3.
Full
=
(k1 + kz)/2
1 /8, the same as for
d
o
sibs, and parent and offspring.
For the k-coefficients of sibs we treat the pedigree
i f two of the ancestors were collapsed i nto one, as shown in Figu re 4.3.3, and w rite the f's in terms of the gametes. Thus
lac
= fbd
=
as
1 /2,
fad = he = 0,
and
=
kz 1 /4, 2kl = 1 /2, ko = 1 /4. There i s also a slight complication if one individual is an ancestor of the other, as in the parent-offspring relationship shown in Figure 4.3.3 . In this case, we have d rawn the relevant gametes ; this makes the situation clear.
1 36
A N INTRODUCTIO N TO P O PU LATIO N G E N ETICS THEORY
lac = he
=
1 /2 ,
lad = hd = 0, k 2 = 0, 2k l = 1 , ko = 0.
The results are as would be expected. A parent and child who are otherwise unrelated must share one and only one gene at a locus. As stated earlier, we assume that neither I nor J is inbred. However, it i s all right for other individuals in the pedigree t o be inbred. The same rules apply as in the computation of the inbreeding coefficient u nder ordinary circumstances. Inbreeding is i rrelevant for all members of a path except for the common ancestor ; if the common ancestor is inbred the path is multi plied by I + f, just as in computing the inbreeding and consanguinity coeffi cients. Likewise the rules for X-linked traits are still good. Males in a path are not counted and any path with successive males is ignored. Of course the k-coefficients, since they apply to diploids, are meaningful only for females. The k coefficients are of particular use in two contexts. One is for the computation of the solution to such problems as : Given that I and J are related and I is genotype QQ, what is the probability that J is also ? Suppose that p is the gene frequency. The probability, then, is k 2 + 2k l P + k o p 2 . For example if I and J are double fi rst cousins and I h as a recessive disease of incidence p 2 then the probability of J being affected is (l + 6p + 9p 2 )/ 1 6. The second use is in determining the correlation between relatives when there is dominance. That is the subject of the next section. ,
4.4 Correlation Between Relatives
The procedure for determining the correlations between relatives, with domi nance, but restricted to individuals that are not inbred is now given. As in the previous section it is all right for other individuals in the pedigree to be inbred, but not for the two individuals under consideration. The necessary calcula tions are set forth in Table 4.4. 1 . The covariance of X and Y is CXY
=
k o(pta I l + 4p�p2 a l la 1 2 + 4PIPi aI2 + 2p�p� a l la2 2 + 4PI P� a 1 2 a22 + p� a�2) + 2k l ( p � a i l + 2PIP 2 a l l a 1 2 + P I P2 ar2 + 2pIP� a 1 2 a22 + p� a�2) + k2(PIa i 1 + 2P I P2 a i2 + pi a �2).
C O R R ELATIO N B ETWEEN R ELATIVES AN D ASSO RTATIVE MATI N G
Table 4.4.1 .
1 37
The calculation of the correlation between relatives in terms of gene
effects, gene frequencies, and k-coeffici ents. The values a /j are assumed to be meas ured as deviations from the mean to simplify the arithmetic.
GENOTYPES
PHENOTYPIC
FREQUENCY OF THIS COMBINATION
VALUES
X
Y
X
AlAI AlAI AIA2 A I A2 AlAI A2 A2 AIA2 A2 A2 AlA2
AlAI A IA 2 AlAI A IA 2 A2 A2 AlAI A2 A2 AIA2 A2 Al
al l al l a12 a1 2 al l a22 a1 2 a22 al2
Y
}
al l a1 2 al l a1 2 all al l all a12 all
k o pt
} }
+ 2kl Pf + k 2 P�
+ 2k IP�P2] ko 4p�p� + 2k lPIP2 + k2 2p IP2
2[ko 2pfp2
2[k o p�pi]
+ 2klPlpi] k o p� + 2k1 P� + k 2 P'i
2[ko 2plp�
But, the coefficient of k o is
2 ( pia l l + 2P l P2 a l 2 + Pi a22) ,
which, by 4. 1 . 2, is O. The coefficient of k l is equivalent to 2 [PI (P l a l l + P 2 a1 2 ) 2 + P 2 (Pl a1 2 + P 2 a2 2) 2 ] ,
which, from 4. J .9, i s 2 (PI cti + P2 ctn,
which , by 4. 1 .5, is Vg • Finall y, the coeffi ci en t of k 2 is Vh , from 4. 1 .4. Therefore, summ ing over all loci, CX Y =
or, since
CX Y
k l Vg + k 2 Vh ,
Vg + Vd in the absence = (k 1 + k 2) V + k 2 Yd ' g Vh
=
of epistatic inte r act io n , 4.4.1
From the definition of the correlation coefficient (A.6.3) the correlation between two relatives, X and Y, neither of which is inbred, is rX Y =
Vd Vg + k2 - . (k l + k 2) v,
-
v,
4.4.2
Essentially the same results were derived by quite a different procedure by Fisher ( 1 9 1 8).
1 38
AN I NT R O DUCTIO N TO P O P U LATI O N G E N ETI CS TH EORY
As a numerical example, consider some correlations on human height. Fisher estimated Vg/ V, = .74 and Vd / V, = .26. He found no appreciable e nviro nmental component i n t he population studied . From t his, the correla tion between parent and child, i f marriages were random with respect to height, would be
!(.74) = .37
,
and, for sibs, , =
;(.74) + 1(.26) = .44 .
Actually, as we shall see l ater i n this chapter, the correlatio ns are con siderably i ncreased by the strong assortat ive marriage for height . Notice that if t here i s n o dominance, epistasis, o r environmental effect, then the correlation becomes simply 'X Y
2!x y ,
4.4.3
where ! is the consanguinity coefficient. The correl ation is the covariance divided by t he geometric mean of the two variances. For additively acti ng genes (no dominance) t he effect of the two sets of ge nes in t he zygote i s t he sli m of the two haploid gametic sets. The c ovariance of w + x w i th y + z is n ot i nfl uenced by the correlation between w and
x or
t hat between y a n d
z.
O n t h e ot her hand, t he genotypic va ria nce
d oes i ncrease with such a correlat ion, and i n fact is multipl ied by I + J, where f is the i nbreed ing coefficient (or the correlation between alleles). This was shown i n eqltat i o n 3 . l O.6. Thus, if there is i n b reeding, the denomin ator i s increased and the correl ation between two i n bred i ndividuals becomes 1
.
xy _
-.J ( t
+
2fxy
-
fx )( 1 + fy )
4.4.4
w hen there is no d omi na nce, epistasis, or environ mental effect. Notice , by compari ng wit h 3.3. 1 , t hat this is Wright's coefficient of rel ationship. I n fact the o ri gi n al derivation of Wright's measure was through correlation analysis, and h is i ntention was to h ave t he relatio nsh ip coefficient reflect the c or relation between the ge n e val ues of the two i n d ivid u als. The e xtension of 4.4. 1 t o include epistasis is straigh t fo rward , but we shalt si mply give the results rat her t han t he derivat i o ns. Fo r two l oci, when epistasis is considered , 4.4. 1 becomes CX Y
= (k.
+
k 2 ) Vg + k 2
VJ
+ (k l + k2 ) 2 VAA
4.4.5
C O R R ELATION BETWEEN R ELATIVES A N D ASSORTATIVE MATIN G
1 39
The extension to more than two loci is direct. For an epistatic interaction between additive effects at r loci and dominance effects at s loci, the coefficient
IS
4.4.6
Some examples are given in Table 4.4.2.
Table 4.4.2. Covariances between relatives of different degree i n terms of variance components i n a population mating at random. These quantities, when divided by the total (phenotypic) varia nce, V" give the correlations.
RELATIONSHIP
Sib Parent-offspring Half-sibs, Uncle-niece, Parent-grandchild First cousins Double first cousins
COVARIANCE
t v, + * Vd + * VAA + t VAO + 116 VOO t v, + t VAA
+ 1 16 VAA t v, l Vg + 614 VAA t v, + 116 Vd + 116 VAA + 614 VAO + 2 � 6 VOO
I t is often of interest to ask for the covariance or correlation between the
offspring and the average of the parents, or the mid-parent, P. Letting P," stand for the measurement on the male parent, PI for that of the female parent, and 0 for that of the offspring, we obtain from the definition of the covariance (A.2.9)
4.4.7
= C po ,
provided that the two sexes are equivalent. So, the covariance of offspring and mid-parent is the same as that between offspring and parent. On the other hand, the variance of P is 1 /2 the variance of Pm or PI (if males a nd females are equally variable, and if they are independent as is
140
AN INTR O D U CTION TO P O P U LATI O N G E N ETICS TH EORY
the case with random mati ng). The regression of offspring on the average of the parents is b
op _
=
Cap
v-p
2Ca p l Vp
4.4.8
= ( Vg + ! VAA )/ � .
Such a formula can be used to predict the rate of improvement by selec tion. The progeny are expected to deviate from the average by a fraction bop of the amount by which the mid-parent deviates, or more formally (see A .4.5), 4.4.9
where P is the population average. Returning to the example of Table 4. 1 . 5 , the correlations are as follows : Half-sibs Parent-offspring Full sibs
. 1 47
.302 .38 2
If the epistatic factors are ignored, the half-sib correlation would be estimated as 1(. 57 1 ) = . 143, not very different. In practical p r oblem s , the breede r u s u a l ly estima tes the heritabi l i ty and then uses this value as a guide to selection programs. His estimates usually come from various correlations between relatives. One of the most used measures of heritability is four times the half-sib correlation, particularly half-sibs with the same father and different mothers since this eliminates the confounding effects of a common uterine and early postnatal environment . Four times the half-sib correlation is . 587 . The correct prediction formula is bap = ( Vg + V.,tA /2); V" or .603 . The error is about 2.5 % ! In th is example the epistasis is quite large, since we have assumed completely complementary gene action . Yet it doesn't cause a very large error in heritability measure ments or predictions based on these. It is these reasons. as wel l as the practical difficulty of measuring epistasis, that lead the breeder to ignore epistasis. If the popula tion is broken up into a series of groups we can relate the variance between and within the grou ps to the correlation coefficient. I t is simpler to regard the groups as of equal size, but the general theory does not depend on this. We think of the quantitative trait or measurement as made up of the sum of a series of additive and independent components. A particular individual,
Variance Within and B etween G roups of Relatives
C O R R ELA'nON B ETWEEN R ELATIVES AN D ASSO RTATIVE MATI N G
141
thejth member of the ith group, has a measurement, Yij ' which is the sum of an overall mean (11), a component COmmon to all members of the group (hi)' and an additional component (wij) that is specifi c to the ind ivid ual . Then if Vb and Vw are the variances of these quantities (the between-gro up and within-group variances). the correlation between m embers of a group is r
Vb v" = = Vb V; ' + Vw
4.4.1 0
For an explanation of these relationships. see A o4. IO-Ao4. I S. For example. the variance withi n families of full sibs is Vw = ( J
-
r) V, .
From the information in Table 404.2, this is
Vw = l V, + i Vd + i VAA + i VAD + ! � VDD -
4.4.1 1
4 .4.1 2
There may also b e environmental factors that are common t o a sibship and others that differ for members of the sibship_ Suppose that Ve and Ve, are the environmental variance components within and between sibships ; then the sib correlation is r
=
Vb + Ve,
-.:..---=-
v;
4.4.1 3
In animal and plant breeding experiments it is often possible to avoid such environmental correlations by randomization. In human genetics and any study of natural populations such difficulties are unavoidable. 4.5 Comparison of Consanguineous and Assortative Mating
Assortative mating means that mated pairs are more similar for some phenotypic trait than if they were chosen at random from the population. It may have eit her of two causes, or some combination of both. The tendency toward phenotypic similarity of mating pairs may be a direct consequence of genetic relationship. For example, in a subdivided popUlation there will generally be a greater phenotypic similarity among the members of a sub popUlation because they share a common ancestry. The genetic consequences in this case are the same as those of inbreeding. On the other hand , there may be assortative mating based on similarity for some trait and any genetic relationship is solely a consequence of similar phenotypes. For example, there is a high correlation between h usband and wife for height and intelligence, probably caused much more by nonrandom marriage associated with the traits themselves than by common ancestry.
1 42
AN I NTRODUCTION TO P O P U LATI ON G EN ETICS T H EO RY
There are also other situations. For example, there is a considerable correlation in arm length between husband and wife. This is probably a consequence of the fact that those factors, genetic and environmental, that increase height also increase the length of the arm . So, any assortative mating for height will be reflected in a similar assortative mating for arm length, diminished somewhat by the lack of perfect correlation between t he two traits. Assortative mating is between individuals of similar phenotypes ; inbreeding is between individuals of similar genotypes. Since individuals with similar phenotypes will usually be somewhat similar in their genotypes, we should expect assortative mating to have generally the same consequences as inbreeding. An excess of consanguineous matings in a population has two effects : ( I ) an increase in the average homozygosity and (2) an increase in the total population variance. Assortative mating would be expected to produce the same general kinds of results. In general, assortative mating causes less increase in homozygosity than inbreeding, especially if the trait is determined by several gene loci. On the other hand, assortative mating causes a large increase in the variance of a multifactorial trait, in contrast to that produced by a comparable amount of inbreeding. A further difference is that inbreeding affects all segregating loci, whereas assortative mating affects only those related to the trait involved. Pure assortative m ating, like inbreeding, d oes not change the gene frequencies. We shall refer i n this book to any situation where d ifferent genotypes make different contributions to the next generation through differential survival, mating patterns, or fertility, as selection. Only when all genotypes make the same average contribution will we regard it as pure inbreeding or assortative mating. But we shall see later in the chapter that with many assortative-mating systems, and even more so with disassortative mating systems, there are differential contributions of different genotypes ; so the effects of assortment are often confounded with selection. The variance-enhancing effect of assortative m ating is apparent with a simple example. Suppose that an arbitrary quantitative trait is influenced by two loci without domi nance. Let each gene with subscript 1 add one unit to the phenotype, whereas each gene with subscript 0 adds nothing. Then the genotype A l A I BI BI represents one extreme phenotype and A o A o Bo Bo the other, with A IA I Bo Bo , A I A o BI Bo , and Ao Ao BI BI beingexactIy intermediate. Inbreeding will increase the frequency of all four homozygous genotypes, A l A I BI BI , Ao A o BI Bb A l A I Bo Bo , and A o A o Bo Bo . This will increase the variance ; in fact, it will exactly double the variance if the population is changed from random-mating proportions to complete homozygosity. On the other hand, with complete assortative mating, the population approaches a state where only the extreme homozygotes, A I A I BI B I and
C O R R ELATI O N B ETWEEN R ELATIVES A N D AS SORTATIVE MATI N G
143
A o A o Bo Bo , remain. This clearly causes a much greater enhancement of the variance, especially if the number of relevant loci is large. The variance increase with assortative mating has been shown experimentally in Nicotiana (Breese, 1 956) and Drosophila (McBride and Robertson, 1 963). The latter authors also found the expected decrease with disassortative mating and demonstrated that the rate of change under selection can be increased with assortative mating. With inbreeding there is no systematic change in the frequencies of the gamete types A I Bb A I Bo , Ao B I ' and A o Bo. On the other hand, as the example shows, assortative mating causes a change in frequency of the gametic types, increasing two while decreasing the other two. So, another way of describing the effect of assortative mating and of understanding its vari ance-enhancing effect is to note that it causes gametic phase (or linkage) disequilibri um. The simplest cases of assortative mating were worked out long ago. These involved mainly a single locus (Jennings, 1 9 1 6 ; Wentworth and Remick, 1 9 1 6). One example of this work is the simple case of two alleles where each genotype mates strictly assortatively, that is, only with another individual of the same genotype. The genetic consequences are obviously exactly the same as with self-fertilization. Heterozygosity is reduced by half each genera tion and the variance is eventually doubled. It m ight be thought from th is example that assortative mating leads eventually to complete homozygosity, as do many forms of inbreeding, but this is not the case. Partial assortative mating, like partial self-fertilization, leads to an equil ibrium level of heterozygosity other than zero. In the more general treatment of assortative mating two cases are of interest. At one extreme the individ uals fall into two (or possibly more) discrete phenotypes with preference for mating within a phenotype. For ex ample, deaf persons tend to marry others with the same trait. At the other extreme is a character like size, for which there is a correlation between mates, but the distribution of sizes is conti nuous and determined by multiple genetic and environmental factors. Before dealing with more complex multifactorial models, we shall first consider a single-locus trait. 4.6 Assortative Mating for a S i ngle Locus
With inbreeding the choice of a mathematical model is clear from knowledge of the relationshi ps and from the Mendelian mechanism. With assortative mating the choice is not so obvious, as a different behavior pattern can produce a different consequence.
1 44
AN INTRO D U CTI ON TO P O P U LATIO N G EN ETICS TH EO RY
We shall measure the degree of assortative matin g by the prod uct moment correlation between the parents, ,. For a q uantitative t rait the cor relation coefficient is directly measu rable. For q ualitative t raits we measure the correlation coefficient as the decrease i n the proportion o f matings between dissimilar phenotypes, d i vi ded by that proportion which is expected with random pairs. We consider two situations. Assume t hat each geno type is d isti nct, the diffe rences being d etermi ned by a series of alleles. N o restriction is placed on t h e n u m be r o f alleles. Assume that in each gen otype a fract i o n , select m ates of their own genotype while the remainder mate at random. In this system , perfect assorta tive mating is equivalent to sel f-fertilization, so this model of i m perfect assortative mati ng is fo rmal l y eq u i va]ent to parti al self- fertilization. This was considered in Section 3 . 8 . From 3 . 8. 1 9 t he heterozygosity at equi l ib ri um is given by Each G enotype with a Different Phenotype
H w = 2H o 2 - ' . r
(1 )
4.6.1
_
This result was fi rst o btai ned by Wright ( 1 92 1 ). Notice that complete homozygosity is not approached unless the assortati ve mating is complete (, = I ). Otherwise, the population approaches a level of homozygosity which is equivalent to an i n breeding coefficient o f f = ,/(2 - , ). We showed in Section 3. 1 0 that when there is no dominance the variance is proportional to 1 + f Therefo re , wit h partial assortat i ve mating the population variance at eq uilibrium is
Vw = Vo( + f)
1
{ 2 ),
VO 2
,
4.6.2
where Vo is the variance with random mating. For example, i f , = 1 /2, the equili bri u m heterozygosity is reduced by 1 /3 a nd the variance is i ncreased by the same fraction. Only i f , = I d oes the popUlation become homozygous, in which case the variance is eve ntua11y doubl ed . A more important example, especial ly i n h uman genetics, is t h e case where d ominance i s complete. We now consider this. We aSSume t hat there are only two alleles and, since dominance is complete, there are only two phenotypes. If there are more than two alleles it may be that they can be grouped into two sets as regards mati ng pattern . For example, it might be t hat the normal allele is dominant and that the wild types tend to mate among themselves leavi ng all the m utant types to mate with other mutants.
Complete Dominance
C O R R ELATIO N B ETWE E N R ELATIVES A N D ASSORTATIVE MATI N G
1 45
Let P, be the frequency of AA in generation t, 2Q, be the frequency of heterozygous Aa, and R, that of the homozygous recessive aa. Let r be the correlation between mating individuals ; that is to say, a fraction r mate strictly assortatively and the rest mate at random with respect to the trait considered. To see the algebraic relationships we imagine the population as being divided into three groups : a randomly mating group comprising a fraction (I - r) of all matings and with the A gene frequency P + Q = p ; a recessive assortatively mating group making up a fraction rR of matings and with the A gene frequency 0 ; a dominant assortative group comprising a fraction r( I - R) and with the A gene frequency (P + Q)/( I - R) or p/( I - R) and recessive gene (a) frequency Q/( l - R). From the randomly mated group the fraction of A A , Aa, and aa progeny will be p 2 , 2pq, and q 2 , where q = 1 - p. The contribution of the dominant assortative group to the AA class next generation will be r(1 - R) [P/(l - R)] 2 ,
to the Aa class will be r(I - R)2[p/{ 1 - R)] [ Q/( I - R)] ,
and to the aa class will be r( 1 - R)[ Q/( I - R)] 2 .
The recessive assortative-mating group will make its entire contribution, rR , to the aa class. Putting all this together the genotype frequencies next generation will be P(A A ) = Pr + 1 = ( 1 - r) p 2 + r ( 1 = ( I - r)p 2 +
P(Aa)
=
- Rr) C .!: R,)
rp 2
1 - Rr
2Qr + 1 = ( 1 - r)2pq + r( 1 =
2 ( 1 - r) p q +
2
4.6.3
'
- Rr) 2
2rp Qr p + Qt '
1
� Rr 1 �'R
t
- R,) C �'R.r [ 1 R,(Rp , ]
4 .6.4
P(aa ) = R' + l = ( 1 - r) q ' + rR, + r( 1 = ( 1 - r) q 2 + r
q2 +
- q)
-
.
4.6.5
146
AN I NTRODUCTION TO PO PU LATIO N G E N ETICS T H EO RY
We have written p and q with no subscripts, since they do not change with time. This can be verified by summing 4.6.3 and half of 4.6.4. Recalling that p = P + Q and q = Q + R, this simpl ifies to Pr + 1 = Pr ' showing that the gene frequency does not change. As with inbreeding only the genotype frequencies change, not the gene frequencies. When assortative mating is complete (r = 1 ), 4.6.4 becomes 2Qr + l =
2p Qr p+Q
I
.
This approaches 0 as t increases, but extremely slowly. When P = 1 /2, then 2 Q o = 1 /2, and the frequency of heterozygotes in successive generations follows the simple harmonic series, 1 /2, 1 /3, 1 /4, 1 /5, . . . , as first shown by Jennings ( 1 9 1 6). With any value of r except I the population never attains complete homozygosity but approaches an equilibrium. We can find the equilibrium heterozygosity by equating Qr+ 1 to Qr , giving
Q 2 + p 2 (I
-
r) Q - p 2 q(l - r) =
0
4.6.6
whose solution gives the equilibrium value, Q, in terms of the correlation between mates and the gene frequency. Alternatively, we can do as we did before and express the heterozygosity as a function of the inbreedi ng coefficient, f Replacing Q by pq( 1 - I) in 4.6.6 gives qf 2 + (r - 1 - q - rq)1 + rq = o.
4.6.7
When the correlation , r, is equal to I , there is complete homozygosity if = I ). Otherwise, there is equilibrium at an intermediate value of the inbreeding coefficient, given by the solution of the above equation lying between 0 and l . Notice that, contrary t o the results with inbreedi ng, the equilibrium value of the inbreeding coefficient with assortative mating is a function of the gene frequencies. We note here that simply equating the frequency of heterozygotes in two successive generations does not prove that this equilibrium is actually approached, or that it is stable. That both of these are in fact true can be shown but the biological considerations .make it quite clear that there must be a stable equilibrium, so we shall not demonstrate it more rigorously. Of considerable interest is the extent to which assortative mating increases the frequency of homozygotes for recessive genes. The equili brium proportion of recessive homozygotes is given by equating R r + l and Rr . This gives the quadratic 4.6.8
C O R R ELATIO N B ETWEEN R ELA'nVES A N D ASSO RTATIVE MATI NG
1 47
with the solution
1 + q2 _
_
)( 1 + q 2 _
4q rp rp ) R = --�----�--�--------�----� �
2
2 2
_
2
4.6.9
2
Some n umerical examples are given in Table 4.6. 1 . Proportion of recessive homozygotes with assortative mating, for various values of the recessive-allelle frequency (q) and the degree of assortative mating (r) . The values given are the proportion of recessive homozygotes after one generation of assortative mating and at equilibrium.
Table 4.6.1 .
RECESSIVE-ALLELE FREQUENCY,
0
. 1 25 . 25 .50 .75 1 .00
.1
.01
r
q
RI
RaJ
RI
.000 1 0 .000 1 1 .000 1 2 .000 1 4 .000 1 7 .00020
.000 1 0 .000 1 1 .000 1 3 .00020 .0003 8 .01 000
.010 .01 1 .01 2 .014 .01 6 .01 8
.5
RaJ .010 .01 1 .01 3 .01 7 .027 . 1 00
RI
RaJ
.250
. 2 50 .261 .273 .305 . 352 .500
.260 .27 1
.292 .3 12
.333
Several general conclusions emerge fro m examination of this table. First, with weak assortative mating there is little ulti mate increase in homo
zygous-recessive genotypes, as seen in the values near the top of the table. However, the population goes a large fraction of the way to equilibrium in the first generation. On the other hand, as seen in the lower left part of the table, intensive assortative mating with a rare recessive gene can lead eventually to a considerable i ncrease in recessive homozygotes, but this is approached very slowly. Notice that when , = I , the solution to 4.6.9 is R = q. As with inbreeding , the proportion of recessive homozygotes approaches the gene frequency. This is expected, of cou rse, since there has been no change in gene frequency during the process. On the other hand, the rate at which the genotypes change, and the change i n effective i nbreeding coefficient, /, depend on the gene fre quency. Assortative mating is quite high for deafness and it might be thought that this is a major factor in i ncreasing the i ncidence. It has been estimated
1 48
AN I NTRODUCTI ON TO P O P U LATI O N G EN ETICS THEO RY
(Chung, Robison, and Morton, 1 959) that there are at least 35 recessive genes, any one of which can cause deafness when homozygous, and with an average frequency of 0.002. Whatever the amount of assortative mating for deafness as a trait, it would be only about ] 135 of this amount for any one recessive gene-somewhat less because of other causes of deafness. Thus, even with strict assortative mating the incidence would not be increased by more than 2 % or 3 %. However, the tendency might be enhanced if there were a tendency for consanguineous marriages among the deaf. 4.7 Assortative Mating for a Simple M ultifactorial Trait
There is strong assortative marriage for height and intelligence in the human population. These traits are determined by a large number of genes and are also influenced by the environment. We should expect that, if there are several genes acting somewhat cumulatively to produce the trait, assortative mating for the entire trait would have a very small effect on any one locus . On the other hand, we would expect that there would be an enhancement of the variability, more than with inbreeding. The enhancement of variability can be seen by a simple example. Suppose that the trait depends on two pairs of factors, such that each substitution of an allele with a subscript I for an allele with subscript 0 adds one u nit to the phenotype, as follows : PHENOTYPE ON GENOTYPE
ARBITRARY SCALE
1
A I A I BIBI
Y+ 4
2
A I A IBIBo , A I A o BIBI
Y+ 3
3
A I A I Bo Bo , Ao A o BI B" A I A o B1 Bo
Y+ 2
4
A I A o Bo Bo , A o A
Y+ l
5
A o A o Bo Bo
0
BI Bo
Y
Inbreeding will increase all four homozygotes. On the other hand, assortative mating will increase only the two extreme types, 1 and 5 . This is easily seen to be true for complete assortative mating, for the extreme types can produce progeny only like themselves. Therefore the occurrence of an extreme type is an irreversible process ; or, in a different vocabulary, types I and 5 represent absorbing barriers. Assume that A l and BI have the same frequency, p and q = 1 p. Then the frequencies will be as follows : -
C O R R ELATIO N B ETWEEN RELATIVES AND ASS O RTAnVE MATING
1 49
EQUILIBRIUM FREQUENCY
COMPLETE
TYPE
CODED
RANDOM
COMPLETE
PHENOTYPE
MATING
INBREEDING
1
2
2
4
1
3
0 1
5
-
p4
4p3q
6p lq l 4pq 3 q4
-2
pl
0
2pq ql
0
ASSORTATIV E MATING
P 0 0 0 q
The mean phenotype, Y, is 2(p - q). The three variances are Random : Inbred : Assortative :
v
= p4(4) + 4p 3q( l ) + 4pq 3 ( 1 ) + q4(4) - y 2 V p 2 (4) + q 2 (4) - y 2 = 8pq ; 2 V = p(4) + q(4) - y = 1 6pq . =
=
4pq ;
The inbred variance is a confirmation of the principle given i n Chapter 3 (3. 1 0.6) that without dominance or epistasis the variance when f = I is twice the variance when f = O. This is true regardless of the number of factors involved in the trait. On the other hand, with assortative mating the variability increase depends on the number of factors. The p rocedure can be extended to any n umber of loci. This was first done by Wright ( 1 92 1 b). We have modified his method somewhat, but follow the same general idea. The procedure comes from Felsenstein (see Crow and Felsenstein, 1 968). Consider a trait determined by n gene loci. At each locus is a gene with frequency p such that the substitution of this gene for its allele adds a constant amount ex to the character under consideration. Later, this restriction to equal gene effects and equal frequencies at all loci will be removed. Let n =
the (haploid) num ber of relevant gene loci, f = correlation in value of homologous genes. k = correlation of nonhomologous genes in the same gamete. I = correlation of nonhomologues in different gametes, m = correlation of homologues i n different individuals, X and m' = correlation of nonhomologues i n different invidid uals. These relations are shown in Figure 4.7. 1 .
Y,
1 50
AN INTROD U CTIO N TO P O P U LATION G E N ETICS T H EO RY
f.
�
0
k,
(0 0
o 0
7
m:
0
x
0
0
0
0
0
0
°
0
0
0
0
0
0
0
0
~ k,
• •
(
f,
+ 1
� 0 0
0
O
o
�
0
0
°
0 0 0
/y
O
0
0
0
Figure 4.7.1 . Correlations between the values of genes in two parents, X and Y, an d their progeny. The circles represent individual genes. Homologous genes are opposite each other and genes from the same gamete are in a single vertical column.
An i n divid u a l gene has a varia nce pqa 2 , where q
1 - p. T hi s can be shown as follows : For convenience, let the value of one allele be ex and the other 0, with frequencies p and q. The mean value is pex + q O = pex. The vari ance, v, is p(a - pCl.) 2 + q (O - pCX) 2 = pqcx 2 . Likewise the covariance, cov , of =
two genes, each with the same variance, is the variance times the correlation coefficient. For example, the covariance of two homologous genes i s pqex 2f. We can write the variance of the total value of individual X as the sum of the variances of the component genes. Th us
V(X) = IVj + 2I cov ij , where i and j designate individual genes.
4.7.1
C O R R ELATIO N B ETWEEN R E LATIVES A N D ASSORTA'rlVE MATI N G
1 51
The variance of an individual gene, V i , is pqa.2 and there are 2n of them, so Iv, = 2npqa.2 • The covariance of a pair of alleles is pqfa.2, and there are n pairs. The covariance between nonalleles from the same gamete is pqka.1 and there are n(n - I ) combinations. Likewise, there are n(n - I ) pairs of nonalleles in different gametes with covariance pqla.2• Putting all this together, we find that the variance of X at time t is V (X),
= =
2npqa2 + 2npqf,a. 2 + 2n(n - l )pqk r a. 2 + 2n(n - l )pql, a.2 2npqa.2 [ 1 + f, + (n - l )(k , + I,)] .
4.7.2
Likewise the covariance of X and Y is
C(X , Y)r = 4n pqmr a.2 + 4n(n - l)pqm; a. 2 •
4.7.3
I f the assortative mating is based solely on the phenotype, rather than being a by-product of common ancestry of the mates, and the gene frequencies are the same for all loci, there is no more reason for alleles in mates to be alike than nonalleles. Therefore m, = m; and we can drop the prime in equa tion 4.7. 3 , leading to 4.7.4
From Figure 4.7. 1 the following recurrence rel ations can be seen. 4.7.5
J. + l = m r ,
4.7.6
k, + 1 =
( l - c)k, + el"
4.7.7
where c is the proportion of recombi nation between the two loc i concerned. In this case it is an average of the recombination between all p airs of loci concerned with the character, and for man is very nearly 1 /2, since most pairs of loci are unlinked. If r is the coefficient of correlation between the phenotypes of the two mates, X and Y, which have the same variance, the covariance is
C(X , Y) = r V (X).
4.7.8
Substituti ng into this from 4.7.2 and 4.7. 3 gives 4n2pqm , a 2 = r [ l + /, + (n - l )(k, + 1,)J 2npqa.2 •
4.7.9
Now we substitute fr + l fo r m , (see 4.7. 5) and !, for I, (4.7. 5 and 4.7.6), which leads after some rearrangement to j, + 1 =
r
-
2n
[ 1 + nir + (n - l)kr] .
4.7.1 0
1 52
AN I NT RO D U CTIO N TO P O P U LATIO N G E N ETICS TH EORY
Using this and the relation
kt + l = {l - c)k, + cit
4.7.1 1
(obtained from 4.7.6, 4.7.7, 4.7.8), we can compute It for any generation t, gi ven the starting values, lo and k o , which would both be 0 for a randomly mating population in gametic phase or " link age " equilibrium. At equilibrium there is no distinction between t and t + 1 , so usmg carets to designate equili brium val u es, 4.7.1 2
Using these equilibrium relations 4.7. 1 0 becomes
J
=
-2n [I r
+
nJ + ( n - 1)JJ ,
4 .7.1 3
leading to
J =
r 2n{ l - r) + r
4.7. 1 4
as first shown by Wright ( 1 921 ). If n is large, J is small unless r i s very nearly 1 . This shows that, unless the number of loci is small or the degree of assortative m ating is very intense, there is only a very slight increase in homozygosity. There is a much larger effect on the variance. From 4.7.2, substituting for I, from 4.7.5 and 4.7.6, I, 4.7.1 5 Vo[ 1 + nl, + (n - I )k, J , where Vo = 2npqa2, the variance with random mating and linkage eq uilibrium.
V(X)t
=
A t equilibriu m under assortative mating, substituti n g into 4.7. 1 5 from 4.7. 1 4 and 4.7 . 1 2,
�(X)
=
o ..:.... y... 1 I -r 1 -
(
_ _
) 2n
_ _
(Wright, 192 1 ), or, for large
p{X) =
4.7.1 8 n,
Vo , approx i mately l -r
4.7.1 7
As a numerical example, let r = 1 /4, which is roughly the corre1atioh in height between husbands and wives. The homozygosity is increased only trivially if n, the number of factors, is l arge. After one generation I = 1 /8n and at equilibrium is 1 /{6n + I ) or approximately 1 /6n. On the other hand, the variance is increased by 1 18 in the first generation and event u ally by 1 /3.
C O R RELATIO N BETWEEN R ELATIVES A N D ASSORTATIVE MATIN G
1 53
4.8 M u ltiple A l l el es, Unequal G ene Effects, and U nequal Gene Frequencies
Still assuming no dominance and epistasis and no environmental effects, the assumption of only two alleles w ith equal effect and equal frequency will be dropped. Let a; be the variance of a gene at the ith locus. Thus aT = r.p" Ct.i Mt, where PIc and Ct." are the frequency and effect on the trait of the kth allele. Mi i s the mean effect o f these alleles and the summation is over all alleles at the ith locus. The a,'s remain constant under assortative mating since the gene frequencies do not change. The covariance akl between two genes is a" a l 'I d where '''' is the correla tion between the two genes. The correlations f, k , I, m , and m ' of Figure 4.7. 1 . are no longer constant for all pairs of genes. Equations 4.7.2 and 4.7. 4 can be written more generally as II
at + 2 I at Ji + 2 I kij aj aj + 2 I Ilj aj aj . j,;,j j ,;, j j c (X, Y) = 4 I m ij a i aj i,j ;= 1
2I
V(X)
•
4.8.1
4.8.2
The recurrence relations 4.7.5, 4.7. 6, and 4.7.7 sti ll apply to individ ual gene pairs ; that is,
ii. t + l = m H,
I t
4••• 3 4.8.4
k ij, 1 + 1
=
(I
-
CIj)k ij, t + cij /ij, , ,
4.8.6
so that at equilibrium
4.8.8
Then at equilibrium 4.8. 1 becomes
ti(X) = 4 'L mjjai aj
ij
We now Jet
-
2 'L mji a; + 2 'L a ; i
i
.
4.8.7
4.8.8
Substituting this into 4.8.2 and 4.8.7 gives
ti(X)
=
C(X ,
Y)
-
:
Y) C( ' + Vo "e
4.8.9
1 54
A N I NTRODU CTION TO P O P U LATIO N G E N ETICS TH EORY
where the carets indicate equilibrium values and
Vo = V(X)o = 2 L o{ i
4.8. 1 0
Vo is V(X) before the assortative mating began ( / = 1 = k = m = 0 in
4.8. 1). Since C(X ,
?(X) =
1
Y) = rV(X), we get Vo
- r(l - lJ2ne)
4.8,1 1
At equilibrium the average inbreedi ng coefficient, weighted by the contribution of each locus to the variance, is 4.8.1 2
Substituting from 4.8.8, 4.8.2 and 4.8. 1 0,
j
=
C(X, Y) . � Vo 2ne ?( X) 2ne Vo r
=
-
.
4.8.1 3
--
r
- 2ne(1 - r) + r ' Comparing 4.8.1 1 and 4.8 . 1 3 with 4.7. 1 6 and 4.7. 1 4 shows the equivalence of n and He ' When mij = m ij = m , then from 4.8.8 4.8.14
and if each locus has the same standard deviation (u j UJ = u) then H e = 2 2 2 n u /n u = n. We therefore call ne the effective number of loci. It will be equal to the true number when there is free recombination and all loci contribute equally to the variance ; otherwise it will be less. Notice that when ne I , 4.8 . 1 3 gives =
=
J-2 - r' r
4.8.1 6
the value mentioned earlier when we discussed a single locus. Likewise 4.8.1 6
C O R R ELATION B ETWEEN R ELA'nVES A N D ASSO RTATIVE MATI N G
1 55
The variance after one generation of assortative mating is readily derived. From equations 4.8. 1 , 4.8.2, 4.8.3, 4.8.4, and 4.8. 1 0 we can write
V(X)1 But k ij. 1
=
=
-
" O'1� m I--I . 0 + 2 " V.O + 2 � � k I-) . i
0, from 4.8.5. Thus
i*j
-
1 0' I- 0')
-
·· u 0'- 0' + 2" � 1 ) m I) . i *j
'
4.8.1 7
VeX)1 = Vo + 2 L mij O'i O'j i. j
= Vo + t cex, Y ) o ,
from 4.8.2, and therefore
VeX)l = Vo since
( 1 �), +
4.8.1 8
C(X, Y) = rVeX).
Table 4.8. 1 gives n umerical illustrations of the increase in homozygosity and vari a nce after on e ge n e rati o n of assortative mating and after equi li bri um is reached. Table 4.8.1 . Effect of assortative mating on the average inbreeding coefficient, J,
of relevant genes and the variance of the trait, V. Subscripts 0, 1 , and
00
refer to the
randomly mating population, the population after one generation of assortative mating, and at equ ilibrium under assortative mating. Other symbols are : ne = effective number of gene loci, r = correlation between mates, H = heritability.
f..,
Vl
V«)
Vo
Vo
r = 1
1
. 500
1 .000
1 .500
2.00
H= 1
4
. 1 25
1 .000
1 .500
8.00
r = .5 H= 1
r = .25 H= 1
r = .5 H = .5
1
.250
.333
1 .250
1 . 33
4
.063
.1 1 1
1 .250
1 .77
00
0
0
1 .250
2.00
1
. 1 25
. 1 43
1 . 1 25
1 .14
4
.03 1
.040
1 . 1 25
1 .28
co
0
0
1 . 1 25
1 .3 3
00
0
0
1 .063'"
1 .21
'" Exact only if Vd
=
O.
1 56
AN I NTRODUCTION TO P O P U LATI O N G E N ETICS TH EORY
We have not considered the effects of disassortative mating, but there is nothing about these formulae that demands that r be positive. Disassortative mating has opposite effects, a decrease of homozygosity and variance and a building up of l inkage disequilibrium in the opposite direction (i.e., an association in the same gamete of genes of opposite effect). 4.9 Effect of Dominance and Environment
In a randomly mating population the variance can be divided into components 4.9.1
or 4.9.2
where Vt is the total variance and Vg , Vd , Vh , and Ve are the genic (additive genetic), dominance, genotypic (or total genetic), and environmental components. The equations above assume that the genetic and environmental factors are i ndependent so that Ve is simply additive to the other components. This is a major limitation to precise quantitative prediction of the phenotypic effects of assortative mating, particularly in human populations. We are also ignoring the effects of epistasis. Finally, all the results from here on are only approximate. According to Fisher ( 1 9 1 8), assortative mating will increase Vg , but not Vd and Ve ' This is not surprising, since with multiple factors only genic effects contribute to the correlation between parent and offspring (Reeve, 1 96 1 ). However, it is not strictly true, for Vd does change. But, as noted earlier, with a large number of genes there is very little change in heterozygosity under assortative mating, and therefore Vd is not expected to change very much. We let A be the correlation between the genic values of the mates. Thus, A =
Vg r V. = rH, t
4.9.3
where H is the heritability. This is the same as h 2 of Section 4. 1 . After one generation of assortative mati ng, we have approximately
( ;) + Vd
Vt = Vg 1 +
+ Ve
4.9.4
from 4.8. 1 8 after replacing r with A. Equation 4.8. 1 8 is reasonably acc urate if Vd is small. Otherwise the factor by which Vg is i nflated may be appreciably i n error. See Reeve ( I 96 1 ) for an exact expression for the 2-allele case. Equa tion 4.8. 1 8 is strictly correct only i n absence of dominance.
C O R R E LATION B ETWEEN R E LATIVES A N D ASSORTATIVE MATI N G
1 57
At equili bri u m under assortative mating
'"'
A
=
Vg r p: I
=
'"' rHo
4.9.5
Substituting into 4.9. 1 fro m 4.8 . 1 1 gives
'P,
=
=
( 1 -1 A'"'Q) + Vd + Ve V. + V' ( I �� ) v.[ 1 + H ( I �� ) ]. Q Q
Vg
=
1 -
where Q
4.9.6
1
2ne
A
We have used i nstead of r because only the genic part of the correlation contri butes significantly to the variance of future populations. Furthermore, we use the equil ibrium val ue of A, since even with constant r there will be as the composition of the population changes. changes in The object is to express the population variance and the correlation be tween relatives after equilibrium under assortative mating, in terms of quanti ties that can be measured in the random-mating population before assortative mating began . To do this we must have a measure for A. Note first the identity
A
4.9.7
which foHows fro m the definition, H = VgI Vt . But, since Vd and Ve do not change much with assortative mating,
V,
- Vg
=
Vg
(1 fl) = Vg ( 1 ) (1 fl ) 1 AQ fl . fl -
-
Equating the right sides of 4.9.7 and 4.9.8 and recalling that fl = obtain after some algebraic rearrangement Fisher's equation for A, Q( I
-
H)A 2
4.9.8
_
-
A + Hr = O.
AIr, we 4.9. 9
H can be measured in the randomly mating population. Then, if Q is taken
as I (i .e., the effective number of genes involved is assumed to be large, as it must be if other assumptions are to be correct), the equation can be solved for A, and this value put i nto 4.9.6 to give the equilibrium variance.
1 58
AN INTRODUCTION TO P O P U LATION G EN ETICS THEORY
3
,
As an example, let
(2 - J"2)/2 = . 29
P, = v,
H = .5,
r
=
. 5, and Q
The n , from 4.9.6,
=
1 . Solving for A gIves
[1 + .5 ( 1 ���9 ) ] 1. 207 V, , 3
=
s o the population variance is i ncreased after equilibrium u nder assortative mating of this degree by about 2 1 %. This value is given in the bottom row of Table 4.8. 1 , along with the increase in variance after one generation of assortative mating. As noted above the latter value especially may be a poor approximation if V" is large. 4.1 0 Effect of Assortative Mating on the Correlation Between Relatives
This was first done by Fisher ( 1 9 1 8) and we follow his method. Consider first parent-offspring correlation. The correlation is V,/2 V, in a randomly mating population. With equilibrium under assortative mati ng this wil l change for two reasons. One is that the variances increase, so we must replace Vg a nd V, with their equilibrium val ues. The other reason is that the correlation between the two parents will, to the extent that this is reflected in genetic di fferences, add to the correlation of offspring with one p arent through influences acting thro ugh the other. If the chosen parent deviates by a unit a mount from the populatio n average, the other parent will deviate b y r because o f the correlation between the two mates. The mean deviatio n of the parents is thus (l + r )/2 , and the expected deviation of the children is the genic part of this, or f/,/ f/t times the parental mean deviation. Thus, the correlation between the chosen parent and the offspring at equilibrium under assortative mating is 4.1 0.1
which in terms of the random-mating variances i s
1 Vg + V, R AQ 1 + ) I( r Ppo = 2 v, + v, 1( ( , = 1 AQ '
-
4.1 0.2
as given by Fisher ( 1 9 1 8) . Fisher also gives the grandparent-child correlation as
�, l + r l + A po = P, -2- --2-
p
4.1 0.3
C O R R ELATI O N B ETWEEN R ELATIVES A N D ASSO R TATIVE MATI N G
1 59
and each additional descendant multiplies the correlation by ( I + A)/2, as expected, since only the genic component is transmitted and therefore A replaces r. With full sibs the problem is more complicated because there are also correlations between the dominance components. Recall first the correlation between sibs u nder random mating, which is
l V,
l Vd
+ --' roo = 2V, 4 V,
4.1 0.4
The variance within a sibship with parents chosen at random is
=
V,(l
_
� V, � Vd 2 V,
_
4 V,
)
4.1 0.5
However, Fisher notes that this is also a good approximation to the variance within a sibship when the parents are mated assortatively, since the variance within a sibship depends only on genes for which the parents are heterozygous and, as we have learned, with a large number of genes the heterozygosity is only slightly decreased by assortative mati ng. Considering now the popula tion at equilibrium u nder assortative mating, the correlation is a measure of the reduction of the variance within a sibship. Thus
But, from (4.9.6), V, = V, + K Vg • Making this substitution and rearranging we obtain Too =
Vg({( + 1) + t Vd V, + {( Vg
4.1 0.6
where K = �
AQ 1 - AQ �
.
A may b e obtained from 4. 9.9. Q is taken a s I .
Correlations for other relatives are given in Table 4. 1 0. 1 .
4.1 0.7
1 60
AN I NTROD UCTI O N TO POPU LATIO N G E N E·nCS TH EORY
Correlations between relatives in a randomly mating population and in a population at equilibrium under assortative mating where r is the phenotypic correlation between mates, H = V,; V, , D Vd/ V" A Hr. Equil ibrium values under continued assortative mating are indicated by carets. The effective number of genes is assumed to be large, so that (2ne - l )/2ne may be regarded as 1 . Table 4.1 0.1 .
=
Parent-offspring Grandparent-offspring Great grandparent-offspring Sibs Double first cousins Uncle-niece First cousins
tH iH
tH
tH + t D !H + T� D !H tH
=
tH O + r) iH( I + r )( 1 + A) tHO + r)( 1 + A)2 tH( 1 + A) + if> "41 H� (1 + 3Ai) + T16 D iH( 1 + A) 2 + t f>A tH( 1 + A)3 + 116 f>J2 �
Fisher applied these methods to data on human stature. The data (obtained from earlier studies by Pearson and Lee) show r rpo roo
=
=
=
.
2 804 ,
. 5066, . 5433.
From 4. 10. 1 we calculate the equilibrium heritability h Vg � = n = . 791 ,
t
from which
A
=
fIr = . 222,
assuming Q = I . Assuming the observed correlations represent equilibrium values we can ask what the heritability was before assortative mating began.
Vg = Jli l V,
=
H
=
v, -
Vg V,
=
- A ), Vg ( A �), I -A .74.
So the assortative mating has increased the heritability from .74 to .79. From the sib correlation 4. 1 0.6 we can estimate Vdl Vt , which turns out to be
C O R R E LATI ON B ETWEEN R ELATIVES A N D ASSORTATIVE MATI N G
1 61
about the same as I - fl. Hence, on the basis of these data, Fisher con cluded that environment is of very little importance in determining variance in h uman stature. The analysis of variance in a population at equilibrium under assortative mating would be Vg Vd V,
Effect of assortative mating V,
62% 21% 83 % 1 7% 1 00%
Fisher assumed that the environmental similarity between sibs was no greater than that between parent and offspring. This seems q uite dubious ; it is probable that genes for height are less d ominant than he thought and the environmental influence greater. To make it easier to go from this treatment to Fisher's 1 9 1 8 paper, here is a Jist of equivalents :
,
4.1 1 Other M odels of Assortative Mating
The correlation model that we have been discussing may not always be realistic. It is simple and natural , but of course there is an infinity of possible patterns of assortative-mating behavior. For quantitative traits the complexity may be such that it is not feasible to study more realistic models, except per haps as special cases by computer simulation. Fortunately many traits of interest are normally distributed, or approximately normally, or may be transformed to be so, and the l inear correlation and regression model works very well for most purposes. On the other hand, there has been considerable discussion in the literature of specific models of assortative mating for single-locus traits (O' Donald , I 960a ; Scudo, 1 968 ; Parsons, 1 962 ; Watterson, 1 959 ; Scudo and Karlin, ] 969 ; Karlin and Scudo, 1 969). As one considers the complexities of real populations there are many factors to take into account. The pattern may depend on whether the mating
1 62
AN I NT R O D U CTI O N TO P O P U LATI O N G E N ETI CS T H EO R Y
is monogamous or promiscuous, on the sex ratio, on which sex exerts t he prefere nce, on the nat ure of the i n heritance of the trait, and on many other variables. Another compl ication is t hat the mating pattern may lead to a greater fertility of some gen otypes than others. I n other words, t here may be selection i n addition to pu re assortative mat i n g. A s stated in the beginning of this chapter, we shall ordi narily use the wo rds assortative m ating to desc ri be pure assortative mating with no selec tion ; that i s, each genotype has the same ex pectat i on of surviving and fertil ity. When t h i s i s not so we shall speak of asso rtative mating with selection. Even with this definition there will be di fficult ies in int erpretat ion. For exa mple, i t may be that the same gen oty pes are more fertile in some mat i n g combinations than others. I n some i n sta nces it may be more con ven ient to designate the ferti l i ty of a mati ng combi nation than that of a genotype (see Bodmer, 1 965). We sh all consider only a few of the many exa m ples that cou ld be used, fi rst uncom plicated by selecti o n and later with selection i ncl uded . Assortative-mating M odels Without Selection
We ret urn fi rst to the single locus with domi nance, fi rst d iscussed in Section 4.6. We assu med that t he same l evel of preference existed among the recess i ve phenotypes as among the domi nants, bot h measu red by the co rrel ation coefficient, r. But we wo u l d now like to be more general. Fo r exam ple, red-haired persons (o r some red-haired perso ns) m ay prefe r t o m a r ry o t h e rs w i t h red hair, b u t the rest of the popu l ation may h ave different preferences, or be i n d i fferent to hair co l o r. Consider the same model as before, but let the degree of assortment be r a nd r' among the recessives and domi n a nts, i n stead of the same value fo r bot h. A gain we assume that, a fter the designated fract ion of asso rtative paI rs I S fo rmed, the rest of the population mate at ra ndo m . T h e m at i ngs wil l then occ u r i n t h e fol l owing ratios : MATING A-
A-
A - , aa aa
X
x
A-
aa x aa
FR EQU ENCY
RANDOM
ASSORTATIVE
r '( l - R)
rR r ' --'-- R(r - r ') = 1 - D
( l - r')2( 1 - R ) 2 / D ( I - r ){ 1 - r'){ 1 - R) R/D ( I - 1' )( 1 - r ')( 1 - R) R,'D ( I - r ) 2 R 2/D J
- r ' - R(r - r') =D
} }
TOTAL 1 - R R
C O R R E LATION B ETWEEN R E LATIVES A N D ASSORTATIVE MATI N G
1 63
The equations corresponding to 4.6.3-4.6.5 can be obtained from this table, glvmg P' + I
(l - r ') 2p 2 =
D
2Q , + 1
-
R, + 1
=
_
r 'p 2, + -1 - R, '
4.1 1 .1
2( 1 - r')p[(1 - r') Q, + (1 - r) R ,] D
[( 1 - r') Q, + ( 1 - r)R,] 2 D
+ rR , +
+
2r'pQ, 1 -
r 'Ql
--
1 -
R,
R, '
4.1 1 .2
4.1 1 .3
That the gene frequency does not change can be verified by adding 4. 1 1 . 1 and half of 4. 1 1 .2 (or 4. 1 1 . 3 and half of 4. 1 1 .2). This again shows that pure assortative mating does not change the gene frequency. However, the final equil ibrium and the rate of approach to this depend on r and r ' , as well as on the gene frequencies. The equilibrium value for Q is given by a cubic equation (see Scudo and Karlin, 1 969), which of course reduces to 4.6.6 when r = r'. As an example, suppose a certain fraction of deaf persons attend common schools and tend therefore to marry assortatively. The rest, say, are educated in publ ic schools and join the population presumed to be marrying at random with respect to this trait. Then r ' will be approximately O. In this case, the equil ibrium equation corresponding to 4.6.8 and obtained by eq uating Rr+ 1 = Rr = R is 2 4.1 1 .4 r i F - ( l - r + 2qr ) R + q = 0, with the sol ution �
R
=
-
-
....:.. r--+ 2q�) 2 -_ 4rq2
�----------------------
1
,. + 2qr
J(1
21'
4.1 1 . 5
When r is less than I there is an eq uili bri um set of genotype frequencies. The proportion of recessive homozygotes is somewhat less than if r = r ' , and the heterozygosity somewhat greater. When the mating preference is complete (r = I ), the population eventually becomes homozygous. Notice that in this case, 4. 1 1 .4 and 4.6.8 are equivalent, as they should be. If there is complete assortment within one phenotype, there must be within the other also. To con tin ue with the same general model, suppose that those matings which are assortative differ in fertility from those which are random. This might happen, for example, if the assortment took place first ; then the later random matings might have their fertility impaired by the delay.
Assortative Mating with Selection
1 64
AN I NT R O D U CTION TO POPULATI O N G E N ETICS T H EO RY
A s an example, suppose that in equations 4. 1 1 . 1 -4. 1 1 . 3 the random matings have thei r fertility mul tiplied by a constant C, which may be greater or less than I . This is eq uivalent to replacing D by Die, which we shall cal l K. However, the equality signs must now become proportionality signs, since the three equations no longer add up to unity. Consider first that r = r ' . Then we can write, after some algebraic simplification (it will be hel pfu l to recall that P, + Q , = P, and Q, + R, = q , = I - p ,) ,
p , Pt + 1 + Q, + l - = q, Q, + 1 + Rt+ 1
=
(1 -
(l
r) 2 p + rpK
- r ) 2 q + rq K
p
=, q
4.1 1 .6
regardless o f the value of K. I n t his case there is still no selection for, although different kinds of matings take place with different frequency, each genotype makes the same contribution of genes to the next generation . However, if r i= r ' , then the relationship 4. 1 1 . 6 is no longer true. The gene frequencies change and assortative mating is complicated by an inherent selection in the process. An interesting model that has been used for asso rtative mating with selection is the following. Suppose that matings occur at random but that disassortative matings are less fertile. This might happen if fo r some reason matings between unlike types were incompatible. We measure the extent of reduction in matings between different phenoty pes by s . The model is specified in this way : MATING
PROGENY RATIOS
FREQUENCY R ATIO
AA
p2
AA x AA AA x Aa Aa x A a A A x aa Aa x aa aa x aa
Aa
4PQ 4Q 2 2PR( l -s) 4QR( I -s) R2
Total
aa
2PQ 2Q 2 Q2 2PR( l -s) 2Q R( 1 -s) 2Q R( 1 -s) R2
1 - 2R( t - R)s = D
Collecting the progeny of each genotype, DP' + l = (P, +
Q,) 2 = p; ,
D Q' + l = Q ,(Pt + Q,) + R ,( P, + Q,)( [ - s) = p , q, - p , R, s, DR , + 1
=
(Q, + R,) 2 - 2Q, R, s =
q;
-
2(LJ,
-
R,)R, s
.
4.1 1 .7 4.1 1 .8 4 11 9 .
.
CORRE LATIO N B ETWEEN RELATIVES AND ASSORTATIV E M ATING
165
Add i n g 4. 1 1 . 7 and 4. 1 1 .8 gi ves
p, +
I
=
P,+
I
+
Qr + 1 = p, [. �R�( ;� Rr )sl · _
4.1 1 .1 0
Whe n R, < 1 /2, P ' + l > p " a n d t h e dominant gene increases i n freq uency. Whet her t he q u antity i n b rackets is greater o r less than 1 determines whether the gene frequency i ncreases or decreases. So there is a tendency fo r the popula tion to move away from the point where the two phenoty pes are equal (P + 2Q = R = 1 /2), The po pulation terlds to move toward fixation of which ever phenotype was more common in the first pl ace. We end up eventual ly with a homozygous popu l ation, a nd w h ich type it is depends mai n ly on t he i n itial gene frequencies.
R
:--_'_----..:.... R ""
+
p
o
Figure 4.1 1 .1 . The paths fol lowed by populations under assortative mating of the type descri bed by eq u at i o ns 4. 1 1 .7-4. 1 1 .9. There is an unstable equilibrium at the point q, = q , R = 1 /2. Poi nts correspondi ng to the Hardy-Wein berg proport ions l ie along the parabola, whose equati on is p2 2PR + R2 2P - 2R + ] O. When s is very small, the population tends to move quickly toward t hi s curve and then proceed sLowly along t he curve to one or the other of the extreme poi n ts, P 1 , or I , depend ing on which side of the horizontal line, R = 1 /2, i t started from. _
R
1 66
AN I NT R O D U CTION TO P O P U LATI O N G EN ETI C S TH E O RY
Actual ly the situation is quite complex. There is a point of u nstable equi librium when R = 1 /2 and the equilibrium val ue of the gene frequency q is obtained by solving 4. 1 1 .9 when R = 1 /2 and q, = q. Sometimes the gene frequencies will reach this equili brium, but this point is unstable and the slightest displacement starts the process off toward fixation at one of the two extremes of fixation of the dominant or the recessive allele. A discussion of this case has been given by Scudo and Karlin ( 1 969). We have simply sketched the general picture in Figure 4. 1 1 . 1 . This shows the behavior when s is small. 4.1 2 Disassortative-matin g a n d Self-sterility Systems
There has been much less attention paid to disassortative- than to assortative mating systems, for the very good reason that with the great majority of traits the mating system, if departures from randomness occur, is more likely to be assortative than disassortative. There are, however, some conspicuous exceptions-one is the ordinary system of biparental reproduction, which may be regarded as an example of disassortative mating. Even more than with assortative mating, disassortative tends to be con founded with selection. It is typically accompanied by gene frequency changes. I n fact, it may be impossible to have strongly disassortative mating without selection. From the simple example of a population with 90 % of one pheno type and 1 0 % of the other, we can see that there is an obvious upper limit to the number of disassortative pairs that can occur. The more common type tends to get left out. We shall consider two examples, both of which involve a mixture of disassortative mating and selection. A number of plant species have this kind of system. The rule is that pollen is functional only on a plant, neither of whose two al leles at this locus is the same as that of the pollen . The system we are considering depends entirely on the haploid genotype of the pollen itsel f and not on the plant that produced it, although there are examples where the determination is by the genotype of the plant rather than the individual pollen. The sel f-steril ity allele system obviously prevents self-fertilization since neither of the two types of pollen produced by a plant can function on that plant. It is also clear that the system would work best in preventing self fertilization, while permitting cross-fertilization , if there were a large number of alleles, so that a randomly chosen pollen is not likely to share its gene with the plant on which it lands. It is conventional to designate the alleles at this locus with the letter S, with individ ual alleles indicated by subscripts.
Self-steri lity Alleles i n Pla nts
C O R R ELAT I O N B ETWEE N R ELATIVES A N D AS SORTATIVE MATI N G
1 67
As an example, SI pollen would function on S2 S3 ' S2 S4 , or S2 S5 0 plants, but not on S1 S2 or SI S4 7 . An im mediate consequence of the system is that every pl ant is heterozygo us. Th us, the total frequency of all genotypes carrying one Si allele is 2Pi . We shall make two assumptions, both reasonable. One is that pollination is random. The second is t hat enough pollen is prod uced t hat every ovule has an equal chance of being fertilized . When there is no selection between different ovu les, t hen each allele transmitted through the female has an equal chance. The success of a pollen grain of genotype Si depends on its falling on a plant not carrying this allele, and thus is proport ional to I 2P i . Thus the proportion of Sj pollen among all successful pollen will be Pi( l 2pJ/ 2r.pf , since the gene LP i( l - 2P i) · But the denominator is LPi - 2Lpf = I frequencies must add up to 1 . Thus, dropping subscripts for simplicity of notation, the change in allele frequency due to pol len selection is in one generation -
-
-
pe l - 2 p) 1 - 2X
-
P=
-
2 p( p - X) 1 2X ' -
where
But, there is no selection on ovules ; all the selection is through the pollen . Thus, since the genes contributed to the next generation come equal ly from t he male and female parents, t he total change in gene frequency is only half as great. So, we write fl p =
- pe p - X ) 1 2X '
4.12.1
-
This formulation is an excellent approximation , but it is not exact. It fails to take i nto account the exact nature of the other pollen grains with which any particular grain is competing on a particular stigma. For example, if most of the other grains carry one of the two alleles that the female has, this particular pollen has a better chance of being the successful one. However, if the number of alleles is large, most of the competing pollen will also fai l to have a n allele corresponding t o that o f the stigma o n which they are com peting, so that this factor can be ignored. The more exact formulation has been given by Fisher ( 1 958). I t is also discussed by Moran ( 1 962) and a rather similar approach was made by Wright ( 1 939). The exact expression is in fact quite intractible, and the solu tions obtai ned have been by approximations such as the one we have just
1 68
AN I NTRODUCTION TO P O P U LATI O N G E N ETICS T H E O R Y
given. Our formulation gives the correct equilibrium val ue, and the correct rate of approach unless the number of alleles is small or their freq uencies grossly unequal. It is clear from 4. 1 2. 1 that the freq uency of an allele will i ncrease if it is less than X, and will decrease if it is greater. The same is true of all alleles. Hence each allele frequency approaches the same value, X, and there is a stable equilibrium when all al leles are of the same frequency. If there are n alleles, then the equilibrium frequency of any one is
Pi = l in.
4.1 2.2
Notice that equation 4. 1 2. 1 is equal to 0 when P i = l in for any number of alleles. That is, if one or more alleles h ave frequency 0 the rest again approach equality. Such a system will lead to the maximum number of alleles maintained. for every new mutant will tend to i ncrease and, aside from loss due to sam pling accidents, will be incorporated i nto the population, whereupon a new equilibrium is approached, with P i = I /(n + I ). It is no surprise then that actual i nvestigations h ave revealed a very large number of such alleles persisti ng i n plant populations. The only force tending to reduce the number of alleles (other than some limitation on the total range of mutational possibility) is the accidental loss of alleles from random changes. This problem has been discussed in great detail by Wright ( I 938, 1 964, 1 965), Fisher ( I 958), and others. We shall consider such random processes in the last three chapters. Disassortative M ating ; One Locus, Two Alleles We shall consider the problem briefly, and only for two alleles. The problem of d isassortative mating for more than two alleles is quite complex (see Finney, 1 952 ; Moran, 1 962). Imagine first a very simpl e case where the only matings are between AA x aa and A a x aa . It is obvious that, after the first generation, there will be no more AA homozygotes and the only remai ning matings are A a x aa. This produces two kinds of progeny, like the parents, and i n eq ual propor tions. The equilibrium is immediately stable. This situation is found in some plants where the dominant gene causes short style and the homozygous recessive is long. Fertilization normally occurs only between two different types. This is clearly expected to lead to a stable 1 : 1 polymorphism. A more familiar example is the ordinary sex determining system in which all mati ngs are X X by X Y, again leading to a stable ] : ] sex-polymorphism. Consider next a slightly more complicated example, and one with a rather i nteresting consequence. This time we still consider only two alleles, but each genotype is regarded as different. The rule is that each genotype can mate
CO R R ELATI O N BETWEEN R ELATIVES AN D ASSO RTATIVE MATI N G
1 69
with any genotype but its own ; otherwise mating is random. If, as usual, we let P, 2 Q = H, and R stand for the frequencies of the three genotypes A A , A a , and aa, we can set forth the various possible matings as follows : PROGENY RATIO
FREQUENCY MATING
RATIO
AA x Aa AA x aa Aa x aa
2PH 2PR 2HR D=
Total
1
AA
Aa
PH
PH 2PR HR
aa
HR
- p2 - H 2 - R 2
The recurrence relations are easily written
P' + I = Pt Hrl Dt , R t + 1 R r Hr/ D
4.1 2.3
"
=
from which
Pt + 1 Rt + 1
Pt Rt
-- = - ,
4.1 2.4
showing that the ratio of the two homozygotes does not change. On the other hand,
Pt + 1 P,
Rr + l
H, Dt
-- = -- = -
Rr
4.1 2.5
and Dr
=
1
-
P; - H ; - R ;.
The homozygotes increase when H > D and decrease when H < D. There is a stationary state when H = D, for then the genotypes have no tendency to change frequency. Setting H = D, and d ropping subscripts since this is an equilibrium, gives the ellipse,
2p 2 + 2 PR + 2R 2
-
3P - 3 R + I
=
O.
4.1 2.6
This is shown in Figure 4. 1 2. 1 . As can be seen, the equilibrium is a rather peculiar one. Any point along the ellipse is stable with respect to perturbations changing the frequency of heterozygotes, for the population tends to return to the points along the el lipse. On the other hand, there is no tendency to return to the original point if there is a change (chance or otherwise) along the ellipse. H ence t here are an infinity of points that are equilibria of this sort.
1 70
A N I NTROD UCTION TO POPU CATION G E N ETICS THEORY
R
O ��------=-r---�-- 1 o p "2
The case of complete d isassortative mating with two alleles and three genotypes. The p op ulati on follows t he paths indicated by the arrows. Points along the ellipse represent equ ilibria The arrows cross the ellipse, since the approach t o eq u i l ibri um is oscillatory F igure 4.1 2.1 .
.
.
The only possible val ues lie within the triangle. The arrows indicate that there is no tendency for t h e Pi R ratio to change. A n actual population would drift randomly along the curve until one or the other of the two homozygotes is lost. Then the situation would reduce to the 2 ph en otype polymorphism of the type discussed before, A A and A a in equal proportions, or A a and aa. These represent the two points at the end of the cu rve. Another point of interest about this sytsem is that the approach to the ell i pse is oscillatory. The popu lation moves in the direction of the arrows in Figure 4. 1 2. 1 , but overshoots each generation so that the value moves back and forth along the arrow, crossing the ellipse each time, and with decreasing amplitude until the equilibrium is reached. In plants where there is an inco mpatibi lity system where the pollen function depends on the genotype of the plant that produces the pollen rather than the specific allele in the pollen grai n itself, there are two possible mechanisms whose consequences differ slightly. It may be that incompatible pollen fails to fertilize the oVllle (pollen elimination) ; alternatively, the fertil ization may occur, but this particular ovule t hen fails to develop if the mating is incompati ble (zygote elimination). The model we have just discussed is equivalent to zygote eli mi natio n . For a discussion, see Finney ( 1 952) and Moran (1 962). -
C O R R ElATION B ETWEEN R ElATIVES A N D ASSO RTATIVE MATIN G
1 71
4.1 3 Problems
1 . Compute the mean and the genic, dominance, and genotypic variances at inbreeding coefficients 0, 1 /2, and 1 for the following four examples. Yl l
(a) ( b)
99 99 1 00 99
(c) (d)
Y1 2
1 00 1 00 1 00 1 00
Y2 2
101 1 00 99 99
PI p
P2 q
.9 .9 .5
.1 .1 .5
2. For the system, al l =f: al l = au (i.e. , At completely recessive), what gene frequency maximizes the genic variance ? The dominance variance ? The total variance ? (Assume random mating.) 3. Give two reasons why the correlation between mother and daughter is likely to be lower for human weight than the correlation between sisters. 4. In terms of the model at the beginning of Section 3 . 1 0, show that the genic variance and dominance variance for f = 0 are 2PI P2 [A + D(Pl - P 2)] 2 and (2PI P 2 D) 2 . 5. What are the k-coefficients for a child and grandparent, for half-sibs, for uncle and niece, and for individuals D and H in pedigree 3.4.2 ? 6. Show that 2k o = ( 1 - 2fAd( 1 - 2fBD) + ( I - 2fAD)( 1 - fBd - 2(fAC fBD + fBC fAD) ' 7. Show that the dominance variance with random mating is the square of the difference in the population means at f = 0 and f = 1 . 8. Relatives such that k 2 = 0 are sometimes called unilineal (Cotterman, 1 941) and those with k2 greater than 0 are bilineal. Give an example of a bilineal relationship other than identical twins, sibs, and double first COUSI nS.
9.
Show that for additive genes (no dominance, no epistasis, no environment effect) the correlation between parent and offspring is
1
+
2/0 + /p
where fo and fp are the inbreeding coefficients of the offspring and parent. 1 0. In deriving 4.4.4 we assumed that the covariance of the sums of two quan tities is not changed if the quantities are correlated. Prove this. 1 1 . Show from 4. 1 1 .9 that the equilibrium value of q for R = 1 /2 is (s + )(2 - s) (1 - s»)/ 2. 1 2. What is the limit of 4 in Problem 1 1 as s approaches 0 1 Show that the genotypes at this point are in Hardy-Weinberg ratios.
172
AN I NTRODUCTION TO P O P U LATI O N G E N ETICS T H EO RY
1 3. What is the heritability of a trait determi ned by a very rare recessive gene ?
1 4. I S.
1 6.
What is the correlation between sibs ? Show that with " pure " overdominance (al l > al l = a2 2) the heritability is (Pi P 2) 2/(pf + p�). I and J are first cousi ns. I has phenylketonuria, the allele frequency being (say) 0.01 . What is the probability that J has this recessive disease ? Is the answer given by using k·coefficients exact ? Show that with three self·sterility alleles the proportion, P" of S1S2 heterozygotes in generation t is ( I Pr - 1)/2. What is the equilibrium proportion ? Is the approach direct or oscillatory ? Assuming the model of Table 4.4.2, what is the covariance of second cousins ? Of individuals D and H i n Pedigree 3.4.2 ? Compare R from 4. 1 1 .5 for , = r' = 0 5 with , = 0.5, " = 0 when q 0 0 1 . Do they differ appreciably ? Suppose that i n a randomly mating population the heritability, H ( = h2 ) of IQ is 0 .6 and that the correlation between husband and wife is 0.5. By what fraction will the variance be increased when the population reaches equilibri um under this degree of assortative marriage ? Compare this with the amount when the heritabilit y is 1 . If, prior to the beginning of assortative marriage, the IQ distribution had a mean of 100 and a standard deviation of 1 5, what fraction of the population would have IQ ' s above 1 30 before and after ? -
-
1 7.
1 8. 1 9.
.
=
.
5 SELECTION
S
election occurs when one genotype leaves a different n umber of progeny than another. This may happen because of differences i n survival, i n mating, o r i n fertility. We are, as before, ignoring for the present differences that arise from random fluctuations. As mentioned in the pre ceding chapter, selection is distinguished from i nbreeding and pure assorta tive mating i n that u nder the latter systems the number of descendants is the same for all genotypes. Selection, along with migration and mutation, may alter the gene frequencies. However, it does not necessarily do so ; it may be that the fitnesses of the different genotypes differ, but i n such a way that opposing tendencies balance and the gene frequency is unchanged. Selection may be because of the greater fitness of some types, as i n nature, or through artificial selection as practiced by the animal and plant breeders. Sewall Wright ( 1 93 1) has sai d : Selection, whether in mortal ity, m ati ng or fecundity, applies to the organ ism as a whole and thus to the effects of the entire gene system rather than to single 1 73
1 74
AN INTROD UCTI O N TO PO P U LATI O N GENETICS THEORY
genes. A gene which is more favorable than its allelomorph i n one combination may be less favorable in another. Even in the case of cumulative effects, there is generally an optimum grade of development of the character and a given plus gene will be favorably selected in combinations below the optimum but selected against in combinations above the optimum. A gain the greater the number of unfixed genes in a population, the smaller must be the average effectiveness of selection for each one of them. The more intense the selection i n one respect, the less effective it can be i n others. The selection coefficient for a gene is thus in general a function of the entire system of gene frequencies. As a first approximation , relating to a given population at a given moment, one may, however, assume a constant net selection coefficient for each gene.
Selection involving both mortality and fertility is almost always com plicated. One consequence of differential mortality is that a population counted at any stage except as zygotes will usually depart from Hardy Weinberg ratios, even when mating is random. This would suggest that the proper time to census a population would be as soon as possible after fertili zation. On the other hand, from the standpoint of assessing the effects of random gene frequency drift, it is more meaningful to count adults at the beginning of the reproductive period (Wright, 1 93 1 ; Fisher, 1 939a). When the probability of mating or the fertility is being considered, it may be more meaningful to measure the fertility of mating pairs than of individuals (Bodmer, 1 965). The systematic quantitative theory of natural selection came of age with a series of papers by H aldane ( 1 924- 1 93 1 ) , In the beginning of the first paper he said : A satisfactory theory of natural selection must be quantitative. In order to establish the view that natural selection is capable of accounting for the known facts of evolution we must show not only that it can cause a species to change, but that it can cause it to change at a rate which will account for present and past transmutations. In any case we must specify : ( 1 ) The mode of inheri tance of the character considered, (2) The system of breeding i n the group of organisms s tu died , (3) The i ntensity of selection, (4) Its i ncidence (e.g. on both sexes or only one), and (5) The rate at which the proportion of organisms showing the character i ncreases or diminishes. It should then be possible to obtain an equation connecting (3) and (5).
Starting with the si mplest cases-a single pa ir of al1eles, random mating, discrete generations, constant selection coefficients equal in the two sexcs he proceeded to more and more complex cases. These i ncluded non-Mendelian inheritance, different intensities in the two sexes, within-family selection, X-linkage, inbreeding and assortative mating, multiple factors, linkage, poly-
S ELECTION
1 75
ploidy, sex-limited characters, reversal of dominance in the two sexes, gametic selection i n one or both sexes, mUltiple-recessive and multiple-dominant traits, and overlapping generations. These early studies are summarized in his 1 932 book The Causes of Evolution. The ways that selection can operate are uncountable, and many special cases are of genetic interest. However, we shall discuss only a few in order to illustrate general principles. The effects of selection were also studied by R. A. Fisher and Sewall Wright, with a greater emphasis on quantitative traits. We shall also discuss the main generalities arising from these studies. Natural selection , like classical mechanics, has both static and dynamic aspects. The statics of evolution will be dealt with in Chapter 6. This involves the relatively stable situation that results from the balance of various opposing forces-mutation, selection, migration, and random fluctuations. In this chapter we consider the dynamics-the way in which selection changes the composition of a population. We shall consider two models, models I and 2 of Chapter 1 . The first assumes that generations are discrete and nonoverlap ping, as in annual plants. This is also applicable to many problems in animal breeding where pedigrees are known, and the generations can therefore be kept straight. The second model applies strictly to organisms that reproduce and die continuously and with a constant probability of both, a situation approxi mated by some single-celled organisms. However, we should like to use it as an approximation to the situation in many organisms where generations overlap. The approximation is best when the population has reached stability of age distribution, as discussed i n Chapter 1 . As expected, the two models become quite similar when the intensity of selection for the trait under consideration is small. For populations not in age-distribution equilibrium, it may be useful to weight each individual by its reprod uctive vaJ ue (see Section 1 .5). However, most of the time we shall use the equations in a simple form, regarding them as useful approximations from which we can reach i nteresting qualitative and semiquantitative general izations. 5.1 Discrete G eneration s : Complete Selection
As a first example, we consider simple cases in which one class is lethal or sterile. In animal or plant breeding this corresponds to culling one class completely. 1 . Selection Against a Dominant A llele
If there is complete selection against a dominant factor, that is, if this phenotype is eliminated or fails to reproduce, the population next generation will be composed entirely of
1 76
AN INTRODUCTIO N TO P O P U lA"n O N G E N E"nCS THEORY
homozygous recessives (except for new mutations, incomplete penetrance, and such complications) " So, one generation of selection is sufficient to elimin ate the undesired type from the population. On the other hand, if all homozygous recessives are eliminated there are still recessive factors re maining in the population, hidden by heterozygosis with their dominant alleles. These can combine in later generations to produce zygotes that are homozygous for the recessive gene. If the proportion of recessive genes in generation 0 is P o and mating is at random, there will be P 5 recessive homozygotes. When these are eliminated, the only possibility for a homozygous-recessive offspring is by the mating of two heterozygotes, in which case 1 /4 of the offspring are expected to be homozygous recessives . Among the dominant phenotypes, a fraction 2 Po( 1 - Po)/ [(1 - PO) 2 + 2po( 1 - Po )] will be heterozygous. The proportion of recessive homozygotes next generation will then be 2. Selection Against a Recessive A l l e le
[
] [
2 Po( 1 - Po) 2 4 ( 1 - Po)2 + 2po( 1 - Po) = 1
Po 2 1 + Po
J
In the next generation, we can replace P i by P o /( 1 + Po) , and so on, leading to the following formulae for the change in gene frequency. PI =
Po 1 + Po '
Pi Po , P2 = 1 1 + 2p o + Pi
Pt =
5.1 .1
Po 1 + tp o '
and the zygotic frequencies are given by the squares of these quantities. Another form of expression that might be mnemonically preferable is obtained by replacing p with I /y. This leads to 1
-
Yt
=
1 1 + Yt - l
1
- --
t + Yo
5.1 . 2
The proportion of recessive homozygotes is the square of these quantities. For example, if the present frequency of a rare recessive d isease is 1 {40, 000 or 1 /2002, and if none of these reproduce, the number next generation will decrease to 1 {20 1 2 or 1 {4O,30 1 . This provides a numerical illustration of what is already well known-that selection against a recessive that is rare goes very slowly.
S ELECTION
3. Selection Against a Sex - l i nked Recessive Trait
1 77
If there is com plete elimination of homozygous-recessive females and hemizygous-recessive males, then after the first generation there will be no more homozygous recessive females. From that time on, all the recessive phenotypes will be males, who i n turn came from heterozygous females. A heterozygous female may transmit the recessive gene to a son or to a daughter. I n the former case it is eliminated ; in the latter it is retained i n the population. The proportion of heterozygotes among the daughters of affected females is 1 /2. Since the affected males come from heterozygous females, they too are reduced by half each generation , the proportion of affected among males in any genera tion being exactly half the proportion of heterozygotes among females i n the previous generation. The results that we have j ust discussed are shown graphically in Figure 5. 1 . 1 . Note i n particular the very slow rate of decrease of homozygotes for autosomal recessives, once the gene has become rare. Figure 5. 1 .2 shows actual data on the decrease in frequency of a recessive lethal gene i n Drosophila. In this experiment the generations were kept separated so the conditions of the model are fulfilled in this regard. In each generation the ad ults will be of two genotypes, AA and Aa, the aa type having died in the pre-adult stages. The two surviving types were classified by progeny 0.25
is c Q) :::l CT Cl>
Q)
u:
Co >-
0.125
c Q) .c 0..
o
2
3
4
5
6
7
Time in Generations
8
9
10
11
12
F igure 5.1 .1 . Selection against a recessive (R), dominant (D), and an X-chromosomal recessive ( X). Selection is assumed to be complete, and in each case the starting frequency of the trait is 0.25. In the X-chromosome case it is assumed that there are no homozygous recessive femal es , as would be the case after one generation of selection.
1 78
AN I NTRODUCTION TO POPU LATION G E N ETICS T H E O R Y
0.6
0.5
0.4
q 0.3
0.2
- - �-�-�!>--. ..- - - -
.�
- - -
0.1
- - - - -- - - - - -
0 4---r--.r--.---.--,--' 7 5 8 4 9 10 o 6 2 3 2
3
4
5
7
6 (
8
'
9
10
11
Selection aga inst a recessi ve lethal gene. The allele frequency is given on the ordinate. In the abscissa t is the generation when adults are counted and t' is the generation for zygote gene frequency. Data from Wal lace (Amer. Natur. 97 : 65-66, 1 963). Figure 5.1 .2.
tests and the proportion of a ge n e s among these is shown in the graph for each generation. I n generation 0 the adults were aJI Aa, since the experiment was started with flies all of the same heterozygous genotype. Thus Po = 1 /2. This is also the expected proportion of a alleles among the zygotes in the next generation. At the adult stage the proportion of a alleles will have changed to 1 /3, according to the relations in 5. 1 . 1 . In the figure the generations corresponding to the adult gene frequencies are i ndicated by t ; those for zygote gene fre quencies are i ndicated by t'. The data agree approximately with the theoretical expectations, the standard errors of the points being large enough that most deviations are not significant. However, when all gene rations are considered there is some selection against the heterozygotes. 5.2 Discrete G enerations : Partial Selectio n
We define the fitness, or selective value, as the expected number of progeny per parent. The parents and progeny must be counted at the same age, of course. The effects of differential mortality and fertility are kept within the same generation if each generation is enumerated at the zygote stage. In a biparental population, half of the progeny are credited to each parent.
S ElECTION
1 79
Consider that there are many genetic types i n the population, and that each reproduces its own type exactly. The model is also appropriate for a Y-chromosome factor (if only males are considered), for an X-chromosomal factor i n either sex in an atta�hed-X stock, or cyto plasmic inheritance transmitted through one sex. Assume that the genotypes A I ' A 2 , A 3 , • • • have fitnesses W I ' w 2 , W3 , • • • and are present i n the population i n frequencies P I ' P2 , P 3 , • • • • Then the proportion of A i genotypes next generation will be Asexuall Population
P i Wi Pi Wi = ---=- , . . W P 1 W 1 + P2 W2 + where w = L Pi Wi is the average fitness. The change in the proportion of A i in one generation is I
Pi =
.
A p1. = Pi Wi - p . = Pi(Wi -
-
I
-
W
-
w)
W
.
5.2.1
The quantity W i w is the average excess in fitness of the genotype A i ' For only two alleles, this formula is conveniently written as -
w ) Ap I = P I P2(W� - 2 = SP�P2 ,
W
W
5.2.2
where s, t he selection coefficient, is WI - W 2 and P I + P2 = 1 . Notice that selection is most rapid when the two types are nearly equal in frequency and becomes slower when one is m uch more common than the other. For example, if WI is 1 . 1 and W 2 is .9, the frequency of A l wil l i ncrease from . 50 to .55 in one generation, but if the frequency is 0. 1 0, it will only change to . 1 1 95 in one generation. Diploid Sexual Population We now let wij stand for the average fitness of the genotype A i A j • As before we let Pii be the frequency of the homozygous genotype A i A i and 2 Pij the frequency of the heterozygote A i A j ' Then the frequency of the gene A i is (from equation 2. 1 .2)
P i = L Pij .
j Next generation the proportion of A i genes will be I
Pi =
" . po o o o �) I) wI) -
w
p1 1 = ---=- , . w·
5.2.3
W
where 5.2.4
1 80
AN I NTRODUCTION TO PO PU LATION G E N ETICS THEO RY
and lVj =
Lj Pjj Wjj . Pi
5.2.5
Hence, 5. 2.6
The formula is the same as that for asexual selection , but Wj now has a more com plex mean ing. I n t his equation 1 \ ' ; i s t he ave rage fit ness of t he A ; al lele ; more specifically, it is the average fit ness of all genotypes contain ing A i , weighted by the frequency o f the gen otype and by t he number o f A i alleles ( l or 2). The n iV, the average fitness o f t he populatio n , can be expressed in either of two ways : ( l ) t he average of all t he genotypes in the popul ation, and (2) the average fitness of all the genes at this l ocus. These co rrespond to the two expressions given in 5.2.4. Equation 5. 2 .6 shows t hat the rate o f change o f t he gene frequency is proportional to :
The gene frequency, Pi ' Thus a very rare gene will change slowly, regardless o f how stron gly it is selected . (2) The average excess in fitness of t he A i allele over the popu lation average. If the excess, W j iV, i s posi tive, t he allele wi1l increase ; if negative, it will decrease. I f this is large the frequency of A i will c hange rapidly ; if small, slowly.
(I)
-
Notice also that the gen e frequency change will be sl ow when the allele becomes very common (p; -+ I). I n t his case, l I ' j and w are n ot very different, since most o f the population contains t he A i gene. This point can be brought out by rewriting 5.2.6 in another way : 5.2.7
where Wx is t he average fitness of a l l al leles other t han A j . This shows clearly t hat Apj approaches 0 as P i gets near to eit her 0 or I . I f the population i s i n random- mati ng p ro portions we can write 5 .2.7 in stil l another way, often used by Wr ight (e .g., 1 949). With random mating pIJ. . = p I. pJ. and 5.2,8
SelECTION
1 81
or, representing all alleles that are not A i as collectively Ax with frequency
Px = I
-
Pi '
=
Pi H'jj + Px H' jx , 5.2.9 Wx = Pi Wxi + Px lVxx , where U' ix = It'x i i s the average fitness of all heterozygotes where one allele is A i and H'xx i s the average of all genotypes where neither is A i ' In this notation , Wj
still assuming random-mating proportions,
5.2.1 0
In this formulation .i.' jx and wxx are not constants, but depend on the relative frequencies of the non-A i alleles (except when there are only two alleles). Notice that the partial derivative of w with respect to Pi is -;- =
ow
U Pi
2 Pi Wij
-
2 P i Wix + 2( 1 - Pi)Wix
-
2(1
.
Pi)Wxx 5.2.1 1
The two quantities W ix and Wxx are treated as constants in this differentiation. Substituting this into 5.2. 7 gives Wright's formula 5.2.1 2
In analogy with physical theory we can regard w as a potential surface, in which case OW/O Pi becomes the slope of the surface with respect to Pi ' Treating W ix and Wxx as constants is equivalent to treating all allele frequencies except Pi as constant fractions of I P i (see 5.2.9). Thus OW/OPI is the slope of w in the direction where the relative frequencies of the other alleles do not change. The gene frequency moveS over the surface at a rate proportional to the slope, but governed by the term P i(l Pi) ' For extensions to populations not in random proportions, see Wright ( 1 942, 1 949). In more complicated situations there may not be a fitness surface, w, that is a function of the gene frequencies. For example, the fitnesses of the individual genotypes may not be constants, or there may be complications from linkage and epistasis. In such cases the fitness may not even increase at all. For a discussion, see Wright ( 1 942, 1 955, 1 967 ) and Moran ( 1 964). -
1 82
AN I NTRODUCTION TO POPU LATIO N GENETICS T H EORY
When there are only two alleles, we can use equation 5.2.7 (for example) with i = 1 and x = 2. We shall now write explicit formulae for the 2-allele situation with random mating in several special cases. The fitnesses and frequencies of the three genotypes are given below : GENOTYPE FITNESS FREQUENCY
Then, from 5.2.9 and 5.2.7, 11-' 1 = 1V1 l P l + 11-' 1 2 P2 , W2 = 11-'1 2 P I + H'2 2 P 2 ,
Ap
I
+ ( W2 1 - W2 2)P2] = P I P2[(W I I - W I 2)P� . \V
5.2.1 3
a. No dominance, 1 1'1 2 = ( 1\" 1 1 + 1 \' 2 2 )/2. A
_
PI -
P I P 2 ( W I 1 - W22)
2 \V
=
S P I P2 2W -
5.2.1 4
where s = 1\'1 1 - 11' 2 2 b. Dominant favored, It'l l = 11'1 2 ' A I dominant. A
pI
=
2 SP P2 . � W
5,2.1 5
c. Recessive favored, 1 1' 1 2 = 1 1' 2 2 , A I recessive. A
2 SP P2 pI = � . W
5.2.1 6
d. Asexual , haploid , or gametic selection (5.2.2). 5.2.1 7
These four formulae illustrate several i mportant facts abou t selection. One is that asexual selection is more effective than sexual when the whole range of gene frequencies is considered. A diploid sexual population with no dominance evolves half as fast as an asexual population in which the two types differ by the same amount as the two homozygotes in the sexual popu lation. Haploid or gametic selection is equivalent to asexual for a single locus. Comparison of b and c shows that selection is most effective when both the dominant and recessive genes are of intermediate frequency. In fact, the maximum rate of change is when the recessive allele is twice as frequent as
S ELECTION
1 83
the dominant. Selectio n becomes decreasingly effective as the recessive gene becomes rare and the p 2. term gets closer to O. Notice, by comparison with Table 4. 1 . 1 , that the situation w here selection is inefficient is that where the heritability is low, as expected. A way of expressing partial dominance i n a convenient manner i s to use the terminology of Wright ( 1 93 1 and l ater). We assign fitnesses and symbols to the genotypes as follows : GENOTYPE FREQUENCY FITNESS (RELATIVE)
Aa 2pq 1 - hs
aa
q2 1-s
In these terms, Il q
=
- spq [q + h(p - q ) ] . 1 - sq ( 2hp + q )
5.2. 1 8
The quantity, s, often is referred to as the selection coefficient. The quantity h is a measure of dominance. When the a allele is recessive, h O. W hen a is dominant, h = l . When II is negative there is an overdominance. This formula brings out t he i mportance of partial dominance when there is selection on a rare gene. I f a is completely recessive and q i s small the rate of change is proportional to q 2 . When q i s small and h larger than q. the correspond ing term is hq (since p is very nearly unity). Thus, e ve n a small amount of partial dominance (say, h .05 or less) may m ake a great deal of difference in the n ature of selection involving rare genes. This becomes especiaJly i m portant in the consideration of equilibri u m gene frequencies when selection agai nst a recessi ve genotype i s balanced by new mutations ; this wil l be discussed i n the next chapter. As mentioned earl ier, the number of interesting special cases is almost endless. Many of them were worked out by Haldane and many m ore have been done since. We shall at this point mention only two. l t is probable that in nature m uch of selection is based on fertility. It is likely to be the rule, rather than the exception, that a gene has different fertil ity effects in the two sexes. Suc h differences are also frequent in mortality rates, as well . So, if we are to have a model that is at all real istic, it should take into account the possibility of different selection coefficients in the two sexes. Fortunately, there is an easy solution ; t he considerations of equation 1 . 1 .4 are applicable, and the combined fitness i s the average of the two sexes. Furthermore, it is the simple average, since each i nd ivid ual is derived from one sperm and one egg. Therefore , except for X- and Y-1i nked genes, the contri bution of the two sexes i s the same, and the overall fitness of a geno type i s the unweighted average of that in the two sexes. This is not exact, =
1 84
AN I NTRODUCTI O N TO P O P U LATI ON GEN ETICS TH EORY
however, for the resulting differences in allele frequencies in the two sexes at the time of mating may lead to departures from Hardy-Weinberg propor tions (see 2.5. 1 ) ; but the formulae are usually satisfactory as an approximation. X- and Y-linked genes cause no particular difficulty. Genes that are on the Y chromosome are found only in the heterogametic sex. The situation is entirely equivalent to asexual or gametic selection where only males (or females if they are XY in this species) are counted. For an X-linked trait we note that the gene in a male is derived from his mother whereas those in a female are derived equally from the two parents. Therefore the frequency in males is given by the formula applied to the fitness and frequency in females of the previous generation. The gene frequency in females is given by the average for the two sexes in the previous generation . Specifically, if qm and qf stand for the frequencies of the gene of interest in males and females, and we use primes to designate the next generation, 5.2.1 9
and 5.2.20
The symbols IVm and IVf stand for the average fitness of the allele i n males and females, respectively. As mentioned above, these formulae are not exact because of departures from Hardy-Weinberg proportions ; this may become important if selection is very intense. Throughout this section we have spoken of gene frequency changes rather than changes of genotype frequencies. Formulae can be written for changes in the genotypes directly, but they are much more cumbersome. We effect a great simplification by working directly with the gene frequencies. Furthermore, the zygotic types are put together and taken apart every genera tion by the Mendelian processes of segregation and recombination, so that a zygote type (when many loci are considered) may never again be reconsti tuted. For these reasons, almost all of selection theory deals with changes in gene frequencies. This has a price, however. There is usually some inaccuracy i n goi ng from gene frequencies to zygotic frequencies. We need to know something about the mating system. Even i f mating is completely at random , there will be departures from Hardy-Weinberg ratios in all stages after mortality begins . We therefore regard the procedures as useful approximations rather than exact formulae. We follow the changes in gene frequencies ; then we get the
S ELECTIO N
1 85
genotype frequencies by the H ardy-Weinberg principle, or some modification thereof to include nonrandom-mating effects. If possible, we count the popu lation at the zygote stage. Fortunately, much selection of evolutionary interest is relatively slow, and the Hardy-Weinberg ratios are very good approxi mations even for adult populations. If we try to take into account the various complexities of populations in the real world the formulae naturally become more complicated. One obvious extension of what we have been doing is to consider survival and fertility as separate aspects of fitness. We illustrate with a special case. Consider a locus with two alleles, A l and A l , and with viabilities and fertilities as given below. GENOTYPE VIABILITY FERTILITY
The total fitness of a genotype will be the product of its viability and fertility. The part of the life cycle in which survival is important (from the stand point of n atural selection) is that prior to reproduction, so we let vij be the survival to the time of reproduction. We are still assuming that generations are discrete and that matings take place at random among the adults. If the proportions of the three genotypes are P l l , 2P1 2 , and P 2 2 , and the enumeration is made at the zygote stage, then the combi ned survival and fertility (or expected number of progeny, crediting half to each parent) of A i Aj is vjj fij . Letting v/j !ij = wij ' the equations of the earlier parts of this section (e.g., 5 . 2 .6) are applicable. Equation 5.2.6 gives the changes in gene frequency and the p ro po rtions of the three zygotic types are p i , 2P I P Z ' and pi , where P I a nd pz are the new gene frequencies. ]f, on the other hand, the population is enumerated at the adult stage the situation is more complicated. For one thing, the genotypes at this stage are no longer i n Hardy-Weinberg ratios. Suppose the population is censused just before reprod uction. Then vi) is the viability up to this stage and !ij is the fertility. (Deaths that occur during the reproductive period can be accommodated by regarding them as reducing f.) Let P l l , 2P 1 l , and P2 2 stand for the proportions of the three genotypes at the stage of enumeration. We can obtain the proportjon of zygotes next generation as follows. The A l genes contributed will be proportional to Pl lfi l + P1 2 fi z and the A z genes proportional to P 1 2/1 2 + P H/l 2 ' With random m ating the three zygotic types will be in the ratio 2 ( Pl l fi l + P1 2 fi z ) : 2(P l lfi l + P 1 2 h 2 )(P1 2/I Z + P H /2 z) : z ( P1 2 fi l + P 2 2 !l 2 ) .
1 86
AN I NTRODUCTIO N TO POPUlATI ON G EN ETICS T H EORY
The adult frequencies next gelleration, indicated by pri mes, are
p'I I
_ -
( P J I fI I + P 1 2 f1 2)2V I I ' K 5.2.21
where K is the sum of the numerators and is introd uced to m ake the fre quencies total I . Successive application o f these form ulae gives the frequencies i n later generations, but the equations are no longer simple functions of the allele frequencies. In most such cases and in more complicated ones the only way to get the results is to gri nd them out generation by generation. Of course high speed computers are a godsend for numerical results. In some cases a simplifying transformation can be found . One such appears when one class is lethal or sterile (Teissier, 1 944 ; Crow and Chung, 1 967 ; Anderson, 1 969). Suppose the A 2 A 2 class dies before the age of enumer ation. Then, since we are interested i n relative rather than absol ute frequencies, let us arbitrari ly choose J .O as the viability and fertility of one class. Accord ingly, l et Vi J hI
Then
=
=
'
P1 1
_ -
1,
Vi 2
1 ' /1 2
=
=
l" l'2 2
/
=
0,
2 ( P I I + P 1 2 f) ' K 5.2.22
P ; 2 = O.
Note that the frequency, say q, o f allele A 2 at the stage of enumeration is
since 2P1 2 is the proportion of heterozygotes, and Pi l + 2P I 2
=
1.
We can simpl i fy things by letting Y
=
Pl l 2P I 2
=
1
-
2q
2q
5. 2 . Z3
S E LECTI O N
1 87
Then the value of Y next generation is
P1 1 + P] 2f = ( 1 ) 1 2 P 2 fv Y fv + 2 v 2
p� 1 , Y = 2P'l =
I
'
5.2.24
and we can write the simple recurrence relation
Y r+ l = aY r + b ,
where
a
=
5
.2 .25
l fvf and b = l /2v.
This gives the ratio of normal homozygotes to lethal heterozygotes at the adult stage. If we want the lethal-allele frequency, it is given each genera tion by 1
&.2.26
A n approxi mate expression for Yr for any t can be obtained by writing an expression for the rate of change of Y and treating this as a differential equation.
,1 Y = Yr + l - Yt =
1
- vf vf
1
Yr + 2v ;::::: Jt · dy
This i n tegrates i nto
Y r = (Yo + C)eA f - C,
where A
=
1 - vi vf
'
c=
5.2.27
2( 1
i
- vf)
Figure 5.2. I shows some data from Drosophila population cages. The flies were of three genotypes, + 1 + (normal), + ISb (Stubble bristles), and SblSb which is l ethal in the larval stages and is not observed in the adult population . The data points are from weekly cenSuses and record the pro portion of the Sb gene in the adult population. The data are an average of four populations, from each of which a sample of 200 adults was classified each week. The populations numbered several hund red. The average generation length under these circu mstances is estimated to be about 2.5 weeks. The d otted line is the expected proportion of Sb chromo somes if the gene is comp]ete1y recessive and treating the situation as if the d iscrete model were appropriate (equation 5. 1 . 1 ). The starting frequency, qo , is taken as 0.3.
1 88
AN I NTROD UCTI O N TO P O P U LATIO N G E N ETICS THEORY
0.3
0.2
q 0. 1
O �----�----.---.--,.---�--� 7 o 5 2 3 6 8 4 T i m e in Generations
F igure 5.2.1 . Selection against a recessive lethal gene that produces Stubble bristles when heterozygous. Abscissa : time in generations. Ordinate : Sb gene frequency at the adult stage. The dotted line shows expectation for fully recessive lethal ; solid line, 1 2 % disadvantage of heterozygote. (Data from W. Y. Ch u ng . )
The solid line is obtained from 5.2.25 and 5.2.26, using v = .970,/ = .907, Iv = .880, and Yo = .667 (corresponding to qo = . 3 0) . As can be seen, al though the population is changing continuously and the generations overlap, the data fit the expectation very well. The approximation 5.2.27, gotten by treating the process as if continuous, gives results that are almost indistinguishable from 5.2.22. Furthermore, it is not very important in this example to separate viability from fertility. For example, if we let v = 0.88 and / = 1 .00, starting with a frequency of 0.30 we have after 1 0 generations a frequency of 0.035 ; with v = 1 .00 and / 0.88 we have 0.039, not very different. The product of v and / is more i mportant than either of the components. For a detailed discussion of the p ractical procedures for measuring v and /in actual populations, see Anderson ( 1 969). We can summarize this discussion of the time of enumeration by re writing the formulae, and at the same time extending them to i nclude multiple alleles . We are still assuming random mating. Enumeration at the zygote stage : =
r. = I}
( Li Pij Vij!ij )( Li Pij v ij!ij ) C
.
5.2.28
Enumeration at the adult stage :
p� . I)
=
( Lj Pij!ij)( Li Pij!ij)VIj K
C and K are chosen to make the frequencies total to 1 .
5.2.29
S ELECTI O N
1 89
Note that, for zygote enumeration, vjJlij = W ij and from 5.2.5
P i W i = L P ij Vijhj '
5.2.30
From the definition of gene frequency (2. ] .2),
5.2.31
c since LPJ Wj =
W.
If we sum both sides of 5.2.3 1 , and note that L P i = 1 and L P i Wi = W, we see that C w 2 , so =
P· W . Pi - W , ' - -' ' -
5.2.32
corresponding to 5.2.3. For adult enumeration, noting that Pij/ij , PJi = L j 5.2.33
where
Vj = L Pj/j v ij . j
5.2.34
Summing, as before, we discover that K tV =
=
W,
where
L P jh Vj .
5.2.35
Putting aU this together, we have : ADULT ENUMERATION
ZYGOTE ENUMERATION -
. p,(w, 8p , = w _
Hi)
8p ,
p ,(v,J, Hi) w -
=
_
5.2.36
Hi = L 2: p , p Jf,Jj VI) j
J
1 90
AN I NTRO D U CTI O N TO POPULATIO N G E N ETICS THEORY
In these formulae, W i is the weighted average of the product of Vjj and J;j , while vJ'j is the product of the weighted averages of vij and Jij . The gene frequencies refer, of course, to the frequencies at the time of enumeration. As we saw in the numerical example, which involved rather strong selection (a lethal homozygote and a heterozygote with about 1 2 % dis advantage), the results were approximated rather well by a model assuming that all the selection is through fertility differences and also by one assuming that all the selection is through viability differences of the heterozygote. Furthermore, a continuous approximation arrived at by treating the change by a differential equation also gave a good approximation to the results. This confirms our intuitive judgment that, unless selection is quite intense, there is not very much difference between the various models. We shall sometimes use a discontinuous model and sometimes a contin uous one, making the choice on the basis of which seems more natural or which is more manageable. We turn now to a discussion of the continuous model. 5.3 Continuous M odel with Overlapping Generations
M any populations in nature have births and deaths occurring more or less continuously, with both reproduction and mortality at various ages. Under these circumstances the continuous models (models 2 and 4 of Chapter 1 ) are more appropriate. We shall develop formulae analogous to those of the preceding section. Fitness is measured i n terms of the Malthusian parameter, m . This is the rate of geometric increase such that the contribution of a class to the next generation is proportional to em . We expect equations of the general type discussed in Section 1 . 6 . Again, as in the last section, we concentrate on gene frequency changes. Because of mortality selection the Hardy-Weinberg (or other specified) ratios at birth will be changed as each cohort gets older, So there will ordinarily be no stage in which the entire population is i n these ratios. One procedure that would at least partially mitigate this difficulty is to enumerate the population at birth, then get the proportion at different ages from life-table information ; in other words, use genetic information only to predict the number of each genotype born at a particular instant. However. we are mainly concerned with approximate results. So we assume that the conditions of model 2 are reasonably well met and that we are interested primarily in gene frequency changes. As in Section 1 .2 we let b and d stand for birth and death rates. For example, the genotype A i Aj would have a probability b ij &t of giving birth and dij &t of dying during the infinitesimal time interval At. We let mij = bij dij . -
S E LECTIO N
1 91
Let 2N stand for the total number of genes at the A locus i n the diploid population and 2nj for the number of A j alleles. Then pj = n d N. During the time interval At the increase in population number due to the contribution from A j A j parents is NPjj mjj At, which is also a measure of increase in A j genes due t o A j A j parents, since each parent contributes one A j gene to each progeny. Likewise for any genotype A j Aj , the increase in A j genes due to contributions from this genotype is NPij m ij At (only half the total frequency of A I Aj is used since only half the contributed genes are A j). Thus, when At becomes small,
dn·I . =" 7' NPIJ. . m IJ. = m . n.I ' dt 1
_
mj =
". L.J
NP·IJ· m IJ· · nj
=
". L.J
P·IJ· m IJ. . pj
5.3.1
,
5.3.2
where mj is the average fitness of gene A j measured in M althusian parameters. Likewise, 5.3.3 5.3.4
where m is the average fitness of the population, again measured in Mal thusian parameters. A j ustification that the arithmetic mean of the individual m's is appropriate was given in equation 1 .2.4. From the ordinary rules of differentiation,
dN dnj N - - n· dt dt 1
dt Nn j m j - nj Nfii N
5.3.5
2
= pj(mj - m),
where mj m is the average excess in Malthusian parameters of the allele A j • The similarity of 5.3.5 and 5.2.6 is apparent. The continuous and dis continuous models become more nearly equivalent as the selective differences among the genotypes become smaller. If the w' s are regarded as relative fitnesses and one genotype is assigned the val ue 1 , then w is very nearly 1 . Since m ij = loge W ij (cf. Section 1 .2), -
1 92
A N I NTRODUCTIO N TO P O PU LATIO N G E N ETICS TH EORY
and the two equations dpi dt
(
w)
P i Wi = p I.(m I. - m ) and �Pi = --\V-- _
-
become nearly equivalent. We shall find that for some purposes one formula is more suitable than the other. Usually the qualitative results are very much the same so we shaH often choose whichever model leads to the si mplest results. For two alleles and random mating. 5.3.5 can be written (in analogy with 5.2.7) approximately for slow selection as dPl
= PIP2(m l - m2)'
dt
5.3.6
This is only approximate because the adults may depart from Hardy-Wein berg ratios. Now, consider the same special cases as before. With two alleles and random mating we can write the approximate formulae
m l lP i + m l 2 PI P2 = m 1 I P I + m 1 2 P2 , PI m2 = m 1 2 PI + m22 P2 '
ml =
Substituting these into 5.3.6 gives
dt = PI P2[(m l l - m I 2)P I + (m2 1 - m 2 2)P2] '
dp i
5.3.7
a. No dominance, m l 2 = (m i l + m22)/2.
dp l dt = SPI P2/2,
5.3.8
where s = m i l - m22 . b. Dominant favored , ml l = m1 2 , A l dominant.
dp i 2 dt = SPI P2 '
5.3.9
c. Recessive favored, ml 2 = m 22 , Al recessive .
dp l 2 dt = SP I P2 .
5.3.1 0
d. Asexual, haploid, or gametic selection.
dp i dt = SPI P2 '
5.3.1 1
SELECTI O N
1 93
The same general observations can be made as were made for the discontinuous model. The relative rates of gene frequency change are as before. Notice that 5.3. 1 1 is the equation of the logistic curve. This is apparent if we note the correspondence of P I with N/K and P 2 with 1 - N/K in equation 1 .6.2. In integrated form, 5.3. 1 1 becomes 5.3. 1 2
where P, is the frequency of the favored gene at time t and Po is the i nitial frequency. We can proceed as we did with equation 5.3. 1 1 and write all the equations in integrated form . For our purposes it is more convenient to write them in the form t !(p), than with P as a function of time. Integrating and letting Po stand for the initial proportion (when t = 0) and PI = p, equations a, b, c, and d become : =
a/. No dominance.
t = � In s
P,( 1 - Po ) Po( 1 - P, )
•
5.3.1 3
b/. Dominant favored.
[
(l t = ! I n p, - Po) c.
'
s
1 + Po( 1 - P,) 1 - P,
Recessive favored.
[
( 1 Po t = ! I n Pr - ) s Po( 1 - p, )
_
_
..!.] . + P, Po
.!.
1 1 - Po
].
5.3.1 4
6.3.1 5
d.' Asexual or haploid.
t
=
! In p, s
( l - Po ) .
Po( l - P,)
5.3.1 6
I n each case P designates the frequency of the favored gene, and t is the number of generati ons required to change the frequency from Po to P" The value of s is assumed to be small enough that the departure from Hardy Weinberg proportions does not introduce serious errors. The reason for writing the equations in this form is apparent. It empha sizes the fact that t is always inversely proportional to s. This is true, or approximately true, as long as s is not large enough to appreciably upset the
1 94
AN I NTRO D U CTIO N TO P O P U L'ATI O N G E N ETICS THEORY
Hardy-Weinberg proportions . It is also true for slow selection with a discrete model. This can be seen from equations 5.2. 1 4--5 .2. 1 7 by noting that when s is small Jl p has about the same meaning as dp/dt, and w is approxi mately I . Table 5.3. 1 shows the number of generations' requi red to change the Table 5.3.1 .
The rate at which gene and genotype frequencies change under
selection.
N U M BER OF GENERATIONS REQUIRED WHEN TO C H A NG E G E NE F R EQ U E NCY
Asexual No dominance Dominant favored Recessive favored
S = .001
from .0000 1 ' to .01
from .01 to .5
from .5 to .99
from .99 to .99999
6,921 1 3,842 1 2,563 99,896,9 1 8
4,592 9, 1 84 5,595 1 02,595
4, 5 92 9, 1 84 1 02,595 5,595
6,921 1 3,842 99,896,91 8 ] 2,563
N U M BER OF GENERATIONS REQUIRED TO C H ANGE THE PROPORTION OF DOMIN ANT (OR RECESSIVE) PHENOTYPES
Dominant favored Recessi ve favored Asexual
6,920 309,780 6,92 1
4,8 1 9 1 1 ,664 4,592
1 1 ,664 4,8 1 9 4,592
309,780 6,920 6,92 1
gene or genotype frequency when s = 0.00 1 . For any other val ue, say S ' , simply divide the numbers in the table by s'/.OO I . For example, with S ' = 0.0 1 , the times would be only I / I O as l arge. The values in this table are taken m ostly from Haldane. For more ex tensive results and a variety of other cases, see his Th e Causes of Evolution ( I 932, 1 966). These equations are written as if s remains constant throughout the entire period of gene frequency change. That this should be strictly true is of course highly unlikely in any real case. But it gives us a general idea of the times involved in evolutionary change and the qualitative effects of dominance and recessivity. Furthermore, as emphasized in Section 1 .6, equations of this type apply to changes in components of a population even though the whole population may be increasing, decreasing, or constant and under a variety
SELECTI O N
1 95
of regulatory mechanisms. Again we see why it is usually convenient in population genetics to discuss proportions rather than numbers of individuals or of genes. Obviously, some of the large values are unrealistic. 99,896,9 1 8 genera tions is probably longer than the life of the species, and certainly s i s not going to be constant for that length of time. Furthermore, when the fre quency of the gene is very near to 0 or to 1 , random fluctuations in gene frequency can carry the gene to loss or fixation. A treatment of this problem taking chance factors into account has been given by Ewens ( I 967d). As might be expected , the values in the table are quite good for moderately large populations in the range of gene frequencies from 0.0 1 to 0.99, but the numbers at the tails of the distribution are often in serious error even in quite large populations. In particular, the largest values in the table are much too large. Later, in Section 8.9, we shall consider the related problem of the length of time required for fixation of a mutant gene in a finite population. 5.4 The Effects of Linkage and E pistasi s
When linkage and epistasis enter the problem, the situation immediately gets complicated . For one thing, whereas the Hardy-Weinberg ratios within each locus are attained within a single generation, gametic phase equilibrium is approached only asymptotically, as was discussed in Section 2.6. So we cannot be as free with the assumption of between-locus equilibrium as within a locus. We can circumvent this to some extent by using the more generalized form of the Hardy-Weinberg principle that we discussed in Section 2.6. This states that the array of zygotic frequencies can be written as the square of the array of gametic frequencies. So we can deal with the problem by treating the chromosome, or the entire gamete, as the unit instead of the gene. We will discuss the amount of linkage disequilibrium that is produced by selection with linkage and epistasis ; in fact we shall find that there is " linkage " djsequilibrium, even when there is no linkage. We shall also discuss the effect of epistasis on the rate of change by selection. M any of the problems are still unsolved. It is not difficult to get reason ably good answers when linkage is loose and epistasis is weak. We can also treat the situation with very tight linkage as if the linked genes were a single gene. But the intermediate area, a small amount of crossing over and strong epistasis, is very difficult and no general theory exists. Individual examples have sometimes been worked out by computer. With more than two loci, the situation naturally gets still more complex.
1 98
AN INTRODUCTIO N TO POPULATIO N G E N ETICS TH EORY
G eneration of G ametic Phase D i sequi librium with Epistasis
Assume that there are two loci, each with two alleles. There are therefore four gamete types. For the moment we shall treat the loci as if they were completely linked. We can then regard each chromosome as the formal equivalent of a gene and assign symbols in the same way. It is as if there were a single locus with four alleles. AB
CHROMOSOME FREQUENCY AVERAGE FITNESS
Fitness is measured i n Malthusian parameters ; therefore the model i s a continuous one. As before, we let bij stand for the birth rate and dij stand for the death rate of a particular genotype. The time, t, is measured in generations. The genotype Ab/ab would have a frequency 2P I P2 ' It would have a probability b1 2 �t of giving birth and dI 2 �t of dying during the small time interval �t. Then the fitness of Ab/ab is m I 2 = bI 2 - dI 2 . In the absence of recombination we can use equation 5.3. 5 of the previous section and write (assuming that there are Hardy-Weinberg proportions)
dPi = p j(m j - m ), _
dt
5.4.1
where
m j = L pj m ij
and
j
m = L Pi m j = L mi . L i j Pi Pj j i
With crossing over the double heterozygote will contribute some gametes that are different from those it recei ved from its parents. We measure crossing over by e, the recombination fraction. Unlinked genes will be treated as the special case when e = 1 /2 . Consider the production of gametes of type abo With homozygotes and single heterozygotes crossing over makes no difference, so their production of ab gametes is independent of C. There are 2 P2 P 3 Ab/aB double heterozy gotes and a fraction e/2 of their gametes will be abo There are 2P I P4 A B/ab double heterozygotes and a fraction ( I - e)/2 of their gametes will be abo Putting all this together, the change in the number of ab chromosomes in the time interval �t will be
2N[PI Plbl l + P I P2 b 1 2 + PI P 3 b 1 3 + P I P4 b I i l - e) + P2 P3 b23 e - PIPl dl l - PI P2 dI 2 - PI P3 dI 3 - PIP4 d14] �t = 2(NP I m l - Nbc D)llt,
5.4.2
S ELECTI O N
1 97
where
D and
=
P IP4 - P 2 P 3
5.4.3
b = b1 4 = b2 3 , on the entirely reasonable assumption that the two kinds of double hetero zygotes have the same birth rate. Meanwhile, the change in the number of all four kinds of gametes together in the same time interval is 2Nm At. We then use the same method used in deriving 5.3.5 to get the rate of change in the proportions of the chromosome types, writing dpi/dt = d(n 1 /N)/dt, etc. This leads to
dp l = PI (ml dt
-
dp 2 = l m1 dt P ( dp 3
dt
=
P3 (m 3
m)
-
cbD,
m) + cbD, _
-
5.4.4 -
m ) + cbD,
dp 4 = P4(m4 - m) - cbD. dt _
These equations were first derived by Kimura (1956). There is another way of measuring departure from linkage equilibrium that is more useful than D for our purposes. Whereas D is the difference in frequency between the two types of heterozygotes, we define Z as the ratio of the two. Thus
Z=
PI P4
P2P3 ,
5.4.5
and the relation between D and Z is given by
D = P Z P3 (Z
-
1 ).
5.4.6
Z has the property, first shown by Kimura ( 1 965 b), of approaching a
nearly constant value when mating is at random and recombination is large relative to the amount of epistasis. Such a slowly moving equilibrium is called quasi-equilibrium. The natural logarithm of Z is loge Z
=
In Z = I n PI
-
In P2
-
In P3 + I n P4 .
We are i nterested i n determining what happens to the linkage dis equilibrium as selection proceeds. To do this we inquire into the rate of change of log Z with time.
1 98
A N I NT RO DUCTION TO POPUt:ATI O N G EN ETICS T H EORY 1
The time derivative of In Z is 1
-
Z
= dt
dZ
-
1
1
dp i
- --
dt
PI
-
-
-
1
dP 2 --
-
P2 d t
-dP 3
P 3 dt
+ - -- .
d P4
5.4.7
P4 d t
Substituting from 5.4.4 and simplifying leads to
..!. dZ = Z
tn l
dt
E
=
where E = ml
_
-
m2
_
cbD P,
_
� �)
1 + CbD � + _
(
P2
PI
PJ
+
P4
+ m4
m z - mJ
-
+ '" 4
nl J
5.4.8
5.4.9
and is a measure of epistasis, and 1
P = L, - .
5.4.1 0
i Pi
I f c is larger than l EI (more specifically, if cb is larger than l E I , but b is not far from 1 for a population of stable size), then the value of Z tends toward a value which is relatively stable. We start by writing 5.4.8 again and substi tuting for D from 5.4.6. This gives
=
Since 0
dt dZ
<
I ). Model S is the opposite ; all E/s are negative. Therefore it would build up linkage disequilibriu m in the opposite di rection.
S ELECTION
203
1.30 c =
1.28
0.2
"
1.00 ( Z ( 1 . 2 9
1.26 1.24 1.2 2
1 .2 0
1.18
Z
1.16 1.14 1.12 1.10
88
1 . 08
AA 0
Aa
0
- 0.01
0
0
- 0.01
8b
1 .06
E,
1 .0 4 1 .0 2
�
E...
E, - E3 - 0 :=
bb - 0.01 5 - 0.Q15
0.04 5
aa
0,0 2
1 .00
50
0
Figure 5.4.3.
150
100
200
2 50
300
350
400
Another illustration of the fast approach
450
500
to quasi
linkage equilibrium and the slow change in Z thereafter when ' epistasis is small and linkage is loose. This is a diploid model with recombination
of 0.2 between the loci. figure.
The fitness and
epistatic
parameters are given in the
Models 7 and 9 are mixed and the direction of departure depends on the gene frequencies. Table 5.4.2 shows the change in Z when E is large and linkage is tight. Table 5.4.2. Changes in chromosome frequencies and gametic phase unbalance
(Z) in a diploid model with close linkage, c = .01 . Fitnesses are : w••"" W ...
- bb
= W.... B _ = .95, W...
-
B-
= 1 .00.
CHROMOSOME FREQUENCIES GENERATION
AB
Ab = aB
o
.250
.250
40
.263
. 1 14
.278
10
.294
20
.025
80
Notice that change.
m
.218
. 1 85
.007
ab
.250
.268
.336
.s08 .961
=
1 . 10,
Z 1 .00 1 .67
2.87
1 0.02
483.
this case there is no quasi-equilibrium and Z
co n
tinues to
204
AN I NTRODUCTIO N TO P O P U LATIO N G E N ETICS TH EORY
U nless linkage is close or epistasis strong (unless £/ c is appreciable), the considera tions that we have j ust discussed lead us to expect that epistasis would have rather little effect on selection. Although it is hard to make such a statement quantitative, we can give some indication of the direction of the effect. We shall return to the question of the magnitude of the effect in Section 5.7. To be concrete, assume that the large letter genes generally increase fitness. Thus in Table 5.4. 1 , fitness tends to increase as we move from upper left to lower right. We have seen that if the E/s are positive t hen Z tends to be greater than I , that is, an excess of ab and A B chromosomes tends to develop. Those chromosomes that have the lowest fitness and those that have the highest, on the average, are the ones that are increased by epistasis. Thus the effect of positive epistasis is to make the population more variable. Since the rate of change by selection increases with the variability of the population, the effect of positive epistasis is to increase the speed of selection. Conversely, if epistasis is negative (the E;'s are negative), the mediocre chromosomes accumulate in excess. The result is a less variable population and slower selection.
The Effect of Epistasi s on the Rate of N atural Selection
Linkage and the Establ ishment of B eneficial Mutant Combina
It is sometimes true that two or more genes that are i ndividually deleterious interact to prod uce a beneficial effect. If these genes are newly arising mutants, and therefore rare, the population is in a troublesome situation. The essential situation is clear with a haploid model, so we shall consider this. Let the original population be abo Suppose the fitnesses, Wab ' W.A b , Wa B ' and W.A B are in the ratio 1 : ( l Sl) : (J S 2 ) : 1 + t, where the s's and t are positi ve. If ab is the prevailing type, Ab and aB will be present in small numbers determined by their m utation rates and the magnitude of the s's. Very infrequently the A R type wil l arise, either by mutation or by recombination between Ab and aBo This individual will ordinarily mate with the most common type, abo With free recombination less than half of the progeny will be A B, and unless this type possesses an enormous selective advantage it will not increase in frequency. Onl y if the fitness of the A B type i s great enough to compensate for the l oss of A B types through recom bination will the genotype increase. However, if for any reason A B becomes common so that many of the matings are with individuals like themselves or with Ab and aR, then the genotype can increase. This type of situation has been extensively discussed by Haldane ( 1 93 1 ) , Wright ( 1 959), and Bod mer and Parsons ( 1 962). One way in which the rare mutant combination m ight increase is when there is strong assortative mating. However, it is quite u nlikely that the
tions
-
-
S ELECTION
205
mutant combination that was favorable for some reason would also happen to predispose the individuals to mate assortatively. The genes m ight also happen to be linked. If the rare AD individual mates with an ab type, which will usually be the case, the proportion of A B progeny will be proportional t o 1 - c , where c, as before, i s the recombination fraction between the two loci. However, the AB type will increase from these matings if the extra fitness of the A D type is enough to compensate for this ; that is, if ( I + t)( 1 - c) > I , or t > cj( 1 - c) . The conditions a re actually a liule less stringent, because some of the matings will be with AD, aB, o r Ab types. Furthermore there is a small addition by recombination from Ab x aB matings. These do not change the direction of the inequality, however, so we can say that linkage of such intensity as to give a recombination of t or less is sufficient to insure the incorporation of the double m utant. Actually the same algebra works for diploids, where t is the advantage of the double heterozygote over the pre vailing type ( Bodmer and Parsons, 1 962). Whether such mutant combinations are important enough for this to be an i m portant reason fo r having linkage is doubtful ; but it does illustrate one situation where linkage introduces a qualitative change in the outcome, not j ust a change in rate. 5.5 F isher"s F undamental Theorem of N atural Selection : S ingle Locus w ith Random M ating
Up to this point we have concentrated on the rates at which selection changes the genic and genotypic com position of a popUlation. We are also interested in the way in which quantitative phenotypes are changed by selection. The selection may be natural or man-made. The character m ay be yield of a cereal , rate of gain in meat livestock, or whatever trait is of interest. In natural selection the trait of greatest significance is fitness-the capacity to survive and reproduce in t he existing environment. It is clear that natural selection tends to preserve those genes which, on the average, increase the fitness of their carriers, and therefore to increase the fitness itself. It is also clear that this is happening to all com peting species at the same time, so that with increasing time it requires greater and greater intrinsic fitness for a species to survive the steadily increasing competition. We are looking for an expression describing the rate at which fitness is increasi ng, while realizing that this is not likely to be reflected in increasing rate of population growth because of l im itations of the environment and competing species. Intuition tells us that the rate at which selection changes the fitness will be related to the variability of the population. We shall show, as Fisher
206
A N I NTR O D U CTIO N TO P O P U LATI O N G E N ETICS T H EORY
( 1 930) first did, that the rate of i ncrease in fitness is equal to the genic variance in fitness. If equations of the form of 5.3.5 are applicable, the theorem is precise. But we are primarily interested in it as an approximation applicable to a wide variety of popUlation models. We start first with a simple case : a single locus, two alleles, no environ mental effect, and random mati ng. A1Al
GENOlYPE
p�
FREQUENCY
mU = m + al l m + 2«1
FITNESS
GENIC VALUE
A 1 Az 2P1PZ m12 = m + 012 m + « 1 + «z
A :z A z
P� mu = m + a U m + 2«2
The average of the deviations from the mean must be 0 ; hence, as s hown i n 4. 1 .3,
2p t(X1 + 2P I P2«(X I + (X 2 ) + 2p i (X2
=
0
or, since P I + P 2 = 1 ,
P I (X I + P2 (X2 = O .
5.5.1
The genic variance was shown earlier (4. 1 .5) to be
Vg
=
2(PlccI + P 2 (XD
5.5.2
with the obvious extension to multiple alleles,
Vg
=
2 r Pi CC: .
5.5.3
The quantity el i is called the average effect of the allele A i ' The term was introduced by Fisher ( 1 930). Alternatively. we can define el as the average effect of substituting A l for A 2 , in w hich case CC = CC I - CC 2 and 5.5.4
However, we prefer 5.5.2 because it leads so naturally to an extension to multiple alleles. In the m odel we are considering, m =
Pim l l + 2P I P 2 m 1 2 +
pi m 2 2 .
The rate of change in the average fitness wilJ be d in t d
=
=
d i dP 2(Pl m l l + P 2 m 1 2 ) pt + 2 (Pl m 1 2 + P 2 m 2 2) t2 d
dPl
dP 2
2 m l dt + 2m 2 dt '
d
5.5.5
S ELECTION
207
A l and m l is that of A l (5.3.7). ml dp l Pl( m l - m), dt Hence dm 2Pl ml (ml - m) + 2Pl ml(ml - m) dt = 2Pl(ml - m ) l + 2Pl(ml - m) l . The equivalence of the two expressions above may be shown by ex panding the second to give (ignoring the factor 2) Pl(ml - m)(ml - m) + Pl(ml - m)(ml - m) Pl m l(m l - m) + P l ml(ml - m) - m(Plml + Pl ml - m). But, from 5.3.4, m P lml + P l ml ; so the quantity in the last parentheses is O. Now, note from 5.3.7 that ml Pl m ll + Plmll , and therefore ml - m Pl m ll + Pl m l l - m Pl al 1 + P l a12 IXl from 4. 1 .9. Substituting into 5.5.7 gives dm 2P l(Xl + 2Pl (X l = Vg dt where is the average fitness of the allele But from 5.3.5, =
_
5.5.6
=
5.5.7
=
=
=
=
=
5.5.8
=
=
1
1
5.5.9
from 5. 5.2. This example illustrates a special case of Fisher's Fundamental Theorem of Natural Selection. In his words : " The rate of increase in fitness of any organism at any time is equal to its genetic variance at that time." Fisher's genetic variance is what we are calling the genic or additive genetic variance. When the genotypes are in random proportions, as in the model we have been discussing, the genic variance in fitness can be written in another form that is sometimes useful. Combining 5.3.5, 5.5.5, and 5. 5.7, we can write
2[�P l (ddPlt ) l + �Pl (ddPtl ) l] , or for m ultiple alleles, l) o d 1 P ' � ( dt V =2 Vg =
g
L... Pi
5.5.1 0
5.5.1 1
208
AN I NTROD UCrl O N TO POPU LATI O N G E N ETICS T H EO R Y
The discrete generation analog of 5. 5.9, related to 5.2.6 I n the same way that 5.3.5 is related to the contin uous formula, is 5.5.1 2
This can be derived easily as follows (Li, 1 967a, b). We note first that with random mating ( P2 = 1
-
p d,
by 5.5.8 when al l = WI 1 iV, etc. d 2 iV --2 = 2 (w} } - 2 W 1 2 + w 2 2 ) , dp l -
P I P 2 diV �P I = 2w - dP I
from 5.2. 1 3 . We now expand �w = iV' �w = �P l _
2 2 diV ( �P J ) d w + dp i 2 dPi '
-
w
into a Taylor series,
which is exact because w i s a quad ratic function of the pts and all derivatives beyond the second are O. S Ubst ituting, �w
_ =
2 P I P 2 a 2 p:p� a 2 + 2 ( W I I - 2 W l 2 + W2 2) w W
as was to be shown. Unless selection is strong the quantity in brackets is very nearly I . This is especially true if several l oci contribute to the trait and dominance is in di fferent di rections, for then the quantity in parentheses is someti mes positive and sometimes negative with considerable cancellation resulting. So, to a good approximation in many instances,
Vg
�w = -=- , W
5.5.1 3
S ELECTIO N
209
or perhaps more meaningfully,
i1w
W
=
Vg
w2 •
That i1w is always positive may be shown by noting that
PI P2(Wl l - 2W 1 2
+ W22) = P 1 W l l + P1 W22 - PiWl l - 2PIP2 W1 2 - p i W22
where WI and w are the means of a completely inbred population and a randomly mating population. Substituting this into 5.5. 1 2 gives
i1w =
(
)
Vg W + WI . W 2w
5.5.1 4
Since aU the quantities in this expression are positive, or 0, the fitness must always increase or be at an equilibrium. We shall not extend this to multiple aUeles but the conclusion is still correct-the fitness with random mating and constant wij's can never de crease. For references, see Mulholland and Smith ( 1 959), Scheuer and Mandel ( 1 9 59), and Kingman ( 1 96 1 , 1 96 I a). The interpretation of Fisher's theorem has been a matter of consider able discussion. Clearly, if the fitness of a genotype is defined as the intrinsic rate of increase, the average fitness cannot increase indefinitely, as the theorem would seem to say. The population growth rate may be 0, or its size may even decrease. One way in which this might happen is if the linkage relations change through recombination so that less favorable chromosomes increase in frequency despite natural selection. Another is if the fitnesses of individual genotypes change. For example, the fitness of a genotype may depend on its frequency or the frequency of other genotypes, or the environment may be deteriorating so that all genotypes become less fit. One i nterpretation of the theorem is to say that it measures the rate of increase of fitness that would occur if the gene frequency changes took place, but nothing else changed. The theorem thus gives the effect of gene frequency changes alone, i solated fro m the other things that are happening. Fitness becomes a rather abstract quantity that continually increases while the popu lation size is roughly stabilized by the various factors that cause the environ ment to change. This is what Fisher seems to be saying in at least one passage : For the majority of organisms, therefore, the physical environment may be regarded as constantly deteriorat ing , whether the cHmate, for example, is beco ming warmer or cooler, moister or drier, and this will tend, in the majority
21 0
AN I NTROD UCTI O N TO POPU LATIO N G E N ETICS TH EORY
of species, constantly to lower the average val ue of m , the Malthusian parameter of the population i ncrease. Probably more important than the changes in cl imate wi l l be the evol utionary changes i n progress in associated organisms. As each organism i ncreases in fitness, so will its enemies and competitors i ncrease in fitness ; and this will have the same effect, perhaps in a much more important degree, in impairi ng the environment, from the point of view of each organism concerned. A gainst the action of Natural Selection in constantly increasing the fitness of every organism, at a rate equal to the genetic variance in fitness which that population maintains, is to be set off the very considerable item of the deterioration of its inorganic and organic environment.
Al ternatively, we can interpret fitness more concretely as the actual rates of change and i ntroduce corrective terms to incl ude the effects of linkage and epistasis, nonrandom mating, changes in the fitnesses of individual genotypes, and the effects of overcrowding a nd deterioration of the environ ment. This we shall do in the next section. 5.6
The Fundamental Theorem : Nonrandom Mating and Variable F itnesses
To derive the principle in its most general form, i nclud i ng the effects of epistasis and multiple alleles, is beyond the scope of this book. However, we shall give some indication of the nature of the extension to more complex situations at the end of this sec tio n Fisher ' s ( 1 930, 1 958) treatment of the s u bj ect i s recondite. A clearer discussion of the circu mstances u nder which the principle works was given in Fisher ( 1 94 1 ). For a straightforward , general derivation, see Kimura ( 1 958). We shall remove the assu mptions of random mati ng and constant fitness for each genotype. However, we shall continue to consider only a single locus with two alleles. We assign values as follows : .
GENOTYPE FREQUENCY FITNESS GENlC VALUE
A lA I Pl l mi l = a + al l G i l = a + 2Cl: I
A IA2 2P I 2 m l 2 = a + aI 2 G I 2 = a + a: I + a: 2
A2 A2 P22 m 2 2 = ii + a2 2 G 2 2 = a + 2a:2
We follow the same procedure as was used i n Sections 4. 1 and 4.2. There we measured departure from random-mating proportions by the i nbreedi ng coefficient, f Here we find it more convenient to measure it in another way, by a measure e to be defined later. Although the inbreeding coefficient is a natural measure of the effects of inbreeding, random gene frequency drift, and some types of assortative mating, it is n ot the most natural measure to use when relating fitness change to the genic variance. For the rate of increase
21 1
S E LECTI O N
in fitness to be equal to the genic variance it is necessary, not that f be con stant, but that 8 be constant. The phenotypic variance in fitness i s 5.6.1
H owever, we would expect selection to be m ore closely related to the genic variance, as defined i n Section 4. 1 . I n this case it is 5.6.2
T he IX 'S are to be determined by least squares, following the same pro cedures as befo re. We choose the IX ' S so as to minimize the quantity Q = Pl l (m l l - Gl l ) 2 + 2P1 2(m I 2 - G1 2) 2 + Pn(m n - Gn) 2 2 2 2 = Pl l (a l l - 2IXI ) + 2P1 2 (a1 2 - IXI - I( 2 ) + P22(an - 2I(2 2) .
To minimize Q, we differentiate with respect to IXI and (X 2 and equate to 0, giving ;-
cQ
V{X I
= - 4P l l(al l - 2IXI) - 4P1 2 (aI 2 - IX I
cQ
-;-
vIX 2
=
I(2)
- 4P 1 2(aI 2 - IX I - I( 2) - 4Pn(an - 2I( 2 )
=
=
0,
O.
Rewriting, we o btain
P1 1 a l l + P1 2 a1 2 P 1 2 a 1 2 + Pn a n
=
(P l l + P 1 2 )IX 1 + P1 1 IX I + PI 2 {X2 ,
5.6.3
(P1 2 + P n)IX 2 + P 1 2 CJ. l + Pn IX 2 '
5.6.4
But, Pl l al l + P1 2 a 1 2 P i a l , where a l is the average excess of the allele A I ' Likewise P 1 2 a 1 2 + P 2 2 an = P2 a 2 ' After making these substitu tions in the left sides of the equations above, we then m ultiply the first by IXI and the second by IX 2 ' Addi ng the two equations we get 2 P l a l C( l + P2 a 2 CJ.2 2Pl t IXr + P l 2(IX I + I(2) + 2 Pn IX � . The quantity on the right is half the genic variance (cf. 5.6.2). Therefore
Vg = 2 P l a l CJ. 1 + 2P2 a 2 IX 2 '
5.6.5
The extension to multiple alleles and to multiple loci is direct, so we can write, in general , 5.S.6
where the i n ner s u m is over the alleles at a locus and the o uter is over the loci. A comparison o f 5.6.5 with 5.5.2, which was derived on the assumption of random mating, ill ustrates the fact that with H ardy-Weinberg ratios there
21 2
AN INTR O D U CTI O N TO P O P U LATI ON G EN ETICS THEORY
is no distinction between a and (x . With inbreeding, a = (X( I + f), where f is the inbreedi ng coefficient. This is clear from the comparison of 4. 1 .6 with 4.2.2. The difference between average excess and average effect of a gene is discussed in a different way by Fisher ( 1 958, pp. 34-35). We now proceed to WTite an expression for the rate of change i n fitness, m = Pl l m l l + 2 P1 2 m 1 2 + P22 m 2 2 '
5.6.7
If the m's are no longer regarded as constant, dm
dm l 2
dm l l
dm 2 2
dt = P 1 1 dt + 2 P 1 2 dt + P22 dt
5.8.8
d P22
dP 1 2 dP l l + m 1 1 dt + 2m 1 2 -;[t + m22 -;[t
.
The first three terms to the right of the equ ality sign are the average of the rate of change of the fitnesses of the individual genotypes weighted by the frequency of each genotype. Thus
Letting m i]
(
=a
+ a u ' the last three terms i n 5.6.8 are
)
-
-
--
d P2 2 dP 1 1 _ dP I I d P 1 2 d P2 2 dP1 2 ' + a2 2 + 2a 1 2 G -- + 2 + - + a l l -dI dt dt dt dt dI
5.6 . 1 0
But, the quantity in parentheses must be 0, since it is equal to the total rate of change of the frequency of all genotypes. Because an increase in the pro portion of one genotype must always be accom panied by a compensating decrease i n the others, the net change i s O. We measure the departure from random-mating proportions by the quantity 0 = IJ
. .
Pi} , Pi Pi
which is I for H ardy-Weinberg ratios. Substituting Oij PI Pi for Pi] in the least-squares equations 5.6.3 and 5 .6.4 gives or
p � O l la 1 1 + P I P 2 0 J 2 a 1 2 pl Ol la l l + P2 01 2 a 1 2
=
=
P J CI I + 0 l l P : CIJ + 0 1 2 P I P2 (X 2 ,
(X l + 0l l Pl(Xl
+
01 2 P2 (X2 ,
and in the same way PI01 2 a 1 2 + P2 022 G22 = (X2 + OI 2 P1iJ.I + 0 n P2 1:J.2 '
)
5.6. 1 1
S ELECTION
Likewise, since Pi)
=
21 3
O ij P i Pj ,
dO l l dp l dP l l 20 U P ! + P 2I t ' dt d dt _
dpi dP2 dP I l dO l l dt + = 01 2 P I + dt P P 1 P2 I (f"t ' 2 2 0 dt dP 2 d P2 2 2 d0 22 2 0 22 P2 + d e P 2 dt . dt _
Substituting these into 5.6.8 and making use of equations 5.6.9, 5.6. 1 0, and 5.6. 1 1 leads to
dPl dm dm + 2(a l + O l l P l a l + 0 1 2 P 2 (2) dt = dt dt d P2 + 2 (a 2 + O U P l a l + 0 22 P2 ( 2) d t
5.6.12
Now
dO l i - dOl 2 -PI P2 dt ' dt
5.6.1 3
and o
d0 22 P 2 de . dt This can be shown by noting that PI 0 1 1 + P 2 01 2 d p2 dP I 1 2 dt + 0 2 1 dt =
_
PI
dO l l
_
5.8.1 4 =
1 , which on differentiation leads to 5.6. 1 3. S ubstituting 5.6. 1 3 and 5.6. 1 4 into 5.6. 1 2 leads to
where D jj = au More concisely,
dm dt
- =
ai - aj ' O ij Pi Pj = Pj) ' and t/>ij
dP dt/> dm dp l 2a l - + 2a 2 -2 + - + D - . dt dt dt dt
=
loge
O ij (i.e., dOlO = dcp).
214
AN I NTRO D U CTION TO POPULATI ON G EN ETICS TH EO RY
Finally , from 5.3.5,
dp i _ = p;(m j - m ) = pj a j , dt d 4> dm dm V + - + D -. = 9 dt dt de
-
5.6.1 5
Although our derivation has considered only a single locus with two alleles, the formula is readily extended as was shown by Kimura ( 1 958). The three terms on the ri ght can be interpreted as follows :
I . v� is the genic variance. 2. dm/dt is the average rate of change in the fitness of the individual genotypes. If these are constant, as is frequently assumed, this term drops out. I n a natural population the e nvironme nt is contin u al1y deteriorating, primarily because of the evolutionary i mprovement of competing species. This term can be thought of as a measure of such deterioration . 3. The third term measures the effect of gene i nteraction and departure from Hardy-Weinberg ratios. The qua ntity Du is the difference between the fitness of a genotype and i ts best linear estimate. It is thus a measure of the effect of dominance. The quantity 4> = 10ge B is a measure of departure from Hardy-Wei nberg proportions, being the log of the ratio of the actual frequency of a genotype to its Hardy-Weinberg expectation.
Therefore, the term Dd4>ldr will be 0 if there is no dominance or if the genotypes are in Hardy-Weinberg ratios. It will also be true u nder more general conditions. I t is not necessary that 4> be 0, only that its derivative be O. I n other words, there must be a constant amount of departure from random proportions, measured by 4>. Numerical examples where the mean fitness decreases because the third term i n 5.6. 1 5 (extended to incl ude epistasis) is negative have been given by Kojima and Kelleher ( 1 96 1 ) a nd Kimura ( 1 965). Notice that when the gene frequencies are changing, constant i nbreeding coefficient (f) is not the same as constant 4>. It is readily verified that when f is a constant other than 0, the third term i n 5.6. 1 5 is not 0, except in the absence of dominance (Crow and Kimura, 1 956 ; Tu rner, 1 967). We have mentioned before that, even with random mating, there wi1l generally be departures from random genotypic proportions if there is selection. The Hardy-Wei nberg proportions will be fou nd at the zygote stage, but if there is differential mortality this will ordinarily lead to deviations from random expectations. We should not expect this to alter greatly the
S E LECTIO N
21 5
fundamental theorem however. I f gene frequency changes are slow, the pro portions of each genotype in the population will be a constant multiple of the rand om-mating zygotic proportions ; that is to say, the (Ji/S will be approximately constants. Thus the condition d log (Jfdt 0 wi1l be approxi mately correct with random mating, so the rate of fitness change is still given approximately by the genic variance. For discrete generations the formula is similar. With random mating, the equation is =
v
5.6.1 6
= -2. + � w W
-
approximately. For a general discussion, see Wright ( 1 955) and Li ( 1 967). Somewhat more generall y, removing the assumption of random mating, we have 5.6.1 7
For a derivation of this, see Kempthorne ( 1 957, p. 358). The close relation ship of this to 5.6. 1 5 can be seen by notin g that
D
de/> -
dt
=
. D·I)." PI) i...J
d
.
dt
log 0I). = .
" D I).i...J
d
.
dt
pI). . ,
5.6 . 1 8
which i s equivalent t o the middle right term in 5.6. 1 7 for the continuous case. The extension of 5.6. 1 5 to multiple alleles and m ultiple loci with epistasis is not difficult in principle, but involves considerable algebra. It is given by Kimura ( 1 958). The equation may be written 5.6.1 9
where e i s a measure of the fitness of a genotype, expressed as a deviation from the linear least-squares expectation, and therefore is a measure of both dominance and epistasis, and e/> = log (J is a measure of the departure from random-mating expectations of the frequency of each genotype. An explicit expression for this term with two loci with linkage and epistasis, but with rand om mating, is given in the next section. I t will be shown that with random mating the effects of linkage disequilibrium and epistasis effectively cancel, as long as epistasis is weak and linkage not too close, so that the last term in 5.6. 1 9 becomes unimportant.
21 6
AN I NTRODUCTIO N TO POPULATIO N G E N ETICS TH EORY
There has been considerable disc ussion in recent years as to whether Fisher's theorem is .. exact." I t is clear from the derivation as we have given it that the rate of change in fitness is equal to the genic variance if 0) the assumptions of population model 2 apply, that is, equations of the form dp/dt = p(m - m) are appropriate, (2) the genotypic fitness coefficients are constant, and (3) the departures from random-mating proportions, as measured by fJ, are constant. However, no model can ever be an exact description of nature. We have not considered such complexities as arise when one pays careful attention to the pattern of deaths at different ages, the various ways in which mating combinations occur, different fitnesses of the genotypes in the two sexes, and such. We are mai nly concerned with rather slow selection i n which the details are of less influence. Fisher's principle, we believe, sum marizes a great deal of biological complexity in a sim ple and elegant statement that relates popu lation genetics to statistical theory in a very useful way. But of course, as a description of nature it (like all other descriptions) is oniy an approximation. The effect of overcrowding can be incl uded in the formulation. We let the actual rate of increase in a population of size N be given by M, so that 1 dN
-
-
N dt
= M = m - IjI(N)
where IjI(N) is a function describing the reduction in population growth rate with overcrowding. dM dm = dt dt
-
-
=
-
dljl dt
dm
dljl dN
dt
dN dt
- - - -
5.6.20
dljl dm = - - MN dt dN =
d 4> dm D + + 9 dt dt
V
_
MN
d ljl dN '
where dljl/dN describes the i ncrease in resistance to population growth with overcrowding. Fisher ( 1 930, 1 958, p. 46) writes
- = W - D - -. dM dt
M
C
Our formulation is similar. W is to N(dljl /dN), treated as a constant.
5.6.21 Vg ,
D is our dm/dt, and l /e is equivalent
S ELECTI ON
21 7
5.7
·rhe Fundamental Theorem : Effects of Linkage and Epistasis
III
most natural populations the assumption of random mating is a good approximation to reality. Furthermore, most pairs of genes are u nlinked or loosely linked . I n so far as a single locus is concerned, regardless of the number of alleles, the Hardy-Weinberg ratios are attained in a single generation ; so this is a reasonable assumption. On the other hand , gametic phase equilibrium is approached only asymptotically, and with epistatic i nteractions there is permanent " l inkage " disequilibrium even for unlinked genes, as was dis cussed in Section 5.4. Fo.r a discussion of this problem , see Bodmer and Parsons ( 1 962), Kojima and Schaffer ( 1 964), Lewontin ( 1 964), Wright ( 1 965), and Felsenstein ( 1 965). Therefore, it is necessary to consider the effects of linkage disequilibrium when there is epistasis, even when there is random mating. It is not immediately clear what this does to the term Ddl/>/dt (or rather, the corresponding term i n the more general equation when epistasis is i ncluded, 5.6. 1 9). However, we shall now demonstrate that there is a remarkable property of randomly mating populations that is true unless linkage is very close or epistasis is very strong. This is that the amount of linkage disequilibriu m maintained b y selection is just enough to n ullify the epistatic contribution to the variance, so that the genic variance remains the correct measure of the rate of change in population fitness. We are assuming the same model as in Section 5.4, but we shall repeat the basic elements now for convenience. Assume that there are two loci, each with two alleles. There are thus four gamete types. We shall assume that the loci are linked with recombina tion frequency c. If the loci are on i ndependent chromosomes, then we regard this as the special case where c = .5. We assign symbols for the chromosomes, frequency, and average fitnesses as follows : CHROMOSOME (GAMETE) TYPE
AB
FREQUENCY AVERAGE FITNESS
Fitnesses are measured i n Malthusian parameters. For example m l is the average fitness of all genotypes containing chromosome ab, weighted by their frequency and the n umber of ab chromosomes carried ( I or 2). We make use of the gametic or marginal values i n Table 5.7. 1 . All the necessary equations can be expressed i n terms of these values because, under random mating, the zygotic frequencies are given by the square of the gametic frequencies. In this way we effect a considerable saving i n algebraic manipu lations.
21 8
AN I NTRODUCTION TO P O P U LATION G E N ETICS T H EO R Y
We now show that under quasi-linkage equilibrium the Fisher theorem is correct. We would expect this to be the case, for quasi-equilibrium provides a nearly constant departure from gametic phase balance ; in other words, the third term on the right side i n equation 5.6. 19 should disappear. However, we shall demonstrate this explicitly to illustrate how the epistatic-variance component may be isolated . Table 5.7.1 . Frequencies and fitnesses of the various zygotic combinations of two l inked loci. The gametic values are given along the margins, and the zygotic values in the main body of the table.
ab ab Ab
aB
AB
P I Pl ml2
P I P3 ml J
P I P4 ml4
m22
pi
P JP Z mJl
aB AB Totals Averages
P"
Ab
PI mi m
P4P 2 m4l P2 m2
m + cr:
= Pz + P4 , P S = P J
P 2P J ml J
mJJ
p�
P4P 3 m4 l
PJ mJ m + f3
P 2P 4 m24
P JP 4 mJ4
pa
m44 P4 m4
m + a:
+
Freq uency Fitness Frequency Fitness Frequency Fitness Frequency Fitness Frequency Fitness f3 Genic va l ue
+ P 4 (gene frequencies)
From Table 5.7. 1 the mean fitness is m
= I I Pi Pj m ij = I Pi m i i
j
i
5.7.1
S ELECTIO N
219
and the average fitness of the ith chromosome is 5.7.2
The rate of change in fitness is
dm dt = =
( � � m ij Pi
t
dpj dt
�
dP dPj + Pj i dt dt
Pi m ij +
�
d pi dt
d d m · pj + L m . pi =L j J dt i ' dt L..t
= 2 "m · ,
)
� pj m ij
5.7.3
d pi dt '
since the two expressions on t he right are the same. Substituting into 5.7.3 from 5.4.4 and rearranging gives
But, from 5.7. 1 and 5.7.2, 5.7.5
Recalling 5.4.9 and using 5.7.5, 5.7.4 becomes
dm 2 dt 1
- -
= L p · m 2· - m 2 I
I
cb D E •
5.7.6
However, L P i mt - m2 is the variance of the m;'s, which we shall call the marginal or gametic variance, Vgam . Refer to Table 5.7. 1 . This includes components from e pistasis, but not from dominance. Thus
dm
dt = 2( Vgam
-
cb DE).
5.7.7
We now use the familiar l east-squares procedure to estimate the additive component of the gametic variance. The quantity to be minimized i s, from Table 5.7. 1 , Q
=
-
Pl(m . - m)2 + P 2 (m l - m o:l + P3(m 3 - m - p) 2 + P4( m4 - m 0: P) 2 .
5.7.8
220
AN INTR O D U CTION TO POPU LATI ON G EN ETICS TH EORY
Taking derivatives and equating to 0 to m inimize gives :
-
1 8Q = - P l (m 1 - m) - P 2 (m 2 - m - ex) - P im 3 - m - P) (1) 2 8m - Pim4 - m - ex - fJ) 1 8Q (2) '2 = - P2(m2 - m - a) - pim4 - m - ex - p) = 0, 8ex 8Q -1 =
(3)
2 8p
Let
P l (m l - m)
- P3(m3 - m - p) - P4(m4 - m
-
a - p)
=0
= O.
= K.
Then, after subtracting (3) from (1 ),
- P2(m2 - m -
a)
=
K.
Likewise, after subtracting (2) from ( 1 ),
- P3(m3 - m - {J) = K.
Therefore, also
Pim4 - m
-
a - P)
= K.
5.7.9
Dividing these four equations by PI ' P 2 , P3 ' and P4 , respectively, and adding, we obtain I 5.7.1 0 m l - m 2 - m3 + m4 = K L , Pi or, usi ng 5.4.9 and 5.4. 1 0,
-
E = KP.
5.7.1 1
Q, which is the nonadditive component of the marginal or gametic variance, is, from 5.7.8 and 5.7.9, Q
=
K2
= K2p. L ": P i
5.7.1 2
Thus, Vgam = Va + K 2P, where Va is the additive component of the gametic variance. Substituting i nto 5.7.7 yields
-
dm dt
=
2( Va
+ K 2p - cbDE)
( � - CbDE) ) = 2(V + �
= 2 V. +
dZ
a
from 5.4.8.
PZ dt
5.7.1 3
S ELECTION
221
We showed earlier (Section 5.4) that with loose linkage (cb > lEi) Z attains quasi-equilibrium. So, when there is quasi-equ ilibrium the last term in 5 .7. 1 3 vanishes. We can write 5.7. 1 3 in two ways, recalling that the genic variance Vg is twice the additive component of the gametic variance, Va , since we are Vg + t V... ... (see 4. 1 . 1 9). assuming random mating, and that 2 Vgam =
dm de
dm de
2E dZ
= Vg + PZ dt '
5.7.1 4
=
5.7.1 5
Vg
+ 2: V...... - 2cbDE. 1
The first equation is appropriate when there is quasi-equilibrium and the second i s more useful for strong epistasis and tight linkage, although both are correct and in fact fully equivalent . However, under the one cir cumstance the last term of the first equation tends to 0 and in the other the last term in the second equation does. We therefore see that with free recombination natural selection operates in such a way that just enough gametic phase disequ ilibrium is generated ( - cb DE) to cancel exactl y the epistatic variance (K 2p ) . Therefore the rate of change i n fitness is given by the genic component of the variance, as the Fisher principle says. From this, it would appear that parent-offspring correlations do not involve any significant epistatic terms if the population is near quasi-li nkage equilibrium. I n fact, the assumption of quasi-equilibrium is probably closer to the truth than the conventional simple assumption of, gametic phase eq u i li bri u m . It is therefore quite poss i ble that the epi stat ic components of variance that a re sometimes added to the expressions for parent-offspring correlations (e.g. , Kempthorne, 1 957) may be making the expressions less accurate than when only the genic variance is used. A numerical example is shown in Table 5.7.2. The slow change in linkage disequilibriu m , as measured by Z, is evident. Notice how, after the first few generations, the change in fitness is given very closely by the genic variance, despite great changes in the genic and the total gametic variance. However, in the very first generation, the change is given more accurately by the total gametic variance (doubled), because the proper level of linkage disequilib rium has not been attained. A sim ilar example, but with less recombi nation, is shown in Figure 5 .7. 1 . Again, as soon as quasi-equilibrium is attained, . he rate of change in fitness is given by the genic variance. With very tight linkage the rate of change is given more closely by 2 Vgam than by Vg • The reason is obvious ; with very little crossing over the
222
AN INTRODUCTIO N TO POPU LATION GENETICS TH EORY
Table 5.7.2. The attainment of quasi-equ i l i br i u m with random mating and free
recombi nati on . The genotypic fitnesses are : ao bb, 1 .02 ; aa B-, .985 ; A - bb, .99 ;
A- B-,
1 .00. The recombinat i on frequency i s 1 /2. The popula t i on starts i n gametic
phase equ i l ibrium, a n d d iseq u i l i brium is very slowly generated. Discrete generation model. (From K imura, 1 965 .)
TWICE
G EN ERCHROMOSOME fREQUENCIES
ATION
0
CHANGE
G E NIC
G AM ETIC
L1 :-..1 KAG E
IN
VAR-
VAR-
DlSEQUILl-
FlT-
l ANCE
Z
BRl U M
ab
AB
Ab
oB
.200
.200
. 3 00
x
1 .000
. 300
lANCE
2 Vasm
1 05
VII X l OS
2.93
0.73
2 .9 1 2.91
NESS
1 05
x
10
. 20 1
.204
.
29 1
. 304
1 .028
. 66
.66
50
. 1 96
. 224
.267
.3 1 3
1 .029
.46
.45
2. 85
1 00
. 1 86
. 243
. 244
. 3 28
] .030
.37
.37
2. 9 8
200
. 1 46
. 259
.210
38 5
1 .035
.93
. 92
4 . 24
.
3 00
.
07 3
.229
. 1 62
.536
1 .049
5 .7 2
5.64
1 0.09
400
.005
.085
.050
. 860
1 . 08 1
1 5. 29
1 5.42
1 6.78
500
.000
.006
.003
1 .095
1 .68
.99 1
1 .7 3
1 .72
0.00020 0.000 1 8 1
0.000 1 6
VAA I V� + -:; • I I I I I
0 .0001 4 0.000 1 2 0.000 1 0
"
0.00008
,
,
I
I
,
0.00006 0.00004 0.00002
O ����==���-r-----�--��--�---�� a
50
100
150
200
2 50
0.30
0. 3 2
0.33
0.36
0.40
0.46
r
300
350
400
450
5 00
550
0.56
0.73
0.89
0.97
0.99
0.999
F igure 5.7.1 . The genic vari ance (lower l i ne) and t h e gen i c variance
plus half the additive x additive epistatic variance (upper l i ne). Note that the rate of change i n fitness, after the attainment of quasi equ i l i brium, follows a lmost exactly the l ower l ine. The example i s t he same as Figure
5 .4.3.
SELECTION
223
chromosomes behave as units. Notice, though, that i n neither case does the dom inance variance contribute to the rate of change in fitness. Table 5.7.3 shows an example of the latter kind. The data are the same The same population as was shown in Table 5.4.2, showing that with very low recombination an d h igh epistasis the gametic variance ( x 2) is a better predictor o f the rate of change i n fitness than the genic variance. Table 5.7.3.
TWICE CHANGE I N GENERATION
FITNESS X
0 10 20
40 80 1 20
105
3 1 .84 3 8.76 53.56 1 44.62 82.82 1 .4 1
GENIC VARIANCE X
1 05 .00
1 .46 8.79 79.88 76.65 1 . 35
GAMETIC VARIANCE X
1 05
3 1 .25 41 .72 61.31 1 68.73 99.29 1 .7 1
a s in Table 5.4.2. Recombination is low ( c = .0 1 ) and epistasis i s large, so that E/c is much larger than 1 . As is apparent in the table, the doubled gametic variance is a better predictor than the genic. (The doubling is simply because the diploid individual with random mating is derived from two independent gametes.) For several generations this i s the best predictor, but later this turns out to be an overestimate. Notice, though, that there is no contribution from the dominance variance to the prediction-it is already an overestimate without dominance. To return to the more usual situation-after all, most pairs of genes are not closely l in ked-we should not conclude that epistasis can have no effect on the rate of fitness change just because this is governed by the genic component. Epistasis, by altering the total variability in the population, can change all components of the variance, including the genic. This change could be in either direction. Wright ( I 965 and many ear1ier papers) has emphasized that the most important kind of epistasis is probably the kind that arises from the fact that intermediate values for most metrical traits are optimum from the standpoint or fitness. For example, individuals far from the mean for size in either d irection have lower survival and fertility rates. This would produce, generally, epistasis such as to make E negative. Hence the variance would be
224
AN INTRODUCTIO N TO PO PULATI O N G E N ETICS T H EO R Y
reduced. This would then make the population slower to respond to direc tional selection ; so we can say that i n general the effect of epistasis on such metrical traits is probably such as to slow somewhat the rate of progress by selection , progress being measured by the rate at which the mean for the trait is changed. We have considered only that part of the total epistasis that is caused by interaction of pairs of genes. There are of course possible higher order i nter actions. It is generally believed that for quantitative traits these are not very important. Indeed, for one particular m odel, studied by all three pioneers Haldane, Fisher, and Wright-they make no contribution. This is a m odel in which the fitness decreases i n proportion to the square of the distance of the metrical trait from the mean. Wright ( 1 935) showed that with this model all of the epistatic variance is contained in 2-locus i nteractions and that higher order interactions make no contribution. This is not true for most other models, but this case suggests that the magnitude of the contribution of higher order effects may be small. One other possibility is being studied currently by several investigators. This is the effect of several linked genes on the chromosome. Even though the outer members may be only loosely linked with each other, there may be associations because they are both linked with intermediate genes. So far there is no general theory for this, although computer simulations suggest that there may be such effects, and that departures from gametic phase balance may thereby be generated (Lewontin , 1 964). Two final points connect this section with earlier discussions. One is that twice the gametic variance, as we have used it here, includes half the additive x additive component of the epistatic v ariance, VAA (see Section 4. ) ). The other i s that, because of inevitable departures from Hardy-Weinberg proportions after selection, the genic variance is not exactly the sum of the additive portions of the two gametic variances. However, for the same reason as given before, if gene frequencies are changing slowly the departure from random proportions is roughly constant and the genic variance gives the rate of change in fitness. Note on Terminology We have used the word fitness as a synonym for the selective or reproductive value of a genotype. I t may be either an absolute value, measured by the num ber of progeny per parent, or it m ay be relative to some reference genotype. For a discontinuous model we have used w and for a continuous model, m, as measures of fitness. The average fitness, w or m, is a function of the individual genotypic fitnesses and the frequencies of the various genotypes, and again may be absol ute or relative. The word is also used more restrictively. In his recent discussions Wright ( 1955, 1 969) used the word fitness for that population function that
A
S ELECTIO N
225
increases with time according to Fisher's Fundamental Theorem of Natural Selection. It is not in general the same as loge W o r m. A s we have seen, dm/dt = Vg only under certain conditions such as constant individual m's or w's, random mating or constant departure therefrom, loose linkage, an d quasi linkage equilibrium. In fact dm/dt may even be negative. There is value in defining a quantity which always increases and which measu res the theoretical evol utionary improvement in a popUlation brought about by gene-frequency change, despite the fact that this is hardly ever experienced as a corresponding increase in population numbers because of such things as competing species, overcrowding, freq uency-dependent selective values, and changes in the mating system and linkage relations. Such a quantity, called fitness by Fisher and the fitness function by Wright, always increases at a rate equal to the genic variance. For a discussion of this quantity and the way in which it can be defined, see Wright ( 1 969, p. J 2 1 ). 5.8 Thresholds and Truncation Selection for a Quantitative Trait
Some situations in nature approach a model in which nothing happens until a certain quantity is attained and then all values above this show the phenom enon . For example, it may be that doses of a drug may be harmless up to a point and that beyond this point they become harmful. There may be in some instances similar kinds of gene action ; an effect of some kind appears only when a certain number of harmful genes are pre�ent. This has been suggested by Lerner as the basis for some congenital anomalies in chicks. The existence of sharp thresholds in nature is open to discussion, but it is clear that this model is approached rather closely in some kinds of breeding experiment�. All the individuals above a certain level are saved and reproduce ; the rest are culled. Since mass directional selection must be very similar to natural selection, we should be able to adapt the theory developed in these chapters to this subject. We assume that the trait under consideration is determined by a large number of factors, genetic and envi ronmental , the number being large enough that the effect of any one locus is small relative to the total variabil ity. The quantitative trait (e.g., yield, Y) is assumed to be normally distributed (see A . 5). If the data are not normally distributed they can often be transformed to be approximately normal. For example, one might work with the logs, or the square roots, or some other function of the origi nal measurements that would have a symmetrical distribution approxi mating the normal. The breeder saves for reprod uction a certain fraction , S, of the popu lation. All of these lie above the cutoff or truncation point, C, on the abscissa.
226
AN I NT R O D U CTIO N TO P O P U LATI O N G E N ETICS THEO RY
This is illustrated in Figure 5.S. 1 . This method of truncation selection is equivalent to a threshold at point C. We assume that all the animals or plants that are saved are equally fertile, or more generally that there is no correlation between fertility and the yield beyond the truncation point. Let z be the ordinate of the normal curve at the truncation point.
Y
)"
C
Tru n cati o n selection. The
individuals in the shaded area are saved for Figure S.S.1 .
breeding ;
the ones
to the left are culled.
Assuming a no r m al distributi on, the frequency f( Y) of i ndividuals with yield is given by
Y
=
fe y )
[ (Y_ Y)2]
1 e xp ayM: 2rr
2a
5.8.1
2
where a 2 ( = Vt) is the total variance in yield from all causes, genetic and environmental. The proportion saved , S, is related to the cutoff point, C, by s
=
f 1 % dominance has the same k i netics as if i t were completely d om inant, as the graph shows. 1.0
0. 8
qhs
0.6
u
0 .4
0.2
0.05
0. 1 0
0.15
0.20
h
0.25
0.30
0.35
0.40
0.4 5
0.50
Figure 6.2.1 . The equilibrium value of qhs/u as a function of h, where q
is the mutant·gene frequency, s is its homozygous disadvantage, hs its heterozygous d isadvantage, and u the mutation rate. When most of the elimination of mutants is by selection against heterozygotes, qhs/u - ] and the va]ues l ie along the horizontal l ine. If eliminations a re partly through homozygotes the curve dips down and becomes 0 if all elimina t ions are this way.
Furtherm ore, there is strong evidence in Drosophila that m utants with s mall
s
h
is larger for
than when s i s large (for a summ ary and references,
see Crow, 1 970). In general , the best evidence is that almost all the i m pact of natural selection on mu tant genes is through their heterozygou s effects.
]f h
is l arge enough that m ost selec tion is in the heterozygou s state, then
t he difficulty in fin i te p op ulatio ns d i scussed above is no l onger so i mportant. W ith partial domi nance the
m ea n
fre qu en c y of mutants is almos t independent
262
AN I NTRO D U CTIO N TO POPULATIO N G E N ETICS TH EORY
of the effective population number, although t he variance is much greater i n small populations of course. I n this case, let Pl l = p ( 1 - f) + pf, P1 2 = pq( l f), and P2 2 = q ( 1 - f) + qf If h, f, and u are all small enough that products of three of them can be neglected, the solution of 6 . 2.2 becomes 4. Nonrandom Mating, Partial Dominance
2
2
q�
u s (h + f + q)
.
6 .2.1 1
If f = 0, we have the situation just discussed. Iff> h t he n homozygous selection is most important. I f h > f (unless both are very small, or s is very small) most of the selection is in heterozygotes. Reverse mutation is not likely to be important for two reasons. One is that the rate is usually less. The other is that the mutant gene, especially if hs is large, is very rare so there are few opportunities for reverse mutation. The case of primary interest is that where t he mutant is deleterious and recessive. I n contrast to the situation for autosomal reces sives where a small amount of heterozygous selection may be t he most import ant effect, a small amount of heterozygous expression has only a slight effect on the frequency of an X-linked recessive. The reason is that a single recessive gene is expressed i n males and thereby exposed to the full effect of selection . If the fitness of mutant males is 1 s, relative to normal males, and heterozygous females are not greatly impaired, t he n the rate of elimination of mutant genes per generation is about sf3. since one-third of the X chromo somes are in males. If this is balanced by new mutations, U � qsf3 or
X-linked Locus
-
q�
3u S
-
6.2.1 2
at equilibrium. As expected, t he frequency is higher than for autosomal dominants, but much less than for complete recessives. If s is large the selective value of homozygous females is almost totally i rrelevant as far as determination of gene frequencies is concerned. Very few such females exist ; with a lethal t here would be almost n one at all. Likewise, the selection against heterozygous females is unimportant unless it gets to be comparable to s in magnitude. 6.3 Equi librium Under M utatio n Pressu re
Ordinarily one regards selection as t he strongest force i nfluencing gene frequencies, with mutation providing a steady i nput of new variability. O n the other hand, t here i s growing evidence that some, perhaps a large fraction,
POPULATI O N S I N APPROX IMATE EQU ILI B R I U M 263
of DNA changes are nearly enough neutral that mutation rates become an appreciable factor. With such weak forces we should expect the consequences of random gene frequency drift to be important. That we shall consider later in Section 7 . 2. In this section we shall treat the subject deterministically, as if the population were i nfinite. Consider a series of n alleles A I ' A 2 , A 3 , , A n . The decrement in frequency of A i due to mutation from A i to other alleles is •
•
•
(� u iij :i '
where uij means the rate per generation of mutation from A i to A j and P h as usual, is the frequency of A i ' The increment i n A i alleles will be the sum of the mutation rates from other alleles to A i ' or
L U ji Pj ' j
Thus the net increase in frequency of A i per generation is
Api = - P, L Uij + L Uji Pj , j
j
6.3.1
as given by Wright ( 1 949, for example). The equilibrium is obtained by setting Ap i = 0 for each value of i and solving the series of simultaneous equations. For two alleles, P z = I - P I and we have
Ap I = - PI Un + ( 1 - P I)UZ I ' and when Ap I = 0,
PI =
V UZI = -- . U I Z + UZI U + V
6 . 3.2
where U and v are the mutation rates from and to allele A I ' For three alleles we have the equilibrium equations
- (u n + Ul J)P I + Uz IPz + U3lP3 = 0, UI 2 PI - (U2 1 + U2J)p z + U J 2 P 3 = 0, Ul 3 PI + UJ 2 P Z - (U 3 1 + UJ2)P3 = 0 .
6.3.3
These three equations are not independent, since P I + P Z + P 3 = l . We can reduce this to two independent equations by replacing P 3 by 1 - P I - P z in two of the equations and solving. A convenient way of getting the solution is to write the matrix of the coefficients
UZ I - (U Z I + uz3) UZ3
6.3. 4
264
AN I NTRODU CTION TO POPU LATION G E N ETICS T H EO R Y
Then
61 1 PI = n ' 622 P2 = n '
6.3.5
63 3 P = 3 n'
where � 1 1 is the 2 x 2 detriment obtai ned by deleting row 1 and colu mn I from A . Li kewise 6 2 2 is gotten by deleting the second row and column, and so on. D, the denominator, is the sum of the three nu merators. D is also the value of the determinant obtained by replacing any row in A by l ' s . (Some simp le rules for evaluation of determinants are given in A.7.) The extension to fou r or m ore alJeles is straightforward. The determinant is larger, but the rules are the same. With typically low mutation rates the approach to equi l ibrium is very slow. For example, with two alleles, �PI
= - Ul 2Pl + U2 1P2
•
6.3.6
Now let
PI + e , P2 = P2 e, PI =
-
where P I and P 2 are equ ilibrium values. Then substituting into 6.3.6 we get ��
= �PI = - (u 1 2 + u2 1)e - til 2 PI + U 2 I P 2 '
But the last two terms add u p to 0 (see 6.3.2), so = 1 = - �e T U2 + U1 2 L u.
6.3.7
The approach to equilibrium is at a rate equal to the total mutation rate . If I U = 1 0 - 5 , about 70,000 generations are required to go halfway to equ i librium . So, to nobody's su rprise, we learn that the approach to mutational equilibri u m is a very slow process. 6.4 M u tation a n d Selection with M u ltiple Al leles
When discussing the equilibrium between m utation and selection we con sidered only two al leles and ignored reverse m utation. It is expected that the wild-type gene can mutate to m any different states, as we have discussed in the previous section. I f the mutant states are individually rare and are selected
POPULATIO N S I N APPROXI MATE EQUILI B RIUM
265
mainly as heterozygotes then there wi ll be little chance for interaction, since hardly ever will more than one of a set of mutant alleles occur in the same individual. The total frequency will be governed mainly by the total mutation rate from the normal to all mutant alleles. Experience with Drosophila mutants shows that the combinations of two recessive mutants with visible effects are often intermediate between the two homozygous mutants. In this case there will be little interaction. So we suspect that in most cases multiple al1elism d oesn't c hange the picture much. We shall consider the question only briefly. For this it will be expedient to adopt a continuous model and assume that the gene frequency change can be written in the form d pi
- =
dt
pI.( m I.
m) _
-
-
" u· · pI. 4I} }
+
" u }I.. p}4}
6.4.1
•
We get this by combining 5 . 3 . 5 with 6.3. 1 . For reasons mentioned in Section 6.2 reverse mutation can be ignored with little loss of accuracy if the deleterious effects of the mutants are large relative to the mutation rates. If the last term is 0, by setting dpJdt = 0 for equil ibrium we obtain the pleasingly simple result mO
- In = L j
UOj '
6.4.2
where the subscript 0 designates the normal allele (which, of course, may be a population of indistinguishable isoalleles) and the j indicates any mutant allele. In words : At equilibrium the average excess in fitness of the wild-type allele is equal to its total rate of mutation to all mutant alleles. This depends on the fitness being measured in Malthusian parameters. It also assumes that reverse mutation to the wild-type allele can be neglected . However, n o restriction is placed on the number of alleles, on the mating system, or on the rate of mutation from one to another mutant allele. We shall use this principle to find the equilibrium frequencies of mutant genes. 1 . F u l l y Recessive Mutants
We place no restriction on the interaction between mutant alleles, but all are recessive to the wild type, A t . For algebraic simpl icity we set the fitness of the wild type at 0, measured in Malthusian para meters. Let the fitness of the mutant type A j Aj be - sij ' Then from 6.4.2, 6.4.3
where the sum mation extends over the n I mutant alleles. The average reduction in fitness caused by mutant phenotypes (homozygous mutants and -
AN I NTRODUCTION TO POPU LATIO N G E N ETICS T H EO R Y
266
combinations of two mutants) is 6.4.4
from which " p
L
.
.
I)
L UO i s
=
·
6.4.5
[his says that the total frequency of mutant phenotypes is the ratio of the total mutation rate (forward) divided by the (weighted) average selective dis advantage of the mutant phenotypes. The fitness i s measured i n Malthusian parameters. For two alleles the discrete generation analogy is 6.2.3. 2. Partially or Completely Dominant M utants
Once again we let the wild-type (A o A o) fitness be o. The fitness of the mutant homozygote A i A i i s Sjj , o f the mutant heterozygote A i A j i s sij ' and o f the normal-mutant heterozygote A o A i is hi Sjj . From 6.4.2, -
-
-
6.4. 6
But,
where, as before, the sum mation is over all mutant alleles. If there is appreci able dominance the last term i s very small relative to the others. Omitting this, and noting that
we obtain ,P L OI' =
L UO i hs
X
Po 2 po -
1
---
L HO i � --=Its
x
With Hardy-Weinberg proportions PO i
L Pi �
L UOi
--=-
hs
.
Po · =
6.4.7
P O P i ' and 6.4.8
POPU LATI O N S I N APP ROXI MATE EQU I LI B R I U M
267
So, as a first approximation based on strong selection ag ainst heterozygotes, the total frequency of mutant alleles is the total forward mutation rate divided by the average selection against hcterozygotes. With weaker selection the situation is more complex, and we shall go no further with it. 6. 5 Selection and Migration
The subject of migration and population structure has beco me very extensive in recent years. Various models of structure ranging from a continuous population with l imited dispersal to completely isolated islands have been studied. We shall consider only one model, the island model of Sewall Wright (see, for example, his 1 9 5 1 paper). Later, i n Sections 9 . 2 and 9.9, this and other models will be treated stochastically. The species is thought of as broken into a number of subpopulations, largely isolated from each other, but with some exchange of migrants. We let P i be the frequency of an allele of interest i n the ith subpopulation, and p be the frequency in the whole population. In Wright's model a fraction M of the population are replaced by migrants each generation. The migrants are assumed to have a gene frequency equal to that for the whole popUlation p . The change in allele frequency in the ith subpopulation per generation is I1.P i
=
- MP i + Mp = - M(Pi
-
p) .
6.5.1
Since on this model the effect of m igration is linear in the gene frequencies, there should be a correspondence with mutation which is also linear. If we rewri te 6.5. 1 , I1.P i = - Mp i + Mp - Mp i P + MPi P
=
- M( l -
P)P i + Mp( 1 - Pi)'
6.5 . 2
we see that this corresponds in form to
6 .5.3 I1.Pl = - U1 2 Pl + U 2 1 ( l - P. ) · So we can carry the same equations from mutation to migration by setting
M( l - p) = U1 2 = Mp = U Z 1 =
u,
6.5.4
v.
We shall use this correspondence in Section 9.2. In analogy with 6.4. 1 we may regard selection and migration both as continuous processes and write dp i
dl
- =
Pi( m j - m ) - M( Pi - P) .
6.5.5
268
AN I NTROD UCTION TO P O P U LATIO N G E N ETICS T H EO R Y
We must keep straight that m is the average fitness of the allele of interest in the ith sUbpopulation whereas p is the average allele frequency in the entire population. We are using m for the Malthusian parameter of selection and M for migration. It is to be expected that the immigrants coming into a subpopulation from the outside will be less well adapted to the local conditions since they have not had the benefit of previous selection in the local environment. This is quite analogous to an input of different genes by mutation (although the migrant genes have been pretested in an environment that is not wholly unrelated, and hence are likely to be less deleterious than new mutants). To find equilibrium conditions we can equate 6.5.5 to O. In general the selection term will be a cUbic function of the gene frequencies. In the absence of dominance the expression is only quadratic, so we shall consider thi s. Using 5.3.8 and equating dp/dt to 0 we have at equilibrium sp( l - p) - M(p - p)
=
0,
6.5.6
where, dropping the subscripts, p is the frequency of the gene (say A) that is locally favored, s is the selective advantage of AA, and s/2 that of Aa, both measured in Malthusian parameters. The relevant solu tion is
p=s
-
M + J( M - S) 2 + 4Msp 2s
.
6.5.7
For example, if M = s, p = �p. It is clear that if migration is large and selection weak all the populations will tend to become alike. On the other hand, if lsi � M and the selective values differ from one subpopulation to another, there will be considerable local differentiation. This model has been extensively discussed by Wright ( 1 940, 1 95 1 ). 6.6
Equilibrium Between Migration and Random D rift
If the population is broken up into subpopulations and these subpopulations are small there will be random drift of gene frequencies among the sub populations so that they will drift apart. Migration from one to another will counteract this effect. We shall discuss this briefly now by elementary pro cedures and then return to it in Section 9.2 with a more sophisticated stochas tic treatment. As in the last section we let M be the amount of exchange each generation by migration. From equation 3 . 1 1 . 1 the increase in autozygosity i n a sub population is given by fr =
� (1 2�J fr - l '
2
e
+
-
6.6.1
P O P ULATI O N S I N APPROXIMATE EQUILI B R I U M
269
where Ne is the effective number in the subpopulation. H owever, the genes will remain identical only if the individuals carrying them have not been replaced by migrants. The probability that neither of the two uniting genes has been exchanged for a migrant gene is (1 - M)2. (This is not quite exact because our model assu mes that self-fertilization is possible, but unless Ne is very small this i s a trivial correction.) The equation, including a correction for random exchange of genes with outsiders, is 6.6.2
Letting J, = 1, - 1
f=
= 1 for equilibrium,
(1 - M)2 2 Ne - (2Ne 1 )( 1 -
-
M)2 '
6.6.3
When M is s mall so that M2 can be neglected,
I - 2M 1 � f � 4Ne M + 1 2M 4Ne M + 1
6.6.4
-
We shall find exactly the same formula for mutation in Section 7.2. This is not surprising because of the m athematical equivalence of mutation and migration in Wright's m igration model. If M � 1 /4Ne ' then 1 becomes large and there is considerable local autozygosity. Contrariwise, if M » 1 /4Ne ' the migration swamps the local subpopulation and the whole thing becomes effectively one panmictic unit. It is impressive that the amount of migration needn 't be very great. A fraction l iNe of the population means one individual in a population of Ne , so if the number of migrants is much more than one per generation, t here is little local differentiation. This is mitigated somewhat, however, because migrants tend to come from neighboring subpopulati ons, rather than being a random sample of the entire population, and neighboring genes are likely to be somewhat alike. H ence the swamping effect of migration may be less. Equation 3 . 1 2.3 connects the inbreeding coefficient of a subpopulation with the variance in gene frequency caused by random variation among the sUbpopulations. From this 8.6.5
where Vp is the variance In gene frequency among subpopulations and p
270
AN INTRODUCTION TO POPULATI O N G E N ETICS TH EORY
is the average frequency in the whole population. Substituting this i nto 6.6.4 gives Vp
=
p( l jj) 4Ne M + I '
6 .6.6
which is Wright's formula relating the amount of random differentiation among partially isolated subpopulations to the amount of migration. We shall derive the same formula again later (9.2. 1 0) by a quite different procedure. 6.7 Equili brium U n der Selection :
S ingle Locus with Two Alleles
We start with a simple case-one locus, random mating, and two alleles. We shall use a discrete generation model ; in general the conclusions are the same with a continuous modeL. Suppose the generations are enumerated as zygotes and that the fitnesses of the three genotypes A l A I ' A I A 2 ' and A 2 A 2 are W I I ' 11' 1 2 ' and W 2 2 ' With random mating the zygote frequencies, counted before selection, will be pi, 2PI P2 , and p� , where PI and P 2 are the allele frequencies and PI + P 2 L From 5.2. 1 3, =
�PI =
PIP 2 [P I ( Wl l - 11\ 2 ) + P 2( 11' 1 2 - 11'22)] w
.
6.7.1
Equating this to 0 to find the equilibrium val ues gives 6.7.2
in addition to the two trivial val ues P I = 0 and PI = 1 . We can notice immediately that some restrictions are imposed by the fact that P I must lie in the range 0 to 1 and therefore cannot be negative. If we write the denominator as (1 1 '1 1 - W1 Z ) + ( }t' Z 2 - WI Z) we see that 0 < P l < I only if both terms i n the denominator have the same sign. So we see that one condition for an equilibrium is or 6.7.3
To have an equilibrium other than 0 or 1 the heterozygote m ust either be less fit or more fit than either homozygote. Otherwise the only equilibria are PI = 0 or PI = I .
P O P U LATIO N S I N A P P R OX I MATE EQU I LI B R I U M 271
To investigate the stability of the equilibrium consider a small displace ment � from the equilibrium P I ' so PI = P I + �. Then P 2 must be P2 - � and, noting that IIp, = 0, ll� = IIp I - IIp, = llp i � PI P 2 [(P I + � )( wl l - w1 2) + (P2 - � )( w1 2 - 1V2 2)]!W = P I P2(W1 1 - 2 w 1 2 + W 2 2) � ! w ,
6.7.4
smce
When W I I - 2 1V 1 2 + W 2 2 < 0, ll�!� is negative, and its absolute value is less than one. This means that P I is a stable equilibrium in that a displace ment from this point is followed by a tendency to return to the point. When \V 1 1 - 2 1 V 1 2 + 1V 2 2 > 0, then ll�!� is positive. A displacement is followed by an even larger displacement in the same direction, so the equilib rium is unstable. Selection favoring the heterozygote leads to a stable equilibrium, with the equilibri u m value of PI given by 6.7.2. If the heterozygote is selected against, the population tends to move away from the unstable equilibrium toward PI = 0 or PI = I . Whether the A l or the A 2 allele is fixed depends on which side of the equilibrium point the population starts from. A locus with a superior heterozygote tends to persist in the polymorphic state so that it contributes quite disproportionately to the variability of the population. A locus with an inferior heterozygote ultim ately is fixed at gene frequency 0 or I and does not contribute to the variability. There is another way of testing the stability of an equilibrium. We saw in Section 5 . 5 that under natural selection with a single locus. rando m mating, and constant fitness coefficients the fitness always increases. With a continuous model this follows from the Fundamental Theorem of Natural Selection. With a discontinuous model it follows from 5.5. 14. The mean fitness is 2 2 6.7.5 W = P I I V 1 1 + 2PIP 2 W l 2 + P 2 W 2 2 , from which, recalling that P 2 - =
dw
dp ,
=
I - PI '
2p, (W1 1 - w 1 2 ) + 2p lw 2 1 - W 2 2) = 0 ,
leading to the same equation as before,
6.7.6
272
AN I NTRODUC1'I O N TO P OPULATI O N G E N ETICS TH EO RY
So, as expected, the equil ibriu m point is where the fitness either maximum or minimum. Differentiating again,
d2 w d 2 'Pi
=
2(Wl l - 2 H·'1 2 + W 22 )'
IS
stationary-
6.7.7
When this is negative the equilibrium point is a relative maximum and there fore stable. When the second derivative is positive the equilibri u m is unstable. W hen there are multiple alleles the cond itions for a maximu m are more involved, as we shall see in the next section . For a particularly lucid elementary ex position o f the various kinds of equilibria u nder selection , see Li ( l 9 67b) . 6.8 Selective Equil ibrium with Multiple Alleles
Again we assume random m ating and a discrete model . There are n alleles A . , A 2 , • • • , A" with frequencies P I ' P 2 , . , p" . Let wij be the fitness of genotype A i A j . Then, from 5 . 2. 1 , the change in frequency of allele A i in one generation is .
.
6.B.1
where
and
At equilibrium when
llp i = 0, 6.8.2
and we reach the quite reasonable conclusion that at equilibrium the average fitness of all alleles is the same and equal to the population average fitness. If this were not true, those alleles with the higher fitnesses would i ncrease and the population would not be i n equilibrium. To investigate t he stability of the equilibrium we should like to use the principle employed in the preceding section-that fitness is m aximized at t he point of stable equilibrium. For the conti nuous model we know this fro m 5 . 5.9, which extends readily to multiple alleles. We also know from 5 . 5 . 1 4 that t his is true with a discrete generation m odel for random mating with two alleles.
POPULATI O N S I N APPROXI MATE EQU ILI B R I U M 273
We would expect this to be true for multiple alleles also, and this has been shown. For a discussion of this problem, see Scheuer and Mandel ( 1 959), M ulholland and Smith ( 1 959), and Kingman ( 1 96 1 , 1 96 1 a). We shall now demonstrate this. If you are willing to accept this without proof, you may wish to skip this section and proceed directly to equation 6.8. 1 1 .
Demonstration That Fitness Always I ncreases in the N eighbor hood of the Equilibrium If we use a caret to designate equilibrium values, 6.8.3
To simplify the calculations, we express the fitnesses relative to the mean fitness at equilibrium, so we let
In this measure
W; = L Wij /1j = l¥ = 1 .
6.8.4
j
The formula for llp i is t he same in the new units ;
/!,.p; =
w
Pi( Wj
-
W) .
8.8.5
Our interest now is in the behavior of W near the equilibrium point. Let �i be a small deviation from Pi , as in the last section, so that The change in W due to these displacements is
b l¥ = L Wij(!1i + e i)(!1j + ej)
ij
=
-
I
J; , I:. J; J A . 1:. J; 1. + � 2� L WIJ I:. L WIJ I'J . .
ij
ij
. .
•
. •
But since
we h ave
J; J b l¥ = � � I. I:. L WIJ ):
ij
. .
• •
6.8.8
274
AN I NTRODUCTION TO P O P U LATIO N G E N ETICS TH E O RY
Similarly, if we consider the change in Wi due to these displacements,
lJ Wi
=L
j
l1I;ipj + �j)
L l1I;j Pj j 6 .8.7
so 6.8.8
The change in gene frequencies in one generation near the equilibrium point, from 6.8.5, 6.8.6, and 6.8.8 is approximately 6.B.9
The c hange in mean fitness W in one generation due to natural selection is
A W = L l1I;iPi + APi)(Pj + A pj) - L l1I;j Pi Pj ij
which, in the neighborhood of t he equilibrium poin4 reduces to
AW
= 2
L ( W; + lJ W; ) Aei i
+
L Wi) Aei Aej ij
from 6.8 .4, 6.8.8, and 6.8.9. Now consider two quantities,
A
and
= 2
Li Pi e!
B = L l1I; j Pi pie i + ej)2, ij
both of which are nonnegative. Since B
=
2� L...
ij
then
since Wi
=
1.
WIJ. . p . pJ. e � I
I
+ 2 '" '-
ij
WIJ. p...I. f'�J· eI· eJ. '
POPULATI O N S I N APPROXIMATE EQUILI B R I U M 275
Using these quantities, the change of W in one generation near the equilibrium is
� W = A + !(B - A)
= !(A
+ B) > 0.
6.8.1 0
This shows for multi ple alleles, as 5.5. 1 4 did for two alleles, that (with random mating and constant w d s) the fitness always increases unless the E/S are all 0, that is to say, unless the population is already at equilibrium. We now use this principle to derive stability conditions for the multiple allele case. Equi librium Allele F requencies The gene frequencies at equilibrium may be obtai ned directly by solving the simultaneous equations from 6.8.2 using the definitions of the w; ' s (see 5.2.5). Wl l P I
+
H'1 2 P 2
Wl 1 Pl
+
U'l l P l
+ . . . + HJ1 n Pn + . . . + W2nPn
=W
W . . . + Wnn Pn = W Wn 1 Pl + �"" n 2 P 2 + =
6.8.1 1
The solutions, from Cramer ' s rule, are 6.8.1 2
where
�=
It'l l
1t'1 2
�"" l n
6.8 . 1 3
and �i i s the determinant gotton b y substituting I ' s for all the elements in the ith column of the determinant �. The latter part of 6.8. 1 2 follows from the fact that I. P i = I . The average fitness at equilibrium is w �/I. �i ' =
Stabil ity of the Equilibrium We have seen that fitness always i ncreases except at the equilibrium , so the stability condition is that W is a relative maxi mum . This means that .)W must always be positive for small deviations from the equilibrium. Mathematically this is assured if the quadratic form 6.8.6 is neg ative definite (see below for definition). Since I. ei = 0, then n- l
en = - I. ei ' i= 1
6.8.1 4
276
AN I NTRODUCTI ON TO POPULATIO N G E N ETICS TH EORY
Substituting this into 6.8.6, and using the original fitness scale, n- l n- l
b Rl = L L t ij ei ej , i = 1 j= 1
6
.8 1 5 .
whe re
6 . B.1 6
The quad ratic form on the right side of 6.8. 1 5 is negative definite if and only if
ti l t2 1 t31
t 1 1 < 0,
t1 2 t22 tn
t1 ) t2 3 < 0, t3 3
6.B.1 7
and so on up to order n I. The second condition is that all gene frequencies be positi ve. This can be done as follows. Let T = [ t ij] be the matrix of the quadratic form. It turns out that the determinant I T I is equal to the denominator in the right term of 6.8. 1 2. So -
n
I T I = L 6j • j= 1 However, 6.8. 1 7 requ ires that
so (i = 1 , 2,
.
.
. , n)
6.8.1 8
in order that al l the gene frequencies are positive. To summarize : The conditions for a stable equilibri u m are 6.8. 1 7 and 6.8. 1 8. It may be convenient mnemonically to re member that t ij may be written symbolically as (i - n)(j - n), which gives ij in -jn + nn. Exactly the same arguments a pply to the continuous model. The necessary a nd sufficient conditions are the same, except that the wij's are replaced by Malthusian parameters. In fact the first report, by Kimura (I 956a), was for a continuous m odel. Mandel's ( 1 959) formu l ae for the discrete case are equivalent. Those read ing M andel's pape rs can verify that, for example, his condition -
al l a 1 n ani ann 1
1 1 >0 0
P O P U LATI O N S I N APPROXIMATE EQUILI B R I U M 277
is the same as 11 1
0,
(1)
11 1
(2)
- il 1 = - il 2 = -
I! 1 I W1 2 W2 2
Wl l W2 1
=
W 1 2 - W 2 2 > 0,
1 = W2 1 - W l l > 0. 1
These are equivalent to the condition W l l earlier. For three alleles W1 1 il = W 2 1 W3 1
W1 2 W2 2 W3 2
w1 2 il l = I W 2 2 I W3 2 2 11 1 = ( I - 3)
W 2 2 that we discussed
W1 3 W2 3 , W3 3 W1 3 W2 3 , W3 3
wl l il 2 = W 2 1 W3 1
I I
W1 3 W2 3 , W3 3 ,
= Wl l - W 1 3 - W3 1 + W 3 3 ' tl l = ( I - 3)(2 - 3) = W l :Z - W 1 3 - W 3 2 + W 3 3 2 W3 3 . t2 2 = (2 - 3) = W 2 2 - W23 - W32 0,
(I)
tl l
(2)
il l > 0,
But, since
I
tl l t21
0,
il 2 > 0 ,
il 3 > 0.
l
tl l = ill + il 2 + il 3 , 122
i n the 3-allele case the necessary and sufficient conditions are : tl l
Wi 1 , in =O or
r
n Wi < 1.
i= O
6.9.1 3
282
AN INTRODUCTION TO P O P U LATIO N G E N ETICS T H EO RY
Next, consider the case in which the frequency of A 1 is very high, or equiva lently, the frequency of A 2 is very low. In this case,
q t(Pt + wt qt) 6.9.1 4 q t + 1 = ---- Wt and the ratio qt + 1 /qt is u nity at the limit q t -+ O. This means that the propor
tional increase or decrease per generation of the frequency of the recessive allele (A 2 ) is extremely small when it is rare, and the consideration of the ratio is not very useful. Also the d iffere nce qt + 1 - qt is of the order of q; and there fore tends to 0 very qu ickly as qt -+ O. On the other hand, the reciprocal of qt is very large and the difference of this quantity between two successive generations turns out to be finite and therefore more useful. That is to say we magnify the difference which actually is very small. Let Qt = I /qt ; then
Q +1 Q = , - ,
(I -
and , at the l imit of Therefore
Q,
W
t)( Q t - I )
-
1 +
w,
'
Qt -+ 00 , this has the limit I -
t � t Q + 1 - Qo L ( I i=O
)
w; .
6.9.1 5
11' , .
6.9.1 6
In order that the polymorphism be maintained, A 2 must increase in this case, or the reciprocal of its frequency must decrease, so that
t L ( 1 - W i) < 0, ;=0
or
t - L Wi > 1 . ( + 1 ;= 0 I
6.9.1 7
This treatment shows that the suffic ient conditions for polymorphism are 6.9. 1 3 and 6.9. 1 7. Namely, the arithmetic mean of the fitness of the recessive is larger than unity while its geometric mean is less than unity. This elegant theorem was first proved by Haldane and Jayakar ( 1 963). These conditions would be fulfilled, for example, if the recessives were 5 %- 1 0 % fitter than the dominants but an epidemic disease killed off all the recessives every twenty generations. The geometric mean is less than I while the arithmetic is larger. After each epidemic the frequency of recessive genes may decrease suddenly, followed by steady increase until the next epidemic. Multi ple-niche Polymorphism We shall conside r briefly a simple model of Levene ( 1 953). Suppose there are two alleles A 1 and A 2 and mating is at
PO P U LA1'I O N S I N APPR OXIMATE EQUILI B R I U M 283
random. The zygote frequencies are p 2 , 2pq, and q 2. These genotypes are then distributed randomly into separate niches, each of which is different as regards the selective values of the three genotypes. I n the ith niche the relative fitnesses are :
We make one more assu mption-that the fraction that matures in the ith niche is k i (L k i I). After maturation the progeny from all niches are randomized before mating. In this model the average fitness in the ith niche is Wi = p2 ( I - Si) + 2pq + q2 ( l - t i) 6.9.18 1 p2 S . _ q2 t . , =
=
-
I
I
and the average fitness of allele A i in the ith niche is ""' u
=
p ( l - Si) + q
=
1 - PSi '
6.9.1 9
From 5.2.6 the gene frequency change in the ith niche is
A pi =
Wi) = pq(qt� PSi) . Wi Wi
P( W l�-
Averaging over all niches,
6.9.20
There will be at least one stable intermediate equilibrium if p increases when it is small and decreases when it is large. So we shall use this to determine sufficient conditions for a polymorphism. If d(Ap)/dp > 0 when p 0 and d(Ap)/dp < 0 when p 1 the conditions will be satisfied, since Ap = 0 at p 0 or 1 . From 6.9.20, -+
-+
lim
p .... O
d( Ap) ti _ k. _ =L ti 1 dp I
=" L. k I
-
since L k i = I .
-
1
W2 2 . i
-
-
l,
=
284
AN I NTROD UCTION TO POPULATI O N G E N ETICS TH EORY
So one condition for a stable equilibrium is
1
L k, - > 1, or
W22, i
1 ) 1/ (L kj _ W2 2 , i
Pi S ii '
8.1 2.3.1 0
This follows from 6. 1 2.3.9 a foraori, since the right side of 6. 1 2.3.9 contains several nonnegative terms, of which Pi Sjj is only one. If all heterozygotes have equa] fitnesses and all homozygotes are inferior, but not necessarily equal to each other, then the inequality becomes an equality. This is because with equal heterozygotes Sij = 0 for all combinations where i =F j and therefore Pi Sjj is the only nonzero term in 6. l 2. 3.9. Writing 6. l 2.3. 1 0 as L
Pi = - , Sj j and noting that 1 L L - = 1,
L Pi =
1 , we obtain
Sjj
and
S
n
L = -,
8.1 2.3.1 1
306
AN I N T R O D UCTI O N TO PO P U LATI O N G EN ETI CS T H E O R Y
where s is the harmonic mean of the homozygous d isadvantages, S j j , and n is the number of alleles. This shows that for comparable fitness coefficients, the segregation load is inversely proportional to the n umber of alleles maintained in the population . Equation 6. 1 2. 3 . J 0 extends the principle previously mentioned to mul tiple alleles. Regardless of the i ndividual fitnesses, if the population is at equilibrium under selective balance, the segregation load (or at least a mini mum estimate thereof) can be gotten from information on a single allele and its homozygous effect on fitness. The minimum estimate of the load is the product of the frequency of that allele and its homozygous selective disadvantage relative to the best genotype. For example, the freque ncy (q) of the recessive gene for phenylketonuria is about 0.0 1 . The fitness of persons homozygous for this gene is very nearly 0 (or at least it has been u ntil very recently), so s = I . Thus the m inimum segre gation load if this allele is mai ntai ned by selective balance is sq or 0.0 1 . The population fitness is at least I % less than that of the heterozygote, or the best heterozygote if there are multi ple alleles. I n contrast, if this is a fully recessive mutant maintained by recurrent mutation, the necessary mutation rate is 0.000 1 . The mutation load is only 0.000 1 . So the genetic load in this case is 1 00 times as large if the abnormal gene is maintai ned by heterozy gote advantage as if it were determined by recurrent mutation. A group of independent loci will have a collecti ve segregation load that is roughly the sum of the individual loads until the number gets large. Sup pose there were 1 00 loci, independent in i n heritance and in their effect on fitness, and each with a load as large as that just mentioned. The total load would then be 1 00 x 0.0 I , or I . This means that, with independence, the average fitness of the popu lation relative to the best heterozygote I S (l 0.0 1 ) 1 00, or roughly e - I 0.37. With 400 loci, e - 4 0.02. In general, the average fitness and load are -
=
L = l - e - [ I,
=
6.1 2.3.1 2
where I is the load for an individual locus. The load can quickly get to be very nearly one if the n umber of poly morphic loci is large. That this creates problems i n accounting for large n umbers of segregati ng l oci in a population has been discussed re peatedly. There are several ways i n which a large amount of polymorphism can be maintai ned without a large segregation load. One possibility is that the selective differences are very small. If these are much less than the reciprocal of the effective population size, the allele frequencies will be largely determined by random drift and mutation as we shall discuss in Chapter 8. Of cou rse, with neutral genes there i s no load.
P O P U LATI O N S I N APPROX I MATE EQUILI B R I U M 307
Whether any large number of neutral or nearly neutral mutants exist is an open question, but the evidence for them is increasing (e.g. , King and Jukes, 1 969). A second possi bility is that some polymorphic loci are maintained by frequency-dependent selection. If each allele is favored when rare, but not when common, there is a stable equilibrium when each is p resent in inter· mediate frequencies. The selective differences are minimized at or close to the equilibrium point, so that a population at equilibrium can be polymorphic with very little load . However, in any real population there will be drift away from the equilibrium point because of random processes and the population is returned to the equilibrium only by selection. So the load is not O. Linkage can reduce the segregation load by holding together a group of heterotic genes, at least under some circumstances. An extreme example is the case where there are several loci , each occupied by two alleles, and with both homozygotes lethal at every locus. Under this model the segregation load is 1 /2 at each locus, and for n loci will be I .5". However, if all the loci were linked i nto two chromosomes, complementary at each locus, the load would be reduced to 1 /2. Sved, Reed, and Bodmer ( 1 967), King ( 1 967), and Milkman ( 1 967) have all suggested that a threshold or truncation-selection model can greatly decrease the segregation load. T he model assumes that beyond a certain level of heterozygosity additi onal heterozygous loci make no increased contri bution to the average fitness. In an extreme form, all individuals beyond a number x of heterozygous loci have fitness I and those below this number have fitness O. An alternative model with much the same consequences is that a certain fraction p of the individuals are selected and the rest are c ulled . Those i n d i vid uals t ha t are selected are the ones with the largest number of heterozygous loci. I t is doubtful, we thi nk, that natural selection acts by counting the number of heterozygous loci and then sharply dividing the population into two groups based on the number of such loci. But the animal breeder prac tices truncation selection with respect to phenotypes , and it may be argued that natural selection approximates this pattern sufficiently well to alter substantially the number of polymorphisms maintained by a certain amount of selection. The question is o ne that requires empi rical answers ; clearly not enough i s k nown about gene interactions to judge the realism of such models at present. The introducti on of linkage into threshold models leads to mathematical di fficulties, but computer simulations have shown that it is possible to devise systems i n which a great amount of polymorphism is maintained. Wills, Crenshaw, and Vitale ( 1 969) have studied one such model. They assume truncation selection of a certai n i ntensity (e.g. , 1 0 % selective elimination, -
308 A N I NTRO D U CTION TO POPULATI O N G E N ETICS TH EORY
where eliminated i ndividuals are those with the smallest n umber of hetero zygous loci). With close linkage the population tends to retain particular chromosomes-generally those that are mutually complementary-and a moderate amount of selection can retain a large number of polymorphisms. Again, how realistic the assumptions are is unknown. Which of these mechanisms are the more important i n determining natural polymorphisms and whether additional mechanisms play a role are among the most i ntriguing questions of population genetics. The combined efforts of experimental studies, natural population censuses, and computer simulation studies will probably be required for any real understanding. The i nbred load for a locus with k alleles is L1 = L Pi S i i < kL,
6.1 2.3.1 3
since L Pi Si i is the sum of k terms, each of which (by 6. 1 2.3. 1 0) is less than the random-mating load. If all heterozygotes are the same the inbred load is simply k times the random load. Thus, for a locus maintained by selective balance, or for a group of such loci if they are independent, the inbred load is not greater than the random load multiplied by the number of alleles maintained i n the equilibrium population. In principle, it should be possible to determine whether inbreed i ng effects i n man are due mainly to mutationally maintained loci or loci maintained by selective balance-provided that the average dominance, h, and the number of alleles, k, are both small. But the uncertainty about these and other assumptions and the absence of reproducible data have kept this from being an informative approach to the problem. 6.1 2.4 The I ncompatibility Load
The only well-understood cause of an incompatibility load is maternal-fetal incompatibility for antigenic factors. For example, an A child with an 0 mother has a certain risk of dying as an embryo or neonatally due to anti-A agglutinins of the mother. Because of the rule that an i ndividual can produce antibodies only against antigens he does not possess, and because antigens are (with perhaps a few exceptions) the result of dominant genes, it follows that any increased death rate will always be i n heterozygotes. We can write the possibly i ncompatible types by writing maternal geno types and the allele contributed to the embryo by the father. With random mating, they occur with the following frequencies, letting p, q, and r stand for the frequencies of A , B, and 0 alleles.
POPULATIONS I N APP ROXIMATE EQUILI B R I U M 309
PROBMOTHERS
FREQUENCY
SPERM
FREQUENCY
GENOTYPE
(I)
GENOTYPE
(2)
00 00 AA AO BB BO
r2 r2 p2 2pr q2 2qr
A
p q q q p p
B B B A A
ABI LITY
(I)
x
(2)
pr 2 qr 2 p "q 2prq pq 2 2pqr
OF DEATH
dA dB dB dB dA dA
The total incompatibility load due to this locus is equal to L = dA(pr2 + pq2 + 2pqr) + d B(qr 2 + p2q + 2pqr)
= dA pe l - p)2 + ds q(l _ q)2 .
6.1 2.4.1
More generally, if represents the frequency of A i and dj is the prob ability of death d ue to the antigen resulting from allele A i , we can write the load as
Pi
6.1 2.4.2
where m ay , of course, be 0 for some alleles as is probably the case for 0 in the ABO system. This assumes that is the same irrespective of the mother ' s genotype (assum ing she has no A I allele) and of the other (non-A /) allele in the child, which is a reasonable a priori assumption, but not always t rue. If it is not, a separate d for e ac h maternal-fetal genotype combination has to be introduced. I f either the mother or the child is inbred the incompatibility load is changed.
di
dl
1 . M other I nbred IS
The load due to the antigen produced by allele
now
A
- Im) C�ipjr + Imj�iPj] = d i Pi[( l - pj) 2 (1 - 1m) + I ( l - Pi)] = d i Pi(l - Pi)(l - Pi + Pi 1m),
di Pi [(l
m
where 1m is the i nbreedi ng coefficient of the mother. Summed over all alleles, this is 6.1 2.4.3
Thus the incompatibility load i ncreases In proportion to the inbreeding coefficient of the mother.
31 0
AN I NTRO D U CTIO N TO P O P U LATI O N G EN ETICS T H EO RY
Consider the diagram in Figure 6. 1 2.4. 1 where M is the mother and C is the child . The letters a and b are the two gametes that united to form the mother while c is the gamete contributed to the child by the father, F. 2. C h i ld I nbred
i1
b
\/ M
6",
•
• c
tf
0
/
F i g u re 6.1 2.4.1 .
A
diagram of gametic connections between mother (M), father (F), and child (C). The solid circles are gametes, designated by a, b, and c.
Let E be t he probability that c is identical to either a or b, in which cAse there can be no i ncompatibility. Then, letting = stand for identity by descent, E
=
prob (a
= c) + prob (b = c) - prob (a = b = c).
But the probabili ty that a = c is twice the contribution to the inbreeding coefficient of the child m ade by the path going through the grandparent that contributed a, and likewise for the probability that b = c. Therefore E = 21c - prob (a
=b
=
c),
6.1 2.4.4
where fc is the inbreeding coefficient of the child. The prob (a = b = c) is 0, unless there is an ancestor to which all three gametes, a, b, and c, can be traced back . There seems to be no simple general rule for computing this, but it is easily done for individual cases. I n most human pedi grees this term is 0, of course. The i ncompatibility load is then L
= D(1 - E).
6.1 2.4.5
PO P U LATIO NS I N APP ROXI MATE EQU ILI B R I U M 31 1
Hence, except for the possibility of a common ancestor of a, b, and c, we can say that the incompatibility load increases in proportion to the inbreeding coefficient of the mother and decreases in proportion to the inbreeding co efficient of the child. For further discussion, see Crow and Morton ( 1 960). 6.1 2 . 5 The Load Due to M eiotic Drive
As the final example of a genetic load, consider the effect of meiotic drive (Sandler and Novitski, 1 957). Typical examples are the t-alleles in m ice and the SD factor in Drosophila. We shall consider only a simple example. Assume that the homozygote is lethal or sterile (which is often true). Let h be the selection against heterozygotes. Assume further that the ratio of a to A genes contributed by males is K : 1 K, but that the contribution from heterozygous females is the regular 1 : 1 of Mendelian heredity. Meiotic drive is typically found in one sex only, so this is a good assumption. The genotypes and fitnesses can be designated as -
AA 1
GENOTYPE
PrPm
FITNESS FREQUENCY
6.1 2.5.1
where Pf and Pm are the frequency of allele A i n the gametes of females and Pf ' With meiotic drive we cannot make males ; qm = 1 - Pm and qf = 1 the simplifying assum ption of Hardy-Weinberg zygote ratios, since the gamete frequencies are different in the two sexes (recall Section 2. 5). The allele frequencies next generation are given by -
I
qm = I
qf
=
K( l - h)(Pm qf + Pf q m)
1 - h( Pm qf + Pf qm) - qm q f
6.1 2.5.2
,
I
qm 2K '
6.1 2.5.3
To specify the equilibrium conditions, let q� = qm qm/2 K. This leads i mmediately to the quadratic
=
4m and qi = qf
q ;'( 1 - 2h) + q m [h ( 1 + 4K) - 2 K] + K [2 K - 1 - h( 1 + 2K)]
or A q � + Bqm +
where A
=
C = 0,
1 - 2h
B = h( 1
+ 4K)
C = K[2K
-
-
2 K,
1 - h{ l + 2K)] .
=
0
= 4f
=
6.1 2.5.4 6.1 2.5.5
31 2
AN INTR O D U CTI O N TO P O P U LATI O N G E N ETICS TH EORY
The relevant solution is
4m
=
-
B JB 2 -
_
2A
4A C
.
The load is L
=
h(Pm 41 + pl 4m) + 4m 41
6.1 2.5.6
=
�� ( 1 + 2K - 24) + i�.
6.1 2.5.7
The value of the meiotic drive load for several representative values of K and h are given in Table 6. 1 2 . 5. 1 . Table 6.1 2.5.1 . The l oad due to meiotic drive. The gene favored by segregation
of h relat ive to the normal homozygote. K is the proportion of gametes with the
distortion is assumed to be lethal when homozygous and to have a disadvan tage driven gene in one sex ; the other sex is assumed to have normal Mendelian segregation. (From Crow,
1 970.) h
K
0.00
0.01
0.02
0.05
0. 1 0
0.20
0. 30
.010
.0 1 0
.0 1 0
0 .007 .039 .097 . 1 97 .279 . 356 .396 .472
0
0
0
0
0
.5 .6 .7 .8 .9 .95 .98 .99 1 .00
=
qm =
.04 1 . 1 00 .200 .282 .359 .400 .490
.042 . 1 00 .200 .282 .360 .400 .495
.042 . 1 00 .200 .282 .360 .401 . 500
When h
0
0
.029 .087 . 1 87 .267 .342 .379 .438
0
0 .03 3 . 1 29 .205 .270 .298 . 3 33
0.50 0
0
0
0 0 0 .03 1 .083 . 1 03 . 1 25
0 0 0 0 0 0 0
0, as shown by Bruck ( 1 9 57),
K - JK( l - K),
4 ;'
L = 2K
=
1
2.
-
J K(l
-
6.1 2.5.8
K ).
6.1 2.5.9
This system has the interesting and unusual property that the load is decreased when the lethal gene is partially dominant. This property is brought out in the table, which also gives an idea of the range of values for K a n d h that maintai n the p o ly m orphis m .
P O P U LATIONS I N A P P R OXIMATE EQUILI B R I U M 313
6.1 3
Evolutionary Advantages of Mendelian I nheritance
The ubiquity of Mendelian inheritance attests to its evolutionary value. For such an elaborate mechanism to be contrived implies that it must confer a great advantage on the population possessing it. It is not at all apparent that sexual reproduction is of any selective advantage to the i ndividual. Its value clearly lies in gene-shuffl ing, the value of which is more likely to be for the long ti me benefit of the population as a whole than for the i nd ividual. I n fact, as we noted before, Fisher ( 1 930, 1 958) went so far as to suggest that sexual reproduction is perhaps the only trait that has evolved by intergroup competition. We shall not try to discuss ways in which Mendelian inheritance may have evolved, but we shall discuss briefly some of the ideas about its value to the population . I t is true that the Mendelian system i s capable o f producing a n enormous number of genotypes by recombi nation of a relatively small number of genes. The number of potential combinations is indeed great, but the number produced in any single generation is limited by the pop ulation size, and gene combinations are broken up by recombination just as effectively as they are produced by it. Furthermore, for a given amount of variability, the efficiency of selection is greater i n an asexual population, for here the rate of progress is determined by the genotypic variance rather than by the genic variance, as it is in a sexual system. However, if the environment changes so that drastic change in p henotype is needed, an asexual population (in the absence of new mutants) is limited to the best genotype in the current population. Selection of Mendelian recombinants can produce strains that far transgress the former variability of the population. Numerous selection experiments have demonstrated that i n a few generations the mean can come to lie outside the range of what were the most extreme deviants in the popu1ation before selection began. The evolutionary advantage of recombination has ofte n been discussed. The two principal ideas are : ( I ) that recombination makes it possible for favorable mutants that arose i n different individuals to get i nto a single individual, and (2) that recombination permits the species to respond more effectively to an ever-changing environment. The first idea was developed originally by Fisher ( 1 930) and by Muller ( 1 932). The second was most clearly stated among early writers by Wright ( 1 93 1 and later) and by Sturtevant and Mather ( 1 938). To consider the first idea, we shall rely on a model first proposed by M uller. In an asexual population two beneficial mutants can be incorporated into the population only if the second occurs i n a descendant of the individual
31 4 AN I NT R O D U CTION TO P O P U LATI O N G E N ETICS TH EORY
in which the first mutation took place. The limiting factor will be the time required for the descendants of the first mutant to increase to such numbers that a second mutant becomes reasonably probable. In a Mendelian popu lation all the favorable mutants that occur during this interval can be i ncor porated. (We are ignoring the loss of a new mutant by random processes, to be discussed later in Chapter 8, for thi s is not essentially different i n the two systems.) An asexual system will be as efficient as a sexual system only if the mutation rate is so low, the selective advantage of the m utant so great, or the population so small that the first mutant is established before another favor able m utant occurs. We have discussed this situation quantitatively (Crow and Kimura, 1 965) but we shall give only a qualitative summary here. The situation is illustrated in Figure 6. 1 3. 1 , which is adapted from M uller's 1 932 paper. The three Time �
'
. " "
"
,
'
. . .
'-,
"
:
. . . . . . . . . .
.
... . . . .. .. . " .
'
. . . .
..
,
.
.
..
.
"
.
"
.
. ..
"
.. .
.
"
.
" "
'" "
.
.
.
.
, ,', "
"
Large Population
I I
Asexual Sexual
Small Population
Evolution in sexual and asexual populations. For explanation, see the text. Figu re 6.1 3.1 .
POPULATIONS IN APPROXIMATE EQU ILI B RI U M
31 5
mutants, A , B, and C, are all beneficial. In the asexual population when all three arise at roughly the same time only one can prevail. In this case A is more fit than B or C; or A may simply be luckier in happening to occur in an individual that was for other reasons unusually well adapted. B can be incorporated only when it occurs in an individual that already has mutant A , and this will not happen o n the average until the descendants o f the original A mutant have grown to numbers roughly the reciprocal of the mutation rate. C is finally incorporated, but only after A B individuals are in appreciable numbers. So, in an asexual population the mutants are incorporated in series. The sexual population, on the other hand, permits the incorporation of mutants in parallel. The ratio of the evolutionary rate in a Mendelian to that in an asexual population is the number of mutants that occur in the interval of time between the occurrence of a mutant and the occurrence of a second mutant i n a descendant of the first mutant. Numerical calculations have been presented in the paper referred to above. The qualitative conclusions are clear from the diagrams however. The relative advantage of sexuality will be greatest when the population is large, when the mutation rate is high, and when the selective advantage of the mutant is small-for each of these favors the occurrence of more mutants than can be incorporated in series. In other words, the advantage of sexuality is greatest when t he system is evolving by very minute steps in a large population. On the other hand, the incorporation of new mutant is not the only evolutionary factor of importance. Evolution may also occur by the shifting of frequencies of genes that are already present in fairly high frequencies in the population, as Wright has emphasized. In this case the argument of Stu rtevant and Mather is rel evant. We can do no better than quote directly from their article : The simplest system that we have been able to devise, having the required property of favoring recombination, is as follows : Two gene pairs, A, a and B, b, exist i n a population subjected frequently to three different sets of en vironmental conditions, D, E, and F. Condition D favors A but acts un favorably on B, whereas E favors B but lowers the frequency of A. These, if properly adjusted as to intensity of selection and number of generations over which they operate, will insure the perpetuation of both allelomorphs at each locus. Recombination will not, however, be of i mportance. For this it seems necessary to i ntroduce the third c ond i t i on , F, favoring the combination AB (with or without a similar effect on ab), but acting adversely on the single types A and B. Under such conditions it would appear that there will be a selective action favoring recombination, of the order of magnitude of the selection of the AB type, as opposed to Ab or aB, u nder condi tion F. Other combinations of two or more loci may be exert i ng similar action in similar
31 6
AN I NTRO D U CTION TO POPU LATIO N G EN ETICS TH EO RY
or disimilar environmental conditions, and so the net effect will probably be to favor recombination for the majority of loci under the majority of condition changes.
Similar arguments apply when the population is divided into partially isolated subpopulations with different gene combinations favored in different subpopulations and occasional migration between subpopulations (Wright, 1 93 1 and later). Virtually the same argument has been stated recently (and independently) by Maynard Smith ( 1 968). We do not know which of these arguments, the Fisher-Muller or the Wright-Sturtevant, is the more important. Very likely, both are valid, and which is more important depends on whether the incor poration of new mutants or the adaptation of existing variants to variable environments is more often the limiting factor in evolutionary advance. It is also possible, of course, that neither of these has identified the really crucial advantage, and that some third reason is still more important. -rhe Advantages of Diploidy The evolutionary advantages of recombina tion can be obtained in species that spend most of their lives as haploids as well as in those that are predominantly diploid. Yet there has often been evolution toward diploidy. What is the reason ? At first glance it would appear that there is an obvious advantage of diploidy i n that dominant alleles from one haploid set can prevent the expression of deleterious alleles in the other. However, as soon as a new equilibrium is reached the m utation load will be the same ; in fact, it will be twice as large unless the mutants are completely recessive. From this stand point diploidy is a disadvantage, not an advantage. However, when the population that is previously haploid suddenly changes to diploid there is an immediate advantage. To be sure, when the equilibrium is reached the advantage is gone ; but by this time the diploid condition may be established and there is no way of going back to haploidy without all the deleterious effects of exposing recessive genes. So it may be that diploidy is not conferring a lasting benefit, but is the result of a tem porary advantage that cannot easily be gotten rid of. There are two other possible advantages of diploidy. One very obvious one is overdominance. If such loci are co mmon, there is a genuine advantage to diploidy provided such effects are important enough to compensate for the greater mutation load from partially domi nant loci. A second possibility is the protection it affords from the effects of somatic mutation. The zygote in a diploid species or the gametophyte in a haploid plant may have approximately the same equilibrium fitness, but the effects of somatic mutation would be quite different. Diploidy would protect
POPULATIO N S I N APP R OXI M ATE EQUILI B R I U M
31 7
against recessive, or partially recessive, somatic mutants. If the soma were large and complicated, as i n higher plants and especially animals, a diploid soma may provide a significant protection against the effects of recessive mutants i n critical cells.
6.1 4 Problems
1. A mutant has a selective disadvantage hs in the heterozygote. Assume that hs is l arge enough that selection in m utant homozygotes can be ignored. Show that the m ean number of generations that a mutant persists is IJhs. 2. Assume that X-linked hemophilia has an i ncidence of one i n 20,000 males and a selective disadvantage, s, of about 0.8. Estimate the mutation rate of the gene. 3. Give two or more reasons why the mutation rate estimates for autosomal dominant and X-linked recessive mutants are m ore accurate if s is large. 4. What is the genic variance of an overdominant l ocus at equilibrium under selection ? 5. Show that for equilibrium under selection the e quilibrium gene frequen cies remain the same when wij is replaced by a + bWij ' where a and b are constants. 6. Show that the segregation load when f = I for one of a series of multiple alleles is the same as the load for all alleles when f = O. (Assume that all heterozygotes are equal i n fitness.) 7. Prove that in equation 6.7.4, whe n Wl I 2w1 2 + W2 2 < 0, 16�j� 1 < 1 . 8. Show, for the model of equation of 6.5.6, that the migration load is M(p p)jp, where p is the equilibrium frequency in the sUbpopulation and p the gene frequency in the immigrants. Is the load the same when the favored gene is completely dominant ? 9. Derive the characteristic equation of 6.9.6 by letting y'jy = y"jy' A. 1 0. Show how 6. 1 2.2.8 may be obtained. 1 1 . What would be the effect of polygamy on the zygotic sex ratio ? Of infanticide where one sex is preferentially killed ? Would you expect a different zygotic seX ratio for mice than for monkeys ? 1 2. Assume that weight in mice is determined by a large number of loci, that there is no dominance or epistasis, and no environmental variance. Assume further that the fitness of a mouse is proportional to the square of its deviation from the average weight i n the popUlat ion. Show that the homozygous l oad is twice the randomly m ating load. -
=
318 A N INTROD UCTI O N TO POPU LATIO N G E N ETICS T H E O RY
1 3 . Given the array o f fitnesses Al A2 A3
Al 0.90 1 .00 1.15
A2 1 .00 1 .00 1 . 10
A3 1.15 1 . 10 0.80
wiH this lead to a stable equilibrium ? Is there a stable equilibrium if allele A 3 is lost ? 1 4. What frequency of the Rh - gene (d) wilJ maximize the incidence of hemolytic disease in Rh + embryos with Rh - mothers? 1 5. What freq uency of t he A , B, and 0 alleles will maximize the A BO incompatibility load? Are t hese frequencies independent of dA and dB?
7 PROPERTIES OF A FINITE POPULATION
I
n preceding chapters we have considered mutation, migration, and selec tion as factors causi ng deterministic changes in gene frequencies and in such population properties as average fitness or performance. There was the implicit assumption that the population is large enough that random sampling of gametes does not introduce an appreciable amount of noise into the system and that the migration and se1ection coefficients are either constant or change in a predictable way. Actual populations are finite so there is some random drift i n gene fre quencies as we mentioned briefly in Chapter 3 . Also there may be fortuituous changes i n the other factors, particularly i n selection coefficients. I n this chapter we will continue the discussion begun in Chapter 3 on the effects of a finite popUlation number, ignoring the i nfluence of migration and selection. Later, i n Chapters 8 and 9, we consider how these factors interact with random processes. 319
320
AN I NTR O D U CTIO N TO POPU LATIO N G E N ETICS TH EO RY
7.1 I ncrease of H omozygosity
Due to Random G ene F requency Drift
As was pointed out in Chapter 3, Section I I , there is a fluctuation in gene frequency from generation to generation i n a finite population. We can regard each generation of N diploid offspring as being derived from a sample of 2N gametes from the parental generation. As time goes on the gene fre quency will tend to depart more and more from its original value. Fi nally one of the alleles is fixed while all others are lost and the population becomes homozygous for this locus. Si nce the fixation of genes is an irreversible process (if we exclude mutation or migration) the number of fixed loci will tend to increase with time. We also emphasized that, although the consequences of random gene frequency drift and consanguineous mating are both such as to lead to an increase i n homozygosity and we can measure either by Wright's inbreeding coefficient, in one regard the two are quite different. In a large population with some consanguineous matings there is a departure from Hardy-Weinberg ratios, as given by equations 3 .2.2 and 3.2.3, and the proportion by which the heterozygosity is reduced is equal to f In a finite population withi n which m ating is at random the population remains in approximate Hardy-Weinberg ratios. The reduction in heterozygosity comes from the random changes in gene frequency, the net effect of which is to reduce L P i Pj (the proportion of heterozygotes, i =1= j) by a fraction f In a population of N diploid i ndividuals mating completely at random, including the possibility of sel f-fertilization, the heterozygosity decreases at a rate I j2N where N is the effective population number (see 3. 1 1 .2). I n terms of the inbreeding coefficient 7.1 .1
as was shown earlier (3 . 1 1 . 1). Also, when there i s no self-fertilization the change is given by !, = g, - l ' I
N
1
g = 2
(1 +
N ( - 1) 1, 1) +
N
7.1 .2
g1 - 1 ,
7.1 .3
where !, is the inbreeding coefficient in generation t and g , is the coefficient of consanguinity of two randomly chosen individuals in generation t (see 3 . 1 1 . 3) .
P R O P E RTIES O F A FI N ITE PO PULATION 321
From this 7.1 .4
If there are two sexes of unequal numbers, Nm males and Nf females,
N
= 4Nm Nf Nm + Nf
7.1 .5
in 7. 1 .4 (see 3. 1 3. 1). We also introduced Wright's concept of effective population number in Section 3. 1 3, an idea that will be developed further later in thi s chapter. Random gene frequency drift, either because o f a finite population or because the selection and migration coefficients vary, plays a central role in the " shifting balance " theory of evolution proposed by Sewall Wright. As we discussed in Chapter 5, deterministic selection ten ds to i ncrease the fre quency of those genes which enhance the fitness of the in d ivid ual. So the population fitness tends to i ncrease (subject to some qualification if there are complicating factors such as frequency-dependent selection and linkage dis eq uilibrium). This was discussed in Sections 5.6, 5.7, and 5.9. I f the mean relative fitness is plotted as ordinate and the various gene frequencies as the various abscissae in multidimensional space, we can think of the fitness as a hypersurface. In this metaphor we can speak of the surface, as Wright has, as an adaptive surface. See Wright ( 1 967 and earlier) for a d iscussion of thi s concept, and for the relation of the " existence " of such a surface to quasi-linkage equilibrium. If the surface has multiple peaks and valleys a population , which can be thought of as a poi nt on the surface, will tend to climb the nearest peak, which is not necessa rily the highest. There may be no deterministic way in which a populatio n can change from one peak to a higher one. Wright suggests that such evolutionary bottlenec ks may be frequent and that it is i mportant that there be ways by which a population can move away from the stable equilibrium represented by one peak in order to come under the sphere of influence of another, higher peak. This m ight happen if, for any of the reasons mentioned before, there is some random shifting of gene frequencies. Such a change might be sufficient to permit the population to wander randomly over the surface (still speaking metaphorically) and come under the influence of a higher peak. Of course, t here is a loss in average fitness as the popUlation drifts from the highest point. But Wright argues that this is a necessary price for letting evol ution find new gene combinations. Multipl e peaks and valleys will be prevalent if there are complex domi nance and epistatic relations among the various loci, especially those of a type
322 AN INTR O DUCTIO N TO P O P U LA TIO N G E N ETICS THEORY
where two or more alleles that are ind ividuall y deleterious are collectively beneficial. The views of Fisher and Wright contrast strongly on the evolutionary significance of random changes in the population. Whereas, to Fishe r, random change is essentially noise in the system that renders the deterministic proc esses somewhat less efficient than they would otherwise be, Wright thinks of such random fluctuations as one aspect whereby evolutionary novelty can come about by permitting nove l gene combinations. Whether random gene frequency d rift is a way of creating new, favorable epistatic combinations or is more like background noise, there is increasing evidence that it is prevalent. Molecular biology has shown dramatically the wide range of mutational possibility at a single gene l ocus with its several hundred nucleotides. The possibility that many nucleotide substitutions may cause inconsequential changes has become increasi ngly apparent. Some, or perhaps m any, such changes may alter fitness by an amount that is of the same order as the mutation rate or the reciprocal of the effective population number, or less. The fate of such mutants is determined largely by random processes. The possibility that such random changes may account for a sub stantial part of the amino acid changes observed in the evolution of hemo globin and other proteins has been discussed by several authors recently (see Kimura, 1 968, 1969 b ; King and Jukes, 1 969 ; Crow, 1 969). In the next chapter we shall discuss t he rate of evolution that would be expected from neutral or near-neutral mutations. The other aspect of near-neutral genes and a great multipl icity of poten tial mutations at each locus is the possibility that this may account for poly morphisms, particularly those having no overt effect and detected only by electrophoresis or other chemical trickery. This is the subject of the next section. 7 .2
Amount of H eterozygosity and Effective Number of Neutral Alleles in a Finite Population
It has often been suggested in the past that the wild-type allele is not a sjngle entity, but rather a population of different isoalleles that are indistinguishable by any ordinary procedure. Since each gene consists of several h undred, or perhaps thousands, of nucleotide pairs, the range of mutational possibility is enormous, especially when one considers combinations. That some of these are essentially equivalent seems reasonable and is reinforced by chemical studies showing that amino acid substitutions often do not affect in any detectable way the function of certain enzymes. I n any case the probabi lity that such alleles may exist in substantial numbers seems great enough to
P ROPERTIES OF A FIN ITE PO P U LATION 3 23
warrant an inquiry into the population consequences. The procedure given here is an extensio n of one presented earlier (Kimura and Crow, 1 964). In this chapter we shall consider only neutral alleles. Those with a slight selective advantage or disadvantage (especially those that are overdominant) are of interest, but require more advanced methods. These will be discussed in Chapter 9. To isolate the essential question, we consider an extreme situation in which the number of possible isoallelic states at a locus is large enough that each new mutant is of a type not preexisting in the population. This provides an estimate of the upper limit for the number of different alleles actually maintained by mutation. Let u be the average rate of mutation of the alleles existing in the popu lation of effective size Ne • If the popUlation consists of N i ndividuals, there will be 2Nu new mutant genes introduced per generation. The probability of two uniting gametes carrying identical alleles is given by 7. 1 . 1 . The two genes which are identical in state will remain so in the next generation only if neither of them has mutated since the previous generation, the probability of which is (1 - U) 2 . Thus we can write 7.2.1
When equilibrium is reached, f, = f, - 1 ' The solution is
/=
( 1 - u) 2 2 2 Ne - (2Ne - 1 ) (1 - U)
Ignoring terms contai ning
1
- 2u
2 u ,
'
we obtain 1
� f = 4 Ne u - 2u + 1 4Ne u + 1
The proportion of heterozygous loci, H, is I 4Ne u H � ---4Ne u + 1
7.2.2
7.2.3
/
or 7.2.4
This gives the probabil ity that an individ ual chosen at random will be heterozygous for this locus. If the effective population number is much smaller than the reciprocal of 4u, the average individual will be homozygous for most such loci ; on the other hand, if Ne is much larger than 1 /4u the individuals in population will be largely heterozygous for neutral alleles. I t is convenient to define the effective number of alleles (ne) maintained in the population by the reciprocal of the sum of the squares of allelic
324 A N INTROD UCTI O N TO POPULAnON G E N ETICS TH EORY
frequencies. In the present model, Ile = I II (since the proportion of hom o zygotes = 1 = L pt) and therefore 7.2.5
Note that this will be less than the actual numbe r of alleles, unless all alleles are of the same frequency. I f P i is the frequency of the ith allele, then the effective number of alleles is I/L P;, On the other hand, the actual nu mber of alleles is l ip, where f is the mean allele frequency. For example, with three alleles with frequencies 2/3, 1 /6, a nd 1 /6, p = 1 /3 (of course, since there are three alleles), but L Pr = 1 /2. Thus the number of alleles i s three, but the amount of hetero zygosity is the same as would be found in a population with two alleles of equal frequency. M ost of the time in population genetics we are more interested in the effective number of alleles than in the actual number. Many of the alleles will be represented only once or twice in the population and contribute very little to the average heterozygosity or genetic variance. Since it is ordinarily the l atter quantities that we are interested i n , the effective number is the more useful. Fortunately, the effective number is much more easily computed, as we have j ust shown. The actual number requires a k nowledge of the dis tribution of allele frequencies ; this will be deferred until Chapter 9. The derivation of 7.2.3 assumed that each mutant is new ; that is, that it is an allele not currently represented in the population. We can remove this restriction. Assume that there are k possible alleles and the rate of mutation from one to one of the k - I others is u/(k - l ) ; that is, the total mutation rate is u. Then we modify 7.2. 1 to become
h=
[2� ( 1 - 2�J h - l J(l ( 1 - 2�J ( 1 - h - l )2 e+
+
k
-
U
U) 1
2
(I
- u ),
7.2.6
ignoring the possibility that both alleles mutate simultaneously to the same new mutant state. 2 Ignoring terms in u and letting !, = !, - 1 = J, we get for the equilibrium value,
7.2.7
PRO PERTI ES OF A FINITE PO PULATIO N 3 25
or H
=
Ne u 4 _ _ k 4Ne u __ k
( - 1) + 1
_ __ __
__
7.2.B
_ ____
As expected , when k -+ 00 , 7.2.8 becomes the same as 7.2.4. We have been assuming where the population incJ udes self-fertilization. We can readily remove this restriction. IncJuding the possibility of mutation and returning to our model where each mutant is new to the population, 7. 1.2 and 7. 1 . 3 are modified to become
It = gt- 1 g, = [ � + it - . ) + ( 1 - �Jgt-.](l 1
2
( - U) 2 , e
(l
7.2.9
.
7. 2 1 0
U) 2 .
Putti ng these together and eliminating the g 's gives 7.2.1 1
The average proportion of homozygosity, f (upper figure), and the effective number of neutral alleles (lower figure) in a randomly mating population of effective nu mber Ne and mutation rate u. (From Kimura and Crow, 1 964.) Table 7.2.1 .
EFFECTIVE POP U LATION NU MBER, M UTATION RATE, U
10 - 4 10- 5
10 - 6
10 - '
N.
lOS
102
103
1 04
.96 1 .04
.7 1 1 .4
. 20 5 .0
.996 1 .004
.96 1 .04
1 .4
5.0
1 .0004
.996 1 .004
.96 1 .04
.71 1 .4
20 5.0
.
. 024 41
.996 1 .004
.96 1 . 04
.71
.20
.9996
.99996
.9996
1 .00004
1 .0004
.71
1 06
10'
.024 41
.0025
.00025
40 1
4001
.20
.024 41
1 .4
.0025 401
5.0
328 AN INTR O D U CTION TO POPU LATIO N G EN ETICS T H EO RY
At equilibrium, fr = fr - l = fr - 2 = f, leading to
f�
1 4Ne u +
7.2. 1 2
1
and
H ';:::,
4Ne u 4Ne u +
1,
just as before.
2
0.5
5
0.2
10
0.1
20
0.05
50
0.02
1 00
0.01
200
0.005
500
0.002
1 000
0.001 -+-----.----,.---..--�___i 1 0 /; 10 2
F igure 7 2 1
Average homozygosity, /, and the effec t ive number of neutral alleles, nr , as a function of the effective population number, Nt! , for various mutation rates, u. The range of mutational possibility is assumed to be large enough that each mutant is an allele not currently in the population. Populations where f is between 0.5 and 1 are homozygous for the majority of loci ; those below 0.5 are heterozygous for the majority. .
.
.
7.2.1 3
PROP ERTIES O F A FINITE POPULATION 327
Table 7.2. 1 gives the average proportion of homozygosity, f, and the effective number of alleles, ne = 1 1f, for populations of various effective number and mutation rate. If 4Ne u = I the effective number of alleles is 2 and increases as 4Ne u gets larger. As 4Ne u gets smaller the population becomes more nearly monomorphic. For populations with effective numbers of order 1 05 or larger, it should not be surprising to find considerable neutral polymorphism, provided that neutral mutations are occurring at rates of the order 1 0 - 5 (,: higher. This i s shown graphically in Figure 7.2. 1 . Later in this chapter we shall discuss effective population number in more detail. At that time we will show that there are two ways of defining the effective population number. For many cases these lead to the same con sequences, but not always. To anticipate, the effective number that we have been discussing in this section is the inbreeding effective number. 7.3 Change of M ean and Variance in
Gene Frequency Due to Random Drift
Consider a population of N breeding individuals in which the frequencies of a pair of alleles A 1 and A are P and I - p respectively. We assume that mating 2 is such that the next generation is produced from union of N male and N female gametes, each extracted as a random sample from an infinitely large hypothetical population of gametes produced by the parents. Thus the prob abilities that the n umber of gene A 1 will be 0, 1 , 2, . . . , i, . . . , 2N in the pooled sample of 2N gametes is given by ( l _ p) 2 N, 2Np( l _ p) 2 N - l , 2N(2N - I) . . . (2N l. .,
-
2
2N(2N - I )
i
+
p 2 ( I _ p) 2 N - 2 , . . . ,
I) i -i 2 P ( 1 - P) 2 N , " ' , p N .
In other words, the probabilities follow the binomial distribution which is obtained by expanding [ p + ( 1 - p)] 2 N . Since the mean and variance of this distribution are respectively 2Np and 2Np( l - p), the mean and variance of the gene frequency x will be given by x
=
Vox where
E(x)
=
E
=
p,
E[ ( x - p)2]
=
E(x 2 )
_
p2
=
p(
�;
7.3.1
p)
,
7.3.2
stands for the operation of taking expectations with respect to the
328
AN I NTR O D U CTION TO POPU LATIO N G EN ETICS T H EORY
number (i) of gene A 1 and x = i/2N. For a derivation of the binomial variance formula, p(1 - p)/2N, see A.5.3-A.5.5. In the previous chapters we used the letter p , or sometimes q, to designate alIele frequencies. Now, as we shift the emphasis to random effects and stochastic processes, we shall designate the allele frequency by x to emphasize that this is a random variable. Usu ally, we use p for the initial value, which is what the frequency would be in the absence of random effects. We wil l call the newly formed generation the first generation. Formulae 7.3. 1 and 7.3.2 show that in going to the first generation, the mean of the gene frequency is unchanged, while the variance increases by an amount inversely proportional to the popUlation number N. Suppose that the second generation is produced again from N male and N female gametes taken as random samples from the first generation. Let x' be the frequency of A 1 in the second generation ; then
x' = E(x') = E1 E2(x') = E1 [E2(x')] = E1 (x) = p, where E1 and E 2 designate expected valu e operators in the first and the second generation. It will be seen that the mean is again unchanged. Similarly, the variance is calculated as follows : Since 2 2 Yx' = E[(x' - p) ] = E1 E2 [(x' - x) + (x - p)]
= E1 E2 [(x' - X)2] + 2E1 [(x - p)E2(x' - x)] + E1 [(x - p)2 ] ,
in which
E2 [(x, - x) 2] and
=
x( l - x) 2N
=
1
2N
- x) , � [P _ P _ p( 1 - p)] p(1 - p) ( _ �) p( l 1
E:z<x'
=
x
=
2
2N
=
2
- -p (x
2
2
)] ,
0,
we obtain V.
[x - P
2N
2N
2N
+
+
p(1 - p) 2N
- p)
2N
.
If the same pattern of reproduction is continued until the tth generation, the mean and the variance become respectively x(t)
= p,
y(t)
=
;lC
p( 1 - p) 2N
[( 1 - -1 )'- 1 2N
+
. ( 1 - -1 ) 1 ]
.. +
2N
+
7.3.3
7.3.4
P R O P ERTIES O F
A FI N ITE P O P U LATIO N 329
After a large number of generations, the mean is still unchanged, i.e.,
j ( OO )
=
p,
7.3.5
but the variance i ncreases with time and finally becomes
V�oo) = p( 1 - p) ,
since
!�� ( 1 - 2�
r
7.3.6
= O.
This suggests that at this limit, the gene frequency will become either I or 0 with probabilities p and 1 - p. The process leading to this state starting from an arbitrary gene frequency will be studied i n detail i n Chapter 8. The state 0 corre x = 1 corresponds to fixation of gene A 1 i n the popUlation and x sponds to its loss. The i ncrease of variance with time as shown i n 7.3.4 shows that the probability of belonging to one of these states increases with time. We are referring here to the variance in gene frequency between a group of popUlations, each starting with the same gene frequency. This variance increases with time as the populations diverge from their initial value. At the same time the a verage genic variance within each population decreases as the individual gene frequencies move toward 0 or 1 . In parallel with this the average heterozygosity of an individual decreases with time, as we shall now show. Let Ht be the probability that a randomly chosen individual in the population at the tth generation be heterozygous. Since this probability is 2x( 1 - x) for a given gene frequency x, we have =
Ht
=
=
E[2x(1 - x)] t 2[x( ) - V�)
which reduces to
Ht
=
=
_
2E[x - (x 2 _ p 2) _ p2 ]
p2 ] ,
( 2�r,
2p( 1 - p) 1 -
(t
=
1 , 2, . . . ) ,
7.3.7
i f we apply 7.3.3 and 7.3.4. We see that the average heterozygosity decreases at a constant rate of 1 /2N per generation. If the i nitial popUlation (generation 0) is produced by random mating, the expected frequency o f heterozygotes at that t ime is 2p( 1 - p), so we have
Ht
=
( �r
Ho 1 2
(t
=
0, 1 , 2 , . . . ) ,
as we obtained earlier by another method (see 3. 1 1 .2) .
7.3.8
330 A N INTRODUCTION TO P O P U LATIO N G EN ETICS TH EORY
When N is sufficiently large, this is approximated by 7.3 .9
(see also formula 8.4. 1 1 i n the next chapter). At this poi nt, a few remarks are in order. First it should be noted that the homozygosity or heterozygosity of an ind ividual within a population is a concept distinct from genetic homogeneity or heterogeneity of the popu lation itself. For the latter Wright used the terms homallelic and heterallelic ; a population is homal lelic if it contains only one allele, but is heterallelic if it contains two or more alleles. Secondly, as shown above, the probability of heterozygosity decreases at the rate of exactly 1 /2N per generation under random mating, while the probabi lity of coexistence of both alleles within a population , though it diminishes each generation, does not decrease at a constant rate. Its rate approaches 1 /2N only asymptotically. This will be demonstrated at the end of the next section and in more detail in Chapter 8. Wright ( 1 93 1 ) gave an approxi mate formula, 7.3.1 0
for the number of unfixed loci at the tt h generation. The approxi mation is satisfactory only for a large t and an intermediate initial gene frequency. The number of alleles introduces another complication into the process of decrease in genetic heterogeneity, but not in the decrease of heterozygosity : Formulae 7.3.8 and 7.3.9 are correct not only for two alIeles but for any nu mber. Let Xl ' X 2 , . . . , Xn be respectively the frequencies of alleles A I ' A 2 , . . . , An in the rand om-mating populat ion at the tth generation (X l + X2 + . . . + Xn = 1 ) . The distribution of the frequencies of t hese alleles i n the next generat ion prod uced by N male and N female gametes taken as random samples from generation t fol lows the mult inomial distribution with means, variances, and covariances given by E(xD
=
Xi '
7.3.11
E(x ,·) 2 _ x · - xlI2N- x;) ' 2 I
I
_
7.3.1 2
----
( i # j). The expected total frequency of heterozygotes in generation t + I
Ht +
1
=
EC�j x;xj) j�j E (x;xj), =
7.3.1 3
IS
PRO PERTIES OF A FINIT E POPULATION 331
and if we apply 7.3. 1 3 in the right side of the above expression, we obtain Ht + 1 =
(I
-
I _ 2N
)L
or Ht + 1 =
( 2�) 1
-
i�j
Ht
Xj Xj
7.3.1 4
•
The expected total frequency of heterozygotes i n a randomly mating popu lation decreases at the rate of 1 /2N, regardless of the number of alleles or the al lele frequencies. If we note that the expected frequency of heterozygotes is proportional to I - fr (see Chapter 3), we obtain
or 7.3.1 5
which is equivalent to equation 7 . 1 . 1 and also 3. 1 1 . 1 . The above relation may also be written as
111, I - I,
1, + 1 - I,
I - I,
I
2N
7.3.1 6
Starting from 10 = 0, the i nbreeding coefficient in generation t is given by 7.3.1 7
When N is large we have, with good approximation, 7.3. 1 8
7.4
Change of G en e F requency M oments with Random Drift
This section is an extension of the previous one. Here we will study more generally the change of gene frequency moments. First, a discrete model will be employed to derive a few moments and the results will be compared with the corresponding values obtained by a continuous model. The latter model will then be used with advant age to derive a general formula for the moments when N is large. Some of the results obtained will be found u seful in later sections. In the next section, we will find that the first fou r moments are
332 A N I NTRODUCTION TO POPU LATION G E N ETICS T H EORY
required to describe the change of variance in quantitative characters in finite populations. Furthermore, by knowing all the moments for large N, we get extensive knowledge of the process of random genetic drift which will be treated in detail in Chapter 8, Section 4. Actually, it was through this method i n conjunction with the method of partial differential equations that the entire process of random genetic drift was first constructed (Kimura,
1955a). Let us consider an isolated population of N breeding diploid individuals. x respectively. Let A l and A be a pair of alleles with frequencies x and 1 We assume that mating is random and that the mode of reproduction is such that N male gametes and N female gametes are drawn as a random sample from the population to form the next generation. Here m ay take any one of the sequence of values :
2
-
x
1
2 ,1- 1 1 2N' 2N' . . . 2N' . J(x; DXt Xt + Dxt• J1�t + = E(x�+ )
0
,
We designate by t) the propability that the gene frequency is x at the be the frequency of A l in the tth tth generation (t = 0, 1 , 2, . . . ). Let generation and let be the change of due to random sampling of gametes such that Xt + l
Let
Xt Xr
=
1)
7.4.1
7.4.2
1
be the nth moment of the gene frequency distribution about 0 in the genera tion t + 1 . Conventionally the nth moment around 0 is designated by but since we often use the prime ( ') to designate values in the next generation, we will save the prime for this purpose. Now, we note that the distribution in the (t + l)th generation is the result of convolution of the di stribution i n the tth generation and the sampling " error " in reproduction. Thus, in calculating the expectation of + I in terms of we take the expectation in two steps : first taking the + expectation for a random change which we shall denote by E" , and then taking the expectation for the existing distri bution of gene frequency which we shall denote by such that
J1�,
x�
(xt Dxt)",
E� , E�(x�) = L x"f(x ; I
and
x=o
Dx,
t)
=
J1�t)
E,,[(Dxt)"] i�O (2� - Xtr(2�)x;( 1 - Xt)2N-i, DXt =
since
can be assumed to follow the bi nomial distribution.
7.4.3
7.4.4
P R O P E RTIES O F A FINITE P O P U LATI O N
For
I , 2, 3, and 4, 7.4.4 becomes
n =
E,,(fJ x ,) = 0 ,
=
E,, [(fJ x, ) 2J
and
333
3 EA [(fJ x,) J
=
4 E [(fJ x,) J "
=
7.4.5
x,( 1 - x, ) 2N
7.4.'
x,( 1 - x,)( 1 - 2 x,) ( 2N) 2
7.4.7
3x:( 1 - X,) 2 ( 2N) 2
7.4.8
+
x,( 1 - x,)( 1 - 6x, + 6x; ) (2N)3 .
The mean, variance, and higher moments of the distribution can be obtained as follows. From 7.4. 1 ,
= E(x, + fJx ,) . The left side is p\' + 1 ) by 7.4.2, while the right side is E(x, + 1 )
E(x, + fJ x,) = E4t [ x,
+ E6(ox,) J
=
E4t(x,)
since E.,(fJx ,) = 0 from 7.4.5. Therefore ,,(, + 1 ) 1"' ,,( ,) ,,( 0) 1"'1 , 1"'1 1
=
p\'),
- -
_
_
i.e., the mean stays constant : N)
, J r
- 1"'1
,, ( 0 )
7.4.9
•
To obtain the second moment we square both sides of 7.4. 1 and t ake expec tations. E(x;+ 1 )
+ 2 x, ox, + (OX,) 2J . The left side is p� + 1 ) by definition, while the right side is =
E [x;'
[
]
x,( l - XI) E4t [x,2 + 2x, E6(fJ x,) + E6(ox,) 2 ] = E4t x,2 + ' 2N which becomes
( 1 - 2�) E4t(x;) + � E4t(x,) = ( 1 - 2� ) p�) + 2� p\'l, 2
by noting 7.4.5, 7.4.6, as well as 7.4.3. Since p\') = p\O ), the above equation gives
,,(t + ) 1"' 2 1
=
( 1 _ _2N1_ ) 1"'2' + _2N1_ 1"'1,,(0). ,, ( )
7.4.1 0
334
AN I NT R O D U CTIO N TO P O P U LATI O N G EN ETI C S T H E O RY
The solution of this finite difference equatio n is
p.�) (p.�O) - p.\O») (1 - 2�) + p.\O). I
=
7.4. 1 1
Let V �) be the variance of gene frequency i n generation t, i.e.,
V�)
=
p.�) - (p.\,,)2;
then 7.4.1 2
To find the third moment, we use a similar procedure ; cubing both sides of 7.4. 1 and taking expectations, +
=
E(X�+l) EcP[x� + 3x;E6(�Xr) 3xrE6(�X,)2 + E6(�X,)3] _- EcP [X,3 + 3x,x,( 1 - X,) x,( 1 - xr)( 12 - 2X )] , (2N) 2N +
r
where the last term is obtained from 7.4.7. Thus, we obtain the recurrence relation,
JIg' ) - (2�) JI�) 2� (1 - 2�) Jlg) ( 1 - 2�) (1 - 2� ) JI�)' 1
'
+
+
7 -4. 1 3
the solution of which is
Carryi ng out the same procedure, we obtain the following recurrence formula for the fourth moment : ,,(1)
_2N1_) 1"'2 1 + � ( 1 - _)(1 _ �) p. (t) 2N 2N 2N 1 ) 1 �)( 1 - �)"(l) + (1 - _ 2N ( 2N 2N 1"'4 , J
_
7.4.1 6
P R O PERTIES OF A FINITE POPULATIO N 335
the solution of which is
/1(r)4 /1(0)1 =
(/1(20) - /1(01 ») ( -)t (/1�0) - � /1�0) � /1\0») [ ( �) ( �) r ( ( 0) 0) [/1(0)4 - /1(30) /1 /1 ] 2 _6 1 6
+
18N - 1 1 6 1 0N
+
2
+
1 -
_
+
+
2
1 2N
1 -
1 2N - 7 1 0N
-
_
1 -
2
2
10N
If we start from a population with gene frequency p, we have
/1�0)
=
2
p ,
/1\') /1�) V�)
/1�)
etc. , and the above results are expressed as follows :
=
p,
�r [ ( �) ' ] , ( ) -� -� [( �) ( �) r, 6 p( 1 - p) ( ) - 1)[(1 - �)( �)r ] [ (
p
=
P - ( 1 - P) 1 -
=
p( 1 - p) 1 -
=
P
2
p( 1
_
1 -
P) 1
,
2
2
-
18N - 1 1 1 0N
=
p,
7.4.1 7
7.4.1 8
7.4.1 9
1 t 2N
1 -
_
- p(1 - p)( 2p
+
/1\0)
_
_
p(1 - p)(2 p - 1)
/1�) = P
7.4.1 6
2N - 1
p( l - p) p(1 - p) -
1 2
2N
1 -
2
7.4.20
2
t
1 -
2
2
7.4.21
2N - 1
I ON
_
6
The result for variance agrees with 7 . 3.4 ; see also Wright ( 1 942) and Crow ( 1 954) for alternative derivations. The moment formulae are important in finding the process of change i n the genotypic variance within lines, between lines, and the additive component of the variance within lines in the case of inbreeding d ue to restricted population size. This was first worked out for a
336 AN INTRODUCTIO N TO P O P U LATIO N G E N ETICS TH EORY
completely recessive gene by Robertson ( 1 952). Though his expressions are less explicit, the present results seem to be in complete agreement with his, which were obtai ned by the use o f matrix algebra. I n order to find the general formula for the moments of the distribution, we shall now make the assumption that the population size N is large enough that terms of the order I /N 2 • I /N3, etc., can be omitted without serious error. The recurrence formula for the nth momen t is obtained as follows : E ( x�+ I )
=
E [( x, + c5x,)"J
[ (�) [ II +
X� - I .5 X I + ( ;) X � - 2 (.5 XI ) 2 +
= E X� + II
E;fJ(x, + I )
=
E;fJ X I
.
. .] ,
( 1 )]
n ( n - l ) x�- I ( l _ x l) +0 N2 2N 2
'
where O( I / N 2 ) denotes a term of the o rder of I / N 2 • Neglecting higher o rder terms, we have (1 + 1 )
1111
]
[
n(n - 1 ) (I) n(n - I ) (I) = I1111 + 4 N 1111 - I 4N
(n
=
, . ).
I, 2 .
.
7.4.22
For a large N, the mome nts change very slowly with generations, and we ca n replace the above system of finite d ifference equations by the fol lowing system of differential equations.
dl1�')
dt
-
_
n(n
- 1 ) [1111( I )
4N
_
( I)
1111 - I
]
(n = I , 2, 3, . . . )
1 f the population starts from the gene frequency results for n = 1 , 2, 3, and 4 :
.
p,
.
7 .4 23
we get the following 7.4.24
1
11 � } = P - p( 1 - p)e - 2 N 1,
V �r ) J8)
=
=
(
7.4. 25
)
I I p( l - p) l - e - 2 N , L
7.4.26 3
p - lp( 1 - p)e - 2N I - lp( 1 - p)(2p - I )e - 2 N ', I
11 �) = P - tp( 1 - p)e - 2 N I
- p( l - p)(2p - l )e - 2 N '
7.4.27
3
7.4.28
6
- tp( 1 - p)(5p2 - 5p + 1 ) e- 2 N ' . If we compare these formulae with 7.4. 1 7, 7.4. 1 8, 7.4. 1 9, 7.4.20, and 7.4.2 1 , i t will be seen that even for a rather small N such a s N = 1 0, they give very good approximations to the formulae based o n the discrete model.
P R O PERTI ES O F A FIN ITE P O P U LATION 337
The above form ulae may also be expressed i n terms of the inbreeding coefficient, si nce e-
2N 1 1
=
1
r JI '
_
as shown by 7.3. 1 8. Th us
11 \') = p , 11�t) = P - p( 1 - p)( 1
7.4.29 -
V�) = p( 1 - p)1"
11�) J1�)
=
1,),
7.4.30 7.4.31
p - ip( 1 - p)( l - 1,) - !p( l - p)( 2 p
= P
- -tp( l - p)( l - 1,) - p( l - p)(2p
1 ) ( 1 - 1,) 3 , 3 1 ) ( 1 - 1,)
7.4.32
-
-
- tp( l - p) [ l - 5p( l - p) J( 1 - /,) 6 .
7.4.33
From the above process of calculation, it may be inferred that the solution of 7.4.23 for an arbitrary n, ( n � 2), with the initial cond ition J1�O) p", must have the form =
11�/) where
;. .
I
=
=
P +
i(i
"- I
=
i
L C�i ) e - Ai', 1
+ 1
7.4.34
)
7.4.35
4N
, and C�i) s are constants which do not depend on t. For a l arge value of t, only the first few of these constants are important and they are rather easily determined from 7.4. 23. For example, substituting 7.4.34 into 7.4.23 and compari ng the terms i nvolving A l in both sides of the resulting equation, we get CO ")
=
(n
n( n
+
1)( n -
1) _
2)
from which we obtain
CO ) "
C( l )
"- 1 ,
3 ( n 1 ) C O2 ) , n+1 -
=
7.4.36
where C�I ) = p(l - p) from 7.4. 2 5. General terms are much more difficult to derive so we shall only give the result : -
J1�/) = P + L ( 2 i + 1 )p( 1 n- I
i =- 1 X
-
p)( _ 1 )iF( 1 - i, i
+
4N
,
( n - 1 ) ( n - 2) · " ( n - i ) e ( n + 1 )( n + 2) . . . ( n + ; )
_ i( i + J )
2, 2, p) 7.4.37
,
338 AN I NTROD UCTION TO P O P U LATI O N G EN ETICS TH EORY
where F represents the hypergeometric function, that is,
F( 1 - i, i + 2, 2, p)
+
=
( l - ;)(2 - i) 1 x 2
x
x
1+
( 1 - i)(i + 2) P 1 x2
( i + 2)(i + 3) 2
x
3
P
2
+
7.4.38 •
•
•
( I - 1 , 2, 3 , ' " ) . .
_
For i = I , 2, and 3, the right-hand side of t he above expression reduces to I , 1 - 2p, and I - 5p + 5p 2 , respectively. As the process of random drift proceeds, the probability that the gene is either fixed in the population or lost increases gradually. I t is possible to find this probability from the moment formula 7.4. 37 by the following device : For the probability of fixation, I 7 . 4.39 /( 1, t) = l i m L xj(x, t) l i m p�t). " .... co
x=O
=
"
.... co
This makes use of the fact that for n = 00 and 0 < x < I , x" = I only when x = I ; otherwise x" = O . The right side of 7.4.39 is readily evaluated from 7.4. 37. Let us denote the above probability by f(p , I ; t), meaning that this is the probability of the gene A I reaching fixation by the tth generation given that its i nitial frequency is p. Thus we have
/( p, 1 ; t )
=
=
co
_ i( i + l ) ,
P + L ( 2 i + l )p( 1 - p) ( - 1 ) F( 1 - i , i + 2, 2, p)e i= I
.
'
3
I
P - 3p( l - p)e - 2N t + 5p ( 1 - p)( l
_
2p)e - 2N I
4N
7.4 . 40
6
- 7 p( 1 - p) ( 1 - 5 p + 5 p 2 ) e- 2N ' + . . . . This can also be expressed as 7 . 4.41
where r = I - 2 p and the P;(r)'s represent the Legendre polynomials : Po(r) = 1 , PI er ) = r, P2 (r ) = t (3 r 2 - I ), P 3 (r) = t (5 r 3 - 3 r ), etc. The probability, denoted by f(p , 0 ; t) , of A l being lost or its allele A 2 being fixed by the tth generation is obtained simply by replacing p with 1 - p and r with - r in the above expressions. Therefore the probability that both A l and A 2 coexist in the population at the tth generation is obtained from o. = 1 - f(p ,
1 ; t) - f( p, 0 ; t).
P RO P E RTI ES O F A FINITE POPU LATIO N
I f we use the relation Pi( - r) ex:>
0, . L [P2 i r) - P 2 j + 2 ( r) ] ej=O
=
=
( - l)'Pj(r), we get
(2j+ 1 )(2j+ 2)t
4N
1
1( 1 - r2 ) e - 2 N ' + . . . .
Thus for f -
00 ,
339
7.4.42
we have the asymptotic formula
0, '" 6p( l - p)e
2N .
-- ,
,
7.4.43
Note that this is different from the formula for heterozygosity, i .e'3 7.3.9. A similar method for calculating moments can readily be extended to the case of three al leles A I ' A 2 , and A 3 , though the situation is more com plicated (see Ki mura, 1 955b, 1 956). 7.5
The Variance of a Quantitative Character Within and Between Subdivided Populations When an infinitely large population is subdivided into isolated subgroups of finite size N, within each of which mating is at random, the process of random genetic drift will go on in each of them until the frequency of a particular allele ulti mately becomes eit her 0 or 1 . We will designate the sUbpopulations as lines. First, consider a character dete rm ined by add itive genes. Let Cl be the average effect of substitu ting A 1 for A 2 , so that we can express the genotypic values of A l A " A 1 A 2 , and A 2 A 2 as 2a, (x, and 0 respectively. I f the frequency of A I in a particular l i n e is x, , the gen oty pic mean a n d the varia nce within the l ine a re respectively 2axt and 2a 2 x,( I - x,) , Thus the genotypic variance within a line at generation t is
V�)
=
2Cl 2 E.p[xl( l
which reduces to
- XI )] = 2cr. 2(p \' } - p �» ,
7.5.1
if we appl y 7.4.29 and 7.4. 30. The variance between line means at the fth generation is
vll) = E.p [(2ClX, ) 2 ] - [E.p (2a x, )] 2 4(X 2 [p � } - ( p C! ») 2 ] , =
which si mi larly reduces to
V�I )
=
2 (X 2 p( 1 - p)2ft .
7.5.2
AN I NTRO D U CTION TO P O P U LATIO N G E N ETICS TH EORY
340
The total genotypic variance (see Chapter 4) is obtained from
vir ) = E4>[( 2a) 2xt + a 2 2x, ( 1 - X,)] - [E.p(2axl)] 2 = 2a2 [1'\' ) - 2(1'�))2 + I'� )] .
I n terms of It , this becomes
vi' ) = 2a2 p( 1 - p) ( 1
7.5.3 J, ) . This total genotypic variance may also be derived by noting that the fre quencies of A l A I ' A I A 2 ' and A 2 A in the total population with inbreeding 2 coefficient I are
+
P I I = pi + p 2( I
- I), 2 P1 = 2p( 1 - p)( 1 -I), 2 P = ( 1 - p)1 + ( I - p)2( 1 - I),
7.5.4
22 and that the total genotypic variance for such population is given by
Vh
= (2a) 2 PI I + a2 2P1
and therefore
2
- (2aPI l + 2IXP1 2 )2 ,
Vh = 2a2 p( 1 - p)( l + /),
as was discussed earl ier (see 3. 1 0. 6). The total genotypic variance (7.5.3) is of course equal to the sum of the preceding two variances V!') and vt'). S in ce Vo = 2a2 p( l - p) is the genotypic variance expected when the entire p op ul ation is a p anm ict i c unit, we may write
V�) = Vo( 1 - J, ), V�') = Vo 2 J, , Vi') = Vo( 1 + J,).
7 . 5. 5
7.S.6 7.5.7
These results for additive genes were given by Wright ( 1 9 5 1 ) and were shown to hold under q u ite general conditions (Wright, 1 952). They hold not only when li nes are completely isolated and drifting toward fix at ion but also for cases when a steady state has been reached under mutation, crossbreed i ng, and selection, as long as j, represents the inbreeding coefficient of individuals relative to the total population. Returning to the case of random genetic d rift in completely isolated lines, if the l ines are started as random samples from a very large parental stock for which 10 = 0, we have approximately 7.5.8
and 7.5.9
P R O P E RTIES O F
A FI N ITE P O P U LAT I O N 341
To summarize, the variance within lines decreases at the rate of 1 /2N per generation and finally becomes 0, while the variance between Hnes increases with time and finaHy becomes 2 Vo . The situatio n becomes much more complicated if there is dominance. Let Y. b Yt 2 , and Y2 2 be respectively the genotypic values of A l A I , A .A 2 , and A 2 A 2 . For a particular line in which the frequency of A I is x" the mean genotypic value is
YI . x; + 2 Y1 2 x,( l - x,) + Y2 2( 1 - X,)
=
M(x,)
2
and the genotypic variance within the line i s
V(xt)
Y f . x t + 2 Y� 2 Xt (1 - Xt) + yi2(1 - X,) 2 - M2(x,),
=
of which
2xt( l - xt)[( Y1 1 - Y1 2 )Xt + ( Y1 2 - Y2 2 )( I - Xt)]
2
is the additive or genic component and
x;(l - xt) 2 ( Y1 1 - 2 YI 2 + y2 2) 2 is the dominance component (cf. Section 4. 1 ). To make the expressions simpler, we will choose a scale such that
Y1 1 = I , =
YI 2 h , Y2 2 = O . The mean genotypic value averaged over the whole population i n the tth generation is
M �t ) = Eq,[M(x,)] = Eq,[2hx, + ( 1 - 2h)x;J or
M�t) = 2hpl') + ( 1 - 2h)p�).
7.5.1 0
In terms of the inbreeding coefficient, this becomes
M1t)
= P
- ( 1 - 2h)p( 1 - p) ( l - 1,).
This result is also obtained from
M1' )
=
p. 1 + h 2 P . 2 ,
by using 7.5.4.
7.5.1 1
342 AN I NTRODUCTION TO P O P U LATI O N G E N ETICS T H EO RY
The total genotypic variance is obtai ned from
V�t) = E.p[X,2
+
h 22xrC l - x,)] - { E.p[x�
+
h2 x,( 1 - X,)] } 2 ,
and this reduces to 7.5.1 2
I n terms of!" it is written as
v�t) = p( l - p)[1 - ( 1 - 2 p - 2h2
+
4ph)( 1 - J,) - ( 1 - 2h)2 p( 1 - p)( 1 - fr )2] .
7.5.13
It may be noted that this i s also obtai ned from +
h22P I - [ M � )] 2 2 by using 7.5.4 and 7. 5 . 1 1 .
V�I) = P i l
] f we regard the total genotypic mean and variance as functions of the i nbreedi ng coefficient rather than functions of t, writing them as Mh(f) and f( Vh ) , we have =P
- ( I - 2 h)p( l - p)( I - f) , Vh(f) = p( l - p) - p( J - p)( 1 - 2p - 2h 2 - p2( 1 _ p)2( 1 217)2( 1 f)2 .
Mh(f)
_
+
4ph)( I - f)
7.5.1 4
_
Since these two relations ca n be derived directly from 7.5.4 , it may be seen that they hold for any distri bution as long as p represents the average gene frequency and f i s defined such that 7. 5.4 holds for the whole population. The above relations are also written
Mh(f) = Mh(O)( 1 - f) + Mh( l ) /' Vif) = Vh(O)( l - f ) + V,, ( I )f + [MiO) - Mh ( 1 )]2f( 1 - f),
7.5.1 5
as was done earlier (3 . 1 0. 1 and 3. 1 0.3). The genotypic variance withi n lines i s
V�lh ) = E.p [ V( x, )] =
211 2 tl\t)
+
( 1 - 6h2 )tl�) - 4 h ( 1 - 2h )tl � )
7.5.1 6
- ( 1 - 211 ) 2/1�),
of which the add itive component or the genic variance withi n lines is
V�lg ) = E.p[ 2 xr( 1 - x,)[( 1 - h) xr + 11( 1 - Xr)J 2 ] = 2h 2/111) + 2h ( 2 - 5h)/1 �r ) + 2 ( 1 - 2h)( 1 - 4h)/1�)
- 2 ( 1 - 2h) 2/1�),
7.5.1 7
P R O P ERTI ES OF A FINITE POPULATI O N
343
and the dominance component is
V ��d = E,p[x;( 1 - X, )2 ( 1 - 2h)2 ] ) ( 1 - 2h)2 [Jl�) - 2Jl �') + Jl�)] .
7.5.1 8
=
The variance between line means is obtained from
V�r )
=
E,p [ M 2 (x, ) ] - { E,p [ M ( x,) ] } 2 ,
which leads to
V �r) = 4h 2Jl�r ) + 4h ( 1 - 2h)Jl�) - [ 2hJl\')
+
+
( 1 - 2h)2Jl�) 7.5.1 9
( 1 - 2h)Jl�)] 2 .
] f the random genetic drift proceeds within each of the completely isolated lines, the above results are expressed, by applying 7.4. 2 9-7.4. 33 , in terms of the inbreed ing coefficient as follows :
V�'(�r )
=
f(2 - 3h + 3h2)p( l - p)( J - /,) +
( 1 - 2 h )p( 1 - p)(2 p - 1)( 1
+
t( 1
-
+
7.5.20
-
1,)
( 1 - 2h)p( 1 - p )(2p - 1 )( 1 - 1,)3
H I - 211)2 p( 1 - p)[ l - 5p( l - p) ] ( 1 - 1,)6 ,
V �ld) = t( 1 - 2 11 ) 2 p( 1 - p)( 1 - to
-
-
!, )
2 h )2p( 1 - p)[ l - 5 p( l - p) ] ( 1
v�r) = p( l - p) + -H 9 -
-
1,) 3
2h)2p( 1 - p)[ 1 - 5 p( l - p ) ] ( 1 - 1,) 6,
V�lg) = !( 3 - 2 11 + 2h2) p( 1 - p)( 1 +
-
+
6h
+
4 11 2
( 1 - 2 h )2 p 2 ( 1 - p) 2( l
- (1
-
2h )p( t - p) (2 p
-
+ _
_
1,)6 ,
7.5.21
7. 5 . 22
l Op - 2 0 h p)p( 1 - p)( 1 - !,)
1,)2
1)(1
_
7. 5.23
1,)3
- H I - 2 h ) 2 p( 1 - p) [ J - 5 p( 1 - p) ](l
_
1,)6 .
These calcul ations are tediolls but straightforward. One of the most interesting situations arises when the gene A 1 is com pletely recessive and occurs in the original stock at a very low frequency. In this case, the freq uency of homozygous-recessive individuals wil l increase with time and genotypic variance within lines increases up to a certain val ue of inbreeding coefficient, as shown by Robertson ( 1 95 2) .
344 AN I NTR O D U CTION TO P O P U LATIO N G E N ETICS T H EO R Y
For a completely recessive and rare gene (h following results :
=
0, p
V�lh) tP( 1 - /,)[4 - 5( 1 - /,)2 + ( 1 - /,)5], V�lg) = tp( l - /,)[3 - 5( L - /, ) 2 + 2( 1 - /,)5], V�ld) = !p( 1 - /, ) [ 1 - ( I - /,)5], Vbt ) p[l - t( 1 - /,) + (1 - /,) 3 - ·H I /'t] , vlt ) p/, . =
�
0), we have the 7.5. 24 7.5.25 7.5.26
=
7.5 . 27
=
7.5.28
Figure 7 . 5 . 1 shows change of these variances i n terms of change i n the i nbreedi ng coefficient. As will be seen in the figure, the genotypic variance within Jines first i ncreases with f and reaches its maximum value when f is roughly 1 /2 and decreases thereafter to become 0 when f reaches unity. Its additive component behaves somewhat similarly. Generally, such a pattern of change in Vw , i.e., i ncrease fol. 1 owed by decrease, occurs when 6p2 < 1 or p < 0041 . For a higher gene frequency, the variance within li nes always decreases with i ncreasi ng j, as in the case of no dominance. For details, see Robertson ( 1952) .
o
Figure 7.5.1 .
0. 5
1.0
f
Change of total genotypic
variance ( VII)' variance between lines ( Vb), variance within l ines ( VW(�)' and its additive component ( VW(q)) of a char acter governed by very rare recessive genes (Robertson,
1 952).
P R O P E RTI ES O F A FI N IT E PO P U LATION 345
A n esse ntially d i ffere n t case i n which li nes are partially isol ated a nd random d rift toward fixation is counterbalanced by occasional crossing be tween li nes so t hat there is a steady-state gene frequency d istribution has been worked out by Wright
( 1 952).
He has shown that the result for the
variance within l i nes, assuming a completely recessive character, does not d i ffe r very m uch in terms of/and p fro m the above case of complete isolation.
7.6 Effective Population Number 7.6.1 Introduction I n the prece d i ng secti o n of t his chapter, w e assumed an i deal populati o n of N breed i ng i ndividuals wh ich are produced each generation by random union of
N
m al e and
N
female gametes regarded as random samples from the
population of t he prev ious generation. We then stud ied the change of mean , variance, and h igher moments i n gene frequency d ue to this random sampling of gametes. We also d iscussed here and in Chapter
3
decrease in heterozy
gosity of an i n d ividu al and i ncrease of genetic homogeneity within a finite population. For a given i n itial gene frequency, these quantities a re expressed solely i n terms of population n umber
N.
On the other han d , t he breeding structure of an actual population is li kely to be much more compl icated and may d i ffer from this ideal populati o n i n m a n y respects. Thus i t i s desirable to have formu lae through which such complicated situations are reduced to the equivalen t ideal case which we understand a nd for w hich we have for mulae. A few of the s impler formulae were give n i n Section
3. L 3
where the idea of an effective population number
was first mentioned. The very useful concept of effective popu Lation number was introduced by Wright
( J 93 J )
to m eet this need. In a finite population, as we have dis
cussed, there i s a decrease in homozygosity (inbreeding effect) and a rand om drift in gene freq uencies beca use of sampl ing varia nce (variance effect). In simpler cases a pop U l ation has t he same e ffective number for either effect, and fo r this reason Wright used them m ore or les s interchangeably. But for more complex situations it is necessary to make a disti nction (Crow, C row and M orto n ,
1 955 ;
Kimura and C row,
1 963).
1 954 ;
Our treatment i n thi�
section foll ows the Latter paper.
7.6.2 I n b reeding Effective Number We fi rst co nsider a mo n oeci ous d iploid popul ation , m ating at random and i ncluding the possibility of self-fertilization . We assume that each i nd ividual has an equal chance of contributing to the next generatio n . Then the number,
346 AN INTROD UCTI O N TO POPU LATIO N G EN ETI CS TH EORY
k,
of gametes contributed by an individual to the next generation will follow a binomial distribution. A particular gamete has a probability 1 / N of coming from a particular one of the N parents and there are 2N gametes in all. So the probability of k gametes coming from a particular parent is 7.6.2.1
In a population of stable size,
k=2
7.6.2.2
and the variance is 7.6.2.3
More generally, if the average number of contributed gametes is k, the variance of k will be Vk = Nk
(�)( 1 - �) = k ( 1 �) . -
7.6.2.4
Under this circumstance, each parent has the same expected number of offspring and the probabil ity of two randomly chosen gametes coming from the same parent is 1/ N. For a large N the freq uency distribution will approach the Poisson distribution for which
Vk = K,
7.6.2.5
as discussed in A.5. When such an ideal situation is not realized, we define the effective population number by the reciprocal of the probability that two randomly chosen gametes come from the same parent. To see this point, let us consider first a population of monoecious diploids. We assume that mating is at random, though the expected number of offspring is not necessarily the same for each individual. Let P, be the probability that two uniting gametes (or equivalently, under the assumption of random mating, two randomly chosen gametes) come from the same individual of the previous generation, t - 1 . Then, using the same reasoning as we have previously employed in Section 3. 1 1 , we obtain
Jt = Pt I'
( 1 +2/' - 1 )
(
+ 1
-
P, )1t - 1 .
or 7.6.2.6
PROPERTIES O F A FINITE POPULATIO N 347
where !, is the inbreeding coefficient in generation t. On the other hand, in our ideal population consisting of N'- 1 individuals in generation t 1, the correspondi ng relation is
-
fr
= 2N: _ 1 + (1 - 2N: _ .) fr - l '
7.1.2.7
The above two expressions agree with each other if
P, =
1 . -Nt- 1
The heterozygosity decreases at the rate
�Ht - l
-
Pt = 1 - fr - l 2 �fr - l
-
(cf. 7.3. 1 6). Thus two monoecious populations with equal Pt are equivalent with respect to the i nbreeding effect and we can define the inbreeding effective number by the relation : I Ne(f) = - . P,
7.1.2.8
Note that the effective n umber thus defined is determined by the number of individuals in the parental generation. From the above definition, we can derive a concrete formula for the effective number when the distribution of the number of contributed gametes (k) is known. Let kj be the number of successful gametes from the ith parent in generation t I (i I , 2, . . , Nt - I )' The number of ways in which two gametes can be chosen out of the total n umber of Nt - I f( gametes is
-
=
.
of which
is the number of cases in which two gametes come from the same parent. Thus
7.1.2.9
348 AN I NTROD UCTI O N TO POPULATI O N G EN ETICS T H EO RY
where and
I( is the average number of contri buted gametes per parental i ndividual ,
We will define
Vk by Vk L i (kNi' --l 1()2 = LN'i -lkf _ 1(2 , =
7.6 . 2.10
the variance of the number of gametes contributed per individual in the parent generation. With this definition,
Li kl =
N, _IC V" + 1(2 ),
and, if we note that
N' _ I K 2N" =
7.6.2.1 1
the above probability may be expressed as follows :
P = ,
1( 1( - I ) .
K(2N, - 1 )
VI< +
Thus the in breedi ng effective number is given by
N
e(/) =
2N - 1 I( + V,,/ K ' _
l'
7.6.2.1 2
N,
is the number of individuals in the tth generation. where For the ideal case i n which the distribution of k is binomial with mean and variance
I(
1 ) ( 1 - -) V" = N' - I I( (' tf,-I N, - I 1 ), = (1 N' _ I 1(
-
_
_
the inbreeding effective number reduces to the actual number in generation t I;
as it should.
P R O P E RTI ES O F A FINITE PO PULATION 349
When k does not follow the binomial distribution but the population number, N, remai ns constant (k = 2), the formula for the inbreeding effective number becomes Ne(f) =
4N - 2
Vk
+
7.6.2.1 3
2'
as first shown by Wright (l 938b). If a]] i ndividuals contribute equally to the next generation ( Vk 0), the above formula reduces to =
Ne(f ) = 2N - 1 ;
7.6.2.1 4
the effective size is approximately twice as large as the actual size, as we mentioned in Section 3. 1 3. On the other hand, if only one individual con tributes the entire next generation , -
2
( k N, _ \ ) t'k N, - I
If
_
1: 2
- K
_ 1:2 (N - K
1- \ -
1 ),
and 7.6.2. 1 2 reduces to
as it should . The assumption that aU individ uals have an equal expectation of off spri ng i s unlikely to be met in nature and the effective number in natural conditions is usually co nsiderably smaUer than the actual number as pointed out repeatedly by Wright. Next we wi ll consider a population with separate sexes, stil1 assuming random mating. Here the situation is somewhat compJicated because we must consider three consecutive generations. We wiJ] define P, as the prob ability that two homologous genes i n two individuals in generation t came from the same individ ual in the previous generation, t 1 . Then we have
1,+
\
=
P,
( 1 1,- 1) + ( 1 - P, 1, +
)
2
or
!,
+
\
P,
P,
= 2 + ( 1 - P, )!, + 2
as shown i n Section
3. 1 1 .
1, -1'
-
350 AN I NTRODUCTIO N TO POPU LATIO N G E N ETICS T H EORY
For an ideal population consisting of NI*- l males and Nt*-*I females in which each i ndividual has the same expected number of offspring as the others of the same sex, the probability Pt is given by PI (see
1( 1
= -4
--
3. 1 1 .5).
Ni- I
+
1)
Ni-*l
--
7.6.2.1 5
If the numbers of males and females are equal,
'" * '"l = N -l ' Nt1 = N'2 '-
-
and we get
1 Pt = N · r-1 --
It has been shown already (see 3 . 1 1 .9, 3. 1 1 . 5) that, for a constant number of males a nd females, heterozygosity decreases at a rate of approxi mately 1/(2Ne + 1 ) per generation with N" given by
Ne
4N*N** N* + N**
= ----
This is again the reciprocal of P I ' though Pt for this case is slightly different from the monoeciou5 case. Thus any two populations (with separate sexes) having equal P, a re equivalent with respect to the inbreeding effect. The formula corresponding to 7.6.2. 1 2 may be derived as follows : Let kj be the number of gametes contri buted by the ith individual i n generation t 1 . Then -
7.6.2.1 6
The term N, i n the denominator, which did not appear in the case of a monoecious populati on, comes from the fact that only al1eles that did not enter into the same i ndividual in generation t can unite to form generation t + 1. We now proceed a s i n 7.6.2.9-7.6.2. 1 2. Noting that N, = N' - l K/2 , the above expression is simplified and we get -
Vi + ,,2 - "
Pt = K(Nt- i K 2) ' -
P R O P E RTIES O F A FI N IT E P O P U LATI O N
351
The effective number i s then given by
1 Nr - J Ii - 2 N e(f ) = - = k Pt Ii I + �
---_
or Ne( / ) =
k
V.
2Nt - 2 -
7.6. 2.1 7
k
k - l + -= k
Note t hat the numerator is now 2Nt - 2 rather than 2Nt - I as in the case of a monoecious population. The difference, however, is important only when Nt is very small. When the numbers of males and females, N,*_ J and Nt*-*J are diffe rent it is sometimes convenient to calculate the mean and variance of k separately for each sex and then combine them to get Ii. and Vk from
,
k = m k* + (1 - m ) 1(* *, Vk = m V: + ( 1 m) V: * + m( 1 - m )(1( * - k ** )2 ,
-
7.6.2.1 8
where k * and k * * refer to the nu mber of gametes from males and females and m is the proportion of males (cr. 3 . 1 0. 3 ) . Thus if kr and kj * are t he numbers of gametes from t he ith male and jth female respectively, then K* =
Li k i
V*k -
" . (k �
and
N t*- 1 ' L.,.,
I
-
N1*- 1
k *) 2
A l so, since m and 1 gene ration t - I ,
'
m
represent the proportion of males and females in
1 - 111 =
N:- t m = -- , Nt - l
-N t*-* 1 Nt - 1
I f t he number of gametes contribu ted per ind ividual follows t he binomial distri butio n ,
Vk*
= k*
-
(
)
)- , 1 -Nt*- 1
(
Vk* * = k** t
_
1
)
' N** t- 1
_
352 AN INTROD UCTION TO PO P U LATI O N G E N ETICS TH EO RY
and we obtain
Ne(f) or
=
4m{l - m)Nr - t ,
7.6.2.1 9
4Ni- l Ni-*1 Ne(f) = N * + N ** . r- I 1-1
7.6.2.20
This agrees with our earlier result (3. 1 1 .5), first given by Wright ( 1 93 1 ). On the other hand, if the population consists of a single pair, a male and female,
Nt - 1
=
2,
Vk
= 0,
and 7.6.2. 1 7 reduces to
Ne(f)
=
2,
as it should. Notice that the effective number related to an autozygosity increase in generation t + 1 is a function of the popUlation number, Nt - I t two genera tions earlier. This is as expected, since with separate sexes two homologous genes could not come from a common ancestor more recent than a grand parent. 7.6.3
Variance Effective N um ber
We have j ust developed the concept of effective population number as this relates to the change in the probabil ity of identity by descent. We now con sider a definition of effective number that renders different popUlations com parable as regards the sampling variance of the gene frequency. This we call the variance effective number. I n many cases the two concepts l ead to the same consequence, but not in general. In an ideal population the sampl ing variance of the gene frequency drift from parent to offspring generation is Vdp = p(l - p)/2N. So the natural definition of the variance effective number is obtained by setting the actual variance, Vdp , equal to p( I - p)/Ne(v), where Ne(v) is the variance effective number. Consider first a population of monoecious diploids and let Nt - 1 be the number of individuals i n generation t - 1 . As in the previous calculation, we wiJI denote the number of gametes contributed per individual by k and define its mean and variance by
2: k; - kV,k = -t-t 11. , 1�
2
'
7.6.3.1
P R O PERTIES OF A F I N IT E P O P U LATI O N 353
- =
where k is the number of gametes contributed by the ith individual and the i summation is over all the i ndivid uals in generation t I (i 1 , 2, . . , Nt - I ) ' I n our retrospective approach of defining the effective number from the observed distrib ution of contributing gametes, the value of k for a specified individual is fixed and not a random variable. However, if we pick out, conceptually, an ind ividual at random from the population. k is a random variable with mean " and variance l'k . E [(k - 1() 2 J = � .
E(k) = Ii,
.
7.6.3.2
Since each individual can contribute both male and female gametes, we will denote the number of the two types by k * and k * * such that for the ith individual
Their average is equal but they have their own variances :
* k f( * = L i = k * * = L "j = " Nt - I 2 ' Nt - I V*k
� �
- Ntk*-i 2 - k-* 2 -
l
,
v*k *
_
7.6.3.3
� �
k** 2 _ r: i K ** 2 .
Nt - l
7.6.3.4
We will assume that the population i n generation t - 1 contains a pair of alleles A I and A 2 with frequencies p and 1 p. The number of ind ividuals wi th genotypes A I A I , A I A 2 , and A 2 A 2 within the population will be denoted by n I h n 1 2 , and n 2 2 (n i l + n l 2 + n22 = Nt - I ) ' While all the gametes from an A I A I individual contai n allele A I . half of those from a heterozygote (A I A 2) do. so we will designate by 1* and 1 ** the number o f A I gametes among male and female gametes prod uced by a heterozygote. For given val ues of k * and k **, 1 * and 1 * * are random variables which follow the binomial distribution wi th means and variances given by
E(l* )
= 2k* '
-
k** E(l**) = -
2
k*
= -
E
4'
-
[( l* * - k*-*) 2] 2
7.6.3.5
k**
= -.
4
7.6.3.6
Let u s first consider a collection of male gametes which are produced by indivi d uals of generation t 1 and which are desti ned to form generation t . The total number of A I genes contained i n them may be expressed as
1:J \ k *
+
1: 1 2 1*,
where r I I and r 1 2 denote respectively summation over AlA 1 and A I A 2 I. individuals in generation t
-
354 A N INTRODUCTIO N TO POPU LATION G E N ETICS T H EO RY
Similarly, the correspond ing quan tity for female gametes i s L 1 1 k * * + L 1 2 I**. Thus t he change of gene freque ncy between generations t and t by
Jp =
1
Nr - I K
[ (L l l k * + L I 2 1 * ) + (L l l k * * + L I 2 1 * * )J - p,
1 i s given
7.6.3.7
from which we have
N r - t k!5p = (L l l k* + L 1 2 /*) + (L l l k* * + L I 2 /**) - N, l kp. Let us d efine random varia bles X * and X * * by
X * = L I l( k *
r.
*) + L 1 2
k * - K*
- I{.
and X "' * = L I I ( k**
- k**)
+ L1 2
2
(
+ L 1 2 I*
k** - k**
2
k*
-
2
)
(
+ L 1 2 1* * -
7.6.3.8
7.6.3.9
-2 ) ' k**
7.6.3.10
Then 7.6.3.8 may be expressed as
Nr _ l k!5p = X * + X ** ,
7.6.3.1 1
which may be derived by n oting t hat
(
) (
r. k** k* L I I I{. * + L 1 2 2" + Ll l k * * + L 1 2 2
)=
Nr _ 1 p k .
Now, from 7.6.3.2 and 7.6.3 . 5 , we have
E(X*)
=
E(X* *)
=
O.
Furthermore, assuming i ndependence of male a n d female gametes, we have E( X * X **) = O. Thus, squaring bot h sides of 7.6.3. 1 1 and taking expectations, we have
( Nr - 1 K)2 V6p = E(X* 2 ) + E(X** 2 ).
where
V"p
E[(!5p)2].
,
7
...
6 3 12
I n o rder to eval uate E( X * 2 ) we n o t e that in t he right side o f 7.6. 3.9 t h e first two terms a re independent of t he last term, because variation i n the number of contri buted gametes is i n d ependent of t he variation in the n u m ber of A l genes w ithin the gametes produced by heterozygotes.
PROP ERTIES O F A FINITE POPULATION 355
[
Thus
E( X * 2 )
=
E L l l(k * - k* ) + L 1 2
]
k* - K* 2 2
[ (
+ E L 1 2 1*
2 )] 2 .
k* -
7.6.3.1 3
The first term in the right side of this equation may be evaluated by using the relation
* Cu,
=
-
V: Nr- 1 _ 1 '
7.6.3.1 4
where C�, i s the covariance between the numbers of gametes contributed by two different males, that is,
kk' - Li*iNr(k-1(i -Nr-k*)(1 -k; 1-) k*) .
C*
_
7.6.3.1 5
The relation 7 . 6.3 . 1 4 i s a direct consequence of the identity
L (ki - k *) = 0 ,
because, by squaring both sides, we have or
Nr- 1 V: + Nr- 1( Nr- 1 - I )Ctk'
=
0,
which is equivalent to 7.6. 3 . 1 4. I f we pick at random two individuals from the population, then the expected value of the cross product k i - k*)(kj - K * ) is equal to C " i.e. ,
E [( k i - k * )( k; - k *)] =
(
C:k,
( i #: j) .
:k
7.6.3.1 6
Noting the above and using 7.6. 3 . 14, we obtain
[
E L l l( k* - K* ) + L1 2
=
=
]
k* _ K* 2 2
[( ) ( ) )] 2 "1 "1 "1 k ) 2 2 2 * 4 ( "1 1 + Vk + "11 + 2 - ( "1 1 + 4 - Nr- 1 - l " n 2) Nr- ( "11 + �2 rJ . 1 1 "1 + � [ ( �: Nr - 1 v, *
7.6.3.1 7
The second term i n the right side of 7.6.3 . 1 3 may easily be evaluated if we note that / * ( l /2)k* follows the binomial d istribution with mean 0 and
-
356
AN I NT R O D UCTI O N TO P O P U LAT I O N G EN ETI CS T H EO R Y
variance k* /4. The distributions are independent between two differen t indi viduals. Thus 7.6.3.1 8
Combining 7 . 6.3. 1 7 and 7.6.3. 18, we get
N.
�; [N. - . (n . . n;, ) - (n . n�'n +
- 1
[N
.
Similarly, for female gametes we have
E(x* * 2)
=
N,
Vi I
*
-1
(
I "I I +
I
"12 4
) ( _
Adding these t wo equations, we obtain
� [ (
(Nt _ 1k) 2 V6p N, _ k - 1 N' - I because V",
==
* V: + vt ,
k
=
Nt
1
"12 Nt - l
=
=
+
n�2 "* , 7.6.3.1 9
)]
"12 2 "1 + 2 k ** . "1 1 + 2 4
) (
n�2 -
"1 1 +
7.6.3.20
)]
" " 2 2 + 2 k,
�
�
7.S.3.21
K * + 1i. * *.
Let us define a coefficient
"1 1
"U
+
+
Cf. , - t
7.6.3.22
by t he relation
( 1 - CX' - I)P + CX' - l P,
2
( 1 - cx. _ 1 ) 2p( 1 - p) ,
7.6.3.23
( 1 - cxt - 1 ) ( 1 - p) 2 + (Xt - l( 1 - p) . The coefficient CX ' - 1 is a measure of departure fro m Hardy-Weinberg pro portions, whether because of inbreeding or other factors. Then the right side of 7 . 6.3.21 is much simplified, giving
(N1- 1 K)2 y;6p
=
N;- J V",p( 1 - p)( 1 + 2(N,- t - 1 )
a, - I
)
+
N, _ t kp( 1 - p)( l
2
- ex,
,
)
or
.
7. 1 3. 24
P R O PERTIES O F A FI NITE POPULATION
357
The variance effective number is defined by Ne{v) =
p( l - p )
7.6.3.25
2 V6p '
since the sampling variance for the ideal sit uation of N monoecious individ uals mating at random is v�
p ( p) l p= 2N
'
From 7.6.3.24 and 7.6.3.25, we obtain
N e(v) _
p(2l '1'- p) "6
p
_ -
Nt - I k
2 Sk -r ( 1 + (Xt - t ) + ( 1 - (Xt - l)
or
2Nt ----'Ne(v) = --:2:-Sk ( 1 + (Xt - t ) + ( 1 I -
-
(Xt - t )
7.6.3.26
-
which is the required formula for the varIance effective number. In this formula
1 ), and Nt is the which includes the Gaussian correction, Nt- d(N, - 1 num ber of individ uals in generation t, Nt = Nt - I k/2. Notice that, whereas t he inbreeding effective number is naturally related to the number in the parent (or with separate sexes, the grandparent) genera tion, the variance effective number is related to the number in the progeny generation. This is to be expected, since the probability of identity by descent depends on the number of ancestors whereas the sampling variance depends on the size of the sample, Le., the number of offspring. We now consider some special cases. When k follows the binomial distribution
-
or
358
AN I NT R O D U CTIO N TO P O P U LATIO N G E N ETI C S TH E O RY
and
7.6.3.26 reduces to Ne(IJ) = Nt
irrespective of the coefficient (1, I ' When the population keeps a constant number Nt k 7.6.3.26 becomes
Ne( lJ) =
4N
sf +
=
2, and if CX ' - l = 0, 7.6.3.27
2'
However, i n a finite population under random mating, the expected value of Ct, _ l i s not 0 but - 1 (2 Nt t - 1 ) (cf. 2. 10.1). Thu s if the parent generation were derived from random mating, but Ctt- 1 is not k nown, we substitute the above expected value for ct t - t in formula 7.6.3.26. This gives
/
-
2 1 Ne( v) - ( Nt
( -V.)1 )k '
_
-
2 1
7.S.:U8
+�
k
For example, with self-fertilization, Nt - 1 = I , k 2, Vic = 0, and Ne(lJ) be� comes unity as expected. I f all the individuals contribute equally to the next generation ( Vk = 0) and if k. = 2, we have =
Nl1(lJ) = 2N
- 1,
7.S.3.29
namely. the effective number is about twice the actual number. When sexes are separate , the form ula for the effective number may be derived from the following consideration : The gene frequency in the next generation is the average of gene freq uencies of male and female gametes, i.e. ,
p*' + p**' p' = -2 -Therefore
=
� 4
[P( 1 - p) 2N* III
p( 1 - P) + 2N* * III
Thus we obtain
_ p( l - p) _ Ne(l) -
(
]
2 Vilp - ! _1_ + 4
N* III
].
_1_ )
N** e
P RO P E RTIES OF A FI NITE POPU LATION
359
or
Ne(u) =
4N*N** e e N*e + N** e '
7.6.3.30
where 7.6.3.31
and the expression for N:* is similar. If the males and females are equal in number and have equal progeny distributions and if (X t - l is not known, we obtain, noti ng that (X i- l = (Xi! 1 = - 1 /(Nt - 1 - I ),
Ne = (
Nt - 1
1
-
1)k
V. +� k
.
In the special case of sib mating,
Nt - 1
=
2,
k
=
2,
Vk = 0, and we get Ne =
2
as expected. When the effective number changes from generation to generation, the representative effective number over T generations may also be obtained from the consideration of variance as we did for inbreeding (Section 3. 1 3). Let N�(�) be the effective number in the tth generation. Then, as shown in 7.3.4, the gene frequency variances in two consecutive generations are related by
yp(t)
(1
=
Starting with
_
1(t) ) y(t - l ) + p( l -(t ) p) .
2Ne(u)
7.6.3.32
2 Ne(v)
p
V�O) = 0, the variance after T generations is ( T = 1, 2, . . . ) .
If Ne(u) is the representative effective size, then
(1
or
-
l
2N e(u)
(
T log 1 -
) T = ITT ( 1 t= 1
1 2Ne(u)
)
=
T
-
1
1 ) 2N(te(u)
(
L log 1
t
=
)
-
1 2N(t) e(v)
)'
7.6.3.33
360 AN INTRODUCTION TO P O P U LATI O N G EN ETI CS THEORY
N::�/ s and
Thus for large
T
-=--
N e(v)
T =L
l
1= 1
a
relatively small T, we have
(1)
N e(v)
or
T
Ne(v) = ( _
T
L
t= 1
1(t) ) . Ne(lI)
7.6.3.34
Thus, as for the i nbreeding effective numbe r, the rep resentative variance effective nu mber is approxi mately the harmonic mean of the i ndividual effective nu m bers over the whole time as pointed out by Wright (1 938a). He gives an example in which the breeding population increases in five genera tions in geometric series from 1 0 to 1 06 and then returns to 1 0 and repeats the cycle. For this case, the representative effective size turns out to be 54, which is much nearer to the minimum nu mber than to the maximum. In many natural populations, the breeding number may stay fairly constant with small fluctuations around a certain mean. If it changes for tuitously from generation to generation with mean and standard deviation UN ' the effective number is gi ven by
f:J
� N e(v) = N
-
2
UN
N'
7.6.3.35
as long as UN is much smaller than N. This i s derived as follows : Let the deviation of from its mean N , such that
N
and
E(lJN) Since
v
lJ
p
_
-
=
0,
p( 1
-
2N
p)
_
-
p(
1
-
+
p)
lJN) 2N ( 1 If lJN (lJl'}) 2 = p( l p) [ 1 f:J N 2tV -
_
_
+
_
•
•
•
J'
lJN be
PROPERTIES OF A FIN ITE POPULATION 361
neglecti ng small terms, we obtain
1 (J;] ( ) p p [1 + lJ ) il 2 2N
E( V p
Thus
Ne(v)
=
=
p(
1 - p)
2E( V )
=
lJp
f./
1 + NA�
•
A ( 1 - (J �)2 = j;J - (J;j;J '
N (J 2 �
f./
as was to be shown. 7.6.4 Comparison of the Two Effective N umbers
To make ou r comparison si mpler, we will consider a population of monoecious organisms mating at random. From 7.6.2. 1 2 and 7.6.3.28, the inbreeding and variance effective numbers are respectively Ne(/) =
v:N'-Ik - 1 � + ,I( - 1
7.6.4.1
,I(
and Ne(v) =
-' v:-)-(---
(2 N ' - 1 - 1 ),1( 7.6.4.2
2 1 + _k k
As noted already, the inbreeding effective number is more natu rally related to the number of the parents, while the variance effective number is related to that of the offspring. The former is usually much smaller than the latter if a large number of offspring is produced out of a small number of parents. In an extreme case of N' - l = I and k -+- 00, the in breeding effective number becomes u nity w hile the variance effective number is infinity. On the other I , Vk 0) the inbreeding hand, if each parent produces just one offspring (K effective number becomes infinite but the variance effective number stays finite and equal to twice the nu mber of offspring. However, t hese are extreme examples. I f the population size is constant (N' - l N, = N, ,I( = 2) both formulae reduce to =
=
=
Ne(f) = N e(v) =
4N - 2 Vk + 2
7.8.4.3
with a sl ight correction if there are separate sexes ; the 2 in the n umerator is replaced by 4.
Table 7.6.4.1 A comparison of the inbreeding and variance effective numbers for monoecious and bisexual populations. In the
=
=
=
k , V. = keN - l )IN , Vt* keN - 2)/N
=
=
=
- I 1(2N, - 1 - 1), a:_ 1 - 1 /(Nr - 1 - I )
2, (X , _ 1
(Tc
=
1)
Sib mating
Self- fert i lization
Homozygous parents
(0: = I )
Constant popu lation size
Parents in random-mating proport ions
Equal progeny numbers
Decreasing population
Parents in random-mating prop ortions
Constant population size
Equal progeny number
Equal progeny num bers V" = s; = 0
k
Constant populat ion size
Parents in random-mating proportions
si
Ideal population
General
Equation number
2N - l
co
2N - I
k- l
Nr - 1k -
Vt + 2
4N - 2
N' - l
I
( I + IX r - I )S; k
2
2N - 2
co
2N - 2
k- l
co
2Nr - l
2N - I
1 - O: r - l
2Nr
V,, + 2
Vt* + 2 Nr - 2k - 2
4N - 2
Nr
! - ar- I +
2Nr
7.6. 3.26
MONOECIOUS
variance in
I
-
•
a'_ 1
N,
( 1 + a:'_ I)S; k
2
co
2N, - 2
2N - 2
1 - o:i- l
2N,
V,,· + 2
4N - 4
+
2Nr
7.6.3.3 1
SEPARATE SEXES
VARIANCE EFFECTIVE N UMBER
4N - 4
JV' - 2
V: k- l+k
Vt T
_
Nr - 2 " - 2
Nr - 1k - 1 k-I +
7.6.2. 1 7
SEPARATE SEXES
7.6.2. 1 2
MONOECIOUS
=
symbols wit h an asterisk refer t o
" = mean number of progeny per parent, V.
INBREEDING EFFECTIVE N UMBER
1 963a.)
I,
= measure of departure from Hardy-Weinberg proport ions ;
one sex only. (From Kimura and Crow,
a
number of individuals in generation
number of progeny per parent,
progeny distribution. N r
monoecious population self-fertilization is permitted. The values for separate sexes are for the case where both sexes h ave the same
�
:zI -
1 . The actual distribution of the number of alleles maintained in this way is more difficult and will be discussed in Chapter 9, along with the complicating effects of selection. A similar situation arises when we look at the process in time. A newly arisen mutation is very likely to be lost from the population within a few generations because of accidents of the Mendelian process and variations in the number of progeny from different individuals in the population. On the other hand, if the population is finite, a minority of mutants may be lucky enough to persist i n the population and ultimately become the prevailing type. To be sure, the l ikel ihood of this is extremely small in a population of moderate size ; but the probability is not zero and in the long time of evolu tionary history events with small probabilities do occur. We shall consider now the simple problem of how frequently such neutral-gene replacement is expected to occur. Later in the chapter (Section 8.8 and 8.9) we deal with the process in more detail, including the effects of selection . Consider a population of size N (actual number, not effective number). If we look sufficiently long into the future the population of genes at a par ticular locus will all be descended from a single allele in the present generation. In the vocabulary of Chapter 3, they al l wil l be identical by descent and the population will be autozygous for this locus. This is the result of the inexor able process of random gene frequency drift. If, in the present generation , an allele Al exists in frequency p, the probabil ity is simply p that the lucky allele
STO C H ASTIC P RO C ES S ES IN TH E C H A N G E O F G E N E F R EQUENCI ES
369
from which the whole population of genes is descended is At rather than some other al lele. Now, if mutation occurs at a rate u per gene per generation, then the number of new mutants at this locus in the present generation is 2Nu. The probability that a particular gene will eventually be fixed in the population is 1 /2N. So the probabil ity of a mutant gene arising in this generation and eventual1y being incorporated into the population is 2Nu(l /2N) = u . We have the remarkably simple result that Rate of neutral gene substitution
=
u.
8.1 .1
That is to say that, viewed over a long time period, the rate of evolution by fixation of neutral m utants is equal to the mutation rate. Stated in another way, the average i nterval between the occurrence of successfu l mutants is l /u. The observed rate of evolution of amino acids in mammalian hemo globins i s about one replacement per codon per 1 0 9 years. This could be accounted for entirely by neutral substitutions if the mutation rate to such alleles were 1 0 - 9 per codon per year. This is not to assert that this is neces sarily the major mechanism by which amino acids evolve, but we would not be surprised if it turns out that an appreciable fraction of nucleotide replace ment in evolution is of this type. For a d iscussion , see Kimura (1 968), King and Jukes ( 1 969), and Crow ( 1 969). We have not considered the time required for a successful mutant to go from a single representative to complete fixation. Later, in Section 8.9, it will be shown that the average time required for those mutants which are successful to change frequency from 1 /2N to 1 is 8.1 .2
This does not depend on the mutation rate. This is as expected ; since only one representative wil l be fixed, the time required does not depend on how many mutants there are. Note also that the relevent population number this time is the effective n umber, not the actual number. These poi nts are i llustrated in Figure 8 . 1 . 1 . The rate of gene substitution, when we consider a time period that is long with respect to the time required for a single substiturion to occur, is u. This i s given by the reciprocal of the time i nterval between the occurrence of successive successful mutants, as shown on the graph. The time required for a particular gene to be substituted is 1, also shown. If 1 is s mal l relative to l /u, that is, if 4Ne u � 1 , then the population is monomorphic most of the time, as i l lustrated by situation b in the figure. On the other hand, if 1 is comparable to l /u as might be the case in a large population with a high rate of m utation to neutral alleles, the population
370 AN INTRODUCTION TO POPU LATION G E N ETICS THEORY
\0 ,
. ': I������I �C
)
) -E-oE----- 1 ---___ u
Time
Much Longer Time
2 N �--�(�====�(�==�---'�
(e)
n
b
O �-��----��---'Time
as
in
Figure S.1 .1 . Gene subst itution by random drift and mutati on. The abscissa
is time over a very long period ; the ord inate is the number of mutant genes descended from a single mutant. The upper figure
(0)
is i ntended to
illustrate that, whi le most mutants persist a few generations and then are lost, an occas ional one i ncreases to eventual fixation. The second drawing (b) shows only the mutants that eventually become i ncorporated into the population. The t i me scale is therefore much longer. The third (c) shows a situation i n a large r population or one with a higher rate of occurrence of neutral mutants. In this case the time required for a replacement (i) i s comparable t o that between such events ( l /u), with the result that there is considerable transient polymorphism.
will have considerable transient polymorphism. At any one time it is likely to have more than one allele, although these will be different alleles at different times. This is illustrated by situation c. Later in the chapter (Section 8 .8), it wi ll be shown that the probability of fixation of a single gene that is slightly favorable is approximately 2s. So, even favorable mutants are lost most of the time. However, if they occur with any appreciable frequency they are substituted considerably more rapidly than neutral alleles, as expected.
STOCHASTIC PROCESSES I N T H E CHAN G E O F G EN E FREQU ENCIES 371
8.2 Change of Gene F requencies as a Stochastic Process
In many mathematical problems arising in population genetics, the process of change in gene frequency may be treated as deterministic, as we have done in Chapters 1 through 6. This approach was extensively developed by Haldane ( 1 924 and later), especially for single-locus problems. It is still useful. There are many circumstances where such an approach is sufficiently realistic to yield interesting and reliable information, as we already illustrated. Yet, when we consider that actual populations are all finite, that many mutant genes may be represented only once at the moment of their occurrence, and that organic evolution has proceeded over an enormous period of time in an ever-fluctuating environment, we realize the necessity of an approach that can take indeterminacy into account. In this chapter we treat the process of change in gene frequencies as a stochastic process. By this we mean a mathematical formulation of chance events in a process that proceeds with time. The pioneering work in this field has been done by Fisher ( 1922, 1 930) and Wright ( 1 93 1 and later). These authors have been mainly concerned with the state of statistical equilibrium that is reached when the form of the dis tribution becomes constant. The problem of constructing the entire history of change in gene frequencies starting from arbitrary initial frequencies is more complicated. Several practically important cases have been solved by one of the authors and he has reviewed the historical development of the subject elsewhere ( Kimura, 1 964). A mathematical approach which has proven to be very powerful makes use of " diffusion " models, in which two diffusion equations, the Kolmogorov forward and backward equations, play a central role. 8.3 The D iffu sion E quation M ethod
In population genetics the fundamental quantity used for describing the genetic composition of a Mendelian population is gene frequency rather than genotype frequency. The main reason for this, as was discussed in Section 2. 1 , is that each gene is a self-reproducing entity and its frequency changes almost continuously with time as long as the population is reasonably large. On the other hand, genotypes are produced anew in each generation by recombination of genes and therefore do not have the continuity that genes have. Also, we note that in actual evolution the gene frequency changes are typically very slow. To be sure, there are some exceptions. One is the rapid increase in the melanic gene in some Lepidoptera in i ndustrial areas. Another is the development of resistance to insecticides and antibiotics. But, as
372
AN INTRODUCTI ON TO POPULATION G E N ETICS TH EORY
pointed out by Haldane ( 1 949d), the typical rate of evolution shown by fossil records is of the order of one-tenth of a darwin , where one darwin represents a change by a factor e in a milJion years, or 1 0 - 6 per year. On the ordinary scales by which we consider time, this is exceedingly slow. Therefore, in the following treatment we will regard the process of change of gene frequency as a continuous stochastic process. Roughly speaking, this means that as the time interval becomes smaller, the amount of change in gene frequency during that interval is expected to be smaller. M ore strictly, the process is called a continuous stochastic process if for any given positive value e, however small, the probability that the change i n gene frequency x during time interval (t , t + ( x , t)} ox
0, we obtain
1
- a { M(x , t)t!>(x, t)} ox
B.3.7 ,
which is equivalent to 8 . 3 . I , vex, t) and M(x, t) in this equation corresponding respectively to V�x and M6x in equation 8.3. I . We may note here that these two sets of quantities are defined in a slightly different way. Namely, in the above derivation Vex, t) represents the variance per infinitesimal time of the random component of change for which the mean is 0. Any systematic com ponent of change is included in M(x, t). On the other hand, Vb and Mb in equation 8.3. 1 represent the variance and the mean of the change of gene frequency per generation. In practice, quantities such as mutation rates, rate of migration , intensity of selection, and effect of random sampling of gametes which determine the rate of change in gene frequency are all measured or expressed with one generation as the time unit. So, for practical purposes, expressions v"x and Mb might be more convenient than Vex t) and M(x, t). In the above derivation leading to 8.3.7, we assumed that the systematic pressure pushes the gene frequency toward the right, but no essential change is required for the argument if it pushes toward the left, in which case we simply use m(x + h, t)bt(x ; I + (1) = f {4>U e o(ox 2 ! ox2 e 3 03(4)g) + . } d�' 3 ! ox3 where 4> and 9 stand for 4> (x; t) and g(x, e ; I, ot) . In the foHowing treatment _
_
. .
8.3.9
we wil l assume that the order between summation, integration, and dif ferentiation may be i nterchanged freely. From 8.3.9, neglecting terms involv ing e J and higher powers of we have
�, 4> (X ; I + lJ t) = 4> f U de - :x { 4> fe gde} +
Noting that
� ::2 (41 f � 2g d�) .
8.3.1 0
transferring the first term in the right-hand s ide of 8.3. 1 0 to the left, and then dividing both sides by 01, we get
4>(x ; I + lJ t) - 4>(x, I) 01
:x {cf>(x, I) :, f e;g(x, e; ; I, ot) de;} + � ::2 ( Ih(X, t ) L f 6(x, �; t, �t) d�} . At the l imit as ot 0, i f we define M (x , t) and Vex, t) by M(x, I) l im ; feg( x , e ; I, ot) de ul = -
8.3 .11
-+
=
and
8.3.1 2
6, "' 0
Vex, t) = l im ul; f e 2g(x, e; t , lJt) de , 61 ..... 0
8.3.1 3
equation 8.3. 1 1 yields the Kolmogorov forward equation given before as 8.3.7. The above derivation may stm be unsatisfactory from the standpoint of mathematical rigor. For more rigorous derivations, readers may refer to the m athematical literature, for example, Kolmogorov ( 1 93 1).
378
AN I NT R O D U CTI O N TO P O P U LATI O N G EN ETICS T H EO R Y
Going back to equation 8.3. 1 and substituting 4>( x, t) for 4>(p, x ; t), the Kolmogorov forward equation may most conveniently be expressed for our purpose as a4>(x, t) - 1 a2 -,h(x t) } - 2 ax2 { V/J.t 'P at ,
8.3.1 4
a - - { MlJx 4>(x, t) } . ax
In applying this eq uat ion to population genetics, it is often very useful to keep in mind that the q uantity
- �2 :
uX
{ Vh- 4> (x , t ) } + M �J( 4>( x , r),
which we will denote by P(x, r) and which enters the right-hand side of the above equation as iJP(x, r)/iJx, represents the rate per generation of net flow of probabil ity mass across the poi nt x. With the help of Figure 8.3. 1 we will again try to show this using a geometrical interpretation. The net flow of probab i l ity mass across the poi nt x + h/2 during the short time i nterval (t, t + b t), which we denote by P(x + 111, t)b t , is given by -
P(x + th, r).5r
=
m(x, r)b r4>(x, r)11 +
1-v(x, t).5 r¢(x , t) h
- tv(x + h, t )b t ¢(x + h, t ) h.
8.3.1 5
Here we consider o n l y the exchange of freq uencies between the classes having gene frequency x and x + h. Substituting m(x, t) h M (x, t) and vex, t)h vex , t)/h in the above equation, we get =
P(x
+
t h , t)
=
M( x, r ) 4>(x, t ) 1 Vex
2
which gives P(x , t)
=
+
h , t )4> (x
+
It , r )
=
- vex , !)4> (x , t)
h
M(x, t)4>(x, t) 1 a
8. 3.1 6
- - ;- { V(x , t) 4>( x , t) } 2
uX
at the l imit of h O . Agai n, for practical pu rposes, it is convenient to use the mean and the variance of the gene frequency change per generation , ----.
STO C H ASTIC P R O CESSES IN T H E C H A N G E O F G E N E F R EQU E N CI ES
379
that is, Mb and Vb for M(x, t) and V(x, t), to give P(x , t)
=
-
1 a 2 { Vh tP ( x, t ) } + M6Jc tP(x, t ) .
8.3.1 7
ox
With this expression for the probability flux, the forward equation may be written as oP(x, t)
oq,(x , t) at
8.3.1 8
ax
A s pointed out folJowing equation 8.3.3, our fundamental equation 8.3. 1 is valid only for gene frequencies in the interval 0 < x < 1 (unfixe d classes). Therefore, separate treatments are required to obtain probabilities for x 0 and x 1 (term inal cl asses). I n order to obtain the rate of change in the frequencies of these terminal classes, we make use of the fact just established, that is, P(x , t) in 8.3. 1 7 gives the probability flux across the point x at time t. Here we will consider a special but i mportant case in which the change of these frequencies is entirely due to inflow of probability mass from the unfixed classes. In the terminology of the mathematical theory of probability, the bou ndaries (x 0 and x = 1 ) act as absorbing barriers. In such a case, we have =
=
=
df(O, t) = dt df( l, t) dt
=
_
P( O' t) ,
8.3. 1 9
P( l ' t) ,
8.3.20
where [(0, t) and [( I , t) are the frequencies of classes having gene frequency ° and t at the tth generation. In the particularly i mportant case i n which the random fluctuation is solely d ue to random sampling of gametes such that v., x x( l x)/2Ne , and the systematic pressure is solely due to selection such that M6,1( x( l - x)s(x, t), where s(x, I) is the selection coefficient, �
- P(O, t)
=
Lim
;�· .... o
1
uX
�2 q,(O
-
q,(0, t).
Thus , from 8.3. 1 9, we have =
=
[!2 : (X(l2Nex) q,(x, t)} - x( l - x)s(x, t)q,(x, t)]
4Ne df(O, t) dt
-
'
t)
2N I
(N). Ne
8.3.21
380
AN I NTR O D U CTI O N TO PO P U LATION G E N ETICS T H EO RY
In the right-hand side of the above equation, ¢(O, t) is approximately equal to ¢O/2N, t). Since ¢(l /2N, t) O/2N) is our approximation for the frequency of subterminal class fO /2N, t) , we have
df(O, t)
dt
=
�1 2
(_ 1 t) (!!.-). 2N ' Ne
8.3.22
That is, the contribution from the unfixed classes to the rate of change in the terminal class with x 0 is half the frequency of the subterminal class with x = l/2N multiplied by the ratio N/Ne . In a special case in which the actual size of the population is equal to its effective size, this ratio reduces to unity. In a like manner, we have =
d_1_(_1 '....;.t). � ¢ 1 , t) _1 ( _ N) d t = 2 ( 2N N e
�
�1 2
N (1 _ _ , t) ( _) 2N e 1
8.3.23
N
for the terminal class with x = 1 under a similar condition. The diffusion equation method can readily be extended to treat the cases of two or more random variables. For example, for two independently segregating loci, each with a pair of alleles Al and A 2 in the first locus and BI and B2 in the second, the corresponding Kolmogorov forward equation becomes 8.3.24
where tP = ¢(p, q ; x , y ; t) stands for the probability density that the fre quencies of Al and BI are x and y at the tth generation, given that their frequencies are p and q at t = O. Some of the applications of the forward equation to more concrete problems of population genetics, such as constructing the process of random genetic drift owing to smaIJ population number, wiJ I be presented in foll owing sections. Next, we will attempt to derive the Kolmogorov backward equation given as 8.3.2, in which we consider x fixed and p a variable. In this formula lation we reverse the time sequence and look at the process retrospectively. Also, in order to make our treatment simpler, we will restrict our considera tion to the cases in which the process i s lime homogeneous ; that is, we consider only those cases in which if X' I and X'2 are frequencies of a gene at times t I and t 2 , then the probability distribution of X'2 ' given X' I ' which in general is a function of t I and 1 2 , depends 0 nly on the time difference t 2 t I'
-
STOCHASTIC PROCESSES IN T H E CHAN G E OF G EN E F R EQU EN CIES 381
For such a time-homogeneous Markov process, we have
f
4>(p, x ; t + b t) = g(p, e ; � t ) 4>(p + e , x ; t) de.
8.3.25
The above relation, which is analogous to relation 8.3.8, may be derived by considerations similar to those by which 8.3.8 was derived. Note here, how ever, that g in the above rel ation depends on three variables, p, e , and �t, but not on t . This is due to the assumption of time homogeneity, that is, the probability that the gene frequency change from p to p + e during time interval of length �t is the same for any t (generation). Expanding 4>( p + e , x ; t) inside the integral i n terms of e but neglecting terms i nvolving e 3 and higher powers of e , we obtain, at the l imit of fJt --. 0, the fol lowing e quation :
a4> ( p, x ; t) V(p) a 24> (p, x ; t) = 2 at ap2 --
�
a4>( x ; t) + M(p) , p
8.3.28
where
()t--O vt� J �g(p, e ; fJ t) de,
M( p) = l im
.; f
V(p) = l im eg (p, e; fJ t) de· ()t ... O i H
8.3.27
8.3.28
Thus, substituting the mean and the variance of the amount of change per generation, M()p and V()p for M(p) and V(p), in the above equation, we obtain 8.3.2 , the Kol mogorov backward e quation as applied to popUlation genetics. One of the very i mportant uses of this equation is its application to the problem of gene fixation. If we take x = 1 , 4> i n the backward equation may be interpreted as the probability that the gene becomes fixed (established) in the population by the tth generation, given that it is p at the start. We will denote thi s probability by u(p, t), for which we have
a u(p, t ) a u(p, t) V()p a2 u(p, t) + M()p = 2 2 ap ap at ---
8.3.29
-
The probability of fixation will then be obtained by solving the above equa tion with the boundary conditions
u(O, t)
=
0,
u(l , t) = 1 ;
that is, the probability is 0 i f p = ° and is 1 if p = 1 .
8.3.30
382 AN I NTR ODUCTI O N TO POP ULATI O N G E N ETICS TH E O RY
Of special interest in population genetics is the ultimate probability of gene fixation defined by
u (p) = lim u ( p, t) ,
8.3.31
, .... <Xl
for which iJu - = 0.
at
Thus, u(p) satisfies the ordinary differential equation
du(p) V"p d 2u (p) T d 2 + M"p d-p = O
8.3.32
p
with boundary conditions
u(O)
=
u(1) = 1 .
0,
8.3.33
The problem of gene fixation is important in the theory of evolution and also for the study of breeding. We will elaborate the application of equation 8.3.32 to this problem later in this chapter. In the next few sections, we wil l apply the Kolmogorov forward equation to solve some concrete problems arising in population genetics. 8.4 ·rhe Process of Random Genetic Drift
Due to Random Sampling of Gametes
We wil l first consider the simplest situation in which a pair of alleles At and A2 are segregating with respective frequencies x and 1 x i n a random mating population of N monoecious i ndividuals and the only factor which causes gene frequency change is the random sampling of gametes in repro duction. As time goes on, the gene frequencies tend to deviate from their initial values and eventually one of the two alleles becomes fixed in the population. This is the simplest i mportant stochastic process in the change of gene frequencies in a Mendelian population. Since Wright's work in 1 93 1 , the process has been known by the term ' drift ' or more adequately ' random genetic drift ' . In the previous chapter, we studied the l aw of change in the mean, variance, and the higher moments. The mea n and the variance of the change in gene frequency x per generation are, respectively, -
M",, = O and V
.b
_
-
x(_l
_
x....;.) .
_
2Ne
'
STOCHASTIC P ROCESSES IN T H E C H A N G E O F G EN E F R EQU E N C I ES 383
as shown i n 7.3. 1 and 7.3.2. For simplicity, we assume i n this section that the effective population number Ne i s equal to N. Thus the forward equation 8.3. 1 becomes
a4> 1 a2 at = 4N iJx 2 { x(1 - x) 4> } ,
(0 < x < 1),
8.4.1
where 4> - 4>(p , x ; t) is the probability density that the gene frequency be comes x in the tth generation , given that it is p at t = O. I n terms of the Dirac delta function · (j( . ) , the initial condition may be expressed i n the form 4>(p, x ; 0) = (j(x - p). 8.4.2 The required solution of 8 .4. 1 that satisfies the initial condition 8.4.2 was first obtained by Kimura ( I 955a). It is expressed in terms of the hyper geometric function as follows : 00
4>(p, x ; t) = L p( 1 - p)i( i + 1 ) ( 2 ; + i= 1
where F( ' ,
X
"
- i, i + 2, 2, p)
I)F(1
F(1 - i , i + 2, 2, x) e
- i( i + 1 )1
8.4.3
4N ,
" . ) stands for the hypergeometric function so that
F(1 - I. , 1. + 2, 2, x) = 1
+
(1 +
- i)(i + 2 ) x 1x2
8.4.4
( 1 - i)(2 - i)( i + 2) ( i + 3) 2 . . . X + .
1
x
2
x
2
x
3
The above sol ution , 8.4.3, may also be expressed in terms of the Gegenbauer polynomial (see Korn and Korn, 1 968), defined by
Ti1- 1 ( Z ) =
(
. i( i + 1 ) . . F 1 + 2, 1 - 1 , 2 ' 2
as follows : 00
¢(p, x ; t) = , L
(2i + 1)(1
_
r2)
1-
Z)
--
2
8 .4 . 5
- i(i + 1 )1
8.4.6 Tl- 1 ( r) Tl- 1 ( z) e 4N , 'C + 1 ) where r 1 - 2p, z = 1 - 2x, and TMz) = 1 , T:(z) = 3z, TJ(z) (3/2) ( 5z 2 - I ), T�(z) = (5/ 2)(7z 3 - 3z), Tl (z) = ( 1 5/8)(2 1 z4 - 1 4z 2 + 1 ) , etc. , =
=
1
I I
=
The right-hand side of equation 8.4.3 or 8.4.6 is an i nfinite series, but for a large value of t, only the first few terms are of any significance in deter mining the actual form of the distribution. Thus, for a large t, we have 0 (p, x ; t)
=
6p(1 - p)e-ZRt + 30p(1 - p)(1 - 2 p)(1 - 2x)eZNt + . . . . 1
3
8.4.7
384
AN I NT R O D U CTION TO P O P U LATI O N G E N ETICS TH EORY
In particular, at the l i mit of
¢
'"
Ce
2N ,
- - t
t
-
00, we obtain the asymptotic formula
I
8.4.8
in which C is a constant. This means that after a large number of generations, the probabil ity distribution for unfixed classes (0 < x < I ) becomes flat and decays at the rate of 1 /(2N) per generation. This is called the state of steady decay, and 1 /(2N) corresponds to the smallest eigenval ue of the partial d i fferential equation 8.4. 1 . The relation 8.4.8 was first obtai ned by Wright ( 1 93 1 ). Figure 8.4. 1 i l lustrates such a state of steady decay, when fixation or loss of an allele occurs at a constant rate.
o
2 %
50% x
75%
1 00%
The distribution after many generations (roughly, any time after t = 2N) when the distribution is of steady form. All frequencies between 0 and 1 exclusive are equally probable and are decreasing at the same rate, 1 /2N. Fixat ion or loss of the allele proceeds at a constant rate, 1 /4N. (Adapted from Wright, F i gure 8.4.1
1 93 1 .)
The complete sol ution , 8.4.3, enables us to construct the more detailed process of change i n the frequency distribution of unfixed classes as shown in Figure 8.4.2a and 8.4.2b. In Figure 8.4.2a, the initial gene frequency (p)
STOCH ASTIC PROCESSES I N THE CHANG E O F G E N E FR EQU EN CIES 385
2N
is 0.5. It may be seen from the figure that after generations the distribution curve becomes almost flat and the genes are still unfixed in about 50 % of the cases. In Figure S.4.2b, the initial gene frequency is assumed to be 0. 1 and it takes 4N or 5N generations before the distribution curve becomes practically flat. By that time, however, gene A l is either fixed in the population or lost from it i n more than 90 % of the cases. So the asymptotic formula
6.0
5.0
3.0 4 .0
2.0 3.0
1.0
t
=
N
-
2
2.0
O l--....c...---L-----.---"'---=---., 1 0 o 0.5
1.0
(b)
05
(" )
10
F igures 8.4.2s,b. The process of random genetic drift due to small population
number, in which it is assumed that mutation, migration, and selection are absent, and the random change in the gene frequency from generation to the initial frequency of A I is
generation is caused by random sampl ing of gametes. In Figure
8.4.20,
0.5, while in Figure 8.4.2b, the initial
effective population number. The abscissa is the frequency of A 1 in the frequency is
0. 1 .
In both figures,
I
stands for t i me and
N stands
for the
population and the ordinate is the corresponding probability density.
(From Ki mura,
1 9550.)
8.4.8 may not be as useful for p = 0. 1 as in the case p = 0.5. Actually, from 8 .4.7, C = 6p( l - p) and this constant is small if p is near 0 or 1 . Going back to the complete solution 8.4.6, the probability, nr , of both A l and A 2
386
AN I NTRODUCTIO N TO P O P U LATI O N G E N ETICS THEORY
co-existing in the population in the tth generation may be obtained by inte grating 4>(p , x ; t) with respect to x from 0 to I . This gives <Xl
O, = L { P 2 j( r) - P 2j + 2 ( r) } e j=O
(2j+ 1 )( 2j+ 2 ), 4N
B.4.9
,
=
=
where P( - ) represents the Legendre polyno mials ; P o(r) I , PI er) r I), etc. The above formula is an infinite series, but for a large t, we may use the fol lowing formu l a to compute the value of 0, :
P (r) = ( : /2)(3r 2 2 n, =
-
1
6p(1 - p) e - 2N ' + 14p(1 - p)( 1
_
6
5 p + 5 p 2 )e - 2N ' +
.
. .
.
8.4.1 0
The frequency of heterozygotes or the probability that an individual in the population is heterozygous at a given generation can also be obtained by using 8.4.6 as follows :
H, =
1
Io
2x ( 1 - x )4>(p, x ; t ) d x
=
,
2p( 1 - p)e - 2 N.
8.4.1 1
This shows that the frequency of heterozygotes decreases exactly at the rate of I j(2N) per generation. I t agrees with 7.3.7 which was obtained by an elementary method. Act ually, this holds also for mul tiallelic cases and is i ndependent of the number of al leles involved. The above treatment should have made it clear that the genetic hetero geneity of a population and the heterozygosity of an individual are not only distinct conceptually but also their probabilities 0, and H, are different. The processes of change in the probability distribution of fixed classes may be obtained by using rel ations 8.3 . 2 1 and 8 . 3.23. The frequency of the class in which gene A l is fixed, or the probabil ity that A l becomes fixed by the tth generation, is as fol lows :
f( p,
1;
t)
= 1 4N =
p+
I ' 4>(p, ] ; r) dr 0
1 �
2 iL =1
-
(2 i + .
1 )( 1
.
/(/ + 1 )
cr.,
=P+
-
I (2i + l ) p( l i= J
,,2 )
i ( i + l )t . 1 Ti - 1 (,,)( - l )'e-�
- p) F( i + 2 , I
-
8.4.1 2
i, 2, p)( - l )'e- �. •
i � i + I ),
Similarly, the frequency of the class in which A l is lost may be obtained by integrating 4> (p , 0 ; r)j(4N), or more simply, by replacing p with ( I - p) in the above expression for [(I', I ; t). With these expressions, it can be shown that f(p, 0 ; t ) +
O r + f(p, I � f)
=
I
STO C H ASTIC P R OC ESSES IN TH E C H A N G E O F G EN E F R EQ U E N C I ES
and, at the l imit of
f( p , 0 ; (0) = 1
t
=
-
00 , we have
p,
n oo
=
f( p, 1 ; (0) = p.
0,
387
8.4. 1 3
We might very roughly characterize the above process of random genetic drift by the following example. I f we start out with 1000 populations, each of size 100 individuals and contai ning 50 % of gene A I ' then after about 200 generations A I is either fixed or l ost in roughly 500 populations. I n the remaining 500, the distribution of the frequency of A l is practically flat. When this state is reached, 1 /200 of the unfixed populations become fixed for Al or its allele each generation from that time until eventually every population will be homogeneous for either A l or its allele. In his first treatment of the process of random genetic drift, Fisher ( 1 922) used the transformed gene frequency rather than the gene frequency itself. A main reason for this is that if the gene frequency is transformed from x to 0 by the relation (angular transformation) o = cos - 1
(l
-
2x),
8.4.1 4
the sampling variance of gene frequency per generation becomes roughly independent of the gene frequency, where e i n radians changes fro m 0 to 1t as x changes from ° to 1 . This may be seen as fol lows : From the abo-.'e relation , we have bO =
-
1
Jx( 1
-
x)
bx - 4{
1
-
x( 1
2x
-
x
) }t (bx)
2
+ " ',
8.4.1 5
where bx is the amount of change i n x per generation and bO is the corre sponding change in O. Then, if we note that M�x = E(bx)
=
0,
.)2 }
v�x = E {(bx
=
x( l
-
2N
x)
'
8.4.1 6
we obtain, after neglecting higher order terms ,
1 MMJ = - - cot O 4N
1
Vc59 = -
8.4.1 7
2N
It follows then that if a population starts from a fixed gene frequency p, the variance of 0 after t generations is given approximately by 8.4.1 8
388 AN I N TRO DU CT I O N TO P O P U LATI O N G EN ETICS T H EO RY
We should note here that the above formula for V9 is valid only when t is much smaHer than N. Also, we should note that the expected . value of b8 now depends on 0, since in calculating £(b8) from 8 .4. 1 5, the second term in the right-hand side is nonzero and cannot be neglected even if the first term is zero. Fisher ( 1 922) neglected M 69 and wrote the differential equation for the probabil ity distribution using on ly VM = 1 /2N. This led him to the incorrect result of 1 /4N as the rate of steady decay. Later ( 1 930) he incor porated M"(J = - (cot 8)/4N into the equation to obtain the correct result. Nevertheless, this type of transformation which makes sampling variance nearly constant is rather convenient for treating data on random genetic drift over a relatively short period or if the gene frequency is restricted to a range not very far from 0.5. So far we have considered a pair of alleles A I and A z . With more than two alleles the situation is of course more complex, but the same principles apply. Figure 8.4.3 shows the steady-state s ituation for three al leles. At this
c, 8,
F igure 8.4.3. The distribution for three
alleles a t a steady state under random
drift.
(From
Kimura, 1 955b).
state the rate of change from populations with three alleles to populations with two alleles is 3/2N per generation. Then each of the popUlations with two alleles changes to a popUlation with one allele at a rate of 1 /2N, as shown above. At the same t i me the rate of fixation is 1 /2N for each of the three alleles, so that every generation 3/2N of the populations becomes fixed
STOCHASTIC P R O CESSES IN TH E C H A N G E O F G EN E F R EQ U E N C I ES
389
for one of the three alleles . The extension to more than three alleles fol lows naturally. The rate of change, when a steady state has been reached, from k alleles to k 1 alleles is k(k - I )/4N per generation, where N is the effective population number. This result is from Kimura ( l 955b). The problem becomes much more difficult if we consider two linked loci that are segregating si multaneously. Although a theory comparable to that of single locus has not been developed, the amount of linkage disequilibrium caused by random sampling of gametes in a finite population has been clarified by H i l l and Robertson ( 1 968) and also by Ohta and Kimura ( 1 969). Let us assume that a pair of alleles A I and A 2 are segregating in the first locus, and Bl and B2 i n the second locus. If we denote by X l ' X2 , X3 , and X4 the respective frequencies of the 4 types of chromosomes A I B1 , A I B2 , A2 BI , and A 2 B2 • then D = X1X4 -XZX3 represents the amount of linkage disequilbrium. 2 Hill and Robertson ( 1 968) showed that in a small population E(D ), that is the mean square of D, may become large even if E(D), the mean value of D, is O. Usin g the method of moment generating matrix, they obtained analytical expressions for the 3 quantities, E{x( l - x)y( 1 - y)}, E{ D( I - 2x)( l - 2y)}, 2 and E(D ) in the case of no crossing-over, where x and y are respectively the frequencies of A 1 and Bl in the first and second loci . Ohta and Kimura ( 1 969) obtained more general expressions for an arbitrary recombination fraction c, based on the diffusion models. An i nteresting property first discovered by 2 2 Hill and Robertso n is that the expectation of , D / { x( l - x)y( 1 - y)} settles down quickly in the process of random drift to a constant value which depends only on Nec. Note that ' is the correlation coefficient between gene frequencies at two segregating loci . They also inferred from simulation studies that E(,2) approaches 1 /(4Nec) as Nec increases. Ohta and Kimura ( 1 969) considered a quantity u; E( D Z )/E{x( l - x)y( l y)} and showed that it takes a value similar to E(,2). They obtained an a nalytical expression for u; and showed that -
=
=
i
u
�
-
1 / (4Net)
for a large Nec. It is i nteresting to note that a relation similar to this holds for the case in which a steady state is reached with recurrent mutations and random genetic dri ft as shown by Ohta and Kimura ( 1 969a). 8.5
Change of G ene Frequency U nder Linear Pressu re and Random Sampling of Gametes
In the previous section, we have considered the process of random drift caused by random sampling of gametes alone. We will n ow investigate the process in which the effects of mutation and migration are also i ncluded.
390 AN I NTRO DU CTI O N TO P O P U LA TION G E N ETICS TH EORY
Since the effects of mutation and migration on the rate of change in gene frequency are linear functions of the gene frequency, we may call them col lectively li near pressure. Let us consider a random-mati ng population of effective size Ne in which a pair of alleles A l and A 2 are segregating with respective frequencies x and 1 - x. If we suppose that this population excha nges i nd ividuals with another population at the rate m per generation, then the rate of change in x due to this cause is m(xl
- x
)
per generation, where XI i s t he frequency of A 1 in the i mmigrants (cf. 6.5. 1 ). Here we will assume that XI is a constant. This may be a good approximation if the immigrants represent a random sample from the enti re species. If mutation rates are not negligible, we may replace m by m + u + v and mXI by mXI + v, where u and v are respectively the mutation rates of A 1 to and from its allele A 2 ' Though the pressure of selection is intrinsically nonlinear, in certain cases, like selection acti ng at the neighborhood of the equi librium gene frequency, it may be treated as if it were linear with good approximation. However, the range of applicability is quite restricted. The change of mean , variance, and the higher moments of the d i stri bution of gene frequency under l inear pressure and random sampling of gametes can be worked out by applyi ng the method we used i n Chapter 7, Section 4 for the case of random samp l ing of gametes a l one. In the present case we note that E�(bx,)
=
m(x/
- x,
)
=
rather than Eibx,) 0 as in 7 .4.5, where x, generation. For the mean, we obtain
IS
the val ue of X at
the
8.5.1
rth
which leads to
d/"11 1I/ ' ) ( = - m p /i( 1 ) -dr
- Xl
)
for the continuous model. The solution of the above equation gives the mean gene frequency at the tth generation : 8.5.2
where p is the value of x at r O. S i m i l arly, we can work out the second and the higher moments step by step. The general formula for the nth moment =
STOC H A STIC P R OC ESSES I N T H E C H A N G E O F G E N E F R EQ U EN C I ES
391
of t he gene frequency around t he origin i s as follows (Crow and Kimura , 1 956) : ll(r) f(B + n )f(A + 2i)f(A - B + i ) r(A + i - I ) )
11
f
=
j =- O
(11 )
i, rCA + n + i)r(B + o rCA - B)r(A + 2i
x
F(A + i - I , - i, A - B, 1 - p) exp
{- ( ;
1)
m
+ i N1 t 4
}
8.5.3
e 4 In the :::.b ove t !uation, A = Ne m, B = 4 Ne mXI (I > Xl > 0), p is the i n itial frequen�y I.' l 1 1 , and F( · • . , " . ) denotes the hypergeometric function. Since i n ttus case the mean a nd variance of t he rate of change in gene frequency x are respectively
M",,;
=
m(xl - x)
and
x( l 2 Ne
8.5.4
x)
8.5.5
the forward equation 8.3. 1 becomes
8 al
1
4N I?
82 x) 1 8x 2 { x( - }
- In
8 { (X l - x) } , ax
8.5.6
fjJ(p, x ; l) i s the p robability density that the frequency of A l where fjJ becomes x at the lth generation, given that i t i� p at t = O. The i ni t ial con d ition for the equation i s (p, x ; 0)
=
a(x - p) .
8.5.7
T he moment formula 8.5.3 suggests that the solution to the above equation must have the form
t/J (p,
ro
x; 1) = .f:oXi(x) e x p
By comparing
)1�(f)
'} l ' { - i( + e ) t m
'4N
8.5.8
•
r X"t/J( p, X ; t) dx 1
'0
with 8 . 5 . 3 we can get the appropriate expressi on for t/J, which turns out to be the perti nent solution of 8 . 5.6. It is given by 8 .5,8 in which Xj(x)
=
x B- 1 ( l - X){A - B) - I F(A + i - I , - i, A - B, 1 - x) x
x
F(A + i - I , - i, A - B, 1 -
+ 2i)f(A + + --['l(A-------------+ i)f(A + rCA - B i!
�
i)r(A
- B)r(B
p)
i - I)
2i
--------
1)
8 .5.9
392
AN I NTROD U CTION TO P O P U LATI O N G EN ETICS T H EO RY
At t -+ 00, our formula 8.5.8 converges to Wright's well-known formula for the steady-state gene frequency distribution under migration :
8 .5.1 0
Figures 8.5. l a, 8.5. 1 b, and 8.5. l c show the asymptotic behavior of the dis tribution curve for three different cases : 4Ne m = 0.2, 4Ne m = 2, and 4Ne m = 6. In all the three cases illustrated the gene frequency, XI ' of the immigrants is 0.5 and the initial gene frequency, p, of the population is assumed to be 0.2. We will study the nature of steady-state distribution (8.5.10) in some detail in the next chapter. In the above treatment, it has been assumed that XI is neither 0 nor 1 . In terms of mutation pressure alone, this corresponds to the case of reversible mutation for which u > 0 and v > O. Next, we will investigate the case of irreversible mutation . Let us assume that A 2 mutates to Al at the rate v per generation (v > 0) but there is no mutation in the reverse direction (u = 0). If X t is the frequency of Al at the tth generation, then the amount of change in X t in one generation is 8.5.1 1
where et is the amount of change due to random sampling of gametes with mean and variance EII (J:o,t2 )
=
e
x,(1 - x,) 2N '
8.5.1 2
Using the same procedure we used to derive 8 . 5.3, as well as 7.4.37 in the previous chapter, we obtain the following formula for the nth moment of the gene frequency distribution about origin in the tth generation : J1 �( r ) = 1
_
(I
x
_
p)
) _ I ) i- l r(c + n) r(c + i - I )(c :- 2 i - I ( (�) f. [(c)[(c + n + l) i= l I
F( 1 - i, i + c, c,
i4Ne1 )t}. p) exp { - ( i
V
8.5.1 3
+
where c = 4Ne v, and exp { . } , r( · ), and F( ' , " " . ), respectively, denote the exponential, the gamma, and the hypergeometric functions. In particular, the first two moments are
J1�( t ) = 1
and
- ( 1 - p)e- vt
J12,(r) = 1 - ( 1 - p)
2c + 2 c+2
8.5.1 4
( - -C )e
e - vt - ( 1 - p) p
c+2
(
-NI )r
- 2 v+4
•
•
8.5.1 5
STO C H ASTIC P RO C ES S ES I N TH E C H A N G E O F G E N E F R E Q U E N C I ES
2.0
393
0.2
4 Nm 1.5
1 0
0.5
a
2 .0
4 Nm
, .5
1 .0
...
=
(a l
0.5
o
2
2.0
....
4 Nm
=
6
1.5
-1------4--.---1 t =
0.5
a
1.0
1 .0
0.5
0
0.5
(b)
1 .0
0
0
(e l
0 .5
Figures 8.S.1 a,b,c. Asymptotic behavior of d istribution curves for a finite population with migration or other linear pressures. In all three drawings. the gene frequency of the immigrants is assumed to be 0.5, and the initial frequency in the population 0.2. The abscissa is the gene frequency x ; the ordinate is the probability density 4>. N : population number. m : rate of migration. (From Crow and Kimura, 1 956.)
1 .0
394 AN I NTRO D U CTI O N TO P O P U LATION G E N ETICS TH EO RY
The gene frequency distribution O 1 . However, i n the present case (m 1 . I = 0 i n the spheroidal function), the above series gives the right answer to three significant figures even for = c 3 . This agrees with the statement of Wright and Kerr (1954). Figure 8.6. 1 . 1 gives the relation between c and 2NAo . More exact values are listed =
Genic Selection
15 14
12 11 10 13
o
A
� N
9
8
l' � 5 4
3 2
0
..,.,..
0
V
1 .0
./
/
2 .0
/
3.0
/ /
� Ns 4.0
/ /
5 .0
/ /
6.0
L
/
7.0
B.O
Relations between rate of steady decay ('\0) and intensity of selection (s) in the process of genic selection in finite populations. N is effective size of population. (Kimura. F i gure 8.6.1 .1 .
1 955).
in Table 8.6 . 1 . 1 . From Figure 8.6. 1 . 1 it looks as if 2N1o increases linearly with c for large value of c, though having no proof we are not certain about it. The eigenfunctions vff >Cz) corresponding to lk 's are given by 8.6. 1 . 8. The coefficients I"k corresponding to the first three eigenvalues are found in the tables of Stratton et al. ( 1 94 1 , pp. 1 1 6, 1 l 8 , and 1 20). It will be noted here that for c 0, all the formulae given above reduce to the ones for the case of random drift studied in 8.4. =
400
AN I NT R O D U CTI O N TO PO P U LATI O N G E N ETICS T H EO R Y
Table 8.6.1 .1 .
c
Relation between
c
2Ne Ao
(From Kimura, 1 955.)
2Ne Ao
3.5 4.0 4.5 5 .0 6.0 7.0 8 .0
1 .00000 1 .09985 1 .39765 1 .88771 2.55927 3.39445 4.36529
0.0 0.5 1 .0 1 .5 2.0 2.5 3 .0
c( = Ne s) and 2NAo .
5.43 1 83 6. 54540 7.66121 8.75330 1 0.85728 1 2.89983 1 4.91 989
Note : In the new table of spheroidal wave functions by Stratton et af. (1 956), t is tabulated for c (denoted as 9 in the table) up t o c = 8.0 (pp. 506-508), from which 2NAo can be calculated by the relation
The eigenfunction V: �)(z) corresponding to the sma]]est eigenvalue ito is of special significance, since it gives the frequency distribution of unfixed classes at the state of steady decay, when it is multiplied by ec( 1 - z ) . It is expressed by
V\� ( z)
=
fg TMz ) + f� THz ) + f� Tl(z ) + . . . .
8.6.1 . 1 4
The coefficients Ig , If , I� , etc. , depend on c . When c 1 .0, for example, Ig 1 .0208, If = 0.0 1 3980, I� 0.000096, etc. Figure 8.6. 1 .2 illustrates the distribution at steady decay of unfixed classes for some values of c. The area under each curve is adjusted so that it is unity. The case c = 1 .7 corresponds to the case experimentally studied by Wright and Kerr ( 1 9 54) and the present result agrees quite wel l with theirs (including the rate of decay). The rate of fixation of Al may be calculated from =
=
df( p, 1 ; t)
(x) dx I
=
o
Xl '
9.2.8
This agrees with the assum ption that i n the island model migrants represent a random sample from the whole species. The variance of gene frequency among subgroups is 9.2.7
as first given by Wright ( 1 93 1 ), This was derived earlier by elementary methods (see 6.6.6). This can also be derived from the following procedure : Let x, be the frequency of A l in the t th generation ; then X, + I
=
X,
+ bx"
9 .2.B
where bx, has mean and variance given by 9.2.2 and 9.2.3. Using the same procedure we used in th e study of pure random d ri ft (compare Section 7 .4). we get
�( t 1 ) p. +
= =
so that p.�( OO )
E(x, + I )
=
E(x, + {)x,)
p '/' ) + m(x - Jlt» , =
X.
In addition. we get
p ;( I + 1 ) == E (x,2+ 1 )
=
E { ( x + bX,) 2 } r
9.2.9
or
440 AN INTRODUCTION TO P O P U LATI O N G E N ETICS TH EORY
For t -+
00 ,
!l/1;(00) = X
0, /1'/ (0 ) =
(2mx2 + 2Ne) ( _
2m
Thus ' ( 00 ) U�2 - /1 2 _
_
» ( /1 '1( 00 2 _
X,
and we obtain
)
+ _1_ /1 �':1J ) = o . 2N e 4 Ne mx 2 + X 4 Ne m + 1
_
-
x( 1 - x)
x2 4 Ne m + l ' _
9 .2.1 0
which agrees with Wright's formula ( 9.2.7 ) . To derive the above formula for variance, we have assu med that the change in gene frequency by migration is small so that in calculating the sampling variance x(l - x) /2N e ' the gene frequency before migration may be used for x. On the other hand, if the amount of migration per generation is very large and if the sampling of gametes occurs after migration , it is more accurate to use the frequency after migration for x in calculating the variance. Thus, denoting by !lx the amount of deterministic change by migration and by ( the amount of stochastic change by random sampling of gametes, we have E(�,) = 0 and
X,( l - X,)
, E( sJ!2) =
2N e
9.2.1 1
'
where X, = x, + !lx, = x, + m(x - x,) = m.x + ( 1 - m)x, is the frequency after migration but before the sampling. Since X, + 1 X, + �" formula 9.2.9 for J.l ; is modified as =
J.l '(' + 1 ) = EtfJ( X,2 ) + EtfJ 2 =
{X,(l - X,) } 2N e
( 1 - 2�J E(X?) + 2�e E X ) ,
9.2.1 2
( ,
+ ( 1 - m)p�( ' ) and E(X;) = ( mx)2 + 2 m.x( 1 - m)J.l 'I(t ) + ( 1 - m) 2J.l;(t). At e quilibrium in which tl',( t ) = x and J.l ;( ' + 1 ) = J.l ;(I) = tl2(OO) = u; + x 2 , we have E(X,) x and E(X/ ) = x 2 + ( l - 111) 2 U; , and therefore 9.2. 1 2 yields
where E(X,)
=
=
u;
or
(
)
X
+ x 2 = 1 - 1 [ x 2 + ( 1 - m) 2 u;] + 2Ne 2Ne
�2 =
u
mx
1
2Ne
-
[
x( 1 -
1 - ( 1 - 111)
2
x)
(1 -
1 2Ne
1)
.
9.2.1 3
D I ST R I B UTION OF G EN E F R EQUENCIES IN P O P U LATIO N S 441
Unless an unusually large value is assigned to m, the difference between this improved formula, 9.2. 1 3, and the approximate formula, 9.2. 10, is negligible. Assuming migration only, we may im mediately adapt the above treat ment to include reversible mutations. If there is mutation from A l to its allele A 2 at the rate of u per generation and also reverse mutation from A 2 to A l at the rate v, then the rate of change i n x due to these mutations is - ux + v( 1 - x). Thus 9. 2.2 becomes
M6x = - [m (1 - Xr) + uJ x + (mxr + v)( l x). 9.2.1 4 This means that, in order to include mutations, we simply replace m by m + u + v and mXI by mX + v in the various formulae. Going back to distribution 9.2.5, let us now investigate the frequencies of fixed classes or the probabilities that allele Al is temporarily fixed or lost in the subgroup. For this, we note first that although the constant C in 9.2.4 was deter mined by the condition -
I
f c/>(x) dx I
o
=
1,
9.2.1 5
from which 9.2.5 was derived, the study of the stochastic process involved (see formula 8. 5.9) shows that C is intrinsically determined by the process and it i s not a constant determined by the arbitrary statistical procedure of making its i ntegral equal to unity. That is to say, condition 9.2. 1 5 follows directly from the nature of the stochastic process. Thus it l ooks as if in the present treatment no probability mass is left to represent the fixed classes. It turns out, however (cf. Kimura, 1 968), that the frequency 1(0) that Al is temporarily lost from the population is given by /(0)
=
Thus, /(0)
�
1
J2N ¢(X) dx. o
r(4Ne m) r(4Ne mxr + l )r(4Ne m ( l -
XI))
( )4N.mxl 1
_
2N
9.2.1 6
9.2.1 7
approxi mately. This may also be derived by considering the balance between mutation and random extinction at the subterminal class, as suggested by Wright ( 1 93 1 ). Similarly, the frequency I( I ) that A L is temporarily fixed in the population (A 2 temporarily lost) is
/( 1 )
=
f1 --c/>(x) dx 1
2N I
r(4Ne m) � r(4Ne m r) r(4Ne me l
X
- XI) + 1)
( )4Nem(1-xr) . 1
2N
9.2 . 1 8
442
AN INTRODU CTION TO P O P U LATI ON G E N ETICS TH EORY
9.3 Distribution of Gene F requencies
U nder Selection and Reversible Mutation
Let us assume a pair of all eles A t and A 2 with respective frequencies x and I - x in a random-mating popUlati on of effective size Ne • Let u be the mutation rate from A l to A 2 and let l' be the mutation rate in the reverse direction. Then the rate of frequency change of A l by mutation i s - ux +
vO
-
x).
If a is the average fitness of the populati on measured i n Malthusian param eters, then the rate of change in gene frequency owing to natural selection is given by x( l - x) da dx 2
Note that if the fitness is measured i n selective values ( W), the corresponding rate of change is given by Wright's formula x( l - x) d W 2W
dx '
where W is the mean selective value of the population (cf. 5.2. 1 2) . Combining the expressions for mutation and selection, the mean rate of change in x per generation i s M.b;
=
- ux + v( l - x) +
x( 1 - x ) dii 2
dx '
9.3.1
For a population of effective size Ne ' the variance in Ox due to random sampling of gametes is v
b
=
x(1 - x) 2Ne '
Therefore
and we obtain from 9. 1 . 3 the distribution formula for the frequency of AI : 9.3.2
In the simplest case of genic selection in which Al has selective advantage s over A 2 we have a =
,
2sx2 + s2x( 1 - x)
=
2sx,
D I STRIB UTI O N OF G E N E FR EQU ENCIES I N POPULATIO NS 443
if we assign relative fitnesses 2s, s, and 0 to A l A I ' A I A 2 ' and A 2 A 2 respectively.
9. 3.2 reduces to l/J( x) Ce4 N.SJli x4N.f) - 1 ( l - X )4N .u- l . Figures 9.3. 1 a, b, and c the frequency
Then
In
=
N.
-
9.3.3
distribution of a deleterious' gene
I , I , I I , I I , , I , I , , , I , , , I
1
40 1)
N _ ....!2.... "
4 0 1)
I
- - --... -, \
\
-
' � ..
(8) � " I ' . " , ,
-- -
- - -
(b)
:
� : :, :,
, I I I I I
I I ,
, I I I I I r I , I I I I I I I I I I I I
=0
N
, , ' '
•
:
100
40 I)
,
• I
, , I
:
,
I
\
\
'
/
, / ' /
!t / 1
I
I I
;
I
� ,
'
\
\
\ \
\
(e)
\
,
,
"
Figures 9.3.1 11,b,c. Graphs showing the frequency d istribution of a
deleterious gene
(s < 0)
in a small (a), an intermediate
(b),
and a large
v). represent cases
populat ion, assu ming equal mutation rates in both directions (u =
I n each figure, curves with solid, broken, and dotted l ines with
s = - u/ l O, - u,
(c)
and - l Ou, respectively. (From Wright, 1 937.)
444
AN I NTRO DUCTION TO POPULATION G E N ETICS TH EORY
(s < 0) in a small population (a), i.e., N e = 1 /40u, an intermediate population (b), i.e . , Ne = 1 O/40u, and a large population (e), i.e., Ne = 1 00/40u, is illustrated for three levels of selection intensities, assum i ng u = v. In each figure, curves with solid, broken, and dotted l ines represent cases with s = - u/ l O, - u, and - l Ou, respectively. As will be seen from Figure 9.3. l a, fixation or loss of alleles predom inates i n a small population and selection i s not very effective in determining the gene frequency distribution. On the other hand, in a large population (e) the gene frequency tends to cluster around the equilibrium value and small change in selection intensity may lead to a marked change i n the distribution . In the more genera l case of zygotic selection, i f the relative fitnesses of three genotypes A l A l , A I A z , and A z A z are s, sh, and 0, respectively, measured in Malthusian parameters, then ii
= sx2 + 2shx( 1 - x) N
•
,
- 4 0I u _
_ _
,,
"
- - -
-
(8)
-
, -
N
•
:2 �
40u
" -'�
... ... -... ...
Ib)
. ,, ,, ,, ,, ,. , ,
,
---
' I ,
,
N
•
�
100
40u
, , , I
.
I
I ,
I
,
, ,,
Ie)
Graphs showing the distribution of a completely recessive deleterious gene (s < 0, h = 0, u = v). Effective size Ne = 1 /40u, and J O/40u, and l oo/4()u in a, b, and c, respectively. In these figures the solid line represents the least selection (s = - u/5), the broken line selection 1 0 times as intense, and the dotted line se l ectio n 1 00 times as intense. (From F i gures 9.3.2a,b,e.
Wright, 1 937.)
DIST R I B UTION OF G E N E FR EQU EN CIES I N POPULATIONS 445
and the distribution of x is given by 2 N"SX2 + 4NeSIlX( 1 -x) 4NeV - l _ 4N"II - l x ( 1 X) ¢ (x) = Ce .
9.3.4
Figures 9.3.2a, b, and c illustrate the distribution of completely recessive deleterious genes (s < 0, h = 0) for three different population sizes and for three levels of selection intensity. If there is overdominance between alleles, it is more convenient to express the relative fitnesses of the three genotypes A l A I ' A I A 2 ' and A 2 A 2 as - Sl ' 0, and - S2 , so that 2 2 jj = - SlX - S2( 1 - X) . Then the distribution formula for x becomes 9.3.5
Using a distribution formula equivalent to 9.3.4 or 9.3.5, Nei and Imaizumi ( l 966a) investigated the amount of genetic variation expected in a finite population at equilibrium. 9.4 Distribution of Lethal G enes
If gene A 1 is lethal , its distribution will be restricted to the range of very low frequencies and mutation from the lethal gene Al to its normal allele A2 may be neglected . Let x be the frequency of the lethal gene A I ' Suppose that A l is completely recessive to its allele (A 2 ) so that the selective values of three genotypes are W I I = 0 and W1 2 = W22 1 . Then the rate of change i n x by selection is 2 x - 1 + x· =
The rate of change by mutation from A 2 to Ai is v( l - x).
Thus, we can take 2
M6x = v( l - x l - -- , I +x
x
9.4.1
as the mean of DX per generation. Using the binomial variance as before,
Vb =
x( l - x) 2N e '
9.4.2
9. 1 .3 yields 2 2N 4N I -I ¢(x) = C ( l - X ) "X ..V - ( l - X ) ,
9.4 .3
446
A N I NT R O D U CTIO N TO P O P U LATI O N G E N ETICS T H E O R Y
v = 10
o
0.001 0.002
0.003
0.004 0 00 5
0.006
-&
0.007
0.008
0.009
0.010
Figure 9.4.1 . Probabil ity distribution of the frequency of a lethal gene for various population sizes. The mutation rate (v) is assumed to be 1 0 S per generation. Abscissa : Frequency of the lethal gene in the population. Ordinate : Probability density. (From Wright 1 937.) -
where constant C is determined from 9 . 1 .4 as
2 9.4.4 , B(2Ne , 2Ne v) + B(2Ne , 2Ne v + !) where B( , . ) designates the beta function. Figure 9.4. 1 shows the probability distribution of x among populations containing at least one lethal gene for various effective population sizes, N Since x is practically restricted to a very small positive value, we may substitute
C=
.
e '
�(x) = Ce - 2 N.x2x4N.v- l
9.4.5
for 9.4.3. This formula can also be obtained by using M6J: =
v
-
x
2
9.4.6
instead of formula 9.4. 1 . The constant C is then
C=
2(2Ne)2N.v r(2Ne v)
,
9.4.7
approximately. The mean of the distribution is easily found and we obtain 9.4.8
DIST R I B UTIO N OF G E N E F R EQU ENCIES I N POPULATIO N S 447
At the limit of Nt! 00 , x becomes J �. This is the value expected in an infinitely large random-mating population and it can also be obtained from --.
M {);Ie
=
V
-
x2
=
O.
On the other hand, if Nt! v tends to 0, x
�
vJ27t Ne •
9.4.9
It may be seen easily that the expect�d frequency of lethal genes is much lower in a small than in a large population. In Figure 9.4.2, mean lethal frequency in a population i s plotted as a function of the effective population number, assuming v = 1 0 - 5. The figure shows clearly that for the well-known
relation x = Jv to be valid, the population number has to be hundreds of thousands, as pointed out by Robertson ( 1 962).
10 -� �----�-'--'--�-10 '> CD 5 000 1 0 ' 500 1 0 3 10 ' N�
Average frequency of a completely recessive lethal gene in populations of various sizes where t he mutation rate, v, is 1 0 5 . F i g u re 9.4.2.
-
448
AN INTRO D U CTIO N TO P O P U LATI O N G E N ETI CS T H E O RY
le the lethal gene (A I ) is not completely recessive in fitness but is slightly deleterious in the heterozygote, as is the case for the majority of lethal genes in Drosophila, we should take M{)x = v - hx( l - I) - { x2( l
-
I) + xl} ::::: v - hx - x2 - xl
9.4.1 0
as the mean change of the lethal frequency per generation, still assum ing that x is small. Here I denotes the inbreeding coefficient and we assume that it is small and constant. Using the approximate expression V{)x = x/2Ne for variance, the formula for the frequency distribution becomes 9.4. 1 1
In this case, if the mutation rate and the heterozygous disadvantage are around v = 1 0 - 5 and h = 1 /40, respectively, as observed for the majority of lethal genes in Drosophila, the mean gene frequency is little influenced by the effective population number and is roughly equal to v/(h + I).
More generally, if (h + I) }> Jv, the selection against lethal genes is primarily exerted in the heterozygous state so that 9.4. 1 1 may be replaced by a simpler formula, v- 1 9.4.1 2 ¢(X) = Ce - 4N,, (h + j)xX4N" , where
c
=
[4N (h + f)J 4N " v ; r(4Ne v)
that is, the distribution is approximated by the gamma distribution with mean and variance v X= h +1 _
--
and
v (12x = . 4Ne(h + f)2
In experimental studies of lethal genes, the directly observable quantity is the frequency of lethal-bearing chromosomes rather than individual lethal genes. So, let Xl be the proportion of chromosomes carrying one or more lethal genes, and let Xo = 1 - Xl ' Then, assuming independent distribution of lethal genes at different loci, x - e -l:; x 1 0 -
,
where X i is the frequency of the lethal gene at the ith locus. Thus,
QI
=
- loge X0 =
I Xi i
D I STRI B UTION O F G E N E F R EQUENCIES I N P O P U LATI O N S 449
has
and
mean and variance
a
QI
(12
,.. -_ "
(12
XI
U
- 4Ne(h + f)2 '
_
if we assume that (h + f) is the same for all relevant loci. I n the above formulae, U = 2:, Vi is the lethal mutation rate per chromosome. As pointed out by Nei ( 1 968a), since the frequency of lethal genes at each locus fol lows a gamma distribution, their sum Q1 is again distributed as a gamm,a distribution, so that we have
4>( Q I )
where
C=
=
C e - 4N.,(h + f)Q I QiN U - 1 ,
{4N ..(h
+ f)} 4 N. U
r(4Ne U)
.
Figure 9.4.3 illustrates this frequency distribution for various population sizes, assuming U 0.005, h 0.025, and f = o. =
=
10
8
u
=
0.005
h "" 0.0 2 5
6
4
2
Figure 9.4.3.
0.2
0.4
0.6
0.8
Frequency distribution of lethal chromosomes (QJ). Here it is assumed that each lethal gene has a hetero zygous selective disadvantage of 2.5 % and that the lethal mutation rate per chromosome is 0.5 %. The number beside each curve represents the effective population size. (From Nei, 1968.)
1.0
450
AN INTR O D U CTI O N TO POPU LATION G E N ETICS THEORY
For more details on the distribution of lethal genes, readers may refer to Nei (I 968a). 9.5 Effect of Random F l uctuation in Sel ection
I ntensity on the Distribution of Gene F requencies
In
the preceding sections, we have considered only those cases in which the sole factor producing random fluctuation of gene frequencies is the random sampl ing of gametes. H owever, as pointed out in Section 8.7, it is probable that random fluctuation i n selection intensity is also at work in determining the gene frequency distribution in natural populations. In this section , we will investigate this effect for the case of no dom inance, that is, the heterozygote having fitness m idway between the two homozygotes. For a more detailed treatment of the subject, see Kimura ( 1 955). We will express the selective advantage of A l over A 2 by s, and assume that s is a random variable with mean s and variance Vs ' Let u and v be the m utation rates as defined in 9.3. Then the mean and variance of the rate of ch.rnge i n gene frequency x are given by M():c
V6x
=
=
sx( l - x ) - ux + r( \ - x )
Vs x ( 1 - x) + 2
2
x( l - x)
2Ne
.
9.5.1
Substituting these in 9. 1 .3. we get
<J>( x ) = CX 4Nev - I ( 1
where
_
X)4N "I/ - l U I - X )4NcA - I ( X - }'2)4N"B - I ,
and
and
( < 0).
9.5.2
DISTRI B UTION O F G E N E F R EQU ENCIES IN POPULATIONS
451
In order to compare the relative effect resulting from the two different random factors, i.e . , the random sampling of gametes and random fluctuation of selection intensities, we will study in some detail the symmetrical case where s = 0 and u = v. With these assumptions, 9.5.2 reduces to 9.S.J
If the population size is small such that 4Neu is much less than I , the genes are fixed most of the time and the distribution curves for unfixed classes are V-shaped, as shown in Figure 9.5. l a, where N e = 1 0 3 • If the variance of s is 0. 1 /2N e or 5 x 1 0 - 5 the effect of the fluctuation is so small that, when drawn on the graph, the curve is indistinguishable fro m the one with no fluctuation. With � 1 O/2Ne or 5 x 1 0 - 3 the effect is still not striking ; the frequencies of the subterminal classes merely rise by about 33 % compared with the case of Vs o. With � = l OOj2Ne or 0.05, the subterminal classes become about 2.2 times as high as in the case where V,s O. � 0.05 is a high fluctuation since Us = J � = 0.2236. Thus for small populations the effect of random fluctuation of selection is rather unimportant. Figure 9.5. l b shows the distribution of gene frequencies in intermediate populations such that 4Ne u l . If N e N, this means that one mutation appears every two generations on the average. For example, if we assume a m utation rate of one in one hundred thousand ( 1 0 5), N e should be 25,000, that is, 25 times as l arge as before. In such populations, jf there is no fluctua tion, there will be a flat d istribution. That is, all the heterallelic classes are equally probable. Here the effect of the fluctuation may not be negligible. With Vs = 1 0j2N e or 2 x 1 0 - 4 , the distribution curve becomes U-shaped and the heights of the subterm inal classes are about 4.9 times the heights of those where Vs = O. With Vs = l OO/2N e or 0.002, the distribution resembles that of a very small population (4Ne u � 0), the frequency of the subterminal classes rising about 48 times as high as when Vs = O. In Figure 9 . 5 . I e the population size is assumed to be twice as large as before (4 N e u = 2). so that, if N Ne , one new mutation is expected per generation. With a m utation rate of u 1 0 - 5 , N should be 50,000. In this case it can be shown that if Vs does not exceed I jN" or 2 x 1 0 - 5 , the distribu tion curve is unimodal. On the other hand, if Vs exceeds this value, the curve becomes b imodal. As is shown in the figures, if Vs l Oj2N or 1 0- 4 , the gene frequencies giving the two modes of the distribution are about 0.053 and 0.947, respectively. With Vs l OOj2N or 1 0 - 3 , the modal gene frequencies are about 0. 5 % and 99. 5 %, and the modal classes are about 1 04 times as frequent as the class with 50 % gene frequency. Therefore in the last case the distribution curve looks as if it were V-shaped on the figure. However, the class frequencies fall sharply outside these modes and the frequencies of terminal classes are O. =
=
=
=
=
=
f!
=
=
e
452 AN INTROD UCTI ON TO POPULATION G E N ETICS TH EORY 4 Ne u
«
1.
U
-=
S
V.
O. N.
=
- 103
4 N. u
...
1. u
=
V.
S
""
0
10.0
2.0
v s
.
i\ /
\
10.0
2.
:,
=
Vs
0.5
(b)
o
(s)
=
Vs
,
10 2 Ne
�
J
- 2 N. ;' : 1
= 1 . / ,-1--- - - - - --.. ..-' J , ...... "'.::.--:: . ---- - .. ..-:- . -:.----_ 0 4---���== . �����--�
O +-------�--� 1 .0 0.5 o
4 Ne U
2 N•
\
,
I:ii
1 00
.
5.0 -
1.0
..
U = v.
S
""
0
Vs
1.0
0
I! :i
�l I
I. 5.0
V
-
'
.
-j
I \.
10
2 N.
� I, ,),':
I
i
'
=
•
o
•
100
2N
/V;
i
: �
=
'
I ,'
o
Figures 9.S.1s,b,c.
. ...
=
. _ : .�-�
0
- . - . - �
__ a . _
;�
' • 2N
--
i �I and let X be its frequency after mutation. Then •
•
•
,
-
x + u 1 (1 - Kx) , I ). If we assume that random samplin g of gametes is where U1 = ul(K X
=
-
carried out after the deterministic change of gene frequency, then the change of gene frequency due to random sampling, which we denote by �, has mean and variance
454 AN INTRODUCTIO N TO P O P U LATION G E N ETICS TH EORY
In the above formula, Ne is the " variance " effective number of the popula tion. Since the frequency of A l in the next generation is x ' = X + e = Ul + (I - UI K)X + e,
if we denote by p� and P ; the first and second moments of the frequency distribution of A l at equilibrium, we have J1'l = E(x ' ) = u 1 + ( 1 - U I K )p � , P;
=
{
E(X'2) = E{(X + e ) 2 } = E X2 +
1 = __ { u 1 + ( 1 - u 1 K)p /d + 2Ne + ( 1 - U l K ) 2 p; } .
X( ! - X » 2Ne
)
( 1 - 2�e) {U i + 2u 1 ( 1 - U t K)P/l
9.6.4
9.6.5
Thus, from 9.6.4, we obtain I
1
PI = -. K
9.6.6
Substituting 1 1K for P /l in 9.6.5, we obtain 1 + (2Ne - 1)(2 u 1 - u7K) P = 2 K{2Ne - (2Ne - 1)( 1 - u 1 K) 2 } ' I
9.6.7
where U1 = ul(K - I ). The average homozygosity or the sum of squares of the allelic frequencies may be expressed in terms of p; as follows :
H0 =
E(� Xf) = Kp2 '
9.6.8
in agreement with 9.6.3 when very small quantities are ignored. The effective number of alleles ne is equal to the reciprocal of H 0 and therefore ne = 1 /(Kp; ) . I f the number of allelic states is indefinitely large, K = 00 so that U1 = 0, u 1 K = u, and uiK = 0 in the above formulae and we obtain 9.6.9 e
Note that N here represents the variance effective number of the population rather than the inbreeding effective number that appears in 9.6.2. Since the mutation rate is usually very much smaller than unity (u � 1 ), 9.6 . 9 reduces to 9.6.1 0
in good agreement with 9.6.2.
DISTRIB UTI O N O F G E N E F R EQU E N CIES I N P O P U LATIONS
'l55
The problem of finding the distribution of the allelic frequencies is more difficult but was solved by Kimura and Crow ( 1 964). For neutral alleles, the steady-state distribution has a rather simple form : 9.6.1 1 =
(Kimura and Crow, 1 964), where M Ne u. Ne stands for the effective population number. In Chapter 7 (cf. Table 7.6.5. 1 ) we showed that the two definitions of effective population number are equivalent if the population is neither increasing nor decreasing. In m uch of the discussion to follow, this assumption is m ade, so it is not necessary to distinguish between the two effective numbers. In this distribution <J>(x) dx gives an approximation to the expected number of alleles whose frequencies in the population lie within the range x to x + dx (0 < x < I ) . It is i mportant to note here that we are considering a frequency distribution, within a single population, of various alleles having a different number of representatives, but we are not considering a probability distribution for any particular allele. Using the above distribution, f m ay be obtained by computing the sum of squares of allelic frequencies
f
=
I
I x2(x) dx
o
=
1 4Ne U + 1
'
which agrees with the result obtained in 9.6. 1 . Thus ne
I
= - =
f
4Ne U + 1 .
9.6.1 2
This shows that the effective number of alleles is determined solely by the effective population number and is independent of the actual population number. The average (actual) number of alleles (na) can be obtained by summing the expected number of alleles in each frequency class from 1 /2N to 1 . Thus na
=
1
f I (x) dx 2N
=
4Ne u
1
f 1 ( 1 - X)4N.u - I X - l dx.
9.6.1 3
2N
This shows that the average number of alleles within a population depends both on the effective and the actual population numbers. Therefore na may be m uch more difficult to estimate from a sample than ne , and less informative as to genetic variability. Figure 9.6. 1 illustrates the result of a Monte Carlo experiment on the number of neutral isoalleles in a small population consisting of 50 males and
456 AN I NTROD U CTIO N TO P O P U LATI O N G EN ETICS T H EORY
50 females (N = 1 00), of which only 25 males and 25 females actually partici pate in breeding (N = 50). The simulation experiment was carried out using the IBM 7090 computer by generating pseudo-random numbers. In each generation, 1 00 male and 1 00 female gametes were randomly chosen from 25 breeding males and 25 breeding females to form the next generation. e
12 11
10 9
3
2
,'
"
,..... �,
,
"
�
/
I
.....
...... - ..... , ,,.. ..... ....... - .....
..
- - - - - - � - - - - - � - - - -� - - � - - - - - - - - - � - - - � \ I \. � , ... - ..... _ ...... .... ... - "'" ..... - - - -
.. . ... . -1\,
",
o �--�--�---r----r---.---'--'---� a 1 00 2 00 300 400 500 600 700 800 900 1000 1 1 00 1 200 Ge neration
F igure 9.6.1 . A result of Monte Carlo experiments on the number of neutral isoalleles. In this experiment actual population number N = 1 00, effective populat ion number N... = SO, and mutation rate u = 0.005. The average number of alleles (n,,) and the effective number of alleles (n...) are plotted by �ound and square dots respectively. Horizontal lines, solid and broken, represent corresponding theoretical values. For details, see text.
Mutation to a new (not preexisting) allele was induced in each gamete with probability 0.005 prior to the formation of zygotes (u = 0.005). The initial population was set up such that it contained 200 different alleles. Outputs of both average and effective numbers of alleles were given at 50-generation intervals starting with generation 100, and the experiment was continued until generation 1 200. The balance between mutation and rando m extinction of alleles was reached well before generation lOO. The averages of 23 outputs are na = 5.522 and J = 0.520 or ne = 1 .923. These should be compared with the theoretical values obtained from equations 9.6. I 3 and 9.6. 12, setting N = 1 00, N = 50, and u = 0.005, that is, e
na =
and
f
1
I 2 00
X - I dx
= loge 200 ::::: 5.298
DISTRIBUTION O F G E N E FREQUENCIES IN POPULATIONS
457
Thus, the agreement between the results of a Monte Carlo experiment and the theoretical predictions based on the diffusion approximation is satis factory. For more extensive simulation experiments, readers may refer to Kimura ( 1 968). 9.7 The N umber of Overdominant
Al leles in a Finite Population
Since Fisher's paper of 1 922 it has been known that in an infinite population, overdominance leads unconditionally to stable polymorphism for a pair of alleles. For more than two alleles, however, a more delicate condition has to be satisfied at �table equilibrium (see 6.8). The complexity of the condition, however, does not vitiate the general conclusion that overdominance is a potent factor for maintaining selective polymorphism in an infinite population. As pointed out by Robertson (1 962), selection for the heterozygote is a factor retarding fixation if the equilibriu m gene frequency lies inside the range of 0.2-0.8, when a pair of alleles are involved (see 8.6.4). On the other hand, for an equilibrium gene frequency outside this range, there is a range of values of Ne(Sl + S2) for which heterozygote advantage in fact accelerates fixation. Ne is the effective population number and Sl and S2 are selection coefficients against both homozygotes. This might suggest that if there are a large number of overdominant alleles in a population, they will be lost by random drift as readily as neutral alleles, and overdominance is rather ineffective in keeping a large number of them in a finite population. In this section we will investigate quantitatively the maximum nu mber of overdominant alleles that can be maintained in a finite popUlation. For this purpose, we will consider an ideal situation in which every mutation produces a novel allele which is different from the preexisting ones and every allele is heterotic with any other allele. Furthermore, we assume that all heterozygotes have equal fitness and all homozygotes also have equal fitness which is lower by S compared with that of the heterozygotes. Any asymmetry with respect to fitness within homozygotes or heterozygotes will reduce the number of alleles. Consider a random-mating population of effective size N e ' We will designate by x the relative frequency of an allele in the population and let (x) dx be the expected number of alleles whose frequency is in the range x ,...., x + dx. The relative frequency of each allele may change from generation to generation by mutation, selection, and random drift, but at statistical equilibrium there will be a stable distribution in x given by Wright's formula (cf. 9. 1 . 3),
(x) = - e C
V,s.lC
2
Jv
M 6x 6x
d.lC ,
9.7.1
458
AN I NTRO D U C1·IO N TO P O P U LATI O N G E N ETI CS TH EO RY
where C is a constant and Mdx and V,b are the mean and variance of the rate of change in x. Let u be the mutation rate per gene per generation and denote by f the sum of squares of allelic frequencies that are contained in the population
f = I xt ,
9.7.2
i
where X i is the frequency of the ith alIele A i . As in the previous section, we will denote by n e Iff the effective number of alleles maintained in the population. Since the rate of change of the fre quency of a particular allele by mutation is - ux and that by selection is =
- sx(x - f),
9.7.3
we have
Mdx = - ux - sx(x - f).
9.7.4
In the above expression, s( > 0) is the selective disadvantage of a homozygote compared with a heterozygote measured in Malthusian parameters. For a discrete model, in which the selective values of a homozygote and a heterozy gote are 1 s and I , respectively, the amount of change in X per generation is given by -
sx(x - f) ' I - sf
9.7.5
but since we are concerned with cases in which sf is much smaller than unity (0 < sf � I ), this also lead s to the same expression as 9.7.3 if we neglect this small quantity in the denom inator. The variance in the rate of change in x is given by
Vdx =
x( 1 - x) . 2N
9.7.6
e
Since we are considering a situation in which a large number of alleles are maintained in a population and the effective range of x is essentially restricted to a very low value (x � I ), we will use t he approximate expression
x
Vdx = 2N .
9.7.7
e
-1
Using 9.7.4 and 9.7.7, we obtain from 9.7. 1 the distribution formula
( x ) = C e - 2 S(x -f)2 _4Mx x
where
,
9. 7.8
9.7.9
DISTRIBUTION OF G E N E F R EQU E N C I ES I N P O P U LA'rl O N S
459
In deriving this distribution formula,jis assumed to be a constant which is interpreted as the expected value of the sum of squares of allelic frequencies or, more simply, as the probability of allelism, the reciprocal of which is the effective number of alleles in a population. Such a treatment regarding j as constant (for a given S and M) is again an approximation, but as will be shown later, this turns out to be satisfactory for our purpose. Constant C in 9.7.8 is determined by the condition that the frequencies of all the alleles in the population add up to unity,
L Xj =
or
i
J
1,
1 x(x) dx = 1 .
9.7.10
o
Note that usually t/> in Wright's formula (9 . 1 . 3) represents probability density of gene frequencies rather than the expected number of alleles as we use in this section. Thus, usually, C is determined by condition 9. 1 .4 rather than 9.7. 10. From 9.7.8 and 9.7. 1 0, we obtain 1 _ -C
J
1e
0
-
2S(x
-
f)2 -
4 Mx dx.
9.7. 1 1
This can also be expressed as 1
C
/+2JS - X e 4 M.jS S 2M2
-
_
S
2
when�
X = 2 ft
_
J
-x
e
-
.t2 2
d..1.,
(/- �) .
9.7.1 2
9.7.1 3
At the equilibrium state in which random extinction is exactly balanced by mutational production of new alleles, we have the following condition at the subterminal class : 2Nu = number of new mutanls
�2
(_1) _1 (�) 2N 2N Ne,
9.7.14
---y--n u mber of extinctions
(cf. 8.3.22). This gives
4 - M - C e _2S(�_/)2 2N . 2N 4M -
9.7.1 6
460
AN I NT R O D U CTIO N TO POPU LATIO N G EN ETICS TH EO RY
Since u is very small compared with unity, the term (4M/2N) may be neglected in the actual calculation of the exponential term in the above formula. Furthermore, we assume that the effective number of alleles is much smaller than the total number of genes in a population, i.e., ne = 1 11 an operator for taking the expectation with respect to the existi ng gene frequency distribut ion, so that the operator E for taking the overall expectation is given by E = Et/> ElJ . Let us first consider the mean p of the gene frequency distribution at equilibrium, i.e., E(P i) = p. Taking expectations of both sides of 9.9.3 and noting 9.9.2 and 9.9.4, we obtain
E( pi ) = ( I - m oo )p + m oo Pl ·
Since E( p �) = P at equilibrium, we have
472
AN I NTRODUCTION TO P O P U LATI O N G EN ETICS TH EORY
So, we will substitute p for PI in the following treatment. In order to calculate the variance and the correlation coefficients of gene frequencies at equilibrium, we will let
and
Then 9.9.2 becomes 9 .9.6
where a
=
(I
-
md( I
- m oo
) and P
=
mt(l
-
m oo)/2. Also, 9.9.3 becomes .
9 9. 7
-, , wh ere Pi = Pi - p. We will denote by Vp the variance of the gene frequency distribution among colonies. Squaring both sides of 9.9.7 and taking expectations, the variance in the next generation is -
If we apply 9.9.4 and 9.9 . 5 to the right-hand side of the above equation and note that Pi = Pi + p, we obtain Vp' =
(
)
1 P( 1 p ) . E (P�) + 1 -2N I! '" I 2Ne -
-
-
Now, from 9.9.6
E",(Pf) = a2 Vp + 4aP Vp r l + 2p 2 VP( 1
9.9.8
+ r
2
),
.
9 9.9
where 'j is the correlation coefficient of gene frequencies between two colonies which are} steps apart, i .e.,
In particular, '0 = 1 , and also we assume , - 1 E",(pn in the right-hand side of 9.9.8, we have V�
=
(1 2� ) {a2 -
I!
+ 4�P' 1 + 2f32(1 +
= '1 '
Substituting 9.9.9 for
rz)} Vp + p( 1
� p)
2N e
.
9.9.1 0
DISTR I B UTION OF G EN E F R EQU ENCIES I N P O P U LATIONS
Thus at equilibrium in which V;
=
473
Vp , we obtain 9.9.1 1
where r 1 and r are correlation coefficients of gene frequencies b etween 2 colonies one and two steps apart. In order to obtain the correlation coefficients rij 1 , 2, . . . ), let us con sider the covariance (C) in the next generation, =
Cj
=
E(pi pi + j) ,
in which P� = P i + e i and P� + j = Pl + j + e i +j ' Since for j 1 , e i and e i +j are mutually independent random variables, each with mean 0, we have
Cj = E{(Pi + ei)(Pi + j + e i + j) } or
Cj
=
E",(PJ Si +j) '
Then, if we use relation 9.9.6, we get =
E", {[api + P(Pi - l + Pi + l)] [api+ j + P(Pi+j - l + Pi+j + l)]} + + = a 2 Cj + 2ap(Cj + 1 + Cj- 1) + P2 ( Cj + 2 2C) CJ- 2)' At equilibrium i n which Cj = Cj , noting rj C) Vp , we obtain Cj
=
(a 2 + 2p2 - 1)rj + 2ap(rj + 1 + rj- l) + p 2(rj + 2 + rj- 2 )
=
0,
(j
1). 9.9.1 2
Equation 9.9. 1 2 holds for j > I . However, for j = 1 , r - 1 should be replaced by r1 to give
9.9.1 3 (a2 + 2p2 - I )rl + 2ap(r + I) + p2(r3 + rt ) = 0. 2 The essential part o f the mathematical treatment o f the stepping-stone model is to find the solution (rj) for the system of equations 9.9. 12 that satisfies the boundary conditions, ro = 1 and roo O. This was done by Kimura and Weiss ( 1 964) and also by Weiss and Kimura (1 965). Here we will present an elementary treatment which was given in Kimura and Weiss (1964). Let rj ;.1 and substitute in 9.9. 1 2 ; then we have 1 «(X2 + 2p 2 _ 1) + 2(XP(;" + ;" - ) + p2(;" 2 + ;" 2) = 0 , =
=
or
474
AN I NTRODUCTION TO P O P U LATI O N G E N ETICS TH EORY
from which we obtain the four roots,
1 - a)2 - (2{3)2
, _ ( l - a) + Ai ..1.2
=
( 1 - a)
2 {J
J(T- a) 2 - (2(J) 2
-
2(J
( 1 + a)
-
, 9.9.1 4
J(l+;V-=(2PY 2{J
Then the required solution may be expressed in the form
4
rj = L K i )·{ , I
1
9.9.1 5
where K/s are constants. H owever, since )' 1 > I , > ..1.2 > 0, )' 3 < - 1 , and I < ..1.4 < 0, we must have K 1 = K 3 = 0 in order that ,j vanishes at j = 00 . Furthermore, i n order that '0 I , we must have K2 + K 4 I . Thus, writing K for K2 , 9.9 . 1 5 becomes
1
=
=
In order to determine K, we substitute 1" 1 K..1. + ( I - K)..1.4 , 2 + ( l K)..1.� , and ' 3 = K)·i + ( l - K)..1.1 in 9.9. 1 3. This yields
2 2 2 2a(J + K {(a + 3{J - 1))' 2 + 2a{J..1. i + {J ..1. n
Then, if we use the relationships
-2 1'2
=
I -a , -- A 2 {J
-
1,
and
-
1 +a --
{J
..1.4 - 1 ,
9.9. 1 7 is reduced to
9.9.1 6
'2
= K ..1.�
9.9.1 7
DIST R I B UTION O F G E N E F R EQ U E N CIES I N POP U LATIO N S
475
This is further reduced, if we substitute 9.9. 1 4 for A, 2 and A,4 ' to give
or 9.9.1 8
where
and
in which IX = ( 1 - m1)(l - mcx,) and p = m 1 ( l - mcx,)/2. Therefore, the correlation of gene frequencies between two colonies which are j steps apart is given by 9.9. 1 6 with A,2 and A,4 given by 9.9. 1 4 and K given by 9.9. 1 8. Then, applying this formula to calculate ' 1 and ' 2 in 9.9. 1 1 , the formula for variance becomes
v =
p
In
1 ) ( 1 - 2R I R2 )} ' 2Ne { l - ( 1 2Ne R t + R2 p( 1 - p)
_
the special case
9.9.1 9
_
of no m igration
between
adjacent
colonies
(island
1 + IX, R 2 = 1 IX. Thus, 9.9. 1 9 agrees model), m t = 0 and therefore Rl with 9.2. 1 3 except that rn a:; rather than rn is used to represent long-range migration. Weiss and Kimura ( 1 965) developed a more sophisticated method which can also treat higher dimensional cases. Using this method, the correlation in the one-dimensional model is expressed as =
1 f21t j2O dO 2n 1 H ( cos 0) = 1 2 1t - dO2 2n t 1 H (COS 0)
-
cos
rj
where
H(cos 0)
0
=
-
IX +
2P cos 0,
(j > 0),
9.9.20
9.9.21
47.
in x
AN INTRO D U CTIO N TO POP U LATI O N G E N ETICS T H E O RY
(
which a = ( I - m l ) l - m Cl:J = I - moo - maO - m oo) and ( 1 - mt;D)' Formula 9.9.20 may also be expressed in the form rj
AI(j) + A2 (j) A 1(0) + A 2 (0) ,
=
where
and
2p = m l
!U.22
1t jO dO A l(j) = 4rr1 Jo(2 1 -cosH(cos 0)
21t cos jO dO f 411: ( l - IX) - 2p cos 0 1
9 . 9.23
0
jO dO 1 2 1t cos jO dO 1 21t ) A 2 (j = 211: t 1 + H(cos 0) = 4n f ( 1 + cos cc) + 2P cos 0 .
9.9.24
0
To evaluate the above integrals the following formula will be found u seful :
1 211: Then,
dO f 21t cos nO = 0
X
+
cos 0
(
1
1
Jx 2 - I + ( - l )n
1
Jx 2 - 1
(Jx 2 - 1 - x)", (x +
--
Jx2 - It,
x> l 9.9.25
x < -1
9.9.28
and 9.9.27
where
and
Thus 9.9.20 agrees with 9.9. 1 6.
DIST R I BUTION OF G EN E F R EQU ENCIES I N POPU LATIO N S
477
Though the above solution for rj given in 9.9. 1 6 is perfectly general, a simple app roxi mation formula is available for an interesting and important special case in which m et:) � m I ' In this case, A 2 (j) is small in comparison to A 1 (j ) so that
rJ. = e -
(J2iiico)
•
--;;;- J'
9.9.28
approximately. If in addition from 9.9. 1 9.
ml
small so that I
IS
� ml �
m et:) , we have, 9.9.29
approximately. 1 .0 .,-------, m 1
0.9
=
0.1
m CD
=
4
x
10-�
0.8 0.7 0.6 ...
� 0.5 0.4 0.3 0.2 0. 1
10
20
30
p
40
50
60
Figure 9.9.2. Decrease of genetic correlation with distance when m 1 0. 1 and rna) = 4 x 10 - 5. (From Kimura and Weiss, 1964.) =
70
478
AN INTRODUCTION TO P O P U LATI O N G E N ETICS T H E O RY
In Section 9.2 we derived formula 9.2.5, giving the gene frequency distribution among subgroups in the " island model." I n this model, if the entire species is subdivided into colonies, each with effective size Ne , and each colony exchanges individuals at the rate m with a random sample taken from the entire species, the frequency distribution is given by
cjJ (p) = Cp4 N.mp - l ( 1
_
ptN.m(l - p l - l ,
9.9.30
where C is a constant. In the stepping-stone model which we are considering in this section, there is a correlation in gene frequency between immigrants and the receiving colony so that ml(1 - ' I ) gives the effective rate of exchange comparable to m in the island model. So, the gene frequency distribution among colonies in the stepping-stone model may be approximated by 9.9.30 if we substitute 9.9.31
mh we have m � Jimlrna:> since '1 � 1 - J2rna:>/ml from 9.9.28. So, from 9.2.7, the variance of gene frequency among colonies is rna:>
When
2
(jp =
�
p( 1
-
p)
4Ne m + 1
::::::;
p( 1
-
p)
4NeJ2mlrna:> + 1 which is in good agreement with 9.9.29.
,
The m athematical treatment of the two- and three-dimensional models is more difficult, but it is given in Kimura and Weiss ( 1 964) and in more detail in Weiss and Kimura (1965). Figure 9.9.2 illustrates the decrease of genetic correlation with distance for one-, two-, and three-dimensional cases, assum ing rna:> = 4 x 10 s and rn J 0. 1 . The figure shows that it depends very much on the number of dimensions. =
APPENDIX SOME STATISTICAL AND MATHEMATICAL METHODS FREQUENTLY USED IN POPULATION GENETICS
T
he purpose of this appendix is to supply some of the methods that the reader m ay not have encountered or has forgotten. There is no attempt at mathematical rigor, although we have tried to supply enough of the background to make clear the general nature and limitations of the methods. This is written for biologists, not m athematicians, and if one of the latter chances to be reading this he is invited to look the other way or at least be tolerant of some of the intuitive arguments and cookbook attitudes. No knowledge of statistics, m atrices, or higher m athematics is assumed, but the reader is expected to know the elementary theory of probability and to be familiar with the differential and integral calculus. This appendix is intended to be sufficient for any procedures used in the first six chapters, and most of Chapter 7. However, the last two chapters involve subjects of considerably greater mathematical difficulty and it seems to be impractical to include all the procedures here. The reader will either have to accept some results without derivation or look up the methods else where. We have tried to provide references. 479
480
APPEN DIX
Among the various books that m ight
be
men tioned as sources of
additional i nformation we call part icul ar atte ntion to two t ha t are by pio neers i n the field of population genetics. One is t he first volu me of Se wall Wright's Evolution and the Genetics of Populations, which i s devoted to the biometrical and stat istical foundations of population genetics. The other is R. A. Fisher' s Statistical Methods for Research Workers. There i s one method of great importance i n popu lation and b iometrical genet ics which we have omi t ted. This is Sewall Wright's method of path coefficients. Altho ug h we have used al ternative methods in this book, t he reader should real ize that many of the resul ts were first obtained by this method. The proced ures are derived and explained in Wright's first v o lume ( 1 968). A clear elemen tary exposition i s given by Li ( 1 95 5a).
A.1 Various Ki nds of Averages I f there are a large n u mber of observat i ons or measure m e n ts it is us ually d i fficul t , if not imposs i ble, to m a ke m uch se nse o f them by exami n at ion of t he i ndivid ual values. We t herefore m a ke use of various derived quanti ties that extract from the d ata the i n formation i n w hich we are especially i n t er ested . We usually desire some meas ure of the ce ntral or typical val ue and some measure of the a mo u nt of vari abil ity . We might wish, in ad di tion , to know other thi ngs, such as whether t he values are symmetrically distributed about the ce ntral value. UsuaHy, we are especially i nterested i n two q uantities, the mean and t he variance. The ord i nary average, or arithmetic mean, is repre sented by Mx or X and is defined by
A.1 .1
where Xl ' X2 ,
•
•
•
, XN are t he successive meas urements, N i s the n u m ber
of measurements, and the Greek
L means to add all the X ' s fro m X 1
to X N
•
The convention of indicating the mean by a superior bar is widely used. The geometric mean i s the Nth root of the prod uct of t he N v a l ues. I t is frequently useful for d ata that are n ot symmetrically distrib uted. I n popUlation genetics the geometric mean i s useful i n measuri ng average popUlation growth over a period of generat ions, beca use of the geometric nature of popUl ation i ncre ase. The geometric mea:! i s defined symbol ically by
A.1.2
APPEN DIX
481
For computation, it is more conveniently defined as the antilog of
1
X = L Xj , N
A.1 .3
where X i = log X i . In other words, the geometric mean is the antilog of the arithmetic mean of the logarithms of the values. Another mean that is used in population genetics, for example in the study of effects of chance in small populations, is the harmonic mean. The harmonic mean is the reciprocal of the mean of the reciprocals, or
= N L (�). _ Xi H1 x �
Ta bl e A.1 .
;
A.1 .4
Some hypothetical data to illustrate computational methods.
1
Xi
log l o X,
X,
X, - X
( X, - X) 2
3 4 5 6 7 8 9
12 9 13 7 11 11 9 ]1 7
.083 .1 1 1 .077 . 1 43 .09 1 .09 1 .1 1 1 .09 1 . 1 43
1 .079 .954 1 . 1 14 .845 1 .041 1 .041 .954 1 .04 1 . 845
2 -1 3 -3
4 1 9 9
1 -1 1 -3
1 9
Sum
90
.94 1
8.9 1 4
0
1
2
N= 9 90 = 10 9
Mx = X = Hx
Gx
(: ) 1
�
=
9 1
antilog
Med ian = 1 1
36 Vx = - = 4 9
�
9.56
-
8.914 9
=
antilog . 990 = 9.77
36
482
APPEN DIX
For some purposes, the most appropriate summarizing value is not any of the mean values, but the median. The median is the middle value ; that is, the value chosen so that there are an equal number of observations above and below it. If the total number of observations is even, the median is defined as the mean of the two central values. A familiar example of the use of the median is in the halflife of a radioactive element. An exactly analogous prob lem arises in measuring the number of generations through with a harmful mutant gene persists in the population ; it is sometimes convenient to measure i ts half1ife, or median persistence, rather than its mean persistence. These various averages are illustrated with a simple numerical example in Table A. 1 .
A.2
Measures of Variab il ity : The Variance and Standard Deviation
There are also several measures of variability, but we shall consider only two, the variance and the standard deviation. The �'arjance, V, usually reflects the properties of greatest genetic interest . It is defined as the mean of the squares of the deviations of the individual items from their mean. Sym bolicall y,
Vx = " N l..J 1
[( X i - X)
- 2
J.
A . 2.1
This is illustrated with a numerical example in Table A . I . For computation, it is more convenient to write the variance formula as A.2.2
that is, the mean of the squared value minus the square of the mean value. This can be derived as follows. From A . 2 . 1 ,
� 0: X � 2X I X i = � [ I xf - N X 2 ]
=
-
+ N
X2] (since I X i N X ) =
Data are often grouped into classes with the same or similar measure ments. In t his case the mean and variance are computed by weighting each
A P P E N DIX
483
measurement by the number of individuals with that measurement. If there are n 1 individuals with measurement X l , n 2 with measurement X 2 ' and so on, then n I X 1 + n 2 X 2 + · · · + n" X,, = n 1 + n 2 + · · · + n"
X = and
Vx
2 " L.. ni Xj
_ L.. nj(Xj - X)
- 2
"
-
I nj
-
-
L nj Xj l: nj
NX
A.2.3
2
N
A.2.4
where N = I nj • Notice that, as before, the variance formula can be written 2 Vx = X
2
- , X
-
where the superior bar now indicates the weighted mean. In population genetics the frequencies of individuals with each measure ment or attribute are usually expressed as a proportion of the total . If pj = nJN, then I Pi = I , X = I pj Xj ,
A.2.S
and A.2.S
The variance is always measured in units that are the square of the units of the original measurements. This is sometimes inconvenient, as when the variance in height of a group of persons is given as square inches, or variance i n the yield of wheat i s in square bushels. In order to return to the original dimensions, it is customary to take the square root of the variance, and call the resulting quantity the standard deviation. Thus, the standard deviation, (I, is defined by (I
=
ft.
A.2.7
The population variance is often denoted by (1 2 . Despite these dimensional difficulties, the variance is almost always the more useful quantity in popUlation genetics. There are two principal reasons. One is that the variance has properties of additivity and sub divisibility whereas the standard deviation does not. This means that if a compound measure is the sum of two independent measures, the variance of the compound is the sum of the variances of the two parts. For example, if we can think of the yield of maize, Y, as being the sum of a genetic compon ent G, and an independent environmental component, E, then Vy = VG + VE •
4B4
A P P E N DIX
The second property of the variance that makes it especially useful in population genetics is that the rate of evolutionary change is more closely related to the variance than to other measures of population variability. The variance of the sum of two measurements may be derived as follows. Let X and Y be the two measurements. Vx + y
=
1 � 2 L.. ( Xj + Yi - X + Y) N
(si nce X + Y =
=
1 N
-
[� L.. ( X
j
- X )2 +
� L.. ( Yj -
Y) 2 + 2
-
=
X + Y)
� ( X i - X)( Yi
L..
-
Y)]
-
Vx + Vy + 2 COVX y ,
A.2.B
where COVXy
=
1
N
� L.. ( X j
X)( Yj - Y ). -
-
_
A.2.9
COVX y is called the covariance of X and Y. If the measurements are of several quantities, then the variance of the sum is V l: X = L Vx + 2 L cOVxx' ,
A . 2.l 0
where the last sum is the sum of all possible pairs of covariances. If there are measurements on n objects then there are n(n - 1 )(2 covariances and 1l vanances. For example, if we had measurements on hand length, forearm length, and upper arm length for each of N persons, the variance of the total arm length would be given by the sum of the three variances and six covariances. If the quantities X and Y are i ndependent of each other they will tend to deviate from their means in opposite directions just as often as in the same direction. In the former case, the value of the product will be negative, in the latter, positive ; so, on the average, L (X i X)( Yj - Y ) will be O. Therefore, if the quan tities X and Y are independent -
A.2.l l
or, if there are more than two quantities, and they are independent, A.2.l 2
A P P E N DIX
485
If there are n equally variable and independent quantities, then the variance of their sum is
Vr x = n Vx ·
A.2.1 3
I f K is a constant
(since KX = KX)
. .
A 2 14
From the foregoing, we derive a very i mportant formula-the variance of a mean. If X is the mean of N independent observations, using A .2. 1 2 we obtain Vx =
=
A.3
(�) 2 V� x 1 N
. .
A 2 15
Vx
(from A .2. l 3).
Population Values and Sample Values
In the practical use of statistics we are often interested in drawing inferences about some population on the basis of a sample of observations from the population. In such circumstances we have little interest in the sample values themselves except in so far as they provide information about the population. If the sample is representative of the population, we can use it as the basis for estimates of the unknown population values. If we wish to estimate the mean of a population, the mean of the sample is on the average a correct estimate. It is unbiased, in the sense that the expected value of X is jl, the population value. On the other hand, the variance of a sample is a biased estimate of the variance of the population. However, there is a simple correction that removes the bias ; simply divide by N - 1 instead of N.
486
A P P E N DI X
We estimate (12, the population variance, by measuring the squared deviations from the true mean, }1, rather t han the sample mean X . Using E to stand for the expected or mean value, we have v
= = =
{� L (Xi - }1)2 } E {� I [(Xi - X) + (X - p)] 2 } E {� 0= ( Xi - X ) 2 2( X - p) L ( Xi - X) + L ( X - p ) 2 ] ) .
E
+
But, I (X i - X) = 0 and E{( l / N) I ( X - p) 2 } is t he variance of the mean of X , which is V/ N . Hence V
=
E
{ I \, ( X j - X)- 2 ) + N
L.,
1
N
V
=
1
N
_
{"
- 2}
E Lj X j - X) 1
and an unbiased estimate of the population variance is
I ( X i - X )2 = v N-1
A .3.1
M ost of t he time in this book we are dealing with theoretical populations and their variances, so formulae A.2. l an d A.2.2 are appropriate ; but when· ever one is dealing with actual data and wants to estimate the population values from these dat a, A . 3 . 1 should be used. The same correction appl ies to the covariance if t he purpose is to estimate the population parameter from sample measuremen ts. Thus, COV X Y
=
I ( X i - X)( Yj - Y ) N
-I
.
A.3.2
The variance of the mean, as estimated from measurements on a sample of N observations, is A . 3.3
A.4 Correlation and Regression
The covariance is measured in units which are the squares of the original measurement units. So is the variance. It is frequently desirable to have a measurement of associat ion that is dimensionless. For this purpose the conventional measurement is the Coefficient of Correlation. It is the covar·
A P P E N DIX
487
iance divided by the geometric mean of the two variances. Thus the corre lation coefficient, r, is ,Xy •
_
CoVX Y _ 1
"X-X
- - i.J
J Vx Vy N
(Ix
Y
-
(ly
Y
.
A.4.1
For computation, the formula is conveniently written as rX Y =
-
I (X - X)( Y - Y) JI (X X) 2 I ( Y y ) 2
A.4.2
XY - XY J(X 2 _ X 2 )( y 2
A .4.3
_
_
_
y2) '
The value of the correlation coefficient ranges from 1 to + 1 . If there is no association between the variables (if X and Y are inde pendent) the value is 0. With perfect linear association of X and Y, that is to say, if every increase in X is associated with a proportional increase in Y, the correlation is 1 . Perfect negative association gives a correlation of - 1 . The coefficient of correlation is closely related to the regression coefficient. The traditional way to express a dependent variable Y in terms of an i n dependent variable X is to determine the curve of relationship by the method of least squares. By this procedure the curve is chosen so that the sum of the squares of the vertical deviations of the actual points from the curve is minimized. If the relationship is assumed t o be linear, the regression co efficient is the slope of the line. Suppose that there are a series of points, (X i ' YJ, to which a straight line is to be fitted, as shown in Figure A.4. 1 . We wish to find the line for which the sum of the squared deviations, L Df, is a minimum. Let the equation of the line be -
Y' = f(X) = a + bX, where a and b are to be determined. For any value X, the corresponding value on the line is Y' a + bX, so we minimize the quantity =
A.4.4
To do this, we follow the usual procedure and differentiate Q with respect to a and to b and set the resulting quantities equal to 0. This gives
-= aQ ca
aQ
ab
=
-
-
2 I ( Y - a - bX) = 0, 2 I X( Y - a
-
bX)
=
0,
488
A P P E N DIX
y
x
F igure A.4.1 . The procedure for fitting a straight
line by the least-squares method. X is the
independent variable and Y the dependent.
The
line is determined so that the sum of the squares of the deviations from the l ine, mi nimized.
L:( y,
-
YDz, i s
from which we have the two equations, aN + b L
X L Y = 0, a L X + b L x2 L X Y 0, -
-
=
where N is the number of pairs of measurements and L a are a
=
Y
where
-= x and
-y =
-
bX ,
=
Na. The solutions
A .4 . S
LX
N '
LY
N '
and A.4.6
A P P E N DIX
489
The line of best fit (by the least-squares criterion) is Y'
= a
=Y
+ bX +
b(X - X),
A.4.7
where b (usually written as by x) is given by AA.6. The slope, byx , is called the regression of Y on X. I t gives the amount by which Y changes for a unit change in X. Notice that if the two variables are measured as deviations from their means, say y = Y' - Y and x = X X, the equation takes the simple form -
y
=
bx .
A .4.7a
This also shows that the regression l i ne passes through the two means, X and Y. Another way of writing the regression, one that shows its relation to the correlation coefficient, is b yx
=
rX Y
O' y . O'x
A.4.8
-
The regression coefficient is concrete and is expressed in the appropriate units of measurement for the dependent and i ndependent variables. For example, if X is the pounds of fertilizer applied per acre and Y is the yield of corn i n bushels per acre, by x is expressed as bushels/pound. On the other hand, the correlation coefficient is dimension less. The correlation coefficient has two properties that are especially useful in i nterpreting genet ic data. By substituting A A. 5 and AA.8 into AAA, we see (after some algebraic rearrangement) that A.4.9
This tells us that r2 is the fract ion by which the squared deviations from the regression line are less than the squared deviations from the general mean, 1'. In this sense, a fract ion r 2 of the variance of Yis associated with (or explained by) a linear change in X ; the remainder, 1 r 2 , represents the fraction of the variance due to random deviations from the l i near association with X. 1 11 Chapter 4 we use r 2 , where r is the correlation between genotypic and pheno typic measurements, as a measurement of heritability, that is to say, the extent to which phenotypic variance is accounted for by genotypic differences . The other interpretation is this : I f the quantitative trait or measurement can be regarded as the sum of a large number of components of which a fraction , r, are common to the measurements while the remaining fraction, I r, are independent, then the expected correlation between stich measure ments i s r. For example, a mother and daughter are identical for hal f thei r -
-
490
A P P E N DIX
genes, the other hal f being independent (assum ing that the father and mother are unrelated). Thus the expected correlation between mother and child for a trait determined by a large number of additively acting, independent genes would be 1 /2. The correlation coefficient is useful in another, somewhat related way. Suppose that a quantitative trait or measurement, Yij ' of the jth member of the ith group is made up of three independent and additive components : ( 1 ) an overall mean, /1 , (2) a component common to the group, OJ , and (3) a component special to the individual, S jj .
Then Y ij
= J1 + 0 i +
S ij ,
A.4 . 1 0
and since they are deviations from the mean,
Lj Lj sij = 0.
A .4.1 1
For example, the groups m ight be families. Then OJ is the component, genetic or environmental or both, common to all members of the ith family and s ij is the add itional componen t special to the jth member of the ith family. Since the group and special factors are independent, the variance is A.4. 1 2
The covariance between the measurements, Yij and Y jk , of two members of the same group is
COV(Yij , Yik) = E {(g j + sij)(g j + Sik)} = E(g;) + E(gi Sij ) + E(gi sid + E(Sij Sjk) '
A.4.1 3
The last three terms are all 0, because sij and S jk are independent of OJ and of each other. Since £( g ; ) Vg , =
cov(y ij ' Y ik ) =
Vg
A.4.1 4
and the correlation between two members of a group is r =
�q
Vg + Vs
V
g =-
l"�
A.4.1 5
For example, since the correlation between mother and daughter for independent and additive genes is 1 /2, the variance within mother-daughter groups, Vs is 1 /2 the population variance, Vy • '
A P P E N DIX
491
If sij and Sik are not independent, then A.4. l 4 is not correct. An example would be competition between litter mates, pre- or postnatally. The death of one may enhance the probability of survival of others ; or if one is stunted and eats less, the others may have more. In extreme cases this might create negative values of the last term in AA. 1 3 large enough to offset the first and make the covariance negative. For many purposes, however, the assumption that the sij's are i ndependent is a reasonable one.
A.5 B inomia l, Poisson, and N ormal Distributions
If the probability of an event is p, the probability that in N independent trials the event will occur exactly n times is given by prob e n)
=
N!
n .' (N
_
n ) 1.
p"( 1 - p)N -".
A.S.1
Because this is the formula for a term in the expansion of [p + (1 _ p)]N this is called the binomial disbribution. The extension to more than two kinds of events is straightforward and is given by the multinomial distribution . If P I is the probability of event I , P2 the probability of event 2, P 3 the probability of event 3, and so on, then the probability that in N trials event I will occur n I times, event 2 n 2 times, event 3 n3 times, and so on, is pro b(n i t n 2 ,
. . .
)
=
N!
n 1 I· n 2
" · n3
·
"1
"2
"l
. . . P I P2 P 3 . . . ,
A.S.2
The theoretical mea n and variance of the binomial distribution are easily obtained. If p is the probability of an event, then the expected number of occurrences i n N trials is Np. I n a single trial there are two possible numbers of occurrences of the event, 0 or 1 , with probability (1 - p) and p. The mean number of occurrences is, of course, p. Thus, by A.2.6 the variance is
v = p(1 - p) 2 + ( 1 - p)(O = p(1 - p).
_
p) 2
A.S.3
Therefore, in N trials the variance of the n umber of occurrences is (by A.2. 1 3) N times as large, or V N
=
Np( 1 - p).
A.S.4
492
APPENDIX
Likewise, from A.2. I S, the variance of the proportion of occurrences in N trials of an event with probability p is
Vp = p( l N- p) .
A.5.5
A case of special i nterest arises when p is allowed to approach 0 at the same time that N becomes indefinitely large in such a way that the product N p remains of moderate value. The limiting form approached in this manner is the Poisson distribution. The probability of exactly n occurrences of the event when the mean number is p ( = N p) is given by prob (n)
e - !J
= -- .
p"
A.5.6
nl
I n particular, the probabil ity of no occ u rrence is e - Jl• The variance formula is easi ly obtai ned from fOl mula A. S.4. As approaches 0 while N p p remai n s fm ite, the variance approaches =
p
A.5.7
That is to say, the variance is the same as the mean for the Poisson distribution . The distribution of measurements of many biological materials, and of a great many other things, is often approxi mated by a symmetrical, bell shaped curve such as is shown in Figure A . S. I . This is the curve of the normal distribution a nd has t he equat io n A.5.B
- 3 0' \,
- 2 0' \,
F i gure A.5.1 .
'--y-�
- 0'
0
6 8% of Area
V 9 6 % of Area
v 9 9 .7% of Area
0'
2 0' /
3 0' I
The n o rm al distribution curve.
A P P EN DIX
493
where jJ. is the mean and a 2 is the variance. The normal distribution is also the limit of the binomial distri bution as N gets large while p remains finite. As can be seen from the figure, approximately 68 % of the observations are within one standard deviation of the mean and about 96 % are within two standard deviations. The proportion of observations lying between any two mUltiples of the standard deviation may be obtained by numerical i ntegration of A . 5 . 8 between the appropriate limits. If the observations are not normally distributed, it is often possible to transform them to yield a distribution that is approximately normal. For example, data that are skewed are often rendered roughly normal by replacing the numbers with their logarithms or square roots. A conven ient property of the normal distribution is that with large samples the mean tends to be normally distributed even when the parent distribution is not hormal. Therefore, if the standard deviation ot the mean (often called standard error) is known, it is possible to use the normal distribution to give an idea of the precision of the estimate. For example, with an appropriately drawn sample, the mean of the sample is expected to lie within two standard errors of the population mean about 96 % of the time. A.6 Significance Tests and Confidence Limits
1 . Significance Tests for E n umeration Data : The C h i - square Test
We frequently want to compare observed results with those values that are predicted on the basis of some hypothesis. How far from the expectation must the observations be before they can be regarded as real differences and not simply statistical accidents ? One widely used approach is to ask the question : What is the probability that, if the expectations are correct, results would be obtained that deviate from the expectations by as much as or more than those which were observed ? An approximate answer to this question is given by the Chi-square test. The value of X 2 is given by 2 (ObSerVed - expect ed) 2 = A.6.1 X L expected From the value of X 2 and one other quantity, the number of degrees of
[
]
.
freedom, the desired probabil ity can be read from a table or chart. A chart is given at the end of this chapter. The n umber of degrees of freedom is the number of classes of observa tions minus one, minus the number of parameters estimated from the data. To read the chart, look along the lower horizontal axis until the value of X 2 i s reached ; then go up from this poi nt to the line corresponding to the number of degrees of freedom. The probability i s directly to the left.
494 A P P E N DI X
As an example, suppose a plant breeder who expected a 9 : 3 : 3 : I ratio in an F2 cross observed 98, 22, 25, and 1 5 i n the four categories. Since the total number is 1 60, the expected n umbers on the 9 : 3 : 3 : I hypothesis are 90, 30, 30, and 1 0. He would compute l by 2
2
2 (22 - 30) (25 30) ( 1 5 - 1 0) 2 (98 - 90) = X3 + 30 30 90 + 10 + =
6. 1 8,
prob
=
2
. 1 1.
There are three degrees of freedom, i ndicated i n t he subscript to X l . There are four categories of plants. I n t his case no parameters were estimated from the data, so t he n umber of degrees of freedom is 4 - 1 3. The probability, 0. 1 1 , is obtai ned fro m t he chart as t he probability associated with a X 2 of 6. 1 7 and 3 degrees of freedom. The i nterpretation is that, i f one were to repeat this experiment 1 00 t imes with 9 : 3 : 3 : 1 expectations, 1 1 t imes he would expect to get results that deviat e from the expected 90, 30, 30, and 10 by as much as or more than t he observed set of results did. Stated another way, and somewhat more precisely, the probability of obtaining a value of X 2 equal to or greater than that observed, if the hypoth esis i s correct, i s . 1 1 . This is useful practically because the probability given by the l procedure is a good approximation to that o bt ained by an exact calculation based on t he multi nomial distribution. As a second example, consider the data in Table A.6. 1 . These are for a double backcross in mink where the t heoretical expectation with equal viabilities and no linkage is 1 : 1 : 1 : I . The data are from R. M . Shackleford. Again, there are 3 degrees of freedom, but this time the value of l is much larger. From the chart it can be seen that t he probability is less than the smallest value on t he chart, .000 1 . With probabilities as small as this we know t hat ( 1 ) either the hypothesis was wrong, or (2) a very i mprobable event has happened. In this case, we conclude that something is wrong with the hypothesis ; to a geneticist it is obvious that what is wrong is that the genes are linked. By convention, if the probability given by t he chart is less than .05 we say that the difference between t he observations and the expectations is significant. If the probability is less t han .01 we say that t he difference is =
highly significant.
The additivity property of X 2 is illust rated by the further analysis in Table A.6. 1 . The comparison number 2 tests t he segregation ratio for the E, e locus. Here, l = 6.05 for I degree of freedom. This corresponds to a probability a little larger than 0.0 1 . This i s significant, but not h ighly signifi cant ; we conclude t hat the hypothesis of equality is doubtful, probably
A P P EN DIX
495
attributable to viability differences. The comparison number 3 tests the segregation ratio of the B, b locus. Here X2 0.80, corresponding to a probability of about 0.4 ; there i s no reason to doubt the correctness of this expectation . Comparison number 4 tests t he hypothesis of nonlinkage. =
Table A.6.1 .
An example of the calculation of X2.
GENOTYPE
1
2
PHENOTYPE
Ebony Palomino Dark Pastel
Ee Bb Ee bb ee Bb ee bb
OBSERVED N U MBER
(O)
EXPECTED N UMBER
4
-
£) 2
E
22
20
.200
7
20
S.450
14
20
1 . 800
37
20
14.450
TOTAL
80
80
Ebony + Palomino Dark Pastel
24.900 = xi
29
40
3 . 025
51
40
3 .025
TOTAL
80
SO
6.050 = xi
Ebony Dark Palomino Pastel
36
40
.400
44
40
.400
TOTAL
80
SO
.SOO = xi
Ebony Pastel Palomino Dark
59
40
9.025
21
40
9.025
TOTAL
SO
80
1 8.050
- -----
3
(E)
(0
TOTAL
for 2,
3,
and 4
=
xi
24.900 = xi
Here t here i s a large X 2 , corresponding to a probabil i ty of less than .000 1 . So we conclude that definitely the two loci are not i ndependent. By this analysis the total X2 for 3 degrees of freedom has been broken up in 3 components, each with 1 degree of freedom and each testing a separate aspect of the observations. The additivity principle i s illustrated by the fact that these t hree values, each of I degree of freedom, add up to the original X 2 with 3 degrees of freedom.
496
A P P E N DIX
As a final example, consider the data in Table 2.2.2 in Chapter 2. In this case we have
OBSERVED EXPECTED
MM
MN
NN
TOTAL
363 361.5
634 636.9
282 280. 5
1 279 1 278 . 9
x�
= 0.027, probabil ity = 0.87
There is no reason to question the assumptions that led to these expectations -namely, correct estimation of the gene frequencies, random mating, equal viabilities, no segregation bias, etc. This differs from the earlier examples i n one regard, however. The number o f degrees o f freedom is not 2, as m ight have been expected, but 1 . This is because the data themselves were used to estimate the gene frequency on which the expectations are based. Therefore one more degree of freedom has been removed ; the number of degrees of freedom, then, is 3 - 1 - I = 1 . The manner in which the allele frequency was computed is given in Table 2. 1 . 1 . There are two cautions that should be mentioned. The first is that one must use numbers, not proportions or percentages, in the calculations. The second is that the method is only approximate, and the degree of approxima tion gets worse as the numbers become small. A conservative rule is not to use the X 2 method if any expected number is less than 5. 2. Confidence Limits for Enumeration Data
A somewhat similar problem arises when statistical procedures are used for estimation of some population parameter, rather than to test a hypothesis. This is done by deter mining fiducial or confidence limits. (Actually there are subtle differences in these two concepts, but they do not enter into the computations for the kinds of examples that we are discussing.) If the sample is large enough that the normal distribution can be i nvoked, we can assign confidence limits as follows. If p is the estimate and s is the standard deviation of the estimate, then the approximate upper and lower confidence limits are
L.. = p + IS, lower limit = L, = P - IS,
upper limit
=
A.6.2
where t is chosen according to the probability level desired. The value of t is obtained from the t chart at the end of this appendix, using infinite degrees of freedom and a probability corresponding to the complement of the confidence level desired. For 95 % " confidence," we choose t corresponding to a probability of .05.
APPEN DIX
For convenience, here are used confidence levels. CONFIDENCE LEVEL
t values corresponding to
497
several frequently
t
. 50
0.67
.90
1 .64
.95
1 .96
.98
2.33
.99
2.58
Here is a simple numerical illustration. Suppose in a count of 400 persons, 1 60 are found to have blue eyes. These are a random sample of a large population (theoretically infinite) within which we should like to estimate the true frequency of blue eyes. On the basis of the sample, the estimate is p 1 60/400 .40. The variance of this proportion, from A.5.5, is (.4)( 1 - .4)/400 .0006. The standard deviation , s, is the square root of this, or .0245 . If we desire a 98 % confidence statement we choose t = 2.33. Thus the upper l i mit is approximately =
=
Lu
=
=
.40 + (2.33)(.0245) = .457
and the lower limit is
L, = .40 - (2.33)(.0245) = . 343.
We therefore have 98 % " confidence " that the val ue lies between . 343 and .457. We can get a n improvement in precision, especially when p is far from 0.5, by asking the following questions : What value of the true proportion would lead to a probability ex of getting the observed number or something larger and what value would lead to a probability ex of getting the observed number or something less ? These two values would then be the Jimits for the confidence value of 1 - 2ex. S uppose that p i s the observed proportion and n is the true value. The p robability of eq ualling or exceeding p i n a sample of N is gotten by computing I (p - n)/s, where s = In(l - rr)/N and looking up the probability i n a table or chart of the normal integraL The upper limit is gotten the same way. Consideri ng both upper and lower limits, we solve the quadratic,
t2
=
or
(N +
N(p - rr) 2 rr( 1 - rr)
12)n2
,
(2Np +
t 2)n + Np 2
=
0
and the two sol utions for rr are the upper and lower limi ts.
A . S.3
498
A P P E N DIX
Thus
L
u
L
(2Np + ( 2 ) + J(2Np + ( 2 ) 2 - 4 Np2(N + 1 2) 2(N + ( 2 )
= ----------�------�------------
-
1 -
(2Np + ( 2 )
_
J(2Np + ( 2 )2 - 4 Np2(N + ( 2) 2 (N + ( 2)
�------�-------------
------
A.6.4
For example , su ppose we observe 8 successes in 40 observations. Then p = 8/40 = 0.2 and N 40. I f we desire 99 % confidence l.imits, we choose ( 2 . 5 8 , leading to 0.086 and 0.400 as the lower and upper 99 % limits. If we had used equations A .6.2 we would have gotten the less accurate values 0.037 and 0.363. This procedure is only as accurate as the normal approximation to the binomial. The exact answers may be obtai ned by com puting the appropriate binomial values. Charts giving binomial limits are available in standard hand books (Beyer, 1 966 : Pearson and Hartley, 1 958). From these charts the exact limits to the problem above are 0.07 and 0.4 1 , so A .6.4 gave a very good approxi mation. The exact meaning of fiducial and confidence li mits has been a matter of a great deal of controve rsy. Some have questioned whether it is proper to speak of the probability that the " true " value lies between certai n limits. For an interesting and characteristically polemic d iscussion of the deeper meanings, see Fisher ( 1 956). =
=
3. Significance Tests for M easu rement Data
If the data are from measurements rather than counts, the appropriate test of significance is based on the ( distri bution. I f we have measurements on two groups and wou ld like to know if their means are significantly different we compute ( as foll ows : (
=
J
Xa - Xb ( Na + Nb)[( Na - l ) Va + ( Nb - l ) Vb] , Na Nb(Na + Nb - 2)
A.6.S
where Xa and Xb are the means of the two samples and Va and Vb the two variances, computed from A.3. 1 ,
I V = (Xi XY N- 1 Na and Nb are the number of measurements in the two samples.
APPEN DIX
499
This value of t is introduced i nto the t chart exactly as i was in the i chart. The number of degrees of freedom is Na + N b - 2. A probability less than .05 is interpreted as a significant difference and less than .01 is conventionally regarded as h ighly significant. If the two sets of measurements are paired in some way-for example if they are measures of blood pressure of the same person before and after administration of a drug-it is usually more appropriate to test the hypothesis that the two means do not differ by treating each of the differences as a variable. If dl = Xa l - Xb l t the difference between the first measurements in each group, and so on, A.6.6
where
Va =
L (d i N( N
a)2
-
1)
=
L dr - Na2 N( N
-
1)
.
This value of t is entered i nto the chart with N - 1 degrees of freedom. The probability is that of obtaining an absolute value of a as large as or larger than the observed value if the popUlation true difference i s o. The Significance of a Correlation Coefficient When a correlation coefficient has been measured and we desire to know whether it is significantly different from 0, the t test is again appropriate. I n this case
4.
t=
---'=( = J ,
1=- '� )2 '
A.6.7
(N - 2)
where r is the observed correlation coefficient and N is the number of pairs of measurements. The appropriate number of degrees of freedom is N - 2. 5. Confidence Limits for the Mean with Measurement Data
The confidence limits for the mean of a series of measurements are the same as A . 6.2. The upper and lower limits are Lu Ll
=
=
X + tsx , X
-
tsx ,
A.6.S
where X is the observed mean, Sx is the standard deviation of the mean (the square root of the variance, as given below, and t is chosen to correspond to the probability desired with degrees of freedom equal to N - I .
500
A P P E N DIX
For example, jf the mean of I I sample items is 25 and the standard deviation of the mean is computed to be 4, we might desire the 95 % confidence limits. Looking for the value of t corresponding to P = .05 and 1 0 degrees o f freedom, we find from the X 2 and t chart that the val ue is between 2.2 and 2.3 (actually the value is 2.23). Thus the 95 % confidence limits for the popula tion mean are 25 + (2.23)(4) , or 1 6. 1 and 33.9. 2 The t values in the X and t chart and in the table in Section A.6.2 are for two-tailed tests. That is, they give combined probabilities for deviations in both directions. If deviations in only one direction are considered, then the probability of a deviation as large or larger than that observed is only half as large. If confidence intervals are desired where a deviation in o nly one direction makes sense, then choose a t value corresponding to a prob ability twice as large as that given in the chart, e.g., choose t corresponding to 0.02 instead of 0.0 1 . A.7 Matrices and Determ i nants
A matrix, A, is a set of quantities arranged in rectangular form, such as
A.7.1
where the first subscript indicates the row and the second the column. The element in row i and column j is thus designated by aij ' The dimensions of a matrix are specified by the numbers of rows and columns ; a matrix with m rows and n columns is called an m x n matrix. If two matrices have the same dimensions, the processes of addition and subtraction are direct. The corresponding elements are simply added (or subtracted). If matrices A and B are added to produce matrix C, then cij aij + bij . For example, =
(3
2
6 1
I) + (2
0
3
-4 5
2 6
The rule for the multiplication of two matrices is more complicated. The element in row i and column j of the product matrix is the sum of the p roducts of the elements of row i of the first matrix and columnj of the second. That is A.7.2
APPEN DIX
501
This makes i t necessary that the first matrix have the same number of columns as the second has rows. Here is an example :
(i
-2 2
The element i n the first row and second column of the product matrix is (3 x 1 ) + ( - 2 x 1) + (6 x 4) = 25, for example. The product of an m x k and a k x n matrix i s of dimension m x n. Notice that with matrices AB i s not necessarily equal to BA. I n fact, unless special restrictions are placed on the d imension s of A and B one of the products may not even be defined. For example if A i s m x k and B i s k x n, A x B i s possible, but B x A is not. A matrix with only o ne row is called a row vector ; a matrix w ith only o ne column i s called a column vector. In population genetics, the most usual multiplication is (m x m)(m x 1 ) ; that is, a square matrix by a column vector. For example,
Multiplication of a matrix by a constant yields a matrix in which every element i s multiplied by that number : e.g.
(
ka 1 1 kA = ka2 1
ka3 1
( )
)
ka 1 2 ka22
ka1 3 ka2 3 .
ka3 2
ka3 3
A.7.3
Note that this is equivalent to multiplication by the matrix k
0
o o
k
(I
When k
I
0
=
0 O . k
)
1 we have the unit or identity matrix, usually designated by I :
0 0 0 o 1 0 0 o 0 1 0 0 0 0 1
= I.
A.7.4
The unit matrix plays the same role i n the algebra of matrices as the number does in ordi nary algebra. Notice that multiplication of a square matrix by the unit matrix of the same dimensions leads to the original matrix.
502
A P P E N DIX
A square matrix has a corresponding determinant, written as al l
IA I =
a2 1
a3 1 a4 1
a1 2
a2 2 a )2
a4 2
a1 3 a2 ) a) ) a4 3
a14
a 24 a3 4 a44
A.7.S
The minor of an element in a determinant is the determinant that remains when the row and column of the element are deleted. The cofactor of an element i s the minor, prefixed by a sign which is determined by the position of the element. If the sum of the subscripts is even the sign is positive ; if the sum is odd the sign is negative. We shall designate the cofactor of the element aij by A ij ' The value of a determinant is given by taking any row or column and summing the product of each element in the row or column by its cofactor. Thus, in the determinant above the cofactor of a)2 is
2 A ) 2 = ( _ 1) 3 +
al l
a2 1
a4 1
a1 3 a2 3 a4 3
a1 4 a 24 , a4 4
A.7.S
and the value of the determinant is A.7.7
In this case the first column was used as the basis for expansion, but any other row or column could have been ; for example
IA I
=
a3 t
A3 1
+ a ) 2 A ) 2 + a )3 A 3 3 + a34A 3 4 '
Each cofactor may itself be evaluated as a determinant, and so on until the cofactors are single elements . The inverse of a square matrix is the analog of the reciprocal of a number. The inverse of a square matrix A is denoted by A - I and satisfies the relation AA 1 = A - I A = 1 The process of obtaining the inverse is rather trouble some. To obtain an element, which is designated a ij, we consider the cor responding element of the original matrix with rows and columns transposed, aji • We take the cofactor of this element and divide by the determinant of the entire original matrix. In symbols, the element aii in the inverse matrix A - 1 is given by A jdl A I in the original matrix. I f the value of the determinant is 0, the matrix has no inverse. As an example, consider the square matrix,
-
(3 )
1 2 A= 2 I 1 . 2 2 3
A.7.B
APPEN DIX
503
The determinant of this matrix is
,A , = 3 =3
1 ; � 1 - 2 1 � � 1 + 21 � : 1 - 1) x
4
-
2
I
x
+ 2(
=
8.
The co factors are All
All
= (_ 1 )2
= (_ 1 ) 3
2
2
2 2
1 3 = 4, 1 = ' 3 -4
1
and so on, so the matrix of cofactors is
(-i- 1
-4 7
-1
The inverse matrix i s obtained by dividing by and columns,
4 8
-
A- 1
=
-
4 8
-
0
-1
-
8
7 8
-
4
8
-
/A/ and interchanging rows
1
-
8 1
A.7.8a
8 4 -
8
It can easily be verified in this example that the product of
A
and its i nverse 1 is the identity matrix ; AA I. Although the procedure given here always works, it is not the best for computational efficiency. Much more rapid procedures for large matrices are given i n textbooks on numerical methods (e.g., Faddeev and Faddeeva, 1 963). As an example of the use of matrices in population genetics, consider successive generations of self-fertilization. We let D, H, and R stand for the proportions of dominant homozygotes, heterozygotes, and recessive homo zygotes. Using the subscript t to indicate time measured in generations we
=
504
A P P E N DIX
can write the frequencies in generation t in terms of t he corresponding fre quencies in t he previous generation. With self-fertilization (see Table 3. 1 . 1),
H, =
A.7.9
This can be written in matrix form
D,
H,
-
R,
1
1 4
0
D' - l
0
-
1 2
0
H' - l
0
1 4
1
R' _ I
A.7.1 0
Since t here is the same relation between the genotype frequencies at times t 1 and t 2 as between those at times 1 and 1 1 , we can write -
-
-
D,
1
1 4
H, -
o
-
1 2
o
R,
o
1 4
1
-
o
2
A.7.1 1
1 4
-
o
o
-
1 2
o
o
1 4
1
A P P E N DIX
If
A
is the matrix i n A . 7 . 1 1 , then by the rule for matrix multipl ication 3
-
1
A2 =
506
8 1
-
0
0
4
A3 =
0
1
3
-
0
1
0
8
0
7
16
1 8 7
16
0 0
1
As a numerical example, co nsider a population i n which the i n it ial freq uencies of the three types are Do
=
1 /4, Ho
=
1 /2, and Ro
=
1 /4. What
are the frequencies after three generatio ns of sel f-fertil izati on ? This is given by
1
7
16
o
1
-
o
o
8
1
7
16
o
1
-
4
1
-
2
A.7.1 2
-
1
4
fro m which we c a n write
1 15 1 7 D3 = l x - + - x - = 4 1 6 2 32 ' 1 1 1 H3 = g X 2 = 16 ' 7 15 1 1 x "2 + 1 x 4 = ' R3 = 32 16
A.7.1 3
These are the same values obtained by another method in Table 3. 1 . 1 .
A.a
E igenvalues and E igenvectors
We would like to have a way to obtain quantities after several generati ons without t he necessity for repeated matrix multiplicati on. This can often be done . Consider the recurrence relations x, y,
=
=
a l l x, _ 1 a2 I x, - 1
Z, = a 3 1 x, - 1
+ a 1 2 Y , - 1 + a 1 3 Z, - t , + a 2 l Y , - l + a2 3 z , - h +
a l l Y, - t
+
a n ZI - l '
A.B.1
606
A P PE N DIX
where, as before, the subscript form
f
measures time in generations. I n matrix
A.S.2
We consider the related quantities I, m, and 11, chosen so that A.S. 3
The column vector is called an eigenvector. Equations A.8.3 may be written a l i i + a 1 2 m + a1 3 n
=
),1,
a2 1 1 + a2 2 m + a2 3 n
=
Am,
a3 l 1 + a3 2 m + a3 3 n
=
An,
A.S.4
which is the same as
0 a2 1 1 + (a2 2 - ).)m + a2 3 n = 0 a3 1 1 + a3 2 m + (a3 3 - ).) n = O.
(a I J
-
A) I + a 1 2 m + a 1 3 n
=
A.S.4a
From the rules of algebra, we know that this system of homogeneous equations has a nontrivial sol ut ion for I, In, and n if and only i f
= 0,
A.S.S
or, in abbreviated form, IA
-
AI \
=
0,
A.S.5a
where I is the identity matrix. A.S .Sa is the characteristic equation of the matrix A and has three roots. We assume that the three roots, )' 1 ' A2 ' and A3 , are distinct ; that is, that no two have the same val ue. These roots are called eigenvalues (also character istic roots, latent roots, characteristic values, and proper values). When A t is substituted into A.SAa, the equations can be solved for the ratios I I , m l , and n t • Likewise, 12 , m 1 ' and n 2 are obtained by substituting A 2 ; and so on for 13 , m J , and n J . We note here that I, m, and n are not uniquely determined, since (for example) we could divide each equation by n and have three equations i n
A P P E N DIX
507
two variables, lin and min. However, the ratio I : m : n can be uniquely determi ned if condition A.8. S is met ; so one of the three can be chosen arbitrarily. The relation between X' _ I and X, - 2 is the same as that between x, and X, - l ' Therefore, from A.8 . 2 , we can write
c') C ( Yt = Zt
=
a 1 2 a3 1
al 2
all
a2 2
a2 3
a3 2
a3 3
al 2
al 3
a2 2
a2 1
an
a 3 1'
a2 3 a3
3
)( ) '
Yt - ,2 Z, _
H)
2
A.S .S
Xo
Yo · Zo
The initial values Xo , Y o . and z o are known, and are treated as constants. They can be expressed in terms of the I, m, and n' s by Xo Yo Zo
=
c i l.
=
c1nl
=
+ c2 12 + c3 13 , c l m . + C2 m 2 + C 3 m3 ,
A.S.7
+ c 2 n2 + c) n3 .
If the three eigenvalues are distinct, these equations can be solved and the c's thereby determined. Then a general solutio n is possible. Substituting A.8.7 into A.8.6 gives
( y ,) ( X'
=
Zt
�
al l a2 l a3 1
c , A'
From A.8.3
(�I) (;" ) (:' ) +
+
C, A '
C3 A'
3
A.S.S
-
Notice that, by the rules of matrix multiplication, if x is a column vector A2 x
=
A(Ax)
Conti nuing this,
=
AUx)
=
lUx)
=
2 l X.
508
A P P E N DIX
S ubstituting in A.8.8, we obtain
C) (: ) (: ) (: ) = c , A',
,
' + C2 A�
2
+ C JA�
J .
which may be written as
c l /1 A'i + c z lz A� + c3 /3 A.� , y, = C l m 1 A. 1 + c z mz A� + C3 m 3 A.� , z, = C l " I A. 1 + CZ "2 A.� + C 3 "3 A� ,
x,
=
'
A.8.9
'
where the c ' s, I ' s, m's, n ' s, and A'S are all constants which have been deter mined or can be determined . This gives us an explicit expression for x, y, and z in any generation without requiring a series of matrix multiplications. One of the eigenvalues, say A I , will be the largest and, as t gets large, A � is much larger than A� and A� . Thus after a few generations the structure of the population will be determined entirely by the largest or dominant eigenvalue. Hence, for many purposes, only the largest root need be fou nd. In this book we are mainly concerned with the cases in which the elements of a matrix are probabilities, which are necessarily non-negative. Therefore the largest eigenvalue, )'1 1 is positive. Furthermore, the absolute value of any eigenvalue is not greater than one. As the other roots become small, we see from A.8.9 that t
A . B.1 0
This tells us that the frequencies of the three types represented by x , y, and z eventually reach a constant ratio measured by the values of the largest eigenvector corresponding to the largest eigenvalue. If some of the roots are repeated there may not be a corresponding number of independent eigenvectors, but this situation does not arise in the examples that we are considering. As a numerical example, consider the recurrence relations for repeated sib mating as given in equations 3. 8.6. as
-. 00 .
h,
k,- I '
=
k, = !h , _ 1 + !k, - l o
A .B.1 1
where the quantities are defined in Section 3.8. The quantity of greatest interest is h, , the relative heterozygosity at generation t. The characteristic equation is
I
-A
* !
1 I
_
A =
O.
A.8.U
A P P E N DIX
509
On expansion of the determinant, we have A 2 - -1A - 1 = 0,
giving the roots =
Al
A2 =
1 + J5 4
1 - }5 4
= .
809
•
A . S. 1 3 =
- .
309 .
The general solution for h, is given by h,
=
aA'l
+ bA.� ,
A.8. 1 4
where a and b can be determined by the value of h for the first two generations. Putting t = 0 and t = I , we obtain respectively ho
=
a
hI
=
aA l
+ b, + bA 2 •
These can be solved for a a nd b, giving a
=
h l - A2 h o A l - A2 '
A.S.1 5
h I - A l ho - ---bA2 - A l .
From the definition, hr is the amo u nt of heterozygosity expressed as a fraction of the original amount. If we start with a randomly mating popula tion, ho = h l 1 . For ex ample , h3 5/8. Notice that when t becomes large =
=
A.8.1 6
Thus, after a few generations the heterozygosity decreases by a fraction 0. 1 9 1 each generation . A.9
The M ethod of Maximum Likel i hood
The principle of this method is to choose as the estimate of the parameter t he value that maximizes the probability of the observed results. In addition to its intuitive appeal, this method has been shown by Fisher to be an opti mum procedure by several criteria ( Fisher 1958).
510
APPENDIX
As an illustration consider a problem where the answer is already known from other considerations, the estimation of gene frequency i n a 2-allele locus with the heterozygote recognizable. We assume random m ating. Aa
aa
TOTAL
2p(1 - p)
(l - p) 2
B
C
1 N
GENOTYPE EXPECTED PROPORTION OBSERVED N U MBER
The probability of the observed numbers, A , B, and C, as a function of the unknown parameter, p, is prob
N .'
= .
A ! B! C!
p2A(2p( l
-
p))B( l
_
p)2C
A . 9.1
The same value of p that maxi mizes the probability will also m aximize the logarithm of the probability and the algebra is thereby greatly simplified. L = log prob
=
(2A + B) log p + (8 + 2 C) log ( 1
-
p) + K,
A.9.2
where K is a constant. To find the value of p that maximizes L, we equate the derivative of L with respect to p to O.
dL dP
= 2A + B
_
P
B + 2C -P l
1
= 0
'
which has the solution p=
2A + B 2N
A.9.3
---
as expected. The theoretical variance of a maximum-likelihood estimate appropriate when the sample is large is given by the estimated value of the negative reciprocal of the second derivative
�= Vp
_E(d2L2 ) . dp
A.'.4
The estimated value symbol "E " is taken to mean " replace the observed quantities with their m aximum-likelihood estimates. "
L
d 2 = 2A + B dp2 - p2
B + 2C
( 1 _ p)2 '
A P P E N DIX
51 1
Replacing (2A + B) by 2Np and (B + 2C) by 2N( l - p), we obtain V. =
p( 1
-
2N
p
p)
A.9.5
'
which is as expected, for this is equivalent to a binomial sample of 2N genes. When more than one parameter is to be estimated, as with mUltiple alleles, the procedure is a straightforward extension, though naturally more complicated. Consider the 3-allele model with dominance in Table A.9. 1 . Table A.9.1 . A 3-al lele model with dominance. A 1 is dominant to A 1 and A 3 ,
while A 1 is dominant to A 3
•
A3
PHENOTYPE
TOTAL
GENOTYPES
c
A
OBSERVED NUMBERS EXPECTED PROPORTIONS
p i + 2P1P2 + 2P1P3
p�
N
1
To simplify the calculations we make u se of a very convenient property of maximum-likelihood estimates, that of functional invariance. This means that any function of a maximum-likelihood estimate is the maximum-likeli ho od est i m ate of that function. Therefore, in this example, we let - P3 ' Y_ z
A.9.6
Inversely, P3
= Jy,
pz
= J�
-
Jy,
PI =
I
-
J�.
A.9.6a
Then the expected proportions for the phenotypes A . , A2 , and A 3 are I x, - y, and y. The probability of the observed numbers in terms of x and y is -
x
A.9.7
and, taking logarithms as before,
L
=
A 10g( 1 - x) + B log(x - y) + C l o g y + K,
A.9.B
51 2
APPENDIX
where K is a constant.
oL
-= ax oL
B = 0, --Ax + -x 1
-
-
-
C
-B
x
A.9.9
y
- = -- + - = 0. oy
Solving for
-
x
y
A.9.9a
y
and y, we obtain B+C N
x = --
A.9.1 0
and, by the principle of functional invariance (see A.9.6a),
P2 =
J
B+C N
-
JC
A .9.' 1
N'
The same principle applies to more involved cases. The difficulty is that the likelihood equations are often impossible to solve except by successive approximations. The formulae for the variances are also more involved, but straightforward. We shall give the arithmetic procedures later in this section. One more very useful property of maximum-likelihood estimates (and many other large sample estimates as well) is that if X is a quantity whose variance is known and Y is a function of X, the variance of Y is given by
2 (dY) Vy = Vx dX
A.9.1 2
.
p2 ,
For example, if there are 11 people with genotype aa and N - n with genotype AA or A a, then 11( N is an estimate of where p is the frequency of the a allele. This is the maxi mum-likelihood estimate. The variance of the estimate of is I N. Then, by the principle of functional invariance, the estimate of the gene frequency p is J--;'FN . The variance of this estimate is
p2 p2 ( p2 )(
A . 9. 1 3
If three variables are involved, and Z is a function of X and Vz = Vx
(-OZ) 2 + Vy (OZ ) 2 oZ oZ + 2 covxy - - . ax
oY
ax o y
Y, A.9.1 4
APPEN DIX
513
In order to get the variance of the estimates when more than one param eter i s being estimated, we make use of matrix methods. We first define the information matrix as A.9. 1 S
where
Ixx
1)1)1
= = E (iJB2yL)2 '
In each case the maximum-likelihood estimates are substituted into the formulae. From the information matrix the variances can be found. The extension to more than two parameters is straightforwa rd, but w i l l not be given here. Consider again the example of Table A.9. t . Further differentiation of A.9.9 and A.9.9a, followed by substitution of the maximumwlikelihood estimates from A.9. t 0 (and recalling that A N), leads to Ixx
+B+C= = ( 1 A + (x -B y)2 -_ N 2 (AAB+ B) : - x B y) 2 ( -- ( B y) 2 + -yC2 - N 2( BBe+ C) . _
J xy J
--:=
A.9.1 6
_
_
YJI
x
_
The inverse of the information matrix covariance matrix,
(
Vx COVXY
covxy Vy
) = =A C
-I.
IS
the
z'ariance-coz'ariance
or
A.9.n
In this case
A+B 1 B B A A = N2 1 B + C ' B BC
A.9.1 B
51 4 APPENDIX
which inverts to C
=
A- I
=
Thus
( 3
1 A(B + _
N
AC
C)
A
C
C(A + B)
)
.
A . 9 .1 9
Vy = C(AN3+ B) '
A.9.20
To obtain the variance of the gene frequency estimates we again use formulae A.9. 1 2 and The estimates again are
A.9.14. = J� .
P l = fi = Jif+N C ' P2 = Jx - Jy = J N C - J N'C A.9.6 - pi V = (O0)'P3 ) 2 = C(AN3 ..!.-4y 4N 2 4N ' ( - P I) 2 = V = V (0OXPI ) 2 = C) 4x = � 4N2 4N ' (0aPx2 ) 2 V (0P2 ) 2 xy aoxP2 00Py2 1 C(A C) - N J 4x N3 4y N 3 4)xy J C C) = _ (2A 1- 4NP� 4N- P l)2 P3 [l 1
I
-
-
-
B+
Using t hese, along with V.
P3
P2
+ B)
Y
A (B + N
x
PI
v
and A.9. 1 0, we obtain
=
-
V
+
X
A(B +
, i)y
_ ..1...
+B
2 4N
+
_
1 - (1
A+ B
2A
=
1
1 - 1
+ 2 cov
+ B) 1
-
I
1
-
)'
1 _
=
-
2
AC
-
1
=
-
B+
( 1 - p t )2 ]
2N( 1
-
PI)
A.9.21
A P P EN DIX
61 5
A.1 0 Lagrange Multipliers
I n m aximum or minimum problems it is often required to find a stationary value (maximum or minimum) subject to certain side conditions. The Lagrange method of undetermined multipliers often effects a great simplifica tion in the algebra. We shall not attempt to explain why the procedure works, but will illustrate its use. Suppose there is a function I(x, y, z, . . . ) in which there are k relations a mong the variables, 4>l(X, y, z, . . . ) 0, 4> 2 0, . . . , 4>k O. To find the values of x, y, z, . . . which m aximize or minimize J, we equate to 0 the partial derivatives of the function =
=
=
A.1 0.1
where the A/ s are treated as constants. In taking partial derivatives we treat x, y, z, . . . , as if they were independent. For example, suppose we wish to find the rectangle of maximum area inscribed in a circle of radius r. The equation of the circle is x2 + y 2 = r 2 . The area of the rectangle is 4xy. So we write ",(x, y)
=
4xy
+ A(x2 + y 2
- r 2 ).
Differen tiating, a ", ax a ", ay
=
4y
+ 2Ax
=
4x
+ 2Ay
=
0,
=
o.
- 2 and x y ; so the figure is a square. Solving these two equations, A Of course we could have substituted .Jr2 - x2 for y in 4xy and differen tiated with respect to x only. In this problem the saving of algebra is trivial, but in many problems this procedure effects a great simplification. To consider a genetic example, what frequency of each of n alleles will maximize the proportion of heterozygotes with random mating ? It is easier to ask what minimizes the proportion of homozygotes. So we ask what values of the p/ s minimize L pf subject to the side condition that L p , 1 . We write =
=
=
'"
a ",
=
- =
apt
L p f + A(L Pt - 1 ), 2p. + A
=
0.
If we add all k equations, 2 L P i + k A 0 ; but since L P t 1 , A - 2Ik So P i 1 1k for all k alleles, and the heterozygosity is maximized when the alleles are equally frequent. =
=
=
=
.
51 8
APPEN DIX Pr obablltty 0 � a:. :.... 0 0
0
0 0
0
0
0 (J) 0
0 U' 0
0 J:> 0
!='
W 0
0
!=' 9
N 0
0 0
:"" 0
C GO
01
0
0
IJ1
0
0
W
N
0
0
0
0
0
0 CII
0
0
0 N
0
0
0
0
0
0 0 CII
0 0 W
0
0 0
0
0 0
:;;;: n :T p Cl1 o ;:l
01 c
�a : >< '"
.,. c
'"
e.. .."
;;
!'O Q. 0
3
f..l 0
N U\
N 0
cD Qj
1: :t:
!'O
2.
I1'l
0
N
>.:
M _
0
�
(.11 .,.
N
W
t..O N
N
M 0
0 c::
::. ;.."
1.0 M
'0
(I) :>
:...
� ii 1.0
to
.;: 15 '"
0
(.C)
.. ... 0> - 0, 0 .,
1.0 0 0 0 0 "':
0
0 0 0
BIBLIOGRAPHY
T
his list of articles and books includes many that are not referred to in the text ; i n fact, the majority are i n this category. There are also many that we have not �tudied, but we thought i t would be more useful to have a list that is extensive rather than selective. We hope, by so doing, to call attention to the rich ness and diversity of the literature i n this field. We have not tried to make the list exhaustive, but we have attempted to include most of the papers in the theory of population genetics, particularly those that are in English and which have appeared in readily available journals. Mostly these papers deal with theoretical and mathematical aspects, but we have included several experimental and observational studies, particularly if these i nclude or bear on theoretical points. Adke, S. R. 1 964. A multi-dimensional birth and death process. Biometrics 20 : 2 1 2-2 1 6. Ali, M., and H. H. Hadley. 1 955. Theoretical proportion of heterozygosity in populations with various proportions of self- and cross-fertilization. Agron. J. 47 : 589-590. 51 7
51 8
BIBLIO G RA P H Y
Allan, J. S., and A. Robertson. 1 964. The effect of initial reverse upon total selection response. Genet. Res. 5 : 68-79. Allard, R. W. , and J. Adams. 1 969. The role of intergenotypic interactions in plant breeding. Proc. II Intern. COllg. Genet. 3 : 349-370. Allard, R. W., and P. E. Hansche. 1 964. Some parameters of population variability and their implications in plant breeding. A dv . Agron. 1 6 : 28 1 -325. Allard, R. W., and P. E. Hansche. 1 965. Population and biometrical genetics i n plant breeding. Proc. Xl Int. COllg. Gen et. 3 : 665-679. Allard, R. W., S. K. Jain, and P. L. Workman. 1 968. The genetics of inbreeding populations. Adv. Genet. 1 4 : 55- 1 3 1 . Allard, R. W., and C. Wehrhahn. 1 963 . A theory which predicts stable equilibrium for inversion polymorphism in the grasshopper, Morabu scurra. Evolution 1 8 : 1 29-1 30. Allen, G. 1 965. Random and nonrandom i nbreeding. Eugen. Quart. 1 2 : 1 8 1 - 1 98. Allen, G. 1 966. On the estimation of random inbreeding. Eugen. Quart. 1 3 : 67-69. Allen, R., and A. Fraser. 1 968. Simulation of genetic systems. XI. Normalizing selection. Theor. & Appl. Genet. 3 8 : 223-225. Allison, A. C. 1 955. Aspects of polymorphism in man. Cold Spring Harbor Symp. Quant. Bioi. 20 : 239-255. Anderson, F. S. 1 960. Competition in populations consisting of one age group. Biometrics 1 6 : 1 9-27. Anderson, V. L., and O. Kempthorne. 1 954. A model for the study of quantitative inheritance. Genetics 39 : 883-898. Anderson, W . 1 969. Selection in experimental populations. I. Lethal genes. Genetics 62 : 653-672. Andrewartha, H. G. 1 957. The use of conceptual models in population ecology. Cold Spring Harbor Symp. Quant. BioI. 22 : 2 1 9-232. Andrewartha, H. G., and L. C. Birch. 1 954. The Distribution and Abundance 0/ Animals. Univ. Chicago Press, Chicago, I ll. Andrewartha, H . G., and T. O. Browning. 1 96 1 . An analysis of the idea of " resources " in animal ecology. J. Theor. Bioi. 1 : 83-97. Arellanao, 0., T. Ricott i, and A. Diaz. 1 964. U ber ein Problem der Genetik . Naturwiss. 5 1 : 567. Armitage, P. 1 952. The statistical theory of bacterial populations subject to mutation. J. Roy. Stat. Soc. B 1 4 : 1 -40. Arnheim, N., and C. E. Taylor. 1 969. Non-Darwinian evolution : Consequences for neutral al lelic variation. Nature 223 : 900-903. Arthur, J. A., and H. Abplanalp. 1 964. Studies using computer simulation of reciprocal recurrent selection. Genetics 50 : 233. Atkinson, F. V., G. A. Watterson, and P. P. P. Moran. 1 960. A matrix inequality. Quart. J. Math. 1 1 : 1 37- 1 40. Ayala, F. 1 969. Evolution of fitness. IV. Genetic evolution of interspecific competitive ability in Drosophila. Genetics 6 1 : 737-747. .
Bailey, N. T. J. ] 96 1 . Introdllction to the lvlathematical Theory 0/ Genetic Linkage. Oxford University Press, Oxford, England.
B I BLIO G RA P H Y
51 9
Baker, G. A., J . Christy, and G. A. Baker. 1 964. Analysis of genetic changes in finite populations composed of mixtures of pure lines. J. Theor. Bioi. 7 : 68-85. Baker, G. A., J. Christy, and G. A. Baker. 1 964a. Stochastic processes and genotypic frequencies under mixed selfing and random mating. J. Theor. Bioi. 7 : 86-97. Baker, H . G., and G. L. Stebbins. 1 965. The Genetics of Colonizing Species. Aca demic Press, New York. Barber, H. N. 1 965. Selection in natural populations. Heredity 20 : 5 5 1 -572. Barker, J. S. F. 1 958. Simulation of genetic systems by automatic digital computers. 111. Selection between alleles at an autosomal locus. Aust. J. Bioi. Sci. 1 1 : 603-6 1 2. Barker, J. S. F. 1 958a. Simulation of genetic systems by automatic digital computers. IV. Selection between alleles at a sex linked locus. Aust. J. Bioi. Sci. 1 1 : 6 1 3-625. Barker, J. S. F. , and J. C. Butcher. 1 966. A simulation study of quasi-fixation of genes due to random fluctuations of selection intensities. Genetics 53 : 26 1-268. Barnett, V. D., 1 962. The Monte Carlo solution of a competing species problem. Biometrics 1 8 : 76-1 03. Barrai, I., M. P. Mi, N. E. Morton, and N. Yasuda. 1 965. Estimation of prevalence under incomplete selection. Amer. J. Hum. Genet. 1 7 : 22 1 -236. BarriceJIi, N. A. 1 962, 1 963. Numerical testing of evolution theories. Acta Biotheor. 1 6 : 70-1 26. Bartko, J. J., and G. A. Watterson. 1 963. Inference on a genetic model of the Markov chain type. Biometrika 50 : 25 1 -264. Bartlett, M . S. 1 937. Deviations from expected frequency in the theory of inbreeding J. Genet. 3 5 : 83-87. Bartlett, M. S. 1 955. Stochastic Processes. Cambridge Univ. Press, Cambridge. Bartlett, M. S. 1 957. On theoretical models for competitive and predatory biological systems. Biometrika 44 : 27-42. Bartlett, M. S. 1 960. Stochastic Population Models in Ecology and Epidemiology. Meth'uen, London. Bartlett, M. S., J. C. Gower, and P. H. Leslie. 1 960. A comparison of theoretical and empirical results for some stochastic population models. Biometrika 47 : 1 - 1 2. Bartlett, M . S., and J. B. S. Haldane. 1 934. The theory of inbreeding in autotet raploids. J. Genet. 29 : 1 75-1 80. Bartlett, M. S., and J. B. S. Haldane. 1 935. The theory of i nbreeding with forced heterozygosis. J. Genet. 3 1 : 3 27-340. Bateman, A. J. 1 950. Is gene-dispersion normal ? Heredity 4 : 353-364. Bateman, A. J . 1 952. Self-incompatibility systems i n angiosperms. I. Theory. Heredity 6 : 285-3 1 0. Bazykin, A. D. 1 965. On the possibility of sympatric species formation (in Russian), Bull. Moscow Soc. Nat., Bioi. Ser. 70 : 1 61 - 1 65. Bellman, R., and R. Kalaba. 1 960. Some mathematical aspects of optimal predation in ecology and boviculture. Proc. Nat. A cad. Sci. 46 : 7 1 8-720. Bennett, J. H . 1 953. Junctions in inbreeding. Genetlea 26 : 392-406. Bennett, J . H . 1 953a. Linkage in hexasomic inheritance. Heredity 7 : 265-284.
520 B IB LIOG RAPHY
Benhett,
J. H. 1 954.
Genetics
J. H. 1 9540.
39 : 1 50-1 58.
Bennett, Bennett,
Panmixia with tetrasomic and hexasomic inheritance.
J.
On the theory of random mating. A nn. Eugen. 1 8 : 3 1 1 -3 1 7 . H. 1 954b. The distribution of heterogeneity upon inbreedi ng. J. Roy.
Stat. Soc. B 1 6 : 88-99.
Bennett, J. H. 1 95 6. Lethal genes in i nbred lines. Heredity 1 0 : 263-270. Bennett, J. H. 1 957. Selectively balanced polymorphism at a sex-linked locus. J. H . 1 957a.
Nature 1 80 : 1 363-1 3 64.
Bennett,
The enumeration of genotype-phenotype correspondences.
Heredity I I : 403-409.
Bennett, J. H. 1 95 8 . The ex istence and stability of seleCtively balanced polymorphism at a sex-l inked locus. A llst. J. BioI. Sci. I I : 598-602. Bennett, J. H. 1 963. Random mating and sex linkage. J. Theor. Bioi. 4 : 28-36. Bennett, J. H. 1 968. Mixed self- and cross-fertilization in a tetrasomic species. Biometrics 24 : 485-500.
Bennett, J. H., and F. E. Binet. 1 95 6. Association between Mendelian factors with mixed selfing and random mating. Heredity 1 0 : 5 1 -56. Bennett, J. H., and C. R. Oertel. 1 965. The approach to a random association of genotypes with random mating. J. Theor. Bioi. 9: 67-76. Bernstein, F. 1 925. Zusammenfassende Betrachtungen tiber die erblichen Blut strukturen des Menschen. Zeit. indo A bst. Vererb. 37 : 237-269. Bernstein, F. 1 930. Fortgesetzte Untersuchungen aus der Theorie der Blutgruppen. Zeit. indo Abst. Vererb. 5 6 : 2 33-273.
Beyer, W. H. 1 966. Handbook of Tables for Probability and Statistics. Chemical Rubber Company, Cleveland. Binet, F. E., A. M. Clark, and H. T. Clifford. 1 959. Correlation due to linkage in certain wild plants. Genetics 44 : 5- 1 3 . Binet, F. E., and R. T. Leslie. 1 960. The coefficient of inbreeding i n case of repeated full-sib-matings. J. Genet. 5 7 : 1 27-1 30. Binet, F. E. , and J. A. Morris. 1 962. On total hereditary variance in the case of certain mating systems. J. Genet. 5 8 : 1 08- 1 2 1 . Birch. L. C . 1 948. The intrinsic rate of natural i ncrease o f a n i nsect population. J. A nim. Ecol. 1 7 : 1 5-26.
Birch, L. C. 1 960. The genetic factor in population ecology. A mer. Natur. 94 : 5-24. Bodewig, E. 1 93 6. Mathematische Untersuchungen zum Mendelismus und zw Eugenik. Genetica 1 8 : 1 1 6-1 86. Bodewig, E. 1 93 60. Mathematische Untersuchen zum Mendelismus. Zeit. indo Abst. Vererb. 7 1 : 84-1 1 9.
Bodmer, W. F.
1 960.
Discrete stochastic processes i n population genetics.
J. Roy.
Stat. Soc. B 2 2 : 2 1 8-244.
Bodmer, W. F. ] 963. Natural selection for modifiers of heterOZygote fitness. J. Theor. Bioi. 4 : 8 6-97. Bodmer, W. F. 1 965 . Differential fertil ity in population genetics mode1s. Genetics Bodmer, W. F. , and L. L. Caval li-Sforza. 1 968. A migration matrix model for the study of random genetic drift. Genetics 5 9 : 565-592. 5 1 : 4 1 1 -424.
B I B LI O G R A P H Y
521
Bodmer, W. F., and A. W . F. Edwards. 1 9 60. Natural selection and the sex ratio. A nn. Hum. Genet. 24 : 239-244. Bodmer, W . F., and J. Felsenstein. 1 967. Linkage and selection : Theoretical analysis of the deterministic two locus random mating model. Genetics 57 : 237-26 5 .
Bodmer, W. F., and P. A . Parsons. 1 960. The initial progress of new genes with various genet ic systems. Heredity 1 5 : 283-299. Bodmer, W. F., and P. A. Parsons. 1 962. Linkage and recombination in evolution. A dv. Genet. 1 1 : 1 - 1 00. Bogyo, T. P. 1 965. Some population parameters as affected by truncation selection. Genetics 5 2 : 429. Bogyo, T. P., and S. W. Ting. 1 968. Effect of selection and linkage on inbreeding. A ustr. J. BioI. Sci. 2 1 : 45-58. Bohidar, N . R. 1 96 1 . Monte Carlo investigations of the effect of l inkage on selection. Biometrics 1 7 : 506--5 07. Bohidar, N. R. 1 964. Derivation and estimation of variance and covariance com ponents associated with covariance between relatives under sex-linked trans mission. Biometrics 20 : 505-5 2 1 . Bohidar, N. R., and D. G. Pate1. 1 964. A Monte Carlo investigati on of interaction between linkage and selection under stochastic models. Biometrics 20 : 660. Bonnier, G. 1 947. The genetic effects of breeding in small populations. A demon stration for use in genetic teaching. Hereditas 3 3 : 143-1 5 1 . Bosso, J . A., O. M . Sorrain, and E . E . A . Favret. 1 969. Application of finite absorbent Markov chai ns to sib mating populations with selection. Biometrics 25 : 1 7-26.
Brace, C. L. 1 964. The probable mutation effect. Amer. Natur. 98 : 453-455. Bradshaw, A. D. 1 966. Gene flow and natural selection in closely adjacent popula tions-a theoretical analysis. Heredity 2 1 : 1 7 1 . Breese, E. L. 1 956. The genetical consequences of assortative mating. Heredity 1 0 : 323-343.
Brown, J. L. 1 966. Types of group selection. Nature 2 1 1 : 870. Brown, S. W. 1 964. Automatic frequency response in the evolution of male haploidy and other coccid chromosome systems. Genetics 49 : 797-8 1 7. Bruce, A . B. 1 9 1 0. The Mendelian theory of heredi ty and the augmentation of vigor. Science 3 2 : 627-628. Bruck, D. 1 957. Male segregati on ratio advantage as a factor i n maintaining lethal alleles in wild populations of house mice. Proc. Nat. A cad. Sci. 43 : 1 52- 1 58. Brues, Alice M. t 964. The cost of evolution vs. the cost of not evolving. Evolution 1 8 : 3 79-383.
Brues, A. M. 1 969. Genetic load and its varieties. Science 1 64 : 1 1 30- 1 1 36. Bunak, V. V. 1 93 7 . Changes in the mean values of characters i n a mixed population. Ann. Eugen. 7 : 1 95-206. Buri, P. 1 956. Gene frequency in small populations of mutant Drosophila. Evolution 1 0 : 367-402.
Buzzati-Traverso, A. A. 1 95 3 . On the role of mutation rate in evolution. A lii del IX Congo Intern. di Genet. 1 : 450-462.
522
BIBLIOG RAPHY
Ca in, A. J., and P. M. Sheppard. 1 954. The theory of adaptive polymorphism. A mer. Natur. 88 : 3 2 1 -326. Campos Rasado, J. M., and A. Robertson. 1 966. The genetic control of sex ratio. J. Theor. Bioi. 1 3 : 324-329. Cannings, C. 1 967. Equili brium, convergence and stabi lity at a sex-linked locus under natural selection. Genetics 56 : 6 1 3-6 1 7. Cannings, C. 1 968. Equili brium under selection at a multi-allelic sex-linked locus. Biometrics 24 : 1 87-1 89. Cannings, C. 1 969. Unisexual selection at an autosomal locus. Genetics 62 : 225229. Cannings, c., and A. W. F. Edwards. 1 969. Expected genotype frequencies in a small sample : Deviation from the Hardy-Weinberg equ il ibrium. A mer. J. Hum. Genet. 2 1 : 245-247. Carson, H . L. 1 955. The genetic characteristics of marginal populations in Drosoph ila. Cold Sprillg Ha,.bor Symp . Quant. Bioi. 20 : 276-286. Carson, H. L. 1 959. Genetic conditions which p romote or retard the formation of species. Cold Spring Harbor Symp. Quant. Bioi. 24 : 8 7- 1 05 . Caspari, E., G . S. Watson, a n d W. Smith. 1 966. The i nfluence o f cytoplasmic pollen sterility on gene exchange between populations. Genetics 5 3 : 74 1 -746. Castle, W. E. 1 903. The law of heredity of G alton and Mendel and some laws governing race improvement by selection. Proc. A mer. A cad. Sci. 39 : 233-242. Cavall i-Sforza, L. L. 1 950. The analysis of selection cu rves. Biometrics 6 : 208-220. Cavalli-Sforza, L. L. 1 952. An a nalysis of li nkage i n quantitative i nheritance. In Quantitative Inheritance. Ed. by E. C. R. Reeve and C. H . Waddington. H is Maj esty's Stationery Office, Lo ndon. Pp. 1 35- 1 44. Cavalli-Sforza, L. L. 1 958. Some data on the genet ic structure of human populations. Proc. X. Illter. COllg. Gellet. 1 : 389-407. Cavalli-Sforza, L. L. 1 963. The distri but ion of m igration distances : Models, and appl ications to genetics. Elltretiell de A10llaco en Sciences Humallines: Les Deplacemellts HUl1laills. E d . J . Sutter. Pp. 1 39- 1 58. Cavalli-Sforza, L. L. 1 965. Popu lation structure and human evolution. Proc. Roy. Soc. B. 1 64 : 362-379. Caval li-Sforza, L. L. 1 969. H u ma n diversity. Proc. II Intern. Congo Gellet. 3 : 4054 1 6. Cava lli-Sforza, L. L., I. Barra i, and A. W. F. Edwards. 1 964. A nalysis of human evolution u nder random genetic drift. Cold Spring Harbor Symp. Quant. Bioi. 29 : 9-20. Cavalli-Sforza, L. L., and A. W. F. Edwards. 1 967. Phylogenetic a n alysis : Models and esti mation procedures. EW/llti01l 2 1 : 550-570. Cavall i-Sforza, L. L., M . K i mura, and I. Barrai. 1 966. The pro bability of consan gui neous marriages. Genetics 54 : 37-60. Ceppel lini, R., M. Sinisca\co, and C. A. B. Smith. 1 955. The esti mation of gene frequencies in a random-mating population. A 1111. Hum. Gellet. 20 : 97- 1 1 5 . Chapman, A. B. 1 946. Genetic and Ilongenetic sources of variation in the weight response of the immature rat ovary to a gonadotropic hormone. Gelletics 3 1 : 494-507.
B I BLIO G RAPHY 523
Chia, A. B. 1 968. Random mating in a population of cyclic size. J. Appl. Prob. 5 : 2 1 -30.
Chia, A. B. , a'nd G. A. Watterson. 1 969. Demographic effects on the rate of genetic evol ution. I. Constant size populations with two genotypes. J. Appl. Prob. 6 : 23 1 -248 . Chigusa, S. , and T . Mukai. 1 964. Linkage disequilibrium and heterosis in experi mental popu lations of Drosophila melanogaster with particular reference to the sepia gene. Jap . J. Genet. 39 : 289-305. Chung, C. S., and A. B. Chapman. 1 958. Comparisons of the predicted with actual gains from selection of parents of inbred progeny of rats. Genetics 4 3 : 594-600. Chung, C. S. , O. W. Robison, and N. E. Morton. 1 959. A note on deaf mutism. A nn. Hum. Genet. 2 3 : 357-366. Chung, Y. J. 1 967. Persistence of a mutant gene in populations of different genetic backgrounds. Genetics 57 : 957-967. Clarke, B. C. 1 964. Frequency-dependent selection for the dominance of rare polymorphic genes. Evolution 1 8 : 364-369. Clarke, B. C. 1 966. The evolution of morph-ratio cl ines. A mer. Natur. 100: 389-402. Clarke, B. c . , and P. O'Donald. 1 964. Frequency-dependent selection. Heredity 1 9 : 20 1 -206. Clayton, O. A , O. R. Knight, J. A. Morris, and A. Robertson. 1 957. An experi mental check on quantitative genetical theory. III. Correlated Responses. J. Genet. 55 : 1 7 1 - 1 80. Clayton, G. A . , J. A M orris, and A. Robertson. 1 957. An experimental check on quantitative genetica l theory. I. Short-term responses to selection. J. Genet. A,
and A Robertson. 1 95 5 . Mutation and quantitative variation. Amer. Nat. 89 : 1 5 1 - 1 58. Clayton, G. A., and A. Robertson. 1 95 7. An experimental check on quantitative genetical theory. II. The long-term effects of selection. J. Genet. 5 5 : 1 52-1 70. Coale, A. J . , and P. Demeny. 1 96 6. Regional Model Life Tables and Stable Popula tions. Princeton Univ. Press, Princeton, N.J. Coale, A. J . , and C. Y. Tye. 1 96 1 . The significance of age patterns of fertility in high-fertility populations. Milbank Memor. Fund Quart. 34 : 63 1-646. Cochran, W. G. 1 95 1 . I mprovement by means of selection. Proc. Second Berkeley Symp. Math. Stat. Prob. Pp. 449-470. Cockerham, C. C. 1 954. An extension of the concept of partitioning hereditary variance for analysis of covariances among relatives when epistasis is present. 5 5 : 1 3 1 -1 5 1 .
Clayton, G.
Genetics 39 : 859-882.
Cockerham, C. C. 1 956. Analysis of quantitative gene action. Genetics in plant breeding. Brookhaven Symp. Bioi. 9 : 53-68. Cockerham, C. C. 1 956a. Effects of l inkage on the covariances between relatives. Genetics 4 1 : 1 3 8-1 4 1 . Cockerham, C . C . 1 959. Partitions o f hereditary variance for various genetic models. Genetics 44 : 1 1 41-1 1 48. Cockerham, C. C. 1 961 . Implications of genetic variances in a hybrid breeding program . Crop Sci. 1 : 47-52.
524
B I BLIOG RAP H Y
Cockerham, C. C. 1 963 . Estimation of genetic variances. Statistical Genetics and Plant Breeding. Ed. by W. D. Hanson and H. F. Robinson. Nat. A cad. Sci. Nat. Res. Counc. Publ. 982 : 5 3-93 . Cockerham, C. C. 1 967. Group inbreeding and coancestry. Genetics 56 : 89- 1 04. Cockerham, C. C. 1 969. Variance of gene frequencies. Evolution 23 : 72-84. Cockerham, C. c., and B. S. Weir. 1 968. Sib mating with two linked loci. Genetics 80 : 629-640.
Cole , L. C. 1 954. The population consequences of life h istory phenomena. Quart. Rev. BioI. 29 : 1 03-1 37. Cole, L. C. 1 957. Sketches of general and comparative demography. Cold Spring Harbor Symp. Quallt. BioI. 22 : 1 -5 . Collins, G. N. 1 92 1 . Dominance and the vigor o f first generation hybrids. Amer. Natur. 5 5 : 1 1 6- 1 3 3 . Collins, R. L. 1 967. A general nonparametric theory of genetic analysis. Genetics 56 : 55 1 .
Comstock, R. E . 1 955. Theory of quant itat ive genetics : Synthesis. Co ld Spring Harbor Symp. Quant. BioI. 20 : 93- 1 09. Comstock, R. E., and H. F. Robinson. 1 952. Esti mation of average dominance of genes. In Heterosis. Ed. by J. W. Gowen. Iowa State ColI. Press, Ames, Iowa. Pp. 494-5 1 6. Comstock, R. E., H . F. Robinson, and P. H . Harvey. 1 949 . A breeding procedure designed to make maximum use of both �eneral and specific combining abil ity. J. Amer. Soc. Agron. 4 1 : 360-3 67. Connell, J. H., and E. Orias. 1 964. The ecological regulation of species diversity. Amer. Natur. 98 : 399-4 1 4. Constanti no, R. F. 1 968. The genetical structure of populations and developmental ti me. Genetics 60 : 409-4 1 8. Cook, L. M. 1 96 1 . The edge effect in population genetics. Amer Natur. 9 5 : 295-307 . Cook, L. M. 1 965. A note on apostasy. Heredity 20 : 63 1 -636. Cormack, R. M. 1 964. A boundary problem arising in population genetics. Bio metrics 20 : 785-793 . Cotterman, C. W. 1 940. A Calculus for Statistico-genetics. Unpublished thesis, Ohio State U niv., Columbus, Ohio. Cotterman, C. W. 1 94 1 . Relatives and human genetic analysis. Scient. Monthly 5 3 : 227-234.
Cotterman, C. W. 1 95 3 . Regular two-allele and three-allele phenotype systems. Amer. J. Hum. Genet. 5 : 1 93-2 3 5 . Cotterman, C. W. 1 954. Est imation of gene frequencies in nonexperimental popula tions. In Statistics alld Mathematics in Biology. Ed. by O. Kempthorne, T. A. Bancroft, J. W. Gowen, and J. L. Lush. Iowa State Coil. Press. Ames, Iowa, Pp. 449-465. Cotterman, C. W. 1 969. Factor-union phenotype systems. Computer Applications in Genetics. Ed. by N. E. Morton. Univ. Hawaii Press, Honolulu. Pp. 1 - 1 8. Courant, R., and D. Hilbert. 1 962. Methods of Mathematical Physics. Vol. I. Interscience Pub., New York.
B I B LI O G RAPHY
525
Cress, C. E. 1 966. H e terosis of the hybrid related to gene frequency d i fferences between two populations. Genetics 5 3 : 269-274. Crick, F. H. C. 1 967. Orig i n of the genetic code. Nature 2 1 3 : 1 1 9. Crosby, J . Crosby, J.
L. 1 949. Selection of an u n favorable gene complex. Evolution L. 1 960. The use of electronic computation i n the s tudy
3 : 2 1 2-230. of random
fluctua ti on s in rapi d ly evolving populations. Proc. Roy. Soc. B 242 : 5 5 1 -573. Crosby, J.
J 96 1 . Teaching genetics w i th an electronic computer. Heredity
L.
1 6 : 255-273 .
L . 1 963. The evolution a n d nature o f domi nance. J. Theor. BioI. 5 : 35-5 1 . J. L . ] 966. The popu1ation genetics o f speciation processes. Heredity
Crosby, J . Crosby,
2 1 : 1 68. Crosby, J .
L.
1 966a. Sel f-i ncompa t i b i l ity a l leles i n the population o f Oenothera
organensis. Evolution 20 : 5 67-579. Crow, J. F. ] 945 A chart of the X 2 and t d istributions. .
J.
A mer. SIal.
A ssn .
40 : 3 76.
Crow, J. F. 1 952. Dom ina nce and ove rdominance. I n Heterosis. Ed . by J. W . Gowen.
Crow, J. F. 1 948. A l ternative hypotheses of hybrid vigor. Genetics 3 3 : 477-487. Iowa State Coil. Press, Ames, I owa. Pp. 282-297.
Crow, J. F. 1 954. Random mating w i th linkage in polysom ies.
A mer.
Nalur.
88 : 43 1 -434. Crow, J. F. I 954a . Breeding structure of popula t i ons. 1 1 . Effect i ve popula tion number. Slalislics and Mathematics in Biology. Iowa State Col i . Press, Ames, I owa, Pp. 543-556. Crow, J. F . 1 95 5 . Genera l theory of population genet ics : Syn thesi s. Cold Spring Crow, J. F. 1 958. Some possi b i l i t ies for measuring selection i n tens i t ies in man . Harbor Symp. Quan t . Bioi. 20 : 54-59.
Hum. BioI. 30 : 1 - 1 3. Crow, J. F. 1 959. Ionizing rad iation and evo l u t ion . Scient. A mer. 20 1 , Sept. Pp.
Crow, J. F. 1 96 1 . Population genetics. A mer. J. Hum. Genet. 1 3 : 1 37- 1 50. 1 38 1 60. -
Crow, J. F. 1 96 3 . The concept of genetic load : A reply. A mer. J. Hum. Gen et. I S : 3 1 0-3 1 5. Crow, J. F. 1 966. The qua l i ty o f people : H uman evolutionary changes. BioScience
Crow, J. F. 1 969. M o lecular genetics and population genetics. Proc. XII 11l1. Congo 1 6 : 8 63-867.
Genel. 3 : 1 05 - 1 1 3.
Topics in Pop ulation Genetics. Ed . by K . J. Kojima. Sprin ger Verlag, Heidel berg. Crow, J. F. , and Y . J. Chung. 1 967. Measurement o f effect ive generation length i n
Crow, J. F. 1 970. Genetic l oads and the cost of n a tural selec t i on . Malhemalical
Drosophila population cages. Genetics 5 7 : 95 1 -955. Crow, J . F . , and
J.
Felsenste i n . 1 968. The effect o f assortative mating on t he gen etic
compos i tion of a population. Eugen. Quart. 1 5 : 8 5-97. Crow, J . F., and M. K i mu ra . 1 956. Some genetic p roblems in natura l populations. Crow, J. F . , a n d M. K imura. 1 965. The theory of genetic loads. Proc. XI Int . Pro c. Third Berkeley Symp. Math. Stat. and Prob. 4 : 1 -22. Congo Genet. 3 : 495-505.
526
B I B LI O G R A P H Y
Crow, J. F., and M. Kimura. 1 965a. Evolution in sexual and asexual populations. Amer. Natur. 99 : 439-450. Crow, J. F., and A. P. Mange. 1 965. Measurement of inbreeding from the frequency of marriages between persons of the same surname. Eugen. Quart. 1 2 : 1 99-203. Crow, J. F., and N. E. Morton. 1 955. Measurement of gene frequency drift in small populations. Evolution 9 : 202-2 1 4. Crow, J. F., and N. E. Morton. 1 960. The genetic load due to mother-child i n compatibility. A mer. Natur. 94 : 4 1 3-4 1 9. Crow, J. F., and W. C. Roberts. 1 950. Inbreeding and homozygosis i n bees. Genetics 35 : 6 1 2-621 . Crow, J. F. , and Rayla G. Temin. 1 964. Evidence for the partial domi nance of recessive lethal genes in natural populations of Drosophila. Amer. Natur. 98 : 21-33. Cruden, Dorothy. 1 949. The computation of inbreeding coefficients in closed populations. J. Hered. 40 : 248-25 1 . Curnow, R. N. 1 964. The effect of continued selection of phenotypic i ntermediates on gene frequency. Genet. Res. 5 : 341 -353. Dahlberg, G. 1 928. Inbreedi ng i n man. Genetics 1 4 : 42 1 -454. Dahlberg, G. 1 938. On rare defects in human populations with particular regard to inbreeding and isolate effects. Proc. Roy. Soc. Edinb. 58 : 21 3-232. Dahlberg, G. 1 947. Selection in human populations. Zool. Bidr. Uppsala 25 : 2 1 -32. Dahlberg, G. I 947a. Mathematical Methods for Population Genetics. S. Karger, Basel and New York. D'Ancona, U. 1 954. The struggle for existence. Bibliotheca Biotheoretica 6 : 1 -274. Daniel, L. 1 964. A szelekci6 biometriai alapja. (The biometrical basis of selection.) N6venyterme/es 1 3 :369-380. Dansereau, P. 1 952. The varieties of evolutionary opportunity. Rev. Canad. Bioi. I I : 305-388. Darwin, C. 1 859. The Origin of Species. John Murray, London. Deaki n, M. A. B. 1 966. Sufficient conditions for genetic polymorphism. A mer. Natur. 1 00 : 690-692. De Finetti, B. 1 926. Considerazioni matematiche sui I'ereditarieta mendeliana. Metron 6 : 1 -4 1 . Dempster, E . R. 1 955. Genetic models i n relation to animal breeding problems. Biometrics I I : 535 536. Dempster, E. R. 1 955a. Maintenance of genetic heterogeneity. Cold Spring Harbor Symp. Quant. Bioi. 20 : 25-32. Dempster, E. R. 1 956. Some genetic problems in controlled populations. Proc. Third Berkeley Symp. Math. Stat. Prob. 4 : 23-40. Dempster, E. R. 1 956a. Comments on Professor Lewontin's article. A mer. Natur. 90 : 385-386. Dempster, E. R. 1 960. The question of stability with positive feedback. Biometrics 1 6 : 48 1 -483. Dempster, E. R., and I. M. Lerner. 1 947. The optimum structure of breeding flocks. II. Methods of determination. Genetics 32 : 567-579. -
B I B LI O G RAPHY
Dempster, E. R., and I. M. Lerner.
527
Heritability of threshold characters.
1 950.
Denniston, C.
Genetics 3 5 : 2 1 2-236. 1 967. Probability and Genetic Relationship. Unpublished thesis, University of Wisconsin.
Dethier, V. G., and R. H. MacArthur.
201 : 728-729.
severe mental defect. Amer. J.
W.
A field's capacity to support a butterfly
J., I. Barrai, N. E. Morton, and M. P. Mi.
population. Nature Dewey,
1 964.
Dickerson. G . E.
1 955.
Recessive genes in
1 965. Human Genet. 1 7 ; 237-256.
Genetic slippage in response to selection for multiple
objectives. Cold Spring Harbor Symp. Quant. Biul. Dickinson, A. G . , and J. L. J inks.
1 95 6.
20 : 21 3-224.
A generalized analysis of d iallel crosses.
Genetics 41 : 65-78. Dobzhansky, Th.
1 95 1 . Genetics and the Origin of Species.
3rd Ed. Columbia
Univ. Press. N.Y. Dobzhansky, Th.
1 955.
A review of some fundamental concepts and problems o f
population genetics. Cold Spring Harbor Symp. Quant. Bioi.
1 956. What is an Th. 1 957. Genetic
20 : 1 -1 5. 90 : 337-347.
Dobzhansky, Th.
adaptive trait ? Amer. Natur.
Dobzhansky,
loads in natural popUlations. Science
1 26 :
1 91 - 1 94. Dobzhansky, Th.
1 95 7a.
Mendelian populations as genet ics systems. Cold Spring
Harbor Symp. Quant. Bioi. 22 : 385-394. Dobzhansky. Th.
Evolution of genes and genes in evolution. Cold Spring
1 959.
Harbor Symp. Quant. Bioi. 24 : 1 5-30. Dobzhansky, Th.
1 959a.
Variation and evolution. Proc. Amer. Phi/os. Soc.
1 03 :
252-263. Dobzhansky, Th. Dobzhansky, Th.
1 96 1 . 1964.
Man and natural selection. A mer. Scientist
Drosophila populations ? Amer. Natur. Dobzhansky, Th.
1 967.
49 : 285-299.
How do genetic loads affect the fitness of their carriers i n
98 : 1 5 1 - 1 66.
Genetic diversity and diversity of environments. Proc.
Fifth Berkeley Symp. Math. Stat. and Prob. 4 : 295-304. Dobzhansky, Th., and O. Pavlovsky. 1 957. An experimental study of interaction between genetic drift and natural selection. Evolution 1 1 : 3 1 1 -31 9. Dobzhansky,
Th. , and
O. Pavlovsky.
1 959.
How stable is balanced polymorphism ?
Proc. Nat. A cad. Sci. 46 : 4 1 -47. Dobzhansky, Th. , and B . Wallace.
1 953. The genetics of homeostasis in Drosophila. Proc. Nat. A cad. Sci. 39 : 1 62-1 7 1 . Dobzhansky, Th., and S. Wright. 1 94 1 . Genetics of natural populations. V. Rela tions between mutation rate and accumulation of lethals in a population of
Drosophila pseudoobscura. Genetics 26 : 23-5 1 . Dodson, E . O.
1 962. Note on the cost o f natural selection. A mer. Natur. 96 : 1 23-126. 1 959. Ober die Berechnung von Inzucht- und Ver wandtschaftskoeffizienten. Biom. Zeit. 1 : 1 50. DowdesweIl, W. H. 1 955. The Mechanism of Evolution. (The Scholarship Series in
Doring, H . , and E. Walter.
Biology.) Heinemann, London. isms during simple vegetative reproduction. J. Theor. BioI.
Drobnik, J ., and J. Dlouha.
1 966.
Statistical model of evolution of haploid organ�
1 1 : 4 1 8-435.
528
B I BLIO G R A P H Y
und i h re Evolution. Bioi. Zhllr. I : 52-95.
Dubinin, N . P. , and D. D. R omaschoff. 1 932. Die genetische Strucktur der Art Dunbar,
J. 1 960. The evo l u t i o n of stabil ity in mari ne e n v i ro n ments. Natura l
M.
selec t i on at t he level of the ecosystem . A mer. Natllr. 94 : 1 29- 1 3 6. Dunn, L. C. 1 957. Evidences of evo l u t ionary fo rces lead ing to t h e spread of lethal genes i n wild populations of ho use mice. Proc. Nat. A cad. Sci. 43 : 1 58- 1 63. East, E.
M.
1 936. Heterosis. Genetics 21 : 3 7 5-397.
Eberhart, S. A. 1 964. Theoretical relations among s i ngle, t h ree-way, and double Eberhart, S. A . , W. A . Russel l , and L. H. Penny. 1 964. Double cross hybrid pre cross hybrids. Biometrics 20 : 5 22-539.
Edwards, A. W. F. 1 960. On the method of est i mat ing freq uencies using the negative d ict ion i n maize when epistasis is present. Crop Sci. 4 : 363-366.
Edwards, A. W . F. 1 96 1 . The population genet ics of " Sex Ratio " i n Drosophila b i nomial d i stri but ion. A nn. Hum. Gen. 24 : 3 1 3-3 1 8 .
Edwards, A . W. F. 1 963. Nat u ral selection and t he sex rat i o : The approach to
pseudoobscllra. Heredity 1 6 : 29 1 -304.
Edwards, A. W. F. 1 963a. The l i m i tat ions of populat ion m odels. Proc. Second equ i l i brium. A mer. Natur. 97 : 397-400.
Edwards, A . W. F .
Int . Congo Hum. Gellet. Pp. 222-22 3 .
1 967.
Fundamental t heorem o f natural selection. Nature
2 1 5 : 5 3 7-53 8.
Ell iso n , B. E. 1 965. Li m i ts of i n fi n i te populations under random mat i ng. Proc.
Nat. A cad. Sci. 5 3 : 1 266- 1 272. Ellison, B. E. 1 966. Li m i t t heorems for random mat i ng in i n fi n i te populat ions. J.
Appl. Prob. 3 : 94- 1 1 4. M. 1 968. A note o n
Emlen, 1. Emlen, J .
natural selection and t he sex rat i o . Amer. Natur.
1 02 : 94-95 .
Ewens, W . 1. 1 963. Numerical results and diffusion appro x i mat ions i n a genet ic
M.
1 968a. Select ion for the sex rat i o . Amer. Natur. 1 02 : 589-59 1 .
Ewens, W . J . 1 963a. Diploid populations w i t h selection depend i ng on gene fre process. Biometrika 50 : 24 1 -249 .
Ewens, W. J . 1 963b. The mean time for absorpt ion in a process of genetic type. quency. J. A ust. Math. Soc. 3 : 3 5 9-374.
Ewens, W. 1. 1 963c. The d i ffusion equation and a pseudo-distri bution in genetics. J. Aust. Math. Soc. 3 : 3 7 5-383.
Ewens, W . J. 1 964. The ma i ntenance of a lleles by m u ta t ions. Genetics 50 : 89 1 -898. J. Roy. Stat. Soc. B 25 : 405-4 1 2.
Ewens, W . .T. I 964a . On t he pro blem of sel f-steri lity a l leles. Genetics 50 : 1 433- 1 438.
Ewens, W. J. 1 964b. The pseudo-transient d istribution a nd its uses in genetics. J. Appl. Prob. I : 1 4 1 - 1 56.
Ewens, W . 1. 1 965. A note on Fisher's t heory o f the evolution o f domi nance.
A nn. Hum. Genet. 29 : 85-88.
Ewens, W. 1.
1 965a. The adequacy o f the d i ffusion approx imation t o cert a i n
d istributions i n genetics. Biometrics 2 1 : 386-394.
BI B LI O G RAPHY
629
Ewens, W. J . 1 966. Further notes on the evolution of dominance. Heredity 20 : 443-450. Ewens, W. J. I 966a. Linkage and the evolution of dominance. Heredity 2 1 : 363-370. Ewens, W. J. 1 967. A note on the mathematical theory of the evolution of domi nance. A mer. Natur. 1 0 1 : 3 5-40. Ewens, W. J. 1 967a. The probabi lity of survival of a mutant. Heredity 22: 307-3 1 0. Ewens, W. J . 1 967b. The probabi lity of survival of a new mutant i n a fluctuating environment. Heredity 22 : 438--44 3 . Ewens, W. J . 1 967c. The probabi l i ty o f fixation of a mutant : The two-locus case. Evolution 2 1 : 532-540. Ewens, W. 1. 1 967d. Random sampl ing and t he rate of gene repl acement. Evolution 2 1 : 657-663. Ewens, W. J . 1 968. A genetic model having complex l inkage behavior. Theor. & Appl. Genet. 38 : 1 40- 1 43. Ewens, W. 1. 1 969. Population Genetics. M ethuen, London. Ewens, W. J. , and P. M . Ewens. 1 966. The maintenance of alleles by mutation M onte Carlo results for normal and self-fertil ity populat ions. Heredity 2 1 : 3 7 1 -378. Fadeev, D. K . , and V. N. Fadeeva. 1 963. Computation Methods in Linear Algebra. Freeman Pub. C , San Francisco. Falconer, D. S. 1 960. Introduction to Quantitative Genetics. The Ronald Press Co., New York. Falconer, D. S. 1 967 . The inheritance of l iabil ity to diseases with variable age of onset, with particul ar reference to diabetes mel l itus. Ann. Hum. Genet. 3 1 : 1 -20. Fal k , C , and C. C Li. 1 969. Negative assortative mating : Exact solution to a simple model . Genetics 62 : 2 1 5-223. Falk, H., and C. T. Falk. 1 969. Stability of solutions to certain nonlinear difference equat ions of population genetics. Biometrics 25 : 27-37. Feldman, M. W . 1 966. On the offspring number distri bution i n a genetic population . 1. Appl. Prob. 3 : 1 29-1 4 1 . Feldman, M . W . , M . Nabh olz, and W . F. Bodmer. 1 969. Evol ution o f the Rh pol ymorph ism : A model for the i nteraction of i ncompatibil ity, reproductive compensation, and heterozygote advantage. Amer. 1. Hum. Genet. 2 1 : 1 7 1 - 1 93 . Feller, W. 1 950. Art Introduction to Probability Theory and its Applications. Vol. I. 3 rd Ed. 1968. Wiley, New York . Fel ler. W. 1 95 1 . Diffusion processes i n genetics. Proc. Second Berkeley Symp. Math. Stal. Prob. Pp. 227-246. Fel ler, W. 1 952. The parabolic differential equations and the associated semi group of transformations. Ann. Math. 55 : 468-5 1 9. Feller, W. 1 954. Diffusion processes in one dimension. Trans. Amer. Math. Soc. 77 : 1 -3 1 . Feller, W. J . 1 966. On the influence of natural selection on population size. Proc. Nat. A cad. Sci. 55 : 733-737.
530
B I B LI O G R A P H Y
Fel ler, W. 1 966a. An Introduction to Probability Theory and Its Applications. Vol. II. Wi ley, New York. Feller, W. 1 967. On fitness and the cost of natural selection. Genet. Res. 9 : 1 1 5. Felsenstein, J. 1 965 The effect of l inkage on directional selection. Genetics 52 : 349-363. Finney, D. J . 1 9 52. The equil i bri um of a self-incompatible polymorphic species. Genetica 26 : 3 3-64. Finney, D. J . 1 962. Genetic gains under three methods of selection . Genet. Res. -
.
3 : 4 1 7-423. Fish, H . D. 1 9 1 4. On the progressive i ncrease of homozygosis i n brother-sister matings. A mer. Natur. 48 : 759-76 1 . Fisher, R. A . 1 9 1 8 . T he correlation between relatives on the supposition of Mendel ian inheritance. Trans. Roy. Soc. Edillb. 5 2 : 399-43 3. Fisher, R. A. 1 922. On the dominance ra tio. Proc. Roy. Soc. Edinb. 52 : 3 2 1 -34 1 . Fisher, R. A. 1 9 2 5 Statistical Methods for Research Workers. 1 3 th Ed. 1 958. O l iver and Boyd, London. Fisher, R. A . 1 928. The possible modification of the response of the wild type to recurrent mutations. Amer. Natur. 62 : 1 1 5- 1 26. Fisher, R. A. 1 92 8a . Two further notes on the origin of dominance. Amer. Natur. .
62 : 57 1 -574. Fisher, R.
A.
1 929. The evol ut ion of dominance ; reply to Professor Sewall Wright.
Amer. Natur. 63 : 553-556. Fisher, R. A. 1 930. The Genetical Theory of Natural Selection. Clarendon Press, Oxford. Fisher, R. A. 1 930a. The evolution of dominance in certain polymorphic species. A mer. Natur. 64 : 385-406. Fisher, R. A. 1 9 30b Mortal ity among plants and its bearing on natu ral selection. .
Nature 1 25 : 972-973.
Fisher, R . A . 1 930c. Biometry and evolution. NalLtre 1 26 : 246-247. Fisher, R. A. I 930d. Genetics, mathematics, and natural selection. Nature 1 26 : 805-806. Fisher, R. A. 1 930e. The distribution of gene ratios for rare mutations. Proc. Roy. Soc. EdilZb. 50 : 205-220. Fisher, R. A. 1 93 1 . The evolution of dominance. BioI. Rev. 6 : 345-368 . Fisher, R. A. 1 932. I nheritance of acquired characters. Nature 1 30 : 5 79. Fisher, R. A. I 9 32a. The evolutionary modification of genetic phenomena. Proc. Sixth Int. Congo Genet. I : 1 65- 1 72. Fisher, R. A. I 932b. The bearing of genetics on theories of evolution. Sci. Prog Twent. Cent. 26 : 273-287. F isher, R. A. 1 933. Selection in the production of ever-sporting stocks. Ann. Bot. 1 8 8 : 727-733 . Fisher, R. A . 1 933a. Number of M endelian factors in quantitative inheritance. Nature 1 3 1 : 400-401 . F isher, R . A . 1 93 3b. Protective adaptations of animals, especially insects. Proc. Entom. Soc. Lond. 7 : 87-89. .
BIBLIO G RA P H Y
531
Fisher, R. A. 1 934. Professor Wright on the theory of dominance. Amer. Natur. 68 : 3 70-374. Fisher, R. A. 1 934a. Indeterminism and natural selection. Phil. Sci. I : 99- 1 1 7. Fisher, R. A. 1 934b. Adaptation and mutations. School Sci. Rev. 59 : 294-30 1 . Fisher, R. A. 1 935. The sheltering of lethals. Amer. Natur. 69 : 446-455. Fisher, R. A. 1 935a. On the selective consequences of East's ( 1 927) theory of heterostyl ism i n Lythrum. J. Genet. 30 : 369-382. Fisher, R. A. 1 936. The measurement of selective intensity. Proc. Roy. Soc. Lond. B 1 2 1 : 58-62. Fisher, R. A. 1 937. The wave of advance of advantageous genes. Ann. Eugen. 7 : 355-369. Fisher, R. A. 1 939. Selecti ve forces in wild populations of Paratettix texanus. Ann. Eugen. 9 : 1 09-1 22. Fisher, R. A. 1 939a. Stage of enumeration as a factor i nfluenci ng the variance in the number of progeny, frequency of mutants and related quantities. Ann. Eugen. 9 : 406-408. Fisher, R. A. 1 940. Non-lethality of the mid factor i n Lythrum salicaria. Nature 1 46 : 52 1 . Fisher, R. A. 1 94 1 . The theoretical consequence of polyploid inheritance for the mid style form of Lythrum salicaria. Ann. Eugen. I I : 3 1 -38. Fisher, R . A. 1 94 1 a. Average excess and average effect of a gene substitution. Ann. Eugen. 1 1 : 5 3-63. Fisher, R. A. 1 942. The polygene concept. Nature 1 50 : ] 54. Fisher, R. A. 1 943. Allowance for double reduction in the calculation of genotypic frequencies with polysomic inheri tance Ann. Eligen. 1 2 : 1 69- 1 7 1 . Fisher R. A . 1 947. Number of self-steri lity alleles. Nature 1 60 : 797-798. Fisher, R. A. 1 947a. The theory of l inkage in polysomic i nheritance. Phil. Trans. Roy. Soc. B 233 : 55-87. Fisher, R. A. 1 949. The Theory of Inbreeding. 2nd Ed. 1 965. Oliver and Boyd, London. Fisher, R. A. 1 949a. A theoretical system of selection for homostyle Primula. Sankhya 9 : 325-342. Fisher, R. A. 1 950. A class of enumerations of importance in genetics. Proc. Roy. Soc. B 1 36 : 509-520. Fisher, R. A. t 950a. Gene frequencies in a cl ine determ ined by selection and diffusion. Biometrics 6 : 353-36 1 . Fisher, R . A. 1 952. Statistical methods i n genetics. The Bateson Lecture, 1 95 1 . Heredity 6 : 1 - 1 2. Fisher, R. A. 1 953. Population genetics. Proc. Roy. Soc. B 1 4 1 : 5 1 0-523. Fisher, R. A. 1 954. A fuller t heory of " junctions " in inbreeding. Heredity 8 : 1 87- 1 99. Fisher, R. A. 1 956. Statistical Methods and Sciellti/ic Inference. Oliver and Boyd, London . Fisher, R . A. 1 958. The Genetical Theory of Natural Selection. 2nd ed. Dover Press, New York.
532
B I BLIO G R A P H Y
Fisher, R. A . 1 958a. Polymorphism and natural selection. J .
Ecol. 46 :
Fisher, R. A. 1 959. Natural selection from the genetical standpoint.
289-293.
A Ustr. J. Sci.
22 : 1 6- 1 7.
Fisher, R. A. 1 959a. An algebraically exact examination of junction formation and transmission i n parent·offspring inbreeding.
Heredity
1 3 : 1 79- 1 86.
Fisher, R. A. 1 96 1 . A model for the generation of self·sterility alleles. J.
Theor. Bioi.
Fisher, R. A. 1 962. Enumeration and classification in polysomic i nheritance. 1 : 4 1 1 -4 1 4.
J.
Theor. Bioi.
2 : 309-3 1 1 .
Fisher, R. A., and E. B. Ford. 1 928. The variability of species in the with reference to abundance and sex.
Trans. Entom. Soc. London
Lepidoptera,
2 : 367-384.
Fisher, R. A., and E. B. Ford. 1 947. The spread of a gene in natural conditions in a Fisher, R. A . , and E. B. Ford. 1 950. The " Sewall Wright " effect . Heredity 4 : 1 1 7- 1 1 9. colony of the moth
Panaxia dominula L. Heredity
1 : 1 43-1 74.
Fisher, R. A., F. R. Immer, and O. Tedin. 1 93 2. The genetical i nterpretation of statistics of the third degree in the study of quantitative inheritance.
Genetics
1 7 : 1 07-1 24.
Fisher, R. A., and F. Yates. 1 963.
Medical Research.
Statistical Tables for Biological Agricultural, and
6th Ed. Hafner Pub. Co. , New York.
Fitch, W. M. 1 966. An improved method for testing for evolutionary homology. J.
Mol. Bioi.
1 6 : 9- 1 6.
Fitch, W. M . , and E. Margoliash. 1 967. Construction of phylogenetic trees.
Science
1 55 : 279-284.
Ford, E. B. 1 964.
Ecological Genetics. Methuen, London ; John Wiley, Genetic Polymorphism. Faber & Faber, London.
New York.
Ford, E. B. 1 965. Ford, E. B., and P. M. Sheppard. 1 965. Natural selection and the evolution of dominance. Heredity 2 1 : 1 39-146. Frank, P. W. 1 960. Prediction of population growth for m in
Amer. Natur.
Daphnia pulex cultures.
94 : 357-372.
Fraser, A. S. 1 957. Simulation of genetic systems by automatic digital computers. I. I ntroduction.
A ust.
J.
Bioi. Sci.
1 0 : 484-49 1 .
Fraser, A. S. 1 957a. S imulation of genetic systems by automatic digital computers. II. Effects of l inkage on rates of advance under selection.
A ust.
J.
Bioi. Sci.
1 0 : 492-499.
Fraser, A. S. 1 958. Monte Carlo analyses of genetic models.
Nature
1 8 1 : 208-209.
Fraser, A. S. 1 960. Simulation of genetic systems by automatic digital computers. V. Linkage, dominance, and epistasis. Biometrical O. Kempthome, Pergamon Press, New York. Pp. 70-83.
Genetics.
Ed.
by
Fraser, A. S. 1 96Oa. Simulation of genetic systems by automatic digital computers. VI. Epistasis.
Aust.
J.
Bioi. Sci.
1 3 : 1 50- 1 62.
Fraser, A. S. 1 960b. Simulation of genetic systems by automatic digital computers. VII. Effects of reproductive rate and i ntensity of selection on genetic structure.
Aust. J. Bioi. Sci.
1 3 : 344-350.
Fraser, A. S. 1 962. Simulation of genetic systems. J.
Theor. Bioi.
2 : 329-346.
Fraser, A. S. 1 967. Gametic disequilibrium i n muItigenic systems under normalizing selection.
Genetics
55 : 507-5 1 2.
B I B LI O G R A P H Y
533
1 967. Simulat ion of genetic systems. XI. Inversion poly morphism. Amer. J. Hum. Genet. 1 9 : 270-287. Fraser, A. S., D . Burnell, and D. M i ller. 1 966. Si mulation of genetic systems. X. Inversion polymorphism. J. Theor. BioI. 1 3 : 1 - 1 4. Fraser, A. S., and P. E. Hansche. 1 964. S imulation of genetic systems. Major and minor loci. Genetics Today. Pergamon Press, New York. Pp. 507-5 1 6. Fraser, A . S . , D. M iller, and D. Burnell, 1 965. Polygenic balance. Nature 206 : 1 1 4. Freese, E. 1 962. On the evolution o f base composition of DNA . J. Theor. BioI. 3 = 82- 1 0 1 . Freire- Maia, N. 1 964. On the methods a va ilable for estimating the load of muta tions d isclosed by inbreeding. Cold Spring Harbor Symp. Quant. BioI. 29: 3 1-39. Frota-Pessoa, O. 1 957. The estimation of the size of isolates based on census data. A mer. J. Hum. Genet. 2 : 9-1 6. Frydenberg, O. 1 963. Population studies of a lethal mutant in Drosophila melano gaster. I. Behaviour in populations with discrete generations. Hereditas 50: 89- 1 1 6. Fraser, A. S., and D. Burnell.
Gabriel, M . L.
1 965. Pri mitive genetic mechanisms and the origin of chromosomes. A mer. Natur. 94 : 257-269. Gale, J. S. 1 964. Some applications of the theory of junctions. Biometrics 20 : 85- 1 1 7. Galton, F. 1 889. Natural Inheritance. Macm illan & Co., London. Garber, M . J. 1 95 1 . Approach to genotypic equilibrium with varying percentage of self-fertilization, J. Hered. 42 : 299-300. Garfinkel, David. ] 962. Digital computer simulat ion of ecological systems. Nature 1 94 : 856-857. Gates, C. E., R. E. Comstock, and H . F. Robinson. 1 957. Generalized genetic variance and covariance formulae for self-fertilized crops assuming linkage.
Genetics 42 : 749-763. Gause, G . F. 1 934. The Struggle for Existence. Williams and
Wilkins, Balti more . Geiringer. H . 1 944. O n the probability theory of l inkage i n MendeJian heredity.
A nn. Math. Stat. 1 5 : 25-57. 1 945. Further remarks on linkage theory i n Mendelian heredity . Ann. Math. Stat. 1 6 : 390-393 . Geiringer, H . t 947. Contribution t o the heredity theory o f muItivalents. J. Math. Phys. 26 : 246-278. Geiringer, H. 1 948. On the mathematics of random mating in case of different recombination values for males and females. G,.enetics 33 : 548-564. Geiringer. H. 1 949. On some mathematical problems arising in the development of Mendel ian genetics. J. A mer. Stat. Assoc. 44 : 526-547. Geiringer, H. 1 949a. Chromatid segregation in tetraploids and hexaploids. Genetics 34 : 665-684. Geiringer, H. 1 949b. Contribution to the linkage theory of autopolyploids. Bull. Math. Biophys. 1 1 : 59-82, 1 97-2 1 9. Ghat, G. L. 1 964. The genotypic composition and variabil ity i n plant populati ons under mixed self-fertilization and random mating. J. Indian Soc. Agri. Stat. 1 6 : 94- 1 25. Geiringer, H.
534
BI BLIOG RAPHY
G hat, G . L. 1 967. Loss of heterozygosity i n populations under mixed random mating and selfing. J. Indian Soc. Agri. Stat. 1 8 : 73-8 1 . G il bert, N. E. G . 1 960. Predicting performance i n Fl and F 2 generations. Heredity 1 3 : 1 46- 1 49. G ilbert, N . E. G . 1 960a. Polygene analysis. Genet. Res. 2 : 96- 1 05. G i l bert, N. E. G . 1 96 \ . Polygene analysis. II. Selection. Genet. Res. 2 : 456-460. G il l , J. L. 1 965. Effects of fin ite size on select ion advances in simulated genetic populat ions. A list. J. BioI. Sci. 1 8 : 599-6 1 7. G ill, J . L. 1 965a. A Monte Carlo evaluation of predicted selection response. A ust. J. BioI. Sci. 1 8 : 999- 1 007. G i ll, J. L. 1 965b. Selection and li nkage in simulated genetic populat ions. A list. J. BioI. Sci. 1 8 : 1 1 7 1 - 1 1 87. G il l , J . L., and B. A . Clemmer. 1 966. Effects of selection and l i nkage on degree of inbreed ing. A llst. J. BioI. Sci. 1 9 : 307-3 1 7 . G illois, M. 1 964. La Relation d'Idelllite en Gelletiqlle. U npublished thesis, Faculty of Science, Univ. of Paris. G iIlois, M. 1 966. Le concept d'i ndentite et son importance en genetique. Annales de Ghletiqlle 9 : 58-65. Goldberg, S. 1 950. all a Singular Diffusioll Equation. Ph . D. thesis. Cornell University, Ithaca, N.Y. Good hart, C. B. 1 963. The Sewall Wright effect. A mer . Natur. 97 : 407-409. Goodman, L. A. 1 967. The probabil ities of exti nction for birth-a nd-death processes that are age-dependent or phase-dependent. Biometrika 54 : 579-596. Goodman, L. A. 1 968. Stochastic models for the population growth of sexes. Biometrika 5 5 : 469-487. Gowe, R. S. A., A . Robertson, a nd B. D. H . Latter. 1 959. Environment and poul try breed i ng problems. 5 . The design of poultry control strains. Poult. Sci. 38 : 462-47 1 . Gowen, J. W., Ed. 1 952. Heterosis. Iowa State ColI . Press, Ames, I owa. Grant , V. 1 963. The Origin of Adaptatiolls. Columbia University Press, New York. G riffing, B. 1 950. A nalysis of quantitative gene action by constant parent regression and related techniques. Genetics 35 : 303-3 2 1 . G riffing, B. 1 956. Concept of general and specific combi ning abil ity i n relation to diallel crossing systems. A llst. J. BioI. Sci. 9 : 463-493. G riffing, B. 1 956a. A general ized treatment of the use of diallel crosses in quantitative i nheritance. Heredity 1 0 : 3 1 -50. G riffing, B. 1 957. Linkage i n t risomic i nheritance. Heredity 1 1 : 67-92. Griffing, B. 1 960. Theoretical consequences of truncation selection based on the i nd ividual phenotype. A llst. J. BioI. Sci. 1 3 : 309-343. G riffing, B. 1 960a. Accommodation of l inkage i n mass selection theory. Aust. J. BioI. Sci. 1 3 : 501 -526. Gri ffing, B. 1 96 1 . Accommodation of gene-chromosome configuration effects in quantitative i nheritance and selection theory. Aust. J . BioI. Sci. 1 4 : 402-4 1 4. G riffing, B. 1 962. Consequences of truncation selection based on combi nations of i ndividual performance and general combining ability. Aust. J. BioI. Sci. 1 5 : 333-35 1 . ,
B I B LI O G RAPHY
G riffing, B.
1 9620.
1.
Bio. Sci.
1.
Bioi. Sci.
Prediction formulae for general combin ing ability selec t ion
methods utilizing one o r two random-mating popula t ions.
) 5 : 650-665. G riffin g , B. 1 963.
536
A ust.
Comparisons of potentials for general combin ing abili ty selection
methods utilizing one or two random-mating popula tions. Griffing, B.
Aust.
1 6 : 838-862. 1 965 . I nfluence o f sex on selection. 1. Con tribution of sex-linked genes. Aust. l. Bioi. Sci. ) 8 : 1 1 57- 1 1 70. G riffing, B. 1 966. Influence o f sex on selection. II. Contribution o f autosomal genotypes havi n g d i fferent values i n the two sexes. A ust. 1. BioI. Sci. 1 9 : 593-606. G riffing, B. I 66a. Influence of sex on selection . III. J o i n t contributions of sex l in ked a n d a utosomal genes. Alist. 1. Bioi. Sci. 1 9 : 775-794. G ri ffing, B. 1 967. Selection in reference to biological groups. I. Individual and group selection appl i ed to populations of unordered groups. A ust. 1. Bioi. Sci. 20 : 1 27- 1 39. G ri ffing, B. 1 968. Selection in reference to biological groups. II. Consequences of Aust. 1. Bioi. Sci. 2 1 : 1 1 63-1 1 70.
selection in groups of one size when evaluated in groups of a different size.
H agedoorn, A . L., and A . C. H agedoorn.
1 92 1 . The Relative Value of the Processes
Causing EL'O/utiofl. M arti n u s N ijhoff, The Hague.
H ai gh, J . 1 969. An e numeration problem in self-sterility. H a i rston, N . G .
Species a bundance and commun ity organiza t ion.
F. E.
Smith, and L. B. Siobodki n.
40 : 404-4 1 6.
H a i rston, N . G . ,
Biometrics 25 : 39-47. Ecology
1 959.
populat ion control, a n d competi t i o n.
1 960. Community A mer. Natur. 94 : 42 1 -425.
structure,
1 963. Concepts o f random mating and the frequency o f consanguineous marriages. Proc. Roy. Soc. B 1 59 : 1 25- 1 77 . Haldane, J . B . S . 1 9 1 9 . The combination of l i nkage v a l ues, a n d t he calculation o f d istance between loc i o f linked factors. 1. Genet. 8 : 299-309.
H ajnal, J.
Part 1 . TrailS. Camb.
H aldane, J . B. S. 1 924. A mathematical theory of natural and a rt ificial selection.
Phil. Soc. 23 ; 1 9-4 1 .
Part 1 1 .
Haldane, J . B . S. 1 9240. A mathematica l theory o f natural and a rt i ficial select ion. Halda ne, J . B. S.
Bioi. Pmc. Camb. Phil. Soc., BioI. Sci. I : 1 58- 1 63. 1 924b. A mathematical theory o f natura l a n d a rtificial Part I l L Proc Comb. Phil. Soc. 23 : 363-372. Haldane, J. B. S. 1 9 24 c A ma themat ical t heory of natural and a rt ificial Part I V. Proc. Camb. Phil. Soc. 23 : 235-243. Haldane, J . B. S. J 927. A mathematical theory of natura l and artific ia l Part V . Selection a nd mutation. Proc. Camb. Phil. Soc. 28 : 838-844. H aldane, J . B. S. t 9 3 0 . A mathematical theory of natura l and a rt ificial VI. Isolation. Proc. Camb. Phil. Soc. 26 : 220--2 30.
selection.
.
.
selectio n . selection . selection.
Halda ne, J . B. S . 1 9 30a. A mathematical theory of n a tural and artificial selection. V I I . Selec t ion i n tensity as a fu nct ion o f m ortal i t y rate.
27 : 1 3 1-1 36.
Proc. Camb. Phil, Soc.
536
B IBLIOG RAPHY
Haldane, J. B. S. 1 930b. A mathematical theory of natural and artificial selection. VIII. Metastable populations. Proc. Camb. Phil. Soc. 27 : 1 37-1 42. Haldane, J. B. S. 1 930c. A note on Fisher's theory of the origin of dominance and on a correlation between dominance and l inkage. Amer. Natur. 64 : 87-90. Haldane, J. B. S. I 930d. The theoretical genetics of autopolyploids. J. Genet. 22 : 359-372. Haldane, J. B. S. 1 932. A mathematical theory of natural and artificial selection. IX. Rapid selection. Proc. Camb. Phil. Soc. 28 : 244-248. Haldane, J. B . S. 1 932a. The Causes of Evolution. Harper & Row, New York. Haldane, J. B. S. 1 936. The amount of heterozygosis to be expected in an approxi mately pure line. J. Genet. 3 2 : 375-39 1 . Haldane, J. B. S. 1 937. The effect of variation on fitness. Amer. Natur. 7 1 : 337-349. Haldane, J. B. S. 1 937a. Some theoretical results of continued brother-sister mating. J. Genet. 34 : 265-274. Haldane, J. B. S. 1 938. Indirect evidence for the mating system in natural populations. J. Genet. 36 : 2 1 3-220. Haldane, J. B. S. 1 939. The spread of harmful autosomal recessive genes i n human populations. Ann. Eligen. 9 : 232-237. Haldane, J. B. S. 1 939a. The equil ibrium between mutation and random extinction. Ann. Eugen. 9: 400-405. Haldane, J. B. S. 1 939b. The theory of the evolution of dominance. J. Genet. 37 : 365-374. Haldane, J. B. S. 1 940. The conflict between selection and mutation of harmful recessive genes. Ann. Eugen. ] 0 : 4 1 7-42 1 . Haldane, J. B. S. 1 94 1 . Selection against heterozygosis i n man. Ann. Eugen. I I : 333-340. Haldane, J. B. S. 1 946. The i nteraction of nature and nurture. Ann. Eligell. 1 3 : 1 97-205. Haldane, J. B. S. 1 947. The dysgenic effect of induced recessive mutations. Ann. Eugen. 1 4 : 3 5-43. Haldane, J. B. S. 1 948. The theory of a cline. J. Genet. 48 : 277-284. Haldane, J. B. S. 1 948a. The number of genotypes which can be formed with a given number of genes. J. Gellet. 49 : 1 1 7-1 1 9. Haldane, J. B. S. 1 949. Human evolution : past and future. Genetics, Paleontology, and Evolution. Ed. by G. L. Jepsen, E. Mayr, and G. G . Simpson. Princeton Univ. Press, Princeton, N.J . Pp. 405-4 1 8. Haldane, J. B. S. 1 949a. Disease and evolution. La Ricerca Scient. 1 9 : I - I I . Haldane, J. B. S. I 949b. Parental and fraternal correlations i n fitness. Ann. Eugen. 1 4 : 288-292. Haldane, J. B. S. 1 949c. The association of characters as a result of inbreeding and linkage. Ann. Eugen. 1 5 : 1 5-23. Haldane, J. B. S. I 949d. Suggestions as to quant itative measurement of rates of evolution. Evolution 3 : 5 1 -56. Haldane, J. B. S. I 94ge. Some statistical problems arisi ng in genetics. J. Roy. Stat. Soc. B 1 1 : 1 - 1 4.
BI BLI O G R A P H Y
537
Haldane, J. B. S. 1 949f The rate of mutation of human genes. Proc. Eighth Int. Congo Genet., Stockholm. Pp. 267-273 . Haldane, J. B. S. 1 95 1 . The mathematics of biology. Sci. J. Roy. Coli. Sci. 22 : 1 - 1 1 . Haldane, J . B. S. 1 953. Some animal life tables. J. Inst. Actuaries 79 : 83-89. Haldane, J. B. S. 1 953a. Animal populations and their regulation. New Biology 1 5 : 9-24. Haldane, J. B. S. 1 954. The measurement of natural selection. Caryologia 6 : 480-487 (Suppl.). Haldane, J. B. S. 1 954a. The statics of evolution. Evolution as a Process. Ed. by J. Huxley, A. C. Hardy, and E. B. Ford. Allen and Unwin, London. Pp. 1 09- 1 2 1 . Haldane, J . B . S . 1 954b. A n exact test for randomness o f mat ing. J. Genet. 52 : 63 1 -635. Haldane, J. B. S. 1 955. Population genetics. New Biology 1 8 : 34-5 1 . Haldane, J . B. S. 1 955a. The complete matrices for brother-sister and alternate parent-offspring mating involving one locus. J. Genet. 53 : 3 1 5-324. Haldane, J. B. S. 1 955b. On the biochemistry of heterosis, and the stabilization of polymorphism. Proc. Roy. Soc. B 1 44 : 2 1 7-220. Haldane, J. B. S. 1 956. The estimation of viabilities. J. Genet. 54 : 294-296. Haldane, J. B. S. 1 956a. The relation between density regulation and natural selection. Proc. Roy. Soc. London B 1 45 : 306-308. Haldane, J. B. S. 1956b. The conflict between inbreeding and selection. I. Self ferti lization. J. Genet. 54 : 56-63. Haldane, J. B. S. 1 956c. The theory of select ion for melanism i n Lepidoptera. Proc. Roy. Soc. B 1 45 : 303-308. Haldane, J. B. S. 1 957. The cost of natural selection. J. Genet. 55 : 5 1 1 -524. Haldane, J. B. S. 1 957a. The conditions for co-adaptation in polymorphism for i nversions. J. Genet. 55 : 2 1 8-225. Haldane, J. B. S. 1 958. The theory of evolution before and after Bateson. J. Genet. 56 : 1 1 -27. Haldane, J. B. S. 1 960. More precise expressions for the cost of natural selection. J. Genet. 57 : 35 1 -360. Haldane, J. B. S. 1 96 1 . Some simple systems of artificial selection. J. Genet. 56 : 345-350. Haldane, J. B. S. 1 96 1 a. Natural selection in man. Prog. Med. Gen. 1 : 27-37. Haldane, J. B. S. 1 962. Conditions for stable polymorphism at an autosomal locus. Nature 1 93 : 1 1 08. Haldane, J. B. S. 1 962a. Natural selection in a population with annual breeding but overlapping generations. J. Genet. 58 : 1 22- 1 24. Haldane, J. B. S. 1 962b. The selection of double heterozygotes. J. Genet. 58 : 1 25-1 28. Haldane, J. B. S. 1 963. Tests for sex-linked i nheritance on population samples. Ann. Hum. Genet. 27 : 1 07- 1 1 1 . Haldane, J. B. S. 1 964. A defense of beanbag genetics. Persp. in BioI. and Med. 7 : 343-359. Haldane, J. B. S., and S. D. Jayakar. 1 963. The solution of some equations occurring in populati on genetics. J. Genet. 5 8 : 29 1 -3 1 7.
538
BIBLIO G R A P H Y
Haldane, 1 . B. S., and S. D. Jayakar. 1 963a. Polymorphism due to selection of varying di rect ion. J. Genet. 58 : 237-242. Haldane, 1. B. S., and S. D. Jayakar. 1 963b. Polymorphism due to selection depend ing on the composition of a popu lation. J. Genet. 58 : 3 1 8-323. Haldane, J. B. S., and S. D. Jayakar. 1 963c. The el imination of double dominants in large random mating populat ions. J. Genet. 58 : 243-25 1 . Haldane, J . B. S., and S . D. Jayakar. 1 964. Equil i bria under natural selection at a sex-linked locus. J. Genet. 59 : 29-36. Haldane, 1. B. S., and S. D. Jayakar. 1 965. The nature of human genet ic loads. J. Gellet. 59 : 1 43- 1 49. Haldane, 1. B. S., and S. D . Jayakar. 1 965a. Select ion for a single pair of allelo morphs with complete replacement. J. Genet. 59 : l 7 1 - 1 77. Haldane, 1 . B . S., and P. Moshinsky. 1 939. I nbreeding i n Mendelian populations with special reference to human cousin marriage. Ann. Eugen. 9 : 3 2 1 -340. Haldane, 1. B. S., and C. H. Waddington. 1 93 1 . I n breeding and l i nkage. Genetics 1 6 : 357-374. Hamilton, W. D. 1 963. The evolution of altru istic behavior. Amer. Natllr. 97 : 354-356. Hamil ton, W. D. 1 964. The genetical evolution of social behavior. I. J. Theor. Bioi. 7 : 1 - 1 6. I I. J. Theor. Bioi. 7 : 1 7-52. Hamilton, W. D. 1 967. Extraordinary sex ratios. Science 1 55 : 47 7-488. Hansche, P. E., S. K. Jain, and R. W. Allard, 1 966. The effect of epistasis and gametic unbalance on genetic load. Gelletics 54 : 1 027- 1 040. Hanson, W. D. 1 958. The theoret ical distribution of lengths of undisturbed chromo some segments in F . gametes. Biometrics 1 4 : 1 35-1 36. Hanson, W. D. 1 959. The t heoretica l distri b ution of lengths of parental gene blocks in the gametes of an F . individual. Genetics 44 : 1 97-209. Hanson, W. D. 1 959a. Early generat ion analysis of lengths of heterozygous chromo some segments a round a locus held heterozygous with backcrossi ng or selfing. Genetics 44: 833-838. Hanson, W. D. 1 959b. Theoret ical distri bution of the i nitial linkage b lock lengths i ntact in the gametes of a population i ntermated for 11 generations. Genetics 44 : 839-846. Hanson, W. D. 1 959c The breakup of initial l i nkage blocks under selected mating systems. Genetics 44 : 857-868. Hanson, W. D., and C. R. We ber. 1 96 1 . Resolution of genetic varia bility in self poll i nated species with an appl icat ion to the soybean . Genetics 46 : 1 425-1 434. Hanson, W. D. 1 962. Average recombination per chromosome. Genetics 47 : 407-4 1 5. Hanson, W. D. 1 966. Effects of part ial isolation (distance), migration, and different fit ness requ irements among environmental pockets upon steady state gene frequencies. Biometrics 22 : 453-468. Hanson, W. D., and B. I . Hayman. 1 963. Li nkage effects on add it ive genetic variance among homozygous l i nes arisi ng from a cross between two homozygous parents. Genetics 48 : 755-766. Hardin, G. 1 960. The competitive exclusion principle. Science 1 3 1 : 1 292-1 298 Hardy, G. H . 1 908. Mendel ian proportions i n a mixed population. Scien ce 28 : 49-50. .
.
B I B LIOG RAPHY
539
Harris, D. L. 1 964. Expected and predicted progress from index selection i nvolving estimates of population parameters. Biometrics 20 : 46-72. Harris, D. L. I 964a. Biometrical parameters of self-fertilizing diploid populations. Genetics 50 : 93 1 -956. Harris, D. L. 1 964b. Genotypic covariances between inbred relatives. Genetics 50 : 1 3 1 9- 1 348. Harris, D . L. 1 965. Biometrical genetics in man. Methods and Goals in Human Behavior Genetics. Academic Press, New York. Pp. 8 1 -94. Harris, T. E. 1 963. The Theory of Branching Processes. Prent ice-Hall, Englewood, N.J. Hartl, D. L., Y. H iraizu mi, and J. F. Crow. 1 967. Evidence for sperm dysfunction as the mechanism of segregation distortion in Drosophila melanogaster. Proc. Nat. A cad. Sci. 58 : 2240-2245. Hartl, D. L., and T. Maruyama. 1 968. Phenogram enumeration : The number of regular genotype-phenotype correspondences in genetic systems. J. Theor. Bioi. 20 : 1 29- 1 63. Hashiguchi, S., and H. Morishima. 1 969. Estimation of genetic contribution of principal components to i ndividual variates concerned. Biometrics 25 : 9- 1 6. Hasofer, A. M. 1 966. A continuous-time model i n population genetics. J. Theor. Bioi. 1 1 : 1 50- 1 63. Hayman, B. I . 1 953. Mixed selting and random mating when homozygotes are at a disadvantage. Heredity 7 : 1 85- 1 92. Hayman, B. I. 1 954. The analysis of variance of diallel crosses. Biometrics 1 0 : 235-244. Hayman, B. I. 1 954a. The theory and analysis of diallel crosses. Genetics 39 : 789-809. Hayman, B. I. 1 955. Descript ion and analysis of gene action and interaction. Cold Spring Harbor Symp. Quant. Bioi. 20 : 79-86. Hayman, B. I. 1 958. The theory and analysis of diallel crosses. Genetics 43 : 63-85. Hayman, B. I. 1 958a. The separation of epistatic from additive and dominance variat ion in generation means. Heredity 1 2 : 37 1 -390. Hayman, B. I. 1 960. Maxi mum likel ihood estimation of genetic components of variation. Biometrics 1 6 : 369-38 1 . Hayman, B. I. 1 960a. The separation of epistatic from additive and dominance variation in generation means. I I . Gene/ica 3 1 : 1 33-1 46. Hayman, B. 1. I 960b. The theory and analysis of diallel crosses. I I I . Genetics 45 : 1 55- 1 72. Hayman, B. I. I 960c. Heterosis and quantitative inheritance. Heredity 1 5 : 324-327. Hayman, B. I. 1 962. The gametic distribution in Mendelian heredity. Aust. J. Bioi. Sci. 1 5 : 1 66- 1 82. Hayman, B. I., and K. Mather. 1 953. The progress of i nbreeding when homozygotes are at a disadvantage. Heredity 7 : 1 65- 1 83. Hayman, B. L , and K. Mat her. 1 955. The description of genic interactions in continuous variation. Biometrics I I : 69-82. Hayman, B. I., and K. Mather. 1 956. Inbreed ing when homozygotes are at a d isadvantage : A reply. Heredity 1 0 : 27 1 -274.
540
BIBLIO G R A P H Y
Hazel, L. N. 1 943. The genetic basis for constructing selection indexes. Genetics 28 : 47&-490. Hazel, L. N., and J . L. Lush. 1 942. The efficiency of three methods of selection. J. Hered. 3 3 : 393-399. Heidhues, T. 1 961 . Anwendung statistischer Methoden in der modernen Tier zUchtung. Ziichtungskunde 3 3 : 1 - 1 2. Hellwig, G. 1 964. ([ ber ein enfaches Prinzip, welches die Entropierzeugung von Lebeswesens bet rifft. J. Theor. Bioi. 6 : 258-274. Herbst, W. 1 927. Variation, Mendelismus und Selektion i n mathemat ischer Behandlung. Zeit. indo Abst. Vererb. 44 : 1 1 0- 1 25. Highton, R. 1 966. The effect of mating frequency on phenotypic ratios in sibships when only one parent is known. Genetics 54 : 1 0 1 9-1025. Hill, W. G. 1 968. Population dynamics of linked genes in finite popul ations. Proc. XII Intern. Congr. Genetics 2 : 1 46-1 47. Hill, W. G. 1 969. On the theory of artificial selection in finite populations. Genet. Res. 1 3 : 1 43-163. Hill, W. G. 1 969a. The rate of selection advance for non-additive loci. Genet. Res. 1 3 : 1 65-1 73. Hill, W. G., and A. Robertson. 1 966. The effect of linkage on limits to artificial selection. Genet. Res. 8 : 269-294. Hill, W. G., and A. Robertson. 1 968. Linkage disequil ibrium i n finite populations. Theor. and Appl. Genet. 38 : 226-23 1 . Hill, W. G ., and A. Robertson. I 968a. The effects of inbreeding at loci with hetero zygote advantage. Genetics 60 : 6 1 5-628. Hiraizumi, Y., L. Sandler, and J. F. Crow. 1 960. Meiotic drive in na tural populations of Drosophila melanogaster. I I I. Populat ional implications of the segregation distorter locus. Evolution 1 4 : 433-444 . Hoen, K., and A. H . E. Grandage. 1 960. Calculation of inbreeding i n family selection studies on the I B M 650 data processing machine. Biometrics 1 6 : 292-296. Hogben, L. 1 932. Fi lial and fraternal correlations in sex-linked inheritance. Proc. Roy. Soc. Edinb. 52 : 3 3 1 -3 3 6. Hogben, L. 1 93 3 . A matrix notation for Mendelian populations. Proc. Roy. Soc . Edinb. 53 : 7-25. Hogben, L. 1 93 3a. Nature and Nurture. Norton, New York. Hogben, L. 1 946. An Introduction to Mathematical Genetics. Norton, New York. Holgate, P. 1 964. Genotype frequencies in a section of a cline. Heredity 1 9 : 501 -509. Holgate, P. 1 966. A mathematical study of the founder pri nciple of evolutionary genetics. J. Appl. Prob. 3 : 1 1 5- 1 28. Holgate, P. 1 966a. Two limit distributions in evolutiona ry genetics. J. Theor. Bioi. I I : 362-369. Holgate, P. 1 967. Divergent population processes and mammal outbreaks. J. Appl. Prob. 4 : 1 -8. Holgate, P. 1 968. Interaction between migration and breeding studied by means of genetic algebras. J. Appl. Prob. 5 : 1 -8.
BI BLIOG R A P H Y
541
Holgate, P. 1 968a. The genetic algebra of k linked loci. Proc. London Math. Soc. (3), 1 8 : 3 1 5-327. Horner, T. W. 1 956. Parent-offspring and full-sib correlations under a parent-off spring mating system. Genetics 4 1 : 460-468. Horner, T. W., and C. R. Weber. 1 956. Theoretical and experimental study of self fertilized populations. Biometrics 1 2 : 404-4 1 4. House, V. L. 1 953. The use of the binomial expansion for a classroom demonstration of drift in small popu lations. Evolution 7 : 84-88. Hubby, J. L , and R. C. Lewontin. ] 966. A molecular approach to the study of genic heterozygosity in natural populations. I. The number of alleles at different loci in Drosophila pseudoobscura. Genetics 54 : 577-594. Hull, P. 1 964. Equ ilibrium of gene frequency produced by partial incompatibility of offspring with dam. Proc. Nat. A cad. Sci. 5 1 : 461-464. H ulse, F. S. 1 957. Exogamie et heterosis. Arch. Suisses Anthro. Gener. 22 : 103- 1 25. Hutchinson, G. E. 1 948. Circular causal systems in ecology. Ann. N. Y. A cad. Sci. 50 : 22 1 -246. Hutchinson, G . E. 1 954. Notes on osci llatory populations. J. Wildl. Manag. 1 8 : 107- 1 09. H utchinson, G. E. ] 959. Homage to Santa Rosalia or why are there so many kinds of animals ? Amer. Natur. 93 : 145-] 59. Hutchinson, G. E., and R. H. MacArthur. 1 959. A theoretical ecological model of size distribution among species of animals. Amer. Natur. 93 : I ] 7- ] 25. H uxley, J . (ed.) 1 940. The New Systematics. The Clarendon Press, Oxford, England. Ivlev, V . S. 1 965 . On the quantitative relationship between survival rate of larvae and their food supply. Bull. Math. Biophys. 27 : 21 5-222. Jacquard, A. 1 969. Evolution of genetic structure of small populations. Social Bioi. 1 6 : 1 43-1 57. Jagers, P. 1 969. The proportions of individuals of different kinds in two-type populations. A branching problem arising in biology. J. Appl. Prob. 6 : 249-260. Jain, H . K., and S. K . Jain. 1 96 1 . Differential non-genetic variability in the ex pression of major genes and polygenes. Amer. Natur. 95 : 3 85-387. Jain, S. K. ] 96] a. On the possible adaptive significance of male sterility in pre domi nantly inbreeding populations. Genetics 46 : 1 237-1 240. Jain, S. K. ] 963. Sex ratios under natural selection. Nature 200 : 1 340-41 . Jain, S. K . 1 968. Simulation of models involving mixed selfing and random mating. II. Effects of selection and linkage in finite populations. Theor. & Appl. Genet. 38 : 232-242. Jain, S. K., and R. W. Allard. 1 965. The nature and stability of equilibria under optimizing selection. Proc. Nat. A cad. Sci. 54 : 1436-1443. Jain, S. K., and R. W. Allard. 1 966. The effects of linkage, epistasis, and inbreeding on population changes under selection. Genetics 53 : 633-659. Jain, S. K . , and P. L. Workman. 1 967. Generalized F-statistics and the theory of inbreeding and selection. Nature 2 1 4 : 674-678. James, J. W. 1 96 1 . Selection i n two environments. Heredity 1 6 : 1 45-1 52.
542
B I B LI O G R A P H Y
James, J. W. 1 962. The spread of genes i n ra ndom mating control populations. Genet. Res. 3 : 1 - 1 0. James, J. W. 1 962a. The spread of genes in popu lations under selection. Proc. World Poult. COllg. 1 2 : 1 4- 1 6. James, J . W. I 962b. On Schwartz and Wearden's method of estimating heri tability. Biometrics 1 8 : 1 23- 1 25 . James, J . W. 1 962c. Confl ict between direct iona l a n d cent ripetal selection. Heredity 1 7 : 487--499. James, J . W. 1 965. Simultaneous selection for dominant and recessive mutants. Heredity 20 : 1 42- 1 44. James, J. W. 1 965b. Response cu rves in selection experi ments. Heredity 20 : 57-63 . James, J. W., and G . McBride. L958. The spread of genes by natural and art ificial selection in a cl osed poul try flock. J. Gellet. 5 6 : 5 5-62. Jennings, H. S. 1 9 1 2. The production of pure homozygot ic organisms from hetero zygotes by self-fertilization. Amer. Natllr. 45 : 487--491 . Jennings, H . S. ] 9 1 4. Formula for the resu lts of inbreeding. Amer. Natllr. 48 : 693-696. J ennings, H. S. 1 9 1 6. The numerical results of d iverse systems of breedi ng. Genetics I : 53-89. Jennings, H. S. ] 91 7. The numerical resu lts of diverse systems of breedi ng, with respect to two pairs of characters, linked or i ndependent, with special relation to the effects of linkage. Gel le tics 2 : 97- 1 54. Jensen, L., and E. Pol lak. 1 969. Random selective advantages of a gene in a finite population. J. Appl. Prob. 6 : 1 9-37. Jepson, G . L. , Ernst Mayr, and G. G. Simpson. 1 949. Genetics, Paleontology, alld Evolution. Princeton U niv. Press, Pri nceton, N.J. Jones, D. F. 1 9 1 7 . Dominance of l i n ked factors as a means of accounting for heterosis. Genetics 2 : 466-479. Jones, L. P. 1 969. Effects of art ificial selection on rates of inbreedi ng i n populations of Drosophila melanogaster. Allst. J. Bio i. Sci. 22 : 1 43- 1 69. Jones, R. M., and K. Mather. 1 958. I nteraction of genotype and environment i n continuous variation. I I. A nalysis. BiomeTrics 1 4 : 489-498. Jones, R. M. 1 960. Linkage distri butions and epistacy in quantitati ve inheritance. Heredity 1 5 : 1 53- 1 59. Kalmus, H., and S. Maynard Smith. 1 967. Some evolutionary consequences of pegmatypic mat i ng systems. A mer . Natw'. 1 00 : 6 1 9-633. Kal mus, H., and C. A. B. Smith . 1 948. Production of pure l i nes in bees. J. Genetics 49 : ] 53-1 58. Karlin, S. 1 966. A First Course ;11 Stochastic Pro cesses. Academic Press, N.Y. Karl i n, S. 1 968. Equilibri um behavior of population genetic models with non random mating. Part I. Preliminaries and special mating systems. J. Appl. Prob. 5 : 23 1 -3 1 3. Karl in, S. 1 968a. Equil ibrium behavior of population genet ic models with non random mating. Part II. Pedigrees, homozygosity. and stochastic models. J. Appl. Prob. 5 : 487-566.
BIBLIOG RAPHY
543
Karlin, S. 1 968b. Rates of approach to homozygosity for finite stochastic models with variable population s ize. A mer. Natur. 1 02 : 443-45 5. Karlin, S., and M. W. Feldman. 1 968. A nalysis of models with homozygote X heterozygote matings. Genetics 59 : 1 05- I 1 6. Karlin, S., and M . W. Feldman. 1 968a. Further analysis of negative assortative mati ng. Genetics 59 : 1 1 7- 1 3 6. Karlin, S., and J. L. McGregor. 1 96 1 . The Hahn polynomials, formulas and an application. Scripta Mathematica 26 : 3 3-46. Karlin, S., and J . McGregor. 1 962. On a genetics model of Moran. Proc. Camb. Phil. Soc. 58 : 299-3 1 1 . Karlin, S., and J. McGregor. 1 964. On some stochastic models i n genetics. Stochastic Models ill Medicine and Biology. Ed. by J. Gurland. Univ. of Wisconsin Press, Madison, Wisc. Pp. 245-279. Karli n, S., and J . McGregor. 1 964a. Direct product branching processes and related Markov chai ns. Proc. Nat. A cad. Sci. 5 1 : 598-602. Karlin, S., and J . L. McGregor. 1 965. Direct product branching processes and related i nduced Markoff chai ns. I . Calculations of rates of approach to homozy gosity. Bemoulli, Bayes, Laplace A nniversary Volume. Springer Verlag, New York, Pp. 1 1 1 - 1 45. Karlin, S., and J. McGregor. 1 967. The number of mutant forms maintained in a population. Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4 : 4 1 5-438. Karlin, S., and J. McGregor. 1 968. The role of the Poisson progeny d istri bution in population genetics models. Math. Biosciences 2 : 1 1 - 1 7. Karl in, S., and J. McG regor. 1 968a. Rates and probabil ities of fixation for two locus random mat ing finite populations without selection. Genetics 58 : 1 4 1 - 1 59. Karl in, S., J. McG regor, and W. F. Bodmer. 1 967. The rate of production of recombinants between li nked genes i n finite populations. Proc. Fifth Berkeley Symp. Math Stat. Prob. 4 : 403-414. Karlin, S., and F. M . Scudo. 1 969. Assortative mat ing based on phenotype. II. Two autosomal alleles without dominance. Genetics 63 : 499-5 1 0 Kemp, W. B. 1 929. Genetic equil i brium and selection. Genetics 1 4 : 85- 1 27. Kempthorne, O. 1 954. The correlation between relatives in a random mating population. Proc. Roy. Soc. B 1 43 : 1 03- 1 1 3. Kempthorne, O. 1 955. The theoretical values of correlations between relatives in random mating populations. Genetics 40 : 1 53- 1 67. Kempthorne, O . 1 95 5a. The correlations between relatives in random mating populations. Cold Spring Harbor Symp. Quant. Bioi. 20 : 60-75 . Kempthorne, O. 1 95 5b. The correlation between relatives i n a simple auto tetraploid population. Genetics 40 : 1 68- 1 74. Kempthorne, O. 1 95 5c. The correlations between relatives in inbred populations. G elletics 40 : 68 1 -69 1 . Kempthorne, O. 1 956. The theory of the diallel cross. Genetics 4 1 : 45 1 -459. Kempthorne, O. 1 957. A ll Introduction to G('1Ietic Statistics. John Wiley and Sons, New York. Kempthorne, O. 1 960. Biometrical Genetics. Pergamon Press, New York.
544
BI BLIOG RAPHY
Kempthorne, O. 1967. The concept of identity of genes by descent. Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4 : 333-348. Kempthorne, 0., and O. B. Tandon. 1 953. The estimation of heritability by re gression of offspring on parent. Biometrics 9 : 90- 1 00. Kerner, E. H. 1 957. A statistical mechanics of interacting biological species. Bull. Math. Biophys. 1 9 : 1 2 1 -1 41 . Kerner, E. H . 1 959. Further considerations on t he statistical mechanics of biological associations. Bull. Math. Biophys. 2 1 : 2 1 7-255. Kerner, E. H. 1 96 1 . On the Volterra-Lotka principle. Bull. Math. Biophys. 23 : 1 4 1 - 1 57. Kerr, W. E. 1 967. Multiple alleles and genetic load in bees. J. Apicult. Res. 6 : 6 1 -64. Kerr, W. E., and S. Wright. 1 954. Experimental studies of the distribution of gene frequencies in very small populations of Drosophila melanogaster. I. Forked. Evolution 8 : 1 72-1 77. Kerr, W. E., and S. Wright. 1 954a. Experimental studies of the distribution of gene frequencies in very small populations of Drosophila melanogaster. Ill. Aristapedia and spineless. Evolution 8 : 293-30 1 . Kerster, Harold W. 1 964. Neighborhood size in the rusty lizard, Sceloporus olivaceus. Evolution 1 8 : 445-457. Keyfitz, N. 1 964. The population projection as a matrix operator. Demography 1 : 56-73. Keyfitz, N. 1 967. Estimating the trajectory of a population. Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4 : 8 1 - 1 1 4. Keyfitz, N., and E. M. Murphy. 1 967. Matrix and multiple decrement in population analysis. Biometrics 23 : 485-503 . Khanzanie, R. G. 1968. An i ndication of the asymptotic nature of the Mendelian M arkov process. J. Appl. Prob. 5 : 350-356. Khazanie, R. G., and H. E. McKean. 1 966. A Mendelian Markov process with binomial transition probabilities. Biometrika 53 : 3 7-48. Kimura, M. 1954. Process leading to q uasi-fixation of genes in natural populations due to random fluctuations of selection intensities. Genetics 39 : 280-295. Kimura, M. 1 955. Stochastic processes and distribution of gene frequencies under natural selection. Cold Spring Harbor Symp. Quant. BioI. 20 : 33-5 3. Kimura, M. 1 955a. Solution of a process of random genetic drift with a continuous model. Proc. Nat. A cad. Sci. 4 1 : 1 44-1 50. Kimura, M. I 955b. Random genetic drift in a multi-allelic locus. Evolution 9 : 4 1 9-435 . Kimura, M . 1 956. Random genetic drift in a tri-allelic locus ; exact solution with a continuous model. Biometrics 1 2 : 57-66 . Kimura, M. I 956a. Rules for testing stability of a selective polymorphism. Proc. Nat. A cad. Sci. 42 : 336-340. Kimura, M . 1 956b. A model of a genetic system which leads to closer linkage by natural selection. Evolution 1 0 : 278-287. Kimura, M. 1 957. Some problems of stochastic processes in genetics. A nn. Math. Stat. 28 : 882-901 .
BI B LIOG R A P H Y
545
1 958. On the change of population fitness by natural selection. Heredity 1 2 : 1 45-1 67. Kimura, M. t 958a. Theoretical basis for the study of inbreeding in man (in Japanese with Engl ish summary). lap. l. Hum. Genet. 3 : 5 1 -70. Ki mura, M. 1 958b. t:ygotic frequencies in a partial ly self-fert ilizing population. Ann. Rep. Nat. Inst. Genet., lapan. 8 : 1 04- 1 05. Ki mura, M. 1 959. Conflict between sel f-fertilization and outbreeding in plants. Ann. Rep. Nat. Inst. Genet. Japan 9 : 87-88. Kimura, M. 1 960. Evolut ion of epistasis between closely linked l oci (in Japanese). lap. l. Genet . 3 5 : 274. K imura, M . 1 960a. Outline 0/ Population Genetics. Ba ifukan, Tokyo, ( i n Japanese). K i mura, M . I 960b. Optimum mutation rate and degree of dominance as determ ined by the principle of min imu m genetic load. l. Genet. 57 : 2 1 -34. Kimura, M. 1 960c. Relative appl icability of the classical and the bala nce hypothesis to man, espec ially with respect to quantitative characters. l. Rad. Res. 1 -2 : 1 55- 1 64. K imura, M . 1 960d. Genetic load of a population and its significance i n evolut ion (in Japanese). lap. l. Genet. 3 5 : 7-33. K imura, M. 1 96 1 . Some caJcul ations o n the mutational load. lap. l. Genet. 36 suppl : 1 79-1 90. K imura, M . 1 96 1 a. Natural selecti o n as the process of accumulating genetic in format ion i n adaptive evolution. Genet. Res. 2 : 1 27- 140. Kimura, M . 1 962. On the probabil i ty of fixation of mutant genes in a population. Genetics 47 : 7 1 3-7 1 9. K imura, M . 1 963. A pro bability method for treat ing i n breeding systems, especially with l inked genes. Biometrics 1 9 : 1 -1 7. Kimura, M. 1 964. Diffusion models in population genet ics. l. App. Prob. I : 1 77-232. Kimura, M. 1 965. Some recent advances i n the theory of population genetics. lap. l. Hum. Genet. 1 0(2 ) : 43-48. Kimura, M. 1 965a. A stochastic model concern ing the maintenance of genetic variabil ity in quanti tative characters. Proc. Nat. Acad. Sci. 54 : 73 1 -736. K imura, M. 1 965b. Attainment of quasi li nkage equilibrium when gene frequencies are changing by natural selection. Gelletics 5 2 : 875-890. K i mura, M . 1 967. O n the evolutionary adjustment of spontaneous mutation rates. Genet. Res. 9 : 23-34. Ki mura, M. 1 968. Genetic varia bility maintained in a finite popUlation due to mutational production of neutral and nearly neutral isoal1eles. Genet. Res. I I : 247-269. Kimura, M . 1 968a. Evolutionary rate at the molecular level. Nature 2 1 7 : 624-626. K imura, M. 1 969. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of muta t ions. Genetics 61 : 893-903. Kimura , M. 1 969a. The length of time required for a select ively neutral mutant to reach fixation through random frequency drift in a fi nite population. Genet. Res.
Kimura, M .
(in press). Kimura,
M. 1 969b.
A cad.
The rate of molecular evolution considered from the standpoint
of population genetics.
Proc. Nat.
Sci. 63 : 1 1 8 1 -1 1 88 .
546
B I B LI OG R A P H Y
Kimura,
M . , and J .
F.
Cro w. 1 96 3 . O n
K i mu ra , M . , a n d J . F.
the maxi mum avoidance of i nbreeding.
Genet. Res. 4 : 399-4 1 5 .
C ro w .
1 963a. The
me a s uremen t of effect ive populat ion
n u m be r . Era/utioll 1 7 : 279-288 .
Kimura, M . , a n d J . F. Crow. 1 964 . The number of al leles that can be ma intained in a fi n ite popu l a t i o n . Gelletics 49 : 725-73 8 . K imura, M., and J . F . C r o w . 1 9 69. Natura l se lection and gene subst i t u t i on. Gellet. Res. 1 3 : 1 27 - 1 42 .
K i mura, M . , a n d H . K a y a n o . 1 9 6 1 . T h e
mai ntenance of supernumerary chromo Ulium cal/osu111 by prefere n t i a l seg rega t i on
so mes in w i l d popu l a t i o n s o f
.
Gelletics 46 : 1 699- 1 7 1 2 .
K i mura, M . , a n d T. M a ruyama . 1 9 66. T h e muta tional load with epi s tat i c gene i nte ra c t i o n s i n fi t ncss. Genetics 54 : 1 3 3 7- 1 3 5 1 . K i mu ra. M . , and T. M a r uy a m a . 1 969. The s u bs t i t u t ional load i n a fi n i te population. Heredity 24 : 1 0 1 - 1 1 4.
K i mu ra , M . , T. M a r u ya ma , a n d J .
F. Crow. 1 963 . The mutation load in small populat ions. GClletics 48 : 1 303- 1 3 1 2. K i m u ra M . , a n d T. O h t a . 1 969. The average number of gene rat ion s until fixation of a m u ta n t gene i n a Il n i te pop u la t i o n . Genetics 6 1 : 763 -77 1 . K i m ura, M . , a n d T. O h t a . I 'nO. G e ne t i c loads at a p ol ym o r phic locus which i s ma i n tai ned by fre q u e n cy d e pe n den t select i o n . Gellet. Res. ( i n p ress ). K i m u ra , M., a n d G. H . Weiss. 1 964. The s tepp i n g stone model of population structure and the decrease of gene ti c correlation with distance. Gel/etics ,
-
K i n g,
J. C.
49 : 5 6 1 -576. 1 96 1 .
I nbreed ing, heterosis, a nd information theory.
K i ng, J. L. 1 96 5 . The
A mer. Natur.
95 : 345-364.
K i ng, J. L .
e ffec t of l i tter cu " i ng---o f fa m i ly plan n i ng-on the rate of
na tu ra l selec t i o n . Gel/etics 5 1 : 425-429.
K i n g, J .
L.
1 966. The gene i n te ra c t i o n component o f the
genet ic load.
Genetics
5 3 : 403-4 1 3 . 1 967.
Co n t i n uously
d i s t r i bu ted
fa c t o rs a ffec t i n g
fi t ness .
Genetics
K i ng. J. L., and T. H . J u k es. 1 969 . N o n- Darw i n i a n ev o l u t i o n . Sciellce 1 64 : 788-798 . 5 5 : 483-492.
F. C. 1 96 1 . A ma t r i x i n e q u a l i ty. Quart. J. Math . 1 2 : 7 8-80. C. 1 96 1 a . A m a t h ema t i ca l p ro b l e m i n population ge ne t ics .
K i n g m a n , J . F.
K i ngman, J .
K i rk ma n , H .
Proc.
1 966. P ro pe rt i e s o f X l i n k ed a l l el es d u ri ng select i o n . A mer. J. Hum.
Cam/). Phil. Soc. 5 7 : 5 74-582.
N.
K lo pfe r, P. H . ,
-
R. H . r-.l acA rt h u r. 1 960.
GC'lIct. 1 8 : 424-432.
a nd
K l o p fe r, P. H . , a nd R . H . M ac A r t h u r .
N ic h e s i ze
a nd fa u n al
d i ve rs i ty .
A mer.
Natllr. 94 : 293- 300.
K n i gh t , G. R . , a nd A . R o bertso n .
1 96 1 . On t h e
2 23-226.
causes of t rop i cal s pec ie s
1 9 5 7 . F i t n css a s a mea s u rabl e character i n
d i versi ty : N iche O \ c r l a p . A I1IC'r. "'atur. 9 5 :
D roso p h i l a . Ge1/etics 42 : 5 24- 530.
K oj i ma , K . 1 95 9 . R ole o f e p istasis a n d overd o m i n a ncc in stabi l i ty o f eq u i l i bria w i t h selec t i o n . Pro t . Xlit. A cad. Sci. 45 : 984-989.
B I B LI O G RAPH Y 547 Koj i ma , K . 1 959a. Stable equ i l i bria for t he opt i mu m model. Proc. Nat. A cad. Sci. 4 5 : 989-993 .
K oj i ma , K . 1 . 1 96 1 . E ffects of dominance a n d s i ze of popul at i on o n response t o mass selectio n . Genet. Res. 2 : 1 77- 1 88. in Biomedical Research. Vol. I. Acad . Press, New York. Pp. 1 97-220.
Koji ma, K .
1 965. The evolut ionary dynamics o f two-gen e systems. Computers
I.
K oj i ma, K. I. , and T. M. Kelleher. 1 96 1 . Changes of mean fit ness i n random mat i ng populations when epistasi s and l i nkage are presen t. Genetics 46 : Koj i ma, K. t, and T. M. Kelleher. 1 962. Survi val of mutant genes. A mer. Natur. 5 27-540.
96 : 3 29-346.
Koj i ma ,
K. I . ,
and H.
E. Schaffer.
1 964. Accu mulation of epistat ic gene complexes.
Evolution 1 8 : 1 27- 1 29.
Koj i ma, K .
I.,
and H . E. Schaffer. 1 967. Survi va l process of H nked mutant genes.
Evolution 2 1 : 5 1 8- 5 3 1 .
Kolman, W. 1 96 1 . The mecha nism of natural select ion for t he sex rati o . A mer. Natllr. 9 5 : 3 7 3-377.
Kol mogorov, A . 1 93 1 . O ber die analytischen Methoden i n der Wahrschei n l ich keitsrechnung. Math. Ann. 1 04 : 4 1 5-458. C. R. ( Doklady) A cad. Sci.
U R SS.
Koi mogorov, A . ] 93 5 . Devi a t i ons from Hardy's formu l a
in parti a l i so lati on.
3(63) : 1 29- 1 32.
Kolmogorov, A . N . 1 95 9. Trans ition of branch i ng processes i nto d i ffusion processes and some pro blems i n genetics ( i n R ussian). Teor. Veroy. i. Primeollon 4 : Koma i , T. , M. Ch ico, a n d 2 3 3-236.
genetics of the lady- beetle, Harmonia, I. Geographical an d temporal variat i ons
Y.
Hosino. 1 950. Contri butions to the evol utionary
i n t he relative freq uency of the elytral pattern types and in t he frequency of
elytral ridge . Genetics 3 5 : 5 89-60 1 . K orde,
V.
T. 1 960. The correlations between relatives for a sex-li nked character
u n der i n bree d i n g. Heredity 1 4 : 40 1 -409. Korn, G . A . , and T. M . Korn. 1 968. Mathematical Handbook for Scientists and Engineers. 2nd Ed . McGraw- H i l i , New York.
Ann. Eugen. 1 2 : 1 72- 1 75 .
Kosa mb i , D. D . 1 94 3 . The est i ma t i o n of map d istances fro m recomb i nation values. Kosswig, C . 1 960. Genet ische Analyse stammesgesch ichtl icher E i nheiten.
Verb.
deutsch. Zool. Ges. 1 959 : 42-73. Zool. AIlZ., Suppl. 23 : 42-7 3 .
Kost itzi n ,
V.
A . 1 939. Nlathematical Biology . H arrap, London.
K rieger, H., and N. Friere- Maia. 1 96 1 . Esti mate o f the load o f mutati ons in homogeneous
populations
from
data
on
mixed
sa m ples.
Genetics
46 :
] 565- 1 566. Kudo, A . 1 962. A method for calculating the i n breedi n g coefficient. Am. J. Hum. and K . Sakaguchi . J 963. A method for calcu lat i ng the i nbreedi n g
Genet. 1 4 : 426-432.
Kudo,
A.,
Kyle, W. H . , and A . B. Chapman. 1 95 3 . Experimental check of t he effectiveness o f coefficient.
II.
Sex-l i nked genes. A mer. J. Hum. Genet. 1 5 : 476-480.
select i o n for a quantitat i ve character. Genetics 3 8 : 42 1 -43 3 .
548
B I B LIOG R A P H Y
Lack, D. 1 954. The evolution of reprod uctive rates. In Evollition as a Process. Ed. by J. H uxley, A. C. Hardy, and E. B. Ford . Allen and U nwin, London. Pp. 1 72-1 87. Lagervall, P. M . 1 960. Quantitative inheritance and dom inance I . The coefficient of relationship caused by dominance. Hereditas 46 : 48 1-496. Lagervall, P. M. 1 96 1 . Quantitative inheritance and dom inance. II. The genic and the dominance variance under i nbreed ing. Hereditas 47 : 1 1 1 - 1 30. Lagervall, P. M. 1 96 1 a. Quan Litative inheritance and domina nce. I I I . The genetic covariance of relatives in i n bred populations. Hereditas 47 : 1 3 1 - ] 59. Lagervall, P. M. 1 961 b. Quantitative inheritance and dom inance. I V . The average degree of dominance. Herl!ditas 47 : 1 97-202. Laidlaw, H. H., F. P. Gomes, and W. E. Kerr. 1 956. Estimation of the number of lethal alleles in a panmic tic population of Apis melli/era L. Genetics 4 1 : 1 79-1 88. Latter, B. D. H. 1 959. Genetic sampling in a random mating population of con stant size and sex ratio. Allst. J. BioI. Sci. 1 2 : 500-505. Latter, B. D. H . 1 960. N atural selection for an intermed iate optimum. Aust. J. BioI. Sci. 1 3 : 30-35. Latter, B. D. H. 1 964. The evolution of non-add itive genetic variance u nder artificial selection. I . Modification of dominance at a single autosomal locus. A ust. J. BioI. Sci. 1 7 : 427-435. Latter, B. D. H . 1 965. The response to artificial selection due to autosomal genes of large effect. I. Changes in gene frequency at an additive locus. Aust. J. BioI. Sci. 1 8 : 585-598. Latter, B. D. H. ] 965a. The response to artificial selection due to autosomal genes of large effect. II. The effects of linkage on limits to selection in finite popula tions. Aust. J. BioI. Sci. 1 8 : 1 009- 1 024. Latter, B. D. H. 1 965b. Quantitat i ve genetic analysis in Phalaris tuberosa. I. The statistical theory of open-pollinated progeny. Genet. Res. 6 : 360-370. Latter, B. D. H. 1 966. The response to artificial selection d ue to autosomal genes of large effect. I l l . The effects of linkage on the rate of advance and approach to fixation in finite populations. A list. J. BioI. Sci. 1 9 : 1 3 1 - 1 46. Latter, B. D. H. 1 966a. The interaction between effective population size and linkage intensity under artificial selection. Genet. Res. 7 : 3 1 3-323. Latter, B. D. H., and C. E. Novitsk i . 1 969. Selection i n finite populations with multiple alleles. I . Limits to directional selection . Genetics 62 : 859-876. Latter, B. D. H., and A. Robertson. 1 962. The effects of i nbreed ing and artificial selection on reproductive fitness. Gellet. Res. 3 : 1 1 0- 1 38. Lea, D. E., and C. A. Coulson. 1 949. The distri bution of the numbers of mutants i n bacterial populations. J. Genet. 49 : 264-285 . Lee, B. T. 0., and P. A. Parsons. 1 968. Selection, prediction and response. Bioi. Rev. 43 : 1 39- 1 74. Lefkovitch, L. P. ] 965. The study of population growth in organisms grouped by stages. Biometrics 2 1 : 1 - 1 8. Lefkovitch, L. P. ] 966. A population model incorporating delayed responses. Bull. Math. Biophys. 28 : 2 1 9-233.
B I B LIOG R A P H Y
549
Leigh, E. G. 1 965. On the relation between the productivity, biomass, diversity, and stabil ity of a community. Proc. Nat. A cad. Sci. 53 : 777-783. Leigh, E. 1 966. Ecological Aspects of Population Genetics. U npubl ished thesis, Yale University. Lerner, I. M. 1 950. POPlilation Genetics and A n imal Improvemel11. The University Press, Cambridge. Lerner, I. M . 1 954. Genetic Homeostasis. Oliver and Boyd, London. Lerner, I. M. 1 958. The Genetic Basis ol Selection. Wiley, New York. Lerner, I. M., and E. R. Dempster. 1 962. I ndeterminism and interspecific competi tion. Proc. Nat. A cad. Sci. 48 : 82 1 -826. Le Roy, H. L. 1 960. Statistische Methoden der Populationsgenetik . Bi rkhausen Verlag, Basel. Leslie, P. H. 1 945. On the use of matrices in certain population mathematics. Biometrika 33 : 1 83-2 1 2. Leslie, P. H. 1 948. Some further remarks on the use of matrices in population mathematics. Biometrika 35 : 2 1 3-245. Leslie, P. H. 1 958. A stochastic model for studying the properties of certain bio logical systems by numerical methods. Biometrika 45 : 1 6-3 1 . Leslie, P. H., and J. C. Gower. 1 958. The properties of a stochastic model for two competing species. Biometrika 45 : 3 1 6-330. Leslie, P. H . , and J. C. Gower. 1 960. The properties of a stochastic model for the predato r-prey type of interaction between two species. Biometrika 47 : 2 1 9-234. Levene, H. 1 949. On a matching problem arising in genetics. A nn. Math. Stat. 20 : 9 1 -94. Levene, H. 1 949a. A new measure of sexual isolation. Evolutiol1 3 : 3 1 5-32 1 . Levene, H . 1 953. Genetic equ ilibrium when more than one ecological niche is available. A mer Natur. 87 : 3 1 1 -3 1 3 . Levene, H. 1 963. In bred genetic loads and the determination of population structure. Proc. Nat. A cad. Sci. 50 : 587-592. Levene, H. 1 967. Genetic diversity and diversity of environment : Mathematical aspects. Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4 : 305-3 1 6. Levings, C. S. 1 964. Genetic relationships among autotetraploid relatives. J. Hered. 5S : 262-266. Lev ins, R. 1 962. Theory of fi tness in a heterogeneous envi ronment. I . The fitness set and adaptive function. A mer. Natur. 96 : 36 1 -373. Levins, R. 1 963 . Theory of fitness i n a heterogeneous environment. I I . Develop mental Flexibility and niche selection. A mer. Natur. 97 : 75-90. Levins, R. 1 964. The theory of fitness i n a heterogeneous environmen t. IV. The adapt ive significance of gene flow. Evolution 1 8 : 635-638. Levins, R. 1 965. Theory of fitness i n a heterogeneous environment. I l l . The response to selection. J. Theor. BioI. 7 : 224-240. Levins, R. 1 965a. Theory of fitness in a heterogeneous environment. V. Optimal genetic systems. Genetics 52 : 891 -904. Levins, R. 1 965b. Genetic consequences of natural selection. Theoretical and Mathematical Biology. Ed. by T. H . Waterman and H. J. Moro'Witz. Blaisdell, Waltham M ass. Pp. 388-397.
550
B I B LI OG R A P H Y
Lev i n s , R. 1 966. The strategy of mode l bui lding in pop ul ation biology. A mer. Sci. 54 : 42 1-43 1 . Levins, R . 1 967. Theory o f fi l ness i n a het ero geneo u s environ ment. VI. The adaptive significance of mutat io n . Genetics 56 : 1 6 3- 1 78. Levins, R. 1 968. EvollltioH ill Challging EI1l;ironme11ls. Pri nceton University Press, Princ eton , N.J. Lewis, E. G. 1 942. On t he generation and growth of a population. Sankhya 6 : 93-96. Lewon t i n , R. C. 1 953. The effect of compensat ion on pop u lations su bject t o natu ral select ion. A mer. Nat"I'. 87 : 375-38 1 . Lewontin, R. C. 1 957. The adaptat ion of pop u lat ions to varying environments. Cold Spring Harbor Symp. Quan t . BioI. 22 : 395-408. Lewontin, R. C. 1 958. A general method for investigating the e q ui l i bri u m of gene frequency in a populat ion. Genetics 43 : 421-433. Lew ont i n , R . C. 1 96 1 . Evolution and the theory of games. J . Theor. BioI. I : 382-403. Lewon t i n , R. C. 1 962. Interd em e s e l ect i on con t ro lli n g a polymorphism in the house mouse. A mer. NlItllr. 96 : 65-78. Lewontin, R . C. 1 964. The role of l i nkage in n at u ra l selection. Genetics Today. Pergamon Press, New York. Pp . 5 1 7·-525. Lewontin , R . C. 1 964a. T h e interaction of selection and l inkage. I. General con sideration s of heterotic models. Genetics 49 : 49-67. Lewontin, R. C. 1 964b. The i nteracti on of selection and li nkage. II. Opti mum models. Genetics 50 : 757-782. Lewon t i n , R. C. 1 965. Selec t i on for col onizing a bili ty. The Genetics 0/ Colonizing Species. Ed. by H. G . Baker and G . L. Stebb i ns . Academic Press, New York. Pp. 77-94. Lewon t i n , R. C. 1 9650. Selection i n and of populations. Ideas in Modern Bi o logy . Proc. X VI Intern. Congo Zool.
Lewontin, R. C. 1 967. The genetics of complex systems. Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4 : 439-456. Lewontin, R. C. 1 967a. Populat ion genetics. A nl1. Rev. Gellet. I : 3 7-70. Lewont i n , R. C. 1 968. The effect of d i fferent i a l viabi l i ty on the pop u la t ion dynamics of t al leles in t he h o u se mOllse. EroluTioll 22 : 262-273. Lewontin, R. c . , a nd C. C. Cockerham. 1 959. The goodness-of-fit t est for detecting selection in random mat ing pop u l at io n s . ErollltioH 1 3 : 561-564. L ew ont i n , R. c., and L. C. Du n n . 1 960. The evolutiona ry dynamics of a poly morphism in t he ho u se mouse. Genetics 45 : 705-722. Lewonti n, R. c., and J . L. Hubby. 1 966. A mol ecular approach to the study of gen ic hetero zy gos i ty in natura l popu lations. II. Amount of variation and degree of hete ro zygosi t y i n natural populations of Drosophila pselldoobscura. Genetics 54 : 595-609. Lewo nt i n , R. C, and P. Hull. 1 967. The i n t e rac t i on of selection and linkage. III. Synergistic effect of blocks of ge ne s . Del' Ziichter 37 : 93-98. Lew onti n, R. c., and K. Kojima. 1 960. The evolutionary dy n amics of complex polymorphisms. Eroliltion 1 4 : 458-472.
B I B LI O G R AP H Y
Lewon t i n, R. c . , and M . J.
D.
Whi te .
1 960.
5!tl
I n teraction between i nversion poly
morphisll1s of twe chromosome pa i rs i n t he grasshopper,
Moraba scurra. Epo/ulion 1 4 : 1 1 6- 1 29. Li, C. C. 1 95 3 . SOl:le general properties of recessive i nherita nce. A mer. J. Hum. Genet. 5 : 269-279. Li, C. C. 1 95 3a. Is Rh facing a crossroad ? A crit ique of the compensation effect. Amer. Narur. 87 : 2 5 7 -26 1 . Li, C. C. J 953b. A d i rect proof of the relation between genotypic mating correlation and the gRfl .etic u n i t i n g correlation in equ i li br i u m populations. J. Hered. 44 : 39-40. Li, C. C. 1 955. T he stabil i ty o f an equilibri u m a nd t he average fi t ness of a population. Amer. Natur. 89 : 28 1 -296. Li, C. C. 1 95 5a. Population Genetics. V n iv. Chicago Press, Ch icago, 111. Li, C. C. 1 957. Repeated l inear regression a nd variance components of a population w ith binomial frequencies. Biomet";cs 1 3 : 225-233 . Li, C. C. 1 957a. The genetic varia nce of autotetraploids with two alleles. Genetics 42 : 583-592. Li, C. C. 1 9 59. Notes on relative fitness of genotypes that form a geometric progression. Em/ulion 1 3 : 564-567. Li, C. C. 1 96 1 . Human Genetics: Principles and Methods. McGraw-Hi l l , New York. Li, C. C. 1 962. On .. reflexive selecti on . " Science 1 36 : 1 055- 1 056. Li, C. C. 1 963. Decrease of population fi t ness upon i nbreedi ng. Proc. Nat. A cad. Sci. 49 : 439-445. Li, C. C. 1 963a. Genetic aspects of consang u i n i ty. Amer. J. Med. 34 : 702-7 1 4. Li, C. C. 1 963b. The way the load rat i o works. A mer. J. Human Gen. 1 5 : 3 1 6-32 1 . Li, C. C. 1 963c. Equi libri u m u nder d i fferential selecti o n i n the sexes. Evolution 1 7 : 493-496. Li, C. C. 1 967. The maximization of average fi tness by natural selection for a sex-Hnked l ocus. Proc. Nat. A cad. Sci. 5 7 : 1 260- 1 261 . Li, C. C. 1 967a. Fundamental theorem of n atural selecti o n . Nature 2 1 4 : 505-506. Li , C. C. 1 967b. Genetic equ i l i br i u m u nder selection. Biometrics 23 : 397-484 . Li, C. c., and D. G . Horvi tz. 1 953. Some methods of est i mating the i nbreed ing coefficien t. A mer. J. Human Genet. 5 : 1 07- 1 1 7. L i , C. c . , and L. Sacks. 1 954. The derivation of joint d istri bution and correlat ion between relatives by the use of stochastic matrices. Biometrics I O : 347-360.
L il lestel, J. 1 968. A n other approach to some M arkov cha i n models i n population genetics. J,
Lloyd, M .
Appl. Prob. 5 : 9-20.
1 964. Weighting i ndividuals by Amer. Natur. 98 : 1 90-1 92. M . , and R. J. Chelard i . 1 964. A
reproducti ve value i n calculat ing species
d ivers i ty. Lloyd,
Lotka, A . J .
table for calcula t i ng t he " equ i tabil i ty "
component of species d iversity. J. Anim.
Ecol. 3 3 : 2 1 7-225 . age distribution. Proc. Nat. Acad. Sci.
1 922 . The stab i l i t y of the norma l 8 : 339-345 . Lotka, A . J . 1 925. Elements of Physical Biology. Will iams and W i l k i ns. Bal t imore. Lotka, A. J. 1 93 1 . The exti nction of fam ilies. J. Wash. A cad. Sci. 2 1 : 377.
552
B I B LIOG R A P H Y
Lotka, A. J. 1 945. Population analysis as a chapter i n the mathematical theory of evol ution. Essays ill Growth alld Form. Ed. by W. E. Le Gros Clark and P. B. Medawar. Oxford, England. Pp. 355-385. Lotka, A. J. 1 956. Elements of Mathematical Biology. Dover Publicat ions, New York. ( Revised edition of Elements of Physical Biology, 1 925.) Ludwig, W., and H . V. Schelling. 1 948. Der I nzuchtgrad in endlichen panmikt ischen Populationen. Zool. Zellf. 67 : 268-275. Lush, J. L. 1 940. Intrasire correlations or regressions of offspri ng on dam as a method of estimating heritability of characteristics. Proc. A mer. Soc. A nimal Prod. 1 940 : 293-30 1 . Lush, J. L. 1 945. A nimal Breeding Plans. 3rd Ed. Iowa State Col lege Press, A mes, Iowa. Lush, J. L. 1 946. Chance as a cause of changes i n gene frequency with in pure breeds of livestock. A mer. Natur. 80 : 3 1 8-342. Lush, J. L. 1 947. Family merit and individual merit as bases for selection . A mer. Natur. 8 1 : 24 1 -26 1 , 362-379. MacArthur, R. H. 1 955. Fluctuations of animal populations, and a measure of community stabil ity. Ecology 36 : 553-536. MacA rthur, R. H. 1 957. On the relative abundance of bi rd species. Proc. Nat . A cad. Sci. 43 : 293-295. MacArthur, R. H. 1 958. A note on stationary age d istributions in single species populations and stationary species popu lations in a commun ity. Ecology 39 : 1 46-147. MacA rthur, R. H. 1 960. On the relative abundance of species. A mer. Natur. 94 : 25-36. MacArthur, R. H. 1 960a. On the relation between reproductive value and opt imal predation. Proc. Nat. A cad. Sci. 46 : 1 43- 1 45. MacArthur, R . H. 1 961 . Population effects of natural selection. A mer. Natur. 95 : 1 95-1 99. MacArt hur, R. H. 1 962. Some generalized theorems of natural selection. Proc. Nat. A cad. Sci. 38 : 1 893-1 897. MacArthur, R. H . 1 964. Environmental factors affecting bird species diversity. Amer. Natur. 98 : 387-397. MaCArthur, R. H. 1 965. Patterns of species diversity. BioI. Rev. 40 : 5 1 0-533 . MacArthur, R. H . 1 965a. Ecological consequences o f natu ral selection . Theoretical and Mathematical Biology. Ed . by T. H . Waterman and H . J . Morowitz. Bla isdell, Waltham, Mass. Pp. 388-397. MacArthur, R., and R. Levins. 1 964. Competi tion, habitat selection, and character d isplacement in a patchy environment. Proc. Nat. A cad. Sci. 5 1 : 1 207- 1 2 1 0. MacArthur, R. H., and E. O. Wi lson. 1 963. An equilibrium theory of i nsular zoogeography. Evolution 1 7 : 373-387. MacArthur, R. H., and E. O. Wilson. 1 967. The Theory of Island Biogeography. Princeton University Press, Princeton, N .J . McBride, G . , and A. Robertson. 1 963. Select ion using assortati ve mating in D. melariogaster. Genet. Res. 4 : 356-369.
BIBLIOGRAPHV 653
McPhee, H. C., and S. Wright. 1 925. Mendelian analysis of the pure breeds of livestock. III. The shorthorns. J. Hered. 1 6 : 205-2 1 5. McPhee, H . C., and S. Wright. 1 926. Mendelian analysis of the pure breeds of l ivestock. IV. The British dairy shorthorns. 1. Hered. 1 7 : 397-40 1 . Malecot, G. M . 1 944. Sur u n prob1eme de probab il ites en chaine que pose Ja genetique. Compt. Rend. de I'Acad. des Sci. 2 1 9 : 379-38 1 . Ma1ecot, G . 1 948. Les mathemaliques de I'heredite. Masson et Cie, Paris. Malecot, G. 1 955. Decrease of relationship with distance. Cold Spring Harbor Symp. Quant. Bioi. 20 : 52-53. MaJecot, G. 1 959. Les modeles stochastiques en genetique de population. Pub. [nst. Stat. Univ. de Paris 8(3) : 1 73-2 10. Malecot, G . 1 966. Probabilitis el Heredite. Presses U ni versita ires de France. Malecot, G. 1 967. I dentical loci and relationship. Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4 : 3 1 7-332. Mandel, S. P. H . 1 959. The stability of a multiple allelic system. Heredity 1 3 : 289-302. Mandel , S. P. H. I 959a. Stable eq uilibrium at a sex-linked locus. Nature 1 83 : 1 347- 1 348. Mandel, S. P. H . , and J. M. Hughes. 1 958. Change in mean viability at a multi. allelic locus in a population under random mati ng. Nature 1 82 : 63-64. Margalef, R . 1 957. I nformation theory in ecology. General Systems 3 : 36-7 1 . Margalef, R . 1 9 63 . On certain unifying principles i n ecology. Amer. Natur. 97 : 3 57-374. Martin, F. G . , and C. C. Cockerham. 1 960. H igh speed selection studies. Biometrical Genetics. Ed . by O. Kemptho rne. Pergamon Press, London. Pp. 3 5-45. Maruyama, T. 1 969. Genetic correlation in the stepping stone model with non symmetrical migration rates. Jour. Appl. Prob. 6 : 463-477. Maruyama, T. 1 970. On the fixation probabil ity of mutant genes in a subdivided population. Genet. Res. (in p ress). M aruyama, T. 1 970a. R ate of decrease of genet ic variability in a subdivided popula tion. Biometrika (in press). Mather, K. 1 94 1 . Variation and selection of polygenic characters. J. Genet. 41 : 1 59-1 93. Mather, K . 1 942. The balance of polygenic combinations. J. Genet. 43 : 309-336. Ma t he r K . 1 943 . Polygenic inheritance and natural selection. Bioi. Rev. 1 8 : 32-64. M at her, K. 1 946. Dominance and heterosis. A mer. Natur. 80 : 9 1 -96. Mather, K . 1 949. Biomelrical Genetics. Dover Pub., New York. Mather, K. 1 95 3 . The genetical structure of populations. Symp. Soc. Exp. Bioi. 7 : 66---9 5 . Mather, K. 1 95 5 . Response to selection. Cold Spring Harbor Symp. Quant. Bioi. 20 : 1 58- 1 65. Mather, K . 1 955a. Polymorphism as an outcome of disruptive selection. Evolution 9 : 52-6 1 . Mather, K . 1 963 . Genet ica l demography. Proc. Roy. Soc. B 1 59 : 1 06- 1 25. Mather, K. 1 966. Variability and selection. Proc. Roy. Soc. B. 1 64 : 328-340. ,
554 BI BLIOGRAPHY
Mather, K. 1 967. Complementary and dupl icate gene interactions in biometrical genetics . Heredity 22 : 97- 1 03 . Matsunaga, E . 1 966. Possi ble genet ic consequences o f family planning. J. Amer. Matzinger, D. F., and O. K empthorne. 1 956. The modified diallel table with partial inbreeding and interactions with environment . Genetics 4 1 : 8 22-83 3 . Maynard Sm ith, J . 1 958. The Theory of Evollltion. Penguin Books, London . Maynard Smith, J . 1 962. Disruptive select ion, polymorphism, and sympatric speciation. Nature 1 95 : 60-62 . Maynard Sm i t h, J . 1 964. Kin se1ection and group selection . Nature 201 : 1 1 45- 1 1 47 . Maynard Smith, J. 1 965. The evolution of alarm calls. A mer. Natur. 99 : 59-63 . Maynard Smith, 1. 1 966. Sympatric speciation . A mer. Natur. 1 00 : 637-650. Maynard Sm i t h, J. 1 968. Evolution in sexual and asexual popu lat i ons . A mer. Med. Assn. 1 98 : 53 3 -540.
Natur. 1 02 : 469-47 3 .
Maynard Smith,
J . 1 968a . ..
Haldane's dilemma " and the rate of evolut i o n .
Nature
Maynard Smith, J. I 968b. Mathematical Ideas ill Biology. Cambridge Un iv. Press, Cambridge, England. Mayo, O. 1 966. On the problem of sel f-incompatibility alleles. Biometrics 22 : 29 : 1 1 1 4- 1 1 1 6.
1 1 1 - 1 20.
Ecological factors i n speciat ion. Evolution I : 263-2 8 7 . 1 954. Change of genetic environment and evolution . Evolution as a Process. Ed. by J . H uxl ey , A. C. Hardy, and E. B. Ford . Allen and Unwin, London, Pp. 1 5 7-1 80. Mayr, E. 1 956. Geograph ical gradients and cl imatic adaptation . Evolution
Mayr, E. Mayr, E.
1 947.
1 0 : 1 05 - 1 08 .
Mayr, E. 1 963 . Animal Species and Erolu tioll . Harvard U niv. Press, Cambridge, M ass. Me rat, P. 1 9 67 . Les ge nes inHuant sur l a va r iance d' un cha ractere quan titif et leurs re percussions possi bles sur la se l ection . A lln. Genetique 1 0 : 2 1 2-230. Mettle r, L. E., and T. G. Gregg. 1 9 69. Populatioll Gen e t ics and Evolution. Prentice Hal l , Englewood C li ffs, N.J . M i l kman, R. D. 1 9 67. H e te ros i s a s a major cause of heterozygosity i n nature .
M iller, G. F. 1 962. The evaluation of ei genvalues of a d i fferential eq uation arising in a problem in genetics. Proc. Comb. Phil. Soc. 58 : 5 88-593. Mode, C. J. 1 95 8 . A mathematical model for the co�evolution of obligate parasites and their hosts . Emlutioll 1 2 : 1 5 8- 1 65 . Mode, C . J . 1 960. A m od e l of a host-pat hogen system with particular reference to the rusts of cerea l s . Biol11e trical Gelletics. Ed. by O. Kempthorne, Pergamon Press, New York . Pp. 84-96. Mode, C. J. 1 96 1 . A ge nerali zed model of a host -pa t hogen system. Biometrics Genetics 5 5 : 493-495.
Mode, C. J. 1 962. Some multi-d i mensional birth and death processes and t heir applications in population genet i cs. Biometrics 1 8 : 543-567 ; 1 9 : 667. 1 7 : 3 86-404.
B I BLIOGRAPHY
555
Mode, C. 1. 1 966. Some multi-dimensional branching processes as motivated by a class of problems in mathematical genetics. Bull. Math. Biophys. 28 : 25-50 ; 28 : 1 8 1 - 1 90. Mode, C. J . 1 966a A stochastic calculus and its application to some fundamental theorems of natural selection. J. Appl. Prob. 3 : 327 352. Mode, C. 1. 1 967. On the probability a line becomes extinct before a favorable mutation appears. Bull. Math. Biophys. 29 : 343-348. Mode, C. J. 1 968. A multidimensional age-dependent branching process with applications to natural selection I. Math. Bioscience 3 : 1 - 1 8. Mode, C. J. I 968a. A multidimensional age-dependent branching process wit h applications to natural selection I I . Math. Bioscience 3 : 23 1 -247. M ode, C. J., and H. F. Robinson. 1 959. Pleiotropism and the genetic variance and covariance. Biometrics 1 5 : 5 1 8-537. Moment, G. B. 1 962. Reflexive selection : A possible answer to an old puzzle. Science 1 36 : 262-263. Moment, G. B. I 962a. On " reflexi ve selection." Science 1 36 : 1 056. Moody, P. A. 1 947. A simple model of " drift " in small populations. Evolution 3 : 2 1 7-2 1 8. Moran, P. A. P. 1 958. Random processes in genetics. Proc. Camb. Phil. Soc. 54 : 60-7 1 . M oran, P. A. P. 1 958a. The effect of selection in a haploid genetic population. Proc. Camb. Phil. Soc. 54 : 463-467. Moran, P. A. P. 1 958b. The d istribution of gene frequency i n a bisexual diploid population. Proc. Camb. Phil. Soc. 54 : 468-474. Moran, P. A. P. 1 958c. A general theory of the distribution of gene frequencies. I. Overlapping generations. II. Non-overlapping generations. Proc. Roy. Soc. B 1 49 : 1 02-1 1 2, 1 1 3- 1 1 6. Moran, P. A. P. 1 958d. The rate of approach to homozygosity. Ann. Hum. Genet. 23 : 1 -5 . Moran, P . A . P. 1 959. The theory o f some genetical effects o f popu lation subdi vision. Aust. J. Bioi. Sci. 1 2 : 1 09-1 1 6. Moran, P. A. P. 1 959a. The survival of a mutant gene under selection. J. Aust. Math. Soc. I : 1 21 - 1 26. Moran, P. A . P. 1 960. The survival of a mutant gene under selection. II. J. Aust. Math. Soc. 1 : 485-49 1 . Moran, P. A. P. 1 96 1 . The survival of a mutant under general conditions. Proc. Camb. Phil. Soc. 57 : 304-3 1 4 Moran, P. A. P. 1 962. The Statistical Processes of Evolutionary Theory. The Clarendon Press, Oxford . Moran, P . A . P . 1 963. On the measurement of natural selection dependent on several l oci. Evolution 1 7 : 1 82-1 86. M oran , P. A. P. 1 963a, Some general results on ra ndom walks, with genetic applica tions. J. Aust. Math. Soc. 3 : 468-479. Moran, P. A. P. 1 963b. Balanced polymorph isms with unlinked loci. Aust. J. BioI. Sci. 1 6 : 1 -5. .
-
.
556
BI BLIO G RAPHY
Moran, P. A. P. 1 964. On the nonexistence of adaptive topographies. Ann. Human Genet. 27 : 383-393. Moran, P. A. P. 1 967. U nsolved problems in evolutionary biology. Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4 : 457-480. Moran, P. A. P., and C. A. B. Smith. 1 966. Commentary on R. Fisher's Paper on the Correlation Between Relatives on the Supposition of Mendelian Inheri tance. Cambridge Univ. Press, Cambridge, England. Moran, P. A. P., and G. A. Watterson. 1 958. The genetic effects of family structure in natural populations. A ust. J. Bioi. Sci. 1 2 : 1 - 1 5. Morishima, H. 1 969. Phenetic similarity and phylogenetic relationships among strains of Oryza perennis, estimated by method of numerical taxonomy. Evolution 1 7 : 1 70-1 8 1 . Morris, R . F. 1 959. Single-factor analysis i n population dynamics. Ecology 40 : 580--5 88. Morse, P. M., and H. Feshbach. 1 953 Methods oJ Theoretical Physics. Part I and II. McGraw-Hill, New York. Morton, N. E. 1 955. Non-randomness in consangui neous marriage. Ann. Hum. Genet. 20 : 1 1 6-1 24. Morton, N. E. 1 960. The mutational load due to detrimental genes in man. A mer. J. Hum. Genet. 1 2 : 348-364. Morton, N. E. 1 965. Models and evidence in human population genetics. Proc. XI Int. Congo Genet. Pp. 935-950. Morton, N. E. 1 969. Human population structure. Ann. Rev. Genet. 3 : 5 3-74. Morton, N. E., Ed. 1 969a. Computer Applicatiolls in Genetics. U niv. Hawai i Press, Honolulu. Morton, N. E., C. S. Chung, and M. P. Li. 1 967. Genetics oj Interracial Crosses in Hawaii. S. Karger, New York. Morton, N. E., J. F. Crow, and H. J. Muller. 1 956. An esti mate of the mutational damage in man from data on consangui neous marriages. Proc. Nat. A cad. Sci. 42 : 855-863. Morton, N. E. , and S. Wright. 1 968. Genetic studies of cystic fibrosis in Hawaii. Amer. J. Hum. Gellet. 20 : 1 57-1 69. Morton, N. E., and N. Yasuda. 1 962. The genetical structure of human populations. Entretien de Monaco en Sciences Humaines: Les Deplacements Humains. Ed. by J. Sutter. Pp. 1 85-202. Morton, N. E., N. Yasuda, C. M iki, and S. Yee. 1 968. Bioassay of population structure under isolation by distance. A mer. J. Human. Genet. 20 : 4 1 1 -4 1 9. Moser, H. 1 958. The dynamics of bacterial populations maintained in the chemostat. Carnegie Inst. Pub. 6 1 4. Mukai, T. 1 964. The genetic structure of natural populations of Drosophila melano gaster. I. Spontaneous mutation rate of poly genes controlling viability. Genetics 50 : 1 -1 9. Muk(\ i, T. 1 969. The genetic structure of natural populations of Drosophila melano gaster. VII. Synergistic interaction of spontaneous m utant polygenes controlling viabil ity. Genetics 6 1 : 749-76 1 . .
B IB L I O G R A P H Y
557
Mukai, T. , and A. B. Burdick. 1 959. Single gene heterosis associated with a second chromosome recessive lethal in Drosophila melanogaster. Genetics 44 : 2 1 1 -232. Mulholland, H. P., and C. A. B. Smith. 1 959. An inequality arising in genetical theory. Amer. Math. Mon. 66 : 673-683. Mul ler, H. J . 1 9 1 4. The bearing of the selection experiments of Castle and Phillips on the variability of genes. Amer. Nat. 48 : 567-576. M u l ler, H. J. 1 925. Why polyploidy is rarer in animals than in plants. Amer. Nat. 59 : 346-35 3. M ul ler, H. J . 1 929. The gene as the basis of life. Proc. Int. Conyr. Plant. Sci. 1 : 8 97-92 1 . M uller, H. J . 1 932. Some genetic aspects of sex. Amer. Natur. 68 : 1 1 8- 1 3 8. Muller, H. J. 1 936. On the variability of mixed races. A mer. Nat. 70 : 409-442. M uller, H. J . 1 939. Reversibility i n evolution considered from the standpoint of genetics. Bioi. Rev. 1 4 : 2 6 1 -280. Muller, H. J. 1 942. Isolating mechanisms, evolution, and temperature. Bioi. Symp. 6 : 7 1 - 1 25. MuJler, H . J. 1 949. The Darwinian and modern conceptions of natural selection. Proc. ArneI'. Phi/os. Soc. 93 : 459-470. MuJler, H . J. 1 950. Evidence of the precision of genetic adaptation. The Harvey Lectures 1 8 : 1 65-229. MuUer. H. J. 1 950a. Our load of mutations. Amer. J. Human Genet. 2 : 1 1 1 -1 76. Muller, H . J. 1 958. Evolution by mutation. Bull. Amer. Math. Soc. 64 : 1 37-] 60. Muller, H. J. 1 964. The relation of recombination to mutational advance. Mutation Res. 1 : 2-9. Muller. H. J. 1 967. What genetic course will man steer ? Prot. Third Int. Congo Hum. Genet. Pp. 52 1 -543 . Murray, M. 1 964. Multiple mating and effective population size in Cepaea nemoralis. Evoilition 1 8 : 283-291 .
Nair, K . R. 1 954. The fitting of growth curves. Statistics and Mathematics in Biology. Ed. by O. Kempthorne, T. A. Bancroft, J . W. Gowen, and J . L. Lush. Iowa State College. Ames, Iowa. Pp. 1 1 9-] 32. Narain, P. 1 963. On mathematical representation of gene action and i nteraction. 1. Indian Soc. Agri. Stat. 1 5 : 270. Narain, P. 1 965 . The description of gene acti on and interaction with multiple alleles in continuous variation. Genetics 52 : 43-53. Narain, P. 1 965a. Homozygosity in a selfed population with an arbitrary number of linked loci. 1. Genet. 59 : ] - 1 3. Narain, P. 1 966. Effect of linkage on homozygosity of a population under mixed selfing and random mating. Genetics 54 : 303-3 1 4. Narain, P. 1 969. A note on the diffusion approximation for the variance of the number of generations until fixation of a neutral mutant gene. Submitted to Genet. Res. Nassar, R. F. 1 965. Effect of correlated gene distribution due to sampling on dianel analysis. Genetics 52 : 9-20.
558
B I BLIOG R A P H Y
Nassar, R. F. 1 969. Distribution of gene frequencies under the case of random genetic drift with and without selection. Theoret. App. Genetics 39 : 1 45-1 49. Naylor, A. F. 1 962. Mating systems which could increase heterozygosity for a pair of alleles. A mer. Natur. 96 : 5 1 -60. Naylor, A. F. 1 963 . A theorem on possi ble kinds of mating systems which tend to increase heterozygosity. Evolution 1 7 : 369-370. Naylor, A. F. 1 964. Natural selection through maternal influence. Heredity 1 9 : 509-5 1 1 . Neal, N. P. 1 935. The decrease in yielding capacity in advanced generations of hybrid corn. l. Amer. Soc. Agron. 27 : 666-670. Neel, J. V., and W. J. Schul l . 1 954. Hllman Heredity. The Univ. of Chicago Press, Chicago. Nei, M. 1 964. Effects of linkage and epistasis on the equili brium frequencies of lethal genes. I. Linkage equ ilibrium. lap. l. Genet. 39 : 1 -6. Nei, M. I964a. Effects of linkage and epistasis on the equilibrium frequencies of lethal genes. II. Numerical solutions. lap . l. Genet. 39 : 7-25. Nei, M . 1 965 . Effect of l inkage on the genetic load manifested under i nbreeding. Genetics 5 1 : 679-688. Nei, M. 1 965a. Variation and covariation of gene frequencies in subdivided popula tions. Evolution 1 9 : 256-258. Nei, M. 1 967. Modification of linkage intensity by natural selection. Genetics 57 : 625-64 1 . Nei, M . 1 968. Evolutionary change of linkage intensity. Nature 2 1 8 : 1 1 60- 1 1 6 1 . Nei, M . 1 968a. The frequency distribution of lethal chromosomes in finite popula tions. Proc. Nat. A cad. Sci. 60 : 5 1 7-524. Nei , M . , and Y. Imaizumi. t 966. Genet ic structure of human populations I. Local differentiation of blood group gene freq uencies in Japan. Heredity 21 : 9-35. Nei, M., and Y. Imaizumi. 1 966a. Genetic structure of human populations. I I . Differentiation of blood group gene frequencies among isolated populations. Heredity 2 1 : 1 8 3- 1 90, 344. Nei, M., and Y. I maizumi. 1 966b. Effects of restricted population size and increase i n mutation rate on the genetic variation of quantitative characters " Genetics 54 : 763-782. Nei, M ., K. I. Koji ma, and H . E. Schaffer. 1 967. Frequency changes of new inver sions in populations under mutation-selection equilibria. Genetics 57 : 74 1 -750. Nei, M ., and M. M urata. 1 966. Effective population size when fertility is inheri ted. Genet. Res. 8 : 257-260. NeIder, J. A. 1 952. Some genotypic freq uencies and variance components occurring in biometrical genetics. Heredity 6 : 387-394. NeIder, J. A. 1 953. Statistical methods in biometrical genetics. Heredity 7 : 1 1 1 - 1 1 9. N icholson, A. J. 1 933. The balance of animal populations. l. A nim. Ecol. 2(suppl.) : 1 32- 1 78. N icholson, A. J. 1 950. Population oscillation caused by competition for food. Nature 1 65 : 476-477. N icholson, A. J. 1 954. An outl ine of the dynamics of animal population. A list. l. Zool. 2 : 9-65.
B I B LIOG RAPHY
559
Nicholson, A. J . 1 957. The self-adjustment of populations to change. Cold Spring Harbor Symp. Quant. Bioi. 22 : 1 53-173. Nicholson, A. J., and v. A. Bailey. 1 935. The balance of animal populations. Part I. Proc. Zool. Soc. London 55 1 -598. Nikoro, Z. S. 1 964. Alteration of population structure as a result of selection in the case of overdominance (in Russian). Bull. Moscow Soc. Nat. Bio. Ser. 49 : 5-2 1 . Norton. H . T. J . 1 928. Natural selection and Mendelian variation . Proc. Lond. Math. Soc. 28 : 1 -45 . Novick, A., and L . Szilard. 1 950. Experiments with the chemostat o n spontaneous mutations in bacteria. Proc. Nat. Acad. Sci. 3 6 : 708-7 ] 9. O'Donald, P. ] 960. Inbreeding as a result of imprinting. Heredity ] 5 : 79-85. O'Donald, P. 1 960a. Assortive mating in a population in which two alleles are segregating. Heredity 1 5 : 389-396. O'Donald, P. 1 962. The theory of sexual selection. Heredity 1 7 : 54 1 -552. O'Donald, P. 1 963. Sexual selection and territorial behavior. Heredity 1 8 : 361 -364. O'Donald, P. 1 963a. Sexual selection for dominant and recessive genes. Heredity 1 8 : 451 -457. O'Donald, P. 1 967. A general model of sexual and natural selection. Heredity 22 : 499-5 1 8. 0' Donald, P. 1 968. Measuring the i ntensity of natural selection. Nature 220 : 1 97-1 98. O'Donald, P. 1 968a. The evolution of dominance by selection for an optimum. Genetics 58 : 451 -460. O'Donald, P. 1 968b. Models of the evolution of dominance. Proc. Roy. Soc. B. 1 7 ] : 1 27-1 43. O'Donald, P. 1 969. " Haldane's dilemma " and the rate of natural selection. Nature 22 1 : 8 1 5-8 1 6. O'Donald, P. 1 969a. The selective coefficients that keep modifying genes in a population. Genetics 62 : 435-444. Ohta, T. ] 967. Probability of fixation of mutant genes and the theory of limits in artificial selection. Jap. J. Genet. 42 : 353-360. Ohta, T. 1 968. Effect of initial linkage disequ ilibrium and epistasis on fixation probability i n a small population, with two segregating loci. Theor. Appl. Genet. 38 : 243-248. Ohta, T., and M . Kimura. 1 969. Linkage disequilibrium due to random genetic drift . Genet. Res. 1 3 : 47-55 . Ohta, T., and M . Kimura. ] 969a. Linkage disequilibrium a t steady state determined by random genetic drift and recurrent mutation. Genetics 63 : 229-238. Ohta, T., and K . Kojima. 1 968. Survival probabilities of new inversions in large populations. Biometrics 24 : 50 1 -5 1 6. Opsahl, B. ] 956. The discrimination of interactions and linkage in continuous variation. Biometrics 1 2 : 4 1 5-432. Orians, G. H . 1 962. Natural selection and ecological theory. Amer. Natur. 96 : 257-263. Orias, E., and F. J. Rohlf. 1 964. Population genetics of the mating type locus in Tetrahymena pyriformis, variety 8. Evolution 1 8 : 620-629.
560
BIBLIOG RAPHY
Osborn, R., and W. S. B. Paterson. ] 952. On the sampling variance of heritability estimates derived from variance analysis. Proc. Roy. Soc. Edinb. B. 64 : 456-46 1 . Osborne, R. 1 957. Correction for regression o n a secondary trait a s a method of increasing the efficiency of selective breeding. Aust. J. Bioi. Sci. 1 0 : 365-366. Owen, A. R. G. 1 952. A genetical system admitting of two stable equili bria. Nature 1 70 : 1 1 27. Owen, A. R. G. ] 953. A genetical system admitting of two distinct stable equilibria under natural selection . Heredity 7 : 97-102. Owen, A. R. G. 1 954. Balanced polymorphism of a multiple allelic series. Caryologia 6 (suppl.) 1 240-1 24 ] . Owen, A. R. G. 1 959. Mathematical models for selection. Symp. Soc. Study Hum. Bioi. 2 : 1 1 - 1 6.
Page, A. R., and B. I. Hayman. ] 960. Mixed sib and random mating when homo zygotes are at a disadvantage. Heredity ] 4 : ] 87-1 96. Parsons, P. A. 1 957. Selting under conditions favouring heterozygosity. Heredity 1 1 : 4 1 1 -42 J . Parsons, P. A. 1 959. Equilibria i n auto-tetraploids under natural selection for a simplified model of viabilities. Biometrics ] 5 : 20-29. Parsons, P. A. ] 961 . The initial progress of new genes with viability differences between sexes and with sex linkage. Heredity ] 6 : 1 03-1 07. Parsons, P. A. 1 962. The i nitial i ncrease of a new gene under positive assortative mating. Heredity 1 7 : 267-276. Parsons, P. A. 1 963. Complex polymorphisms where the coupling and repulsion double heterozygote viabilities differ. Heredity 1 8 : 369-374. Parsons, P. A. ] 963a. Polymorphism and the balanced polygenic combination. Evolution 1 7 : 564-574. Parsons, P. A. 1 963b. Migration as a factor in natural selection. Genetica 33 : ] 84-206. Parsons, P. A. 1 964. Polymorphism and the balanced polygenic complex-a comment. Evolution 1 8 : 5 ] 2. Parsons, P. A. 1 964a. Comp]ex po]ymorphisms with recombination differing between sexes. Aust. J. Bioi. Sci. 1 7 : 3 ] 7-322. Parsons, P. A. ] 964b. Interactions within and between chromosomes. J. Theor. Bioi. 6 : 208-21 6. Parsons, P. A., and W. F. Bodmer. ] 96 ] . The evolution of overdominance : Natural selection and heterozygote advantage. Nature ] 90 : 7- ] 2. Patau, K. 1 938. Die mathematische Analyse der Evolutionsvorgange. ZeUs. f. Abst. u. Vererb. 76 : 220-228. PatJak, C. S. 1 953. The effect of the previous generations on the distribution of gene frequencies in populations. Proc. Nat. A cad. Sci. 39 : 1 063-1 068. Patten, B. C. 1 959. An introduction to the cybernetics of the eosystem : The trophic dynamic aspect. Ecology 40 : 221 -23 1 . Pearl, R . ] 9 1 3. A contribution towards an ana]ysis of the prob]em of i nbreeding. Amer. Natur. 47 : 577-6 ] 4.
B I B LI O G RA P H Y
561
Pearl, R . 1 91 4. On the results of inbreeding a Mendelian population ; a correction and extension of previous conclusions. A mer. Natur. 48 : 57-62. Pearl, R. 1 9 1 4a. On a general formula for the constitution of the nth generation of a Mendelian population in which all matings are of brother X sister. A mer. Nawr. 48 : 49 1 -494. Pearl, R. 1 940. Medical Biometry and Statistics. W. B. Saunders, Philadelphia. Pearson, E. S., and H. O. Hartley. ] 958. Biometrika Tables for Statisticians. Vol. l. The University Press, Cambridge. Pearson, K. 1 904. On a generalized theory of alternative i nheritance, with special references to Mendel's laws. Phil. Trans. Roy. Soc. A 203 : 5 3-86. Pearson, K. 1 909. The theory of ancestral contributions in heredity. Proc. Roy. Soc. B. 8 1 : 2 1 9-224. Pearson, K . 1 909a. On the ancestral gemetic correlations of a Mendelian population mating at random. Proc. Roy. Soc. B. 8 1 : 225-229. Pearson, K ., and A. Lee. 1 903. On the laws of i nheritance in man. I. Inheritance of physical characters. Biometrika 2 : 3 57-462. Pederson, D. G. 1 966. The expected degree of heterozygosity in a double-cross hybrid population. Genetics 5 3 : 669-674. Pederson, D. G . 1 969. The prediction of selection response i n a self-fertilizing species. A ust. J. Bioi. Sci. 22 : 1 1 7- 1 29. Penrose, L. S. 1 949. The meaning of .. fitness " i n human populations. Ann. Eugen. 1 4 : 301 -304. Penrose, L. S. 1 964. Some formal consequences of genes in stable equilibrium. A nn. Hum. Genet. 28 : 1 59-1 66. Penrose, L. S., S. M. Smith, and D. A. Sprott. 1 957. On the stability of allelic systems, with special reference to haemoglobins A, S, and C. Ann. Hum. Genet. 2 1 : 90-93. Pimentel, D. 1 96 1 . Animal population regulation by the genetic feed-back mechanism. Amer. Natw'. 95 : 65-79. Pimentel, D., E. H. Feinberg, P. W. Wood, and J. T. Hayes. 1 965. Selection, spatial distribution, and the coexistence of competing fly species. Amer. Natur. 99 : 97- 1 09. Pirchner, F. 1 969. Population Genetics and Animal Breeding. W. H. Freeman, San Francisco. Planck, M. 1 9 1 7. O ber einen Satz der statistischen Dynamik und seine Erweiterung in der Quantentheorie. Si/z. der preuss. Akad. Pp. 324-341 . Plum, M. 1 954. Computation of inbreeding and relationship coefficients. J. Hered. 45 : 92-94. Pollak, E. 1 966. Some consequences of selection by culling when there is superiority of heterozygotes. Genetics 53 : 977-988. Pollak, E. 1 966a. On the survival of a gene i n a subdivided population. J. Appl. Prob. 3 : 1 42- 1 55. Pollak, E. ] 966b. Some effects of fluctuating offspring distributions on the survival of a gene. Biometrika 53 : 3 9 1 -396. Pollak, E. ] 968. On random genetic drift in a subdivided population. J. Appl. Prob. 5 : 3 14-333.
562
B I B LIOG R A P H Y
Pollard, J. H. 1 966. On the use of the direct matrix product in analysing certain stochastic population models. Biometrika 53 : 397-4 1 5. Pollard, J. H. 1 968. The multi-type Galton-Watson process in a genetical context. Biometrics 24 : 1 47-1 58. Powers, L. 1 944. An expansion of J ones's theory for the explanation of heterosis. Amer. Natur. 78 : 275-280. Preston, F. W. 1 962. The canonica l distri bution of commoness and rarity : Part I. Ecology 43 : 1 85-2 1 5 . Part II. Ibid. 43 : 4 1 0-432. Prout, T. 1 953. Some effects of the variations in segregation rat io and of selection on the frequency of alleles under random mat ing. Acta Genet. Stat. Med. 4 : 1 48-1 5 1 . Prout, T. 1 962. The effects of stabil izing select ion on the time of development i n Drosophila melallogaster. Gellet. Res. 3 : 364-382. Prout, T. 1 965. The est i mation of fitness from genotypic frequencies. Evolution 1 9 : 546-55 1 . Prout, T. 1 968. Sufficient conditions for multiple niche polymorphism. A mer. Natur. 1 02 : 493-496. Purser, A. F. 1 966. I ncrease i n heterozygote frequency with differential fertility Heredity 2 1 : 322-327. Quastler, H. 1 959. I nformation theory of biological i ntegrat ion. A mer. Natllr. 93 : 245-254. Qureshi , A. W., and O. Kempthorne. 1 968. On the fixation of genes of large effects due to continued truncation selection in small populations of polygenic systems with linkage. Theor. & Appl. Genet. 38 : 249-255. Qureshi , A. W., O. Kempthorne, and L. N . Hazel. 1 968. The role of finite size and linkage in response to continued truncation selection. I . Additive gene action. 2. Dominance and overdominance. Theor. & Appl. Genet. 38 : 256-276.
Race, R. R., and R. Sanger. 1 962 Blood Groups in Mall. Fourth Ed . F. A. Davis Co., Philadel phia. Rajagoplan, M. 1 958. Effect of linkage on the homozygosity of a selfed population . J. Indian Soc. Agri. Stat. 1 0 : 64-66. Rasmussen, D. I. 1 964. Blood group polymorphism and inbreeding in natural populations of the deer mouse Peromysclis malliClIlatus. Evolution 1 8 : 2 1 9-229. Rasmuson, M., 1 96 1 . Genetics Oil the Population LeL'el. Svenska Bokforlaget, Stockholm. Rawlings, J. 0., and C. C. Cockerham. 1 962. Analysis of double cross hybrid populations. Biometrics 1 8 : 229-244. Reed, J., R. Toombs, and N, A. Barricelli . 1 967. Simu lation of biological evolu tion and machine learning. J. Theor. BioI. 1 7 : 3 1 9-342. Reed, T. E. 1 959. The definition of relative fitness of individuals with specific genetic traits. A mer. J. Hlim. Genet. 1 1 : 1 37-1 5 5. Reeve, E. C. R. 1 955. Inbreeding with the homozygotes at a disadvantage. Ann. Hum. Genet. 1 9 : 332-346. .
B I B LIOG R A P H Y
563
Reeve, E. C. R. 1 9550. The variance of the genetic correlation coefficient. Biometrics I I : 3 57-374. Reeve, E. C. R. 1 957. Inbreeding with selection and linkage. I. Selfing. Ann. Hum. Genet. 2 1 : 277-288. Reeve, E. C. R. 1 961 . A note on non-random mating in progeny tests. Genet. Res. 2 : 1 95-203. Reeve, E. C. R., and J. C. G ower. 1 958. Inbreeding with selection and linkage. II. Sib-mating. Ann. Hum. Genet. 23 : 3 6-49. Rehfeld, C. E., J. W. Bacus, J. A. Pagels, and M. H . Dipert. 1 967. Computer cal culation of Wright's inbreeding coefficient. J. Hered. 58 : 8 1 -84. ReiersOl, O. 1 962. Genetic algebras studied recurisvely and by means of d ifferential operators. Math. Scand. 1 0 : 25-44. RendeI, J. M. 1 953 Heterosis. Amer. Natur. 87 : 1 29-1 38. Rendel, 1. M. 1 958. Optimum group size in half-si b family selection. Biometrics 1 5 : 3 76-38 1 . Rendel, J. M. 1 959. Evolution of dominance. The evolution of living organisms. Royal Soc. Victoria, Melbourne. Pp. 1 02- 1 1 0. Rhodes, E. C. 1 940. Population mathematics. J. Roy. Stat. Soc. 1 03 : 68-89, 2 1 8-245, 3 62-387. Richardson, R. H., and K. I. Kojima. 1 965 . The kinds of genetic variability in relation to selection responses in Drosophila fecundity. Genetics 52: 583-598. Richardson, W. H. 1 964. Frequencies of genotypes of relatives as determined by stochastic matrices. Genetics 35 : 323-354. Robbins, R. B. 1 9 1 7. Some applications of mathematics to breeding problems. Genetics 2 : 489-504. Robbins, R. B. 1 91 8. Some applications of mathematics to breeding problems. II. Genetics 3 : 73-92. Robbins, R. B. ] 9] 8a. Some applications of mathematics to breeding problems. .
I I I . Genetics 3 : 3 75-389.
Robbins, R. B. 1 9 1 8b. Random mating with t he exception of sister by brother mating. Genetics 3 : 390-396. Robertson, A. 1 952. The effects of inbreeding on the variation due to recessive genes. Genetics 37 ; 1 89-207. Robertson, A. 1 953. A numerical description of breed structure. J. Agric. Sci 43 : 3 34-336. Robertson, A. 1 955. Prediction equations in quantitati ve genetics. Biometrics 1 ] : 95-98. Robertson, A. 1 956. The effect of selection against extreme deviants based on deviation or on homozygosis. J. Genet. 54 : 236-248. Robertson, A. ] 957. Optimum group size in progeny testing and family selection. Biometrics ] 3 : 442-450. Robertson, A. ] 960. On optimum fam ily size i n selection programmes. Biometrics 1 6 : 296-298. Robertson. A. 1 9600. A theory of limits i n artificial selection. Proc. Roy. Soc. B 1 5 3 : 234-249. .
564
B I B LIO G RA P H Y
Robertson, A. 1 96 1 . Inbreeding in artificial selection programmes. Genet. Res. 2 : 1 89-1 94. Robertson, A. 1 962. Selection for heterozygotes in small populat ions. Genetics 47 : 1 29 1 - 1 300. Robertson, A. 1 964. The effect of non-random mating within inbred lines on the rate of inbreeding. Genet. Res. 5 : 1 64- 1 67. Robertson, A. 1 965. The interpretation of genotypic ratios in domestic animal populations. Animal Prod. 7 : 3] 9-324. Robertson, A. 1 967. Animal breeding. Ann. Review o/ Genet. 1 : 295-3 1 2. Robertson, A. 1 967a. The nature of quantitative genetic variation. Heritage From Mendel, Ed. R. A. Brink. University of Wisconsin Press, Madison. Pp. 265-280. Robertson, A. 1 969. The theory of animal breeding. Proc. XII Intern. Congo Genet. 3 : 3 7 1 -377. Robertson, A., and I. M. Lerner. 1 949. The heritabil ity of all-or-none tra its : viability of poultry. Genetics 34 : 395-4 1 1 . Robinson, P., and D. F. Bray. 1 965. Expected effects on the inbreeding coefficient and rate of gene loss of four methods of reproducing finite diploid populations. Biometrics 2 1 : 447-458. Rosado, J. M. c., and A. Robertson. 1 966. The genetic control of sex ratio. J. Theor. Bioi. 1 3 : 324-329. Rosenzweig, M. L., and R. H. MacArthur. 1 963. Graphical representation and stability conditions of predator-prey interactions. Amer. Natur. 97 : 209-223. Ryan, F. J. 1 953. Natural selection in bacterial populations. Arti del VI Congo Int. Microbiol. 1 : 1-9. Sacks, J. M. 1 967. A stable equilibrium with minimum average fitness. Genetics 5 6 : 705-708. Sager, R. 1 966. Mendelian and non- Mendelian heredity ; a reappraisal. Proc. Roy. Soc. B 1 64 : 290-297. Sakai, K. 1 955. Competition in plants and its relation to selection. Cold Spring Harbor Symp. Quant. Bioi. 20 : 1 37- 1 57. Sandler, L., and E. Novitski. 1 957. Meiotic drive as an evolutionary force. A mer. Natur. 91 : 105- 1 1 0. Sanghvi, L. D. 1 963. The concept of genetic load : a critique. Amer. J. Hum. Genet. 1 5 : 298-309. Schafer, W. 1 937. U ber die Zunahme der Isozygotie (Gleicherbarkeit) bei fortge setzer Bruder-Schwester-Inzucht. Zeit. indo Abst. Vererb. 72 : 50-78. Scheuer, P. A. G., and S. P. H. Mandel . 1 959. An inequality in population genetics. Heredity 1 3 : 5 1 9-524. Schmalhausen, I. I. 1 949. Factors of Evolution. Blakiston, Philadelphia. Schmalhausen, I. I. 1 960. Evolution and cybernetics. Evolution 1 4 : 509-524. Schnell, F. W. ] 961 . Some general formulations of linkage effects in inbreeding. Genetics 46 : 947-957. Schnell, F. W. 1 963. The covariances between relatives in the presence of li nkage. Statistical Genetics and Plant Breeding. Ed. W. D. Hanson and H. F. Robinson. Humphrey, New York. Pp. 468-483.
BI BLIOG RAPHY
565
Schnell, F. W. 1 965. Die Covarianz zwischen Verwandten in einer gen-orthogona] Population. I. Allgemeiner Theorie. Biom. Zeit. 7 : I -54. SchuH, W. J., and J. V. Neel. 1 965. The Effects 0/ Inbreeding on Japanese Children. Harper & Row, New York. Scudo, F. M. ] 964. Sex population genetics. La Ricerca Scient. 34 : 93- ] 46. Scudo, F. M. 1 967. L'accoppiamento assortativo basato suI tenotipo di parenti ; elcune consequenze i n popoluzioni. Alii. Ast. Lomb. B 1 0 1 : 435-455. Scudo, F . M. 1 967a. The adaptive value of sexual dimorphism. I . Anisogamy. Evolution 2 1 : 285-29 1 . Scudo, F. M . 1 967b. Selecti on o n both hap]o and dip]ophase. Genetics 56 : 693-704. Scudo, F. M . 1 967c. Criteria for the analysis of multifactorial sex-determination. Ital. J. Zool. I : 1 -2 1 . Scudo, F. M . 1 968. On mixtures of inbreeding systems. Heredity 23 : 142-143. Scudo, F. M . 1 969. On the adaptive value of sexual dimorphism. II. Unisexuality. Evolution 23 : 36-49. Scudo, F. M., and S. Karlin. 1 969. Assortative mating based on phenotype. I. Two al1eles with dominance. Genetics 63 : 479-498. Searle, S. R. 1 966. Matrix Algebra for the Biological Sciences. Wiley, New York. Searle, S. R. 1 96 1 . Phenotypic, genetic, and environmental correlations. Biometrics 1 7 : 474-480. Searle, S. R. 1 965. The value of indirect selection : I . Mass selection. Biometrics 2 1 : 682-707. Seiger, M. B. 1 967. A computer simulation of the influence of imprinting on popu]a tion structure. Amer. Natur. 1 0 1 : 47-57. Sen, S. N. 1 960. Complete selection with partial self-fertilization. J. Genet. 5 7 : 339-344. Sen, S. N. 1 964. Selection after crossing two homozygous stocks in a partial1y self-fertilized population. J. Genet. 59 : 69-76. Sen, S. N. 1 966. Selection i n a mixed population subjected to apartheid. J. Genet. 59 : 250-253. Seyffert, W. ] 960. Theoretische Untersuchungen tiber die Zusammensetzung tetrasomer Population. I. Panmixie. Biom. Zeit. 2 : 1 -44. Seyffert, W. 1 960a. Theoretische Untersuchungen tiber die Zusammensetzung tetrasomer Populationen. II. Selbstbefruchtung. Zeit. Vererb. 90 : 356-374 . Seyffert, W. 1 966. Die Simulation quantitativer Merkmale durch Gene mit bio chemisch definierbarer Wirkung. I. Ein einfaches Modell . Der Zuchter 36 : 1 95- 1 63 . Shaw, R. F. 1 958. The theoretical genetics o f the sex ratio. Genetics 43 : 1 49- 1 63. Shaw, R. F., and J . D. Moh]er. ] 953. The selective significance of the sex ratio. A mer. Natur. 87 : 337-342. Sheppard, P. M. 1 953. Polymorphism, linkage, and the blood groups. Amer. Natur. 87 : 283-294. Sheppard , P. M. 1 956. Ecology and its bearing on population genetics. Proc. Roy. Soc. B. 1 45 : 308-3 1 5. Sheppard, P. M. 1 958. Natural Selection and Heredity. Hutchinson and Co., London.
566
BIBLIO G RA P H Y
Sheppard, P. M ., and E. B. Ford. 1 966. Natural selection and the evolution of dominance. Heredity 2 1 : 1 39- 147. Shikata, M. 1 963. Representation and calculation of selfed population by group ring. J. Tileor. Bioi. 5 : 1 42- 1 60. Shikata, M. 1 964. Interference effect of crossovers in sel fed populations. l. Tileor. Bioi. 7 : 1 8 1 -223 . Shikata, M. 1 965. A generalization of the in breeding coefficient. Biometrics 2 1 : 665-68 1 . Shikata, M. 1 966. Transformation of generalized inbreeding coefficient in compo nents of pedigree. l. Theor. Bioi. 1 0 : 1 1 - 1 4. Shikata, M. 1 966a. Calculations of in breeding coefficient and gene-set probability in diploid populations. J. Theor. Bioi. 1 0 : 1 5-27. Shikata, M. 1 966b. Cross-over effect in homozygosity by descent. J. Theor. BioI. 1 0 : 1 96-208. Shikata, M. 1 968. Recombination effect on consangu inity : Generalized in breeding coefficient in first cousin mating with two linked loci. lap. l. Human Genet. 1 3 : 1 9. Shimbel, A. 1 965. Information theory and genetics. BIlII. Marh. Biophys. 27 : 1 77- 1 8 1 . Singh, M ., and R. C. Lewontin. 1 966. Stable equil ibria under optimizing selection. Proc. Nat. Acad. Sci. 56 : 1 34 5 - 1 348. Skellam, J. G. 1 948. The pro bability d istri bution of gene-differences in rel ation to selection, and random ext inction. Proc. Camb. Phil. Soc. 45 : 3 64-367. Skel lam, J. G. 1 949. The probabil ity distribution of gene differences in relation to selection, mutation, and random extinction. Proc. Camb. Phil. Soc. 45 : 364-367. SkeJ1am, J . G. 195 1 . Random dispersal in theoret ical populat ions. Biometrika 3 8 : 1 96-2 1 8 . Skellam, J . G. 1 95 1 a. Gene dispersion in heterogeneous populations. Heredity 5 : 433-435. Slatis, H. M. 1 960. An analysis of in breed ing in the European bison. Genetics 45 : 275-287. Slobodkin, L. B. 1 953. An algebra of population growth. Ecology 34 : 5 1 3-5 1 9. Siobodkin, L. B. 1 958. Meta-models in theoretical ecology. Ecology 39 : 550-55 1 . Siobodkin, L. B. 1 960. Ecological energy relationships at the population level. Amer. Natur. 94 : 2 1 3-236. Slobodkin, L. B. 1 96 1 . Preliminary ideas for a predictive theory of ecology. A mer. Natur. 95 : 1 47- 1 53. Slobodkin, L. B. 1 962. Growth and Regulation of Animal Populations. Holt, R inehart and Winston, New York. Slobodkin, L. B. 1 964. The strategy of evolution. Amer. Sci. 52 : 342-3 57. Smith, C. A. B. 1 966. Biomathematics. Hafner, New York. Smith, C. A. B. 1967. Notes on the gene frequency estimat ion with multiple alleles. Ann. Hum. Genet. 3 1 : 99- 1 07. Smith, C. A. B. 1 969. Local fluctuations in gene frequencies. Ann. Hum. Generics 32 : 25 1 -260 Snyder, L. H. 1 947. The principles of gene distri bution in human populations. Yale l. Bioi. Me". 1 9 : 8 1 7-833. -
.
B I BLIO G R A P H Y
567
Snyder, L. H., and C. W. Cotterman. 1 936. Studies i n human inheritance XIII. A table to determine the expected proportion of females showing a sex i nfluenced character corresponding to any given proportion of males showing the character. Genetics 2 1 : 79-83. Sperlich, D. 1 966. Equilibria for inversions i nduced by X-rays in isogenic stra ins of Drosophila pseudoobscura. Genetics 5 3 : 83 5-842. Spetner, L. M . 1 964. Natural selection : an information-transmission mechanism for evolut ion. J. Theor. Bioi. 7 : 4 1 2-429. Spiegelman, S. et al. 1 969. Chemical and mutational studies of a replicating RNA molecule. Proc XII Intern. Congr. Genetics 3 : 1 27-1 54. Spiess, E. 1 968. Experimental population genetics. A nn. Rev. Genet. 2 : 1 65-208. Spofford, J. B. 1 969. Heterosis and the evolution of duplications. A mer. Natllr. .
1 03 : 407-432.
Sprott, D.
A . 1 957. The stability of a sex-linked allelic system. Alln. Hum. Genet. 6. Stahl, F. W., and N. E. Murray, 1 966. The evolution of gene clusters and genetic circulari ty. Genetics 5 3 : 569-576. S tanton, R . G. 1 946. Filial and fraternal correlations in successive generations. A nn. Ellgen 1 3 : 1 8-24. Stanton, R . G . 1 960. Genetic correlations with mutiple alleles. Biometrics 22 : 1 -
.
1 6 : 2 3 5-244.
Stern, C. 1 94 3 . The Hardy-Weinberg Law. Sciellce 97 : 1 37- 1 38 . Stratton, J . A ., P. M . Morse, L. J. Chu, and P. A. H utner. ] 94 1 . Ellipric, Cylinder and Spheroidal Wave FUllctions. Wiley, New York . Stratton, J . A., P. M . Morse, L. J. Chu, J . D. C. Little, and F. J. Corbat6. 1 956. Spheroidal Wave FUllctions. Technology Press of M . I . T. & Wiley, New York. Streams, F. A., and D. Pimentel. 1 96 1 . Effects of immigration on the evolution of populations. A mer. Natllr. 95 : 20 1 -2 1 0. Stuber, C. W., and C. C. Cockerham. 1 966. Gene effects and variants in hybrid populations. Genetics 54 : 1 279- 1 286 . .. Student. " 1 929. Evolution by selection. J. A.r.;ric. Res. 39 : 45 1 -47 6. Sturtevant, A. H. 1 9 1 8. An analysis of the effects of selection. Cam. Inst . Wash . Publ. No. 264, pp. 1 -68. Sturtevant, A. H . 1 948. The evolution and function of genes. A mer. Sciem. 3 6 : Sturtevant, A. H. 1 937, J 938. Essays on evolution. I. On the effects of select ion on mutat ion rate. I I . On the effects of selection on social insects. I l l . On the origin of interspecific steril ity. Q uart . Rev. Bioi. 1 2 : 464-467 ; 1 3 : 74-76, 3 3 3-335. Sturtevant, A. H., and K. Mather. 1 93 8 . The interrelat ions of i nversions, heterosis, and recombi nation. A mer. Nalllr. 7 2 : 447-452. Sueoka, N. 1 962. On the genetic basis of variation and heterogeneity of DNA base composition. Proc. Nat. A cad. Sci. 48 : 582-592. Sutter, J . , and L. Tabah. 1 95 1 . La mesure de I'endogamie et scs appl ications demo graphiques . .I. So c . Stat. Paris. 92 : 243-267. Sved, J. A. 1 964. The average recombination frequency per chromosome. Gem'tics 225-236.
49 : 367-3 7 1 .
688 BI BLI O G RAPHY
Sved, J. A. 1 968. The stability of linked systems of loci with a small population size. Genetics 59 : 543-563. Sved, J. A. 1 968a. Possible rates of gene substitution in evolution. Amer. Natur. 1 02 : 283-292. Sved, J. A., T. E. Reed, and W. F. Bodmer. 1 967. The number of balanced poly morhpisms that can be maintai ned in a natural population. Genetics 55 : 469-48 1 . TaBis, G. M. 1 962. A selection index for optimum genotype. Biometrics 1 8 : 1 20-- 1 22. Tallis, G. M . 1 966. Equilibria under selection for k aBeles. Biometrics 22 : 1 21 - 1 27. Teissier, G. 1 944. Equilibre des genes lethaux dans les populations stationnaires panmictique. Rev. Scient. 82 : 1 45-1 59. Teissier, G. 1 954. Condit ions d'equi l i bre d'un couple d'alleles et superiorite des heterozygotes. Compo Rend. A cad. Sci. 238 : 62 1 -623. Thoday, J. M. 1 953. Components of fitness. Symp. Soc. Exp. Bioi. 7 : 96- 1 1 3. Thompson, J. B., and H. Rees. 1 956. Selection for heterozygotes during i nbreeding. Nature 1 77 : 385-386. Thorpe, W. H. 1 945. The evolutionary significance of habitat selection. J. Anim. Ecol. 1 4 : 67-70. Tietze, H. 1 923. O ber das Schicksal gemischter Populationen nach den mendelischen Vererbungsgesetzen. Zeit. angew. Math. Mech. 3 : 362-393. Tomlinson, J. 1 966. The advantages of hermaphroditism and parthenogenesis. Theor. Bioi. 1 1 : 54-58. Trustrum, G. B. 1 96 1 . The correlation between relatives in a random mating diploid population. Proc. Camb. Phil. Soc. 57 : 3 1 5-320. Tukey, J. W. 1 954. Causation, regress ion and path analysis. Statistics and Mathe matics in Biology. Ed. by O. Kempthorne, T. A . Bancroft, J. W. Gowen, and J. L. Lush. Iowa State College Press, Ames, Iowa. Pp. 35-66. Turner, J. R. G. 1 967. Fundamental theorem of natural selection. Nature 2 1 5 : 1 080. Turner, J. R. G. 1 967a. Why does the genotype not congeal ? Evolution 21 : 645-656. Turner, J. R. G. 1 967b. Mean fitness and the equilibria in multilocus polymorphisms. Proc. Roy. Soc. London B. 1 69 : 3 1 -58. Turner, J . R. G. 1 967c. The effect of mutation on fitness in a system of two co adapted loci. Ann. Hilm. Genet. 30 : 329-334. Turner, J. R. G. 1 967d. On supergenes. 1. The evol ution of supergenes. Amer. Natur. 1 0 1 : 1 95-22 1 . Turner, J. R . G. 1 968. Natural selection for and against a polymorphism which interacts with sex. Evolution 22 : 48 1 -495. Turner, J. R. G. 1 968a. On supergenes. II. The estimation of gametic excess in natural populations. Genetics 39 : 82-93. Turner, J . R. G. 1968b. How does treating congenital diseases affect the genetic load ? Eugen. Quart. 1 5 : 1 9 1 - 1 97. Turner, J. R. G. 1 969. The basic theorems of natural selection : A naive a pproach. Heredity 24 : 75-84. Turner, J. R. G., and M. H. Williamson. 1 968. Population size, natural selection, and the genetic load. Nature 2 1 8 : 700.
B I BLIO G RA P H Y
Utida, S.
569
Population fluctuation, an experimental and theoretical approach.
1 957.
Cold Spring Harbor Symp. Quant. BioI. 22 : 1 39-1 5 1 . Vandermeer,
H.
J.
1 968.
Reproductive value in a population of arbitrary age
H., and M. H. MacArthur. 1 966. A reformulation of alternative (b)
distri bution. Amer. Natur. 1 02 : 586-589. Vandermeer,
J.
of the broken stick model of species abundance. Ecology Van der Veen,
J. H.
Van der Veen, J.
47 : 1 39-1 40.
1 960. Heterozygote superiority, selection intensity, and
plateauing. Heredity
1 5 : 321 -323.
H.
1 9600.
Ein
massaselektie. Genen en Phaenen
geinduceerd
suboptimaal
evenwicht
bij
5 : 49-52.
Van Valen, L. 1 960. Nonadaptive aspects of evolution. Amer. Natur. 94 : 305-308.
Van Valen, L. 1 963. Haldane's dliemma. evolutionary rates and heterosis. A mer. Selection i n natural populations. Ill. Measurement and esti
Natur. 97 : 1 85- 1 90. Van Valen, L.
1 965 .
mation. Evolution
1 9 : 5 1 4-528.
Van Valen, L., and R . Levins. 1 968. The origins of inversion polymorphisms.
Amer. Natur. 1 02 : 5-24.
C. 1 947. The natural control of popUlation bala nce in the knapweed gall-fly (Urophora jaceana). J. A nim. Ecol. 1 6 : 1 39-1 87. Verner, J. 1 965. Selection for sex ratio. Amer. Natur. 99 : 4 1 9-421 . Visconti, N., and M. Delbruck. 1953. The mechanism of genetic recombination i n phage. Genetics 38 : 5-33 . Varley, G.
Volterra, V . 1 926. La Lulie Pour La Vie. Gauthier, Paris. Von Hofstein, N. 1 95 1 . The genetic effect of negative selection in man. Hereditas
37 : 1 57-265. Waaler, G. H. M . 1 927. U ber die ErbishkeitsverhaJtnisse der verschiedenen Arten von angeborener Rotgriinblindheit. Zeit. indo Abst. Vererb. 45 : 279-33 3 . Waddington. Wahlund,
S.
C. H . 1 957. The Strategy of the Genes. Allen and Unwin, London. 1 928. Zuzammensetzung von Populationen
und Korrelationser
scheinungen vom Standpunkt der Vererbun gslehre aus betrachtet. Hereditas
11
: 65-1 06.
B. 1 953. On coadaptation i n Drosophila. Amer. Natur. 87 : 343-358. B. 1 958. The role of heterozygosity i n Drosophila populations. Proc. Xth Intern. Congo Genet. 1 : 408-4 1 9. Wallace, B. 1 9 580. The comparison of observed and calculated zygotic distributions. Evolution 1 2 : 1 1 3-1 1 5 . Wa]Jace, B. 1 963. G enetic d iversity, genetic u niformity, and heterosis. Canad. J. Genet. Cytol. 5 : 239-253. Wal1ace, B. 1 968. Topics in Population Genetics. W. W. Norton, New York. Wangersky. P. J., and W. J. Cunningham. 1 95 6. On time lags i n equations of growth. Wallace, Wallace,
Wangersky, P. J't and W.
Proc. Nat. A cad. Sci. 42 : 699-702.
Cold Spring Harbor Symp. Quant. BioI. 22 : 329-338.
J.
Cunningham. 1 957. Time lag in population models.
570
BI BLI O G R A P H Y
Warburton, F . E. 1 967. Increase i n the variance of fitness due to selection. Evolution 2 1 : 1 97- 1 98. Warren, H. D. 1 9 1 7. Numerical effects of natural selection acting on Mendelian characters. Genetics 2 : 305-3 1 2. Watson, G. S., and E. Caspari . 1 960. The behavior of cytoplasmic pollen steril ity in populations. El.;olu tion 14 : 56-63. Watt, K. E. F. 1 959. A mathematical model for the effect of densities of attacked and attacking species on the number attacked. Canad. Entom . 9 1 : 1 29- 144. Watt, K. E. F. 1 962. Use of mathematics in populat ion ecology. A nn . Rev. Entom. 7 : 243-260. Watterson, G. A. 1 959. Non-random mati ng, and its effect on the rate of approach to homozygosity. A nll. Hum. Genet. 23 : 204-220. Watterson, G. A. I 959a . A new genetic populat ion model, and its approach to homozygosity. A nn. Hum. Genet. 23 : 22 1 -232. Watterson, G. A. 1 96 1 . Markov chains with absorbing states : a genetic example. Ann. Math. Stat. 32 : 7 1 6-729. Watterson, G. A. 1 962. Some theoretical aspects of diffusion theory in population genetics. Ann. Math. Stat. 33 : 939-957. Watterson, G. A. 1 964. The applicat ion of diffusion theory to two population genetic models of Moran. 1. Appl. Prob. I : 233-246. Weber, E. 1 959. The genetical analysis of characters with continuous variabil ity on a mendel ian basis. Gene tics 44 : 1 1 3 1 -1 1 39. Weber, E. 1 960. The genetical analysis of characters with continuous variability on a mendel ian basis. I I . Monohybrid segregation and linkage analysis. Genetics 45 : 459-466. Weber, E. I 960a. The genetical analysis of characters with conti nuous variability on a mendel ian basis. III. Dihybrid segregation. Genetics 45 : 567-572. Weinberg, W. 1 908. U ber den Nachweis der Vererbung beim Menschen. lahresh. Verein f vater/. Naturk . Wiirttem. 64 : 368-382. Weinberg, W. 1 909. U ber Vererbungsgesetze beim Menschen . Zeit. indo A bst . Vererb. I : 277-330. 1 :377-392, 440-460 ; 2 : 276-330. Weinberg, W. 1 9 1 0. Weiteres Bei trage zur Theorie der Vererbung. A rch. Rass. Ges. Bioi. 7 : 35-49, 1 69- 1 73 . Weiss, G . H . 1 963. Comparison o f a determin ist ic and a stochastic model for interaction between antagonistic species. Biometrics 1 9 : 595-602. Weiss, G. H., and M. Kimura. 1 965. A mathematical analysis of the stepping stone model of genetic correlat ion. 1. Appl. Prob. 2 : 1 29- 1 49. Weldon, W. F. R. 1 901 . A first study of natural selection in Clallsilia laminata ( Montagu). Biometrika I : 1 09-1 24. Wentworth, E. N., and B. L. Remick. 1 9 1 6. Some breeding properties of the generalized Mendel ian populat ion. Genetics I : 608-6 1 6. Wigan, L. G. 1 944. Balance and potence in natural populations. 1. Genet. 46 : 1 50-1 60. Willham, R. L. 1 963 . The covariance between relatives for characters composed of components contri buted by related individuals. Biometrics 1 9 : 1 8-27.
BIBLIO G RAPHY
571
Williams, E. J. 1 96 1 . The growth and age-distribution of a population of insects under uniform conditions. Biometrics 1 7 : 349-358. Williams, G. C 1 957. Pleitropy, natural selection, and the evolution of senescence. Evolution I I : 398-4 1 2. Williams, G. C 1 966. Adaptation and Natural Selection. Princeton University Press, Princeton, N.J. Will iams, G. C 1 966a. Natural selection, the costs of reproduction, and a refine ment of Lack's principle. Amer. Natur. 1 00 : 687-690. Williams, G . C, and D. C Will iams. ] 957. Natural selection of individually harmful social adaptations among sibs with special reference to social inser!J. Evo1.wion 1 1 : 32-39. Wil l iamson, M. H. ] 95 8 . Selection, controlling factors, and polymorphism. Amer. Natur. 92 : 329-335. Wills, C , J. Crenshaw, and J. Vitale. 1 969. A computer model allo Ning maintenance of large amounts of genetic variability in mendelian populations. I . Assump tions and results for large populations. Genetics (in press). Willson, M. F., and E. R. Pianka. 1 963. Sexual selection, sex ratio and mating system. Amer. Natur. 97 : 405-407 . Wilson, S. P., W. H. Kyle, and A. E. Bell. 1 966. The influence of mating systems on parent-offspring regression. J. Hered. 57 : 1 24-1 25. Woodger, J. H. 1 965. Theorems on random evolution. Bull. Math. Biophys. 27 : 1 45 1 50. Workman, P. L. 1 964. The maintenance of heterozygosity by partial negative assortative mating. Genetics 50 : 1 369-1 382. Workman, P. L., and R. W. Al lard . 1 962. Population studies in predominantly self-poll inated species. III. A matrix model for mixed selting and random outcrossing. Proc. Nat. A cad. Sci. 48 : 1 3 ] 8-1 325. Workman, P. L., and S. K. Jain. 1 966. Zygotic selection under mixed random mating and self-fert i l i zation : Theory and problems of estimation. Genetics 54 : 1 59-1 7 ] . Wright, S. ] 9] 7. The average correlation within subgroups of a population. J. Wash. A cad. Sci. 7 :532-535. Wright, S. 1 92 1 . Systems of mating. I . The biometric relations between parent and offspring. Genetics 6 : ] 1 1 - 1 23 . \Vright, S. 1 92 1a. Systems of mating. I I. The effects o f inbreeding o n the genetic composition of a popu lation. Genetics 6 : 1 24-1 43. Wright, S. 1 92 1 b. System o f mating. I II. Assortative mating based on somatic resemblance. Gelletics 6 : 1 44- 1 6 1 . Wright, S. 1 92 1 c . Systems o f mating. IV. The effects of selection. Genetics 6 : 1 62-1 66. Wright, S. 1 92 1 d. Systems of mating. V. General considerations. Genetics 6 : 1 67-1 78. Wright, S. 1 92 1 e. Correlation and causation. J . Agric. Res. 20 : 557-585. Wright, S. 1 922. Coefficients of inbreeding and relationship. Amer. Natur. 56 : 330--338. Wright, S. 1 922a. The effects of in breeding and crossbreeding on guinea pigs. Bull. U.S. Dept. Agric. 1 1 2 1 : 1 �59. Wright, S. ] 923. The theory of path coefficients. Genetics 8 : 239-255 . -
572
BI BLIO G RA P H Y
Wright, S. 1 923a. Mendelian analysis of the pure breeds of livestock. I. The measure ment of inbreeding and relationship. J. Hered. 1 4 : 339-348. Wright, S. 1 923b. Mendelian analysis of the pure breeds of l ivestock. I I . The Duchess family of shorthorns as bred by Thomas Bates. J. Hered. 1 4 : 405-422.
Wright, S. 1 929. The evolution of dominance. Comment on Dr. Fisher's reply. Amer. Natur. 63 : 1 -5. Wright, S. 1 929a. Fisher's theory of dominance. A mer. Natur. 63 : 274-279. Wright, S. 1 929b. The evolution of dominance. A mer. Natur. 6 3 : 5 5 6-5 6 1 . Wright, S. 1 93 1 . Evolution i n Mendelian populations. Genetics 1 6 : 97- 1 59. Wright, S. 1 93 2. The roles of mutation, i nbreeding, cross-breeding and selection in evo]1Jtion. Proc. VI Int. Congo Genet. 1 : 3 5 6-366. Wright, S. 1 93 3 . Inbreeding and recombination. Proc. Nat. A cad. Sci. 1 9 : 420-4 3 3 . Wright, S. 1 93 3u Inbreeding and homozygosis. Proc. Nat. A cad. Sci. 1 9 : 4 1 1 -420. Wright, S. 1 934. Physiological and evolutionary theories of dominance. Amer. Natur. 68 : 25-53. Wright, S. 1 934a. The method of path coefficients. Ann. Math. Stat. 5 : 1 6 1 -2 1 5. Wright, S. 1 934b. Professor Fisher o n the theory of dominance. A mer. Natur. 68 : 562-565.
Wright, S. 1 93 5 . The analysis of variance and the correlations between relatives with respect to deviations from an optimum. J. Genet. 30 : 243-256. Wright, S. 1 935a. Evolution in populations i n approxi mate equilibrium. J. Genet. 30 : 257-266.
Wright, S. 1 937. The d istribution of gene frequencies i n populations. Proc. Nat. Acad. Sci. 23 : 307-320. Wright, S. 1 938. The distribution of gene frequencies u nder irreversible mutation. Proc. Nat. Acad. Sci. 24 : 253-259. Wright, S. 1 938a. The distribution of gene frequencies i n populations of polyploids. Proc. Nat. A cad. Sci. 24 : 372-377. Wright, S. 1 938b. Size of population and breeding structure i n relation to evolution. Science 87 : 430-43 1 . Wright, S. 1 939. The distribution of self-sterility a lleles in populations. Genetics 24 : 5 3 8-552. Wright, S. 1 939a. Statistical genetics i n relation to evolution. Act. scient. et indus., 802 : 1 -63.
Wright, S. 1 940. Breeding structure of populations in relation to speciation. A mer. Natur. 74 : 232-248. Wright, S. 1 94 1 . The probability of fixation of reciprocal translocations. A mer. Natur. 75 : 5 1 3-522. Wright, S. 1 942. Statistical genetics and evolution. Bull. A mer. Math. Soc. 48 : 223-246.
Wright, S. 1 943. Isolation by d istance. Genetics 28 : 1 1 4- 1 38 . Wright, S. 1 945. Tempo and mode in evolution : A critical reVIew. Ecology 26 : 4 1 5-4 1 9.
Wright, S. 1 945a. The differential equation of the distribution of gene frequencies. Proc. Nat. Acad. Sci. 3 1 : 382-389.
B I B LI O G R A P H Y
573
Wright, S. 1 946. Isolation by distance under diverse systems of mating. Genetics 3 1 : 39-59. Wright, S. 1 948. On the roles of directed and random changes in gene frequency in t he genetics of populations. Evolution 2 : 279-294. Wright, S. 1 948a. Evolution, organic. Encyclopedia Britannica 8 : 9 1 5-929. Wright, S. 1 94Bb. Genetics of populations. Encyclopedia Britannica 1 0 : 1 1 1 - 1 1 2. Wright, S . 1 949. Adaptation and selection. Genetics, Paleontology, and Evolution. Ed. by G. L. Jepson, G. G. Simpson, and E. Mayr. Princeton Univ. Press, Princeton, N.J. Pp. 365-389. Wright, S. 1 949a. Population structure and evolution. Proc. A mer. Phil. Soc. 93 : 47 1 -478. Wright, S. 1 95 1 . The genetical structure of populations. A nn. Eugen. 1 5 : 323-354. Wright, S. 1 95 1 a. Fisher and Ford on " The Sewall Wright Effect." Amer. Scient. 3 9 : 452-479. Wright, S. 1 952. The genetics of quantitative variability. Quantitative Inheritance. Her Majesty's Stationery Office, London, Pp. 5-4 1 . Wright, S. ] 952a. The t heoretical variance within and among subdivisions of a population that is in a steady state. Genetics 37 : 3 1 3-32] . Wri ght, S. 1 95 3. Gene and organism. A mer. Natur. 87 : 5- I B. Wright, S. 1 954. The interpretation of multivariate systems. Statistics and Mathe marics in Biology. Ed. by O. Kempthorne, W. A. Bancroft, J. W. Gowen. and J. L. Lush. Iowa State College Press, Ames, Iowa. Pp. 1 1-33. Wright, S. 1 955. Classification of t he factors of evolution. Cold Spring Harbor Symp. Quant. BioI. 20 : 1 6-24. Wright, S. 1 956. Modes of selection. A mer. Natur. 40 : 5-24. Wright, S. 1 960. Physiological genetics, ecology of populations, and natural selection. The Evolution of Life. Ed. by Sol Tax. Univ. of Chicago Press, Chicago, Ill. I : 429-475. Wright, S. 1 960a. On the number of self-incompatibility alleles maintained in equilibrium by a given mutation rate in a population of given size : a reexami nation. Biometrics 1 6 : 6 1 -85. Wright, S. 1 960b. Path coefficients and pa th regressions : alternative or comple mentary concepts ? Biometrics 1 6 : 1 89-202 Wright, S. 1 960c. The treatment of reciprocal interaction, with or without lag, in path analysis. Biometrics 1 6 : 423-44 5 . Wright, S. 1 963. Discussion of systems of mating used in mammalian genetics. Methodology in Mammalian Genetics. Ed. by W. J. Durdette. Holden-Day, San Francisco. Pp. 4 1 -53. Wright, S. 1 964. Pleiotropy i n the evolution of structural reduction and dominance. A mer. Natur. 98 : 65-69. Wright, S. 1 964a. Stochastic processes in evolution. Stochastic Models in Medicine and Biology. Ed. by John Gurland. Univ. of Wisconsin Press, Madison, Wisc. Pp. 1 99-244. Wright, S. 1 964b. The distribution of sel f-incompatibility alleles in popu lations. Evolution 1 8 : 609-61 9. .
574
B I B LIOG RAPH Y
Wright, S. 1 965. Factor interaction and linkage in evolution. Proc. Roy. Soc. B 1 62 : 80-104. Wright, S. 1 965a. The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution 1 9 : 395-420. Wright, S. 1 966. Polyallelic random drift in relation to evolution. Proc. Nat. A cad. Sci. 55 : 1 074- 1 080. Wright, S. 1 967. " Surfaces " of selective value. Proc. Nat. Acad. Sci. 58 : 1 65-1 72. Wright, S. 1 967a. The foundations of population genetics. Heritage From Mendel, Ed. R. A. Brink. Univ. of Wisconsin Press, Madison. Pp. 245-264. Wright, S. 1 968. Evolution and the Genetics of Populations. Vol. I. Genetic and Biometric Foundations. Univ. of Chicago Press, Chicago. Wright, S. 1 969. Evolution and the Genetics of Populations. Vol. 2. The Theory of Gene Frequencies. University of Chicago Press, Chicago. Wright, S., and Th. Dobzhansky. 1 946. Genetics of natural populations. XII. Experimental reproduction of some of the changes caused by natural selection in certain populations of Drosophila pseudoobscura. Genetics 3 1 : 1 25-1 56. Wright, S., Th. Dobzhansky, and W. Hovanitz. 1 942. Genetics of natural popula tions. VII. The allelism of lethals in the third chromosome of Drosophila pseudoobscura. Genetics 27 : 363-394. Wright, S., and W. E. Kerr. 1 954. Experimental studies of the distribution of gene frequencies in very small populations of Drosophila melangaster. II. Bar. Evolution 8 : 225-240. Wright, S., and H. C. McPhee. 1 925. An approximate method of calculating coefficients of inbreeding and relationship. J. Agric. Res. 3 1 : 377-383. Yanase, T. 1 964. A note on the patterns of migration in isolated populations. Jap. J. Hum. Genet. 9 : 1 36-- 1 52. Yasuda, N. 1 968. Estimation of inbreeding coefficient from p henotype frequencies by a method of maximum likelihood scoring. Biometrics 24 : 9 1 5-936. Yasuda, N. 1 968a. An extension of Wahlund's principle to evaluate mating type frequency. Amer. J. Hum. Genet. 20 : 1 -23. Yasuda, N. 1 969. The estimation of t he variance effective population number based on gene frequency. Jap. J. Hum. Genet. 1 4 : 1 0-1 6. Yasuda, N., and M . Kimura. 1 968. A gene-counting method of maximum l ikelihood for estimating gene frequencies in ABO and ABO-like systems . A nn. Hum. Genet. 3 1 : 409-420. Yntema, L. 1 952. Mathematical Models of Demographic Analysis. J. J. G roen and Zoon, Leiden. Young, S. S. Y. 1 96 1 . A further examination of the relative efficiency of three methods of selection for genetic gains under less restricted conditions. Genet. Res. 2 : 1 06-- 1 2 1 . Young, S . S. Y . 1 966. Computer simulation o f directional selection in large popula tions. I. The programme, the additive and dominance models. Genetics 53 : 1 89-205. Young, S. S. Y., and H. Weiler. 1 960. Selection for two correlated traits by independ ent culling levels. J. Genet. 57 : 329-358.
B I B LIOG R A P H Y
575
Yule, G. U. 1 902. Mendel's Jaws and their probable relation to intra-racial heredity. New Phyto/. 1 : 1 92-207, 222-238. Zirkle, C. 1 926. Some numerical results of selection upon poly hybrids. Genetics 1 1 : 531-583.
GLOSSARY AND INDEX OF SYMBOLS
brief identification o r d e finition, and th e page where it is defined or i ntroduced.
This l ist gives the symbol. a
1. Latin
A A
Symbols
A
IAI
A-I
A t , A 2, A i, A J ai;
ancestor correction between the ge n ic value s of mates matrix de te nn i nant inverse of m atrix A allelic genes average excess, or deviation from the mean, of genotype A iA j average excess of all ele A, Beta fun ction
70 1 56 5 00 502 33 206 21 1 437 577
578
GLOSSARY A N D I N D EX O F SYM BOLS
B 1 , B2, B", BI b b:# b,l" C
C C Cov D D
E E
f, fI
allelic genes, non alleles of genes at the
47
A locus
7
birth rate birth rate at age
regression of y on
11
x
488, 1 40
x
recombination or "crossover" fraction
1 36
covariance Haldane's cost of natural selection
246 484, 1 08
covariance deviation of heterozygote from mean of homozygotes
P I P4
-
p 2 pa, a measure of linkage dis-
equilibrium
99 1 97
epistatic parameter
198
expected value operator
1 07
inbreeding coefficient
of i ndividual I,
probab ility of autozygosity
flJ
coefficient of consanguinity or kinship of
fIB, fB'1', fI'1'
inbreeding coefficients in a subdivided
f(x) tli
47
64
I and J
population probabil ity gene rating function
fertility of ge notype A iA j
1 05 419 1 85
F
two-locus inbreeding coefficient, prob-
F
hypergeometric function
3 83
geometric mean
480
genic or additive value of genotype A ,AJ
210
G Gu
h h2 H2
H H H
Ho Ht I I K K K K
ability of double autozygosity
a
measure of dominance
heri tability, heri tability,
96
260
Vg/ Vt Vh/ Vt
1 24 1 24
heri tab ility, sam e as 11 2
156
harmonic mean
481
proportion of heterozygotes
63
initial proportion of heterozygotes
64
proportion of heterozygotes at time t
64
selection differential
226
identity m atrix
501
carrying capacity
23
number of possible allelic states
45 3
meas ure of segregation distortion
311
pro portion
of
homozygous
recessives
from consanguineous ma tings
74
G LO S S A R Y A N D I N D EX O F S Y M B O L S
579
k
the number of progeny from a parent
346
ko, kt, k2
Cotterman k-coefficients genetic load
1 32 299
upper confidence limit
497
L Lv L, l ex) M
lower co nfidence limit probability of survival to age
x
migration coefficient ( same as 390 )
497 17 m
on p. 267
arithmetic mean mean
gene
480
frequency change
m
generation m
m
m
m,
m ij
IV
Ne Ne(1) Ne(,,) Nt; n ne ne na:t P P Pi}
P P Pi PI, P2, pa, P4 pm
Q q r
r r
s
T
one 372
migration coefficient
390
proportion of males
42
fi tness in Malthusian parameters
7
fitness of allele A i
191
fitness of genotype A iA ;
191
population number
5
effective population number
109, 345
inbreeding effective population number
347
variance effective population number
357
n umber of genotype A iA; in the population
number of alleles
1 68
effective number of alleles
324
effective number of gene loci
1 54
number of individuals of age panmictic index
x
at time
t
Legend re polynomial
38
allele frequency
initial allele frequency
328
proportion of allele A i
33
frequencies of four chromosome types age
x+
1
64 3 86
proportion of genotype A iA J
probability of surviving from age
11
x
a quantity to be minimized
1 96
to 487
an allele frequency, or a probability
(
=
1
- p)
intrinsic rate of increase c orrelation coefficient Wright's coefficient of relationship selective advantage Gegenbauer polynomial
23 487, 1 37 69, 1 38 8, 1 82, 1 92 383
580
GLOSSARY A N D I ND EX OF S Y M B OLS
t t U UfJ U
u (p,t) V Vd VB Vg VI!. Vi V,
v'oom
VolA VAn Vnn
Vu
v ex) w
relative devi ate time mutation rate m utation r ate from A i to A i probability o f ul tim ate fixation of a mutant gene probability of fix ation at generation t, given frequency p at generation 0 vanance dominance variance environmen tal variance genic variance genotypic variance interaction or epistatic variance total variance gametic variance additive X addi tive epis tatic variance additive X dominance epistatic variance domin ance X dominance epistatic variance variance of the gene frequency charge in one genera tion variance in progeny number m utation rate, in reverse direction to that measured by u viability of genotype A iA j reproductive value a t age x fitness in discrete generation model fitness of genotype A tA i fi tness of allele A i age allele frequency varying stoch astically phenotypic vclue ph enotypic value at f = 0 phenotypic value a t f = 1 PIP.�/p2pa! a measure of linkage disequi librium
498 5 259 263 42 1 423 1 0, 54, 482 1 20 1 20 1 20 1 20 1 20 1 20 219 1 27 1 27 1 27 372 346 442 1 85 20 5
179
179
11
328 77
1 97
2. Greek Symbols ex.
ex.
proportion of double reduction departure from Hardy-Weinberg propor tions
52 356
G LO S S A R Y A N D I N D EX O F SY M BO LS
average effect of allele A t
gamma function a determinant
an increment, change in one generation an increment Dirac delta function
581
1 11 391 215 6, 1 19 230 3 83
partial derivative
PHIPiPi, a measure of departure from Hardy-Weinberg proportions
eigenvalue
limiting v alue of Ht1Ht- 1 l argest eigenvalue, rate o f population growth at stable age distribution mean nth moment about 0 product symbol 1T
I
q,
!fJ (p,x;t)
true probability addition symbol standard deviation
F-
[2
x
,
15 486 332 480 491 4 80 483 91
at time t, given that it was p at
time 0 loge 8ij I( observed-expected) 2
3. Other Symbols
1 4, 506 88
the probability that the gene frequency is
n
212
expected
probability of coexistence of 2 alleles in a p opulation
factorial [n! - n (n - l ) (n - 2 ) . . . 2· 1 ] used t o designate the next generation mean value designates an equilibrium value
*
designates that the quantity applies to
••
quantity applies to females
m ales harmonic mean
312 493 386
491 1 84 480 1 46 45 45 302
INDEX OF NAMES
Allard, R. W., 294 Anderson, W., 1 86, 1 88 Bennett, J . H., 4 1 , 50, 52, 280 Bernoulli, J . , 42 6 Bey�r, W. H ., 49 8 Bodmer, W. F . • 1 62, 1 74 , 204, 205, 2 1 7, 2 89, 307 Book. J . A., 7 5 B reese, E. L., 1 43 Castle, W. E., 35 Chu, L. J., 397-400 Chu ng. C. S., 1 48, 1 86, 1 88 Cockerham, C. C., 1 2 4 Co rbalo, F . J • • 397-4 00 Cotterman, C. w., 4 1 , 64, 65, 1 32, 1 7 1 Crenshaw, J., 307
Darwin, C., 2 3 7 , 288 De Finetti , B., 60 Dempster, E. R ., 284, 4 1 8 Denniston, C., 97, 1 3 2 Dirac, P. A. M ., 3 83 , 467 Dobzhansky, T., 297 Edwards, A. W. F., 289 Ernlen, J. M ., 289 Ewens , W. J., 195 Fadeev, D. K., 503 Fadeeva, V. N . • 503 Falco ner, D. S., 69, 12 1 Feller, W., 4 2 1 Felsenstein, J., 1 49, 199, 200. 2 1 7, 248 Feynman, R ., 2 Fibonacci, L.. 64 Finney, D. J .• 1 68, 170 583
584
I N D EX OF N A M ES
Fish, H. D., 63 Fisher, R. A., 2, 3, 4, 9, 32, 4 1 , 52, 64, 89, 1 27, 1 3 7 , 1 3 8, 1 56 , 1 6 1 , 1 67 , 1 68, 1 74, 206, 207, 209, 2 1 0, 22 ] , 224, 225, 2 3 7 , 292, 298, 3 1 3, 3 ] 6,
1 0, 1 7 , 20, 22, 94, 1 1 8, 1 24, 1 5 8, 1 59, 1 60, 1 7 5 , 200, 205 , 2 1 2, 2 1 6, 2 1 8, 242, 287, 289, 322, 3 7 1 , 3 8 7 ,
3 88, 4 1 4, 4 ] 8, 4 1 9, 426, 4 34, 457, 480, 498, 509 Ford, E. B . , 4 1 4 Gauss, K . F., 3 57 Geiringer, H., 50
Haldane, J . B . S., 3, ] 9, 64, 89, 94, 96, 97, 98, 1 74, 1 83 , 1 94, 204 , 224, 240, 244, 245, 249, 2 50, 25 1 -252, 256, 2 80, 2 82, 297, 298, 3 00, 30 1 , 3 66, 3 7 1 , 372, 4 1 8 , 4 1 9 , 422, 428, 43 3 Hardy, G. H., 34, 3 5, 3 8 , 40, 4 5 , 47, 49, 5 3 , 55, 59, 60, 65, 66, 1 05 , 1 1 1 , 1 1 6, 1 1 9, ] 65, 1 74, 1 84, ] 85, 1 90, 1 9 3 , 1 94, 1 95, 1 96, 2 1 1 , 2 1 2, 2 1 4, 2 1 7 , 234, 2 3 8, 266, 3 1 1 , 320, 3 56, 3 62 Harris, D. L., 5 8, 42 1 Hartl, D. L., 4 1 Hartley, H. 0., 498 Hill, W. G., 3 89 Hogben, L., 5 5 Imaizumi, Y .,
1 08, 4 4 5
J ain, S. K., 294 Jayakar, S. D ., 2 80, 282 Jennings, H. 5., 47, 6 3 , 1 43, 1 46 J ukes, T. H ., 250, 3 07 , 322, 3 69
Karlin, S., 1 6 ] , 1 6 3 , 1 66 Kelleher, T. M . , 2 1 4 Kempthorne, 0., 69, 1 2 1 , 1 24, 1 25, 1 27 , 1 29 , 2 1 5, 22 1 Kerr, W. E., 396, 3 9 8 , 399, 4 00, 465 Keyfitz, N ., 1 6 King, J . L., 240, 24 1 , 250, 3 07 , 3 22, 3 69 Kingman, J. F. C., 209, 2 7 3 Koj ima, K. I., 2 1 4, 2 1 7 , 2 9 4 Kolman, W . , 2 8 9 Kolmogorov, A., 3 7 1 , 372, 3 7 3 , 374, 3 7 6, 377 , 378, 3 80, 3 8 1 , 3 82, 4 1 9, 422, 4 2 3 , 425, 428, 466 Korn, G . A., 3 8 3 Korn, T . M ., 3 8 3
Lagrange, J. L . , 2 3 1 , 2 86, 5 1 5 Lee, B . T . 0., 1 60 Legend re, A . M . , 3 3 8 , 386 Leigh, E., 2 5 Lerner, I. M . , 225 Les lie, P. H., 1 1 Levene, H., 5 5 , 257, 282, 284, 294 Lewon tin, R. c., 2 1 7 , 224, 294 Li, C. C., 86, 92, 208 , 2 1 5 , 272, 277 , 480 Little, J. D. C., 397-400 Lotka, A . J ., 1 7 MacArthur, R. H . , 29, 289 McB ride, G., 1 4 3 M alecot, G . , 50, 6 5 , 69, 470 M andel, S . P . H ., 209, 2 7 3 , 276, 277 M ange, A. P., 1 0 6 M arkov, A . A., 372, 376, 3 8 1 , 406
M aruyama, T., 4 1 . 250 M ather, K., 3 1 3, 3 1 5- 3 1 6 Mendel, G., 1 1 , 23, 2 5 , 3 1 , 3 2 , 1 26, 1 4 3 , 1 74, 1 84, 2 3 6, 2 3 7 , 2 9 3 , 297, 303 , 3 1 1 , 3 1 2, 3 1 3 , 382 Milkman, R. D., 307 M iller, G . F., 406, 409, 4 ] 0, 4] I, 4 12 Mohler, J . D . , 2 8 1 M oran, P. A . P . , 1 1 ,
1 4,
1 5, 3 9 ,
1 67 ,
1 68, 1 70, 1 8 1 , 284, 4 2 5 , 426 Morse, P. M., 3 97-400 Morton, N. E. , 7 7 , 1 1 ] , ] 4 8, 3 04, 3 1 1 , 345, 3 63 , 3 64, 426 Moshinsky, P., 64 M ukai, T., 8 5 M ulholl and, H . P., 209, 2 7 3 Mu ller, H . J., 7 7 , 244, 29 7, 3 0 1 , 3 1 3 , 3 14, 3 1 6 M u rata, M ., 3 6 5 M urphy, E . M . , 1 6
Nair, K . R . , 24 N arain, P . , 4 3 2 N eal, N . P . , 84 Nei , M ., 99, 1 08 , 260, 2 88, 449, 450 N ewton, I., 232 N ovitski, E., 257, 3 1 1
3 6 5 , 445,
O'Donald , P., 1 6 1
Ohta, T., 3 8 9, 4 30, 4 3 1 , 4 3 2
Parsons, P. A., 1 6 1 , 204, 205, 2 1 7 Pearl, R., 24 Pearson, K., 3 5 , 1 60, 4 9 8 Poisson, S. D . , 3 46, 42 1 , 426, 4 2 8 , 49 1493
I N D EX O F N A M E S
Pro u t , T., 2 5 7 , 284 Race, R . R., 3 3 Reed, T. E., 307 Reeve, E. C. R., 1 5 6 Remick, B. L., 1 4 3 Robbins, R . B ., 5 0 Robertson , A . , 9 2 , 1 00, 1 4 3 , 260, 294, 3 3 6 , 343 , 3 44, 364, 3 65 , 3 8 9 , 396, 409, 4 1 3 , 4 1 8 , 42 8 , 447, 457 Robison , O. W., 1 48
Sandler, L., 2 5 7 , 3 1 1 Sanger, R., 3 3 Schaffer, H . E., 2 1 7 Scheuer, P. A . G., 209, 273 Sc udo , F. M., 1 6 1 , 1 6 3 , 1 66 Searle, A. G . , 5 7 Shackleford, R. M . , 494 Shaw, R . F., 289 Sheppard , P . M ., 287 Slobodkin, L. B ., 24 Smith, C. A . B., 209, 27 3 Sm i th , M . , 2 5 1 , 3 1 6 Snyder, L. H . , 57 Sturtevant, A . H ., 3 1 3 , 3 1 5-3 1 6 Stratton , J . A . , 3 97 , 398, 399, 400 Sutter, J., 7 6 Sved, J . A . , 25 1 , 307
Tabah, L., 7 6 Teissier, G., 1 86 Turner, J . R. G., 2 1 4, 239
Varley, G. c., 1 9, 20 Vitale, J., 307
585
Waaler, G . H. M., 4 1 Wahlund, S., 54, 60, 1 08 Wallace, B . , 1 7 8 Watterson, G . A., 1 6 1 , 425 Weinberg, W., 34, 3 5 , 3 8 , 40, 45, 47 , 49, 5 3 , 5 5 , 5 9 , 60, 65, 66, 1 0 5 , 1 1 1 , 1 1 6, 1 1 9, 1 6 5, 1 7 4, 1 84, 1 8 5 , 1 90, 1 93 , 1 94, 1 9 5 , 1 96, 2 1 1 , 2 1 2, 2 1 4, 2 1 7, 234, 2 3 8 , 266, 3 1 1 , 320, 3 56, 3 62 Weiss, G. H ., 439, 470, 4 7 3 , 475, 477, 47 8 Wentworth , E. N., 1 43 Wills, C., 307 Wilson, E. 0., 29 Wri ght, S., 2 , 3, 4, 9, 62, 64, 66, 69, 7 3 , 86, 8 8 , 9 1 , 92, 99, 1 00, 1 04, 1 05 , 1 0 6, 1 07 , 1 1 1 , 1 24, 1 25 , 1 3 8, 1 44 , 1 49, 1 52, 1 67 , 1 68, 1 7 3 , 1 74, 1 7 5, 1 80, 1 8 1 , 1 8 3, 204, 2 1 5, 2 1 7, 22 3 , 224, 225, 236, 244, 245, 25 1 , 260, 263 , 267 , 268, 269, 270, 293, 294, 3 1 3, 3 1 5, 3 1 6, 320, 32 1 , 322, 3 30, 3 3 5 , 340, 3 45 , 349, 35 2, 360, 367, 37 1 , 372, 3 82, 3 84, 394, 396, 3 9 8 , 399, 400, 4 1 4, 4 1 8 , 425, 426, 428, 434, 4 3 5 , 436, 438, 4 3 9 , 440, 44 1 , 442, 443, 444, 446, 457, 459, 469, 480 Yule, G . U., 3 5
INDEX OF SUBJECTS
Adaptive surface, 236 Age distribution, 1 1-20 stable, I S A I lozygosity , 65 Altruistic behavior, 243-244 Analysis of variance, see Variance Assortative m ating, 1 4 1-166 compared with consanguineous mating,
1 4 1- 1 43
effect of dominance on , 1 5 6- 1 58 effect of on correlations between rela· tives, 1 58- 1 6 1 multifactorial, 148- 1 6 1 single locus, 1 4 3- 1 48, 1 6 1 - 1 66 with selection. 1 6 3- 1 66 Autozygosity, 65 Average effect, 1 1 7 , 206, 2 1 1 Average eXcess, 1 79, 2 1 1 Backcrossing, repeated, 94
Baldness, 58 Bees, inbreeding in, I l l , 1 12 Behavior, altruistic, 243-249 Bernoulli polynomials, 426 Blending inheritance, 2 3 6-23 7 Blood groups, 3 3 , 36, 40, 496 Branching p rocess, 4 19--423 Brother-sister mating, 64, 86-89
Carrying capacity, 23 Cats, coat color in, 57 Characteristic equation, see Equation Chi-square, chart of, 5 1 6 Chi-square test, 493--496 Circular half-sib mating, 90-92 Coancestry, 69 Coefficient, inbreeding, see Inbreeding coefficient of consanguinity, 68 of kinship, 69 •
581
588
I N D EX O F S U BJ ECTS
Coefficient ( Continued ) of relationship, 69, 1 3 8 selection, see Selection coefficient Coefficients, Cotterman, see Cotterman k-coefficients p ath, 4 80 Color-b l i nd ness, 4 1 -44 Compensation, effect of, on sex ratio, 29 1 -292 Confidence l i m i ts, 496-498, 499-500 Consangu inity , coefficient of. see Coeffi cient phenotypic e ffects of, 7 3-8 5 Cooperative behav ior, 243-244 Corn, 84 Correl ation between relatives, 1 3 6- 1 40 effect of assortative m ating on, see Assortative m ating Correlation coefficient, 487 relation to inbree d i n g coefficient, 66-6 8 signi ficance of, 499 Cotterman k -coefficients, 1 32- 1 3 6 Cousins, double first, repeated mating of, 89-9 1 Covariance, 1 08 , 484 between relat ives, 1 3 6, 1 39 ' Covariance m atrix, 5 1 3 Crowd ing, effect of, 22-29, 2 1 6 Darw i n i a n fit ness, 5 Deafness, assortative m at i ng for, 1 471 48 De Finetti d iagram , 60 Degrees of freedom, 4 1 4 D eterminants, 500-509 D i ffusion equations, see Equations D i ploidy, evolution ary advantage of, 3 1 6-3 1 7 D isassortative m ating, 1 66- 1 70 D istribution, binom i al, 4 9 1 gene frequency, 3 8 3-4 1 8 , 4 3 4-452 normal, 492 Poisson, 492 Doub1e-fi rst-cousin m ating, 89-9 1 Double reduction, 52-5 3 D rift, genetic, see R andom genetic d rift Drosophila melanogaster, 50, 8 5 selection i n , 1 7 8, 1 87- 1 8 8 D y n a m ics, evolutionary, 2 5 6
Effective number, of al leles, 3 2 4 , 4 5 3 465 of gene loci, 1 54 Effective popu lation number, 1 0 3 , 1 09I l l , 345-365
effect of selection on, 3 64-3 6 5 i nbreedi ng, 3 45-352 variance, 3 52-3 6 4 Eigenvalues, 1 4- 1 6, 505-509 Eigenvectors, 1 4- 1 6, 505-509
Epistasis, as cause of gametic phase d isequil ibrium, 1 9 6 d i m i nishing, 8 1 effect of, on i nbreeding, 82 effect of, on selection, 1 95 -205 , 2 1 7225 reinforcing. 8 1 v a riance component due to, 1 24- 1 29
Equation ,
characteristic,
14
Fokker-Planck, 3 72-3 8 2 Kolgomorov backward, 372, 423-430
Koigomorov forward, 3 72-3 8 2 logistic, 2 4 , 2 5 , 28, 1 9 3 Equ a t ions, d i ffusion , 3 7 1 ff. Equ i l ibriu m , 2 5 5 ff. between migration and random drift , 268-270 between selection and m i gration, 26 7268 between selection an d mutation, 258262, 264-267 stabil ity conditions, 2 7 5-27 8 under mutation , 262-264 u nder selection, 2 7 0-27 7 Evol ution, rate of, b y random d rift, 36837 1 by selection, 1 9 4 Fibonacci series, 6 4 Fiducial l i mits, 496-498, 499-500 Fit ness, Darwi nian, 5 definition of, 224 M althusian parameter of, 9 . 1 8 Wright ian, 5 , 9 Fix ation, of a mutant gene, probabil ity of, 4 1 8-430 t i m e u n t i l , 4 3 0-432 Fixation i ndex, see Inbreeding coefficient Fokker-P l a nck equation, see Equation Fu nct ion, hypergeometr ic, 3 3 8, 3 8 3 Fu ndamental theorem o f natural selection, see N atural selection G all insect, death rates in, 20 Gametic phase d isequi libri um, 1 9 6-204 G a m etic phase equi libri u m , 47, 49, 5 0 rate o f approach to, 47-52 Gaussia n correction, 3 5 7 , 4 8 6 Gene frequency, 3 2
I N D EX OF S U BJ ECTS
Gene frequency d i stribution, amon g subgroups, 436-4 4 1 general formula for, 4 3 3-4 3 5 steady state, 43 3-47 7 with selection and mutation, 442-452 Genetic d rift, random, see Random ge netic drift G enetic load, 291-3 1 2 balanced, 3 03-308 causes of, 291-299 incompatibility, 308-3 1 1 m eiotic d rive, 3 1 1-3 1 2 mutation, 299-303 segregation. 303-308 Geometric mean, 6 , 4 8 1 Haldane cost of natural selection, 244252 Haldane-Muller p r inciple, 299-303 Half -sib mating, circu lar, 9 0-92 Hamilton's principle, 2 3 2 Hardy-Weinberg principl e, 3 1 ft. extension to more than two lOci, 50-52 extension to two loci , 41-50 multiple al leles, 40-4 1 X-linked loci, 4 1 -44 Harmonic mean, 4 8 1 Heritabil ity, 1 24, 229 Heterozygosity, amount of, in finite popu lation with neutral alleles, 322-321 e ffect of inbreeding on, 62-68 effect of random d rift o n , 1 0 1 - 1 04, 320-322 Hierarchical population structure, 1 04-
1 08
Homozygosity, see Heterozygosity Hypergeometric function, 3 3 8 , 3 8 3 Inbreedi ng, effect of, o n heterozygosity , 62-64 effect of, on variance, 99- 1 0 1 maximum avoidance of, 89-92 measurement of, 64-68 phenotypic effects of, 1 3-85 regular systems of, 8 5-95 with two loci, 95-99 Inbreedin g coefficien t, 64-68 computati on of, from pedigrees, 69-1 3 Inbreedin g effective populatio n number, 345-3 5 2 Incompatibility, pollen, 1 66- 1 68, 1 70 Incompatibility load, 3 0 8-3 1 1 I ncrease, intrinsic rate of, 2 3 Information matrix, see Matrix
589
Inheritance, blend ing va. particulate, 2 3 6-2 3 1 Intergroup selection, see Selection Intri nsic rate of increase, 2 3 Invariance, functional , 5 1 1 Isolation by distance, 469-41 8 Isonymy, 1 06 k-coefficients, 1 32- 1 3 6 K-sel ection, 29 Kinsh ip, coefficient of, see Coefficient Kol mogorov equations, see Equation Lagrange mUltipliers, 5 1 5 Least action principle, 232 Lethal equivalents, 77 Lethal genes, d istribution of, 445-450 Likelihood , maximum, see Maximum likeHhood m ethod Linkage d isequ ilibrium, 1 9 6-204 Linkage e quilibrium, approach to, 41-52 Load , genetic, see G enetic load Logistic equation, see Equation M aize, 84 M althusian parameter, 9 , 1 8 Mark ov process, 3 12 If. M ating, assortative, 1 43 - 1 6 6 d isassortative, 1 6 6- 1 10 Mating systems, inbreeding, 85-95 Matrices, 500-509 arithmetic operations of, 500-503 M atrix , identity, 5 0 1 information, 5 1 3 transitio n , 1 4, 1 6, 88, 9 1 , 92 Maximum likelihood method, 4 1 , 42, 5 09-5 1 4 M ean, arithmetic, 480 geometric, 6 , 480 harmonic, 480 M eioti c d rive l oad, 3 1 1 -3 1 2 Mendelian inheritance. evolutionary ad vantage of, 3 1 3 -3 1 6 M igration , 261-270, 4 36-44 1 random drift and, 268-210, 469-418 Mink, 495 M odel . determi nistic, 4 m athematical, 3 stepping stone, 469-41 8 stochastic, 4, 3 61- 368 M oments, gene frequency. change of. due to rand o m d rift, 3 3 1-3 3 9 M utation load, 299-303
690
I N D EX
OF
S U BJ ECTS
Natural selection, cost of, 244-252 fu ndamental theorem of, 1 0 , 205-224 maximum princ iple for, 230-23 6 Neutral al leles, actual n umber in a fi nite population, 4 5 3-457 effec tive number in fi nite p opulation, 322-327, 453-457 Neutral genes, rate of evolution of, 36 8-3 7 0 Nucleotide sites, heterozygous, number mai ntained, 466-469 Overdominant a l l eles, numb er m ain tained i n a finite population, 457465 Panmictic index, 64 Panmi xia, 3 1 -5 6 Path coefficients, 4 80 Polymorphism , multip le niche, 2 82-284 neutra l , 257-258 sex linked, 278-2 8 1 single locus, 27(}-277 tw o locus, 284-288 Polyploidy, random m ating with, 5 2-5 3 popu l ation, finite, 1 0 1 - 1 04, 3 1 9-366 evolution in, 244 random mating proportions in, 5 5-56 randomly mating, 3 1-56 subdivided, 5 4-5 5 va ri ance of quantitative ch aracter i n , 3 39-3 45 Popu lation number, change in, 5-20 effective, see Effective popul ation number regulation of, 22-30 Population stru c ture, 2 3 9-244 hierarchical, 104- 1 08 stepping stone model of, 469-4 7 8 Quantitative traits, effect of inbreed ing on, 7 7-85 selection of, 225-230 variance of, 1 1 6- 1 32 variance of, in a subdivided popula tion, 3 39-34 5 Quasi-fixation, 4 1 4-4 1 8 Quasi-linkage equ i l ibrium, 1 9 7-20 3 , 2 1 7 224 Quasi-lOSS of an allele , 4 1 4-4 1 8 r-selection, 29 Random genetic drift, 1 0 1 - 1 04, 3 7 1 as diffusi o n p rocess, 382-389
change in gene frequency moments d u e to, 3 3 1 -339 c hange i n mean and variance d u e to, 327-3 3 1 c h ange of heterozygosity due to, 3 20322 m igration and, 3 89-399, 469-47 8 mutation and , 3 89-3 95 selection and, 396-4 1 4 Recom bination, evolutionary advantage of, 387-3 8 8 Regression, 486-489 Reproduction. clonal, 3 8 sexual, eVOlutionary advant ages of, 3 1 3-3 1 6 Reproductive value, 2(}-22 rate of increase of, 2 2 Retardation factor, 4 1 3 Root, c haracteristic, see Eigenvalues Segregation load, 303-308 Selection, bal anced by m igration, 2 67268 balanced by mutation, 2 59-267 between grou ps, 2 4 1 -243 compl ete. 1 75 - 1 7 8 continuous model of, 1 90 ff. effect of, o n variance, 2 37-2 3 9 effect o f linkage and epistasis o n . 1 95205 effect of time of enumeration on, 1 851 90 fluctuating, 2 8 1 frequency dependent, 2 5 6 heterozygote favored, 2 56, 27(}-277 p artial, 17 8- 1 9 0 stabi lizing. 293-296 t i m e required for population change by, 24, 28, 1 93- 1 9 5 truncation, 225-230 w ithin fami lies, 24(}-24 1 See also N atural selection Selection coeffic ient , 5 Selection di fferential , 226 Selection i n tensity. 227 ra ndom fluctuation of. 4 1 4-4 1 8 Sel f-fertilization. 62-6 3 , 8 6, 504-5 0 5 parti a l , 92-94 Se If-ster i l ity system s , 1 66- ] 68 Sewall Wright effect, see Random genetic drift Sex-l inked loc i , 4 1 -47 Sex ratio, effect of compensati o n on, 29 1 -292 effect of selection o n , 288-293
I N D EX OF S U BJ ECTS
Sexual reproduction, evolutionary advant ages of, 3 1 3-3 1 6 Sib-mating. 64. 86-89, 5 08-509 Significance tests, 49 3-500 Snyder's ratios, 57 Standard deviation, 4 8 3 Statics, evolutionary, 2 5 6 , 4 3 3 Stepping-stone model, 469--4 7 8 Stochastic process, gene frequency change as a, 37 1 t. chart of, 5 1 6 t-test, 498--499 Tetraploidy, random mating with, 52-5 3 Thresholds, 225-2 3 0 Transformation, angular, 3 87-3 8 8 Truncation selection, see Selection United States popul ation, parameter of, 1 9 transition matrix of, 1 6 Urophora jaceana, 20
M althusian
Value, reproductive, 2 0 -22 Variability, maintenance of, 2 5 6-25 8 m easureme nt of, 482-485
591
V ariance, 1 1 6 additive, 1 1 7 binomial. 49 1 definition, 4 82 dominance. 1 20 epistatic. subdivision of, 1 24- 1 29 genic, 1 1 7 genotypic, 1 1 7 m ax.imum likelihood , 5 1 0 of a mean, 4 8 5 o f a s u m , 484 phenotypic, 120 Poisson, 492 Variance components, 120 effects of inbreeding on, 1 3 0- 1 3 2 Variance-covariance matrix, 5 1 3 Variance effective popu l ation number, 352-3 6 1 Vectors, 5 0 1 Wah lund's principle, 54-5 5 Wright effect, see Random genetic drift Wright's inbreeding coefficient, 64 theory of evolution, 2 44 X-linked loci, 4 1--47