PENGUIN BOOKS
GENES, PEOPLES AND LANGUAGES
'There may be no contemporary scholar who has a more detailed understandin...
221 downloads
1838 Views
10MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PENGUIN BOOKS
GENES, PEOPLES AND LANGUAGES
'There may be no contemporary scholar who has a more detailed understanding of human diversity or a more compelling vision of its ·unified history ... TI,is project is an immense intellectual achievement, and Genes, Peoples and Languages is a fine way to get a sense of its scope' Edward Rothstein, The New York Times 'It challenges us to define what we understand of "race" and its author has for nearly half a century made seminal contributions to the study of evolutionary genetics, in addition to possesshlg a most wondelful talent for telling a great story' Gear6id Tuohy, Irish Times 'Luigi Luca Cavalli-Sforza's latest book summarizes the life work of this fascinating polymath, who for the last fifty-five years has been developing ingenious methods to understand the history of everybody ... Genes, Peoples and Languages is, among other things, an intellectual biography - a complex portrait of. scientist capable of mentally juggling the particulars about everything and everybody, while remaining continually alert to grand designs' Jared Diamond, New York Review of Books 'The author has long been at the forefront of research that uses human genetics as a way of unravelling our past. In this short, lucid book he describes the insights that have been gained about the way our species evolved and spread, leading to.different races, cultures and languages' Andrew Crumey, Scotland on Sunday
ABOUT THE AUTHOR
The world's leading expert on human population genetics, Luigi Luca Cavalli-Sforza was born in Genoa in '1922 and has taught at the universities of Cambridge, Parma and Pavia. He is currently active Professor Emeritus of Genetics at Stanford University and is the author of The Hi$tory and Geogro:phy of Hu11Ul1I Genes. He is a Foreign Member of the Royal Society.
Genes, Peoples and Languages
Luigi Luca Cavalli-Sforza
Translated by Mark Seielstad
PENGUIN BOOKS
. "~' which to view that past. We know that, with few exceptions, many characteristics such as heig\>t and skin, hair, and eye color are genetically determined, but we do not understand precisely how. Moreover, some of them are also influenced by non-genetic factors, for instance, nutrition, in the case of height, and exposure to the sun, in the case of skin tone. Our poor understanding of the hereditary mechanism of these familiar characteristics is due to their interaction with non-genetic, environmental factors, and the general complexity of the mechanisms determining all traits that involve shape. By contrast, we understand clearly the inheritance of blood groups, and of chemical polymorphisms among enzymes and other proteins, because the account of traits detennined by relatively simple substances like proteins is chemically Simpler and easier to understand lind measure. But these traits are not directly visible, and rather sensitive laboratoI)' methods are reqUired to detect them. VeI)' early on, the American scientist William Boyd showed that by using the first genetic systems discovered-ABO, RH, and MN--one could already differentiate populations from the five continents. Arthur Mourant, a British hematologist, produced the first comprehensive summaI)' of data on human polymorphisms in 1954. The second edition of Mourant's book, appearing in 1976, contained more than one thousand pages, more than doubling the amount of data previously available. 16
Two major techniques are used to study polymorphisms, or genetic "markers" as they are called because they act as tags on genetic material, on proteins. One, employed for almost all blood group typings, uses biolOgical reagents, often made by humans reacting to foreign substances from bacteria, or from other sources. These reagents are special proteins called immunoglobulins or antibodies. They are made in the course of building immunity, that is, resistance to some external agent, and usually react specifically with substances called antigens, usually other proteins. The other analytical method of genetic analysis, developed in 1948, is a direct study of physical properties of specific protein molecules, usually by measuring their mobility in an electric field. It is called electrophoresis. Both methods revealed directly or indirectly the variation in structure of specific proteins from individual to individual. The behavior of these variants could be tested in families to confirm the genetic nature of such variation. But the number of polymorphic proteins detected in this way was small and at the beginning of the 1980s only about 250 were known. All proteins are produced by DNA, and therefore behind protein variation there must be a parallel variation of DNA, the chemical substance responsible for biolOgical inheritance. The analytical methods necessary to chemically study DNA were developed later. In the early eighties the analysis of variation in DNA had its start. DNA is a very long filament made of a chain combining four different nucleotides, A, C, G, and T. Changes in the sequence of nucleotides of a specific DNA happen rarely, and more or less randomly, when one nucleotide is replaced by another during replication. Thus, if a DNA segment is GCAATGGCCC, it may happen that a copy of it passed by a parent to a child is changed in the fifth nucleotide, T being replaced by C. The DNA generating the child's protein will thus be GCAACGGCCC. This is the smallest change that can happen to DNA, and is called a mutation; as DNA is inherited, descendants of the child will receive the mutated DNA. A change in DNA may cause a change in a protein, and this may cause a change visible t';' us. 17
Restriction enzymes provided a simple way to detect differ'ences in the DNA of two individuals. Restriction enzymes are produced by bacteria and break DNA into certain sequences of 4,6, or 8 nucleotides, for instance CCCC. A method of multiplying DNA in a test tube with the enzyme DNA polymerase, which nature uses to duplicate DNA when cells divide, was discovered and developed in the second half of the eighties, and is called PCR, or polymerase chain reaction. This new technique has improved the power of genetic analysis in the nineties. We now know that there must exist millions of polymorphisms in DNA, and we can study them all, but the techniques for doing this at a sat. isfactory pace are only now beginning to be available. The future of the analysis of genetic variation is clearly in the study of DNA, but results accumulated with the old techniques . based on proteins have not lost their value. There are some specific problems, which can be resolved only by DNA techniques. On the other hand, the very rich information generated by protein data on human populations includes almost 100,000 frequencies of polymorphisms. They were studied for over 100 genes in thousands of different populations all over the earth, and many of the conclusions thus made possible and discussed in this book have alisen from studies of proteins. Results with DNA have complemented but never contradicted the protein data. We start having knowledge on thousands of DNA polymorphisms, but they are almost all limited to very few populations. We will summarize the most impOltant ones.
Studying Many Genes Allows Use of the "Law of Large Numbers"
Is it possible to reconstruct human evolution by studying the types of living populations only? We can simplify the process of doing so by concentrating most of our studies to indigenous people, when it is possible to recognize them and differentiate them from recent immigrants to a region. But we learn much about human origins and evolution from a single gene like ABO. 18
We will introduce here the word "gene." Evelybody has heard it, but few know its precise meaning. The old definition, "unit of inheritance," is still difficult to understand-in fact. it was used when we did not know what a· gene was in chemical terms. Today we can give a much more concrete definition: a gene is a segment of DNA that has a specified. recognizable biological function (in practice. most frequently that of generating a particular protein). It is, therefore. part of a chromosome, a rod found in the nucleus of a cell that contains an extremely long DNA thread, coiled and organized in a complicated way. A cell usually has many chromosomes, and their distribution to daughter cells is made in such a way that a daughter cell receives a complete. copy of the chromosomes of the mother cell. When studying evolution, however, we may, and often must, ignore what a gene is dOing, because we don't know. But a gene remains useful for evolutionary studies (and others) if it is present in more than one form, and the more forms of a gene (allele) that exist. the better the gene suits our purposes. With only three alleles, ABO can hardly be very informative. In Africa, the place of origin, one finds all alleles. But this is also true of Asia and Europe. In Asia, however, the B allele is more frequent than in the other continents; group A is somewhat more common in Europe; and Native Americans are almost entirely blood group O. What conclusions can we draw? That A and B genes were probably lost in the majority of Native Americans, but why? Many have speculated about the reason, but it is impossible to provide an entirely satisfactory answer. The first hypothesis connecting the historical origin of a people and a gene that was subsequently confirmed by independent evidence was made on the basis of the RH gene in the early forties. The Simplest genetic analysiS recognizes two forms: RH+ and RH -. Globally, RH+ is predominant, but RH - reaches appreciable frequencies in Europe with the Basques having the highest frequency. This suggests that the RH - form arose by mutation from tlle RH + allele in westem Europe and then spread, for unspecified reasons, toward Asia and Mrica, never greatly diminishing the frequency of the RH + gene. The highest frequencies of the negative type are generally found in the west and northeast of Europe. Frequencies 19
steadily decline toward the Balkans, as if Europe was once entirely RH - (or at least predominantly so) before a group ofRH+ people entered via the Balkans and diffused to the west and north, mixing with indigenous Europeans. This hypothesis would have remained uncertain if it had not been substantiated qy the simultaneous study of many other genes. Archeology also lent support to the argument, as we shall see later. Reconstructing the history of evolution has proved a daunting task. The accumulation of data on many genes in thousands of people from different populations has produced a dizzying amount of information that describes the frequency of the different forms of more than 100 genes--a body of knowledge that is very useful for testing evolutionary hypotheses. Experience has shown that we can never rely on a Single gene for reconstructing human evolution. It might appear that a Single system of genes like HLA, which today has hundreds of alleles, would be sufficient. The HLA genes play an important role in fighting infections and recently have become important in matching donors and recipients for tissue and organ transplants. They possess a great diversity of forms, as is necessary for a potential defense against the spread of tumors among unrelated individuals, but they are also subject to extreme natural selection related to their role in fighting infection. If the conclusions we reach about evolution through observations made using HLA are different from those obtained using other genes, we need to explain the reasons, because they may lead to different historical interpretations. It is very useful, and I think essential, to examine all existing information. The broadest syntheSiS has the greatest chance of answering the questions we ask, and the least chance of being contradicted by later findings. Therefore, it is also worth gathering information froin any discipline that can provide even a partial answer to our problems. Within genetics itself, we want to collect as much information about as many genes as pOSSible, which would allow us to use the "law of large numbers» in the calculation of probabilities: random events are important in evolution, but despite their capriciousness, their behavior can be accounted for through a large number of observations. 20
Jacques Bemoulli, in his A~ coTtjectandi of 1713, wrote, "Even the stupidest of men, by some instinct of nature, is convinced on his own that with more observations his risk of failure is diminished. » Many studies have been invalidated because of an inadequate number of observations. When we study polymorphisms directly on DNA, there is no dearth of evidence: we can study millions. We may not need to study them all, because at a certain point additional data fail to provide new results or lead to different conclusions. Nevertheless, simply studying a large sample is not always enough. If we observe heterogeneity in our data, so that it can be divided into several categories, each implying a different history, we must further search for the source of these discrepancies. We have seen an important example in the comparison of genes transmitted by the paternal and the matemalline, as we will discuss in another chapter.
Genetic Distances It is clear that, in order to contrast populations, we must synthesize
a vast amount of genetic information. At first, to measure the "genetic distance» between populations, we simply compared pairs of pcpulations. Only much later, when we had a very large number of genes and some new analytical techniques, were we able to study the differences among many populations, or even within individual populations. For most genes, the frequency differences between populations are nil.to very slight and their contribution to the global genetic distance between populations is close to zero. The RH gene provides interesting genetic distances in Europe, but is less useful elsewhere. For example, the frequency ofRH negative individuals is 41.1 percent in England, 41.2 percent in France, 40 percent in the former Yugoslavia, and 37 percent in Bulgaria. These differences are slight, but among the Basques the frequency is 50.4 percent and among the Lapps (more appropriately called the Saami) the frequency is 18.7 percent. For this gene the genetic 21
distance between France and England, calculated simply by taking . the difference between the percentages above, is 0.1 percent. The distance between French and Bulgarians (4.2 percent) or between Bulgarians and persons "from the former Yugoslavia (3 percent) is greater. But the distance between Basqul)S and English is considerable (9.3 percent) and the difference between Basques and Lapps is dramatic (31.7 percent). I like to explain the concept of genetic distance in the simple way that I have done above, as a difference between percentage frequencies of the fonn of a gene. In reality, there are now many methods for calculating genetic distances and all are fairly complicated. When I started this calculation, I asked the advice of my teacher, R. A. Fisher, one of the great geneticists and statisticians, because I could not think of a better consultant. It is pointless to give his formula here, because it is too complex. But it is still essential to average the distance between two populations over many genes if one wants reproducible conclusions. Among other formulas subsequently proposed, one developed by Masatoshi Nei, a famous Japanese-American mathematical geneticist, has become more popular than the Fisher formula I first used. But more than twenty years after he introduced it, Professor Nei is now conVinced that Fisher's approach is better than his own . for the study of human populations. In any case, most of the formulas currently used to calculate genetic distances provide vel)' similar results overall. In fact, if I find substantial disagreement among results using the various distance measures, I tend to suspect there are other problems with the data-usually that the sample of genes is insufficient. Once a genetic distance is calculated between populations for each of several genes, we can average all the distance values thus obtained. We thus synthesize the information from all the genes studied. The more genes we have, the more likely it is that conclusions will be correct. When we have enough genes, we can subdivide them into two or more classes and use each class to test our conclusions, which should, if everything is fine, be independent of the genes employed. 22
Isolation by Geographic Distance
Interesting theories developed by three mathematician~ewaIl Wright in the United States, Gustave Mal6cot in France, and Motoo Kimura in Japari-led, with minor differences, to the conclusion that the genetic distance between two· populations generally increases in direct correlation with· geographic distance separating them. This expectation derives from the observation that while most spouses are selected from within their own village or town, or part of a city, a small proportion are chosen from neighboring ones. This proportion reflects the migration that goes on all the time everywhere because of marnage. In the Simplest model, equal numbers of migrants are exchanged between neighboring villages. The first measurements of migration arising from marriage were performed by Jean Sutter and Tran Ngoc Toan, and independently by myselfin collaboration with Antonio Moroni and Gianna Zei, using church wedding records, which noted the spouses' birthplaces. They confirmed the tendency of people to find spouses from a short distance away, as expected. The first verification of the theory that genetic distance increases with geographiC distance between populations was provided by Newton Morton, who studied small, homogeneous regions. Menozzi, Piazza, and I extended them to the entire world in our book The History and Geography of Hurrw.n Genes, from which figure 1 was taken. The increase of genetic distance with geographic distance may be linear at first, but over a greater geographic distance, the increase in genetic distance slows sharply. The two characteristics of the CUIVe-the rate (i.e., the slope) of the initial increase, and the maximal value reached by the genetic distance over a great geographic distance-are different for the various continents. They are greatest for indigenous Americans and Australians, and slightest in Europe, which is the most homogeneous continent. The maximal genetic distance (in Europe) is three times smaller than on the least homogeneous continents. Despite political fragmentation, migration within Europe has been sufficient to create a greater genetic homogeneity than elsewhere. The CUtve has not reached a maximum value (and therefore 23
0.10
0.10
Africa
0.08
0.08
0.06
0.06
Asia
0.04
0.04 0.02~..
0.02.~
• ."
1000 2000 3000. 40.0.0. 50.0.0.
100.0. 20.00 30.00 40.0.0. 5000 0.10
0.10
Europe
0.08
0.08
0.06
0.06
Amerlca
....
0.04
0.04
0.02
0.02/
100.0. 20.0.0. 3000 40.0.0. 5000 0.10
1000. 2000 3000 400.0. 500.0. 0.10
Austnilla
0.06
0.08
0.06
0.06
0.04
0.04
0.02~ 100.0. 20.0.0. 30.0.0. 40.0.0. 5000
New Guinea
3000 400.0. 50.0.0.
Figure 1. Relationship of geographic distance (in miles, on the horizontal axis) to genetic distance (on a scale between 0 and 1. vertical axis) in the various continents. Genetic distances among pairs of populations were averaged for all available data on 110. genes tested by methods of protein analysis (blood groups. elecb·ophoresis. etc.). Robust averages of genetic distances were calculated for ail possible pairs of tribes. towns. or other human communities that share a geographiC distance class. (Cavalli-Sforza. Menozzi. and Piazza 1994)
the point of genetic equilibrium) in Asia (and clearly even less in the whole world) in spite of the extensive migrations of the past millennia. Mongols. for example. began important expansions east. south, and west around 300 B.C. The Turkish advance, halted near Vienna in the eighteenth century; was their last explOit. Figure 1 shows the remarkable precision with which the data support the theory. Naturally, individual pairs of populations would
24
vary substantially from the theoretical curve, but the pOints in figure 1 are the averages of many population pairs, calculated from over a hundred genes. We observed that it matters little which genes are chosen. Only one genetic system shows a major deviation from the others-the immunoglobulin genes. These genes code for our antibodies, and the greater variation found in them is probably a response to the great geographic variation in the array of infectious diseases we encounter.
What Is a Race, Then?
A race is a group of individuals that we can recognize as biologically different from others. To be scientifically "recognized," the differences between a population that we would like to call a race and neighboring populations must be statistically significant according to some defined critelia. The threshold of statistical significance is arbitrary. The probability of reaching significance for a given distance increases steadily with the number of individuals and genes tested. Our experiments have shown that even neighbOring populations (villages or towns) can often be quite different from each other. TIlere is a limit to the number of individuals in a given village population who can be tested. But the maximum number of testable genes is so high that we could in principle detect, and prove to be statistically significant, a difference between any two populations however close geographically or genetically. If we look at enough genes, the genetic distance between Ithaca and Albany in New York or Pisa and Florence in Italy is most likely to be Significant, and therefore Scientifically proven. The inhabitants ofIthaca and Albany might be disappointed to discover that they belong to separate races. People in Pisa and Florence might be pleased that science had validated their ancient mutual distrust by demonstrating their genetic differenCes. In his Dioine Conwdy, Dante, a Florentine, expressed
25
his dislike of people from Pisa by wishing that God would move two islands situated at the mouth of the river Arno, thereby flooding Pisa and drowning all its people. Classifying the world's population into several hundreds of thousands or a million different races '¥QuId, of course, be completely impractical. But what level of genetic divergence would be necessary to determine boundaries for a definition of racial difference? Because genetic divergence increases in a continuous manner, it is obvious that any definition or threshold would be completely arbitrary. It has been suggested that one might define race by the analysis of discontinuities in the surface of gene frequencies generated on a geographic map. Introduced by Guido Barbujani and Robert Sokal (1990), the method looks for local increases in the rate of change of gene frequencies, per unit of geographic distance. Obstacles to migration or marriage could create these local increases. If proved for many genes, such barriers could help distinguish races. But a true discontinuity is difficult if not impossible to establish for gene frequencies, so they would rather look for regions where gene frequencies change rapidly. The particular rapidity of genetic change that could suffice as a "genetic barrier" would naturally be chosen in an arbitrary manner. This procedure illustrates the theoretical difficulties classification by race poses. Gene frequencies are not geographic features like altitude or compass direction, which can be measured precisely at any pOint on the earth's surface; rather, they are properties of a population that occupies an area of finite extent. One possible solution would be to use villages and small cities as "points" in geographiC space. Large cities could be subdivided into several points to take account of residential segregation. But the available data on gene frequency in villages or small cities are insufficient and they would provide an extremely detailed clustering. In any case, this method is still useful for identifying the geographiC location of genetic "boundaries,» however arbitrary these are. In Europe, for example, Barbujani and Sokal found 33 genetic boundaries that corresponded in 22 cases to geographic features 26
(mountains, rivers, seas) and almost always (in 31 cases) to linguistic or dialectic boundaries, In a country with a homogeneous language, like Italy, family names provided better results than genes. Because they're inherited, surnames can give almost the same information as genes, but are more informative because surnames are readily available in large numbers. A more Significant difficulty resulting from racial classification is that the barriers found by the method described above have rarely defined a closed space inhabited by a population enclave, even when aided by geographical features such as the Alps. Islands may be the only exceptions. The population of each island could be classified as a race, because it would be different from other islands and the nearest mainland, if there were sufficient genetic information. But would that be useful for practical purposes, like for instance taking a census in the United States? The answer is certainly no. A third problem is that a huge number of genes must be studied to distinguish closely related populations. Scientific attempts to clasSify races continued through the end of the nineteenth century. The results often contradicted each other, a good indication of the difficulty of such efforts. Darwin understood that geographic continuity would frustrate any attempt at classifying human races. He noted a phenomenon that repeated itself many times in the course of history: different anthropolOgists come to completely different tallies of races, from 3 to over 100. But why does this compulsion to classify human races exist? The question is extremely important. Maybe it would be more useful to answer a more general question: why clasSify?
Why Classify Things?
When we are presented with a great number of things, we feel compelled to impose some order on potential chaos. Such is the goal of classification. It allows us to describe a complex array of objects with simple words or concepts, even at the cost of oversimplification. 27
Zoologists and botanists have classified thousands or even mil·lions of species, and their work is not close to being finished. Ifvariation were not important and complex, it would not be necessary to categorize at all. On;' could simply recognize the level of difference relevant to one's needs. Humans are not alone in their tendency to classify. Chimpanzees, for example, and probably most other animals, can separate several hundred leaves and fruits into edible and non-edible categories. Depending on their appetite, other categories may be used, although edibility is fundamental since many plants are potentially toxic. Chimpanzees have even been observed teaching their offspling which foods can be eaten and which cannot. Unlike animals, humans use language to differentiate between objects. We assign a name to eacll object we wish to distinguish. African Pygmies recognize hundreds of tree species (Western botanists identify a similar number) and several hundred animals; but such diversity is still too little to require a terribly high order of classification. ClasSification and some accompanying overSimplification become necessary when variation is very high. Naturalists such as Georges Louis Leclerc Buffon and Carolus Linnaeus established valid systems of classification for the extraordinary diversity of plant and animal species. Similar systems can be found in some so-called "primitive" populations who have an undeveloped (or non-monetary) economy. Why can claSSifying human races be useful? Demographers and SOCiolOgists undoubtedly have some opinion on the subject. Most practical clasSifications are extremely simplistic. The' U.S. census recognizes Whites, Blacks (African Americans), Native Americans, Asians, and Hispanics. This last category has almost no biolOgical meaning. In practice it refers to Mexicans, but more generally, a large number of Spanish-speaking people are assigned to it. Proposing an improved classification can only end in failure. Observing the variation between ethnic groups should convince us of that. VIsible differences lead us to believe in the existence
28
of "pure" races, but we have seen that these are very narrow, essentially incorrect criteria. And when measured and plotted carefully, visible traits are actually far less discontinuous than is usually believed. Classification based on continental origin could furnish a first approximation of racial diviSion, until we realize that Asia and even Africa and the Americas are very heterogeneous. Even in Europe, where the population is much more homogeneous, several subdivisions have been proposed. But it is immediately clear that all systems lack clear and satisfactory criteria for clasSifying. The more we pay attention to questions of statistical adequacy, the more hopeless the effort becomes. It is true that strictly inherited characteristics are more satisfactory than anthropometric measurements or observations of colors and morphology. But above all it is true that one encounters near total genetic continuity between all regions while attempting to select even the most homogeneous races. The observation has been made that· almost any human group-from a village in the Pyrenees or the Alps, to a Pygmy camp in Africa--. Unl". 67: 1-259. - - (1969) Skull shapes and the map: craniometric analyses in the dispersion of modem Homo. Pap. Peabochj Mus. Archaeol. Ethnol. Han>. Unit>. 79: 1-189. Kimura, M. and Weiss, G. H. (1964) The stepping-stone model of population structure and the decrease of genetic correlation with distance. Genetics 49: 561-76. Kruskal, J. B. (1971) Multi-dimensional scaI;ng in archaeology: time is not the only dimension in Mathematics in the Arcliaeological and Historical Sciences, ed. Hodson, F. R., Kendall, D. G. and Tautu, P. (Edinburgh University Press, Edinburgh), pp. 119-32. Kruskal, J. B., Dyen, 1. and Black. P. (1971) The vocabulary and method of reconstructing language trees: innovations and large scale applications, ibid. pp.361-80. Le Bras, H. and Todd, E. (1981) L'inoention de la France: Atlas, Anthropologique et Politique (Livre de Poche, Hachette, Paris). Li, J., Underhill, P. A., Doctor, v., Davis, R. w., Shen, P., Cavalli-Sforza, L. L. and Oefner, P. (1999) Distribution of haplotypes from a chromosome 21 region distinguishes multiple prehistoric human migrations. Proc.. Natl. ActuL Sci. (USA) 96. 3796-800. Malecot, G. (1948) Les Math4matiques de I'Htredite (Masson, Parts). - - - (1966) Probabilite et Hertidite (Presses Universitaires de France, Paris). Mallory, J. P. (1989) In Search ofthe Indo-Europeans: Language, Archaeology and Myth (Thames and Hudson, London). Menozzi, P., Piazza, A., and Cavalli-Sforza, L. L. (1978) Synthetic maps of human gene frequencies m Europe. Science 201: 786-92. Morton, N. E., Yee, S. and Lew, R. (1971) Bioassay of kinship. Biometrios 27(1): 256. Mountain, J. L. and Cavalli-Sforza, L. L. (1994) Inference of human evolution through cladistic analysis of nuclear DNA restriction polymorphisms. Proc. Natl. Acad. Sci. 91: 6515-Hl. Mountain, J. L., Lin, A. A., Bowcock, A. M"and Cavalli-Sforza, L. L. (1992) Evolution of modem humans: Evidence from nuclear DNA polymorphisms. Phil. 'll-ans. R Soc. Lond. (B) 377: 15~. Mourant, A. E. (1954) The Distribution of the Human Blood Groups (Blackwell Scientific, Oxford). Murdock, G. P. (1967) Ethnographic Atlas (UniverSity of Pittsburgh Press, Pittsburgh, Pa.).
212
Nei, M. (1987) Moleculm' Evolutionary Genetics (Columbia University Press, New York). Penny, D., Watson, E. E. and Steel, M. A. (1993) Trees from genes and Ian· guages are very similar. Stat. Bioi. 42: 382-84. Piazza, A., Minch; E. and Cavalli·Sforza, L. L. Unpublished manuscript on the tree of sixty·three Indo.European languages. Piazza A., Rendine, S., Minch, E., Menozzi, P., Mountain, J. aJ.1Cl Cavalli·Sforza, L. L. (1995) Genetics and the origin of European languages. Proc. Natl. Acad. ScL 92: 5836-40. Poloni. E. S., Excoffier, L., Mountain, J. L., Langaney. A. and Cavalli.Sforza, L. L. (1995) Nuclear DNA polymolphism in a Mandenka population from Senegal: compruison with eight other human populations. Ann. Hum. Genet. 59:43-61. Quintana·Murci, L., Semino, 0., Bandelt, H-J., Passarino, G., McElreavey, K. and Santachiara-Benerecetti, A. S. (1999) Genetic evidence of an early exit from Africa through eastern Africa. Nat. Genet. 23: 437-441. Rendine, S., Piazza, A. and Cavalli·Sforza, L. L. (1986) Simulation and Separation by Principal Components of Multiple Demic Expansions in Europe. American Natura/tat 128: 681-706. - - - (1989) The origins of Indo-European languages. Sci. Amer. 261(4): 106-14. Renfrew, C. (1987) Archaeology and Language: The Puzde of Indo.European Origins (Jonathan Cape, London). Ruhlen, M. (1987) A Guide to the Worlds Languages (Stanford University Press, StanfOlu, Calif.). - - - (1991) Postscript in A Guide to the World's Languages (Stanford Uni· versity Press, Stanford, Calif.), pp. 379-407. Saitou, N. and Nei, M. (1987) 11,e neighbour-joining method: u new method for reconstructing phylogenetic trees. Mol. Bioi. Evel. 4{4): 406-25. Seielstad. M. T., Minch. E. and Cavalli-Sforza, L. L. (1998) Genetic evidence for a higher female mig.ration rate in humans. Nat. Genet. 20: 278-280. Semino, 0., Passarino, G., Brega, A .• Fellows, M. and Santachiara-Benerecetti, A. S. (1996) A view of the Neolithic diffusion in Europe through two Y-chromosome-speci6c mru·kers. Am. J. HunL Genet. 59: 964-8. Sokal, R. R., Hal-ding, R. M. and Oden, N. L. (1989) Spatial patterns of human gene frequencies in Europe. Am. J. Phys. Anthropol. 80: 287-94. Sokal, R. R. and Michener, C. D. (1958) A statistical method for ev.\uating systematic relationship. Univ. Kansas Sci. Bull. 38: 1409-38. Stigler, S. M. (1986) The History of Statistics (Harvard University Press, Cambridge, Mass.).
213
Sutter, J. (1958) Recherches sur les effets de la consanguinite chez l'homme. Bio. Med. 47:~. Tobias, P. V. (1978) The Bushmen: San Hunt.... and Herders of South Iifrica (Human and Rousseau, Cape Town). Todd, E. (1990) I:lnvention tk l'Europe (Editions de Seuil, Paris). Turner II, C. G. (1989) T...,th and prehistory in Asia. Sci. Arner. 260(2): 88-96. Underhill, P. A., Jin, L., Lin, A. A., Medhi, S. Q., Jenkins, T., Vollrath, D., Davis, R. w., Cavalli-Sforza, L. L. and Oefner, P. J. (1997) Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography. Genome Res. 7: 998-1005. Underhill, P. A., Jin~ L., Zemans, R., Oefner, P. and Cavalli-Sforza, L. L. (1996) A pre-Columbian Y chromosome-specific transition and its implications for human evolutionary hiStory. Proc. Nat!. Acad. Sci. 93: 198-200. Wamow, T. (1997) Mathematical approaches to comparative linguistics. Proc. Nat!. Acad. Sci. 94: 6585-6590. Wolf, A. P. (1980) Marriage and Adoption in China, 1854-1945 (Stanford University Press, Stanford, Calif.). Zei, G., Astolfi, P. and Jayakar, S. D. (1981) Correlation between father's age and husband's age: a case of imprinting? J. Biosoc. Sci. 13: 409-18. Zei, G., Barbujani, G., Lisa. A., Fiorani, 0., Menozzi, P., Sirl, E. and CavalliSforza, L. L. (1993) Barriers to gene flow estimated by surname distribution in Italy. Ann. Hum. Genet. 57: 123-140. Zuckerkandl, E. (1965) The evolution of hemoglobin. Sci. Amer. 212: 110--18.
214
INDEX
ABO blood group system, 16; see also blood groups absolute genetic dating, 83--85 Academy ofInscriptions (Paris), 166 . «Adam," African, 80-82 admixture, genetic, see genetic admixture adoption studies, 189-90 Aegean Islands, 119-20 Afghanistan, 128, 161 African Americans, see Blacks Africans, 9, 10, 72, 148; blood groups of, 19; brought to America as slaves, 74; colonization of Asia by, 61-62; eye shape of, 11; genetic admixture and, 75-77, 149; genetic distance between Australian Aborigines and, 64; languages of, 137, 141, 143, 145, 155, 168 (see also specijic languages and language
families); mtDNA of, 79, 80; multidimensional scaling studies of, 88-89; Paleolithic, 93; sickle cell anemia in, 47, 48; skin color of, 65; Y chromosomes of, 81; see aUlD specific countries and ethnic
groups Afroasiatic languages, 140, 143, 158, 160,168 agriculture, 53, 54, 93,157; cultural transmission and. 191; demic dif4
fusion of, 101-13; and forms of marriage, 183; language and, 118-19,159,160,169-70; populatien expansions and, 95-101,123, 126-29, 177; social structure and, 182 AIDS epidemic, 95, 179 Alaska, land bridge connecting Siberia and, 38
215
Albanian language, 163 alleles, 14,76, 199,201; fOlms of, 19; of highly variable genes, 197; HLA, 19. 50-51; mutation separating, 68: number of, and mutation rate, 50: RH, 19, 104: selective advantages or disadvantages of, 45: thalassemia, 51: variations by population in, 15 Aitaic langnages, 114, 125, 140, 152, 157-58 . Amerindian langnages, 134, 136-37, 140-42 Ammerman, AlbeIt, 96-97, 101, 102, 107,159 Anatolia, 11S-19: agriculture in, 99 Andaman Islands, 141: inhabitants of, 171 Andean highlands, 97, 99, 126 AnglO-Saxons, 151-52, 202 anim,1so breeding of, 4&-47: cultures of,173, 174: domestiCl'tion of, 45-46, 97,118,121-25,127,161,170 Anthony, David, 118 anthropology: definitions of culture in, 173: genetics and, 15, 16; mathematical models and, 186 anthropometric characteristi