Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Eight thousand years of natural selection in Europe

Iain Mathieson, Iosif Lazaridis, Nadin Rohland, Swapan Mallick, Nick Patterson, Songül Alpaslan Roodenberg, Eadaoin Harney, Kristin Stewardson, Daniel Fernandes, Mario Novak, Kendra Sirak, Cristina Gamba, Eppie R. Jones, Bastien Llamas, Stanislav Dryomov, Joseph Pickrell, Juan Luís Arsuaga, José María Bermúdez de Castro, Eudald Carbonell, Fokke Gerritsen, Aleksandr Khokhlov, Pavel Kuznetsov, Marina Lozano, Harald Meller, Oleg Mochalov, Vayacheslav Moiseyev, Manuel A. Rojo Guerra, Jacob Roodenberg, Josep Maria Vergès, Johannes Krause, Alan Cooper, Kurt W. Alt, Dorcas Brown, David Anthony, Carles Lalueza-Fox, Wolfgang Haak, Ron Pinhasi, David Reich
doi: https://doi.org/10.1101/016477
Iain Mathieson
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Iosif Lazaridis
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
2Broad Institute of MIT and Harvard, Cambridge Massachusetts 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nadin Rohland
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
2Broad Institute of MIT and Harvard, Cambridge Massachusetts 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Swapan Mallick
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
2Broad Institute of MIT and Harvard, Cambridge Massachusetts 02142, USA
3Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts 02115, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nick Patterson
2Broad Institute of MIT and Harvard, Cambridge Massachusetts 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Songül Alpaslan Roodenberg
4Independent researcher, Santpoort-Noord, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eadaoin Harney
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
3Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts 02115, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kristin Stewardson
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
3Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts 02115, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniel Fernandes
5School of Archaeology and Earth Institute, Belfield, University College Dublin, Dublin 4, Ireland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mario Novak
5School of Archaeology and Earth Institute, Belfield, University College Dublin, Dublin 4, Ireland
6Institute for Anthropological Research, Zagreb 10000, Croatia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kendra Sirak
5School of Archaeology and Earth Institute, Belfield, University College Dublin, Dublin 4, Ireland
7Department of Anthropology, Emory University, Atlanta, Georgia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Cristina Gamba
5School of Archaeology and Earth Institute, Belfield, University College Dublin, Dublin 4, Ireland
8Current address: Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark
9Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eppie R. Jones
9Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Bastien Llamas
10Australian Centre for Ancient DNA, School of Earth and Environmental Sciences & Environment Institute, University of Adelaide, Adelaide, South Australia 5005, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stanislav Dryomov
11Laboratory of Human Molecular Genetics, Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russia
12Department of Paleolithic Archaeology, Institute of Archaeology and Ethnography, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joseph Pickrell
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
13Current Address: New York Genome Center, New York NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Juan Luís Arsuaga
14Centro Mixto UCM-ISCIII de Evolución y Comportamiento Humanos, Madrid, Spain.
15Departamento de Paleontología, Facultad Ciencias Geológicas, Universidad Complutense de Madrid, Spain.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
José María Bermúdez de Castro
16Centro Nacional de Investigacíon sobre Evolución Humana (CENIEH), 09002 Burgos, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eudald Carbonell
17IPHES. Institut Catalá de Paleoecologia Humana i Evolució Social, Campus Sescelades-URV, 43007. Tarragona, Spain.
18Area de Prehistoria, Universitat Rovira i Virgili (URV), 43002 Tarragona, Spain.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fokke Gerritsen
19Netherlands Institute in Turkey, Istiklal Caddesi, Nur-i Ziya Sokak 5, Beyoğlu, Istanbul, Turkey
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aleksandr Khokhlov
20Volga State Academy of Social Sciences and Humanities, Samara 443099, Russia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pavel Kuznetsov
20Volga State Academy of Social Sciences and Humanities, Samara 443099, Russia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marina Lozano
17IPHES. Institut Catalá de Paleoecologia Humana i Evolució Social, Campus Sescelades-URV, 43007. Tarragona, Spain.
18Area de Prehistoria, Universitat Rovira i Virgili (URV), 43002 Tarragona, Spain.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Harald Meller
21State Office for Heritage Management and Archaeology Saxony-Anhalt and State Museum of Prehistory, D-06114 Halle, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Oleg Mochalov
20Volga State Academy of Social Sciences and Humanities, Samara 443099, Russia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vayacheslav Moiseyev
22Peter the Great Museum of Anthropology and Ethnography (Kunstkamera) RAS, St Petersburg, 199034, Russia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Manuel A. Rojo Guerra
23Department of Prehistory and Archaeology, University of Valladolid, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jacob Roodenberg
24The Netherlands Institute for the Near East, Leiden, RA-2300, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Josep Maria Vergès
17IPHES. Institut Catalá de Paleoecologia Humana i Evolució Social, Campus Sescelades-URV, 43007. Tarragona, Spain.
18Area de Prehistoria, Universitat Rovira i Virgili (URV), 43002 Tarragona, Spain.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Johannes Krause
25Max Planck Institute for the Science of Human History, D-07745 Jena, Germany
26Institute for Archaeological Sciences, University of Tübingen, D-72070 Tübingen, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alan Cooper
10Australian Centre for Ancient DNA, School of Earth and Environmental Sciences & Environment Institute, University of Adelaide, Adelaide, South Australia 5005, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kurt W. Alt
21State Office for Heritage Management and Archaeology Saxony-Anhalt and State Museum of Prehistory, D-06114 Halle, Germany
27Danube Private University, A-3500 Krems, Austria
28Institute for Prehistory and Archaeological Science, University of Basel, CH-4003 Basel, Switzerland
29Institute of Anthropology, Johannes Gutenberg University of Mainz, D-55128 Mainz, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dorcas Brown
30Anthropology Department, Hartwick College, Oneonta, New York 13820, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David Anthony
30Anthropology Department, Hartwick College, Oneonta, New York 13820, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Carles Lalueza-Fox
31Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wolfgang Haak
10Australian Centre for Ancient DNA, School of Earth and Environmental Sciences & Environment Institute, University of Adelaide, Adelaide, South Australia 5005, Australia
25Max Planck Institute for the Science of Human History, D-07745 Jena, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ron Pinhasi
5School of Archaeology and Earth Institute, Belfield, University College Dublin, Dublin 4, Ireland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David Reich
1Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
2Broad Institute of MIT and Harvard, Cambridge Massachusetts 02142, USA
3Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts 02115, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

The arrival of farming in Europe around 8,500 years ago necessitated adaptation to new environments, pathogens, diets, and social organizations. While indirect evidence of adaptation can be detected in patterns of genetic variation in present-day people, ancient DNA makes it possible to witness selection directly by analyzing samples from populations before, during and after adaptation events. Here we report the first genome-wide scan for selection using ancient DNA, capitalizing on the largest genome-wide dataset yet assembled: 230 West Eurasians dating to between 6500 and 1000 BCE, including 163 with newly reported data. The new samples include the first genome-wide data from the Anatolian Neolithic culture, who we show were members of the population that was the source of Europe’s first farmers, and whose genetic material we extracted by focusing on the DNA-rich petrous bone. We identify genome-wide significant signatures of selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height.

Natural selection has left its mark on patterns of variation in our genomes1, but these patterns are echoes of past events, which are difficult to date and interpret, and are often confounded by neutral processes. Ancient DNA provides a more direct view, and should be a transformative technology for studies of selection just as it has transformed studies of history. Until now, however, the large sample sizes required to detect selection have meant that ancient DNA studies have concentrated on characterizing effects at parts of the genome already believed to have been affected by selection2-5.

We assembled genome-wide data from 230 ancient individuals who lived in West Eurasia from 6500 to 1000 BCE (Table 1, Fig. 1a, Supplementary Data Table 1, Supplementary Information section 1). To obtain this dataset, we combined published data from 67 samples from relevant periods and cultures4-6, with 163 samples for which we report new data, of which 83 have never previously been analyzed (the remaining 80 samples include 67 whose targeted single nucleotide polymorphism (SNP) coverage we triple from 390k to 1240k7; and 13 with shotgun data whose data quality we increase using our enrichment strategy3,8). The 163 samples for which we report new data are drawn from 270 distinct individuals who we screened for evidence of authentic DNA7. We used in-solution hybridization with synthesized oligonucleotide probes to enrich promising libraries for more than 1.2 million SNPs (“1240k capture”, Methods). The targeted sites include nearly all SNPs on the Affymetrix Human Origins and Illumina 610-Quad arrays, 49,711 SNPs on chromosome X and 32,681 on chromosome Y, and 47,384 SNPs with evidence of functional importance. We merged libraries from the same individual and filtered out samples with low coverage or evidence of contamination to obtain the final set of individuals. The advantage of 1240k capture is that it gives access to genome-wide data from ancient samples with small fractions of human DNA and increases efficiency by targeting sites in the human genome that will actually be analyzed. The effectiveness of the approach can be seen by comparing our results to the largest previously published ancient DNA study, which used a shotgun sequencing strategy5. Our median coverage on analyzed SNPs is ~4-times higher even while the mean number of reads generated per sample is 36-times lower (Extended Data Fig. 1).

Figure 1:
  • Download figure
  • Open in new tab
Figure 1: Population relationships of samples.

A: Locations color-coded by date, with a random jitter added for visibility (8 Afanasievo and Andronovo samples lie further east and are not shown). B: Principal component analysis of 777 modern West Eurasian samples (grey), with 221 ancient samples projected onto the first two principal component axes and labeled by culture. Abbreviations: [E/M/L]N Early/Middle/Late Neolithic, LBK Linearbandkeramik, [E/W]HG Eastern/Western hunter-gatherer, [E]BA [Early] Bronze Age, IA Iron Age.

To learn about the history of archaeological cultures for which genome-wide data is reported for the first time here, we studied either 1,055,209 autosomal SNPs when analyzing 230 ancient individuals alone, or 592,169 SNPs when co-analyzing them with 2,345 present-day individuals genotyped on the Human Origins array4. We removed 13 samples either as outliers in ancestry relative to others of the same archaeologically determined culture, or first-degree relatives (Supplementary Data Table 1).

Our sample of 26 Anatolian Neolithic individuals represents the first genome-wide ancient DNA data from the eastern Mediterranean. Our success at analyzing such a large number of samples is likely due to the fact that at the Barcin site–the source of 21 of the working samples–we sampled from the cochlea of the petrous bone9, which has been shown to increase the amount of DNA obtained by up to two orders of magnitude relative to teeth (the next-most-promising tissue)3. Principal component (PCA) and ADMIXTURE10 analysis, shows that the Anatolian Neolithic samples do not resemble any present-day Near Eastern populations but are shifted towards Europe, clustering with Neolithic European farmers (EEF) from Germany, Hungary, and Spain7 (Fig. 1b, Extended Data Fig. 2). Further evidence that the Anatolian Neolithic and EEF were related comes from the high frequency (47%; n=15) of Y-chromosome haplogroup G2a typical of ancient EEF samples7 (Supplementary Data Table 1), and the low FST (0.005-0.016) between Neolithic Anatolians and EEF (Supplementary Data Table 2). These results support the hypothesis7 of a common ancestral population of EEF prior to their dispersal along distinct inland/central European and coastal/Mediterranean routes. The EEF are slightly more shifted to Europe in the PCA than are the Anatolian Neolithic (Fig. 1b) and have significantly more admixture from Western hunter-gatherers (WHG), shown by f4-statistics (|Z|>6 standard errors from 0) and negative f3-statistics (|Z|>4)11 (Extended Data Table 3). We estimate that the EEF have 7-11% more WHG admixture than their Anatolian relatives (Extended Data Fig. 2, Supplementary Information section 2).

Figure 2:
  • Download figure
  • Open in new tab
Figure 2: Genome-wide scan for selection.

GC-corrected –log10 p-value for each marker. The red dashed line represents a genome-wide significance level of 0.5 × 10−8. Genome-wide significant points filtered because there were fewer than two other genome-wide significant points within 1Mb are shown in grey. Inset: QQ plots for corrected −log10 p-values for different categories of potentially functional SNPs (Methods). Truncated at −log10(p-value)=30. All curves are significantly different from neutral expectation.

The Iberian Chalcolithic individuals from El Mirador cave are genetically similar to the Middle Neolithic Iberians who preceded them (Fig. 1b; Extended Data Fig. 2), and have more WHG ancestry than their Early Neolithic predecessors7 (|Z|>10) (Extended Data Table 3). However, they do not have a significantly different proportion of WHG ancestry (we estimate 23-28%) than the Middle Neolithic Iberians (Extended Data Fig. 2). Chalcolithic Iberians have no evidence of steppe ancestry (Fig. 1b, Extended Data Fig. 2), in contrast to central Europeans of the same period5,7. Thus, the “Ancient North Eurasian”-related ancestry that is ubiquitous across present-day Europe4,7 arrived in Iberia later than in Central Europe (Supplementary Information section 2).

To understand population transformations in the Eurasian steppe, we analyzed a time transect of 37 samples from the Samara region spanning ~5600-300 BCE and including the Eastern Hunter-gatherer (EHG), Eneolithic, Yamnaya, Poltavka, Potapovka and Srubnaya cultures. Admixture between populations of Near Eastern ancestry and the EHG7 began as early as the Eneolithic (5200-4000 BCE), with some individuals resembling EHG and some resembling Yamnaya (Fig. 1b; Extended Data Fig. 2). The Yamnaya from Samara and Kalmykia, the Afanasievo people from the Altai (3300-3000 BCE), and the Poltavka Middle Bronze Age (2900-2200 BCE) population that followed the Yamnaya in Samara, are all genetically homogeneous, forming a tight “Bronze Age steppe” cluster in PCA (Fig. 1b), sharing predominantly R1b Y-chromosomes5,7 (Supplementary Data Table 1), and having 48-58% ancestry from an Armenian-like Near Eastern source (Extended Data Table 3) without additional Anatolian Neolithic or Early European Farmer (EEF) ancestry7 (Extended Data Fig. 2). After the Poltavka period, population change occurred in Samara: the Late Bronze Age Srubnaya have ~17% Anatolian Neolithic or EEF ancestry (Extended Data Fig. 2). Previous work documented that such ancestry appeared east of the Urals beginning at least by the time of the Sintashta culture, and suggested that it reflected an eastward migration from the Corded Ware peoples of central Europe5. However, the fact that the Srubnaya also harbored such ancestry indicates that the Anatolian Neolithic or EEF ancestry could have come into the steppe from a more eastern source. Further evidence that migrations originating as far west as central Europe may not have had an important impact on the Late Bronze Age steppe comes from the fact that the Srubnaya possess exclusively (n=6) R1a Y-chromosomes (Extended Data Table 1), and four of them (and one Poltavka male) belonged to haplogroup R1a-Z93 which is common in central/south Asians12, very rare in present-day Europeans, and absent in all ancient central Europeans studied to date.

To study selection, we created a dataset of 1,084,781 autosomal SNPs in 617 samples by merging 213 ancient samples with genome-wide sequencing data from four populations of European ancestry from the 1,000 Genomes Project13. Most present-day Europeans can be modeled as a mixture of three ancient populations related to Mesolithic hunter-gatherers (WHG), early farmers (EEF) and steppe pastoralists (Yamnaya)4,7, and so to scan for selection, we divided our samples into three groups based on which of these populations they clustered with most closely (Fig. 1b, Extended Data Table 1). We estimated mixture proportions for the present-day European ancestry populations and tested every SNP to evaluate whether its present-day frequencies were consistent with this model. We corrected for test statistic inflation by applying a genomic control correction analogous to that used to correct for population structure in genome-wide association studies. Of ~1 million non-monomorphic autosomal SNPs, the ~50,000 in the set of potentially functional SNPs were significantly more inconsistent with the model than neutral SNPs (Fig. 2), suggesting pervasive selection. Using a conservative significance threshold of p=5.0 × 10-8, and a genomic control correction of 1.38, we identified 12 loci that contained at least three SNPs achieving genome-wide significance within 1 Mb of the most associated SNP (Fig. 2, Extended Data Table 2, Extended Data Fig. 3, Supplementary Data Table 3).

Figure 3:
  • Download figure
  • Open in new tab
Figure 3: Allele frequencies in different populations.

Allele frequencies for five genome-wide significant signals of selection. In each plot, the dots and solid lines show the maximum likelihood frequency estimate and a 1.9-log-likelihood support interval for the derived allele frequency in each ancient population. The four horizontal dashed lines show the allele frequencies in the four modern 1000 Genomes populations. Abbreviations for ancient populations (See Extended Data Table 1); AEN: Anatolian Neolithic; HG: hunter-gatherer; CEM: Central European Early and Middle Neolithic; INC: Iberian Neolithic and Chalcolithic; CLB: Central European Late Neolithic and Bronze Age; STP: Steppe. The Hunter-Gatherer, Early Farmer and Steppe Ancestry classifications correspond approximately to the three populations used in the genome-wide scan with some differences - for example Bell Beakers are included here with CLB but not in the selection scan (See Extended Data Table 1 for details).

The strongest signal of selection is at the SNP (rs4988235) responsible for lactase persistence in Europe14. Our data (Fig. 3) strengthens previous reports that an appreciable frequency of lactase persistence in Europe only dates to the last four thousand years3,5,15. The allele’s earliest appearance in our data is in a central European Bell Beaker sample (individual I0112) that lived between approximately 2300 and 2200 BCE. Two other independent signals related to diet are located on chromosome 11 near FADS1 and DHCR7. FADS1 and FADS2 are involved in fatty acid metabolism, and variation at this locus is associated with plasma lipid and fatty acid concentration16. The selected allele of the most significant SNP (rs174546) is associated with decreased triglyceride levels16. Variants at DHCR7 and NADSYN1 are associated with circulating vitamin D levels17 and our most associated SNP, rs7940244, is highly differentiated across closely related Northern European populations18, suggesting selection related to variation in dietary or environmental sources of vitamin D.

Two signals have a potential link to celiac disease. One occurs at the ergothioneine transporter SLC22A4 that is hypothesized to have experienced a selective sweep to protect against ergothioneine deficiency in agricultural diets19. Common variants at this locus are associated with increased risk for ulcerative colitis, celiac disease, and irritable bowel disease and may have hitchhiked to high frequency as a result of this sweep19-21. However the specific variant (rs1050152, L503F) that was thought to be the target did not reach high frequency until relatively recently (Extended Data Fig. 4). The signal at ATXN2/SH2B3–also associated with celiac disease20–shows a similar pattern (Extended Data Fig. 4).

Figure 4:
  • Download figure
  • Open in new tab
Figure 4: Polygenic selection on height.

A: Estimated genetic heights for studied populations. Boxes show date ranges and .05 to .95 posterior densities for estimated population mean genetic height (Methods). Dots show the maximum likelihood point estimate of the height. Arrows show major population relationships, with dashed lines representing ancestral populations. Two labeled V’s show our hypothesis for two independent selective events. B: Z scores for the polygenic selection scan. The score is positive in each box if the column population is taller than the row population. Abbreviations; AN: Anatolian Neolithic; HG: hunter-gatherer; CEM: Central European Early and Middle Neolithic; INC: Iberian Neolithic and Chalcolithic; CLB: Central European Late Neolithic and Bronze Age; STP: Steppe; CEU: Utah residents with northern and western European ancestry; IBS: Iberian population in Spain.

The second strongest signal in our analysis is at the derived allele of rs16891982 in SLC45A2, which contributes to light skin pigmentation and is almost fixed in present-day Europeans but occurred at much lower frequency in ancient populations. In contrast, the derived allele of SLC24A5 that is the other major determinant of light skin pigmentation in modern Europe appears fixed in the Anatolian Neolithic, suggesting that its rapid increase in frequency to around 0.9 in the Early Neolithic was mostly due to migration (Extended Data Fig. 4). Another pigmentation signal is at GRM5, where SNPs are associated with pigmentation possibly through a regulatory effect on nearby TYR22. We also find evidence of selection for the derived allele of rs12913832 at HERC2/OCA2, which appears to be fixed in Mesolithic hunter-gatherers, and is the primary determinant of blue eye color in present-day Europeans. In contrast to the other loci, the range of frequencies in modern populations is within that of ancient populations (Fig. 3). The frequency increases with higher latitude, suggesting a complex pattern of environmental selection.

The TLR1-TLR6-TLR10 gene cluster is a known target of selection in Europe23, possibly related to resistance to leprosy, tuberculosis or other mycobacteria. There is also a strong signal of selection at the major histocompatibility complex (MHC) on chromosome 6. The strongest signal is at rs2269424 near the genes PPT2 and EGFL8 but there are at least six other apparently independent signals in the MHC (Extended Data Fig. 3); and the entire region is significantly more associated than the genome-wide average (residual inflation of 2.07 in the region on chromosome 6 between 29-34 Mb after genome-wide genomic control correction). This could be the result of multiple sweeps, balancing selection, or background selection in this gene-rich region.

We find a surprise in six Scandinavian hunter-gatherers (SHG) from the Motala site in southern Sweden. In three out of six samples, we observe the haplotype carrying the derived allele of rs3827760 in the EDAR gene (Extended Data Fig. 5), which affects tooth morphology and hair thickness and has been the subject of a selective sweep in East Asia24, and today is at high frequency in East Asians and Native Americans. The EDAR derived allele is largely absent in present-day Europe except in Scandinavia, plausibly due to Siberian movements into the region millennia after the date of the Motala samples. The SHG have no evidence of East Asian ancestry4,7, suggesting that the EDAR derived allele may not have originated not in East Asians as previously suggested24. A second surprise is that, unlike closely related western hunter-gatherers, the Motala samples have predominantly derived pigmentation alleles at SLC45A2 and SLC24A5.

We also tested for selection on complex traits. The best-documented example of this process in humans is height, for which the differences between Northern and Southern Europe have driven by selection25. To test for this signal in our data, we used a statistic that tests whether trait-affecting alleles are both highly correlated and more differentiated, compared to randomly sampled alleles26. We predicted genetic heights for each population and applied the test to all populations together, as well as to pairs of populations (Fig. 4). Using 180 height-associated SNPs27 (restricted to 169 where we successfully targeted at least two chromosomes in each population), we detect a significant signal of directional selection on height (p=0.002). Applying this to pairs of populations allows us to detect two independent signals. First, the Iberian Neolithic and Chalcolithic samples show selection for reduced height relative to both the Anatolian Neolithic (p=0.042) and the Central European Early and Middle Neolithic (p=0.003). Second, we detect a signal for increased height in the steppe populations (p=0.030 relative to the Central European Early and Middle Neolithic). These results suggest that the modern South-North gradient in height across Europe is due to both increased steppe ancestry in northern populations, and selection for decreased height in Early Neolithic migrants to southern Europe. We do not observe any other significant signals of polygenetic selection in five other complex traits we tested: body mass index28 (p=0.20), waist-to-hip ratio29 (p=0.51), type 2 diabetes30 (p=0.37), inflammatory bowel disease21 (p=0.17) and lipid levels16 (p=0.50).

Our results show how ancient DNA can be used to perform a genome-wide scan for selection, and demonstrate selection on loci related to pigmentation, diet and immunity, painting a picture of Neolithic populations adapting to settled agricultural life at high latitudes. For most of the signals we detect, allele frequencies of modern Europeans are outside the range of any ancient populations, indicating that phenotypically, Europeans of four thousand years ago were different in important respects from Europeans today despite having overall similar ancestry. An important direction for future research is to increase the sample size for European selection scans (Extended Data Fig. 6), and to apply this approach to regions beyond Europe and to nonhuman species.

Acknowledgments

We thank Paul de Bakker, Joachim Burger, Christos Economou, Elin Fornander, Qiaomei Fu, Fredrik Hallgren, Karola Kirsanow, Alissa Mittnik, Iñigo Olalde, Adam Powell, Pontus Skoglund, Shervin Tabrizi, and Arti Tandon for discussions, suggestions about SNPs to include, or contribution to sample preparation or data curation. We thank Svante Pääbo, Matthias Meyer, Qiaomei Fu, and Birgit Nickel for collaboration in developing the 1240k capture reagent. We thank Julio Manuel Vidal Encinas and María Encina Prada for allowing us to resample La Braña 1, and the 1000 Genomes Project for allowing use of the Phase 3 data. I.M. was supported by the Human Frontier Science Program LT001095/2014-L. C.G. was supported by the Irish Research Council for Humanities and Social Sciences (IRCHSS). A.K., P.K. and O.M. were supported by RFBR 𝒩o15-06-01916 and RFHNo15-11-63008 and O.M. by a state grant of the Ministry of education and science of Russia Federation #33.1195.2014/k. J.K. was supported by ERC starting grant APGREID and DFG grant KR 4015/1-1. K.W.A. was supported by DFG grant AL 287 / 14-1. W.H. and B.L. were supported by Australian Research Council DP130102158. R.P. was supported by ERC starting grant ADNABIOARC (263441), and an Irish Research Council ERC support grant. D.R. was supported by U.S. National Science Foundation HOMINID grant BCS-1032255, U.S. National Institutes of Health grant GM100233, and the Howard Hughes Medical Institute.

Author Information

The aligned sequences are available through the European Nucleotide Archive under accession number [to be made available on publication]. The Human Origins genotype datasets including ancient individuals can be found at (http://genetics.med.harvard.edu/reichlab/Reich_Lab/Datasets.html). Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests. Readers are welcome to comment on the online version of the paper. Correspondence and requests for materials should be addressed to I.M.

(iain mathieson{at}hms.harvard.edu), W.H. (haak{at}shh.mpg.de), R.P. (ron.pinhasi{at}ucd.ie) or D.R. (reich{at}genetics.med.harvard.edu).

Author Contributions

W.H., R.P. and D.R. supervised the study. S.A.R., J.L.A., J.M.B., E.C., F.G., A.K., P.K., M.L., H.M., O.M., V.M., M.A.R., J.R., J.M.V., J.K., A.C., K.W.A., D.B., D.A., C.L., W.H., R.P. and D.R. assembled archaeological material. I.M., I.L., N.R., S.M., N.P., S.D., J.P., W.H. and D.R. analysed genetic data. N.R., E.H., K.S., D.F., M.N., K.S., C.G., E.R.J., B.L., C.L. and W.H. performed wet laboratory ancient DNA work. I.M., I.L. and D.R. wrote the manuscript with help from all co-authors.

Methods

Ancient DNA analysis

We screened 433 next generation sequencing libraries from 270 distinct samples for authentic ancient DNA using previously reported protocols7. All libraries that we included in nuclear genome analysis were treated with uracil-DNA-glycosylase (UDG) to reduce characteristic errors of ancient DNA31.

We performed in-solution enrichment for a targeted set of 1,237,207 SNPs using previously reported protocols4,7,32. The targeted SNP set merges 394,577 SNPs first reported in Ref. 7 (390k capture), and 842,630 SNPs first reported in ref.33 (840k capture). For 67 samples for which we newly report data in this study, there was pre-existing 390k capture data7. For these samples, we only performed 840k capture and merged the resulting sequences with previously generated 390k data. For the remaining samples, we pooled the 390k and 840k reagents together to produce a single enrichment reagent. We attempted to sequence each enriched library up to the point where we estimated that it was economically inefficient to sequence further. Specifically, we iteratively sequenced more and more from each sample and only stopped when we estimated that the expected increase in the number of targeted SNPs hit at least once would be less than one for every 100 new read pairs generated. After sequencing, we filtered out samples with <30,000 targeted SNPs covered at least once, with evidence of contamination based on mitochondrial DNA polymorphism32, an appreciable rate of heterozygosity on chromosome X despite being male34, or an atypical ratio of X to Y sequences

Of the targeted SNPs, 47,384 are “potentially functional” sites chosen as follows (with some overlap): 1,290 SNPs identified as targets of selection in Europeans by the Composite of Multiple Signals (CMS) test1; 21,723 SNPS identified as significant hits by genome-wide association studies, or with known phenotypic effect (GWAS); 1,289 SNPs with extremely differentiated frequencies between HapMap populations35 (HiDiff); 9,116 immunochip SNPs chosen for study of immune phenotypes (Immune); 347 SNPs phenotypically relevant to South America (mostly altitude adaptation SNPs in EGLN1 and EPAS1), 5,387 SNPs which tag HLA haplotypes (HLA) and 13,672 expression quantitative trait loci36 (eQTL).

Population history analysis

We used two datasets for population history analysis. “HO” consists of 592,169 SNPs, taking the intersection of the SNP targets and the Human Origins SNP array4; we used this dataset for co-analysis of present-day and ancient samples. “HOIll” consists of 1,055,209 SNPs that additionally includes sites from the Illumina genotype array37; we used this dataset for analyses only involving the ancient samples.

On the HO dataset, we carried out principal components analysis in smartpca38 using a set of 777 West Eurasian individuals4, and projected the ancient individuals with the option “lsqproject: YES”. We carried out ADMIXTURE analysis on a set of 2,345 present-day individuals and the ancient samples after pruning for LD in PLINK 1.9 (https://www.cog-genomics.org/plink2)39 with parameters “-indeppairwise 200 25 0.4”. We varied the number of ancestral populations between K=2 and K=20, and used cross-validation (--cv) to identify the value of K=17 to plot in Extended Data Fig. 2f.

We used ADMIXTOOLS11 to compute f-statistics, determining standard errors with a Block Jackknife and default parameters. We used the option “inbreed: YES” when computing f3-statistics of the form f3(Ancient; Ref1, Ref2) as the Ancient samples are represented by randomly sampled alleles rather than by diploid genotypes. For the same reason, we estimated FST genetic distances between populations on the HO dataset with at least two individuals in smartpca also using the “inbreed: YES” option.

We estimated ancestral proportions as in Supplementary Information section 9 of Ref. 7, using a method that fits mixture proportions on a Test population as a mixture of N Reference populations by using f4-statistics of the form f4(Test or Ref, O1; O2, O3) that exploit allele frequency correlations of the Test or Reference populations with triples of Outgroup populations. We used a set of 15 world outgroup populations4,7. In Extended Data Fig. 2, we added WHG and EHG as outgroups for those analyses in which they are not used as reference populations.

We determined sex by examining the ratio of aligned reads to the sex chromosomes40. We assigned Y-chromosome haplogroups to males using version 9.1.129 of the nomenclature of the International Society of Genetic Genealogy (www.isogg.org), restricting analysis using samtools41 to sites with map quality and base quality of at least 30, and excluding 2 bases at the ends of each sequenced fragment.

Genome-wide scan for selection

For most ancient samples, we did not have sufficient coverage to make reliable diploid calls. We therefore used the counts of sequences covering each SNP to compute the likelihood of the allele frequency in each population. Suppose that at a particular site, for each population we have M samples with sequence level data, and N samples for which we had hard genotype calls (Loschbour, Stuttgart and the 1,000 Genomes samples). For samples i = 1... N, with genotype data, we observe X copies of the reference allele out of 2N total chromosomes. For each of samples i = (N + 1)… (N + M), with sequence level data, we observe Ri sequences with the reference allele out of Ti total sequences. Then, dropping the subscript i for brevity, the likelihood of the population reference allele frequency, p given data Embedded Image is given by Embedded Image where Embedded Image is the binomial probability distribution and ε is a small probability of error, which we set to 0.001. We write ℓ(p; D) for the log-likelihood. To estimate allele frequencies, for example in Fig. 3 or for the polygenic selection test, we maximized this likelihood numerically for each population.

To scan for selection across the genome, we used the following test. Consider a single SNP. Assume that we can model the allele frequencies pmod in A modern populations as a linear combination of allele frequencies in B ancient populations panc. That is, pmod = CPanc, where C is an A by B matrix with rows summing to 1. We have data Dj from population j which is some combination of sequence counts and genotypes as described above. Then, writing Embedded Image the log-likelihood of the allele frequencies equals the sum of the log-likelihoods for each population.

Embedded Image

To detect deviations in allele frequency from expectation, we test the null hypothesis H0: pmod = C panc against the alternative H1:pmod unconstrained. We numerically maximize this likelihood in both the constrained and unconstrained model and use the fact that twice the difference in log-likelihood is approximately Embedded Image distributed to compute a test statistic and p-value.

We defined the ancient source populations by the “Selection group 1” label in Extended Data Table 1 and Supplementary Table 1 and used the 1000 Genomes CEU, GBR, IBS and TSI as the present-day populations. We removed SNPs that were monomorphic in all four of these modern populations as well as in 1000 Genomes Yoruba (YRI). We do not use FIN as one of the modern populations, because they do not fit this three-population model well. We estimate the proportions of (HG, EF, SA) to be CEU=(0.196, 0.257, 0.547), GBR=(0.362,0.229,0.409), IBS= (0, 0.686, 0.314) and TSI=(0, 0.645, 0.355). In practice we found that there was substantial inflation in the test statistic, most likely due to unmodeled ancestry or additional drift. To address this, we applied a genomic control correction42, dividing all the test statistics by a constant, λ, chosen so that the median p-value matched the median of the null Embedded Image distribution. Excluding sites in the potentially functional set, we estimated λ = 1.38 and used this value as a correction throughout. One limitation of this test is that, although it identifies likely signals of selection, it cannot provide much information about the strength or date of selection. If the ancestral populations in the model are, in fact, close to the real ancestral populations, then any selection must have occurred after the first admixture event (in this case, after 6500 BCE), but if the ancestral populations are mis-specified, even this might not be true.

To estimate power, we randomly sampled allele counts from the full dataset, restricting to polymorphic sites with a mean frequency across all populations of <0.1. We then simulated what would happen if the allele had been under selection in all of the modern populations by simulating a Wright-Fisher trajectory with selection for 50, 100 or 200 generations, starting at the observed frequency. We took the final frequency from this simulation, sampled observations to replace the actual observations in that population, and counted the proportion of simulations that gave a genome-wide significant result after GC correction (Extended Data Fig. 6a). We resampled sequence counts for the observed distribution for each population to simulate the effect of increasing sample size, assuming that the coverage and distribution of the sequences remained the same (Extended Data Fig. 6b).

We investigated how the genomic control correction responded when we simulated small amounts of admixture from a highly diverged population (Yoruba; 1000 Genomes YRI) into a randomly chosen modern population. The genomic inflation factor increases from around 1.38 to around 1.51 with 10% admixture, but there is little reduction in power (Extended Fig. 6c). Finally, we investigated how robust the test was to misspecification of the mixture matrix C. We reran the power simulations using a matrix C′ = pC + (1 – p)R for p ∈ [0,1] where R was a random matrix chosen so that for each modern population, the mixture proportions of the three ancient populations were jointly normally distributed on [0,1]. Increasing p increases the genomic inflation factor and reduces power, demonstrating the advantage of explicitly modeling the ancestries of the modern populations (Extended Fig. 6d).

Test for polygenic selection

We implemented the test for polygenic selection described by Ref. 26. This evaluates whether trait-associated alleles, weighted by their effect size, are over-dispersed compared to randomly sampled alleles, in the directions associated with the effects measured by genome-wide association studies (GWAS). For each trait, we obtained a list of significant SNP associations and effect estimates from GWAS data, and then applied the test both to all populations combined and to selected pairs of populations. We restricted the list of GWAS associations to 169 SNPs where we observed at least two chromosomes in all tested populations (selection population 2). We estimated frequencies in each population by computing the MLE, using the likelihood described above. For each test, we sampled SNPs frequency matched in 20 bins, computed the test statistic QX and for ease of comparison, converted these to Z scores, signed according the direction of the genetic effects. Theoretically QX has a χ2 distribution but in practice, it is over-dispersed. Therefore, we report bootstrap p-values computed by sampling 10,000 sets of frequency matched SNPs.

To estimate population-level genetic height in Fig. 4A, we assumed a uniform prior on [0,1] for the distribution of all height-associated alleles, and then sampled from the posterior joint frequency distribution of the alleles, assuming they were independent, using a Metropolis-Hastings sampler with a N(0,0.001) proposal density. We then multiplied the sampled allele frequencies by the effect sizes to get a distribution of genetic height.

References

  1. ↵
    Grossman, S. R. et al. Identifying recent adaptations in large-scale genomic data. Cell 152, 703–713 (2013).
    OpenUrlCrossRefPubMedWeb of Science
  2. ↵
    Wilde, S. et al. Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y. Proc. Natl. Acad. Sci. U. S. A. 111, 4832–4837 (2014).
    OpenUrlAbstract/FREE Full Text
  3. ↵
    Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat Commun 5, 5257 (2014).
  4. ↵
    Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014).
    OpenUrlCrossRefPubMedWeb of Science
  5. ↵
    Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015).
    OpenUrlCrossRefPubMed
  6. ↵
    Keller, A. et al. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat Commun 3, 698 (2012).
  7. ↵
    Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015).
    OpenUrlCrossRefPubMed
  8. ↵
    Olalde, I. et al. Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European. Nature 507, 225–228 (2014).
    OpenUrlCrossRefPubMedWeb of Science
  9. ↵
    Pinhasi, R. et al. Optimal Ancient DNA Yields from the Inner Ear Part of the Human Petrous Bone. PLoS One 10, e0129102 (2015).
    OpenUrlCrossRefPubMed
  10. ↵
    Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
    OpenUrlAbstract/FREE Full Text
  11. ↵
    Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
    OpenUrlAbstract/FREE Full Text
  12. ↵
    Underhill, P. A. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur. J. Hum. Genet. 23, 124–131 (2015).
    OpenUrlCrossRefPubMed
  13. ↵
    The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature In press (2015).
  14. ↵
    Enattah, N. S. et al. Identification of a variant associated with adult-type hypolactasia. Nat. Genet. 30, 233–237 (2002).
    OpenUrlCrossRefPubMedWeb of Science
  15. ↵
    Burger, J., Kirchner, M., Bramanti, B., Haak, W. & Thomas, M. G. Absence of the lactase-persistence-associated allele in early Neolithic Europeans. Proc. Natl. Acad. Sci. U. S. A. 104, 3736–3741 (2007).
    OpenUrlAbstract/FREE Full Text
  16. ↵
    Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  17. ↵
    Wang, T. J. et al. Common genetic determinants of vitamin D insufficiency: a genome-wide association study. Lancet 376, 180–188 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  18. ↵
    Price, A. L. et al. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 5, e1000505 (2009).
    OpenUrlCrossRefPubMed
  19. ↵
    Huff, C. D. et al. Crohn’s disease and genetic hitchhiking at IBD5. Mol. Biol. Evol. 29, 101–111 (2012).
    OpenUrlCrossRefPubMedWeb of Science
  20. ↵
    Hunt, K. A. et al. Newly identified genetic risk variants for celiac disease related to the immune response. Nat. Genet. 40, 395–402 (2008).
    OpenUrlCrossRefPubMedWeb of Science
  21. ↵
    Jostins, L. et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).
    OpenUrlCrossRefPubMedWeb of Science
  22. ↵
    Beleza, S. et al. Genetic architecture of skin and eye color in an African-European admixed population. PLoS Genet. 9, e1003372 (2013).
    OpenUrlCrossRefPubMed
  23. ↵
    Barreiro, L. B. et al. Evolutionary dynamics of human Toll-like receptors and their different contributions to host defense. PLoS Genet. 5, e1000562 (2009).
    OpenUrlCrossRefPubMed
  24. ↵
    Kamberov, Y. G. et al. Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell 152, 691–702 (2013).
    OpenUrlCrossRefPubMedWeb of Science
  25. ↵
    Turchin, M. C. et al. Evidence of widespread selection on standing variation in Europe at height-associated SNPs. Nat. Genet. 44, 1015–1019 (2012).
    OpenUrlCrossRefPubMed
  26. ↵
    Berg, J. J. & Coop, G. A population genetic signal of polygenic adaptation. PLoS Genet. 10, e1004412 (2014).
    OpenUrlCrossRefPubMed
  27. ↵
    Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  28. ↵
    Speliotes, E. K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  29. ↵
    Heid, I. M. et al. Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat. Genet. 42, 949–960 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  30. ↵
    Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).
    OpenUrlCrossRefPubMed
  31. ↵
    Briggs, A. W. et al. Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res. 38, e87 (2010).
    OpenUrlCrossRefPubMed
  32. ↵
    Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl. Acad. Sci. U. S. A. 110, 2223–2227 (2013).
    OpenUrlAbstract/FREE Full Text
  33. ↵
    Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature (2015).
  34. ↵
    Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinformatics 15, 356 (2014).
  35. ↵
    International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
    OpenUrlCrossRefPubMedWeb of Science
  36. ↵
    Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
    OpenUrlCrossRefPubMedWeb of Science
  37. ↵
    Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008).
  38. ↵
    Loh, P. R. et al. Inferring admixture histories of human populations using linkage disequilibrium. Genetics 193, 1233–1254 (2013).
    OpenUrlAbstract/FREE Full Text
  39. ↵
    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4 (2015).
  40. ↵
    Skoglund, P., Storå, J., Götherström, A. & Jakobsson, M. Accurate sex identification of ancient human remains using DNA shotgun sequencing. JAS 40, 4477–4482 (2013).
    OpenUrl
  41. ↵
    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
    OpenUrlCrossRefPubMedWeb of Science
  42. ↵
    Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
    OpenUrlCrossRefPubMedWeb of Science
  43. Norton, H. L. et al. Genetic evidence for the convergent evolution of light skin in Europeans and East Asians. Mol. Biol. Evol. 24, 710–722 (2007).
    OpenUrlCrossRefPubMedWeb of Science
  44. Bokor, S. et al. Single nucleotide polymorphisms in the FADS gene cluster are associated with delta-5 and delta-6 desaturase activities estimated by serum fatty acid ratios. J. Lipid Res. 51, 2325–2333 (2010).
    OpenUrlAbstract/FREE Full Text
  45. Tanaka, T. et al. Genome-wide association study of plasma polyunsaturated fatty acids in the InCHIANTI Study. PLoS Genet. 5, e1000338 (2009).
    OpenUrlCrossRefPubMed
  46. Uciechowski, P. et al. Susceptibility to tuberculosis is associated with TLR1 polymorphisms resulting in a lack of TLR1 cell surface expression. J. Leukoc. Biol. 90, 377–388 (2011).
    OpenUrlCrossRefPubMedWeb of Science
  47. Wong, S. H. et al. Leprosy and the adaptation of human toll-like receptor 1. PLoSPathog. 6, e1000979 (2010).
    OpenUrl
  48. Ahn, J. et al. Genome-wide association study of circulating vitamin D levels. Hum. Mol. Genet. 19, 2739–2745 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  49. Grundemann, D. et al. Discovery of the ergothioneine transporter. Proc. Natl. Acad. Sci. U. S. A. 102, 5256–5261 (2005).
    OpenUrlAbstract/FREE Full Text
  50. Chauhan, S. et al. ZKSCAN3 is a master transcriptional repressor of autophagy. Mol. Cell 50, 16–28 (2013).
    OpenUrlCrossRefPubMedWeb of Science
  51. Soler Artigas, M. et al. Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function. Nat. Genet. 43, 1082–1090 (2011).
    OpenUrlCrossRefPubMed
  52. Sturm, R. A. et al. A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color. Am. J. Hum. Genet. 82, 424–431 (2008).
    OpenUrlCrossRefPubMedWeb of Science
  53. Eiberg, H. et al. Blue eye color in humans may be caused by a perfectly associated founder mutation in a regulatory element located within the HERC2 gene inhibiting OCA2 expression. Hum. Genet. 123, 177–187 (2008).
    OpenUrlCrossRefPubMedWeb of Science
  54. ↵
    Pruim, R. J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).

References

  1. ↵
    1. J. Roodenberg &
    2. S. Alpaslan Roodenberg
    Roodenberg, J. in Life and Death in a Prehistoric Settlement in Northwest Anatolia. The Ilipinar Excavations, Volume III (eds J. Roodenberg & S. Alpaslan Roodenberg) (The Netherlands Institute for the Near East, 2008).
  2. ↵
    Alpaslan Roodenberg, M. S., Todorova, N. & Petrova, V. The Human Burials of Yabalkovo. Praehistorische Zeitschrift 88, 23–37 (2013).
    OpenUrl
  3. ↵
    1. I. Hodder
    Andrews, P., Molleson, T. & Boz, B. in Inhabiting Çatalhöyük: reports from the 1995-1999 seasons. (ed I. Hodder) (British Institute of Archaeology at Ankara, 2007).
  4. ↵
    Karul, N. & Avci, B. Neolithic Communities in the Eastern Marmara: Aktopraklik C. Anatolica 37, 1–15 (2011).
    OpenUrl
  5. ↵
    Alpaslan Roodenberg, S. in Life and Death in a Prehistoric Settlement in Northwest Anatolia. The Ilipinar Excavations, Volume III (eds J. Roodenberg & S. Alpaslan Roodenberg) (The Netherlands Institute for the Near East, 2008).
  6. ↵
    Roodenberg, J., van As, A., Jacobs, L. & Wijnen, M. H. Early Settlement in the Plain of Yenişehir (NW Anatolia). Anatolica 29, 17–59 (2003).
    OpenUrl
  7. ↵
    Alpaslan-Roodenberg, S., 2001 - Newly found human remains from Menteşe in the Yenişehir Plain: The season of 2000. Anatolica 27: 1–14.
    OpenUrl
  8. ↵
    Groenhuijzen M, Kluiving S, Gerritsen FA, Künzel M. (2015) Geoarchaeological Research at Barcın Höyük: Implications for the Neolithisation of Northwest Anatolia. Quarternary International 367: 51–61
    OpenUrl
  9. ↵
    Gerritsen FA, Özbal R, Thissen L (2013) The Earliest Neolithic Levels at Barcın Höyük, Northwestern Turkey. Anatolica 39, 53–92.
    OpenUrl
  10. ↵
    Weninger B, Clare L, Gerritsen F, Horejs B, Krauß R, Özbal R, Rohling E. (2014) Neolithisation and Rapic Climate Change (6600-6000 cal BC) in the Aegean and Southeast Europe. Documenta Praehistorica 41, 1–31.
    OpenUrlCrossRef
  11. ↵
    During BS (2013) Breaking the Bond: Investigating the Neolithic Expansion in Asia Minor in the Seventh Millennium BC. Journal or World Prehistory 26, 75–100.
    OpenUrl
  12. ↵
    1. Sala, R.
    Carbonell E et al. (2014) Sierra de Atapuerca archaeological sites, in: Sala, R. (Ed.): Pleistocene and Holocene hunter-gatherers in Iberia and the Gibraltar Strait: the current archaeological record. Universidad de Burgos / Fundación Atapuerca. Burgos, 534–560.
  13. ↵
    Cáceres I, Lozano M, Saladié P (2007) Evidence for bronze age cannibalism in El Mirador Cave (Sierra de Atapuerca, Burgos, Spain). Am J Phys Anthropol 133, 899–917.
    OpenUrlPubMed
  14. ↵
    Gómez-Sánchez D et al. (2014). Mitochondrial DNA from El Mirador cave (Atapuerca, Spain) reveals the heterogeneity of Chalcolithic populations. PLoS One 9, e105105.
    OpenUrlCrossRefPubMed
  15. ↵
    Brandt G, et al. (2013). Ancient DNA reveals key stages in the formation of central European mitochondrial genetic diversity. Science 342, 257–261.
    OpenUrlAbstract/FREE Full Text
  16. ↵
    Haak W et al. (2015) Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–11
    OpenUrlCrossRefPubMed
  17. ↵
    Fritsch B, Claßen E, Müller U, Dresely V (2011) Die linienbandkeramischen Gräberfelder von Derenburg „Meerenstieg II“ und Halberstadt „Sonntagsfeld“, Lkr. Harz. Jahresschr. Mitteldt. Vorgesch. 92, 25–229.
    OpenUrl
  18. ↵
    1. V. Dresely &
    2. H. Meller
    Leinthaler, B., Bogen, C. & Döhle, H.-J. in Archäologie auf der Überholspur. Ausgrabungen an der A38 Vol. 5 Archäologie in Sachsen-Anhalt, Sonderband (eds V. Dresely & H. Meller) 59–82 (Landesamt für Archäologie Sachsen-Anhalt, Halle (Saale), 2006).
  19. ↵
    Koch, F. Die Glockenbecher-und Aunjetitzer Kultur zwischen Benzingerode und Heimburg - Befunde und Funde der Ausgrabungen an der B 6n. Jahresschrift fuer mitteldeutsche Vorgeschichte 93, 187–290 (2009)
    OpenUrl
  20. ↵
    1. V. Dresely &
    2. H. Meller
    Autze, T. in Quer-Schnitt. Ausgrabungen an der B 6n. Benzingerode-Heimburg Vol. 2 Archäologie in Sachsen-Anhalt (eds V. Dresely & H. Meller) 39–51 (Landesamt für Archäologie Sachsen-Anhalt, Halle (Saale), 2005).
    OpenUrl
  21. ↵
    1. V. Dresely &
    2. H. Meller
    Dalidowski, X. in Archäologie XXL. Archäologie an der B 6n im Landkreis Quedlinburg Vol. 4 (eds V. Dresely & H. Meller) 116–120 (Landesamt für Archäologie Sachsen-Anhalt, Halle (Saale), 2006).
  22. ↵
    Der Sarkissian, C. et al. Ancient DNA Reveals Prehistoric Gene-Flow From Siberia in the Complex Human Population History of North East Europe. PLoS Genetics 9, e1003296 (2013).
    OpenUrl
  23. ↵
    Anthony, D. W. (2007) The Horse the Wheel and Language. Princeton Univ. Press, Princeton; Agapov, S.A., I.B. Vasiliev, and V.I. Pestrikova (1990) Khvalynskii Eneoliticheskii Mogil’nik, Saratovskogo Universiteta, Saratov.
  24. ↵
    Vasiliev, I.B., P.F. Kuznetsov, and A.P. Semenova (1994) Potapovskii Kurgannyi Mogil’nik Indoiranskikh Plemen na Volge, Samarskii Universitet, Samara.

References

  1. 1.↵
    Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172, (2015).
    OpenUrlCrossRefPubMed
  2. 2.
    Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature 524, 216–219, (2015).
    OpenUrlCrossRefGeoRefPubMed
  3. 3.↵
    Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5, 5257 (2014).
    OpenUrlCrossRefPubMed
  4. 4.↵
    Günther, T. et al. Ancient genomes link early farmers from Atapuerca in Spain to modern-day Basques. Proceedings of the National Academy of Sciences, (2015).
  5. 5.↵
    Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211, (2015).
    OpenUrlCrossRefPubMed
  6. 6.↵
    Keller, A. et al. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat. Commun. 3, 698, (2012).
    OpenUrlCrossRefPubMed
  7. 7.↵
    Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413, (2014).
    OpenUrlCrossRefPubMedWeb of Science
  8. 8.↵
    Olalde, I. et al. Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European. Nature 507, 225–228, (2014).
    OpenUrlCrossRefPubMedWeb of Science
  9. 9.↵
    Olalde, I. et al. A common genetic origin for early farmers from Mediterranean Cardial and Central European LBK cultures. Mol. Biol. Evol., (2015).
  10. 10.↵
    Raghavan, M. et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature 505, 87–91, (2014).
    OpenUrlCrossRefPubMedWeb of Science
  11. 11.↵
    Sanchez-Quinto, F. et al. Genomic Affinities of Two 7,000-Year-0ld Iberian Hunter-Gatherers. Curr. Biol. 22, 1494–1499, (2012).
    OpenUrlCrossRefPubMed
  12. 12.
    Seguin-Orlando, A. et al. Genomic structure in Europeans dating back at least 36,200 years. Science 346, 1113–1118, (2014).
    OpenUrlAbstract/FREE Full Text
  13. 13.
    Sikora, M. et al. Population Genomic Analysis of Ancient and Modern Genomes Yields New Insights into the Genetic Ancestry of the Tyrolean Iceman and the Genetic Structure of Europe. PLoS Genet. 10, e1004353, (2014).
    OpenUrlCrossRefPubMed
  14. 14.↵
    Skoglund, P. et al. Genomic Diversity and Admixture Differs for Stone-Age Scandinavian Foragers and Farmers. Science 344, 747–750, (2014).
    OpenUrlAbstract/FREE Full Text
  15. 15.↵
    Skoglund, P. et al. Origins and genetic legacy of Neolithic farmers and hunter-gatherers in Europe. Science 336, 466–469, (2012).
    OpenUrlAbstract/FREE Full Text
  16. 16.↵
    Lipson, M. et al. Efficient moment-based inference of admixture parameters and sources of gene flow. Mol. Biol. Evol. 30, 1788–1802, (2013).
    OpenUrlCrossRefPubMed
  17. 17.↵
    Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093, (2012).
    OpenUrlAbstract/FREE Full Text
  18. 18.↵
    Bollongino, R. et al. 2000 Years of Parallel Societies in Stone Age Central Europe. Science 342, 479–481, (2013).
    OpenUrlAbstract/FREE Full Text
  19. 19.↵
    Bellwood, P. First Farmers: The Origins of Agricultural Societies. (Wiley-Blackwell, 2004).
  20. 20.↵
    Haak, W. et al. Ancient DNA from European early Neolithic farmers reveals their Near Eastern affinities. PLoS Biol. 8, e1000536, (2010).
    OpenUrlCrossRefPubMed
  21. 21.↵
    Brandt, G. et al. Ancient DNA reveals key stages in the formation of central European mitochondrial genetic diversity. Science 342, 257–261, (2013).
    OpenUrlAbstract/FREE Full Text
  22. 22.↵
    Szécsényi-Nagy, A. et al. Tracing the genetic origin of Europe’s first farmers reveals insights into their social organization. Proceedings of the Royal Society of London B: Biological Sciences 282, (2015).
  23. 23.↵
    Underhill, P. A. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur. J. Hum. Genet. 23, 124–131, (2014).
    OpenUrl
  24. 24.↵
    Hollard, C. et al. Strong genetic admixture in the Altai at the Middle Bronze Age revealed by uniparental and ancestry informative markers. Forensic Science International: Genetics 12, 199–207, (2014).
    OpenUrl
Back to top
PreviousNext
Posted October 10, 2015.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Eight thousand years of natural selection in Europe
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Eight thousand years of natural selection in Europe
Iain Mathieson, Iosif Lazaridis, Nadin Rohland, Swapan Mallick, Nick Patterson, Songül Alpaslan Roodenberg, Eadaoin Harney, Kristin Stewardson, Daniel Fernandes, Mario Novak, Kendra Sirak, Cristina Gamba, Eppie R. Jones, Bastien Llamas, Stanislav Dryomov, Joseph Pickrell, Juan Luís Arsuaga, José María Bermúdez de Castro, Eudald Carbonell, Fokke Gerritsen, Aleksandr Khokhlov, Pavel Kuznetsov, Marina Lozano, Harald Meller, Oleg Mochalov, Vayacheslav Moiseyev, Manuel A. Rojo Guerra, Jacob Roodenberg, Josep Maria Vergès, Johannes Krause, Alan Cooper, Kurt W. Alt, Dorcas Brown, David Anthony, Carles Lalueza-Fox, Wolfgang Haak, Ron Pinhasi, David Reich
bioRxiv 016477; doi: https://doi.org/10.1101/016477
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Eight thousand years of natural selection in Europe
Iain Mathieson, Iosif Lazaridis, Nadin Rohland, Swapan Mallick, Nick Patterson, Songül Alpaslan Roodenberg, Eadaoin Harney, Kristin Stewardson, Daniel Fernandes, Mario Novak, Kendra Sirak, Cristina Gamba, Eppie R. Jones, Bastien Llamas, Stanislav Dryomov, Joseph Pickrell, Juan Luís Arsuaga, José María Bermúdez de Castro, Eudald Carbonell, Fokke Gerritsen, Aleksandr Khokhlov, Pavel Kuznetsov, Marina Lozano, Harald Meller, Oleg Mochalov, Vayacheslav Moiseyev, Manuel A. Rojo Guerra, Jacob Roodenberg, Josep Maria Vergès, Johannes Krause, Alan Cooper, Kurt W. Alt, Dorcas Brown, David Anthony, Carles Lalueza-Fox, Wolfgang Haak, Ron Pinhasi, David Reich
bioRxiv 016477; doi: https://doi.org/10.1101/016477

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2520)
  • Biochemistry (4969)
  • Bioengineering (3475)
  • Bioinformatics (15190)
  • Biophysics (6886)
  • Cancer Biology (5383)
  • Cell Biology (7721)
  • Clinical Trials (138)
  • Developmental Biology (4524)
  • Ecology (7139)
  • Epidemiology (2059)
  • Evolutionary Biology (10212)
  • Genetics (7504)
  • Genomics (9776)
  • Immunology (4828)
  • Microbiology (13190)
  • Molecular Biology (5132)
  • Neuroscience (29384)
  • Paleontology (203)
  • Pathology (836)
  • Pharmacology and Toxicology (1462)
  • Physiology (2132)
  • Plant Biology (4738)
  • Scientific Communication and Education (1008)
  • Synthetic Biology (1337)
  • Systems Biology (4005)
  • Zoology (768)