Abstract
The relative contributions of genetics and environment to temporal and geographic variation in human height remain largely unknown. Ancient DNA has identified changes in genetic ancestry over time, but it is not clear whether those changes in ancestry are associated with changes in height. Here, we directly test whether changes over the past 38,000 years in European height predicted using DNA from 1071 ancient individuals are consistent with changes observed in 1159 skeletal remains from comparable populations. We show that the observed decrease in height between the Early Upper Paleolithic and the Mesolithic is qualitatively predicted by genetics. Similarly, both skeletal and genetic height remained constant between the Mesolithic and Neolithic and increased between the Neolithic and Bronze Age. Sitting height changes much less than standing height–consistent with genetic predictions–although genetics predicts a small Bronze Age increase that is not observed in skeletal remains. Geographic variation in stature is also qualitatively consistent with genetic predictions, particularly with respect to latitude. We find that the changes in genetic height between the Neolithic and Bronze Age may be driven by polygenic adaptation. Finally, we hypothesize that an observed decrease in genetic heel bone mineral density in the Neolithic reflects adaptation to the decreased mobility indicated by decreased femoral bending strength. This study provides a model for interpreting phenotypic changes predicted from ancient DNA and demonstrates how they can be combined with phenotypic measurements to understand the relative contribution of genetic and developmentally plastic responses to environmental change.
Introduction
Stature, or standing height, is one of the most heavily studied human phenotypes. It is easy to measure in living individuals and relatively straightforward to estimate from skeletal remains. As a consequence, geographic variation and temporal changes in stature are well documented (1-3), particularly in western Europe, where there is a comprehensive record of prehistoric changes (4). The earliest anatomically modern humans in Europe, present by 42-45,000 BP (5, 6), were relatively tall (mean adult male height in the Early Upper Paleolithic was ∼174 cm). Mean male stature then declined from the Paleolithic to the Mesolithic (∼164 cm) before increasing to ∼167 cm by the Bronze Age (4, 7). Subsequent changes, including the 20th century secular trend increased height to ∼170-180 cm (1, 4). It is broadly agreed that these changes are likely to have been driven by a combination of environmental (e.g. climate or diet) and genetic factors (4, 7-9), although the effects of these two variables cannot be separated based on skeletal data alone. In this study, by combining the results of genome-wide association studies (GWAS) with ancient DNA, we directly estimate the genetic component of stature and test whether population-level skeletal changes between ∼35,000 and 1,000 BP are consistent with those predicted by genetics.
Height is highly heritable (10-14), and therefore amenable to genetic analysis by genome-wide association studies (GWAS). With sample sizes of hundreds of thousands of individuals, GWAS have identified thousands of genomic variants that are significantly associated with the phenotype (15-17). Though the individual effect of each of these variants is tiny (on the order of +/- 1-2mm per variant (18)), their combination can be highly predictive. Polygenic risk scores (PRS) constructed by summing together the effects of all height-associated variants carried by an individual can now explain upwards of 30% of the phenotypic variance in populations of European ancestry (16). In effect, the PRS can be thought of as an estimate of “genetic height” that predicts phenotypic height, at least in populations closely related to those in which the GWAS was performed. One major caveat is that the predictive power of PRS is much lower in other populations (19). The extent to which differences in PRS between populations are predictive of population-level differences in phenotype is currently unclear (20). Recent studies have demonstrated that such differences may partly be artifacts of correlation between environmental and genetic structure in the original GWAS (21, 22). These studies also suggested best practices for PRS comparisons, including the use of GWAS summary statistics from large homogenous studies (instead of meta-analyses), and replication of results using summary statistics derived from within-family analyses that are robust to population stratification.
Bearing these caveats in mind, PRS can be applied to ancient populations thanks to recent technological developments that have dramatically increased ancient DNA (aDNA) sample sizes. These have provided remarkable insights into the demographic and evolutionary history of both modern and archaic humans across the world (23-25), particularly in Europe, and allow us to track the evolution of variants underlying phenotypes ranging from pigmentation to diet (26-29). In principle, PRS applied to ancient populations could similarly allow us to make inference about the evolution of complex traits. A few studies have used PRS to make predictions about the relative statures of ancient populations (29-31) but looked at only a few hundred samples in total and did not compare their predictions with stature measured from skeletons. Here, we compare measured skeletal data to genetic predictions and directly investigate the genetic contribution to height independent of environmental effects acting during development.
Results
PRS and skeletal measurements
We collected published aDNA data from 1071 ancient individuals from Western Eurasia (west of 50° E), dated to between 38,000 and 1100 years before present (BP) (27, 29, 30, 32-57). Using GWAS summary statistics for height from the UK Biobank (generated and made available by the Neale lab: http://nealelab.is/), we computed height PRS for each individual, using a P-value cutoff of 10−6, clumping variants in 250kb windows, and replacing missing genotypes with the mean across individuals (Methods). We refer to this as PRS(GWAS). Because of concerns about GWAS effect sizes being inflated by residual population stratification, we also computed a PRS where we used GWAS P-values to select SNPs, but computed the PRS using effect sizes estimated from a within-family test for ∼17,000 sibling pairs from UK Biobank (Methods) which we refer to as PRS(GWAS/Sibs), and which should be unaffected by stratification. We also obtained stature estimates from 1159 individuals dating to between 33,700 and 1100 BP taken from a larger dataset of 2177 individuals with stature and body proportion estimates from substantially complete skeletons (4, 58). There is limited overlap in these datasets (12 individuals), but they cover the same time periods and broadly the same geographic locations (Supplementary Fig. 1), although the genetic data contain more individuals from further east (30-50° E) compared to the skeletal data. We divided these individuals into five groups based on date: Early Upper Paleolithic (>25,000 BP; EUP), Late Upper Paleolithic (25,000-11,000 BP; LUP), Mesolithic (11,000-5500 BP), Neolithic (8500-3900 BP) and post-Neolithic (5000-1100 BP, including the Copper and Bronze Ages, plus later periods). These groups broadly correspond to transitions in both archaeological culture and genetic ancestry (33, 38, 59), and we resolved individuals in the overlapping periods using either archaeological or genetic context (Methods).
Trends in PRS for height are largely consistent with trends in skeletal stature
We found a significant effect of group (time period) on mean PRS(GWAS) (ANOVA P= 1.9×10−9), PRS(GWAS/Sibs) (P=0.045) and skeletal stature (P=2.8×10−11). There was no evidence of difference between LUP, Mesolithic and Neolithic groups (Supplementary Fig. 2a-b), so we merged these three groups (we refer to the merged group as LUP-Neolithic). We find that PRS(GWAS) in the LUP-Neolithic period is 0.47 standard deviations (SD) lower than in the EUP (P=0.002), and 0.40 SD lower (P= 8.7×10−11) than in the post-Neolithic period (Fig. 1a). PRS(GWAS/Sib) shows a very similar pattern (Fig. 1b), demonstrating that this is not a result of differential relatedness of the ancient individuals to the structured present-day GWAS populations. Skeletal stature shows a qualitatively similar pattern to the genetic predictions, with a 1.5 SD (9.6cm; P=2.9×10−7) difference between EUP and LUP-Neolithic and a 0.27 SD (1.8cm; P=3.6×10−5) difference between LUP-Neolithic and post-Neolithic. Broad patterns of change in stature over time are therefore consistent with genetic predictions.
Additionally, we fit a piecewise linear model allowing PRS to decrease from the EUP to the Neolithic and then increase and change slope in the post-Neolithic (Fig. 1d-f). In this model, PRS(GWAS) decreases by about 1.8×10−5 SD/year (P=0.014) from EUP to Neolithic, and increases by 2.0×10−4 SD/year (P=0.001) post-Neolithic (Fig. 1d). PRS(GWAS/sib) decreases by about 1.6×10−5 SD/year (P=0.037) from EUP to Neolithic, then increases by 1.6×10−4 SD/year throughout the period (P=0.011; Fig. 1e). Again, these changes are qualitatively consistent with changes in stature (Fig. 1f), with a 4.7×10−5 SD/year (3.3×10−4 cm/year; P=2.4×10−8) decrease from EUP to Mesolithic, and an increase of ∼0.5 SD into the Neolithic. However, in this model stature, unlike PRS, actually decreases during the post-Neolithic period (7.5×10−4 cm/year; P=2.0×10−4).
To further explore these trends, we fitted a broader range of piecewise linear models to both datasets (Methods; Supplementary Table 1; Supplementary Fig. 3-5). In the most general model we allowed both the mean and the slope of PRS or stature with respect to time to vary between groups. More constrained models fix some of these parameters to zero–eliminating change over time–or merging two adjacent groups. We compared the fit of these nested models using Akaike’s Information Criterion (AIC, Supplementary Table 1). The linear model in Fig. 1d-f is one of the best models in this analysis. In general, all the best-fitting models support the pattern– for both PRS and measured stature–of a decrease between the EUP and Mesolithic and an increase between the Neolithic and post-Neolithic (Supplementary Fig. 3-5). Some models suggest that the increase in stature–but not PRS–may have started during the Neolithic (Supplementary Figure 5a-c). Finally, we confirmed that these results were robust to different constructions of the PRS–using 100kb and 500kb clustering windows rather than 250kb (Supplementary Fig. 6-7).
Sitting height PRS is partially consistent with trends in body proportions
Standing height is made up of two components: leg length and sitting height (made up of the length of the trunk, neck and head), with a partially overlapping genetic basis (60). Throughout European prehistory, changes in leg length tended to be larger than changes in sitting height (4). We constructed PRS(GWAS) and PRS(GWAS/Sibs) for sitting height and analyzed them in the same way as standing height (Fig. 2). In contrast to standing height, we find no evidence of change between the EUP and Neolithic. Both PRS(GWAS) and PRS(GWAS/Sibs) do increase, either between the Neolithic and post-Neolithic, or during the post-Neolithic period (Fig. 2a,b,d & e). On the other hand, using only skeletons with complete torsos to estimate sitting height, we find no evidence of change in any period. Thus, the skeletal data are consistent with the genetic data for the EUP-Neolithic period, but inconsistent in the post-Neolithic period, where PRS predicts an increase that is not reflected in the skeletons. This could be because of more limited skeletal measurements (only 236 out of 1159 skeletons are sufficiently complete to estimate sitting height directly), because the change in PRS is artefactual, it is being buffered by non-genetic effects, or by opposing genetic effects which we do not capture. Overall, we find mixed consistency between PRS and skeletal measurements (Fig. 3). The decrease in standing but not sitting height between the EUP and Neolithic is consistent in both, as is the increase in standing height between the Neolithic and post-Neolithic. However, PRS predicts a continued increase in stature through the post-Neolithic period that is not seen in skeletal remains.
Geographic variation in standing height
As well as varying through time, human stature is stratified by geography, with trends related to both longitude and latitude (61). North-South trends following Allen’s (62) and Bergmann’s (63) rules are most often interpreted as environmental adaptations to the polar-equatorial climate gradient. Today, Northern Europeans are generally taller than Southern Europeans (1), a pattern which emerged between the Mesolithic and post-Neolithic (4, 7). Longitudinal variation within Europe is present during the Mesolithic (64), though these trends are difficult to interpret due to sampling bias across the time period (4). We therefore tested whether geographic variation in PRS could explain these geographic trends, as it partially explains temporal trends.
We regressed the residuals from our fitted linear height model (the model shown in Fig. 1d-f) on longitude and latitude. Stature increases significantly with latitude (P=1.2×10−10) in the post-Neolithic period. PRS(GWAS) increases in the post-Neolithic (P=0.006) although this is not replicated by PRS(GWAS/Sibs) (P=0.557). PRS does not increase significantly with latitude in the EUP-Neolithic period. There is some evidence of a modest trend in stature in the EUP-Neolithic period (Fig. 4c). However, there is only evidence for this in the Neolithic, not in the EUP-Mesolithic (Supplementary Fig. 8a). Further, because time and geography are correlated in our Neolithic sample, this can also be explained by a temporal increase during the Neolithic, in which case there is no geographic trend (Supplementary Fig. 8b).
In contrast to latitude, there is a significant increasing trend of stature with longitude before but not during the Neolithic (0.36 cm/degree P=1.6×10−7; Fig. 4, Supplementary Fig. 8c). This may be partly driven by a small number of samples from a single site, but still persists if these samples are removed (0.20 standardized residuals per degree, P=0.004; Supplementary Fig. 8d). There is little or no trend (0.06 cm/degree; P=0.047) in the post-Neolithic period (Figure 4f). We find no evidence for longitudinal clines in PRS. In summary, we find that stature increases with latitude in the post-Neolithic, possibly in the Neolithic, but not before. This cline may have a genetic basis. Stature also increases with longitude, particularly in the Mesolithic, but this cline is not predicted by genetics.
Correlated changes in bone density PRS and femoral bending strength
Beyond stature, we wanted to investigate the utility of using PRS to interpret other measurable phenotypes in ancient individuals. Decreased mobility though time, associated with large-scale lifestyle transitions between hunting-gathering, agriculture, and ultimately modern industrialism, is well documented through declines in lower limb bone diaphyseal strength and trabecular density (4, 65, 66). Today, heel bone mineral density (hBMD) is often used as an indicator of general activity levels in younger people (67) and of osteoporosis in older individuals (68, 69); UK Biobank has GWAS data for this trait, indirectly estimated by ultrasound. However, evaluating differences in BMD in archaeological and paleontological specimens can be problematic. In the short term soil leaches bone minerals, while later the bone begins to fossilize, leading to unpredictable patterns of density in ancient remains (70) and requiring special processing methods (65) that are difficult to apply to large samples. However, femoral diaphyseal bending strength can be calculated from bone cross-sectional geometric measurements that are not as affected by bone preservation (71). Here we focus on anteroposterior bending strength (section modulus) of the midshaft femur (FZx), which has been linked specifically to mobility (72). Since both trabecular density and diaphyseal strength should respond to mobility and activity levels, we reasoned that they would be likely to show correlated patterns of temporal change. Following established protocols (71), we standardized FZx first by sex, then the product of estimated body mass and femoral length (4).
Qualitatively, PRS(GWAS) and FZx show similar patterns, decreasing through time (Fig. 5, Supplementary Figure 1g-i). There is a significant drop in FZx (Figure 5c) from the Mesolithic to Neolithic (P= 1.2×10−8) and again from the Neolithic to post-Neolithic (P=1.5×10−13). PRS(GWAS) for hBMD decreases significantly from the Mesolithic to Neolithic (Figure 5a; P=5.5×10−12), which is replicated in PRS(GWAS/Sibs) (P=7.2×10−10; Figure 5b); neither PRS shows evidence of decrease between the Neolithic and post-Neolithic. We hypothesize that both FZx and hBMD responded to the reduction in mobility that accompanied the adoption of agriculture (72). In particular, the lower genetic hBMD and skeletal FZx of Neolithic compared to Mesolithic populations may represent adaptation to the same change in environment although we do not know the extent to which the change in FZx was driven by genetic or plastic developmental response to environmental change. On the other hand, FZx continues to decrease between the Neolithic and post-Neolithc (Fig. 5c,f)–which is not reflected in the hBMD PRS (Fig.5 a-b,d-e). One possibility is that the two phenotypes responded differently to the post-Neolithic intensification of agriculture. Another is that the non-genetic component of hBMD, which we do not capture here, also continued to decrease.
Are changes in PRS driven by selection or genetic drift?
We tested whether there was evidence for selection on any of these traits, by computing the Qx statistic (73) for increasing numbers of SNPs from each PRS, with effect sizes taken from either PRS(GWAS) (Fig. 6a-c) or PRS(GWAS/Sibs) (Fig. 6d-f). We computed the statistic between each pair of adjacent time periods, and over all time periods. We estimated empirical P-values by sampling random frequency-matched SNPs from across the genome. Using GWAS effect sizes, we find selection between the Neolithic and Post-Neolithic for stature (P<1 x10−4; Fig. 6a), which replicates using effect sizes estimated within siblings (10−4<P<10−2; Fig. 6d). The fact that the signal is less strong for GWAS/Sibs than for GWAS could either indicate that some of the signal is driven by stratification (21, 22), or that the power to detect selection for smaller effect sizes is lower when using the nosier sibling effect sizes. We tested this by generating GWAS results on a subsample of individuals, chosen so that the standard error of the effect sizes was equal to those of the within-sibling effects. This produced similar results to the analysis using the within-sibling effects (Supplementary Fig. 9), suggesting that the main reason for the weaker signal is the reduction in sample size of the within-sibling analysis.
For sitting height, we find little evidence of selection in any time period (P<10−2) We conclude that there was most likely selection for increased standing but not sitting height in the Steppe ancestors of Bronze Age European populations, as previously proposed (29). One potential caveat is that, although we re-estimated effect sizes within siblings, we still used the GWAS results to identify SNPs to include. This may introduce some subtle confounding, which remains a question for future investigation. Finally, using GWAS effect sizes, we identify some evidence of selection on heel BMD between when comparing Mesolithic and Neolithic populations (10−3 <P<10−2; Fig. 6c). However, this signal is relatively weak when using within-sibling effect sizes, and disappears when we include more than about 2000 SNPs.
Discussion
We showed that the well-documented temporal and geographic trends in stature in Europe between the Early Upper Paleolithic and the post-Neolithic period are broadly consistent with those that would be predicted by polygenic risk scores (PRS) computed using present-day GWAS results combined with ancient DNA. However, because of the limited predictive power of current PRS, we cannot provide a quantitative estimate of how much of the variation in phenotype between populations might be explained by variation in PRS. Similarly, we cannot say whether the changes were continuous, reflecting evolution through time, or discrete, reflecting changes associated with known episodes of replacement or admixture of populations that have diverged genetically over time. Finally, we find cases where predicted genetic changes are discordant with observed phenotypic changes–emphasizing the role of developmental plasticity in response to environmental change and the difficulty in interpreting differences in PRS in the absence of phenotypic data.
Our results indicate two major episodes of genetic change. First, there was a reduction in stature PRS–but not sitting height PRS–between the Early Upper Paleolithic and Neolithic. These genetic changes are consistent with the decrease in stature–driven by leg length–observed in skeletons during this time period (4, 64, 74, 75). This evolutionary change could have been adaptive, driven by changes in resource availability (76) or to a colder climate (61). Early Upper Paleolithic populations in Europe would have migrated relatively recently from more southern latitudes and had body proportions that are typical of present-day tropical populations (75). It is therefore plausible that they adapted to the colder climate of northern latitudes throughout the Upper Paleolithic. Comparison between patterns of phenotypic and genetic variation suggest that, on a broad scale, variation in body proportions among presentday people reflects adaptation to environment largely along latitudinal gradients (77, 78). On the other hand, we do not find genetic evidence for selection on stature during this time period–although with a small sample size we likely have very low power to detect it. Further, the populations of Early Upper Paleolithic, Late Upper Paleolithic, Mesolithic and Neolithic Europe are substantially discontinuous and deeply diverged genetically (33, 59). For example the ancestors of Mesolithic and Neolithic Europeans are estimated to have diverged ∼46,000 BP (40). Therefore, if these genetic changes do reflect adaptation to climate, this adaptation must have occurred at least partly independently in the ancestors of these populations.
The second episode of genetic change is either between the Neolithic and post-Neolithic, or during the post-Neolithic period. In genome-wide ancestry, this transition is characterized by the eastward movement of substantial amounts of “Steppe ancestry” into Central and Western Europe (27, 30, 38, 50). Our results are thus consistent with previous results that Bronze Age populations of the Eurasian steppe had been selected for increased height and that migration and admixture of these populations with Neolithic European populations increased genetic height in Europe (29, 30). There is no obvious climatic driver for this adaptation but one possibility is that it represents adaptation to a change in social environment. Y chromosome phylogenies suggests an increase in male reproductive variance at this time (29, 48, 50, 79, 80). Culturally, the Bronze Age is characterized by increased social stratification (81) and the introduction of patriarchal Indo-European culture (82). Perhaps these social changes implied increased competition for resources and consequent selection for greater body size. The geographic gradient of increasing skeletal stature is unclear in the Paleolithic, largely West-East in the Mesolithic (7, 64) and largely South-North by the Bronze Age (4, 7, 9). Latitudinal, but not longitudinal, patterns are qualitatively consistent with geographic patterns in PRS suggesting that, like temporal variation, both genetics and environment contribute to geographic variation.
There is a major confounding factor in analysis of temporal and geographic variation in PRS, particularly in the Bronze Age. Genetic population structure in present-day Europe is correlated with geography (83) and largely driven by variation in proportions of Steppe ancestry, with more Steppe ancestry in Northern Europe and less in Southern Europe (38). Suppose that environmental variation in stature is also correlated with geography, and that Northern Europeans are taller than Southern Europeans for entirely non-genetic reasons. Then, GWAS that do not completely correct for stratification will find that genetic variants that are more common in Steppe populations than Neolithic populations are associated with increased height. When these GWAS results are then used to compute PRS for ancient populations, they will predict that Steppe ancestry populations were genetically taller simply because they are more closely related to present-day Northern Europeans (21, 22). In this study, we attempted to avoid this confounding in two ways: first, by computing PRS using GWAS effect sizes from the UK Biobank–a fairly homogenous dataset that should be well-controlled for population stratification, and second, by replicating our results after re-estimating the effect sizes within siblings, which should be robust to population stratification. The tradeoff between these two methods is that the small sibling sample size means that effect size estimates are noisy, even though they should be unbiased, and our results using sibling-estimated effects may miss subtle trends. However, we cannot exclude the possibility that some confounding remains, for example because although we re-estimated effect sizes using the within-siblings design, we still ascertained loci using the GWAS results. Residual confounding would also tend to create spurious signals of polygenic adaption (21, 22).
As well as genetic contributions to phenotype, our results shed light on possible environmental contributions. In some cases, we can make hypotheses about the relationship between environmental or lifestyle changes, and genetic change. For example, if we interpret change in femur bending strength as reflecting a decrease in mobility, the coincident Mesolithic/Neolithic change in heel bone mineral density PRS can be seen as a genetic response to this change. However, in the Neolithic/post-Neolithic periods, the two observations are decoupled. This emphasizes the role of developmental plasticity in response to changes in environment, and of joint interpretation of phenotypic and genetic variables. Even when looking at the same phenotype, we find cases where genetic predictions and phenotypic data are discordant–for example in post-Neolithic sitting height. We must therefore be cautious in the interpretation of predicted genetic patterns where phenotypes cannot be directly measured, even if it is possible to control stratification. Predicted genetic changes should be used as a baseline, against which non-genetic effects can be measured and tested.
Methods
Ancient DNA and polygenic risk score construction
We collected published ancient DNA data from 1122 ancient individuals, taken from 29 publications. The majority of these individuals had been genotyped using an in-solution capture reagent (“1240k”) that targets 1.24 million single nucleotide polymorphisms (SNPs) across the genome. Because of the low coverage of most of these samples, the genotype data are pseudo-haploid. That is, there is only a single allele present for each individual at each site, but alleles at adjacent sites may come from either of the two chromosomes of the individual. For individuals with shotgun sequence data, we selected a single read at each 1240k site. We obtained the date of each individual from the original publication. Most of the samples have been directly radiocarbon dated, or else are securely dated by context.
We obtained GWAS results from the Neale lab UK Biobank page (http://www.nealelab.is/uk-biobank/; Round 1, accessed February and April 2018). To compute PRS, we first took the intersection of the 1240k sites and the association summary statistics. We then selected a list of SNPs to use in the PRS by selecting the SNP with the lowest P-value, removing all SNPs within 250kb, and repeating until there were no SNPs remaining with P-value less than 10−6. We then computed PRS for each individual by taking the sum of genotype multiplied by effect size for all included SNPs. Where an individual was missing data at a particular SNP, we replaced the SNP with the average frequency of the SNP across the whole dataset. This has the effect of shrinking the PRS towards the mean and should be conservative for the identification of differences in PRS. We confirmed that there was no correlation between missingness and PRS, to make sure that missing data did not bias the results (correlation between missingness and PRS ρ=0.02; P=0.44, Supplementary Fig. 10). Finally, we normalized the PRS across individuals to have mean 0 and standard deviation 1.
We estimated within-family effect sizes from 17,358 sibling pairs in the UK Biobank to obtain effect estimates that are unaffected by stratification. Pairs of individuals were identified as siblings if estimates of IBS0 were greater than 0.0018 and kinship coefficients were greater than 0.185. Of those pairs, we only retained those where both siblings were classified by UK Biobank as “white British”, and randomly picked two individuals from families with more than two siblings. We used Hail (84) to estimate within-sibling pair effect sizes for 1,284,881 SNPs by regressing pairwise phenotypic differences between siblings against the difference in genotype. We included pairwise differences of sex (coded as 0/1) and age as covariates, and inverse-rank-normalized the phenotype before taking the differences between siblings. To combine the GWAS and sibling results, we first restricted the GWAS results to sites where we had estimated a sibling effect size and replaced the GWAS effect sizes by the sibling effects. We then restricted to 1240k sites and constructed PRS in the same way as for the GWAS results.
To test whether the differences in the GWAS and GWAS/Sibs PRS results can be explained by differences in power, we created subsampled GWAS estimates which matched the sibling in the expected standard errors, by determining the equivalent sample size necessary and randomly sampling Nsub individuals. where δsib is the difference in normalized phenotype between siblings after accounting for the covariates age and sex.
Stature data
We obtained stature data from Ruff (2018) (4) (data file and notes available at http://www.hopkinsmedicine.org/fae/CBR.html), which also includes estimated body mass, femoral midshaft anteroposterior strength (FZx), and other osteometric dimensions. Statures and body masses were calculated from linear skeletal measurements using anatomical reconstruction or sample-specific regression formulae (4, 58). We calculated sitting height as basion-bregma (cranial) height (BBH) plus vertebral column length (VCL). We restricted analysis to 1159 individuals dated earlier than 1165 BP (651 males and 508 females), of which 1130 had estimates for stature, 1014 for FZx and 236 for sitting height. Sitting and standing height were standardized for sex by adding the mean difference between male and female estimates to all the female values. Sex differences in stature remain relatively constant over time (4), making it reasonable to adjust all female heights by the same mean value. For FZx we first standardized for sex as we did for stature then divided each by estimated body mass multiplied by biomechanical femur length (4).
Grouping
We grouped individuals into broad categories based on date and, in some cases, archeological and genetic context. All individuals were assigned to one time period group, based on median age estimates of the sample obtained from the original publications. Date ranges for each time period are based on a combination of historical, climatic, and archaeological factors. The Early Upper Paleolithic comprises all samples older than 25,000 BP, which roughly coincides with the end of the last glacial maximum (LGM). The Late Upper Paleolithic begins when the European glaciers are beginning to recede (25,000 BP) and extends until 11,000 BP and a shift in lithic technology that is traditionally used to delineate the beginning of the Mesolithic period. Transitions between the Mesolithic, Neolithic, and Bronze Age are staggered throughout Europe, so creating universally applicable date ranges is not possible. We instead defined overlapping transition periods between the Mesolithic and Neolithic periods (8500-5500 BP) and between the Neolithic and post-Neolithic (5000-3900 BP). For the genetic data, samples in the overlapping periods were assigned based on genetic population affiliation, inferred using supervised ADMIXTURE (48, 85, 86) which, in most of Western Europe, corresponds closely to archaeological context (38, 48). In particular, the Mesolithic/Neolithic overlap was resolved based on whether each individual had more (Neolithic) or less (Mesolithic) than 50% ancestry related to northwest Anatolian Neolithic Farmers. The Neolithic/post-Neolithic overlap was resolved based on whether individuals had more than 25% ancestry related to Bronze Age Steppe populations (“Steppe ancestry”; See Ref. (86) for more details). For the skeletal data, group assignment in the overlapping periods was determined by the archaeology of each site. Broadly, sites belonging to the Neolithic have transitioned to agricultural subsistence. Similarly, post-Neolithic populations are broadly defined by evidence of metal working (Copper, Bronze and Iron Ages, and later periods). In particular, we included Late Eneolithic (Copper Age) sites associated with Corded Ware and Bell Beaker material culture in the post-Neolithic category but for consistency with the genetic classifications, we included 8 Early Eneolithic (before 4500 BP) individuals in the Neolithic category, since this precedes the appearance of Steppe ancestry in Western Europe. We excluded samples more recent than 1165 BP.
Linear models
We fitted a series of linear models to changes in both PRS and stature data with time. In the most general model, we allow both the intercept and slope to vary between groups. We then either force some of the slopes to be zero, or some of the adjacent groups to have identical parameters. We describe the models using underscores to indicate changes in parameters, lowercase to indicate slopes (change with respect to time) fixed to zero, and upper case to indicate free slopes (i.e. linear trends with time). For example, “E_L_M_N_B” is the most general model, “elmnb” indicates that all groups have the same mean and there is no change with time, and “ELMN_B” indicates that the first four groups share the same parameters, and the post-Neolithic has different parameters. The models shown in Figures 1 and 2 are “e_lmn_b” (panels a-b), “e_lm_nb” (panel c), “ELMN_B” (panels d-e) and “ELM_NB” (panel f). To analyze geographic variation, we used the residuals of the “ELMN_B” model for the PRS and “ELM_NB” for skeletal stature, and fitted regressions against latitude and longitude.
Polygenic selection test
We computed bootstap P-values for the Qx statistic (73) by sampling random sets of SNPs in matched 5% frequency bins, and re-computing the statistic. Unlike for the PRS calculations, we ignored missing data, since the Qx statistic uses only the population-level estimated allele frequencies and not individual-level data. We tested a series of nested sets of SNPs (x-axis in Fig. 6), adding SNPs in 100 SNP batches, ordered by increasing P-value, down to a P-value of 0.1.
Acknowledgments
I.M. was supported by a Research Fellowship from the Alfred P. Sloan foundation, and a New Investigator Research Grant from the Charles E. Kaufman Foundation. Skeletal data were collected in collaboration with Brigitte Holt, Markku Niskanen, Vladimir Sladék, and Margit Bernor, with the support of the National Science Foundation (BCS-0642297 and BSC-0642710) and the Grant Agency of the Czech Republic and the Academy of Finland and Finnish Cultural Foundation. We thank Jeremy Berg and Eva Rosenstock for helpful comments on an earlier version of the manuscript. This project was initially conceived during discussions at the workshop “Human stature in the Near East and Europe in a long-term perspective” at the Freie Universität Berlin 25-27 April 2018, organized as part of the Emmy-Noether-Projekt “LiVES” funded by the German Research Foundation, Grant Nr. RO4148/1 (PI Eva Rosenstock).
References
- 1.↵
- 2.
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.
- 9.↵
- 10.↵
- 11.
- 12.
- 13.
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.
- 25.↵
- 26.↵
- 27.↵
- 28.
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.
- 35.
- 36.
- 37.
- 38.↵
- 39.
- 40.↵
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.↵
- 49.
- 50.↵
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵