Abstract
Phenotypes typically display integration, i.e. correlations between traits. For quantitative traits—like many behaviors, physiological processes, and life-history traits—patterns of integration are often assumed to have been shaped by the combination of linear, non-linear, and correlated selection, with trait correlations representative of optimal combinations. Unfortunately, this assumption has rarely been critically tested, in part due to a lack of clear alternatives. Here we show that trait integration across 6 phyla and 60 species (including both Plantae and Animalia) is consistent with evolution across high dimensional “holey landscapes” rather than classical models of selection. This suggests that the leading conceptualizations and modeling of the evolution of trait integration fail to capture how phenotypes are shaped. Instead, traits are integrated in a manner contrary to predictions of dominant evolutionary theory.
A common attribute of most organisms is that they display trait integration. For example, life-history traits are often correlated according to a slow-fast continuum1,2. This trait integration is commonly understood in terms of trade-offs and fitness maximization3–8 and is frequently modeled as populations moving across adaptive landscapes toward peaks of higher fitness. However, this adaptive perspective has rarely been evaluated due to a lack of clear alternatives. Consequently, much of our understanding of when and why quantitative traits are correlated might be shaped by adaptive just-so-stories9.
Competing evolutionary processes
Our understanding of selection has been strongly shaped by Sewall Wright’s conceptualization of an adaptive landscape, with populations moving from areas of low fitness to areas of higher fitness While the simple one and two trait landscapes Wright originally detailed have been criticized as unrealistic, including by Wright himself10, the general metaphor has nonetheless guided much of evolutionary thought12.
For quantitative traits, like many aspects of physiology, behavior, and morphology, Wright’s metaphor has been mathematically extended to complex topographies with ridges or tunnels of high fitness13,14,15. Applying these adaptive landscape topologies in mathematical models has led to insights into how variation in traits, and correlations among traits, are expected to change over time15. Simulations have similarly led to the prediction that landscapes with complex topographic features like fitness ridges result in populations with genetic correlations aligned with these ridges3–5.
Concurrent to the study of quantitative trait variation, the question of how the topography of fitness landscapes affects sequence evolution at the genomic level has garnered similar interest16. Whereas theoreticians interested in quantitative trait variation have focused on relatively simple landscapes e.g.3,4,5,17–19, theoretical research regarding sequence evolution has spanned simple single peak Gaussian “Fujiyama landscapes”, to “badlands landscapes” Fig 1A & 1B;20, to abstract high-dimensional “holey landscapes” Fig 1C;21. Among other topics, this research has explored how topographies of varying complexity (Fig 1) affect the ability of populations to reach optima16. An important conclusion from this research is that evolutionary dynamics on simple landscapes often fail to properly predict evolution on landscapes of higher dimensionality.
Of these landscapes, perhaps most conceptually unfamiliar and unintuitive to researchers focused on quantitative trait evolution are Gavrilets’ (1997) holey landscapes (Fig 1C). The general concept of holey landscapes is that, because phenotypes are made up of a large number of traits, phenotypes are necessarily high dimensional constructs and corresponding landscapes will consist of either trait combinations that are of average fitness or trait combinations that confer low fitness or are inviable21,22. This results in flat landscapes with holes at inviable or low fitness phenotypes (Fig 1C). The flat landscape can be understood as stemming from the full multivariate nature of the phenotype: while there may be clear fitness differences in two dimensions, strong gradients will create holes in the landscape and peaks will average out when additional traits are considered. Unfortunately, predictions about quantitative trait evolution on holey landscapes are not clear.
More broadly, it is not clear what the topography of landscapes typically is for natural populations. While portions of selection surfaces and fitness landscapes can be directly estimated23,24, these estimates may differ from the underlying full landscape due to several factors. These include: the omission of fitness affecting traits25, incomplete estimation of fitness26,27, and insufficient power to estimate non-linear selection coefficients28. An alternative to direct estimation of adaptive landscape topography is to infer landscape topography from observed trait (co)variances. For example, low additive genetic variation is suggestive of stabilizing or directional selection29 and additive genetic correlations are expected to emerge from correlational selection and fitness ridges in a landscape e.g.13,14. Thus, an ability to gain an understanding of the topography of adaptive landscapes based on observed trait variation would aid our understanding as to how selection is realized in natural populations.
Here we used a simulation model to examine how evolution on different landscapes contributes to patterns of trait integration. We modeled populations that evolved solely via drift, that evolved via adaptation on simple Gaussian fitness landscapes stemming from Wright’s metaphor, or that evolved on holey landscapes. This allowed us to generate testable predictions for how the structure of additive genetic variances and covariances (G) are shaped by different landscape topographies. We next compared these modeled outcomes to 181 estimates of G, representing 60 species from 6 phyla, including both plants and animals, to determine if observed trait integration is consistent with any of the modeled processes.
Model Construction
We developed an individual variance components model (Methods, Fig S1;30) wherein individuals had phenotypes comprised of 10 traits (k), with each trait being highly heritable (h2 = 0.8), and initial genetic covariances between traits set at zero. Populations of individuals evolved on one of five landscapes: (i) a flat landscape where no selection occurred (i.e. drift alone), (ii) Gaussian landscapes where fitness for each pair of traits was characterized by a single peak but with correlational selection, and three (iii – v) implementations of holey landscapes differing by p21,22, the proportion of viable phenotypes in a holey landscape (p = 0.2, 0.5, and 0.8). Each of the modeling scenarios was simulated 250 times for populations of 7500 individuals and for 100 generations for each population. Full modeling details are provided in the Methods and all modeling code is available at https://github.com/DochtermannLab/Wright_vs_Holey.
Model analysis
Following these simulations, the eigen structures of the resulting 1250 population genetic covariance matrices were compared. Because the simulated phenotypes consisted of 10 traits, it was the overall multivariate pattern of variation that was of interest rather than any specific single trait or pairwise combination. To do so, we calculated the ratio of each matrix’s second eigen value (λ2) to its dominant eigen value (i.e. λ2/λ1). This metric provides a better estimate of the compression of variance into a leading dimension than do other common metrics like the variation of the first eigen value to the sum of eigen values (i.e. λ1/Σλ). For example, λ1/Σλ could be low if the variation not captured by λ1 is equally distributed across all other dimensions, even if all other dimensions contained relatively little variation. The same scenario would produce a high value for λ2/λ1.
λ2/λ1 was then compared across the modeling scenarios using analysis of variance and Tukey post-hoc testing. Alternative metrics for characterizing covariance matrices were consistent with the results for λ2/λ1 (see Supplementary Results). We also present the results of analyses of a broad range of starting conditions and model conditions in the Supplementary Results. These supplemental analyses confirmed the robustness of the findings reported below.
Model outcomes
When evolving on holey landscapes, populations lost greater relative variation in the nondominant dimensions as compared to when evolving on simple Gaussian landscapes or when subject solely to drift (Fig 2; Fig S3 A-D). λ2/λ1 significantly differed depending on selection regime (F4,1245 = 368, p ≪ 0.01; Fig 2). Populations experiencing either just drift or evolving on Gaussian landscapes maintained a more even amount of variation across dimensions compared to those evolving on holey landscapes (i.e. higher λ2/λ1 all post-hoc comparisons p < 0.001; Fig 2, Table S3). All populations evolving on holey landscapes exhibited similar λ2/λ1 ratios regardless of p (all post-hoc comparisons of outcomes for holey landscapes: p > 0.05; Fig 2, Table S3). While a modest difference, populations evolving due to drift alone also exhibited a significantly greater ratio than populations evolving on Gaussian landscapes (difference = 0.06, p = 0.002; Fig 2, Table S3). This magnitude of a difference is unlikely to be biologically important or detectable in natural populations and instead is likely driven by the high power available with simulations. These differences were consistent across approaches to summarizing G and are robust to conditions of the simulations (Supplementary Results).
These modeling results generate the general prediction that greater relative variation in multiple dimensions is maintained when populations evolve on Gaussian landscapes than when evolving on holey landscapes
Put another way, evolving on holey landscapes is predicted to result in a large decrease in variation from the dominant to subsequent dimensions and, consequently, a lower λ2/λ1 value (Fig S3).
Observed patterns of trait integration
We next sought to determine which of the modeled processes produced results consistent with observed patterns of trait integration. To do so we conducted a literature review wherein we used Web of Science to search the journals American Naturalist, Ecology and Evolution, Evolution, Evolutionary Applications, Evolutionary Ecology, Genetics, Heredity, Journal of Evolutionary Biology, Journal of Heredity, Nature Ecology and Evolution, and the Proceedings of the Royal Society (B). We searched these journals using the terms “G matrix” on 14 May 2019, yielding a total of 272 articles. Each article was reviewed and estimated G matrices extracted if the article met inclusion criteria. For inclusion, an estimated G matrix must have been estimated for more than 2 traits (i.e. > 2 × 2), must have been reported as variances and covariances (i.e. not genetic correlations), and must not have been estimated for humans. Based on these inclusion criteria, we ended up with a dataset of 181 estimated G matrices from 60 articles (Fig S2). For each published G matrix, we estimated λ2/λ1.
Observed outcomes
Across all taxa, average λ2/λ1 was 0.36 (sd: 0.23, Fig 2). This estimate is consistent and statistically indistinguishable from those observed for simulated populations evolving on Holey landscapes (tdf:17.275 = 0.32, 1.20, −0.05, p > 0.2 (all) versus Holey landscapes with p = 0.2, 0.5, and 0.8 respectively; Fig 2, Table S10) and substantially less than observed for simulated populations that evolved on Gaussian landscapes or via drift alone (tdf:17.275 = - 12.42, −14.55 respectively, p < 0.001 (both)).
While some individual estimates at the species level exhibited high λ2/λ1 values (Fig 2), phylogeny explained little variation in these values (phylogenetic heritability = 0.05; Table S9). As was the case across all taxa, median λ2/λ1 values for each taxonomic Class (or comparable level clade) were consistently lower than expected if evolution occurred on Gaussian landscapes or via drift alone (Fig 2). Instead, these results are strongly consistent with evolution on Holey landscapes.
Conclusions
The observation that traits linked to fitness are frequently correlated has been a major driver of research across evolutionary ecology. Research in life-history, physiology, and behavior has frequently been structured around such observations, arguing that this integration stems from optimization in the face of trade-offs U3i-33. However, because selection is frequently acting on many traits, patterns of integration quickly diverge from simple expectations, even under conventional models of evolution. However, our results suggest something substantively different is occurring: the observed pattern of variation across taxa suggests that classic models of the evolution of quantitative traits—e.g. stabilizing and correlational selection—are not what have predominantly shaped trait integration. Instead, drift across holey landscapes21,22 is more consistent with observed quantitative genetic variation (Fig 2).
Much of the early theoretical development of holey landscapes focused on the ability of populations to traverse genomic sequence differences via drift, with some sequences being inviable (e.g. due to missense differences in coding regions). How this extends to quantitative traits had been less clear. Our simulation model provides one approach to applying the holey landscape framework to quantitative traits, treating each trait as a threshold character34. Other approaches to modeling quantitative traits on holey landscapes and evolution in response to these versions, such as the generalized Russian roulette model22, may produce different outcomes. It is also important to recognize that the broad support for evolution on holey landscapes does not preclude that subsets of traits from having evolved on Gaussian landscapes. Indeed, stabilizing selection has been observed in natural populations28, though understanding its general strength even on a case by case basis is confounded with methodological problems35,36. Regardless, our finding that observed patterns of quantitative genetic variation across taxonomic groups are not consistent with traditional evolutionary models stands.
This disconnect between observed patterns of multivariate variation and expectations under conventional models of selection suggests that Wright’s metaphor of fitness landscapes and the subsequent implementation of this metaphor as Gaussian surfaces may have contributed to an improper, or at least incomplete, understanding of how selection has shaped phenotypes. A potential contributor to this problem has been the lack of clear alternative explanations besides a simple null hypothesis of drift with no selection. Moving forward, clear development of alternative models of the action of selection and evolution in multivariate space are needed.
Author Contributions
NAD conceived of the project and developed the first version of the model. BK collected published G matrices and calculated matrix summary estimates. RR contributed to model development and analyses. DAR contributed to model development and developed the parameter exploration scheme. All authors contributed to the writing of the manuscript.
Supplemental Methods
Simulation Models
Model Construction
We developed an individual variance components model Fig S1; sensu30 wherein individuals had phenotypes comprised of 10 traits (k) and with each trait being highly heritable (h2 = 0.8) and initial genetic covariances between traits of 0. A high heritability was initially used to reduce the number of generations needed to determine the response of populations to selection. Genetic covariances were set to an initial value of zero to simulate a population under linkage equilibrium. Viability selection was applied based on fitness, which was determined either by location on a ten-dimensional holey landscape or on simple Gaussian landscapes with a single optimum per trait pair.
Holey Landscapes
For simulations evaluating holey landscapes, we simulated populations in which traits were inherited as though continuous but expressed categorically as one of two phenotypic variants (e.g. phenotype 0 versus 1 for trait 1). Specifically, at the start of simulations, we drew genotypes for each individual from a normal distribution with a mean of zero and standard deviation of 1. To these normally distributed genotypes, we added “environmental” values (μ = 0, all covariances = 0) to generate a phenotype with a heritability of 0.8. These continuously distributed phenotypic values were then transformed as one implementation of the holey landscape is based on the fitness of specific and discrete combinations. Specifically, the continuously distributed values were transformed to be a phenotype of 0 or 1, with a genotype < 0 being “0” and a genotype > 0 being “1” (Table S1).
The holey landscape for a specific simulation was then constructed by randomly assigning a fitness of 0 or 1 to the 1024 possible phenotypes (2k) trait combinations based on the parameter p. “p” was the probability that a trait combination had a fitness of 1 and corresponds to Gavrilets’ (2004) percolation parameter. We used three values of p in our simulation ranging from weak (p = 0.2), moderate (p = 0.5) and high (p =0.8). p can vary between 0 and 1, with values of 1 corresponding to a landscape where all trait combinations are viable and have a fitness of 1. As p approaches 0, few trait combinations are viable.
After the first generation, genotypes were drawn from a multivariate normal distribution based on the means and genetic variance-covariance matrix of the population that survived selection. Environmental contributions again had an average of 0 and no environmental correlation with a variance set to keep heritability at 0.8 (or other values during parameter exploration, below). The resulting phenotypic values were then converted to 0’s and 1’s as above. This approach to generating subsequent generations follows the structure of individual variance components models described by Roff30. We used this individual variance components approach rather than an agent-based approach as the latter combined with the computational requirements of matching phenotypes to fitness under the holey landscape model was not amenable to simulation analysis.
Gaussian (Wrightian) adaptive landscapes
For simulations evaluating Gaussian landscapes, we generated genotypes and phenotypes as above but without the categorical conversion (Table S1). We then generated random landscapes such that the optima (θ) for all traits was set to zero. The topography of the landscape for each pair of traits (e.g. ωi,j) was defined as consistent with previous simulation studies examining the evolution of quantitative traits reviewed by3. This approach corresponds to single peak landscapes in any two dimensions. The forty-five ωi,j values that fully describe the landscape were generated using the LKJ onion method for constructing random correlation matrices with a flat distribution of correlations (η = 1; Lewandowski et al. 2009). Using the LKJ onion method ensures that the full description of the landscape (ω) is positive semi-definite with feasible partial correlations. We then calculated each individual’s fitness based on a Gaussian surface38: where wh is the fitness of individual h, zh is a vector of the observed phenotypic values for individual h, ω is the selection surface, and θ is the optima for traits (0). Truncation selection was applied based on fitness, with the 50% of individuals possessing the highest fitness surviving (main results). In an additional set of simulations, stronger truncation selection was applied and only 10% of the population survived.
Following selection in either framework, the next generation was constructed using an individual variance components approach30. Specifically, the next generation was generated as described above based on the trait means, variances and covariances of survivors. Selection therefore acted via changes in means and variances and drift during the selection simulations was due to sampling error from the selection shaped phenotypic distributions.
Drift alone
For populations evolving via drift alone phenotypes were generated as for Gaussian adaptive landscapes. Composition of subsequent generations was likewise generated based on the means and variances of the prior generation, without selection. The drift model therefore was simply a model of sampling error.
Each of five modeling scenarios (simple landscapes, drift alone, three Holey landscapes with p = 0.2, 0.5, or 0.8) was simulated 250 times for populations of 7500 individuals and for 100 generations for each population. All modeling code is available at https://github.com/DochtermannLab/Wright_vs_Holey.
Statistical Comparison of Evolutionary Metrics
To clarify differences in evolutionary outcomes across modeling scenarios, we summarized evolutionary outcomes at the level of G matrices based on several metrics:
λ2/λ1; results for this metric are presented in the main text
λ1/∑λ; this is a commonly used summary value and represents the proportion of variation captured by dominant eigenvalue. This can be interpreted as the proportion variation in the main dimension of covariance
∑λ; matrix trace, the total variation present. For simulations this is informative as to whether a particular process results in the loss of more or less variation
ē: average evolvability across dimensions39. Evolutionary potential throughout multivariate space
ā: average reduction in evolvability due to trait covariance39. Can be interpreted as how constrained evolutionary responses are based on correlations. At the extreme, an average autonomy of 0 would indicate absolute constraints on responses to selection and an average autonomy of 1 indicates evolutionary independence. Values between 0 and 1 represent quantitative constraints.
We compared these metrics across drift, Gaussian, and holey landscape simulations, following the main text, based on ANOVA followed by post-hoc comparisons based on calculation of Tukey’s Honest Significant Differences (HSD).
Post-hoc Parameter Exploration
The above modeling scenarios were used for our overall general analyses and for comparison to observed values. However, to explore whether our modeling outcomes were due to fundamentally different and generalizable outcomes or instead emerged from peculiarities of initial parameters, we expanded our analyses in two ways.
First, in addition to the moderate/weak strength of truncation selection modeled above (0.5), we also modeled stronger selection where only 10% of individuals survived. For this stronger strength of selection we again conducted 250 simulations of 7500 individuals for 100 generations. These simulations were included in the above analyses.
Second, to more broadly examine the sensitivity of our results to different starting values, we conducted simulation studies for our selection model, our model of drift, and our model of evolution on flat holey landscapes. For each modeling scenario (Gaussian surfaces, drift, Holey landscapes) we conducted 1000 simulations where both the magnitude of initial genetic variation in each trait varied and h2 varied (h2 was defined independently). For each scenario we then explored how other changes in starting parameters affected the eigenstructure of G (Table S2).
We then quantitatively assessed the relevance of each varied parameter on λ2/λ1—within modeling scenario—using linear models. All two-way interactions were included in analyses and variables (model parameters) were mean centered but unscaled. We then qualitatively compared λ2/λ1 across modeling scenarios based on heat plots.
Empirically Estimated G Matrices
Observed patterns of multivariate genetic variation
We conducted a literature review with Web of Science to search the journals American Naturalist, Ecology and Evolution, Evolution, Evolutionary Applications, Evolutionary Ecology, Genetics, Heredity, Journal of Evolutionary Biology, Journal of Heredity, Nature Ecology and Evolution, and the Proceedings of the Royal Society (B). These journals were searched using the terms “G matrix” on 14 May 2019, yielding a total of 272 articles. Each article was reviewed to determine if the article met inclusion criteria. Our inclusion criteria were:
A G matrix must have been estimated for more than 2 traits (i.e. > 2 × 2)
Must have been reported as variances and covariances (i.e. not genetic correlations)
Must not have been estimated for humans.
Based on these inclusion criteria, we ended up with 181 estimated G matrices (Fig S2). For each published G matrix, we calculated λ2/λ1 using a purpose-built R Shiny App (link).
For each estimate we recorded the paper from which it was drawn (recorded as a unique study ID), taxonomic information (Kingdom through species epithet), trait category (life-history, physiology, morphology, behavior or mixed), the number of traits in the matrix, λ1, λ2, λ2/λ1, number of dimensions40, number of dimensions divided by the number of traits, and all bibliographic information.
Phylogenetic Signal in λ2/λ1
To test for phylogenetic signal we fit a simple taxonomic mixed-effects model. This modeling approach incorporates the hierarchical non-independence due to taxonomic relationships but does not require a full phylogeny41. Essentially, at each node of a phylogeny, relationships are modeled according to a star relationship. Each taxonomic grouping was included as a random effect, as was study ID, and the resulting model fit with the lme4 package in R42. From this model we estimated phylogenetic signal as the proportion of variation attributable to taxonomy, the variation attributable to study ID, and the residual variance. Confidence intervals were then estimated based on likelihood profile likelihoods.
Comparison of Observed Results to Simulation Results
Finally, we compared the observed values to the average for each of the simulation using the intercept coefficient of the above linear model. For this, t was calculated as43: where was the estimated intercept from the taxonomic model (above) and βHo was a simulation average. p was calculated with degrees of freedom estimated using Satterthwaite’s method (df = 17.275).
Supplemental Results
Simulation Models
Statistical Comparison of Evolutionary Metrics
Populations that evolved on different landscapes (drift alone, Gaussian, or holey) significantly differed from each other in the structure of G after 100 generations (Tables S3 – S7). Holey landscapes were characterized by a compression of most variation into the dominant dimension in multivariate space (Tables S3 & S4; Figures 2 & S3). Populations evolving on Gaussian landscapes were characterized by a drastic reduction in the total variation present, which was also reflected in reduced evolvability (Tables S5 & S6; Figures S4 & S5). The combination of high standing genetic variation and this variation being distributed across dimensions led to populations that evolved solely due to drift to exhibit significantly greater autonomy than observed in any of the other modeling scenarios (Table S7; Figure S6). This greater constraint in populations evolving on either Gaussian or holey landscapes is likely due to the loss of variation for populations evolving on Gaussian landscapes (Figures S4 & S5) and the compression of variation for populations evolving on holey landscapes (Figures 2 & S3).
Post-hoc Parameter Exploration
For populations evolving on Gaussian landscapes, compression of genetic variation into the leading dimension decreased with increasing heritability and an increasing strength of selection (Table S8, Figure S7). No two-way interaction was statistically significant. Put another way, λ2/λ1, increased with heritability and the strength of selection and average λ2/λ1 was 0.68 for average parameter values (Table S8).
For populations evolving solely due to drift, λ2/λ1 increased with greater initial total genetic variation (Table S9). However, the strength of this effect was minimal. More dramatically, λ2/λ1 significantly and strongly decreased with increasing average initial absolute genetic correlation (Table S9). At the extreme, λ2/λ1 approached 0 as the average initial absolute correlation approaches 1. No two-way interaction was statistically significant. Average λ2/λ1 was 0.69 for average parameter values (Table S9).
When evolving on holey landscapes, and consistent with prior simulation comparisons, λ2/λ1 was lower for average parameter values (0.42, Table S10). Compression into a single dimension also increased with increasing heritability and increasing average absolute initial correlations (Table S10).
Genetic variation was more strongly compressed into a primary dimension when populations evolved on holey landscapes versus when they evolved due to drift or due to selection on Gaussian surfaces (Tables S8 – S10; Figures S7 – S9). This was a surprisingly robust result regardless of the starting parameters of a simulation (Figures S7 – S9). This parameter robustness44 supports the generality of our modeling. Unfortunately, we were not able to investigate other forms of robustness44 due to computational limitations.
Acknowledgements
The authors thank A.J. Wilson and B. de Bivort for helpful conversations. This work was supported by US NSF IOS grant 1557951 to N.A.D.
Footnotes
↵* these authors are listed in alphabetical order