A genotype-phenotype-fitness map reveals local modularity and global pleiotropy of adaptation

Building a genotype-phenotype-fitness map of adaptation is a central goal in evolutionary biology. It is notoriously difficult even when the adaptive mutations are known because it is hard to enumerate which phenotypes make these mutations adaptive. We address this problem by first quantifying how the fitness of hundreds of adaptive yeast mutants responds to subtle environmental shifts and then modeling the number of phenotypes they must collectively influence by decomposing these patterns of fitness variation. We find that a small number of phenotypes predicts fitness of the adaptive mutations near their original glucose-limited evolution condition. Importantly, phenotypes that matter little to fitness at or near the evolution condition can matter strongly in distant environments. This suggests that adaptive mutations are locally modular—affecting a small number of phenotypes that matter to fitness in the environment where they evolved—yet globally pleiotropic—affecting additional phenotypes that may reduce or improve fitness in new environments.


27
High-replicate laboratory evolution experiments are opening an unprecedented window into the

30
One of the key insights revealed by these studies is that in many systems, evolution can initially 31 proceed rapidly via many large-effect single mutations. While the identities of these adaptive 32 mutations are often unique to a specific replicate of the evolutionary experiment, across many 33 replicates they tend to occur in similar functional units (e.g. genes and pathways) (Crozat et al.,

56
The mapping of adaptive mutations to a smaller number of functional units and thus a low-57 dimensional space representing the small number of phenotypes that they collectively affect 58 ( Fig 1A) is consistent with theoretical models of adaptation. These theoretical models argue that 59 adaptive mutations, especially those of substantial fitness benefit, cannot affect too many 60 phenotypes at once as most such effects should be deleterious and thus inconsistent with the 61 overall positive effect on fitness (Fisher, 1930;Orr, 2000). More recent studies likewise suggest 62 that selection against mutations with high pleiotropy, i.e. mutations that affect many phenotypes, 63 has resulted in a modular architecture of the genotype-phenotype map, in which genetic 64 changes can influence some phenotypes without affecting others (Altenberg, 2005; Collet et al., A previous evolution experiment generated a collection of hundreds of adaptive yeast mutants, 161 each of which typically harbors a single independent mutation that provides a benefit to growth 162 in a glucose-limited environment (Levy et al., 2015). Many of these mutants, which began the 163 evolution experiment as haploids, underwent whole-genome duplication to become diploid, 164 which improved their relative fitness (Venkataram et al 2016). Some of these diploids acquired 165 additional mutations, including increased copy number of either chromosome 11 or 12 as well 166 as point mutations, which generated additional fitness benefits. The adaptive mutants that 167 remained haploid acquired both gain-and loss-of-function mutations in nutrient-response 168 pathways (Ras/PKA and TOR/Sch9). Some other mutations were also observed, including a 169 mutation in the HOG pathway gene SSK2 (Venkataram et al., 2016). Although these mutants 170 have been well-characterized at the level of genotype and fitness, it is unclear what phenotypes 171 they affect. The first question we address is whether these diverse mutations collectively affect 172 a large number of phenotypes that matter to fitness, or whether these mutants are functionally 173 similar in that they collectively alter a small set of fitness-relevant phenotypes.

175
Understanding the map from genotype to phenotype to fitness is extremely challenging because 176 each genetic change can influence multiple traits, not all of which are independent or contribute 177 to fitness in a meaningful way. We contend with this challenge by measuring how the relative 178 fitness of each adaptive mutant changes across a large collection of similar and dissimilar 179 environments, which we term the "fitness profile". When a group of mutants demonstrate similar 180 responses to environmental change, we conclude that these mutants affect similar phenotypes.

181
By clustering mutants with similar fitness profiles across a collection of environments, we can 182 learn about which mutants influence similar phenotypes, as well as estimate the total number of 183 fitness-relevant phenotypes represented across all mutants and all investigated environments.

185
Because our mutant strains are barcoded, we can use previously-established methods to 186 measure their relative fitness in bulk and with high-precision (Venkataram et al., 2016).  (Table S2).

208
In order to determine the total number of phenotypes that are relevant to fitness in the EC, we 209 focus on environments that are very similar to the EC but still induce small yet detectable 210 perturbations in fitness. We do so because the phenotypes that are the most relevant to fitness 6 near the EC, and "strong" perturbations which we will use to study whether these mutants 214 influence additional phenotypes that matter in other environments (Fig 1C).

216
To partition environments into subtle and strong perturbations of the EC, we rely on the nested 217 structure of replicate experiments performed in the EC. We performed nine such replicates, 218 each at different times, which each included multiple replicates performed at the same time. We 219 observe much less variation across replicates performed simultaneously than across replicates 220 performed at different times (p < 1e-5 from permutation test). Variation across experiments 221 performed at different times is often referred to as "batch effects" and likely reflects 222 environmental variability that we were unable to control (e.g. slight fluctuations in incubation 223 temperature due to limits on the precision of the instrument). These environmental differences 224 between batches are very subtle, as they represent the limit of our ability to minimize 225 environmental variation. Thus, variation in fitness across the EC batches serves as a natural 226 benchmark for the strength of environmental perturbations. If the deviations in fitness caused by 227 an environmental perturbation are substantially stronger than those observed across the EC 228 batches, we call that perturbation "strong".

230
More explicitly, to determine whether a given environmental perturbation is subtle or strong, we 231 subtract the fitness of adaptive mutants in this environment from their average across the EC 232 batches. We then compare this difference to the variation in fitness observed across the EC 233 batches. Sixteen environmental perturbations provoked fitness differences that were similar to 234 those observed across EC batches (Z-score < 2). These environments, together with the nine

240
The rank order of the fitnesses of many mutations is largely preserved across the 25 241 environments that represent subtle perturbations (Fig 2C,

289
We utilize these complex fitness profiles to estimate the number of phenotypes that contribute to 290 fitness in the EC. Given that many of these mutants affect genes in the same nutrient response 291 pathway, the number of unique phenotypes they affect may be small. Alternatively, given the 292 observation that these mutants have different interactions with environments that represent 293 strong perturbations (Fig 2C), this number may be large. We use singular value decomposition 294 (SVD) to ask how much of the complexity in these fitness profiles can be captured by a low-295 dimensional phenotypic model ( Fig 3A). SVD is a dimensionality reduction approach which here 296 decomposes fitness profiles into two abstract multi-dimensional spaces described below.

298
The first space, P, represents the phenotypic effects of mutants, where each phenotype is 299 represented as a dimension (there are k phenotypic dimensions depicted in Fig 3A). Each 300 mutant is represented by coordinates specifying a location in the phenotype space P (e.g. 301 mutant 1 having coordinates( 11 , 12 , 13 , . . . , 1! )). The ancestral reference lineage, which, by 302 definition, has relative fitness zero in every environment, is placed at the origin (e.g. (0, 0, 0, … 303 0)) in this phenotypic space. In this sense, we can think of a mutation's effect on any phenotype 304 as a measure of the distance from the location of the mutant in that phenotypic dimension to the 305 origin.

307
The second space, E, represents the contribution of each of the phenotypes in P to fitness, and 308 thus has the same number of dimensions as P. If a phenotype does not contribute substantially 309 to fitness in any environment, it is not represented as a dimension in either space. Therefore, 310 our model captures only fitness-relevant phenotypes. In space E, each environment is 311 represented by coordinates specifying a location (e.g. environment 1 having coordinates 312 9 by the contribution of that phenotype to fitness in environment j. A linear combination of these 321 weighted phenotypic effects determines the fitness of mutant i in environment j: 322 "# = "1 1# + "2 2# + "3 3# +. . . + "! !#

324
In this model, mutants with similar fitness profiles, for example mutants 1 and 2 in Fig 3A,

332
This genotype-phenotype-fitness model that we generate using SVD harkens to Fisher's

343
Here, we utilize SVD to count the number of phenotypes that contribute to fitness in the original 344 glucose-limited environment in which these adaptive mutants evolved. We used SVD to build an

405
These 7 genotype-by-environment interactions indeed tend to cluster the adaptive mutants by 406 type and by gene ( Fig 3C). Specifically, the diploids, IRA1-nonsense, GPB2, and PDE2 mutants 407 each form distinct clusters (p = 0.0001, p = 0.006, p = 0.0001, and p = 0.0001, respectively). To 408 generate p-values, we calculated the median pairwise distance, finding that multiple mutations 409 in the same cluster are indeed more closely clustered than randomly chosen groups of mutants.

410
Interestingly, the three smallest components, which capture very little variation in fitness across 411 the environments that reflect subtle perturbations of the EC, also cluster some mutants by gene 412 (Fig S3). Specifically, PDE2, GPB2, and IRA1-nonsense mutants are each closer to mutants of 413 their own type than to other adaptive haploids (p = 0.0001, p = 0.0001, and p = 0.03, 414 respectively). Note that the space defined by the three smallest components does not cluster 415 IRA1-nonsense mutants away from diploids (p = 0.718). This suggests that some mutants, e.g.

416
IRA1-nonsense and diploids, have smaller effects on these three phenotypic components.

417
Overall, our abstract phenotypic model, which reflects the way that each mutant's fitness 418 changes across environments, reveals that mutations to the same gene tend to interact similarly 419 with the environment. This suggests that our approach, like others that compare genotype-by-420 environment interactions , is a useful and unbiased way to identify mutations that 421 share functional effects.

423
Our approach also detects cases where mutations to the same gene or pathway do not cluster

477
Now that we have identified the phenotypic components that contribute to fitness in 478 environments that represent subtle perturbations of the EC, we can test the ability of these 479 phenotypic components to predict fitness in more distant environments. Specifically, we can 480 measure how the contribution of each of these components to fitness changes in new 481 environments. We can also determine whether the phenotypic components that contribute very 482 little to explaining fitness variation near the EC might at times have large explanatory power in 483 distant environments (as depicted in Fig 1B and 1C).

485
To test this we performed bi-cross-validation, using the eight component model constructed

493
The 8-dimensional phenotypic model, which was generated exclusively with the data from 494 subtle environmental perturbations, has substantial predictive power in distant environments 13 increases the oxygenation of the media (the "Baffle, 1.8% Glucose" environment), we predict 499 95% of weighted variance with the full 8-component phenotypic model, in contrast to 51% with a 500 1-component model ( Fig 4B). This ability to predict fitness is retained even when the first 501 component (effectively the fitness in EC) is a poor predictor of mutant fitness. For example, in 502 the environment where salt (0.5 M NaCl) was added to the media, the 1-component model 503 predicts fitness worse than predictions based on the average fitness for this environment, 504 resulting in negative variance explained (Fig 4A and 4B). This is due to the fact that mutant 505 fitness in this environment reflects extensive genotype-by-environment interactions, such that 506 the fitness of mutants in this environment is uncorrelated with EC fitness. However, our 507 predictions of mutant fitness in the 0.5 M NaCl environment improve when made using the 8-508 component phenotypic model, which predicts 72% of weighted variance. Astoundingly, the 8-

515
This ability to predict fitness is also observed for mutations in genes and pathways that are not 516 represented in the 60 that comprise the training set (e.g. those with mutations in TOR/Sch9 and 517 HOG pathway genes). For example, the 8-component model explains 93% of variation in the 518 "Baffle, 1.8% Glucose" environment and 71% of variation in the 0.5M NaCl environment for 519 these mutations, compared to 76% and 31% variance explained for the 1-component model, 520 respectively. This indicates that our model is able to capture shared phenotypic effects that 521 extend beyond gene identity. Altogether, our ability to accurately predict the fitness of new 522 mutants in new environments suggests that the phenotypes our model identifies reflect causal 523 effects on fitness.

525
Most strikingly, phenotypic models that include the three smallest phenotypic components,

561
Not all mutants affect all phenotypes and not all environments make all phenotypes 562 important 563 564 Next we explore the extent to which the contribution of a phenotypic component to fitness is 565 isolated to a specific environment and/or a specific type of mutation (Fig 5). We find that many 566 phenotypic components matter more to fitness in some environments than others. For instance, 567 component 2 adds on average 36% of the weighted variance in fitness across strong 568 perturbations, despite adding only 7% on average across the subtle environmental 569 perturbations. This contribution is, however, variable, with the second component adding over 570 90% of variance explained for the two environments with Benomyl and Baffled flasks (the 571 "Baffle, 0.4 μg/ml Benomyl" and "Baffle, 2 μg/ml Benomyl" environments) and only 0.3% for the 572 environment in which the transfer time was lengthened from two to three days (Fig 5A).

574
This environment-dependence is also true for the smallest two components. Specifically,

584
We further asked whether these effects are not only environment-specific but also mutant-585 specific. To do so, we focused on environments for which the two smallest components 586 contribute substantially to fitness (e.g. 0.5 M NaCl). We looked at the extent to which each of 587 these components improves power to predict the fitness of each of the 232 held-out mutants.

588
We found these components improve the fitness predictions for some classes of mutants far 589 more than for others. For example, fitness predictions for mutations in GPB2, diploids with 590 chromosome 11 amplifications, and high-fitness diploids with no known mutations each 591 improved by over 4 standard deviations of measurement error in the 0.5 M NaCl environment 592 due to the inclusion of the 7th component ( Fig 5B). This phenotypic component also has 593 importance in the 1-Day transfer environment, albeit to a lesser degree, resulting in 594 improvements of roughly 1 standard deviation for each of these mutation types. This suggests 595 that these mutants have some phenotypic effect that contributes only slightly to fitness in many 596 environments, including those that represent subtle perturbations of the EC, but that are 597 particularly important in the 0.5 M NaCl and 1 Day transfer environments. Similarly, we find that 598 the 8th component also improves predictive power for specific types of mutants in specific 599 environments. In this case, diploids with chromosome 11 amplifications and PDE2 mutants have 600 particularly strong improvements in the 6-Day transfer environment (11 and 5 standard 601 deviations, respectively) and thus likely have a shared phenotypic effect that is captured by 602 component 8 (Fig 5B).

604
In sum, not all mutants affect all eight phenotypic components to the same degree and not 605 all phenotypic components contribute substantially to fitness in all environments. This 606 idiosyncrasy suggests that directional selection has the potential to generate rather than reduce 607 phenotypic diversity in cases where multiple adaptive mutants persist within a population or 608 across populations. Although directional selection "chooses" mutations that affect similar phenotypes relevant to fitness in the EC, these mutations may have latent effects on a larger

634
Here we succeeded in building a low-dimensional statistical model that captures the relationship 635 from genotype to phenotype to fitness for hundreds of adaptive mutants. Mapping the complete 636 phenotypic and fitness impacts of genetic change is a key goal of biology. Such a map is 637 important in order to make meaningful predictions from genetic data (e.g. personalized 638 medicine) and to investigate the structure of biological systems (e.g. their degree of modularity

643
Specifically, we learned that adaptation is modular in the sense that hundreds of diverse 644 adaptive mutants collectively influence a small number of phenotypes that matter to fitness in 645 the evolution condition. We also learned that different mutants have distinct pleiotropic side 646 effects that matter to fitness in other conditions.

648
Building genotype-phenotype-fitness maps of adaptation has long been an elusive goal due to 649 both conceptual and technical difficulties. Indeed, the very first part of this task, namely the

679
If mutations influence more than one phenotype, then the mapping from phenotype to fitness 680 also becomes challenging. To investigate this map, we would need to find an artificial way to 681 perturb one phenotype without perturbing others such that we could isolate and measure effects 682 on fitness. Mapping phenotype to fitness is further complicated by the environmental 18 mutation that affects a cell's ability to store carbohydrates for future use might matter far more in 685 an environment where glucose is re-supplied every 6 days instead of every 48 hours.

687
In our study, we turned the challenge of environment-dependence into the solution to the 688 seemingly intractable problem of interrogating the phenotype layer of the genotype-phenotype-689 fitness map. We rely on the observation that the relative fitness of different mutations changes 690 across environments. We assume that differences in how mutant fitness varies across 691 environments must stem from differences in the phenotypes each mutation affects. Rather than 692 a priori defining the phenotypes that we think may matter, we use the similarities and

702
We successfully implemented this approach using a large collection of adaptive mutants 703 evolved in a glucose-limited condition. The first key result is that the map from adaptive mutant 704 to phenotype to fitness is modular, such that it is possible to create a genotype to phenotype to 705 fitness model that is low dimensional. Indeed, our model detects a small number (8)

722
Note that although we detect only 8 fitness-relevant phenotypes, we expect the true number to 723 be much larger as the detectable number is limited by the precision of measurement (see 724 Methods and Fig S1). We expect this partly because we know that if we had worse precision in 725 this experiment we would have detected fewer than 8 phenotypic components (Fig 3). Still, 726 these additional undetected components cannot be very consequential in terms of their 727 contribution to fitness in the evolution condition, given how well the first 8 components capture

744
This discovery emphasizes that, although the smaller phenotypic dimensions contribute very 745 little to fitness in the evolution condition (Fig 1B), they can at times have a much larger 746 contribution in other environments (Fig 1C). This makes intuitive sense. For instance, we know 747 that some of the strongest adaptive mutations in our experiment, the nonsense mutations in 748 IRA1, appear to stop cells from shifting their metabolism towards carbohydrate storage when 749 glucose levels become low . This gives these cells a head start once glucose 750 again becomes abundant and does not appear to come at a substantial cost, at least not until 751 these cells are exposed to stressful environments (e.g. high salt or long stationary phase) (Li et

754
supports the idea that adaptation can happen through large effect mutations because many of 755 the pleiotropic phenotypic effects will be inconsequential in the local environment (Fig 1B -C).

756
We can thus argue that our low-dimensional model representing the genotype-phenotype-757 fitness map near the evolution condition hides latent and consequential phenotypic complexity 758 across the collection of adaptive mutants. This complexity is hidden from natural selection in the 759 evolution condition but becomes important once the mutants leave the local environment and 760 are assessed globally for fitness effects. Thus, with respect to their effects on fitness-relevant 761 phenotypes, adaptive mutants may be locally modular, but globally pleiotropic.

763
The notion of latent phenotypic complexity is exciting as it generates a mechanism by which 764 directional selection generates rather than removes phenotypic diversity. Though directional

805
One disadvantage of our approach is that the phenotypic components that we infer from our 806 fitness measurements are abstract. They represent causal effects on fitness, rather than 807 measurable features of cells. For this reason, perhaps we should not refer to them as 808 phenotypes but rather "fitnotypes" (a mash of the terms "fitness" and "phenotype") that act much

1171
Further information and requests for resources and reagents should be directed to and will be 1172 fulfilled by the Lead Contact, Dmitri Petrov (dpetrov@stanford.edu).

1174
The yeast strains used in this study can be grown and maintained using standard methods (e.g.

1175
YPD media in test tubes, glycerol stocks for long term storage at -80°C), but should be

1196
In a few experiments, we spiked in re-barcoded mutants and additional neutral lineages as 1197 internal controls. Since re-barcoded mutants are identical, except for the barcode, these teach 1198 us about the precision with which we can measure a mutant's fitness. Specifically, we spiked in 1199 ten re-barcoded IRA1-nonsense mutants (each with a frameshift insertion AT to ATT mutation at 1200 bp 4090) and ten IRA1-missense mutants (each with a G to T mutation at bp 3776). Neutral

1215
We conducted fitness measurements under a variety of conditions (Table S2) that represent 1216 perturbations of the condition in which these adaptive mutants evolved. Briefly, we separately 1217 grew up an overnight culture of the barcode pool and the ancestral reference strain in 100mL 1218 M3 (minimal, glucose-limited) medium (Verduyn et al., 1992). We then mixed these saturated 1219 cultures at a 1:9 ratio such that 90% of cells represent the reference strain. This ratio allows for 1220 mutants to compete against the ancestor rather than competing against each other. We then 1221 inoculated 400 L of this mixed culture (∼ 5 × 10 7 cells) into 100mL of fresh media in 500mL 1222 DeLong flasks. The type of media used, and sometimes the shape of the flask, varied 1223 depending on condition (Table S2). This culture was then grown at 30°C in an incubator shaking

1231
After each transfer of 400 L, the left-over 9600 L was frozen so that we could later sequence

1242
Growth conditions

1243
In this study, we present fitness measurement data from a collection of 45 conditions that each 1244 represent perturbations of the growth condition in which these adaptive mutants evolved. We

1251
The 45 perturbations of the EC are summarized in Table S2 and include changes to the growth 1252 media, the flask shape, and the transfer times. For example, in the "1 Day" condition, we 1253 change the transfer time from 48 to 24 hours. In the "1.8% glucose, baffled flask condition" we For each of these 45 conditions but three, we include between two and four replicates that were 1259 performed simultaneously (Table S2) such that overall we performed a total of 109 fitness 1260 measurements on our collection of adaptive mutants. Our replicate structure is nested in that 1261 some of our 45 conditions represent replicate experiments that we performed at different times.

1262
Variation across experiments performed at different times is often referred to as "batch effects" 1263 and likely reflects environmental variability that we were unable to control (e.g. slight fluctuations 1264 in incubation temperature due to limits on the precision of the instrument). In particular, we re-1265 measured the fitness of the adaptive mutants in the EC on 9 different occasions, each time 1266 including 3 or more replicates. We refer to these 9 experiments as 'EC batches' in the main text.

1267
However, every set of experiments that was performed at the same time constitutes a separate 1268 "batch". There were slight differences across batches in the way we prepared barcodes for 1269 sequencing, which we detail in the relevant Methods sections. This variation across batches can 1270 be thought of as another parameter that varies across the 45 conditions (in addition to glucose 1271 or salt concentration). We report which experiments were performed in the same batch in Table   1272 S2.

1312
After extracting DNA, we PCR-amplified the barcode locus for each sample. Batches 1 -6 and 1313 10 were conducted with the protocols described in Venkataram et al., 2016). We 1314 made some slight modifications to this protocol, including using a new set of primers to allow for 1315 nested-unique-dual index labeling, for batches 7, 8, and 9. Our modified protocol is as follows.

1317
We used a two-step PCR protocol to amplify the barcodes from the DNA. The first PCR cycle

1470
Briefly, we first calculated the log-frequency change of each barcoded adaptive mutant for each  between missense and nonsense/frameshift/indel mutations in IRA1, here we classified these 1552 mutants into "missense" and "nonsense" classes, where mutants with frameshift and indel 1553 mutations were classified as "nonsense". We also classified diploid mutants with additional 1554 mutations in nutrient-response genes or chromosomal amplifications as separate groups.

1555
Additionally, we created a separate class for "high-fitness diploid" mutants that possess no 1556 additional detected mutations (other than being diploid) but have very high fitness in the EC. To 1557 be classified as a high-fitness diploid, a diploid mutant must have an average fitness across all 9 1558 EC batches that is greater than 2 standard deviations above the average of all diploids. In the 1559 main text, we label these mutants as "diploid with additional mutation" since they are likely to 1560 harbor additional mutation(s) due to their increased fitness.

1562
Calculation of Weighted Average Z Score

1563
To partition environments into subtle and strong perturbations of the EC, we relied on the 9 1564 experiments performed in the EC. Since each of these experiments was performed at a different 1565 time, variation in fitness across these experiments represents batch effects, and we therefore 1566 refer to these 9 experiments as "EC batches". Environmental differences between batches are 1567 very subtle, as they represent the limit of our ability to minimize environmental variation. Thus, 1568 variation in fitness across the EC batches serves as a natural benchmark for the strength of 1569 environmental perturbations. If the deviations in fitness caused by an environmental 1570 perturbation are substantially stronger than those observed across the EC batches, we call that 1571 perturbation "strong".

1573
More explicitly, to determine whether a given environmental perturbation is subtle or strong, we

1579
To ensure that each mutation type contributes equally to our classification of how different each 1580 environment is from the evolution condition, we weighed each mutant's contribution to this 1581 difference. We did so based on the number of mutants with the same mutation type, such that 1582 the mutation-type-weighted average Z-score for a given environment j is given by: where $%&'(") represents the number of mutants that are the same mutation type as mutant i.

1586
We then classified the environmental perturbations based on this Z-score. Sixteen environments 1587 provoked fitness differences resulting in a Z-score of less than two, and we classified these 38 multiple phenotypes, each scaled by their contribution to fitness in that environment, we can use

1665
One issue in this type of analysis is that adding more components always improves the 1666 explanatory power of the model, even when those components capture variation that is primarily 1667 due to measurement noise. This type of overfitting problem is common in statistics, and several 1668 methods have been devised to select the appropriate number of components to include. We use 1669 two such methods here.

1671
Estimating the detection threshold using measurement error

1672
One method to select the appropriate number of components to include in the model and captures only noise. We found that the largest noise-components are of the size that they would 1687 capture 0.07% of variation in our true fitness matrix. Thus, we set this as our limit of detection.

1688
In other words, in order for us to include 8 components in our low-dimensional model, all of 1689 them must explain more than 0.07%% of the variation in fitness. This approach is analogous to 1690 identifying a threshold when measurement noise is known but not identical for all entries in the 1691 matrix (Josse and Sardy, 2014).

1693
Estimating detection threshold using bi-cross-validation these predictions vary when each of the 25 subtle environmental perturbations is held out from 1743 the training set.
training mutants and the 25 subtle perturbations. We did this to avoid the model being