Abstract
The advent of next generation sequencing technologies has made whole-genome and whole-population sampling possible, even for eukaryotes with large genomes. With this development, experimental evolution studies can be designed to observe molecular evolution “in-action” via Evolve-and-Resequence (E&R) experiments. Among other applications, E&R studies can be used to locate the genes and variants responsible for genetic adaptation. Existing literature on time-series data analysis often assumes large population size, accurate allele frequency estimates, and wide time spans. These assumptions do not hold in many E&R studies.
In this article, we propose a method-Composition of Likelihoods for Evolve-And-Resequence experiments (Clear)–to identify signatures of selection in small population E&R experiments. Clear takes whole-genome sequence of pool of individuals (pool-seq) as input, and properly addresses heterogeneous ascertainment bias resulting from uneven coverage. Clear also provides unbiased estimates of model parameters, including population size, selection strength and dominance, while being computationally efficient. Extensive simulations show that Clear achieves higher power in detecting and localizing selection over a wide range of parameters, and is robust to variation of coverage. We applied Clear statistic to multiple E&R experiments, including, data from a study of D. melanogaster adaptation to alternating temperatures and a study of outcrossing yeast populations, and identified multiple regions under selection with genome-wide significance.