Abstract
Detecting signals of polygenic adaptation remains a significant challenge in evolutionary biology, as traditional methods often struggle to identify the associated subtle, multi-locus allele-frequency shifts. Here, we introduced and tested several novel approaches combining machine learning techniques with traditional statistical tests to detect polygenic adaptation patterns. We implemented a Naive Bayesian Classifier (NBC) and One-Class Support Vector Machines (OCSVM), and compared their performance against the Fisher’s Exact Test (FET). Furthermore, we combined machine learning and statistical models (OCSVM-FET and NBC-FET), resulting in 5 competing approaches. Using a simulated data set based on empirical evolve-and-resequencing Chironomus riparius genomic data, we evaluated methods across evolutionary scenarios, varying in generations and numbers of loci under selection. Our results demonstrate that the combined OCSVM-FET approach consistently outperformed competing methods, achieving the lowest false positive rate, highest area under the curve, and high accuracy. The performance peak aligned with the late dynamic phase of adaptation, highlighting the method’s sensitivity to ongoing selective processes and thus for experimental approaches. Furthermore, we emphasize the critical role of parameter tuning, balancing biological assumptions with methodological rigor. Our approach thus offers a powerful tool for detecting polygenic adaptation in pool sequencing data particularly from evolve-and-resequence experiments.
Author’s Summary Organisms often adapt to environmental changes through polygenic adaptation - a process where multiple genes collectively contribute to evolutionary change. However, detecting these small shifts spread across multiple genes has been a persistent challenge for researchers. We developed new computational methods that combine machine learning with traditional statistical approaches to better detect these subtle genetic changes. Using data from a laboratory evolution experiment with the freshwater midge Chironomus riparius, we tested five different approaches to identify genes under selection. Our results showed that combining the machine learning technique One Class Support Vector Machines with a traditional statistical test (Fisher’s Exact Test) was particularly effective at identifying genes involved in adaptation. This combined approach excelled specifically in detecting ongoing adaptive changes while avoiding false positives. Our method provides a reliable tool for researchers studying evolutionary adaptation, particularly in laboratory evolution experiments where populations are tracked over multiple generations. This advancement facilitates improved understanding of how organisms adapt to new environments, which is increasingly important in the context of rapid environmental changes.
Competing Interest Statement
The authors have declared no competing interest.