PT - JOURNAL ARTICLE AU - Marie Morel AU - Frédéric Lemoine AU - Anna Zhukova AU - Olivier Gascuel TI - Accurate Detection of Convergent Mutations in Large Protein Alignments with ConDor AID - 10.1101/2021.06.30.450558 DP - 2022 Jan 01 TA - bioRxiv PG - 2021.06.30.450558 4099 - http://biorxiv.org/content/early/2022/11/29/2021.06.30.450558.short 4100 - http://biorxiv.org/content/early/2022/11/29/2021.06.30.450558.full AB - Evolutionary convergences are observed at all levels, from phenotype to DNA and protein sequences, and changes at these different levels tend to be highly correlated. Notably, convergent and parallel mutations can lead to convergent changes in phenotype, such as changes in metabolism, drug resistance, and other adaptations to changing environments.We propose a two-step approach to detect mutations under convergent evolution in protein alignments. We first select mutations that emerge more often than expected under neutral evolution and then test whether their emergences correlate with the convergent phenotype under study. The first step can be used alone when no phenotype is available, as is often the case with microorganisms. In the first step, a phylogeny is inferred from the data and used to simulate the evolution of each alignment position. These simulations are used to estimate the expected number of mutations under neutral conditions, which is compared to what is observed in the data. Next, using a comparative phylogenetic approach, we measure whether the presence of mutations occurring more often than expected correlates with the convergent phenotype.Our method is implemented in a standalone workflow and a webserver, called ConDor. We apply ConDor to three datasets: sedges PEPC proteins, HIV reverse transcriptase and fish rhodopsin. The results show that the two components of ConDor complement each other, with an overall accuracy that compares favorably to other available tools, especially on large datasets.Competing Interest StatementThe authors have declared no competing interest.