Abstract
Proteins are the building blocks for almost all the functions in cells. Understanding the molecular evolution of proteins and the forces that shape protein evolution is an essential step in understanding the basis of function and evolution. Previous studies have shown that adaptation occurs frequently at the protein surface, such as in genes involved in host-pathogen interactions. However, it remains unclear whether adaptive sites are distributed randomly or at regions that are associated with particular structural or functional characteristics across the genome, since many of the proteins lack structural or functional annotations. Here, we seek to tackle this question by combining large-scale bioinformatic prediction, structural analysis, phylogenetic inference, and population genomic analysis of Drosophila protein-coding genes. Although adaptation is more relevant to function-related rather than structure-related properties, we observed that physical interactions may play a role in the co-adaptation of fast-adaptive proteins. Importantly, protein-protein and protein-DNA interaction sites are hotspots for protein adaptive evolution, regardless of the levels of intrinsic structural disorder or relative solvent accessibility. We found that strongly differentiated amino acids across geographic regions in protein coding genes are mostly adaptive, which may contribute to the long-term adaptive evolution. This strongly indicates that a number of adaptive sites are repeatedly mutated and selected in evolution, in the past, present, and maybe future. Our results suggest important roles of intermolecular interactions and co-adaptation in the adaptive evolution of proteins both at the species and population levels.
Competing Interest Statement
The authors have declared no competing interest.