Abstract
Hotspots of rapid genome evolution hold clues about human adaptation. Here, we present a comparative analysis of nine whole-genome sequenced primates to identify high-confidence targets of positive selection. We find strong statistical evidence for positive selection acting on 331 protein-coding genes (3%), pinpointing 934 adaptively evolving codons (0.014%). Our stringent procedure and quality control of alignments and evolutionary inferences reveal substantial artefacts (20% of initial predictions) that have inflated previous estimates of positive selection, the large majority relating to transcript definitions (61%) or gene models (38%). Our final set of 331 positively selected genes (PSG) are strongly enriched for innate and adaptive immune functions, secreted and cell membrane proteins (e.g. pattern recognition, complement, cytokine pathways, defensins, immune receptors, MHC, Siglecs). We also find evidence for positive selection in reproduction, chromosome segregation and meiosis (e.g. centromere-associated CENPO, CENPT), apolipoproteins, smell/taste receptors, and proteins interacting with mitochondrial-encoded molecules. Focusing on the virus-host interaction, we retrieve most evolutionary conflicts known to influence antiviral activity (e.g. TRIM5, MAVS, SAMHD1, tetherin) and predict 70 novel cases through integration with virus-host interaction data (virus-human PPIs, immune cell expression, infection screens). Protein structure analysis identifies positive selection in the interaction interfaces between viruses and their human cellular receptors (CD4 – HIV; CD46 [MCP] – measles, adenoviruses; CD55 [DAF] – picornaviruses). Finally, the primate PSG consistently show high sequence variation in human exomes, suggesting ongoing evolution. Our curated dataset of positively selected genes and positions, available at http://www.cmbi.umcn.nl/∼rvdlee/positive_selection/, is a rich source for studying the genetics underlying human (antiviral) phenotypes.