Abstract
The diversity and geographical distribution of populations within major marine microbial lineages are largely governed by temperature and its co-variables. However, neither the mechanisms by which genomic heterogeneity emerges within a single population nor how it drives the partitioning of ecological niches are well understood. Here we took advantage of billions of metagenomic reads to study one of the most abundant and widespread microbial populations in the surface ocean. We characterized its substantial amount of genomic heterogeneity using single-amino acid variants (SAAVs), and identified systematic purifying selection and adaptive mechanisms governing non-synonymous variation within this population. Our Deep Learning analysis of SAAVs across metagenomes revealed two main ecological niches that reflect large-scale oceanic current temperatures, as well as six proteotypes demarcating finer-resolved niches. We identified significantly more protein variants in cold currents and an increased number of protein sweeps in warm currents, exposing a global pattern of alternating genomic diversity for this SAR11 population as it drifts along with surface ocean currents. Overall, the geographic partitioning of SAAVs suggests natural selection, rather than neutral evolution, is the main driver of the evolution of SAR11 in surface oceans.