Abstract
A long-standing question is to what degree genetic drift vs. selection drives the divergence in rare accessory gene content between closely related bacteria. Rare genes, including singletons, make up a large proportion of pangenomes (the set of all genes in a set of genomes), but it remains unclear how many such genes are adaptive, deleterious, or neutral to their host genome. Estimates of species’ effective population sizes (Ne) are positively associated with pangenome size and fluidity, which has independently been interpreted as evidence for both neutral and adaptive pangenome models. We hypothesised that these models could be distinguished if measures of pangenome diversity were normalized by pseudogene diversity as a proxy for neutral genic diversity. To this end, we defined the ratio of singleton intact genes to singleton pseudogenes (si/sp) within a pangenome, which shows a signal across prokaryotic species consistent with the relative adaptive value of many rare accessory genes. We also identified differences in functional annotations between intact genes and pseudogenes. For instance, transposons are highly enriched among pseudogenes, while most other functional categories are more often intact. Our work demonstrates that including pseudogenes as a neutral reference leads to improved inferences of the evolutionary forces driving pangenome variation.
Competing Interest Statement
The authors have declared no competing interest.