RT Journal Article SR Electronic T1 Modular gene interactions drive modular pan-genome evolution in bacteria JF bioRxiv FD Cold Spring Harbor Laboratory SP 2022.11.15.516554 DO 10.1101/2022.11.15.516554 A1 Castro, Juan C. A1 Brown, Sam P. YR 2022 UL http://biorxiv.org/content/early/2022/11/15/2022.11.15.516554.abstract AB Depending on the scale of observation, bacterial genomes are both organized and fluid. While individual bacterial genomes show signatures of organization (e.g. operons), pan-genomes reveal genome fluidity, both in terms of gene content and order (synteny). Here we ask how mutational forces (including recombination and horizontal gene transfer) combine with selection and gene interactions to shape genome organization and variation both within and across strains. We first build an evolutionary simulation model to assess the impact of gene interactions on pan-genome structure. A neutral evolutionary model can produce transient co-segregation of initially linked genes but is vulnerable on longer time-scales to perturbing mutational events. In contrast, incorporation of modular gene fitness interactions can produce sustainable clusters of linked and co-segregating genes, with the network of co-segregation recapitulating the defined simulation ‘ground-truth’ network of gene interactions. To test our model predictions, we exploit the increasing number of closed genomes in model species to investigate gene interactions in the pan-genomes of Escherichia coli and Pseudomonas aeruginosa. Using these highly curated pan-genomes, we show that the co-segregation networks for P. aeruginosa and E. coli are modular, associate with physical linkage, and most closely map known metabolic networks (for P. aeruginosa) and regulatory networks (for E. coli). The results imply that co-segregation networks can contribute to accessory genome annotation, and more generally that gene interactions are the primary force shaping genome structure and operon evolution.Significance Statement Bacterial pan-genomes represent extraordinary genomic diversity yet remain relatively unexplored due to a research focus on lab reference strains. We exploit the growing availability of closed genomes to build pan-genomes where we can track the physical linkage of all genes. Through a combination of evolutionary simulations and data-analysis, we ask how mutation, selection and gene interactions combine to shape genome structural organization (linkage) and variation (co-segregation) across strains. We show that co-segregation networks are modular, associate with physical linkage, and map to metabolic (for P. aeruginosa) and regulatory networks (for E. coli). The results imply that modular gene interactions are sufficient to guide the evolution of persistent gene clusters and are the primary force shaping genome structural evolution.Competing Interest StatementThe authors have declared no competing interest.