Abstract
Escherichia coli ST131 is a major cause of infection with extensive antimicrobial resistance (AMR) facilitated by widespread beta-lactam antibiotic use. This drug pressure has driven extended-spectrum beta-lactamase (ESBL) gene acquisition and evolution in pathogens, so a clearer resolution of ST131’s origin, adaptation and spread is essential. Its ESBL genes are typically embedded in mobile genetic elements (MGEs) that aid transfer to new plasmid or chromosomal locations, which are mobilised further by plasmid conjugation and recombination, resulting in a flexible ESBL, MGE and plasmid composition with a conserved core genome. We used population genomics to trace the evolution of AMR in ST131 more precisely by extracting all available high-quality Illumina HiSeq read libraries to investigate 4,071 globally-sourced genomes, the largest ST131 collection examined so far. We applied rigorous quality-control, genome de novo assembly and ESBL gene screening to resolve ST131’s population structure across three genetically distinct clades (A, B, C) and abundant subclades from the dominant clade C. We reconstructed their evolutionary relationships across the core and accessory genomes using published reference genomes, long read assemblies and k-mer-based methods to contextualise pangenome diversity. The three main C subclades have co-circulated globally at relatively stable frequencies over time, suggesting attaining an equilibrium after their origin and initial rapid spread. This contrasted with their ESBL genes, which had a less constrained pattern and stronger population structure. Within these subclades, diversity levels of the core and accessory genome were not correlated due to plasmid and MGE activity. Our findings emphasise the potential of evolutionary pangenomics to improve our understanding of AMR gene transfer, adaptation and transmission to discover accessory genome changes linked to emerging outbreaks and novel subtypes.
Significance Multidrug-resistant Escherichia coli are a major global public health concern, among which ST131 is a pandemic subtype that is the most common cause of urinary tract infections. This study carefully assembled the genomes of 4,071 ST131 isolated between 1967 and 2018 to determine its subclades’ epidemiological, evolutionary and genetic features relevant to their antibiotic resistance genes. We found that ST131 subclades relative frequencies were stable over time, suggesting they may spread rapidly during their origin before stabilising. In contrast to core genome analysis documenting the global co-circulation of subclades C1 and C2, key antibiotic resistance genes in the accessory genome had stronger geographic, genetic and temporal patterns. Additionally, extensive plasmid variation among isolates with nearly identical chromosomes was discovered using multiple methods. This population genomic study highlights the dynamic nature of the accessory genomes in ST131, suggesting that surveillance should anticipate genetically novel outbreaks with broader antibiotic resistance levels.