Abstract
Mobile genetic elements with circular genomes play a key role in the evolution of microbial communities. These circular genomes correspond to cyclic paths in metagenome graphs, and yet, assemblies derived from natural microbial communities produce graphs riddled with spurious cycles, complicating the accurate reconstruction of circular genomes. We present an algorithm that reconstructs true circular genomes based on the identification of so-called ‘dominant’ cycles. Our algorithm leverages paired reads to bridge gaps between assembly contigs and scrutinizes cycles through a nucleotide-level analysis, making the approach robust to mis-assembly artifacts. We validated the approach using simulated and reference data. Application of this approach to 32 publicly available DNA shotgun sequence data sets from diverse natural environments led to the reconstruction of hundreds of circular mobile genomes. Clustering revealed 20 clusters of cryptic, prevalent, and abundant plasmids that have clonal population structures with surprisingly recent common ancestors. This work enables the robust study of evolution and spread of mobile elements in natural settings.
Competing Interest Statement
The authors have declared no competing interest.