RT Journal Article SR Electronic T1 Ultra-efficient, unified discovery from microbial sequencing with SPLASH and precise statistical assembly JF bioRxiv FD Cold Spring Harbor Laboratory SP 2024.01.18.576133 DO 10.1101/2024.01.18.576133 A1 Henderson, George A1 Gudys, Adam A1 Baharav, Tavor A1 Sundaramurthy, Punit A1 Kokot, Marek A1 Wang, Peter L. A1 Deorowicz, Sebastian A1 Carey, Allison F. A1 Salzman, Julia YR 2024 UL http://biorxiv.org/content/early/2024/01/22/2024.01.18.576133.abstract AB Bacteria comprise > 12% of Earth’s biomass and profoundly impact human and planetary health.1 Many key biological functions of microbes, and functions differentiating strains, are conferred or modified by genome plasticity including mobilization of genetic elements, phage integration, and CRISPR arrays. Characterizing each of these processes is time-consuming and requires custom bioinformatic workflows ill-suited to enable discovery of new sources of genetic diversity or to uncover which elements are active. Further, strain typing of bacterial species and approaches to discriminate sub-populations remain time-consuming and resource intensive. Here, we show that SPLASH, our published approach for reference-free discovery and analysis directly from raw reads, and an improved statistical assembly algorithm, compactors, unify diverse tasks in microbial sequence analysis: discovering new mobile elements and CRISPR arrays missing from any reference, and generating rapid, metadata-free strain typing of diverse bacteria. SPLASH and compactors together constitute a new general discovery tool for biological discovery in the microbial world.Competing Interest StatementThe authors have declared no competing interest.