PT - JOURNAL ARTICLE AU - Zielezinski, Andrzej AU - Gudyƛ, Adam AU - Barylski, Jakub AU - Siminski, Krzysztof AU - Rozwalak, Piotr AU - Dutilh, Bas E. AU - Deorowicz, Sebastian TI - Ultrafast and accurate sequence alignment and clustering of viral genomes AID - 10.1101/2024.06.27.601020 DP - 2024 Jan 01 TA - bioRxiv PG - 2024.06.27.601020 4099 - http://biorxiv.org/content/early/2024/07/02/2024.06.27.601020.short 4100 - http://biorxiv.org/content/early/2024/07/02/2024.06.27.601020.full AB - Viromics produces millions of viral genomes and fragments annually, overwhelming traditional sequence comparison methods. We introduce Vclust, a novel approach that determines average nucleotide identity by Lempel-Ziv parsing and clusters viral genomes with thresholds endorsed by authoritative viral genomics and taxonomy consortia. Vclust demonstrates superior accuracy and efficiency compared to existing tools, clustering millions of virus genomes in a few hours on a mid-range workstation.Competing Interest StatementThe authors have declared no competing interest.