TY - JOUR T1 - Sourmash Branchwater Enables Lightweight Petabyte-Scale Sequence Search JF - bioRxiv DO - 10.1101/2022.11.02.514947 SP - 2022.11.02.514947 AU - Luiz Irber AU - N. Tessa Pierce-Ward AU - C. Titus Brown Y1 - 2022/01/01 UR - http://biorxiv.org/content/early/2022/11/03/2022.11.02.514947.abstract N2 - We introduce branchwater, a flexible and fast petabase-scale search for the 767,000 public metagenomes presently in the NCBI Sequence Read Archive. Our search is based on the FracMinHash k-mer sketching technique and can search all public metagenomes with 1000 query genomes in approximately 36 hours using 50 GB of RAM and 32 threads. Branchwater is a Rust-based multithreading front-end built on top of the sourmash library. We provide biological use cases, examine performance, and discuss design and performance considerations.Competing Interest StatementThe authors have declared no competing interest. ER -