RT Journal Article SR Electronic T1 Identifying Taxonomic Units in Metagenomic DNA Streams JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.08.21.261313 DO 10.1101/2020.08.21.261313 A1 Vicky Zheng A1 Ahmet Erdem Sariyuce A1 Jaroslaw Zola YR 2020 UL http://biorxiv.org/content/early/2020/09/29/2020.08.21.261313.abstract AB With the emergence of portable DNA sequencers, such as Oxford Nanopore Technology MinION, metagenomic DNA sequencing can be performed in real-time and directly in the field. However, because metagenomic DNA analysis is computationally and memory intensive, and the current methods are designed for batch processing, the current metagenomic tools are not well suited for mobile devices.In this paper, we propose a new memory-efficient method to identify Operational Taxonomic Units (OTUs) in metagenomic DNA streams. Our method is based on finding connected components in overlap graphs constructed over a real-time stream of long DNA reads as produced by MinION platform. We propose an efficient algorithm to maintain connected components when an overlap graph is streamed, and show how redundant information can be removed from the stream by transitive closures. Through experiments on simulated and real-world metagenomic data, we demonstrate that the resulting solution is able to recover OTUs with high precision while remaining suitable for mobile computing devices.Competing Interest StatementThe authors have declared no competing interest.