RT Journal Article SR Electronic T1 A deep learning framework for real-time detection of novel pathogens during sequencing JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.01.26.428301 DO 10.1101/2021.01.26.428301 A1 Jakub M. Bartoszewicz A1 Ulrich Genske A1 Bernhard Y. Renard YR 2021 UL http://biorxiv.org/content/early/2021/01/27/2021.01.26.428301.abstract AB Motivation Novel pathogens evolve quickly and may emerge rapidly, causing dangerous outbreaks or even global pandemics. Next-generation sequencing is the state-of-the art in open-view pathogen detection, and one of the few methods available at the earliest stages of an epidemic, even when the biological threat is unknown. Analyzing the samples as the sequencer is running can greatly reduce the turnaround time, but existing tools rely on close matches to lists of known pathogens and perform poorly on novel species. Machine learning approaches can predict if single reads originate from more distant, unknown pathogens, but require relatively long input sequences and processed data from a finished sequencing run.Results We present DeePaC-Live, a Python package for real-time pathogenic potential prediction directly from incomplete sequencing reads. We train deep neural networks to classify Illumina and Nanopore reads and integrate our models with HiLive2, a real-time Illumina mapper. DeePaC-Live outperforms alternatives based on machine learning and sequence alignment on simulated and real data, including SARS-CoV-2 sequencing runs. After just 50 Illumina cycles, we increase the true positive rate 80-fold compared to the live-mapping approach. The first 250bp of Nanopore reads, corresponding to 0.5s of sequencing time, are enough to yield predictions more accurate than mapping the finished long reads. Our approach could also be used for screening synthetic sequences against biosecurity threats.Availability The code is available at: https://gitlab.com/dacs-hpi/deepac-live and https://gitlab.com/dacs-hpi/deepac. The package can be installed with Bioconda, Docker or pip.Contact Jakub.Bartoszewicz{at}hpi.de, Bernhard.Renard{at}hpi.deSupplementary information Supplementary data are available online.Competing Interest StatementThe authors have declared no competing interest.