RT Journal Article SR Electronic T1 Classes for the masses: Systematic classification of unknowns using fragmentation spectra JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.04.17.046672 DO 10.1101/2020.04.17.046672 A1 Kai Dührkop A1 Louis Felix Nothias A1 Markus Fleischauer A1 Marcus Ludwig A1 Martin A. Hoffmann A1 Juho Rousu A1 Pieter C. Dorrestein A1 Sebastian Böcker YR 2020 UL http://biorxiv.org/content/early/2020/04/18/2020.04.17.046672.abstract AB Metabolomics experiments can employ non-targeted tandem mass spectrometry to detect hundreds to thousands of molecules in a biological sample. Structural annotation of molecules is typically carried out by searching their fragmentation spectra in spectral libraries or, recently, in structure databases. Annotations are limited to structures present in the library or database employed, prohibiting a thorough utilization of the experimental data. We present a computational tool for systematic compound class annotation: CANOPUS uses a deep neural network to predict 1,270 compound classes from fragmentation spectra, and explicitly targets compounds where neither spectral nor structural reference data are available. CANOPUS even predicts classes for which no MS/MS training data are available. We demonstrate the broad utility of CANOPUS by investigating the effect of the microbial colonization in the digestive system in mice, and through analysis of the chemodiversity of different Euphorbia plants; both uniquely revealing biological insights at the compound class level.Competing Interest StatementSB, KD, ML, MF, and MAH are co-founders of Bright Giant GmbH. PCD is scientific advisor for Sirenas LLC.