RT Journal Article SR Electronic T1 Algorithmic Learning for Auto-deconvolution of GC-MS Data to Enable Molecular Networking within GNPS JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.13.905091 DO 10.1101/2020.01.13.905091 A1 Alexander A. Aksenov A1 Ivan Laponogov A1 Zheng Zhang A1 Sophie LF Doran A1 Ilaria Belluomo A1 Dennis Veselkov A1 Wout Bittremieux A1 Louis Felix Nothias A1 Mélissa Nothias-Esposito A1 Katherine N. Maloney A1 Biswapriya B. Misra A1 Alexey V. Melnik A1 Kenneth L. Jones II A1 Kathleen Dorrestein A1 Morgan Panitchpakdi A1 Madeleine Ernst A1 Justin J.J. van der Hooft A1 Mabel Gonzalez A1 Chiara Carazzone A1 Adolfo Amézquita A1 Chris Callewaert A1 James Morton A1 Robert Quinn A1 Amina Bouslimani A1 Andrea Albarracín Orio A1 Daniel Petras A1 Andrea M. Smania A1 Sneha P. Couvillion A1 Meagan C. Burnet A1 Carrie D. Nicora A1 Erika Zink A1 Thomas O. Metz A1 Viatcheslav Artaev A1 Elizabeth Humston-Fulmer A1 Rachel Gregor A1 Michael M. Meijler A1 Itzhak Mizrahi A1 Stav Eyal A1 Brooke Anderson A1 Rachel Dutton A1 Raphaël Lugan A1 Pauline Le Boulch A1 Yann Guitton A1 Stephanie Prevost A1 Audrey Poirier A1 Gaud Dervilly A1 Bruno Le Bizec A1 Aaron Fait A1 Noga Sikron Persi A1 Chao Song A1 Kelem Gashu A1 Roxana Coras A1 Monica Guma A1 Julia Manasson A1 Jose U. Scher A1 Dinesh Barupal A1 Saleh Alseekh A1 Alisdair Fernie A1 Reza Mirnezami A1 Vasilis Vasiliou A1 Robin Schmid A1 Roman S. Borisov A1 Larisa N. Kulikova A1 Rob Knight A1 Mingxun Wang A1 George B Hanna A1 Pieter C. Dorrestein A1 Kirill Veselkov YR 2020 UL http://biorxiv.org/content/early/2020/01/14/2020.01.13.905091.abstract AB Gas chromatography-mass spectrometry (GC-MS) represents an analytical technique with significant practical societal impact. Spectral deconvolution is an essential step for interpreting GC-MS data. No public GC-MS repositories that also enable repository-scale analysis exist, in part because deconvolution requires significant user input. We therefore engineered a scalable machine learning workflow for the Global Natural Product Social Molecular Networking (GNPS) analysis platform to enable the mass spectrometry community to store, process, share, annotate, compare, and perform molecular networking of GC-MS data. The workflow performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization, using a Fast Fourier Transform-based strategy to overcome scalability limitations. We introduce a “balance score” that quantifies the reproducibility of fragmentation patterns across all samples. We demonstrate the utility of the platform with breathomics analysis applied to the early detection of oesophago-gastric cancer, and by creating the first molecular spatial map of the human volatilome.