TY - JOUR T1 - Algorithmic Learning for Auto-deconvolution of GC-MS Data to Enable Molecular Networking within GNPS JF - bioRxiv DO - 10.1101/2020.01.13.905091 SP - 2020.01.13.905091 AU - Alexander A. Aksenov AU - Ivan Laponogov AU - Zheng Zhang AU - Sophie LF Doran AU - Ilaria Belluomo AU - Dennis Veselkov AU - Wout Bittremieux AU - Louis Felix Nothias AU - Mélissa Nothias-Esposito AU - Katherine N. Maloney AU - Biswapriya B. Misra AU - Alexey V. Melnik AU - Kenneth L. Jones II AU - Kathleen Dorrestein AU - Morgan Panitchpakdi AU - Madeleine Ernst AU - Justin J.J. van der Hooft AU - Mabel Gonzalez AU - Chiara Carazzone AU - Adolfo Amézquita AU - Chris Callewaert AU - James Morton AU - Robert Quinn AU - Amina Bouslimani AU - Andrea Albarracín Orio AU - Daniel Petras AU - Andrea M. Smania AU - Sneha P. Couvillion AU - Meagan C. Burnet AU - Carrie D. Nicora AU - Erika Zink AU - Thomas O. Metz AU - Viatcheslav Artaev AU - Elizabeth Humston-Fulmer AU - Rachel Gregor AU - Michael M. Meijler AU - Itzhak Mizrahi AU - Stav Eyal AU - Brooke Anderson AU - Rachel Dutton AU - Raphaël Lugan AU - Pauline Le Boulch AU - Yann Guitton AU - Stephanie Prevost AU - Audrey Poirier AU - Gaud Dervilly AU - Bruno Le Bizec AU - Aaron Fait AU - Noga Sikron Persi AU - Chao Song AU - Kelem Gashu AU - Roxana Coras AU - Monica Guma AU - Julia Manasson AU - Jose U. Scher AU - Dinesh Barupal AU - Saleh Alseekh AU - Alisdair Fernie AU - Reza Mirnezami AU - Vasilis Vasiliou AU - Robin Schmid AU - Roman S. Borisov AU - Larisa N. Kulikova AU - Rob Knight AU - Mingxun Wang AU - George B Hanna AU - Pieter C. Dorrestein AU - Kirill Veselkov Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/01/14/2020.01.13.905091.abstract N2 - Gas chromatography-mass spectrometry (GC-MS) represents an analytical technique with significant practical societal impact. Spectral deconvolution is an essential step for interpreting GC-MS data. No public GC-MS repositories that also enable repository-scale analysis exist, in part because deconvolution requires significant user input. We therefore engineered a scalable machine learning workflow for the Global Natural Product Social Molecular Networking (GNPS) analysis platform to enable the mass spectrometry community to store, process, share, annotate, compare, and perform molecular networking of GC-MS data. The workflow performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization, using a Fast Fourier Transform-based strategy to overcome scalability limitations. We introduce a “balance score” that quantifies the reproducibility of fragmentation patterns across all samples. We demonstrate the utility of the platform with breathomics analysis applied to the early detection of oesophago-gastric cancer, and by creating the first molecular spatial map of the human volatilome. ER -