PT - JOURNAL ARTICLE AU - Aida Mrzic AU - Pieter Meysman AU - Wout Bittremieux AU - Kris Laukens TI - Automated recommendation of metabolite substructures from mass spectra using frequent pattern mining AID - 10.1101/134189 DP - 2017 Jan 01 TA - bioRxiv PG - 134189 4099 - http://biorxiv.org/content/early/2017/06/28/134189.short 4100 - http://biorxiv.org/content/early/2017/06/28/134189.full AB - Despite the increasing importance of metabolomics approaches, the structural elucidation of metabolites from mass spectral data remains a challenge. Although several reliable tools to identify known metabolites exist, identifying compounds that have not been previously seen is a challenging task that still eludes modern bioinformatics tools. Here, we describe an automated method for substructure recommendation from mass spectra using pattern mining techniques. Based on previously seen recurring substructures our approach succeeds in identifying parts of unknown metabolites. An important advantage of this approach is that it does not require any prior information concerning the metabolites to be identified, and therefore it can be used for the (partial) identification of unknown unknowns. Using association rule mining we are able to recommend valid substructures even for those metabolites for which no match can be found in spectral libraries or structural databases. We further demonstrate how this approach is complementary to existing metabolite identification tools, achieving improved identification results. The method is called MESSAR (MEtabolite SubStructure Auto-Recommender) and is implemented as a free online web service available at http://www.biomina.be/apps/MESSAR/.