PT - JOURNAL ARTICLE AU - Eugene A. Kapp AU - Giuseppe Infusini AU - Yunshan Zhong AU - Laura F. Dagley AU - Terence P. Speed AU - Andrew I. Webb TI - MSCypher: an integrated database searching and machine learning workflow for multiplexed proteomics AID - 10.1101/397257 DP - 2018 Jan 01 TA - bioRxiv PG - 397257 4099 - http://biorxiv.org/content/early/2018/08/22/397257.short 4100 - http://biorxiv.org/content/early/2018/08/22/397257.full AB - Improvements in shotgun proteomics approaches are hampered by increases in multiplexed (chimeric) spectra, as improvements in peak capacity, sensitivity or dynamic range all increase the number of co-eluting peptides. This results in diminishing returns using traditional search algorithms, as co-fragmented spectra are known to decrease identification rates. Here we describe MSCypher, a freely available software suite that enables an extensible workflow including a hybrid supervised machine learned strategy that dynamically adjusts to individual datasets. This results in improved identification rates and quantification of low-abundant peptides and proteins. In addition, the integration of peptide de novo sequencing and database searching enables an unbiased view of variants and high-intensity unassigned peptide spectral matches.HighlightsOpen-source end-to-end label-free proteomics workflowIntegrated database searching and machine learningCustomisable and extensible workflow including de novo sequencingOptimised for multiplexed spectra, challenging proteomics datasets and peptidomics applications