Summary
Improvements in shotgun proteomics approaches are hampered by increases in multiplexed (chimeric) spectra, as improvements in peak capacity, sensitivity or dynamic range all increase the number of co-eluting peptides. This results in diminishing returns using traditional search algorithms, as co-fragmented spectra are known to decrease identification rates. Here we describe MSCypher, a freely available software suite that enables an extensible workflow including a hybrid supervised machine learned strategy that dynamically adjusts to individual datasets. This results in improved identification rates and quantification of low-abundant peptides and proteins. In addition, the integration of peptide de novo sequencing and database searching enables an unbiased view of variants and high-intensity unassigned peptide spectral matches.
Highlights
Open-source end-to-end label-free proteomics workflow
Integrated database searching and machine learning
Customisable and extensible workflow including de novo sequencing
Optimised for multiplexed spectra, challenging proteomics datasets and peptidomics applications
Footnotes
↵* Co-first author