RT Journal Article SR Electronic T1 Open-pFind enables precise, comprehensive and rapid peptide identification in shotgun proteomics JF bioRxiv FD Cold Spring Harbor Laboratory SP 285395 DO 10.1101/285395 A1 Chi, Hao A1 Liu, Chao A1 Yang, Hao A1 Zeng, Wen-Feng A1 Wu, Long A1 Zhou, Wen-Jing A1 Niu, Xiu-Nan A1 Ding, Yue-He A1 Zhang, Yao A1 Wang, Rui-Min A1 Wang, Zhao-Wei A1 Chen, Zhen-Lin A1 Sun, Rui-Xiang A1 Liu, Tao A1 Tan, Guang-Ming A1 Dong, Meng-Qiu A1 Xu, Ping A1 Zhang, Pei-Heng A1 He, Si-Min YR 2018 UL http://biorxiv.org/content/early/2018/03/20/285395.abstract AB Shotgun proteomics has grown rapidly in recent decades, but a large fraction of tandem mass spectrometry (MS/MS) data in shotgun proteomics are not successfully identified. We have developed a novel database search algorithm, Open-pFind, to efficiently identify peptides even in an ultra-large search space which takes into account unexpected modifications, amino acid mutations, semi- or non-specific digestion and co-eluting peptides. Tested on two metabolically labeled MS/MS datasets, Open-pFind reported 50.5‒117.0% more peptide-spectrum matches (PSMs) than the seven other advanced algorithms. More importantly, the Open-pFind results were more credible judged by the verification experiments using stable isotopic labeling. Tested on four additional large-scale datasets, 70‒85% of the spectra were confidently identified, and high-quality spectra were nearly completely interpreted by Open-pFind. Further, Open-pFind was over 40 times faster than the other three open search algorithms and 2‒3 times faster than three restricted search algorithms. Re-analysis of an entire human proteome dataset consisting of ~25 million spectra using Open-pFind identified a total of 14,064 proteins encoded by 12,723 genes by requiring at least two uniquely identified peptides. In this search results, Open-pFind also excelled in an independent test for false positives based on the presence or absence of olfactory receptors. Thus, a practical use of the open search strategy has been realized by Open-pFind for the truly global-scale proteomics experiments of today and in the future.