RT Journal Article SR Electronic T1 Measuring proteomes with long strings: A new, unconstrained paradigm in mass spectrum interpretation JF bioRxiv FD Cold Spring Harbor Laboratory SP 282624 DO 10.1101/282624 A1 Arun Devabhaktuni A1 Niclas Olsson A1 Carlos Gonzales A1 Keith Rawson A1 Kavya Swaminathan A1 Joshua E. Elias YR 2018 UL http://biorxiv.org/content/early/2018/03/15/282624.abstract AB Thousands of protein post-translational modifications (PTMs) dynamically impact nearly all cellular functions. Mass spectrometry is well suited to PTM identification, but proteome-scale analyses are biased towards PTMs with existing enrichment methods. To measure the full landscape of PTM regulation, software must overcome two fundamental challenges: intractably large search spaces and difficulty distinguishing correct from incorrect identifications. Here, we describe TagGraph, software that overcomes both challenges with a string-based search method orders of magnitude faster than current approaches, and probabilistic validation model optimized for PTM assignments. When applied to a human proteome map, TagGraph tripled confident identifications while revealing thousands of modification types on nearly one million sites spanning the proteome. We expand known sites by orders of magnitude for highly abundant yet understudied PTMs such as proline hydroxylation, and derive tissue-specific insight into these PTMs’ roles. TagGraph expands our ability to survey the full landscape of PTM function and regulation.