PT - JOURNAL ARTICLE AU - Marc Creixell AU - Aaron S. Meyer TI - Motif-based phosphoproteome clustering improves modeling and interpretation AID - 10.1101/2021.06.09.447799 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.06.09.447799 4099 - http://biorxiv.org/content/early/2021/06/10/2021.06.09.447799.short 4100 - http://biorxiv.org/content/early/2021/06/10/2021.06.09.447799.full AB - Cell signaling is orchestrated in part through a network of protein kinases and phosphatases. Dysregulation of kinase signaling is widespread in diseases such as cancer and is readily targetable through inhibitors of kinase enzymatic activity. Mass spectrometry-based analysis of kinase signaling can provide a global view of kinase signaling regulation but making sense of these data is complicated by its stochastic coverage of the proteome, measurement of substrates rather than kinase signaling itself, and the scale of the data collected. Here, we implement a dual data and motif clustering strategy (DDMC) that simultaneously clusters substrate peptides into similarly regulated groups based on their variation within an experiment and their sequence profile. We show that this can help to identify putative upstream kinases and supply more robust clustering. We apply this clustering to large-scale clinical proteomic profiling of lung cancer and identify conserved proteomic signatures of tumorigenicity, genetic mutations, and tumor immune infiltration. We propose that DDMC provides a general and flexible clustering strategy for the analysis of phosphoproteomic data.One-sentence Summary DDMC is a general and flexible strategy for phosphoproteomic analysis by clustering phosphopeptides using both their phosphorylation abundance and sequence motifs.Competing Interest StatementThe authors have declared no competing interest.