PTMNavigator: Interactive Visualization of Differentially Regulated Post-Translational Modifications in Cellular Signaling Pathways

Post-translational modifications (PTMs) play pivotal roles in regulating cellular signaling, fine-tuning protein function, and orchestrating complex biological processes. Despite their importance, the lack of comprehensive tools for studying PTMs from a pathway-centric perspective has limited our ability to understand how PTMs modulate cellular pathways on a molecular level. Here, we present PTMNavigator, a tool integrated into the ProteomicsDB platform, which offers an interactive interface for researchers to overlay experimental PTM data with pathway diagrams. PTMNavigator provides ∼3000 canonical pathways from manually curated databases and further enables users to modify and create custom diagrams, tailored to their data. Additionally, PTMNavigator automatically runs multiple kinase and pathway enrichment algorithms whose results are directly integrated into the visualization. This offers a comprehensive view of the intricate relationship between PTMs and signaling pathways. To demonstrate the utility of PTMNavigator, we applied it to two phosphoproteomics perturbation datasets. First, PTMNavigator enhanced pathway enrichment analysis by showing how the regulated peptides and proteins are distributed in the pathways with high enrichment scores. Second, it visualized how drug treatments result in a discernable flow of PTM-driven signaling within pathways. Third, PTMNavigator aided in proposing extensions to an existing pathway by suggesting putative new links between both PTMs and pathway components. By enhancing our understanding of cellular signaling dynamics and facilitating the discovery of novel PTM-pathway interactions, PTMNavigator advances our knowledge of PTM biology and its implications in health and disease.


INTRODUCTION
Post-translational modifications (PTMs) are the key regulators of cellular pathways in eukaryotes. The presence of a certain PTM on a specific residue can, for instance, determine whether or not the protein can interact with other proteins, activate or inactivate an enzyme, alter a proteins' location in the cell, or mark it for degradation. Understanding the specific role of various types of PTMs within intracellular signaling cascades is essential to gain insights into the molecular mechanisms that govern biological processes. Perturbation studies that examine the effects of altered PTM patterns on pathways have been particularly useful in advancing our knowledge (1)(2)(3)(4)(5)(6)(7). While proteome-wide measurements of PTMs have become a routine procedure for proteomics laboratories, its correct interpretation in the context of cellular signaling pathways continues to pose a daily challenge.
Multiple databases are available to the scientific community that are dedicated to the study of pathways (8)(9)(10)(11), including the Kyoto Encyclopedia of Genes and Genomes (KEGG) and WikiPathways. KEGG (9) contains 563 manually curated pathway maps with annotations for almost 9,000 different organisms. WikiPathways (10) is an open-source initiative to collect, maintain, and disseminate data on biological pathways. Its March 2023 release consists of 3,130 pathways across 33 species. Both of these databases are designed to study relationships between genes, gene products, and compounds (such as metabolites or drugs). They offer no (KEGG) or only limited (WikiPathways) direct support for research on the role of PTMs within pathways.
For the study of PTM data, several resources exist, the most notable being PhosphoSitePlus (12), which includes data from a large number of research articles not only on phosphorylation, but also acetylation, ubiquitinylation, and other PTMs. Where evidence exists, the sites in PhosphoSitePlus are annotated with their functions and putative upstream or downstream 3 interactors or modifiers. While PhosphoSitePlus has been a tremendously useful resource, it remains tedious to explore multiple or all sites of a signaling cascade together in order to uncover potential relationships between them and perform higher-level analysis of the data.
To place all regulated peptides into the context of signaling networks, a commonly employed method is pathway enrichment analysis (PEA) (13), for example using online tools such as g:Profiler (14). Multiple PEA algorithms exist, including Gene Set Enrichment Analysis (GSEA) (15), all of which consist of the same two steps: First, the list of perturbed genes is compared to the set of genes contained in each canonical pathway; Second, pathways for which the overlap is larger than expected by random chance, are identified by some statistical test. As a result, all PEAs output a list of pathway names together with their statistical significance (16). While very useful for the analysis of proteome expression changes, this approach has several shortcomings in the context of PTM analysis. First, in a PEA, a pathway is regarded simply as a set of genes, and it is not considered that some of them might be more important to the activity of the pathway than others. Second, all genes that are part of the pathway are regarded the same no matter if they are negatively or positively correlated with the activity of the pathway. Third, investigators do not see whether the regulations are randomly distributed across the pathway (which might imply (17). However, PTM-SEA so far only covers a fraction of the pathways deposited in KEGG and WikiPathways, and has the same shortcoming as other PEA tools in that users only obtain the names of the putatively regulated pathways, rather than how their data maps to the nodes in the pathways.
A number of software tools to visualize PTM datasets and pathways have been developed (18)(19)(20)(21)(22)(23). For instance, Phosphomatics (23) is a web service that helps researchers to examine kinasesubstrate relationships in their phosphoproteomic datasets, whereas KEGGViewer (19) can integrate gene expression profiles and KEGG pathways. A functionality which these tools have been lacking so far is the visualization of regulated PTMs directly within biological networks.
The Cytoscape (18) app PhosphoPath (21) has addressed this issue for time-course PTM data, but the app is no longer supported by Cytoscape.
To overcome the aforementioned problems of pathway enrichment on the PTM level and to empower researches to quickly examine pathway engagement via PTM-regulation, we introduce PTMNavigator, a web application that projects PTM perturbation datasets onto canonical pathways in the form of interactive graphs. This enables users to trace signaling cascades across a cell and identify pivotal PTMs within a pathway. PTMNavigator is available as part of the 'Analytics Toolbox' of ProteomicsDB (24,25) and can visualize data hosted on ProteomicsDB as well as user-uploaded data. The core visualization component of our software, which we termed biowc-pathwaygraph, can be easily reused in web interfaces outside of ProteomicsDB.
We demonstrate the utility of our application using two phosphoproteomic drug profiling studies (2,26). First, we show how PTMNavigator improves the interpretation of PEA by enabling users to explore why the top-ranked pathways are reported as significant. Second, we visualize target and pathway engagement of kinase inhibitors at different steps in a phosphorylation-dependent 5 signaling cascade. Last, we show how PTMNavigator leverages the information contained in dose-dependent drug perturbation studies to aid the functionalization of PTM sites. 6

A PTM-Centric Visualization Interface for Pathways
PTMNavigator is a web-based graphical interface to visualize PTMs from perturbation experiments within pathway diagrams. The interface is embedded into the ecosystem of ProteomicsDB, a multi-omics resource for life science data and data analysis. The software combines two types of inputs (Fig. 1A). All that a user needs to provide is a list of regulated modified peptides, e.g. the result of a differential or dose-response analysis. Each peptide must be annotated with whether or not it was up-or down-regulated between experimental conditions. The second type of input are the canonical pathways of KEGG and WikiPathways, for which we created internal representations in ProteomicsDB. For Homo sapiens, we imported 1,121 pathways (779 from WikiPathways and 342 from KEGG) from the latest releases. In addition, we imported 1,838 pathways for 9 model organisms, including other mammals, bacteria, and plants ( Fig. 1B).
PTMNavigator combines the experimental data and the pathway information into a projection of modified peptides onto their corresponding genes within a pathway diagram. The main interface of PTMNavigator (Fig. 1C) is accessible at www.proteomicsdb.org/analytics/ptmNavigator. Users can visualize datasets already contained in ProteomicsDB or upload their own (templates for input formats are available as Table S1). It is possible to access (previously) uploaded datasets by a Session ID and to visualize one or multiple datasets at a time. The available pathways are sorted by their relevance derived from the pathway enrichment tool g:Profiler. The pathway diagram is rendered together with additional nodes for each PTM peptide next to its corresponding protein. These additional PTM nodes are colored according to their regulation (up, down, not). The initial coordinates of the nodes are the same as in the reference databases so that 7 it resembles the original pathway diagram. However, users can rearrange the layout by dragging nodes around, highlighting interesting subnetworks, and filtering for experiments or regulation categories. The final pathway diagrams can be downloaded as scalable vector graphics (SVG). In the following, we exemplify the capabilities and benefits of PTMNavigator using two recent PTM perturbation datasets. 8

Interpreting Pathway Enrichment Analysis Results with PTMNavigator
To demonstrate how PTMNavigator aids in interpreting the results of a pathway enrichment analysis (PEA), we re-analyzed data from a recent phosphoproteomics study, in which 30 different kinase inhibitors targeting members of the EGF signaling pathway were profiled for their effect on the phosphoproteome of a human retinal pigment epithelial cell line (RPE1) (26) ( Fig. 2A). Re-analysis of all 30 datasets yielded 4,989 regulations (2,563 up and 2,426 down) on 1,757 peptides (out of 7,871 peptides measured in each sample) at 10% (Table S3). PEA for each compound with g:Profiler resulted in enrichment scores (negative log-transformed p-values; ES) for each combination of drug and pathway (Fig. 2B). Some results confirmed prior knowledge on the compounds, e.g. AG-1478 and Lapatinib, both targeting EGFR, were strongly enriched for 'WP437: EGF/EGFR signaling pathway' (ES=8. 2 and ES=3.8). Yet, for many drug perturbations, the results were not as easy to interpret. For example, the JNK inhibitor SP-600125 showed a significant enrichment for 'WP2380 -Brain derived Neurotrophic Factor (BDNF) signaling pathway' (ES>1.3, which corresponds to a p-value<0.05) and the same was observed for 9 other drugs. Another pathway involving BDNF, 'WP3676: BDNF-TrkB signaling' also achieved a high enrichment score in the SP-600125 data (Fig. 2C). However, BDNF signaling is unlikely to be of relevance for RPE1 cells.
To gain more insight into how the regulated peptides were distributed within the enriched pathways, we used PTMNavigator to investigate the high-scoring pathways of SP-600125 (Fig.   2C). The projection of the experimentally measured phosphopeptides onto 'WP2380 -Brain derived Neurotrophic Factor (BDNF) signaling pathway' (Fig. S6) showed that there was, in fact, no regulation in signaling that could be attributed to altered activity of BDNF. Instead, the pathway scored high because the diagram includes many common signaling cascades that can 9 not only be triggered by BDNF but also by other growth factors such as EGF. Most proteins with regulations, such as SRC and SHC1, are known to be part of the canonical EGFR signaling pathway or pathways downstream of it (27,28). The analysis also highlighted a group of regulations downstream of AKT1, notably on TSC2 (down), mTOR, EIF4EBP1, and FOXO3 (all up). Many other high-scoring pathways for SP-600125 (ranks 3-7 and 9) shared this signaling cascade or a variation of it (Fig 2D; Fig. S8-S12 and Fig. S14). Other differentially phosphorylated proteins that were found in multiple of these pathways included IRS1, GAB2, and AKT1S1, all of which are known actors in EGFR and AKT signaling (29)(30)(31)(32). It has been shown before that JNK, the target of SP-600125, phosphorylates IRS1 (insulin receptor substrate 1), which causes a negative feedback loop that inhibits AKT signaling (29,30). The inhibition of JNK by SP-600125 in RPE1 cells might thus have resulted in an activation of AKT1 signaling through this mechanism. PTMNavigator thus helped to work out that, while the set of enriched pathways seemed diverse at first, most regulations could be attributed to a single altered signaling pattern caused by JNK inhibition.

Visualizing Target and Pathway Engagement in PTMNavigator
We next employed PTMNavigator on the same dataset to compare the kinase inhibitors Lapatinib, Wortmannin, and Temsirolimus, which target the same signaling cascade, but at different steps (Fig. 3A). Projection of their perturbed PTM peptides on 'hsa04150 -mTOR signaling pathway' (Fig. 3C), which includes all members of said cascade except for EGFR, confirmed that all three drugs perturbed the pathway. Lapatinib had the largest number of both up-and downregulated peptides and also the regulations most upstream in the pathway (Fig. 3C, upper panel). Among the sites exclusively regulated by Lapatinib were GSK3B_S9, TSC2_S939, and BRAF_S729, whose importance has previously been pointed out in the literature (33)(34)(35). In contrast, the effects of Wortmannin in this pathway were limited to sites on PRAS40, LPIN, 4E- shared with Lapatinib (PRAS40_T246, 4EBP1_T77/S101, LPIN1_S252 for Wortmannin, RICTOR_Y1174&Y1177 for Temsirolimus), or one another (RPS6_S236&S240&S244), which confirms that all drugs perturb the same signaling network at different stages.
A more surprising example was presented when comparing the effects of Sorafenib and Cobimetinib. Both compounds target the RAF-MEK-ERK pathway, another phosphorylation axis downstream of EGFR (Fig. 3B). Since Sorafenib is perturbing the network at a more upstream position (RAF), one would expect its impact on the PTM level to be more pronounced.
In reality, Cobimetinib had a wider impact than Sorafenib, which was visible in PTMNavigator's visualization of 'hsa04010 -MAPK Signaling Pathway' (Fig. 3D). As expected, Cobimetinib significantly inhibited the phosphorylation of the site MAPK1_Y187, which is essential to 11 MAPK1 activity, with a Log 2 Fold Change (logFC) of -4.39 between conditions (p-value=0.01).
There also were effects upstream in the pathway, e.g. on SOS1_S1134. This is likely the result of an inhibited feedback loop, since this site is a substrate of RPS6KA1, which is itself regulated by ERK (36). In contrast, Sorafenib only inhibited the phosphorylation of one site in this pathway (BRAF_S729), which was also strongly downregulated by Cobimetinib (logFC=-5.44, p=0.07 for Sorafenib and logFC=-5.41, p=0.09 for Cobimetinib) and known to be important for successful BRAF-dimerization (37). PTMNavigator's visualization suggests that Sorafenib does engage its target, but the perturbation in signaling does not progress beyond RAF. Previous research on the compound pointed out that Sorafenib, while being an effective agent against renal cell carcinoma and other cancer types, might not exert its efficacy via inhibition of RAF (38,39). In both examples presented here, PTMNavigator provided a visual insight into the target engagement and signaling cascades that one would not get as easily from other depictions such as heatmaps.

Exploring Dose-Dependent Perturbation Data in PTMNavigator
A very important characteristic of a drug's mode of action is how potently it perturbs different parts of cellular signaling, which can be captured by applying a series of doses and measuring the changes on the level of PTMs for each dose. Such a concentration-dependent approach, visualization reveals that all peptides perturbed by Lapatinib in this pathway were regulated at a 13 pEC 50 between 6.4 and 7.4, which corresponds to a concentration range between 40 nM and 400 nM (Fig. 4E). The dose-response curves of each shown peptide can be displayed directly within PTMNavigator. Lapatinib is known to be a highly selective EGFR/ErbB2 inhibitor (38), and the fact that all potencies were within one order of magnitude suggests that this pathway is only regulated by a single initial perturbation. Here, this is likely the inhibition of ErbB2, as EGFR has been reported to be not active in MDA-MB-175-VII cells (41). In other biological systems, the effect might differ. Such cell-type specific observations can also be highlighted by PTMNavigator, which we exemplified by comparing the effect of Dasatinib on 3 different cell lines (Fig. S4).

PTMNavigator Aids in Refinement and Expansion of Pathway Diagrams
We applied PTMNavigator to two other decryptM datasets to put phosphorylation sites with and without functional annotation into context. Refametinib and Mirdametinib are two highly selective kinase inhibitors that both target MAP2K1 (MEK1) and MAP2K2 (MEK2). Both inhibitors were applied to A549 lung carcinoma epithelial cells in the decryptM study (in ten concentrations ranging from 1 nM to 10 µM). Both drugs elicited a selective response in the A549 phosphoproteome, with 148 regulated dose-response curves for Refametinib (38 up, 110 down) and 73 for Mirdametinib (15 up, 58 down). The diagram 'WP51 -Regulation of actin cytoskeleton' was significantly enriched for both datasets (p-values 3.53e-3 for Mirdametinib and 5.91e-5 for Refametinib treatment, see also Table S6) and depicts some of the signaling occurring downstream of the compounds' targets. Projection of the drug-regulated peptides onto this pathway using the 'potency' color scheme (Fig. 5A) showed that Mirdametinib regulates MEK signaling more potently than Refametinib. This is in agreement with the reported higher affinity observed in pulldown experiments (38).
By focusing on dose-dependent phosphorylation signals in common between both drugs, we obtain a more complete understanding of the MEK signaling in this cellular system. Both drugs potently downregulated phosphorylation of MAPK1 and MAPK3 at their most important activating sites (T185 and Y187 for MAPK1, T202 and Y204 for MAPK3), which are all direct substrates of MEK1 and MEK2 and are part of the MEK1 kinase activity signature in PTMSigDB (17). Less straightforward to interpret were several downregulations upstream of MEK at SOS1 sites as well as RAF1_S43 in both datasets. RAF1_S43 is a PKACA substrate, suggesting that the activity of this kinase (which is not visible in the pathway) was also perturbed by the treatment. This is an example where the pathway diagram does not depict all relevant 15 interactions for the cellular system, a frequently observed issue of the canonical pathway definitions (see Discussion). Among the regulated SOS1 sites are SOS1_S1134 and SOS1_S1161, which are substrates of p90RSK. The activity of this protein is regulated by ERK, therefore this regulation likely is the result of an inhibited feedback loop. SOS1_S1064 was also observed less phosphorylated in both experiments. This site has no annotated kinase, but the other regulations suggest it could also be a p90RSK substrate. Downstream of MAPK, both drugs elicited downregulation of several peptides including MYPT1_S507, a site that has previously been associated with treatment by Selumetinib (a MEK1 and MEK2 inhibitor) and Vemurafenib (which has MEK5 as a potent off-target (38,42)). The fact that it is also affected by the two inhibitors studied here and that its dose-response profiles are similar to other peptides associated with MEK inhibition ( Fig. 5B and Fig. S5) suggests that this site could also be part of the general PTM signature of MEK inhibition.
The visualization also revealed another regulation in the 'WP51 -Regulation of actin cytoskeleton' pathway that was present in both datasets, namely GRLF_S589, which has no functional annotation or putative upstream kinase in PhosphoSitePlus. It showed similar doseresponse behavior as the signature MEK inhibition sites on MAPK1, MAPK3, and also MYPT1_S507 ( Fig. 5C and Fig. S5). This suggests that the site is also part of the same phosphoproteomic signature. The pathway diagram shows no perturbation upstream of GRLF1, and, unlike for SOS1, there is no known feedback loop which could explain this observation. Therefore, it remains to be investigated which kinases are directly responsible for phosphorylating this site, how essential these phosphorylations are to the signaling pathway and whether this finding translates to other biological systems. Nevertheless, this shows that PTMNavigator can support researchers in the functionalization of uncharacterized PTM sites and 16 thus in the extension of pathways by combining prior knowledge with experimental data (Fig.   5D). 17

DISCUSSION
PTMNavigator is a tool for visualizing quantitative PTM data in the context of cellular pathways which supports user to analyze complex PTM proteomics data. In its current implementation in the ProteomicsDB platform, it allows exploring pathways of 10 organisms using pathway annotations from KEGG and WikiPathways and the integration with PhosphoSitePlus adds another layer of functional annotation. We demonstrated how PTMNavigator facilitates the interpretation of pathway enrichment analyses, makes perturbed signaling traceable, leverages the novel information contained in dose-dependent PTM data, and generates hypotheses for follow-up experiments to functionalize unannotated modification sites. To our knowledge, PTMNavigator is the only available software that can graphically illustrate how regulated PTMs are distributed within pathways.
PTMNavigator gives a bird's-eye view of PTMs within a pathway, while at the same time integrating multiple sources of information on particular sites. This accelerates PTM perturbation analysis considerably, especially in large scale data sets. Furthermore, it fosters the identification of new kinase-substrate relationships and signaling pathway crosstalk, as we exemplified for the phosphoproteomic response to MEK inhibitors in A549 cells. In the future, we envision to incorporate more PTM annotations e.g. substrate motifs and predicted kinase-substrate using tools such as the Kinase Library (43) to further improve and streamline PTM data analysis.
Current PEAs are far from ideal for analyzing PTM data sets, and their outcomes must, therefore, be taken with considerable caution. Still, PEAs provide currently the best starting point, which is why PTMNavigator uses PEA to pre-sort pathways by putative relevance. The initial ranking should help users to prioritize pathway engagement, but users should keep in mind that some high-ranked pathways are not biologically relevant and a PEA artifact. This is often rooted in the 18 fact that pathway annotations are highly redundant (i.e., the same well studied proteins show up in many well-studied pathways). If better pathway ranking procedures are developed in the future, PTMNavigator can easily be updated to use those.
We emphasize that PTMNavigator is a visualization tool, and that the interpretation of the results remains the responsibility of the user. While we believe the current work constitutes a substantial advance, we point out several caveats as follows: i) Differential or dose-response analysis is not part of PTMNavigator and users need to perform this prior to data upload. PTMNavigator only shows what its users already deemed significantly regulated. This is also a benefit, as it makes PTMNavigator flexible to many experimental settings. ii) The currently available canonical pathway diagrams are heterogenous in structure and, for PTM research, they may sometimes even be misleading. For example, a phosphorylating connection from kinase A to protein B just means that some site on B is a substrate of A, not necessarily (actually not likely) all of them. iii) Pathway definitions are typically generalized from a large amount of data. Consequently, they cannot be specific for a certain tissue or cell type, and some proteins depicted in a pathway might not be expressed or active in the cells in which an experiment was performed. iv) A single node may represent a single gene or gene product, but can also summarize a subfamily of proteins (such as MAPK) or even an entire class of proteins (such as all receptor tyrosine kinases).
Conversely, a 'group' node can represent a protein complex, but can also represent a set of alternatives for a step in the pathway. Users of the pathway diagram need to be wary of these disparate definitions. v) The current knowledge of pathways is limited and many relationships are not yet discovered. PTMNavigator can only show what is already known and its use for the discovery of novel signaling cascades or nodes is necessarily limited. As a consequence, 19 PTMNavigator will become more powerful as pathway databases become more comprehensive in the future.
As alluded to in our final example, a desirable feature is the visualization of user-defined pathways that are either variations of KEGG/WikiPathways diagrams or entirely new pathways.
The biowc-pathwaygraph package already enables users to create their own signaling networks and project their data onto it but currently requires hosting one's own web server. We plan to make this feature more convenient in the future by allowing users to create, edit and load pathway diagrams within PTMNavigator. We expect that features like this will further enhance the versatility of our software and its use by the scientific community to interpret PTM data in the context of their respective pathways. 20

Creating a Pathway Database in ProteomicsDB
We assembled a collection of pathways from 10  We wrote a Python package that converts both types of pathways into JavaScript Object Notation (JSON) format. The JSON format is better suited for use with a web application, since it can be parsed at runtime without further conversion (in contrast to XML). We did not import pathways that contained zero gene or gene product nodes, since they are not interesting for the study of cellular signaling (this includes, for example, metabolic pathways that are focused only on metabolites). 21 We retrieved the list of pathways for each organism and downloaded the KGML files using the KEGG REST API (https://rest.kegg.jp). KGML files distinguish four types of nodes: gene, compound, ortholog, and map. To reduce the complexity of our internal representation, we summarized the gene and ortholog nodes into a common gene_protein node type. map nodes were renamed as pathway nodes since they describe connections to other pathways, and compound nodes were not modified (Fig. S1). In KGML files, nodes are labeled using KEGG identifiers, which we mapped to Uniprot accession numbers using the 'idmapping' service of the Uniprot REST API (as described here: https://www.uniprot.org/help/id_mapping). In order to maximize the number of nodes that had a human-readable name, we subsequently mapped the Uniprot accession numbers to gene names (also using the Uniprot API).

Preprocessing of KEGG Pathway Diagrams
KEGG pathways have two classes of edges: relations, which connect exactly two nodes, and reactions, which can also represent many-to-many relationships by specifying a list of substrates and a list of products. We intended to only represent one-to-one relationships in ProteomicsDB, therefore we replaced each reaction by an equivalent set of relations, connecting each substrateproduct pair by an individual edge. KEGG relations can have 19 different subtypes, which we retained in our JSON representation with two exceptions (Fig. S2): a) The types 'indirect' and 'indirect effect' were summarized to one type 'indirect'.
b) The types 'inhibition' and 'repression' were summarized to one type 'inhibition'.

Preprocessing of WikiPathways Pathway Diagrams
We downloaded GPML files from http://data.wikipathways.org/current/gpml. These pathways, which are maintained by a large, open community, are less standardized than in KEGG. The pathway collection we downloaded distinguishes 13 types of nodes, which we mapped to the 22 same types as the nodes from the KEGG pathways. In addition to the gene_protein, compound, and pathway types, we introduced a fourth type misc to account for WikiPathways nodes that could not be assigned one of the other categories (Fig. S1). In GPML files, each node is annotated with a reference to an external database, such as Entrez or Ensembl. To unify these diverse annotations, we mapped them to Uniprot Accession Numbers where possible, using the Uniprot API (Fig. S3). Edges in GPML files have 24 different types. 10 of these were converted into one of the KEGG edge types, the other 14 were retained as additional types (Fig. S3). A specific feature of the GPML specification is that edges may also end on another edge instead of a node. This is represented by an additional tag in the XML schema, called anchor. To keep our representation simple, we replaced anchor tags by edges that had other edges as their start or end point.

Implementation of biowc-pathwaygraph
We developed a WebComponent (www.webcomponents.org) termed biowc-pathwaygraph for the dynamic rendering of pathway diagrams together with peptide-level experimental data. Our WebComponent expects two inputs: A pathway diagram in JSON format (we call this 'pathway skeleton'), and a list of PTMs, each annotated with a regulation type (possible values are 'up', 'down', and 'not'). Optionally, a list of proteins and their regulation types can be supplied in addition.
biowc-pathwaygraph maps each entry of the PTM list to nodes of the pathway skeleton by comparing the gene names and/or Uniprot accession numbers that are associated with the nodes.
For each matched PTM, an additional node is added to the graph definition, as well as an edge from the new node to its associated node in the skeleton. A PTM can be matched to multiple nodes of the skeleton, resulting in multiple PTM nodes for the same entry of the PTM list. This 23 can happen, for example, when a protein appears twice in the same pathway because the creators wanted to describe two different routes of a signaling cascade.
After the matching process has completed, biowc-pathwaygraph renders the graph as an SVG object using the software library D3.js (44). The cartesian coordinates of the pathway skeleton nodes are used for the initial layout. The edges are drawn in such a way that they connect their two adjacent nodes in a straight line. When a node changes its position (e.g. by a user dragging it around), the endpoints of the edge are automatically updated. The PTM nodes, which do not have cartesian coordinates assigned, are positioned using simulated physical forces (using the d3-force package (https://github.com/d3/d3-force)), which cause them to remain close to their reference skeleton node without colliding with one another. PTM nodes can either be represented individually, or in the form of a single summary node for each regulation type that only displays the number of regulations (step 6 in Fig. 1C; users can toggle between the two representations). There are three possible color schemes for the PTM nodes: the default and most simple option is to color them by their regulation type ('up' nodes are colored red, 'down' nodes blue, and 'not' nodes grey). Alternatively, nodes can be colored by fold change or (in the case of decryptM data) by potency (using EC50 values estimated from the dose-response curve fit). In both cases, the 24 full range of values is determined after mapping the PTM nodes to the skeleton. A continuous color gradient is then calculated between the minimum and the maximum value and applied to each node.

Implementation of PTMNavigator
We implemented PTMNavigator as a Vue.js component (https://vuejs.org). It is essentially a wrapper around biowc-pathwaygraph that facilitates interaction of the WebComponent with both the user and the backend of ProteomicsDB. PTMNavigator handles user interactions such as the selection of datasets, organisms, and pathways, or the application of filters on the loaded data, by calling the necessary endpoints of the ProteomicsDB API, processing the returned data, and passing it on to biowc-pathwaygraph.
PTMNavigator is embedded into ProteomicsDB, which is also Vue.js-based since its latest release (24). The tool can be found in the 'Analytics' section of ProteomicsDB: https://www.proteomicsdb.org/analytics/ptmNavigator.

Processing of User Data Upload
We extended the upload page of ProteomicsDB (www.proteomicsdb.org/analytics/customDataUpload) to allow the upload of data for PTMNavigator in csv format. It is possible to upload conventional quantitative proteomics ('Fold Change') data as well as decryptM data on both peptide and protein level. Example files for all four input types are provided for download on the website (see also Table S1). In the case of decryptM data, the 'curves.txt' file output by the decryptM pipeline or CurveCurator can directly be uploaded together with the TOML file containing the experimental parameters. During the upload, a pathway enrichment analysis is performed automatically so that each pathway in our 25 database is associated with an enrichment score (negative log-transformed p-value) for the uploaded dataset. The uploaded data is stored temporarily in ProteomicsDB and is deleted if not accessed for 14 days. To limit the access to the data, the user receives a personalized 32-digit alphanumeric session identifier during the upload that needs to be entered in PTMNavigator to retrieve the uploaded data.

Pathway Enrichment Analysis
Throughout this study, pathway enrichment analysis was performed as follows: All regulated peptides or proteins of a dataset were mapped to Uniprot accession numbers. The set of all unique Uniprot accession numbers was then sent as a POST request to the API of g:Profiler (https://biit.cs.ut.ee/gprofiler/api/gost/profile/). The 'organism' parameter was set to 'hsapiens' in all datasets discussed in the main text (in the case of the automatic PEA during data upload, it is set to the organism selected by the user). The 'sources' parameter was set to '["KEGG","WP"]' to only receive enrichment scores for KEGG and WikiPathways entries. The 'all_results' parameter was set to 'true' to retrieve also insignificant enrichment results. All others parameters were set to their default values.

Preprocessing of data from Bekker-Jensen et al.
For the detailed experimental procedures of the kinase inhibitor screen, please refer to the original publication. Briefly, RPE1 cells were treated with either 0.1 µM or 1 µM of inhibitor for 30 minutes, followed by stimulation with EGF for 10 minutes. All experiments were performed in triplicates, except for an EGF-stimulation-only control, which was performed in duplicates.
After extraction of proteins, digestion into peptides, and Ti-IMAC-based phosphopeptideenrichment, the samples were measured using LC-MS/MS in Data-Independent Acquisition 26 (DIA) mode. The data was then processed using Spectronaut to identify and quantify phosphorylated peptides.
Our analysis of the data differed from that of Bekker-Jensen et al. in that they performed multiple sample significance testing (ANOVA), comparing all conditions against the control together, whereas we intended to compare each condition to the control separately using t-tests.

I.
If the peptide appeared multiple times in the same experiment (which can happen for multiply phosphorylated peptides), retain only the most significant entry (lowest adjusted p-value).

II.
If a peptide was neither significantly regulated in the low-dose experiment nor in the high-dose experiment of a compound, retain only the entry from the high-dose experiment. 27 III. If a peptide was significantly regulated in the high-dose experiment, retain only this entry.

IV.
If a peptide was significantly regulated only in the low-dose experiment, retain only this entry.
The resulting tables was formatted to fulfil the input criteria of PTMNavigator.
Pathway enrichment was performed as described above.
The code of this analysis (starting from the Perseus output and ending at the creation of PTMNavigator input files) is available as a Jupyter Notebook (Data file S1).

Preprocessing of data from Zecha et al.
For              By double-clicking on it, a summary node can be expanded into individual nodes for each peptide (elliptical shapes). (7) If a user uploaded protein expression data, the proteins will also be colored according to their regulation group. (8) Many proteins may have multiple labels. By right-clicking on a node, an alternative name can be selected. (9) When hovering over a node, all information supplied during the upload is displayed as a tooltip.
If a site is annotated in PhosphoSitePlus, a link to the annotation is retrieved. 38 39