Benchmarking of protein interaction databases for integration with manually reconstructed signaling network models

Protein interaction databases are critical resources for network bioinformatics and integrating molecular experimental data. Interaction databases may also enable construction of predictive computational models of biological networks, although their fidelity for this purpose is not clear. Here, we benchmark protein interaction databases X2K, Reactome, Pathway Commons, Omnipath, and Signor for their ability to recover manually curated edges from three logic-based network models of cardiac hypertrophy, mechano-signaling, and fibrosis. Pathway Commons performed best at recovering interactions from manually reconstructed hypertrophy (137 of 193 interactions, 71%), mechano-signaling (85 of 125 interactions, 68%), and fibroblast networks (98 of 142 interactions, 69%). While protein interaction databases successfully recovered central, well-conserved pathways, they performed worse at recovering tissue-specific and transcriptional regulation. This highlights a knowledge gap where manual curation is critical. Finally, we tested the ability of Signor and Pathway Commons to identify new edges that improve model predictions, revealing important roles of PKC autophosphorylation and CaMKII phosphorylation of CREB in cardiomyocyte hypertrophy. This study provides a platform for benchmarking protein interaction databases for their utility in network model construction, as well as providing new insights into cardiac hypertrophy signaling.


48
In this study, we used three manually curated network models of cardiac myocyte 49 hypertrophy signaling [2], cardiac fibroblast differentiation signaling [3], and cardiomyocyte

59
Benchmarking against the cardiac hypertrophy signaling network 60 In order to benchmark protein interaction databases against manually curated network 61 reconstructions, it is necessary to first annotate the genes or genes that correspond to each 62 node in the network model. A single node may represent multiple protein isoforms or a protein 63 complex constituted of subunits. The manual curation process of constructing these networks 4 on the available information from literature. For example, node A is listed to represent genes A1 66 and A2 in the workflow schematic (Figure 1). In benchmarking any reactions containing node A 67 against a database, both genes A1 and A2 would be considered possible representations of the 68 node. Using this approach, we translated the model's reactions into gene-gene edges.    available datasets for each of these sources were downloaded for use in analysis. For X2K, it 78 was necessary to first concatenate the individual files provided for data sources online. In each 79 of the databases used for benchmarking, the type of interaction is denoted by a separate field.

80
In the example database (Figure 1), the top four interactions are directional, whereas the fifth 5 interaction is undirected. As shown in the figure, the toy network would be compared to all five 82 reactions to produce a benchmarking score. Benchmarking scores are computed for just 83 undirected interactions, just directed interactions, and for all interactions in the specified 84 database.

85
We next applied this benchmarking framework to benchmark five protein interaction 86 databases in their ability to recover edges from a logic-based differential equation

Benchmarking protein interactions to networks using Pathway Commons interaction annotations 147
Protein interaction databases provide not only the interactions themselves but also 148 annotations on the type of interactions. We asked whether certain types of interaction 149 annotations were more associated with successful benchmarking to edges in the three network   database. For example, looking at functional interactions, the three manually curated networks exhibited a higher percentage of annotations for "controls state change of" and "controls phosphorylation of" compared to the overall Pathway Commons database (Figure 3a). In 162 contrast, while the fraction of reactions classified as "catalysis precedes" and "controls 163 expression of" is lower in the manually curated networks. This indicates that specific the process 164 of manually curating a network model may cater to specific types of functional interactions, but it 165 is unclear whether this reflects curation bias or utility. Physical interactions annotated as 166 "interacts with" or "in complex with" were more balanced across manually curated networks and 167 Pathway Commons. The higher proportion of "controls-state-change-of" and "controls-

295
Overall, we found that roughly 70% of edges in the manually curated networks could be recalled  Signor. This approach identified CaMKII phosphorylation of CREB and PKC 303 autophosphorylation as important for the ability to accurately predict ANP phosphorylation.

304
be recovered from existing protein interaction databases, this study also reveals several 306 limitations of these databases specific to constructing predictive network models. The most 307 obvious need is that of more interactions incorporated into protein interaction databases.

308
Pathway Commons is the largest database considered here and had the highest recovery rate.

309
However, the quality of curation is also highly important. Signor has a greater degree of manual 310 curation, and indeed while it has only 3.8% as many directed interactions in Pathway Commons,

311
Signor recovered 51% and 45% as many interactions from the mechano-signaling and fibroblast

347
In conclusion, this benchmarking study revealed significant utility in the ability of protein 348 interaction databases for recovering and also predicting new edges for predictive network 349 models. This demonstrates a promising approach for systematically expanding manually 350 curated network models and reveals new insights into cardiac hypertrophy signaling.

353
Comparing network model topology to protein interaction databases 354 example, node A is denoted as activating node C in rule 3 of the toy network ( Figure 1). The C 357 node represents gene C, and, thus, two pairwise gene product interactions are used to 358 benchmark this edge against databases. In scoring this interaction against a database, each of 359 the gene product interaction combinations is compared to the list of interactions in a database. If 360 either of the combinations is present, the edge is scored as being present in the database.

361
Across an entire network, the performance of a database was determined by calculating the

372
Determining annotations for network edges 373 The possible genes corresponding to each node in a given network were used to