PT - JOURNAL ARTICLE AU - Florian Klimm AU - Charlotte M. Deane AU - Gesine Reinert TI - Hypergraphs for predicting essential genes using multiprotein complex data AID - 10.1101/2020.04.03.023937 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.04.03.023937 4099 - http://biorxiv.org/content/early/2020/04/05/2020.04.03.023937.short 4100 - http://biorxiv.org/content/early/2020/04/05/2020.04.03.023937.full AB - Protein-protein interactions are crucial in many biological pathways and facilitate cellular function. Investigating these interactions as a graph of pairwise interactions can help to gain a systemic understanding of cellular processes. It is known, however, that proteins interact with each other not exclusively in pairs but also in polyadic interactions and they can form multiprotein complexes, which are stable interactions between multiple proteins. In this manuscript, we use hypergraphs to investigate multiprotein complex data. We investigate two random null models to test which hypergraph properties occur as a consequence of constraints, such as the size and the number of multiprotein complexes. We find that assortativity, the number of connected components, and clustering differ from the data to these null models. Our main finding is that projecting a hypergraph of polyadic interactions onto a graph of pairwise interactions leads to the identification of different proteins as hubs than the hyper-graph. We find in our data set that the hypergraph degree is a more accurate predictor for gene-essentiality than the degree in the pairwise graph. We find that analysing a hypergraph as pairwise graph drastically changes the distribution of the local clustering coefficient. Furthermore, using a pairwise interaction representing multiprotein complex data may lead to a spurious hierarchical structure, which is not observed in the hypergraph. Hence, we illustrate that hypergraphs can be more suitable than pairwise graphs for the analysis of multiprotein complex data.