RT Journal Article
SR Electronic
T1 The Soft Vertex Classification for Active Module Identification Problem
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 407460
DO 10.1101/407460
A1 Alexeev, Nikita
A1 Isomurodov, Javlon
A1 Korotkevich, Gennady
A1 Sergushichev, Alexey
YR 2018
UL http://biorxiv.org/content/early/2018/09/04/407460.abstract
AB Motivation Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches consists in finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Usually methods implementing this approach find a single subnetwork and thus solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. To address this issue, in the current study we consider the active module problem as a soft classification problem. We propose a method to estimate probabilities of each vertex to belong to the active module based on Markov chain Monte Carlo subnetwork sampling.Results The proposed method allows to estimate the probability that an individual vertex belongs to the active module as well as the false discovery rate (FDR) for a given set of vertices. Given the estimated probabilities, it becomes possible to provide a connected subgraph in a consistent manner for any given FDR level: no vertex can disappear when the FDR level is relaxed. We show on simulated dataset that the proposed method has good computational performance and high classification accuracy. As an example of the performance of our method on real data, we run it on a protein-protein interaction network together with a gene expression DLBCL dataset. The results are consistent with the previous studies while, at the same time, the proposed approach is more flexible. Source code is available at https://github.com/ctlab/mcmcRanking under MIT licence.