RT Journal Article SR Electronic T1 A strategy to incorporate prior knowledge into correlation network cutoff selection JF bioRxiv FD Cold Spring Harbor Laboratory SP 792697 DO 10.1101/792697 A1 Elisa Benedetti A1 Maja Pučić-Baković A1 Toma Keser A1 Nathalie Gerstner A1 Mustafa Büyüközkan A1 Tamara Štambuk A1 Maurice H.J. Selman A1 Igor Rudan A1 Ozren Polašek A1 Caroline Hayward A1 Hassen Al-Amin A1 Karsten Suhre A1 Gabi Kastenmüller A1 Gordan Lauc A1 Jan Krumsiek YR 2019 UL http://biorxiv.org/content/early/2019/10/03/792697.abstract AB Correlation networks are commonly used to statistically extract biological interactions between omics markers. Network edge selection is typically based on the significance of the underlying correlation coefficients. A statistical cutoff, however, is not guaranteed to capture biological reality, and heavily depends on dataset properties such as sample size. We here propose an alternative, innovative approach to address the problem of network reconstruction. Specifically, we developed a cutoff selection algorithm that maximizes the agreement to a given ground truth. We first evaluate the approach on IgG glycomics data, for which the biochemical pathway is known and well-characterized. The optimal network outperforms networks obtained with statistical cutoffs and is robust with respect to sample size. Importantly, we can show that even in the case of incomplete or incorrect prior knowledge, the optimal network is close to the true optimum. We then demonstrate the generalizability of the approach on an untargeted metabolomics and a transcriptomics dataset from The Cancer Genome Atlas (TCGA). For the transcriptomics case, we demonstrate that the optimized network is superior to statistical networks in systematically retrieving interactions that were not included in the biological reference used for the optimization. Overall, this paper shows that using prior information for correlation network inference is superior to using regular statistical cutoffs, even if the prior information is incomplete or partially inaccurate.