PT - JOURNAL ARTICLE AU - Qingyang Zhang AU - Xuan Shi TI - A Mixture Copula Bayesian Network Model for Multimodal Genomic Data AID - 10.1101/110288 DP - 2017 Jan 01 TA - bioRxiv PG - 110288 4099 - http://biorxiv.org/content/early/2017/02/22/110288.short 4100 - http://biorxiv.org/content/early/2017/02/22/110288.full AB - Gaussian Bayesian networks have become a widely used framework to estimate directed associations between joint Gaussian variables, where the network structure encodes decomposition of multivariate normal density into local terms. However, the resulting estimates can be inaccurate when normality assumption is moderately or severely violated, making it unsuitable to deal with recent genomic data such as the Cancer Genome Atlas data. In the present paper, we propose a mixture copula Bayesian network model which provides great flexibility in modeling non-Gaussian and multimodal data for causal inference. The parameters in mixture copula functions can be efficiently estimated by a routine Expectation-Maximization algorithm. A heuristic search algorithm based on Bayesian information criterion is developed to estimate the network structure, and prediction can be further improved by the best-scoring network out of multiple predictions from random initial values. Our method outperforms Gaussian Bayesian networks and regular copula Bayesian networks in terms of modeling flexibility and prediction accuracy, as demonstrated using a cell signaling dataset. We apply the proposed methods to the Cancer Genome Atlas data to study the genetic and epigenetic pathways that underlie serous ovarian cancer.TCGAThe Cancer Genome AtlasBNBayesian networkGBNGaussian Bayesian networkCBNCopula Bayesian networkMCBNMixture copula Bayesian networkEMExpectation-MaximizationBICBayesian information criterion