RT Journal Article SR Electronic T1 IMIX: A multivariate mixture model approach to integrative analysis of multiple types of omics data JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.06.23.167312 DO 10.1101/2020.06.23.167312 A1 Ziqiao Wang A1 Peng Wei YR 2020 UL http://biorxiv.org/content/early/2020/06/24/2020.06.23.167312.abstract AB Motivation Integrative genomic analysis is a powerful tool to study the biological mechanisms underlying a complex disease or trait across multiplatform high-dimensional data, such as DNA methylation, copy number variation (CNV), and gene expression. It is common to perform large-scale genome-wide association analysis of an outcome for each data type separately and combine the results ad hoc, leading to loss of statistical power and uncontrolled overall false discovery rate (FDR).Results We propose a multivariate mixture model framework (IMIX) that integrates multiple types of genomic data and allows examining and relaxing the commonly adopted conditional independence assumption. We investigate across-data-type FDR control in IMIX, and show the gain in lower misclassification rates at controlled over-all FDR compared with established individual data type analysis strategies, such as Benjamini-Hochberg FDR control, the q-value, and the local FDR control by extensive simulations. IMIX features statistically-principled model selection, FDR control, and computational efficiency. Applications to the Cancer Genome Atlas (TCGA) data provide novel multi-omic insights into the luminal/basal subtyping of bladder cancer and the prognosis of pancreatic cancer.Availability and implementation We have implemented our method in R package “IMIX” with instructions and examples available at https://github.com/ziqiaow/IMIX.Competing Interest StatementThe authors have declared no competing interest.