TY - JOUR T1 - Min-Redundancy and Max-Relevance Multi-view Feature Selection for Predicting Ovarian Cancer Survival using Multi-omics Data JF - bioRxiv DO - 10.1101/317982 SP - 317982 AU - Yasser EL-Manzalawy AU - Tsung-Yu Hsieh AU - Manu Shivakumar AU - Dokyoon Kim AU - Vasant Honavar Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/05/09/317982.abstract N2 - Background Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data, such as somatic mutation, copy number alteration (CNA), DNA methylation, miRNA, gene expression, and protein expression, offer the tantalizing possibilities of realizing the potential of precision medicine in cancer prevention, diagnosis, and treatment by substantially improving our understanding of underlying mechanisms as well as the discovery of novel biomarkers for different types of cancers. However, such analyses present a number of challenges, including the heterogeneity of data types, and the extreme high-dimensionality of omics data.Methods In this study, we propose a novel framework for integrating multi-omics data based on multi-view feature selection, an emerging research problem in machine learning research. We also present a novel multi-view feature selection algorithm, MRMR-mv, which adapts the well-known Min-Redundancy and Maximum-Relevance (MRMR) single-view feature selection algorithm for the multi-view settings.Results We report results of experiments on the task of building a predictive model of cancer survival from an ovarian cancer multi-omics dataset derived from the TCGA database. Our results suggest that multi-view models for predicting ovarian cancer survival outperform both view-specific models (i.e., models trained and tested using one multi-omics data source) and models based on two baseline data fusion methods.Conclusions Our results demonstrate the potential of multi-view feature selection in integrative analyses and predictive modeling from multi-omics data. ER -