PT - JOURNAL ARTICLE AU - Nimrod Rappoport AU - Ron Shamir TI - Multi-omic and multi-view clustering algorithms: review and cancer benchmark AID - 10.1101/371120 DP - 2018 Jan 01 TA - bioRxiv PG - 371120 4099 - http://biorxiv.org/content/early/2018/07/19/371120.short 4100 - http://biorxiv.org/content/early/2018/07/19/371120.full AB - High throughput experimental methods developed in recent years have been used to collect large biomedical omics datasets. Clustering of such datasets has proven invaluable for biological and medical research, and helped reveal structure in data from several domains. Such analysis is often based on investigation of a single omic. The decreasing cost and development of additional high throughput methods now enable measurement of multi-omic data. Clustering multi-omic data has the potential to reveal further systems-level insights, but raises computational and biological challenges. Here we review algorithms for multi-omics clustering, and discuss key issues in applying these algorithms. Our review covers methods developed specifically for multi-omic data as well as generic multi-view methods developed in the machine learning community for joint clustering of multiple data types.In addition, using cancer data from TCGA, we perform an extensive benchmark spanning ten different cancer types, providing the first systematic benchmark comparison of leading multi-omics and multiview clustering algorithms. The results highlight several key questions regarding the use of single-vs. multi-omics, the choice of clustering strategy, the power of generic multi-view methods and the use of approximated p-values for gauging solution quality. Due to the rapidly increasing use of multi-omics data, these issues may be important for future progress in the field.