PT - JOURNAL ARTICLE AU - Hua Chai AU - Xiang Zhou AU - Zifeng Cui AU - Jiahua Rao AU - Zheng Hu AU - Yutong Lu AU - Huiying Zhao AU - Yuedong Yang TI - Integrating multi-omics data with deep learning for predicting cancer prognosis AID - 10.1101/807214 DP - 2019 Jan 01 TA - bioRxiv PG - 807214 4099 - http://biorxiv.org/content/early/2019/10/17/807214.short 4100 - http://biorxiv.org/content/early/2019/10/17/807214.full AB - Motivation Accurately predicting cancer prognosis is necessary to choose precise strategies of treatment for patients. One of effective approaches in the prediction is the integration of multi-omics data, which reduces the impact of noise within single omics data. However, integrating multi-omics data brings large number of redundant variables and relative small sample sizes. In this study, we employed Autoencoder networks to extract important features that were then input to the proportional hazards model to predict the cancer prognosis.Results The method was applied to 12 common cancers from the Cancer Genome Atlas. The results show that the multi-omics averagely improves 4.1% C-index for prognosis prediction over single mRNA data, and our method outperforms previous approaches by at least 7.4%. A comparison of the contribution of single omics data show that mRNA contributes the most, followed by the DNA methylation, miRNA, and the copy number variation. In the case study for differential gene expression analysis, we identified 161 differentially expressed genes in the cervical cancer, among which 77 genes (65.8%) have been proven to be associated with cancer. In addition, we performed the cross-cancer test where the model trained on one cancer was used to predict the prognosis of another cancer, and found 23 pairs of cancers have a C-index larger than 0.5, with the largest value of 0.68. Thus, this study has provided a deep learning framework to effectively integrate multiple omics data to predict cancer prognosis.