PT - JOURNAL ARTICLE AU - Runpu Chen AU - Le Yang AU - Steve Goodison AU - Yijun Sun TI - Deep Learning Approach to Identifying Breast Cancer Subtypes Using High-Dimensional Genomic Data AID - 10.1101/629865 DP - 2019 Jan 01 TA - bioRxiv PG - 629865 4099 - http://biorxiv.org/content/early/2019/06/08/629865.short 4100 - http://biorxiv.org/content/early/2019/06/08/629865.full AB - Motivation Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes.Results To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to alternative state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.Availability and implementation An open-source software package for the proposed method is freely available at www.acsu.buffalo.edu/~yijunsun/lab/DeepType.html.