PT - JOURNAL ARTICLE AU - Boyu Lyu AU - Anamul Haque TI - Deep Learning Based Tumor Type Classification Using Gene Expression Data AID - 10.1101/364323 DP - 2018 Jan 01 TA - bioRxiv PG - 364323 4099 - http://biorxiv.org/content/early/2018/07/11/364323.short 4100 - http://biorxiv.org/content/early/2018/07/11/364323.full AB - Differential analysis occupies the most significant portion of the standard practices of RNA-Seq analysis. However, the conventional method is matching the tumor samples to the normal samples, which are both from the same tumor type. The output using such method would fail in differentiating tumor types because it lacks the knowledge from other tumor types. Pan-Cancer Atlas provides us with abundant information on 33 prevalent tumor types which could be used as prior knowledge to generate tumor-specific biomarkers. In this paper, we embedded the high dimensional RNA-Seq data into 2-D images and used a convolutional neural network to make classification of the 33 tumor types. The final accuracy we got was 95.59%, higher than another paper applying GA/KNN method on the same dataset. Based on the idea of Guided Grad Cam, as to each class, we generated significance heat-map for all the genes. By doing functional analysis on the genes with high intensities in the heat-maps, we validated that these top genes are related to tumor-specific pathways, and some of them have already been used as biomarkers, which proved the effectiveness of our method. As far as we know, we are the first to apply convolutional neural network on Pan-Cancer Atlas for classification, and we are also the first to match the significance of classification with the importance of genes. Our experiment results show that our method has a good performance and could also apply in other genomics data.