RT Journal Article
SR Electronic
T1 Inference of splicing motifs through visualization of recurrent networks
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 451906
DO 10.1101/451906
A1 Aparajita Dutta
A1 Aman Dalmia
A1 R Athul
A1 Kusum Kumari Singh
A1 Ashish Anand
YR 2018
UL http://biorxiv.org/content/early/2018/10/25/451906.abstract
AB Neural models have been able to obtain state-of-the-art performances on several genome sequence-based prediction tasks. Often such models take only nucleotide sequences as input and learn relevant features on its own. However, extracting the inter-pretable motifs from the model remains a challenge. This work explores four existing visualization techniques in their ability to infer relevant sequence patterns learned by a recurrent neural network (RNN) on the task of splice junction identification. The visualization techniques have been modulated to suit the genome sequences as input. The visualizations inspect genomic regions at the level of a single nucleotide as well as a span of consecutive nucleotides. This inspection is performed from the perspective of the overall model as well as individual sequences. We infer canonical and non-canonical splicing motifs from a single neural model. We also propose a cumulative scoring function that ranks the combination of significant regions across the sequences involved in non-canonical splicing event. Results indicate that the visualization technique giving preference to k-mer patterns can extract known splicing motifs better than the techniques focusing on a single nucleotide.