Abstract
Neural models have been able to obtain state-of-the-art performances on several genome sequence-based prediction tasks. Often such models take only nucleotide sequences as input and learn relevant features on its own. However, extracting the inter-pretable motifs from the model remains a challenge. This work explores four existing visualization techniques in their ability to infer relevant sequence patterns learned by a recurrent neural network (RNN) on the task of splice junction identification. The visualization techniques have been modulated to suit the genome sequences as input. The visualizations inspect genomic regions at the level of a single nucleotide as well as a span of consecutive nucleotides. This inspection is performed from the perspective of the overall model as well as individual sequences. We infer canonical and non-canonical splicing motifs from a single neural model. We also propose a cumulative scoring function that ranks the combination of significant regions across the sequences involved in non-canonical splicing event. Results indicate that the visualization technique giving preference to k-mer patterns can extract known splicing motifs better than the techniques focusing on a single nucleotide.