TY - JOUR T1 - Efficient and accurate detection of splice junctions from RNAseq with Portcullis JF - bioRxiv DO - 10.1101/217620 SP - 217620 AU - Daniel Mapleson AU - Luca Venturini AU - Gemy Kaithakottil AU - David Swarbreck Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/11/10/217620.abstract N2 - Next generation sequencing (NGS) technologies enable rapid and cheap genome-wide transcriptome analysis, providing vital information about gene structure, transcript expression and alternative splicing. Key to this is the the accurate identification of exon-exon junctions from RNA sequenced (RNA-seq) reads. A number of RNA-seq aligners capable of splitting reads across these splice junctions (SJs) have been developed, however, it has been shown that while they correctly identify most genuine SJs available in a given sample, they also often produce large numbers of incorrect SJs. Herein we describe the extent of this problem using popular RNA-seq mapping tools, and present a new method, called Portcullis, to rapidly filter false SJs junctions from spliced alignments produced by any RNA-seq mapper capable of creating SAM/BAM files. We show that Portcullis distinguishes between genuine and false positive junctions to a high-degree of accuracy across different species, samples, expression levels, error profiles and read lengths. Portcullis makes efficient use of memory and threading and, to our knowledge, is currently the only SJ prediction tool that reliably scales for use with large RNAseq datasets and large highly fragmented genomes, whilst delivering highly accurate SJs.Availability Portcullis is available under the GPLv3 license at: http://maplesond.github.io/portcullis/Contact daniel.mapleson{at}earlham.ac.uk ER -