TY - JOUR T1 - ACValidator: a novel assembly-based approach for <em>in silico</em> validation of circular RNAs JF - bioRxiv DO - 10.1101/556597 SP - 556597 AU - Shobana Sekar AU - Philipp Geiger AU - Jonathan Adkins AU - Geidy Serrano AU - Thomas G. Beach AU - Winnie S. Liang Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/02/21/556597.abstract N2 - Circular RNAs (circRNAs) are evolutionarily conserved RNA species that are formed when exons ‘back-splice’ to each other. Current computational algorithms to detect these back-splicing junctions produce divergent results, and hence there is a need for a method to distinguish true positive circRNAs. To this end, we developed ACValidator (Assembly based CircRNA Validator) for in silico validation of circRNAs. ACValidator extracts reads from a user-defined window on either side of the circRNA junction and assembles them to generate contigs. These contigs are aligned against the circRNA sequence to find contigs spanning the backspliced junction. When evaluated on simulated datasets, ACValidator achieved over 80% sensitivity and specificity on datasets with an average of 10 circRNA-supporting reads and with read lengths of at least 100 bp. In experimental datasets, ACValidator produced higher validation percentages for samples treated with ribonuclease R compared to non-treated samples. Our workflow is applicable to non-polyA-selected RNAseq datasets and can also be used as a candidate selection strategy for experimental validations. All workflow scripts are freely accessible on our github page https://github.com/tgen/ACValidator along with detailed instructions to set up and run ACValidator.Author summary Circular RNAs (circRNAs) are a recent addition to the class of non-coding RNAs and are produced when exons ‘back-splice’ and form closed circular loops. Although several computational algorithms have been developed to detect circRNAs from RNA sequencing (RNAseq) data, they produce divergent results. We hence developed the software Assembly based Circular RNA Validator (ACValidator) as an orthogonal strategy to separately validate predicted circRNAs in silico. ACValidator takes as input a sequence alignment mapping (SAM) file and the circRNA coordinate(s) to be validated. Reads surrounding the circRNA junction are extracted from the SAM file and assembled to generate contigs. These contigs are then aligned against the circRNA sequence to identify contigs that span the back-spliced junction. We evaluated our workflow on simulated as well as experimental datasets to demonstrate the utility of our approach. ACValidator is implemented in python and is highly computationally efficient, with a run time of less than 2 minutes for an 8 GB SAM file. This workflow is applicable to non-polyA-selected RNAseq datasets and can also be used as a candidate selection strategy for experimental validations. ER -