RT Journal Article SR Electronic T1 Rail-RNA: Scalable analysis of RNA-seq splicing and coverage JF bioRxiv FD Cold Spring Harbor Laboratory SP 019067 DO 10.1101/019067 A1 Abhinav Nellore A1 Leonardo Collado-Torres A1 Andrew E. Jaffe A1 James Morton A1 Jacob Pritt A1 José Alquicira-Hernández A1 Jeffrey T. Leek A1 Ben Langmead YR 2015 UL http://biorxiv.org/content/early/2015/05/07/019067.abstract AB RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it is difficult to reproduce the exact analysis without access to original computing resources. We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 666 RNA-seq samples from the GEUVADIS project on Amazon Web Services in 12 hours for US$0.69 per sample. Rail-RNA produces alignments and base-resolution bigWig coverage files, ready for use with downstream packages for reproducible statistical analysis. We identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounders. Rail-RNA is open-source software available at http://rail.bio.