TY - JOUR T1 - Alternative splicing detection workflow needs a careful combination of sample prep and bioinformatics analysis JF - bioRxiv DO - 10.1101/005546 SP - 005546 AU - M. Carrara AU - J. Lum AU - F. Cordero AU - M. Beccuti AU - M. Poidinger AU - S. Donatelli AU - R. A. Calogero AU - F. Zolezzi Y1 - 2014/01/01 UR - http://biorxiv.org/content/early/2014/05/26/005546.abstract N2 - Background RNAseq provides remarkable power in the area of biomarkers discovery and disease stratification. The main technical steps affecting the results of RNAseq experiments are Library Sample Preparation (LSP) and Bioinformatics Analysis (BA). At the best of our knowledge, a comparative evaluation of the combined effect of LSP and BA was never considered and it might represent a valuable knowledge to optimize alternative splicing detection, which is a challenging task due to moderate fold change differences to be detected within a complex isoforms background.Results Different LSPs (TruSeq unstranded/stranded, ScriptSeq, NuGEN) allow the detection of a large common set of isoforms. However, each LSP also detects a smaller set of isoforms, which are characterized both by lower coverage and lower FPKM than that observed for the common ones among LSPs. This characteristic is particularly critical in case of low input RNA NuGEN v2 LSP.The effect on statistical detection of alternative splicing considering low input LSP (NuGEN v2) with respect to high input LSP (TruSeq) on statistical detection of alternative splicing was studied using a benchmark dataset, in which both synthetic reads and reads generated from high (TruSeq) and low input (NuGEN) LSPs were spiked-in. Statistical detection of alternative splicing (AltDE) was done using prototypes of BA for isoform-reconstruction (Cuffdiff) and exon-level analysis (DEXSeq). Exon-level analysis performs slightly better than isoform-reconstruction approach although at most only 50% of the spiked-in transcripts are detected. Both isoform-reconstruction and exon-level analysis performances improve by rising the number of input reads.Conclusion Data, derived from NuGEN v2, are not the ideal input for AltDE, specifically when exon-level approach is used. It is notable that ribosomal depletion, with respect to polyA+ selection, reduces the amount of coding mappable reads resulting detrimental in the case of AltDE. Furthermore, we observed that both isoform-reconstruction and exon-level analysis performances are strongly dependent on the number of input reads. ER -