TY - JOUR T1 - Reference-free reconstruction and quantification of transcriptomes from long-read sequencing JF - bioRxiv DO - 10.1101/2020.02.08.939942 SP - 2020.02.08.939942 AU - Ivan de la Rubia AU - Joel A. Indi AU - Silvia Carbonell AU - Julien Lagarde AU - M Mar Albà AU - Eduardo Eyras Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/02/09/2020.02.08.939942.abstract N2 - Single-molecule long-read sequencing provides an unprecedented opportunity to measure the transcriptome from any sample 1–3. However, current methods for the analysis of transcriptomes from long reads rely on the comparison with a genome or transcriptome reference 2,4,5, or use multiple sequencing technologies 6,7. These approaches preclude the cost-effective study of species with no reference available, and the discovery of new genes and transcripts in individuals underrepresented in the reference. Methods for the assembly of DNA long-reads 8–10 cannot be directly transferred to transcriptomes since their consensus sequences lack the interpretability as genes with multiple transcript isoforms. To address these challenges, we have developed RATTLE, the first method for the reference-free reconstruction and quantification of transcripts from long reads. Using simulated data, transcript isoform spike-ins, and sequencing data from human and mouse tissues, we demonstrate that RATTLE accurately performs read clustering and error-correction. Furthermore, RATTLE predicts transcript sequences and their abundances with accuracy comparable to reference-based methods. RATTLE enables rapid and cost-effective long-read transcriptomics in any sample and any species, without the need of a genome or annotation reference and without using additional technologies. ER -