Systematic assessment of long-read RNA-seq methods for transcript identification and quantification
Abstract
The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.
Competing Interest Statement
Design of the project was discussed with Oxford Nanopore Technologies (ONT), Pacific Biosciences, and Lexogen. ONT provided partial support for flow cells and reagents. S.C-S and A.N.B. have received reimbursement for travel, accommodation, and conference fees to speak at events organized by ONT. A.N.B. is a consultant for Remix Therapeutics, Inc.
Subject Area
- Biochemistry (12968)
- Bioengineering (9847)
- Bioinformatics (31591)
- Biophysics (16283)
- Cancer Biology (13359)
- Cell Biology (19052)
- Clinical Trials (138)
- Developmental Biology (10320)
- Ecology (15332)
- Epidemiology (2067)
- Evolutionary Biology (19577)
- Genetics (12986)
- Genomics (17956)
- Immunology (13069)
- Microbiology (30555)
- Molecular Biology (12742)
- Neuroscience (66689)
- Paleontology (490)
- Pathology (2065)
- Pharmacology and Toxicology (3551)
- Physiology (5538)
- Plant Biology (11419)
- Synthetic Biology (3177)
- Systems Biology (7839)
- Zoology (1769)