PT - JOURNAL ARTICLE AU - Mingfu Shao AU - Carl Kingsford TI - Scallop enables accurate assembly of transcripts through phasing-preserving graph decomposition AID - 10.1101/123612 DP - 2017 Jan 01 TA - bioRxiv PG - 123612 4099 - http://biorxiv.org/content/early/2017/04/03/123612.short 4100 - http://biorxiv.org/content/early/2017/04/03/123612.full AB - We introduce Scallop, an accurate, reference-based transcript assembler for RNA-seq data. Scallop significantly improves reconstruction of multi-exon and lowly expressed transcripts. On 10 human samples aligned with STAR, Scallop produces (on average) 35.7% and 37.5% more correct multi-exon transcripts than two leading transcript assemblers, StringTie [1] and TransComb [2], respectively. For transcripts expressed at low levels in the same samples, Scallop assembles 65.2% and 50.2% more correct multi-exon transcripts than StringTie and TransComb, respectively. Scallop obtains this improvement through a novel algorithm that we prove preserves all phasing paths from reads (including paired-end reads), while also producing a parsimonious set of transcripts and minimizing coverage deviation.