BinPacker: Packing-Based De Novo Transcriptome Assembly from RNA-seq Data

PLoS Comput Biol. 2016 Feb 19;12(2):e1004772. doi: 10.1371/journal.pcbi.1004772. eCollection 2016 Feb.

Abstract

High-throughput RNA-seq technology has provided an unprecedented opportunity to reveal the very complex structures of transcriptomes. However, it is an important and highly challenging task to assemble vast amounts of short RNA-seq reads into transcriptomes with alternative splicing isoforms. In this study, we present a novel de novo assembler, BinPacker, by modeling the transcriptome assembly problem as tracking a set of trajectories of items with their sizes representing coverage of their corresponding isoforms by solving a series of bin-packing problems. This approach, which subtly integrates coverage information into the procedure, has two exclusive features: 1) only splicing junctions are involved in the assembling procedure; 2) massive pell-mell reads are assembled seemingly by moving a comb along junction edges on a splicing graph. Being tested on both real and simulated RNA-seq datasets, it outperforms almost all the existing de novo assemblers on all the tested datasets, and even outperforms those ab initio assemblers on the real dog dataset. In addition, it runs substantially faster and requires less memory space than most of the assemblers. BinPacker is published under GNU GENERAL PUBLIC LICENSE and the source is available from: http://sourceforge.net/projects/transcriptomeassembly/files/BinPacker_1.0.tar.gz/download. Quick installation version is available from: http://sourceforge.net/projects/transcriptomeassembly/files/BinPacker_binary.tar.gz/download.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology
  • Dogs
  • Gene Expression Profiling / methods*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Mice
  • RNA, Messenger / chemistry
  • RNA, Messenger / genetics*
  • Sequence Analysis, RNA / methods*
  • Transcriptome / genetics*

Substances

  • RNA, Messenger