Accurate estimation of molecular counts in droplet-based single-cell RNA-seq experiments

Viktor Petukhov; Jimin Guo; Ninib Baryawno; Nicolas Severe; David Scadden; Maria G. Samsonova; Peter V. Kharchenko

doi:10.1101/171496

Abstract

Single-cell RNA-seq protocols provide powerful means for examining the gamut of cell types and transcriptional states that comprise complex biological tissues. Recently-developed approaches based on droplet microfluidics, such as inDrop or Drop-seq, use massively multiplexed barcoding to enable simultaneous measurements of transcriptomes for thousands of individual cells. The increasing complexity of such data also creates challenges for subsequent computational processing and troubleshooting of these experiments, with few software options currently available. Here we describe a flexible pipeline for processing droplet-based transcriptome data that implements barcode corrections, classification of cell quality, and diagnostic information about the droplet libraries. We introduce advanced methods for correcting composition bias and sequencing errors affecting cellular and molecular barcodes to provide more accurate estimates of molecular counts in individual cells.