A general and flexible method for signal extraction from single-cell RNA-seq data

Nat Commun. 2018 Jan 18;9(1):284. doi: 10.1038/s41467-017-02554-5.

Abstract

Single-cell RNA-sequencing (scRNA-seq) is a powerful high-throughput technique that enables researchers to measure genome-wide transcription levels at the resolution of single cells. Because of the low amount of RNA present in a single cell, some genes may fail to be detected even though they are expressed; these genes are usually referred to as dropouts. Here, we present a general and flexible zero-inflated negative binomial model (ZINB-WaVE), which leads to low-dimensional representations of the data that account for zero inflation (dropouts), over-dispersion, and the count nature of the data. We demonstrate, with simulated and real data, that the model and its associated estimation procedure are able to give a more stable and accurate low-dimensional representation of the data than principal component analysis (PCA) and zero-inflated factor analysis (ZIFA), without the need for a preliminary normalization step.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cell Line
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Gene Expression Profiling
  • Gene Expression Regulation
  • High-Throughput Nucleotide Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Male
  • Mice
  • Neurons / cytology
  • Neurons / metabolism*
  • Principal Component Analysis
  • RNA / genetics*
  • RNA / metabolism
  • Single-Cell Analysis / methods*
  • Single-Cell Analysis / statistics & numerical data
  • Visual Cortex / cytology
  • Visual Cortex / metabolism

Substances

  • RNA