Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells

Nat Biotechnol. 2015 Feb;33(2):155-60. doi: 10.1038/nbt.3102. Epub 2015 Jan 19.

Abstract

Recent technical developments have enabled the transcriptomes of hundreds of cells to be assayed in an unbiased manner, opening up the possibility that new subpopulations of cells can be found. However, the effects of potential confounding factors, such as the cell cycle, on the heterogeneity of gene expression and therefore on the ability to robustly identify subpopulations remain unclear. We present and validate a computational approach that uses latent variable models to account for such hidden factors. We show that our single-cell latent variable model (scLVM) allows the identification of otherwise undetectable subpopulations of cells that correspond to different stages during the differentiation of naive T cells into T helper 2 cells. Our approach can be used not only to identify cellular subpopulations but also to tease apart different sources of gene expression heterogeneity in single-cell transcriptomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cell Differentiation / genetics*
  • Cell Lineage / genetics*
  • Computational Biology
  • Gene Expression Regulation, Developmental
  • Genetic Heterogeneity*
  • Mice
  • Models, Theoretical
  • Mouse Embryonic Stem Cells
  • RNA / genetics
  • Sequence Analysis, RNA
  • Single-Cell Analysis
  • Th2 Cells / cytology*
  • Transcriptome / genetics

Substances

  • RNA