Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments

Nat Methods. 2019 Jun;16(6):479-487. doi: 10.1038/s41592-019-0425-8. Epub 2019 May 27.

Abstract

Single cell RNA-sequencing (scRNA-seq) technology has undergone rapid development in recent years, leading to an explosion in the number of tailored data analysis methods. However, the current lack of gold-standard benchmark datasets makes it difficult for researchers to systematically compare the performance of the many methods available. Here, we generated a realistic benchmark experiment that included single cells and admixtures of cells or RNA to create 'pseudo cells' from up to five distinct cancer cell lines. In total, 14 datasets were generated using both droplet and plate-based scRNA-seq protocols. We compared 3,913 combinations of data analysis methods for tasks ranging from normalization and imputation to clustering, trajectory analysis and data integration. Evaluation revealed pipelines suited to different types of data for different tasks. Our data and analysis provide a comprehensive framework for benchmarking most common scRNA-seq analysis steps.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenocarcinoma / genetics*
  • Benchmarking*
  • Computational Biology / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Lung Neoplasms / genetics*
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis / methods*
  • Software
  • Tumor Cells, Cultured