A Systematic Evaluation of Single Cell RNA-Seq Analysis Pipelines: Library preparation and normalisation methods have the biggest impact on the performance of scRNA-seq studies

Beate Vieth; Swati Parekh; Christoph Ziegenhain; Wolfgang Enard; Ines Hellmann

doi:10.1101/583013

Abstract

The recent rapid spread of single cell RNA sequencing (scRNA-seq) methods has created a large variety of experimental and computational pipelines for which best practices have not been established, yet. Here, we use simulations based on five scRNA-seq library protocols in combination with nine realistic differential expression (DE) setups to systematically evaluate three mapping, four imputation, seven normalisation and four differential expression testing approaches resulting in ∼ 3,000 pipelines, allowing us to also assess interactions among pipeline steps. We find that choices of normalisation and library preparation protocols have the biggest impact on scRNA-seq analyses. Specifically, we find that library preparation determines the ability to detect symmetric expression differences, while normalisation dominates pipeline performance in asymmetric DE-setups. Finally, we illustrate the importance of informed choices by showing that a good scRNA-seq pipeline can have the same impact on detecting a biological signal as quadrupling the sample size.

Footnotes

↵+ hellmann{at}bio.lmu.de
In this revised manuscript, we added the analysis of a real dataset and show that pipeline choices indeed have an effect on identification and characterization of cell-types in scRNA-seq datasets. Furthermore, we investigate the detection biases that lead to the observed differences in the genes found by the different mappers and annotations. Finally, we added a downsampling function to our simulator powsimR, that now allows us to evaluate different sequencing depths and thus improves the comparability between the 10X Chromium and the other library preparation methods.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.