Summary
Recently, Zimmerman et al.,1 highlighted the importance of accounting for the dependence between cells from the same individual when conducting differential expression analysis on single-cell RNA-sequencing data. Their work proved the inadequacy of pseudoreplication approaches for such analysis – This was an important step forward that was conclusively proven by them. A hierarchical single-cell expression simulation approach (hierarchicell) was developed by Zimmerman et al.,1 to generate non-differentially expressed genes upon which performance was evaluated using the type 1 error rate; the proportion of non-differentially expressed genes indicated as differentially expressed by a model. However, evaluating such models on their type 1 or type 2 error rate in isolation is insufficient to determine their true performance – for example, a method with low type 1 error may have a high type 2 error rate. Moreover, because no seed was set for the pseudo-random number generator used in hierarchicell, the different methods evaluated by Zimmerman et al. were done so on different simulated datasets. Here, we corrected these issues, reran the author’s analysis and found pseudobulk methods outperformed mixed models.
Contact Alan Murphy: a.murphy{at}imperial.ac.uk, Nathan Skene: n.skene{at}imperial.ac.uk
Code availability The modified version of hierarchicell which returns all error metrics, uses the same simulated data across approaches and has checkpointing capabilities (if runs are aborted or crashed) is available at: https://github.com/neurogenomics/hierarchicell.
The benchmarking script along with the results are available at: https://github.com/Al-Murphy/reanalysis_scRNA_seq_benchmark.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Arising From: Zimmerman, K. D., Espeland, M. A. & Langefeld, C. D. Nature Communications (2021). https://doi.org/10.1038/s41467-021-21038-1
It is known that pseudobulk approaches have mis-calibrated confidence intervals leading to lower than expected type 1 error rates. Thus we added ROC analyses to compare the approaches' sensitivity (1 minus type 2 error) at a consistent type 1 error rate to account for this. Pseudobulk approaches still outperformed all other methods, showing their lower type 1 error rate does not result in higher type 2 error rates.