Abstract
Measuring changes in cell type composition between conditions (disease vs not, knockout vs wild type, treated vs not, etc) is fast becoming a standard step in single cell RNA-Seq analysis. Despite that, there is no agreement on the best approach for this type of analysis. As such, we decided to test numerous methods for cell type composition analysis, seeing how they performed in terms of false positive rate and power. Though there is not one clear winner, we do find two method (the propeller method with asin normalization and Dirichlet regression with the alternative parameterization) perform well in most situations. Most importantly, consistent with results in differential expression analysis, we see that it is important to take into account sample to sample (mouse to mouse, person to person, etc) variability to avoid high false positive rates. We also see evidence that aggregation (aka pseudobulk) based method slightly outperform the mixed model methods we tested.
Competing Interest Statement
The authors have declared no competing interest.