Abstract
Current single-cell experiments can produce datasets with millions of cells. Unsupervised clustering can be used to identify cell populations in single-cell analysis but often leads to interminable computation time at this scale. This problem has previously been mitigated by subsampling cells, which greatly reduces accuracy. We built on the graph-based algorithm PhenoGraph and developed FastPG which has the same cell assignment accuracy but is on average 27x faster in our tests. FastPG also has higher cell assignment accuracy than two other fast clustering methods, FlowSOM and PARC.
Availability FastPG is available here: https://github.com/sararselitsky/FastPG
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
This manuscript was updated after changing a parameter in FlowSOM for the accuracy experiments. This change in parameter yielded significantly different results for FlowSOM.
https://github.com/sararselitsky/FastPG_accuracy_performance_scripts