Abstract
Efficient unbiased data analysis is a major challenge for laboratories handling large cytometry datasets. We present EmbedSOM, a non-linear embedding algorithm based on FlowSOM that improves the analyses by providing high-performance visualization of complex single cell distributions within cellular populations and their transition states. The algorithm is designed for linear scaling and speed suitable for interactive analyses of millions of cells without downsampling. At the same time, the visualization quality is competitive with current state-of-art algorithms. We demonstrate the properties of EmbedSOM on workflows that improve two essential types of analyses: The native ability of EmbedSOM to align population positions in embedding is used for comparative analysis of multi-sample data, and the connection to FlowSOM is exploited for simplifying the supervised hierarchical dissection of cell populations. Additionally, we discuss the visualization of the trajectories between cellular states facilitated by the local linearity of the embedding.