RT Journal Article SR Electronic T1 Automated assignment of cell identity from single-cell multiplexed imaging and proteomic data JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.02.17.431633 DO 10.1101/2021.02.17.431633 A1 Michael J. Geuenich A1 Jinyu Hou A1 Sunyun Lee A1 Hartland W. Jackson A1 Kieran R. Campbell YR 2021 UL http://biorxiv.org/content/early/2021/02/18/2021.02.17.431633.abstract AB The creation of scalable single-cell and highly-multiplexed imaging technologies that profile the protein expression and phosphorylation status of heterogeneous cellular populations has led to multiple insights into disease processes including cancer initiation and progression. A major analytical challenge in interpreting the resulting data is the assignment of cells to a priori known cell types in a robust and interpretable manner. Existing approaches typically solve this by clustering cells followed by manual annotation of individual clusters or by strategies that gate protein expression at predefined thresholds. However, these often require several subjective analysis choices such as selecting the number of clusters and do not automatically assign cell types in line with prior biological knowledge. They further lack the ability to explicitly assign cells to an unknown or uncharacterized type, which exist in most highly multiplexed imaging experiments due to the limited number of markers quantified. To address these issues we present Astir, a probabilistic model to assign cells to cell types by integrating prior knowledge of marker proteins. Astir uses deep recognition neural networks for fast Bayesian inference, allowing for cell type annotations at the million-cell scale and in the absence of previously annotated reference data across multiple experimental modalities and antibody panels. We demonstrate that Astir outperforms existing approaches in terms of accuracy and robustness by applying it to over 2.1 million single cells from several suspension and imaging mass cytometry and microscopy datasets in multiple tissue contexts. We further showcase that Astir can be used for the fast analysis of the spatial architecture of the tumour microenvironment, automatically quantifying the immune influx and spatial heterogeneity of patient samples. Astir is freely available as an open source Python package at https://www.github.com/camlab-bioml/astir.Competing Interest StatementThe authors have declared no competing interest.