Abstract
Single-cell RNA sequencing reveals valuable insights into cellular heterogeneity within tumour microenvironments (TMEs), paving the way for a deep understanding of cellular mechanisms contributing to cancer. However, high heterogeneity among the same cancer types and low transcriptomic variation in immune cell subsets present challenges for accurate, high-resolution confirmation of cells’ identities. Here we present scATOMIC; a modular annotation tool for malignant and non-malignant cells. We trained scATOMIC on >250,000 cancer, immune, and stromal cells defining a pan-cancer reference across 19 common cancer types and employed a novel hierarchical approach, outperforming current classification methods. We extensively confirmed scATOMIC’s accuracy on 198 tumour biopsies and 54 blood samples encompassing >420,000 cancer and a variety of TME cells. Lastly, we demonstrate scATOMIC’s practical significance to accurately subset breast cancers into clinically relevant subtypes and predict tumours’ primary origin across metastatic cancers. Our approach represents a broadly applicable strategy to analyze multicellular cancer TMEs.
Competing Interest Statement
The authors have declared no competing interest.