Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A single-cell and spatially resolved atlas of human breast cancers

Abstract

Breast cancers are complex cellular ecosystems where heterotypic interactions play central roles in disease progression and response to therapy. However, our knowledge of their cellular composition and organization is limited. Here we present a single-cell and spatially resolved transcriptomics analysis of human breast cancers. We developed a single-cell method of intrinsic subtype classification (SCSubtype) to reveal recurrent neoplastic cell heterogeneity. Immunophenotyping using cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) provides high-resolution immune profiles, including new PD-L1/PD-L2+ macrophage populations associated with clinical outcome. Mesenchymal cells displayed diverse functions and cell-surface protein expression through differentiation within three major lineages. Stromal-immune niches were spatially organized in tumors, offering insights into antitumor immune regulation. Using single-cell signatures, we deconvoluted large breast cancer cohorts to stratify them into nine clusters, termed ‘ecotypes’, with unique cellular compositions and clinical outcomes. This study provides a comprehensive transcriptional atlas of the cellular architecture of breast cancer.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Cellular composition of primary breast cancers and identification of malignant epithelial cells.
Fig. 2: Identifying drivers of neoplastic breast cancer cell heterogeneity.
Fig. 3: T cell and innate lymphoid cell landscape of breast cancers.
Fig. 4: Myeloid landscape of breast cancers.
Fig. 5: Transcriptional profiling and phenotyping of diverse mesenchymal differentiation states across breast cancers.
Fig. 6: Mapping breast cancer heterogeneity using spatially resolved transcriptomics.
Fig. 7: Spatially mapping new heterotypic cellular interactions.
Fig. 8: Deconvolution of breast cancer cohorts using single-cell signatures reveals robust ecotypes associated with patient survival and intrinsic subtypes.

Similar content being viewed by others

Data availability

All processed scRNA-seq data are available for in-browser exploration and download through the Broad Institute Single Cell portal at https://singlecell.broadinstitute.org/single_cell/study/SCP1039. Processed scRNA-seq data from this study are also available through the Gene Expression Omnibus under accession number GSE176078. Raw scRNA-seq data from this study have been deposited with the European Genome-phenome Archive, which is hosted by the European Bioinformatics Institute and Centre for Genomic Regulation under accession no. EGAS00001005173. All spatially resolved transcriptomics data from this study are available from the Zenodo data repository (https://doi.org/10.5281/zenodo.4739739). Spatially resolved transcriptomics data from Andersson et al.56 can be downloaded from the Zenodo data repository (https://doi.org/10.5281/zenodo.3957257).

Code availability

Code related to the analyses in this study can be found on GitHub at https://github.com/Swarbricklab-code/BrCa_cell_atlas (ref. 72).

References

  1. Kim, H. K. et al. Discordance of the PAM50 intrinsic subtypes compared with immunohistochemistry-based surrogate in breast cancer patients: potential implication of genomic alterations of discordance. Cancer Res. Treat. 51, 737–747 (2019).

    Article  CAS  PubMed  Google Scholar 

  2. Picornell, A. C. et al. Breast cancer PAM50 signature: correlation and concordance between RNA-Seq and digital multiplexed gene expression technologies in a triple negative breast cancer series. BMC Genomics 20, 452 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27, 1160–1167 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Perou, C. M. et al. Molecular portraits of human breast tumours. Nature 406, 747–752 (2000).

    Article  CAS  PubMed  Google Scholar 

  5. Sørlie T. et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. USA 98, 10869–10874 (2001).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Sørlie, T. et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl Acad. Sci. USA 100, 8418–8423 (2003).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Su, S. et al. CD10+ GPR77+ cancer-associated fibroblasts promote cancer formation and chemoresistance by sustaining cancer stemness. Cell 172, 841–856.e16 (2018).

    Article  CAS  PubMed  Google Scholar 

  8. Cazet, A. S. et al. Targeting stromal remodeling and cancer stem cell plasticity overcomes chemoresistance in triple negative breast cancer. Nat. Commun. 9, 2897 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Dushyanthen, S. et al. Relevance of tumor-infiltrating lymphocytes in breast cancer. BMC Med. 13, 202 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  10. Cassetta, L. et al. Human tumor-associated macrophage and monocyte transcriptional landscapes reveal cancer-specific reprogramming, biomarkers, and therapeutic targets. Cancer Cell 35, 588–602.e10 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Katzenelenbogen, Y. et al. Coupled scRNA-seq and intracellular protein activity reveal an immunosuppressive role of TREM2 in cancer. Cell 182, 872–885.e19 (2020).

    Article  CAS  PubMed  Google Scholar 

  12. Medler, T. R. et al. Complement C5a fosters squamous carcinogenesis and limits T cell response to chemotherapy. Cancer Cell 34, 561–578.e6 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Nakamura, K. & Smyth, M. J. TREM2 marks tumor-associated macrophages. Signal Transduct. Target. Ther. 5, 233 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Costa, A. et al. Fibroblast heterogeneity and immunosuppressive environment in human breast cancer. Cancer Cell 33, 463–479.e10 (2018).

    Article  CAS  PubMed  Google Scholar 

  15. Wu, S. Z. et al. Stromal cell diversity associated with immune evasion in human triple-negative breast cancer. EMBO J. 39, e104063 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Sahai, E. et al. A framework for advancing our understanding of cancer-associated fibroblasts. Nat. Rev. Cancer 20, 174–186 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Öhlund, D. et al. Distinct populations of inflammatory fibroblasts and myofibroblasts in pancreatic cancer. J. Exp. Med. 214, 579–596 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Biffi, G. et al. IL1-induced JAK/STAT signaling is antagonized by TGFβ to shape CAF heterogeneity in pancreatic ductal adenocarcinoma. Cancer Discov. 9, 282–301 (2019).

    Article  PubMed  Google Scholar 

  19. Elyada, E. et al. Cross-species single-cell analysis of pancreatic ductal adenocarcinoma reveals antigen-presenting cancer-associated fibroblasts. Cancer Discov. 9, 1102–1123 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Puram, S. V. et al. Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell 171, 1611–1624.e24 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Lambrechts, D. et al. Phenotype molding of stromal cells in the lung tumor microenvironment. Nat. Med. 24, 1277–1289 (2018).

    Article  CAS  PubMed  Google Scholar 

  22. Azizi, E. et al. Single-cell map of diverse immune phenotypes in the breast tumor microenvironment. Cell 174, 1293–1308.e36 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Savas, P. et al. Single-cell profiling of breast cancer T cells reveals a tissue-resident memory subset associated with improved prognosis. Nat. Med. 24, 986–993 (2018).

    Article  CAS  PubMed  Google Scholar 

  24. Kim, C. et al. Chemoresistance evolution in triple-negative breast cancer delineated by single-cell sequencing. Cell 173, 879–893.e13 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Ali, H. R. et al. Imaging mass cytometry and multiplatform genomics define the phenogenomic landscape of breast cancer. Nat. Cancer 1, 163–175 (2020).

    Article  PubMed  Google Scholar 

  26. Wagner, J. et al. A single-cell atlas of the tumor and immune ecosystem of human breast cancer. Cell 177, 1330–1345.e18 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Aran, D., Hu, Z. & Butte, A. J. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 18, 220 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Lim, E. et al. Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nat. Med. 15, 907–913 (2009).

    Article  CAS  PubMed  Google Scholar 

  29. Pliner, H. A., Shendure, J. & Trapnell, C. Supervised classification enables rapid annotation of cell atlases. Nat. Methods 16, 983–986 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Koboldt, D. C. et al. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).

    Article  CAS  Google Scholar 

  33. Prat, A. et al. Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast Cancer Res. 12, R68 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Nielsen, T. O. et al. A comparison of PAM50 intrinsic subtyping with immunohistochemistry and clinical prognostic factors in tamoxifen-treated estrogen receptor-positive breast cancer. Clin. Cancer Res. 16, 5222–5232 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Glajcar, A., Szpor, J., Hodorowicz-Zaniewska, D., Tyrak, K. E. & Okoń, K. The composition of T cell infiltrates varies in primary invasive breast cancer of different molecular subtypes as well as according to tumor size and nodal status. Virchows Arch. 475, 13–23 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Li, H. et al. Dysfunctional CD8 T cells form a proliferative, dynamically regulated compartment within human melanoma. Cell 176, 775–789.e18 (2019).

    Article  CAS  PubMed  Google Scholar 

  39. Yamada, S., Shinozaki, K. & Agematsu, K. Involvement of CD27/CD70 interactions in antigen-specific cytotoxic T-lymphocyte (CTL) activity by perforin-mediated cytotoxicity. Clin. Exp. Immunol. 130, 424–430 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Slyper, M. et al. A single-cell and single-nucleus RNA-Seq toolbox for fresh and frozen human tumors. Nat. Med. 26, 792–802 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Ruffell, B. et al. Leukocyte composition of human breast cancer. Proc. Natl Acad. Sci. USA 109, 2796–2801 (2012).

    Article  CAS  PubMed  Google Scholar 

  43. Zhang, Q. et al. Landscape and dynamics of single immune cells in hepatocellular carcinoma. Cell 179, 829–845.e20 (2019).

    Article  CAS  PubMed  Google Scholar 

  44. Jaitin, D. A. et al. Lipid-associated macrophages control metabolic homeostasis in a Trem2-dependent manner. Cell 178, 686–698.e14 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Chen, J. et al. CCL18 from tumor-associated macrophages promotes breast cancer metastasis via PITPNM3. Cancer Cell 19, 541–555 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Kumar, A. et al. Specification and diversification of pericytes and smooth muscle cells from mesenchymoangioblasts. Cell Rep. 19, 1902–1916 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Thiriot, A. et al. Differential DARC/ACKR1 expression distinguishes venular from non-venular endothelial cells in murine tissues. BMC Biol. 15, 45 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. Mailhos, C. et al. Delta4, an endothelial specific notch ligand expressed at sites of physiological and tumor angiogenesis. Differentiation 69, 135–144 (2001).

    Article  CAS  PubMed  Google Scholar 

  50. Ubezio, B. et al. Synchronization of endothelial Dll4-Notch dynamics switch blood vessels from branching to expansion. eLife 5, e12167 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  51. Kryczek, I. et al. CXCL12 and vascular endothelial growth factor synergistically induce neoangiogenesis in human ovarian cancers. Cancer Res. 65, 465–472 (2005).

    CAS  PubMed  Google Scholar 

  52. Blanco, R. & Gerhardt, H. VEGF and Notch in tip and stalk cell selection. Cold Spring Harb. Perspect. Med. 3, a006569 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  53. Jakobsson, L. et al. Endothelial cells dynamically compete for the tip cell position during angiogenic sprouting. Nat. Cell Biol. 12, 943–953 (2010).

    Article  CAS  PubMed  Google Scholar 

  54. Andersson, A. et al. Single-cell and spatial transcriptomics enables probabilistic inference of cell type topography. Commun. Biol. 3, 565 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Lakins, M. A., Ghorani, E., Munir, H., Martins, C. P. & Shields, J. D. Cancer-associated fibroblasts induce antigen-specific deletion of CD8+ T cells to protect tumour cells. Nat. Commun. 9, 948 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  56. Andersson, A. et al. Spatial deconvolution of HER2-positive breast tumors reveals novel intercellular relationships. Preprint at bioRxiv https://doi.org/10.1101/2020.07.14.200600 (2020).

  57. Newman, A. M. et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 37, 773–782 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Tsoucas, D. et al. Accurate estimation of cell-type composition from gene expression data. Nat. Commun. 10, 2975 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  59. Tsai, J. H., Donaher, J. L., Murphy, D. A., Chau, S. & Yang, J. Spatiotemporal regulation of epithelial-mesenchymal transition is essential for squamous cell carcinoma metastasis. Cancer Cell 22, 725–736 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Molgora, M. et al. TREM2 modulation remodels the tumor myeloid landscape enhancing anti-PD-1 immunotherapy. Cell 182, 886–900.e17 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Leruste, A. et al. Clonally expanded T cells reveal immunogenicity of rhabdoid tumors. Cancer Cell 36, 597–612.e8 (2019).

    Article  CAS  PubMed  Google Scholar 

  62. Dutertre, C. A. et al. Single-cell analysis of human mononuclear phagocytes reveals subset-defining markers and identifies circulating inflammatory dendritic cells. Immunity 51, 573–589.e8 (2019).

    Article  CAS  PubMed  Google Scholar 

  63. Wu, S. Z. et al. Cryopreservation of human cancers conserves tumour heterogeneity for single-cell multi-omics analysis. Genome Med. 13, 81 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Lun, A. T. L. et al. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 20, 63 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Zhao, X., Rødland, E. A., Tibshirani, R. & Plevritis, S. Molecular subtyping for clinically defined breast cancer subgroups. Breast Cancer Res. 17, 29 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  68. Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat. Protoc. 15, 1484–1506 (2020).

    Article  CAS  PubMed  Google Scholar 

  71. Ramilowski, J. A. et al. A draft network of ligand–receptor-mediated multicellular signalling in human. Nat. Commun. 6, 7866 (2015).

    Article  CAS  PubMed  Google Scholar 

  72. Swarbrick, A., Wu, S., Al-Eryani, G., Roden, D. & Bartonicek, N. BrCa_cell_atlas. Version 1.0.0 (analysis code) https://doi.org/10.5281/zenodo.5031502 (2021).

Download references

Acknowledgements

This work is supported by a research grant from the National Breast Cancer Foundation (NBCF) of Australia (no. IIRS-19-106) and supported by the generosity of J. McMurtrie, AM and D. McMurtrie, the Petre Foundation, White Butterfly Foundation, Sydney Breast Cancer Foundation, Skipper Jacobs Charitable Trust, G. P. Harris Foundation and The National Health and Medical Research Council (NHMRC). A.S. is the recipient of a Senior Research Fellowship from the NHMRC (no. APP1161216). S.Z.W., G.A.-E. and J.T. are supported by the Australian Government Research Training Program Scholarship. S.O.T. is supported by the NBCF (PRAC 16-006; no. IIRS-19-084), Sydney Breast Cancer Foundation and the Family and Friends of M. O’Sullivan. S.J. is supported by a research fellowship from the NBCF. X.S.L. is supported by the Breast Cancer Research Foundation (no. BCRF-19-100) and National Institutes of Health (no. R01CA234018). C.M.P. and A.T. were supported by the National Cancer Institute Breast SPORE program (no. P50-CA58223), grant no. RO1-CA148761, and Breast Cancer Research Foundation. This work was supported by the Australian Centre for Translational Breast Cancer Research, Walter and Eliza Hall Institute, with funding support from the NHMRC Centre for Research Excellence grant no. APP1153049. E.L. is supported as a National Breast Cancer Foundation Endowed Chair and by the Love Your Sister foundation. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank the following people for their assistance in the experimental part of this manuscript: J. Yang; G. Lehrbach from the Garvan Institute of Medical Research Tissue Culture Facility; A. Zaratzian from the Garvan Histopathology Facility for tissue processing and IHC staining and guidance on the Visium experiments; the Garvan–Weizmann Centre for Cellular Genomics, including E. Lam, H. Saeed and M. Armstrong for the expertise in flow sorting. We thank H. Holliday for the incredible illustration in Fig. 8g. We thank H. H. Milioli for providing guidance for analyzing the METABRIC cohort dataset. We thank I. Shapiro and C. Grant as consumer advocates. This manuscript was edited at Life Science Editors.

Author information

Authors and Affiliations

Authors

Contributions

A.S. conceived the project and directed the study with input from all authors. E.L., S.W., M.N.H., B.C., C.C., C.M., D.S., E.R., A.P., J.B., S.O.T., E.M. and L.G. contributed to the experimental design, procured the patient tumor tissue and assisted with interpreting the data. S.Z.W., G.A.-E. and K.H. performed the single-cell captures. K.H. analyzed all the clinical information. S.Z.W., K.H. and G.A.-E. optimized and performed the tumor dissociation experiments. G.A.-E. optimized and performed the antibody staining for the CITE-seq experiments. N.B. and G.A.-E. performed the CITE-seq data processing. C.-L.C. and S.Z.W. performed the scRNA-seq experiments on the Chromium Controller. C.-L.C. helped perform the next-generation sequencing of the scRNA-seq libraries. S.Z.W. performed the preprocessing, data integration and reclustering steps for the scRNA-seq data. J.T. performed the analysis and benchmarking of inferCNV. A.T. and C.M.P. led the development of SCSubtype. D.R. interpreted and led the analyses for the breast cancer GM analyses. K.H. and T.W. performed the H&E and IHC experiments. S.O.T. independently assessed and scored all histology in this study. G.A.-E. interpreted and performed the analyses of the immune cells with intellectual input from S.J. C.-A.D. and F.G. provided intellectual input related to myeloid cluster annotation. S.Z.W. interpreted and performed all the analyses of stromal cells. D.K. and C.-L.C. performed the Visium experiments with input from J.E.P. V.G. helped perform preprocessing of the Visium datasets. S.R.W., N.I.W., C.R.U., J.G.C. and Z.W.B. performed the Visium experiments and data processing from an independent laboratory. A.A. performed the Stereoscope deconvolution with input from J.L. S.Z.W. performed the downstream analysis of the Visium data with guidance from A.A., L.L., G.A.-E. and J.L. D.R. interpreted and performed the CIBERSORTx analysis. S.Z.W. and D.R. performed the survival analyses. C.W. and X.S.L. provided intellectual input and guidance on bulk deconvolution and survival analyses. S.Z.W., A.S., D.R., G.A.-E. and S.J. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Alexander Swarbrick.

Ethics declarations

Competing interests

C.M.P. is an equity stockholder and consultant for BioClassifier; he is also listed as an inventor on patent applications for the Breast PAM50 Subtyping assay. J.L. is an author on patents owned by Spatial Transcriptomics AB covering technology presented in this paper. The other authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks Itai Yanai and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Identification of malignant cells, single-cell RNA sequencing metrics and non-integrated data of stromal and immune cells.

a-b, Number of unique molecular identifiers (a) and genes (b) per tumor analyzed by scRNA-Seq in this study. Tumors are stratified by the clinical subtypes TNBC (red), HER2 (pink) and ER (blue). Diamond points represent the mean. c-d, Number of unique molecular identifiers (UMIs;c) and genes (d) per major lineage cell types identified in this study. These major lineage tiers are grouped by T-cells, B-cells, Plasmablasts, Myeloid, Epithelial, Cycling, Mesenchymal (cancer-associated fibroblasts and perivascular-like cells) and Endothelial. Diamond points represent the mean. e-f, UMAP visualization of all 71,220 stromal and immune cells without batch correction and data integration. UMAP dimensional reduction was performed using 100 principal components in the Seurat v3 package. Cells are grouped by tumor (e) and major lineage tiers (f) as identified using the Garnett cell classification method. g, InferCNV heatmaps of all malignant cells grouped by clinical subtypes. Common subtype-specific CNVs and a chr6 artefact reported by Tirosh et. al. are marked (Tirosh et al., 2016b).

Extended Data Fig. 2 Supplementary data for SCSubtype classifier.

a-b, Hierarchical Clustering of Allcells-Pseudobulk (indicated by yellow stars) and Ribozero mRNA-Seq (indicated by blue stars) profiles of the patient samples with TCGA patient mRNA-Seq data. a, View of the basal cluster showing pairing of Allcells-Pseudobulk and Ribozero mRNA-Seq profiles of 2 representative tumors (CID4495 and CID4515) in the present study. b, View of the luminal cluster showing pairing of Allcells-Pseudobulk and Ribozero mRNA-Seq profiles of 4 representative tumors (CID4067, CID4463, CID4290 and CID3948) in the present study. c, Heatmap of SCSubtype gene sets across the training and test samples in each individual group. Colored outlined boxes highlighting the top expressed genes per group. d, Barplot representing proportions of SCSubtype calls in individual samples. Test dataset samples are highlighted within the golden colored outline. e, Scatterplot of individual cancer cells plotted according to the Proliferation score (x-axis) and Differentiation – DScore (y-axis). Individual cells are colored based on the SCSubtype calls. f, Scatterplot of individual TCGA breast tumors plotted according to the Proliferation score (x-axis) and Differentiation – DScore (y-axis). Individual patients are colored based on the PAM50 subtype calls.

Extended Data Fig. 3 Supplementary data for breast cancer gene modules.

a, Spherical k-means (skmeans) based consensus clustering of the Jaccard similarities between 574 signatures of neoplastic cell ITTH. This showed the probability (p1-p7) of each signature of ITTH being assigned to one of seven clusters/classes. Silhouette scores are shown for each signature. b, Heatmap of pair-wise Pearson correlations of the scaled AUCell signature scores, across all individual neoplastic cells, for each of the seven ITTH gene-modules (bolded) and a curated set of breast cancer related gene-signatures. Hierarchical clustering was performed using Pearson correlations and average linkage c, Heatmap showing the scaled AUCell signature scores of each of the seven ITTH gene-modules (rows) across all individual neoplastic cells (columns). Hierarchical clustering was done using Pearson correlations and average linkage. (HER2_AMP = Clinical HER2 amplification status). d, Distributions of signature scores (z-score scaled) for each of the gene-module signatures (24,489 cells from 21 tumors). Cells are grouped according to the gene-module (GM1-7) cell-state. e, Barchart showing the proportion of cells assigned to each of the gene-module cell-states (GM1-7) with cells grouped according to the SCSubtypes. f, Distributions of SCSubtype scores for each of the gene-module signatures (24,489 cells from 21 tumors). Cells are grouped according to the gene-module (GM1-7) cell-state. Kruskal-Wallis tests were performed to calculate the significance between the four SCSubtype score groups in each of the gene-module groups, p-value shown. Wilcox tests were used to identify which SCSubtype had significantly increased SCSubtype scores in the cells assigned to each gene-module, the scores of each SCSubtype were compared to the rest of the SCSubtype scores (****: Holm adjusted p-value < 0.0001, ns: Holm adjusted p-value > 0.05). Box plots in d and f depict the first and third quartiles as the lower and upper bounds, respectively. The whiskers represent 1.5x the interquartile range and the centre depicts the median.

Extended Data Fig. 4 CITE-seq vignette.

a, UMAP Visualization of a TNBC sample with 157 DNA barcoded antibodies (Supplementary Table 11). Cluster annotations were extracted from our final breast cancer atlas cell annotations. b, Heatmap visualization of the cluster averaged antibody derived tag (ADT) values for the 157 CITE-seq antibody panel. Only immune cells are shown. c-d, Expression featureplots of measured experimental ADT values (shown in top rows) against the CITE-seq imputation ADT levels (shown in bottom rows), as determined using the seurat v3 method. Selected markers for immunophenotyping T-cells (c; CD4, CD8A, PD-1 and CD103) and myeloid cells (d; PD-L1, CD86, CD49f and CD14) are shown.

Extended Data Fig. 5 Supplementary data for T-cells and innate lymphoid cells.

a, Dotplot visualizing averaged expression of canonical markers across T-cell and innate lymphoid clusters. b, Cytotoxic and dysfunctional gene signature scores across T-cell and innate lymphoid clusters. A Kruskal-Wallis test was performed to compare significance between (pairwise two-sided t-test for each cluster compared to the mean, p-values denoted by asterisks: *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001). Red line indicates the median expression. c, Dysfunctional gene signature scores of CD8 : LAG3 and CD8+ T : IFNG clusters across clinical subtypes (n = 26; 11 TNBC, 10 ER+ and 5 HER2+). A pairwise two-sided t-test for each cluster was performed to determine significance. P-values denoted by asterisks: *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001. d, Differentially expressed immune modulator genes, stratified by T-cell and Myeloid clusters, compared across breast cancer subtypes. A pairwise MAST comparison was performed to obtain bonferroni corrected p-values. All genes displayed are statistically significant (p-value < 0.05). e, Pairwise two-sided t-test comparison of LAG3, CD27, PD-1 (PDCD1) and CD70 log-normalised expression values in LAG3/c8 T-cells across breast cancer subtypes (n = 26; 11 TNBC, 10 ER+ and 5 HER2+). f, Enrichment of PDCD1, CD27, LAG3 and CD70 expression in the METABRIC cohort between clinical subtypes (n = 1,608; 209 Basal, 224 Her2, 700 LumA and 475 LumB). A pair-wise Wilcox test was performed to identify statistical significance. P‐values denoted by asterisks: *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001. Box plots in b and f depict the first and third quartiles as the lower and upper bounds, respectively. The whiskers represent 1.5x the interquartile range and the centre depicts the median.

Extended Data Fig. 6 Gene expression of immune cell surface receptors across malignant, immune and mesenchymal clusters and breast cancer clinical subtypes.

a, Averaged expression and clustering of 133 clinically targetable receptor or ligand immune modulator markers across all cell types grouped by clinical breast cancer subtypes (TNBC, HER2+ and ER+). Gene lists were manually curated through systematic literature search of known immune modulating proteins expressed on the surface of cells. Default parameters for hierarchical clustering were used via the ‘pheatmap’ package for the visualization of gene expression values.

Extended Data Fig. 7 Supplementary data for B-cells, Plasmablasts and Myeloid cells.

a, UMAP visualization of all reclustered B-cells (n = 3,202 cells) and Plasmablasts (n = 3,525 cells) as annotated using canonical gene expression markers. b, Featureplots of CD27, IGHD, IGKC and IGLC2 across naïve B cells, memory B cells, and Plasmablasts. c, Tumour associated macrophage (TAM) signature score obtained from Cassetta et al. 2019 and the expression of log-normalised levels of CCL8 across all myeloid clusters (9,675 cells from 26 tumors). A pairwise two-sided t-test was performed to determine statistical significance for clusters of interest. P-values denoted by asterisks: *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001. Dashed red line marks median TAM module score or gene expression. A Kruskal-Wallis test was performed to compare significance between groups’. d, LAM and DC : LAMP3 gene expression signatures acquired from Jaitin et al. 2019 and Zhang et al. 2019 respectively, visualized on the myeloid UMAP clusters. e, Heatmap visualizing GO enrichment pathways across myeloid clusters. f, Proportion of myeloid clusters across clinical subtypes. Statistical significance was determined using a two-sided t-test in a pairwise comparison of means between groups (n = 26; 11 TNBC, 10 ER+ and 5 HER2+). P‐values denoted by asterisks: *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001. g, Violin plots of imputed CITE-seq PD-L1 and PD-L2 expression values found on myeloid cells. Box plots in c and f depict the first and third quartiles as the lower and upper bounds, respectively. The whiskers represent 1.5x the interquartile range and the centre depicts the median.

Extended Data Fig. 8 Supplementary data for mesenchymal cell states and subclusters.

a, t-SNEvisualization CAFs, PVL cells and endothelial cells using Seurat reclustered with default resolution parameters (0.8). b, Pseudotime plot for CAFs, PVL cells and endothelial cells, as determined using monocle. Coordinates are as in main Figs. 5c, 5e and 5g. c, t-SNE visualizations for CAFs, PVL cells and endothelial cells with monocle derived cell states overlaid. d, Heatmaps for CAFs, PVL cells and endothelial cells show cell state averaged log normalised expression values for all differentially expressed genes determined using the MAST method, with select stromal markers highlighted. e, Top 10 gene ontologies (GO) of each mesenchymal cell state, as determined using pathway enrichment with ClusterProfiler with all differentially expressed genes as input. f, Stromal cell state averaged signature scores for pancreatic ductal adenocarcinoma myofibroblast-like, inflammatory-like and antigen-presenting CAF sub-populations, as determined using AUCell. g, Enrichment of antigen-presenting CAF markers CLU, CD74 and CAV1 in various stromal cell states. h, Subclusters of CAFs, PVL cells and endothelial cells determined using Seurat show a strong integration with three normal breast tissue datasets, highlighting similarities in subclusters across disease status and clinical subtypes of breast cancer. i, Cell states of CAFs, PVL cells and endothelial cells determined using monocle show a strong integration with three normal breast tissue datasets and breast cancer clinical subtypes.

Extended Data Fig. 9 Supplementary data for spatial transcriptomics.

a, H&E images for the remaining five breast tumors analysed using Visium (TNBC: CID4465, 1142243F and 1160920F; ER+: CID4535 and CID4290). Scale bars represent 500 μm. b, Histograms of cancer deconvolution values, as estimated using Stereoscope. Red line indicates the 10% cutoff used to select spots for scoring breast cancer gene-modules. Spots are colored by the pathology annotation. c, Box plots of gene module scores for all cancer filtered spots, as determined using AUCell, grouped by sample (TNBC=red; ER=blue). Statistical significance was determined using a two-sided t-test, with p-values adjusted using the Benjamini–Hochberg procedure. Box plots depict the first and third quartiles as the lower and upper bounds, respectively. The whiskers represent 1.5x the interquartile range and the centre depicts the median. P‐values denoted by asterisks: *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001. d, Clustered gene module correlations across all cancer filtered spots. Color scales represent Pearson correlation values and are scaled per GM (‘n.s’ denotes not significant; two-sided correlation coefficient, Benjamini–Hochberg adjusted p-value < 0.05). e, Heatmap of the deconvolution values for inflammatory-like CAFs, myofibroblast-like CAFs, Macrophage CXCL10/c9, LAM1 and LAM2 clusters. Spots (columns) are grouped by sample and pathology. Deconvolution abundances (rows) are scaled by cell type. f, Predicted signaling in tissue spots enriched for iCAFs and CD4/CD8+ T-cells. Spots filtered for CAF-ligands and T-cell receptors detected by scRNA-Seq. The mean interaction scores of cell-signaling pairs are defined as the product of the ligand and receptor expression. g, Plots of PD-1 (PDCD1; y axis) expression with PD-L1 (CD274; x axis) or PD-L2 (PDCD1LG2; x axis) expression in spots enriched for CD4/CD8+ T-cells and LAM2 cells, as determined by Stereoscope. Abundance of CD4/CD8 T-cells (combined as T_cell here) and LAM2 are overlaid on the expression plots.

Extended Data Fig. 10 Supplementary figure for CIBERSORTx cell-type deconvolution.

a, Bar and boxplot (inset) of the Pearson correlation for 45 cell-types between the actual cell-fractions captured by scRNA-Seq and the CIBERSORTx predicted fractions from pseudo-bulk expression profiles (*denotes significance p < 0.05, two-sided correlation coefficient). Inset box plot depicts the first and third quartiles as the lower and upper bounds, respectively. The whiskers represent 1.5x the interquartile range and the centre depicts the median. b, Barplot comparing the Pearson correlation for cell-types between the actual cell-fractions captured by scRNA-Seq and the CIBERSORTx (red) and DWLS (blue) predicted fractions from pseudo-bulk expression profiles (*denotes significance p < 0.05, two-sided correlation coefficient). c, Boxplot comparing the CIBERSORTx predicted SCSubtype and Cycling cell-fractions in each METABRIC tumor, stratified by PAM50 subtypes (n = 1,608; 209 Basal, 224 Her2, 700 LumA and 475 LumB). Box plots depicted as described in b. d, Heatmap of ecotypes formed from the common METABRIC tumors (columns) identified from combining ecotypes generated using CIBERSORTx with all 32 significantly correlated cell-types (rows), when using CIBERSORTx on pseudo-bulk samples. e-f, Relative proportion of the PAM50 subtypes (e) and major cell-types (f) in each ecotype, when combining CIBERSORTx consensus clustering results. g-h, Kaplan-Meier (KM) plot of all patients with common tumors in each of the ecotypes (g) and patients with tumors in ecotypes E4 and E7 (h), when combining CIBERSORTx consensus clustering results. p-values calculated using the log-rank test. i-j, Relative proportion of the PAM50 molecular subtypes (i) and major cell-types (j) of the common tumors from combining CIBERSORT and DWLS generated ecotypes. k, KM plot of the patients with tumors in ecotypes E4 and E7, formed from combining CIBERSORT and DWLS generated ecotypes. p-value calculated using the log-rank test. l, Relative proportion of the METABRIC integrative cluster annotations of the tumors in each ecotype, as determined using CIBERSORTx across all cell-types.

Supplementary information

Supplementary Information

Supplementary Note, containing Methods and one figure.

Reporting Summary

Peer Review Information

Supplementary Tables

Supplementary Tables

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, S.Z., Al-Eryani, G., Roden, D.L. et al. A single-cell and spatially resolved atlas of human breast cancers. Nat Genet 53, 1334–1347 (2021). https://doi.org/10.1038/s41588-021-00911-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00911-1

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer