ERG activates a stem-like proliferation-differentiation program in prostate epithelial cells with mixed basal-luminal identity

Summary To gain insight into how ERG translocations cause prostate cancer, we performed single cell transcriptional profiling of an autochthonous mouse model at an early stage of disease initiation. Despite broad expression of ERG in all prostate epithelial cells, proliferation was enriched in a small, stem-like population with mixed-luminal basal identity (called intermediate cells). Through a series of lineage tracing and primary prostate tissue transplantation experiments, we find that tumor initiating activity resides in a subpopulation of basal cells that co-express the luminal genes Tmprss2 and Nkx3.1 (called BasalLum) but not in the larger population of classical Krt8+ luminal cells. Upon ERG activation, BasalLum cells give rise to the highly proliferative intermediate state, which subsequently transitions to the larger population of Krt8+ luminal cells characteristic of ERG-positive human cancers. Furthermore, this proliferative population is characterized by an ERG-specific chromatin state enriched for NFkB, AP-1, STAT and NFAT binding, with implications for TF cooperativity. The fact that the proliferative potential of ERG is enriched in a small stem-like population implicates the chromatin context of these cells as a critical variable for unmasking its oncogenic activity.


Figure S1 .
Figure S1.Time point characterization and annotation of scRNA-seq clusters of genetically engineered mouse models (GEMMs).(A) H&E prostate histology.Dashed lines encircle invasive adenocarcinoma highlighting an ERG-dependent focal invasion from 3-month EPC mice that later became pervasive at 6 months age.Scale bars, 250 µm.(B) UMAP of all cells (N = 36,961 cells) from the prostates of 5 mice (2 EPC, 2 PC, 1 WT) annotated based on leiden clustering.(C) UMAP of all cells colored by individual replicates per genotype.(D) UMAP of all cells showing cells assigned to each genotype and colored by log10(UMI).(E) UMAP of all cells previously defined normal mouse prostate signatures.Epithelial cell clusters defined based on Epcam expression are highlighted.Myeloid cells specific to the EPC samples were defined as Myeloid-EPC.(F) Barplot of all cells colored by individual replicates (left) or each genotype with replicates aggregated (right).Numbers of cells in each cluster were included (n=) and cluster number corresponding to B is shown.(G) UMAP (left) and barplot (right) of all epithelial cells colored by cell types based on normal prostate signatures 35 .Numbers assigned to each individual cluster are shown.Colored circles highlight L2 cells (purple) and basal-like cells (red) specific to PC and EPC prostates (termed Tumor-Lum and Tumor-Bas respectively).Tumor-Bas cells were later defined as Tumor-IM cells as shown in Fig. 1B.A cluster containing a mix of different cell types is defined as mixed.Bas, basal; Lum, luminal.(H) UMAP showing epithelial clusters assigned to each genotype.Clusters are annotated as in G.The IDs of the two Tumor-Bas/IM clusters are shown.(I) Barplot of epithelial clusters showing the cell type composition of each cluster.Numbers of cells in each cluster were included.The clusters are numbered according to the UMAP in G. (J) Violin plots showing Androgen receptor (Ar) expression (left) and Ar signature scores (right) across epithelial cell types.(K) Sagittal view of whole prostates with Trop2 IHC on mice at 3 months age.In a normal prostate (YC), Trop2 selectively stains the stem-like L2 luminal cells at the proximal regions and distal invagination tips over the secretory L1 luminal cells, as expected 35,36,137,138 .By contrast, EPC tumor cells displayed a pan-Trop2 staining regardless of their tissue localization, further supporting a pervasive L2 transition as revealed from scRNA-seq analysis in G-H.Scale bars, 1 mm.(L) Violin plots comparing proliferative marker expression across all epithelial clusters.The two Tumor-Bas (IM) clusters are numbered according to the UMAP in G. (M) Gene set enrichment analysis showing EPC-specific pathways (ERG UP) within the Tumor-Bas/IM population.

Figure S3 .
Figure S3.Tumor-IM cells show intermediate features.(A) Assignment of basal and luminal scores to epithelial cells from all samples aggregated and each individual genotype (see methods).Unsupervised analysis revealed three distinct populations from all epithelial cells which were named as basal, luminal, and intermediate according to the marker scores.A pronounced double-positive IM population in PC and EPC mice are shown.(B) UMAP of all epithelial cells highlighting the IM identity of the Tumor-IM clusters.Clusters are colored by basal, luminal, and intermediate cell types defined in A. (C) Assignment of basal and luminal scores in each individual cell type.

Figure S4 .
Figure S4.A subset of basal cells express androgen-regulated luminal genes.(A)UMAP of all epithelial cells from human prostates.Data were reproduced from a previous study 54 with the same cell type classification, except that the two subtypes of the previously defined prostate cancer cells are together named as PCa here (see methods).The basal and PCa

Figure S5 .
Figure S5.Evidence of ERG+ basal cells in human prostate cancer.(A) UMAP of epithelial cells from patients 54 showing RNA count to assess doublet potential.The low to intermediate level of RNA count in basal cluster suggest the ERG+ basal cells in Fig. 2E are unlikely an artificial outcome of doublet formation.(B) UMAP of epithelial cells showing ERG expression in individual patients.ERG+ PCa samples are shown (see methods).Cells from each patient are colored from white to red based on gene expression, with cells from the rest of patients in grey in the background.Arrows highlight the presence of ERG+ cells in the basal cluster.Cell types in this figure are annotated based on Fig. S4A, with basal and PCa clusters highlighted in black and pink circles, respectively.(C) UMAP of epithelial cells from a different

Figure S6 .Figure S7 .
Figure S6.Generation and characterization of EP orthografts using freshly isolated and recombined cells.(A) Quality control for freshly isolated basal and luminal-derived cells generated in Fig. 3A.(Top) Post-sort analysis to validate purity of the sorted basal and luminal populations.(Bottom) Assessment of Cre recombination efficiency using freshly derived organoids harvested at 5 days post Cre.GFP was used as a surrogate for ERG expression.(B) Images showing prostates harvested at 5 months post transplantation.Visible grafts, highlighted in yellow circles, suggest a higher graft burden from basal-derived EP orthografts.(C) Basal-derived EP orthografts trended towards a higher injected lobe weight than the luminal derivatives at the 5 months endpoint, further corroborating the higher graft burden as shown in Fig. 3B.(D) Basal-derived EP orthografts displayed a reduction of basal marker p63, further corroborating the notion of luminal fate transition as shown in Fig. 3D.p63 expression was measured by flow cytometry in ERG+ graft cells harvested at the 5 months endpoint.Data represent mean ± s.d.; n = 6 (except in Awhere n = 1); ns, not significant; *p<0.05;unpaired two-tailed t-test.

Figure S8 .
Figure S8.In situ analysis of lineage marker expression and Cre recombination efficiency (A) IF staining showing Ar expression in both K5-and K5+ cells from invasive adenocarcinomas of indicated mice; inset shows high-power view.Scale bars, 50 µm.(B) IHC documenting recombined cells after crossing with K8-CreER T2 vs Nkx3-1-CreER T2 mice, YFP was used as a surrogate for Cre recombination.Scale bars, 100 µm.(C) EP;K8-CreER T2 mice displayed Nkx3.1 loss in ERG+ cells from the precursor lesions by 1 month post tamoxifen.High-power view in insets highlights the absence of Nkx3.1 signal in ERG+ cells (arrow) and the neighboring Nkx3.1+normal luminal cells.The Nkx3-1-CreER T2 allele causes haploinsufficiency of the tumor suppressor gene Nkx3-1 139 which could potentially explain the tumorigenic phenotypes in EP;Nkx3-1-CreER T2 mice in Fig. 4.However, the data here show an early loss of Nkx3.1 in EP;K8-CreER T2 mice as well (potentially reflecting an L2 transition and thus loss of the L1 marker Nkx3.1).Thus, Nkx3-1 expression is similarly disrupted in both settings and therefore unlikely to cause the different phenotypes.Scale bars, 100 µm.

Figure S9 .
Figure S9.ERG+ basal and intermediate cells proliferate towards a luminal fate in vitro and in vivo.(A) In vivo EdU pulse chase assay showing EdU quantification in ERG+ EPC cells.The percent of EdU-labeled cells were comparable between the pulse and chase samples, suggesting that the EdU+ population is not likely to enrich the label-retaining cells within 1 week of chase.(B) The same samples from A showing EdU quantification in SP lum and IM populations of ERG+ EPC cells in pulse samples.IM cells showed a higher EdU signal, further corroborating the more proliferative feature of these cells as revealed in Fig. 1. (C) Flow cytometry analysis on ERG+ cells from A highlighting a SP lum shift in EdU+ cells after 1 week of chase.(D) Flow cytometry validating the expected ERG and Pten expression status in EP organoids and the isogenic controls.(E) Flow cytometry showing an ERG-dependent expansion of luminal cells (SP lum ) in EP organoids.(F) Flow cytometry quantification in EP cells after a EdU pulse.The EdU+ population showed a depletion of luminal (SP lum ) cells and an enrichment of basal (SP bas ) cells relative to the total live population.Data represent mean ± s.d.; n > 3.; ns, not significant, *p < 0.05; **p < 0.01; ***p<0.001;****p < 0.0001; unpaired two-tailed t-test (A, B, E); multiple paired t-test with FDR correction by Benjamini, Krieger and Yekutieli (F).

Figure S10 .
Figure S10.Generation and characterization of the dual lineage reporter knock-in system in EP organoids.(A) Junction PCR across the target loci and the engineered reporters in the bulk organoid population.The PCR products specific to the knock-in allele with the expected size indicate successful reporter targeting.(B) Sanger sequencing of the junction PCR clones from A validates expected junction sequences with homogeneous perfect junctions for Krt8-TagRFP targeting, and heterogenous junctions with small in-frame insertion/deletions for Krt5-mNeonGreen targeting.(C) Live cell flow cytometry using the engineered reporter signals recapitulated a similar basal/luminal lineage pattern created by K5/K8 intracellular flow in Fig. S9E.(D) Intracellular flow cytometry comparing the expression pattern between the endogenous K8 and the TagRFP reporter signal in targeted organoid population.~92% of overall concordance was observed.~5.6% false-negative population was also observed which either reflects a lack of targeting or a disruption of protein function in a minority of cells.A similar assay was not possible with the Krt5-mNeonGreen targeting due to the epitope ablation by the targeting event.(E) Functional validation of the Krt5-mNeonGreen engineering by Krt5 depletion, which led to a reduction of the mNeonGreen reporter signal.(F) Effective ERG ablation by CRISPR in EP reporter organoids.ERG was detected by intracellular flow cytometry 2 days after introducing the indicated CRISPR-RNP. 1