LSD1 promotes secretory cell specification to drive BRAF mutant colorectal cancer

Despite the connection to distinct mucus-containing colorectal cancer (CRC) histological subtypes, the role of secretory cells, including goblet and enteroendocrine (EEC) cells, in CRC progression has been underexplored. Analysis of TCGA and single cell RNA sequencing data demonstrates that multiple secretory progenitor populations are enriched in BRAF-mutant CRC patient tumors and cell lines. Enrichment of EEC progenitors in BRAF-mutant CRC is maintained by DNA methylation and silencing of NEUROD1, a key gene required for differentiation of EECs. Mechanistically, secretory cells and the factors they secrete, such as Trefoil factor 3, are shown to promote colony formation and activation of cell survival pathways in the entire cell population. We further identify LSD1 as a critical regulator of secretory cell specification in vitro and in a colon orthotopic xenograft model, where LSD1 loss reduces tumor growth and metastasis. This work establishes EEC progenitors, in addition to goblet cells, as targetable populations in BRAF-mutant CRC and identifies LSD1 as a therapeutic target in secretory lineage-containing CRC.


Introduction
The human large intestine is populated by the stem and transit-amplifying cells that give rise to differentiated absorptive enterocytes and secretory cells such as goblet and enteroendocrine (EEC) cells. Renewal of these populations in the adult intestine relies on a delicate balance between proliferation and differentiation. This equilibrium is disrupted in colorectal cancer (CRC), canonically marked by highly proliferative and moderately differentiated cells (Fleming et al., 2012). CRC that exhibits mucin hypersecretion that encompasses greater than 50% of the total tumor volume is classified as mucinous adenocarcinomas (Bosman et al., 2010).
Approximately 15% of CRC falls into this pathologic classification (Hugen et al., 2014), indicating that a sizable fraction of CRC patients exhibit some degree of secretory cell differentiation and suggesting that secretory cells may be important in driving a proportion of CRC. However, because these pathological diagnoses are based on levels of secreted mucin, they may not reflect the true proportion or diversity of secretory cells in CRC. Tumors classified as nonmucinous may still contain secretory cells and/or all tumors classified as mucinous may not contain the same subset of secretory cells. Substantial heterogeneity in these groups could potentially explain previously noted conflicting results regarding the survival of patients with mucinous versus non-mucinous CRC (Schneider and Langner, 2014).
Another common subtype of cancer originating from secretory cells are neuroendocrine tumors (NETs), which arise from endocrine lineage cells. NETs are less common in the large intestine but are the most prevalent cancer subtype of the appendix (Moertel et al., 1968) (Modlin and Sandor, 1997). Aggressive subtypes of NETs include those with goblet cell differentiation (goblet cell carcinoids) (McGory et al., 2005), and poorly differentiated NETs (PDNETs) (Basturk et al., 2014). While distinct from mucinous tumors, goblet cell carcinoids are pathologically more similar to adenocarcinomas of the large intestine than to other NET subtypes (van Eeden et al., 2007), which may contribute to their under-characterization in the large intestine due to the difficulty in differentiating them from canonical adenocarcinomas. However, recent pathological studies have begun to identify tumors similar to goblet cell carcinoids in the large intestine (Inoue et al., 2020) (Abdalla et al., 2020), indicating that mixed secretory lineage CRCs form at some previously unappreciated frequency. It is currently unknown whether goblet and EEC cells present in colon adenocarcinomas functionally contribute to tumor progression. In spite of this knowledge gap, mounting evidence suggests that oncogenic transformation may be followed by altered differentiation of secretory lineages (Peignon et al., 2011). Driver mutations in CRC categorically affect essential cell processes, such as cell fate, cell survival, and/or genome maintenance (Vogelstein et al., 2013). Loss of function mutations in the Adenomatous Polyposis Coli gene (APC) induce non-canonical -catenin-mediated degradation of the general secretory specification factor Atonal Homolog 1 (ATOH1) leading to loss of goblet and presumably EEC lineage cells (Peignon et al., 2011). Alternatively, mucinous tumors have increased mutation rates in genes that function in the RAS/MAPK and PI3K/AKT signaling pathways (Schneider and Langner, 2014). Accordingly, inducing the BRAF(V600E) mutation in mouse colon organoids increased mucin biosynthesis (Reischmann et al., 2020), but the mechanism by which BRAF mutation alters the activity or differentiation of secretory cells is poorly understood.
The chromatin-modifying enzyme lysine-specific demethylase 1 (LSD1) is over-expressed in many CRCs and contributes to the formation of numerous cancer types such as small cell lung cancer and acute myeloid leukemia (Mohammad et al., 2015) (Takagi et al., 2017) (Maiques-Diaz et al., 2018). While the demethylase activity of LSD1 contributes to the progression of multiple cancer types, the scaffolding function of LSD1 in the CoREST complex has recently been highlighted in CRC (Miller et al., 2020). Nevertheless, both the normal functions of LSD1 in the intestine and its potential role in intestinal cancers are poorly understood. LSD1 has been directly implicated as a regulator of neuronal and myeloid differentiation, brown adiposities, and muscle cells (Laurent et al., 2015) (Maiques-Diaz et al., 2018) (Lin and Farmer, 2016) (Tosic et al., 2018). Furthermore, recent studies have begun to connect LSD1 with the regulation of intestinal secretory lineages. Co-transfection of LSD1 with the EEC lineage transcription factor NEUROD1 upregulated expression of the EEC marker SCT in HuTu80 and HEK 293 cells, indicating that LSD1 may function in the regulation of EEC lineage genes (Ray et al., 2014). Another study found that while LSD1 was required for Paneth cell formation during mouse development, loss of LSD1 did not significantly affect goblet or EEC formation in the small intestine (Zwiggelaar et al., 2020). An outstanding question in the field is whether the well-established tumor-promoting function of LSD1 in the intestine may be linked to an undescribed role in intestinal cell differentiation.
In this study, using patient data and in vitro culture systems, we discover that a subset of BRAF kinase mutated CRC are enriched for EECs that are locked in the progenitor state due to a DNA methylation-dependent differentiation blockade. We generate single-cell RNA-sequencing (scRNA-seq) datasets to study the regulation and function of EEC progenitors in CRC as well as in comparison to the normal human colon at high resolution, identifying conserved and CRC specific features. We further determine that LSD1 is essential for the formation of both EEC and goblet cell lineages in BRAF tumors and that loss of these populations abrogates tumor growth and metastases in vivo. Overall, this study establishes EEC progenitors, in addition to goblet cells, as targetable populations in BRAF-mutant CRC and identifies LSD1 as a novel therapeutic avenue to deplete EEC and goblet lineage cells, commonly enriched in multiple subtypes of BRAF mutant CRC.

Data availability
Single-cell RNA-seq data generated in this study as well as Wang et al., 2019

Single-cell sequencing and TCGA analyses
10,000 cells per sample were targeted for input to the 10X Genomics Chromium system using the Chromium Next GEM Single Cell 3′ Kit v3.1 at the IU School of Medicine (IUSM) Center for Medical Genomics core (HT29) or at IU Bloomington (H508). The libraries were sequenced at the IUSM Center for Medical Genomics using a NovaSeq 6000 with a NovaSeq S2 reagent kit v1.0 (100 cycles) with approximately 450 million read pairs per sample. A detailed description of single-cell sequencing analysis as well as TCGA analysis is included in the supplemental methods.

Cell culture and treatments
All cell lines were maintained in a humidified atmosphere with 5% CO2. HT29 cells were cultured in McCoys 5A media (Corning), NCI-H508 in RPMI 1640 media (Corning) and LS-174T cells in DMEM, supplemented with 10% FBS (Gibco). All cell lines were purchased from the ATCC; LS174T (2020), NCI-H508 (2019), HT29 (verified by STR profiling in 2019). Decitabine (DAC) (Sigma A3656) was reconstituted in sterile ddH2O at 2mg/ml, and administered at 100nM in fresh media every 24 hours for a total of 72 hours. ISX-9 (TOCRIS, 4439) was reconstituted in DMSO at 50mM, and cells were treated with 40M for 24 hours. For DAC/ISX-9 co-treatment, cells were treated with 100nM DAC or ddH2O for 48 hours, and on day three, treated with 100nm DAC or ddH20 and 40M ISX-9 or DMSO for 24 hours. For Notch inhibition experiments, cells were treated once with 40nM DBZ solubilized in DMSO (Tocris, 4489) for 48H. For TFF3 and EGF combination treatments in media exchange experiments, cells were grown for 48H in serum-free media, which was then collected and filtered with a .45 micron filter. 1g/ml anti-TFF3 (R&D Systems, MAB4407) or IgG was added to the media for 5 minutes. These compounds were solubilized in PBS prior to treatment. The media was then added back to the EV or LSD1 KD cells for 24 hours. Cells were then treated with 125 ng/ml recombinant EGF (R&D Systems: 236-EG) for 1 hour and collected for western blot analysis.
For CellTiter-Glo® assay (Promega #G7572) involving combination treatment, cells were starved in media lacking serum for 48H prior to treatment. Cells were treated with 1 g/ml anti-TFF3 or IgG control and 125 ng/ml recombinant EGF for 48H in serum-free media.

TFF3 ELISA and MTT assay
For TFF3 ELISA, cells were cultured in serum-free media for 24H. Fresh serum-free media was added, and after an additional 24H, was collected and centrifuged at 16,000xg for 5 minutes at 4 o C. ELISA's were performed using 50 l of media per sample and following the manufacturer's protocol (R&D systems, DTFF30). After removal of media, 1 ml of Optimem + 100 l MTT (5 mg/ml in PBS) was added and plates were incubated at 37 o C for 3.5H. The media was removed and DMSO was added. Absorbance was read at 570nM.

RNA isolation for qPCR
Total RNA was isolated using the RNeasy mini kit (Qiagen,74104). For qPCR, RNA was used to generate cDNA via reverse transcription (Thermo, K1642). cDNA was amplified using genespecific primers and FastStart Essential DNA Green Master (Roche, 06402712001). Cq values of non-housekeeping genes were normalized to RHOA or -ACTIN expression, selected using the Housekeeping Transcript Atlas (housekeeping.unicamp.br) as strong colon housekeeping genes. qPCR Primer sequences are included as supplemental materials. Results are shown as mean +/-SD.

Clonogenic growth and proliferation assays
For clonogenic growth assays, 500 single cells were plated on one well of a 6-well plate and allowed to grow for 14 days at 37 °C. Cells were then fixed with 10% formalin and stained with crystal violet. Stained cells were imaged using an SYNGENE G:BOX and quantification were carried out using the GeneSys and GeneTools programs. Proliferation assays were performed using the CellTiter-Glo® Luminescent Cell Viability Assay (Promega #G7572) per manufacturer's protocol. Briefly, 1×10 3 cells were plated in 96 well plates and allowed to incubate under standard growth conditions. Results are shown as mean +/-SD.

Whole-cell isolation and western blot
For protein isolation, cell pellets were lysed in a 4% SDS buffer using a Qiashredder (Qiagen).
Relative densitometry for western blots was determined using ImageJ software and normalized to the density of loading control β-ACTIN. Results are shown as mean +/-SD.

Quantitative Methylation Specific PCR (qMSP)
Total DNA was isolated using the DNeasy kit (Qiagen 69504). DNA was bisulfite-treated (EZ DNA methylation lightning kit, Zymo Research, D5030) and used for qMSP. qMSP assays were first validated using a standard curve of bisulfite-treated mixtures of unmethylated and methylated DNA (data not shown). To calculate relative DNA methylation, qMSP assays were performed for each gene using primer sets specific for unmethylated (U) and methylated (M) alleles. The delta Cq method was used to calculate M/U.

Orthotopic xenograft mouse model
HT29 cells were infected with the pLenti-U6-tdTomato-P2A-BlasR (LRT2B) virus (Addgene, 1108545) that enables dual production of firefly luciferase and dTomato. A strong dTomato signal was detectable within 14 days. Instead of sorting individual clones, the top 20% brightest dTomato cells were isolated by FACs to maintain the natural heterogeneity of these cells.
1,000,000 LSD1 KD or vector control cells were combined with Matrigel and injected into the mechanically prolapsed mouse large intestine. To visualize the firefly luciferase positive cells in vivo, mice were injected with luciferin and imaged using an IVIS imaging system every 7 days.
After approximately five weeks, mice were sacrificed and xenograft primary tumors along with adjacent normal mouse tissue, lungs and livers were harvested, imaged using an IVIS imaging system, fixed in 4% paraformaldehyde, and frozen in OTC for further analysis.

BRAF mutation is associated with poorly differentiated EECs
While it is generally understood how differentiation is coordinated in the intestine, whether differentiation is altered in CRC is currently understudied. In canonical intestinal epithelium differentiation, absorptive progenitors are specified by HES1 (Jensen et al., 2000), and cells without NOTCH1-mediated repression of ATOH1 form the various secretory cells (Yang et al., 2001) ( Figure 1A). ATOH1+ cells become goblet cells and NEUROG3+ cells, which become multipotent progenitors and eventually mature EECs (Schonhoff et al., 2004). The well-defined progenitor state is potentially unique to the EEC lineage, as goblet cells currently have no resolved progenitor stage. It is unknown, however, whether oncogenic mutations alter the specification of the EEC lineage in CRC.
To determine if the frequently mutated-in-CRC genes KRAS, APC, TP53, BRAF, PIK3CA, or SMAD4 are associated with the differentiation status of secretory cells in CRC, we used expression and mutation data from the TCGA COAD (supplemental methods). Canonical marker genes LGR5/AXIN2 (Stem), INSM1/NEUROG3 (Enteroendocrine progenitor), SST/CHGA (Enteroendocrine), and MUC2/FCGBP (Goblet) were all significantly correlated (pvalue < 0.05 and R > 0.45) (Supplemental Figure 1A). Thus, the coexpression of these marker gene pairs was used as an approximation of cell type content in the tumors. We annotated the expression quintiles of these marker pairs among all samples, and then plotted the samples by To study this apparent dichotomy, tumors were split into four groups bearing the BRAF or KRAS mutation, and expression values in the top two quintiles of mature EEC progenitor markers (BRAF EP /KRAS EP ) or of differentiated EEC markers (BRAF EC /KRAS EC ), ( Figure 1C). KRAS mutation is equally common among tumors with high expression of progenitor or differentiated markers, but BRAF mutation is primarily associated with high expression of progenitor markers.
Both BRAF and KRAS groups generally exhibited elevated levels of the secretory specification factor ATOH1 and goblet marker MUC2 compared to the all other tumors group (Supplemental  Table 4).
The transcriptomic profile of maturing EECs has recently been elucidated using scRNA-seq, with robust expression markers identified at each stage of development (Gehart et al., 2019).
Leveraging these markers, we inferred the potential stage of EEC progenitor maturation for each tumor sample. NEUROG3, one of the earliest markers of EEC differentiation, was significantly elevated in BRAF EP compared to normal tissue and all other groups (Figure 1D &   Supplemental Table 2). Generally, expression of early and intermediate markers was significantly higher in both BRAF EP as well as KRAS EP relative to all other tumor groups and significantly higher in BRAF EP compared to KRAS EP . Conversely, expression of late markers NEUROD1 and LMX1A were significantly lower in BRAF EP than all other tumor groups.

Consistent with lower expression of late EEC progenitor markers, markers of differentiated
EECs were all significantly lower in BRAF EP than in all other groups, with the exception of 4 out of 25 total comparisons where BRAF EP expression was lower but did not reach significance Table 2). These results suggest that differentiation of EECs may be halted at an early stage in some BRAF tumors.

Organotypic cell lines recapitulate major populations of the large intestine
The HT29, NCI-H508 (H508), and LS174T organotypic cell lines have been used to extensively study the contribution of goblet cells to tumor progression. HT29 and H508 cells contain activating BRAF kinase domain mutations V600E and G596R, respectively, and LS174T cells contain the KRAS GTP-binding mutation G12D (Cancer Cell Line Encyclopedia) (Barretina et al., 2012). However, the presence and differentiation status of the EEC lineage in these cell lines is unknown. To further interrogate the secretory cell populations in these cell lines, we performed single-cell RNA sequencing (scRNA-seq) and compared our data against a select published scRNA-seq of the human colon (Supplemental Figure 2). Through marker analysis, it was determined that most major colon epithelial cell types were present in all three samples ( Figure   2 A, B), including the secretory goblet and EEC populations ( Figure 2C). The early secretory population was defined based on its varied transcriptional continuum, with genes such as MEX3A being expressed throughout the cluster, and other genes such as DLL1 being uniquely expressed at the interface of the early EEC and goblet cell clusters (Supplemental Figure 3A However, well-defined markers for both populations were more clearly expressed in the normal colon, and marker expression was generally lower and more heterogeneous in the HT29 or H508 cell lines, indicating that these populations may not undergo canonical or complete differentiation in the cancer cell lines. We additionally identified a putative early goblet-like cell (EGLC) population that expressed some goblet cell markers such as REG4, MUC5AC, and SERPIN1A but was absent for MUC2 ( Figure 2A & Supplemental Figure 3F, REG4). Further, while intestinal stem-cell markers such as LGR5, SMOC2, and OLFM4 were highly expressed in the normal colon, the expression of these markers was lower in H508 cells and virtually absent in HT29 cells. The absence of this population in our largest dataset (HT29) obfuscated the identification of a canonical stem population in our integrated analyses, however, these cells, which primarily originate from the colon sample, were positioned directly above our early secretory cluster (Supplemental Figure 3G, SMOC2).
Compared to HT29 and H508 cells, the normal colon had significant enrichment of terminally differentiated secretory EEC and goblet cells (FDR < 0.05 and mean Log2 Fold Enrichment > 1) ( Figure 2C). Conversely, the cell lines were enriched for early EECs as well as EGLCs and miscellaneous (Misc) cell clusters (FDR < 0.05 and mean Log2 Fold Enrichment > 1). The Misc clusters contained discrete markers, but these markers were not clearly associated with any one cell type (Supplemental Figure 3H, I). Since EGLCs were primarily present only in HT29 cells, we did not study this population further and rather focused on generally conserved cell types.
The dichotomy between the cell lines and the normal intestine with respect to the differentiation status of EEC's is echoed by examining the expression of well-established marker genes. While

CpG methylation blocks NEUROD1 expression in BRAF EP cancers
We hypothesized that BRAF EP tumors exhibit a molecular alteration that functions to halt EEC differentiation. The CpG island methylator phenotype (CIMP) is characterized by aberrant gains of methylation at dense regions of CpG nucleotides known as CpG islands (CpGI), and generally coincides with repressed gene transcription (Toyota et al., 1999). KRAS and BRAF mutant tumors are differentially associated with CIMP, with BRAF-mutant tumors being uniquely enriched for the CIMP high phenotype versus KRAS mutants (Weisenberger et al., 2006). Based on the significant difference in NEUROD1 expression between BRAF EP and all other tumors, we hypothesized that NEUROD1 may be differentially methylated between these groups. Using TCGA COAD DNA methylation data, we determined both that methylation in all CRC groups was generally increased over the NEUROD1 CpGI compared to normal, and that there was a strong negative correlation between CpGI DNA methylation and expression of NEUROD1 ( Figure 3A). In contrast, methylation was slightly reduced over the transcription start site of the non-CpGI containing gene PAX4 in all sample types and the correlation between methylation and PAX4 gene expression was weak or not present (Supplemental Figure 5). Among the tumor groups, BRAF EP exhibited the highest levels of NEUROD1 methylation while BRAF EC , KRAS EP , and KRAS EC exhibited the lowest ( Figure 3A), consistent with the lowest expression of differentiated EEC markers in BRAF EP and highest expression in BRAF EC , KRAS EP , and KRAS EC groups ( Figure 1E). In line with the TCGA data, NEUROD1 CpGI methylation was significantly higher in BRAF mutant HT29 and H508 cell lines compared to KRAS mutant LS174T cells or normal human colon organoids ( Figure 3B). Treatment with a low dose of the DNA methyltransferase inhibitor decitabine (DAC) led to re-expression of NEUROD1 in HT29 and H508 cell lines but did not change NEUROD1 expression in LS174T cells, consistent with DNA methylation-dependent repression of NEUROD1 in BRAF-mutant tumors ( Figure 3C). To further show the specificity of DAC treatment-mediated reactivation of NEUROD1 in BRAF-mutant cells, we used ISX-9, which activates NEUROD1 expression through a DNA-methylation independent mechanism (Schneider et al., 2008). Treatment with ISX-9 induced significant activation of NEUROD1 in LS174T cells and NEUROD1 re-expression in HT29 and H508 cell lines ( Figure 3C). Co-treatment of DAC and ISX-9 further induced NEUROD1 expression in HT29 and H508 cells, but not in LS174T cells. Further analysis revealed that NEUROD1 promoter CpGI methylation was unchanged after ISX-9 treatment in HT29 cells, but significantly reduced after DAC treatment and reduced to a similar level with DAC/ISX-9 co-treatment ( Figure 3D). Together this data suggests that BRAF EP tumors exhibit CpGI methylationdependent loss of NEUROD1 expression, potentially contributing to the loss of EEC differentiation. Furthermore, HT29 and H508 cell lines may be appropriate models to study the regulation and function of EEC progenitors in colon cancer.

Trajectory analysis provides novel insight into the regulation of intestinal EEC and goblet cell commitment in CRC
The heterogeneous cell content of HT29 and H508 potentiates the use of computational trajectory analysis to better understand the transcriptomic profile of differentiating EECs and goblet cells in CRC. To this end, we performed RNA-velocity analysis, which uses the ratio of spliced to unspliced transcripts (analogous to mature versus nascent RNA) to infer cellular trajectories over developmental pseudotime. Furthermore, we extended the RNA-velocity analysis through integration with PAGA, which predicts the degree to which there is a connection between different clusters of cells based on the similarities and differences in their transcriptional profiles. For both the cell lines and colon samples an almost cyclical pattern emerges from within the TA cluster as seen in the velocity stream plots and the accompanying PAGA plots ( Figure 4A). Within this cell type, velocity arrows can be seen flowing from the inferred S phase cells, through the G2/M phase cells, and finally into the G1 phase cells as would be expected (Supplemental Figure 3D). In the cell line samples, it can also be seen that the velocity stream bifurcates and flows either towards the goblet and EEC cell lineages, or towards the central cluster that contains PLCs, ELCs, and various other clusters ( Figure 4A).
Finally, CellRank was employed to further define clusters that are likely at an end state in differentiation. For both cell lines, the early EECs and goblet cells had a strong localized prediction to be terminally differentiated (Supplemental Figure 6). PLCs, and to some extent other surrounding cells in clusters such as ELCs, had a diffuse and less well-localized prediction as terminally differentiated. Additionally, a subset of cells in TA clusters that varied between samples was predicted to represent an end state, likely reflecting cells that continue to cycle. It is noted that in the colon sample, the disconnected cell populations and low relative cell and sequencing read depth resulted in less informative RNA-velocity and CellRank results. Whereas for the cell lines the connectedness of clusters and relatively high depth of coverage leads to more continuous and confident velocity streams, and therefore more informative downstream analysis such as CellRank.
Using the RNA-velocity results and the knowledge of inferred cell end states from CellRank, we explored the goblet and EEC cell lineages in our cell lines by predicting the "descendants" of cells, or the cells they are most likely to become moving forward in pseudotime/differentiation.
Using this feature, we inferred the descendants of TA cells at the end of the G2/M phase as they potentially exit the cell cycle and begin specification toward CellRank predicted end-states.
Approximately 10% of these cells had descendants that were predicted to be in the goblet and EEC lineages ( Figure 4B), consistent with the relatively small proportion of these cells compared to the whole cellular population. Approximately 80% of cells were predicted to transition into other cell types such as ELC and PLCs or continued cycling, and the remaining 10% did not have well-defined descendants. Using the ratio of spliced to unspliced reads for a gene in a cell, RNA-velocity can also infer the gene velocity, or whether a gene is likely being activated or repressed in a subpopulation of cells. If the proportion of unspliced reads increases, activation (high velocity) is assumed since more nascent transcripts would be expected to be captured by a gene that is transcribing at a higher rate, and vice versa for repression.
Leveraging this information, we looked for genes with uniquely high or low velocities in our goblet and EEC clusters to better understand the regulatory landscape of their transcription.
High confidence targets identified through this analysis include genes such as MYT1, which has previously been shown to function as a general specification factor for the EEC lineage(Gehart et al., 2019) ( Figure 4C). Other high-velocity genes, RFX3 and ST18, have been implicated in the formation and function of pancreatic endocrine cells, and may therefore also function similarly in the large intestine(Ait-Lounis et al., 2010) (Henry et al., 2014). In addition to early EEC lineage specifiers, we identified genes associated with cell morphology like MAP2, the product of which is involved in the formation of dendrites (Harada et al., 2002). As evidence of this function in vitro, analysis of early EEC morphology by immunofluorescence shows that these cells exhibit large ramified processes reminiscent of the dendrite pseudopod structures of mature EECs, with a terminal synaptic-like bouton(Supplemental Figure 7A). With respect to goblet cells, CBFA2T3 has an established role in the specification of these cells (Amann et al., 2005) (Figure 4C). High-velocity genes toward the goblet population were generally involved in calcium signaling and metabolism, such as GMDS. In contrast to the early EECs, goblet cells primarily exhibited ovoid-like morphologies with some cells exhibiting the classic signet-ring morphology with nuclei pressed against the cell membrane (Supplemental Figure 7A).
Individual cell velocity arrows indicated that cells at the furthest points of the early EEC and goblet cell clusters had greatly reduced velocity, which means they are predicted to be terminally differentiated ( Figure 4D & Supplemental Figure 7B). For the goblet cell cluster, the pattern of predicted activation or repression of certain genes appears to correspond to this boundary. For example, Tre-foil factor 3 (TFF3) appears to switch from a predicted activated state to a repressed state near the terminally differentiated cells, while MUC2 goes from a repressed to activated state ( Figure 4E). As previously described, the secretory cell lineage has an interesting bifurcating shape, potentially due to a fate decision for common progenitors.
Zooming into that region and looking at the velocity arrow direction and speed for individual cells reveals a strong bias for many cells towards the EEC lineage in the cancer cell lines ( Figure   4D). In juxtaposition, the directionality and speed of cells differentiating to goblet cells is less coherent, with many cells seeming to change direction back towards the undifferentiated or EEC lineages. A representative descendent plot of an early secretory cell predicted to differentiate towards the goblet cell fate, and then redirect towards the early EEC fate is shown ( Figure 4F).
To directly test the apparent shift from goblet cell differentiation toward early EEC seen in these cell lines, genetic knockdowns of early EEC specification gene NEUROG3 and goblet cell specification gene SPDEF were generated. SPDEF KD reduced NEUROG3 and MUC2 expression, while NEUROG3 KD had no effect on SPDEF or MUC2 expression (Supplemental Figure 7C). These observations suggest a potential model in these CRC cell lines where some cells that are otherwise poised to form goblet cells are redirected prior to terminal differentiation to become early EECs ( Figure 4G).

Secretory cells and their progenitors secrete tumor-associated proteins in CRC
Secretory progenitors exist in highly transient states so are rarely captured in the normal intestine, and are therefore difficult to study. Consistent with our TCGA expression data, the proportion of mature ATOH1 expressing cells was higher in the colon than in HT29 or H508 cells, where the numbers of early EEC NEUROG3 expressing cells are higher in the CRC cell lines (Supplemental Figure 8). While present in these cell lines, NEUROG3 expressing cells nonetheless make up only ~2% of the total cell line population. The enrichment of these early EEC progenitors in the cell lines provides a unique opportunity to study in cancer the functions and interactions between these otherwise highly transient populations.
To interrogate the secretory protein landscape mediated by early EECs, EECs, and goblet cells, the top 50 markers for each of these cell types were cross-referenced to the Human Protein Atlas (proteinatlas.org) as well as related literature, to identify secreted proteins. All populations expressed various secreted factor genes, many of which have been shown to enhance cancerassociated phenotypes ( Figure 5A; Denoted in red). For example, early EECs express ENPP2, which has been shown to promote cell migration. TFF3 is canonically considered a goblet cellsecreted factor that displaces an antagonist of the EGFR, thereby enabling enhanced activation of the PI3K-AKT pathway (Belle et al., 2019). Uniquely, TFF3 is highly expressed in all three secretory cell types. Our trajectory analysis indicates that TFF3 gains velocity in stable EEC progenitors, where it is highly expressed (Figures 4E & 5A). Further analysis indicates that TFF3 is generally highly expressed in goblet and EEC lineage cells compared to all other groups ( Figure 5B). Genetic knockdown of ATOH1 or NEUROG3 to decrease secretory and EEC populations, respectively, lead to significant loss of TFF3 expression ( Figure 5C, D & Supplemental Figure 7C) and significantly less TFF3 secretion into the media ( Figure 5E). Consistent with the previously described role for TFF3 in promoting cell survival (Taupin et al., 2000), the addition of a neutralizing TFF3 antibody to serum-free media resulted in a significant reduction in cell survival during EGF signaling ( Figure 5F). These data provide evidence that EEC progenitors produce secreted peptides that may contribute to tumor progression, such as TFF3, which is critical for cell survival during growth factor signaling.

LSD1 knockdown results in loss of secretory cells
To better understand the regulation of EECs in CRC, we performed pathway enrichment analysis of differentially expressed genes between the early EECs and goblet cells, using the Reactome Pathway Database (Jassal et al., 2020). Compared to goblet cells, early EECs were significantly enriched for chromatin related terms such as "Chromatin modifying enzymes", and "Chromatin organization" (FDR < 0.05) (Supplemental Figure 9). The most differentially expressed chromatin-modifying enzyme was Lysine-specific demethylase 1 (LSD1) (FDR of 2.97x10 -14 and Log2 FC of 1.03). Generally, LSD1 expression was highest in early EEC and early secretory cells, lowest in differentiated EEC and goblet cells, and similar in all other clusters (Supplemental Figure 10A). We have previously demonstrated that loss of LSD1 significantly alters clonogenic growth in organotypic HT29 cells, but not in stem-like SW480 cells, which do not form secretory cells (Miller et al., 2020). Similarly, reduction of EEC or secretory cells by knockdown of NEUROG3 or ATOH1 significantly reduced clonogenic growth in HT29 cells (Supplemental Figure 10B). Combining this observation with our LSD1 expression analysis, we hypothesized that LSD1 expression levels increase in secretory progenitors to regulate EEC progenitor and goblet cell formation in our cell lines. Additionally, loss of these populations after LSD1 KD may be responsible for our previously reported clonogenic growth phenotypes (Miller et al., 2020).
As a preliminary analysis to determine if LSD1 may be necessary for the formation of secretory lineage cells, we reanalyzed our previously published LSD1 knockdown (KD) RNA-seq data in HT29 and SW480 cells (Miller et al., 2020). LSD1 KD was associated with a significant reduction of nearly all tested EEC progenitor and goblet cell markers in HT29 cells, but no markers were reduced in SW480 cells (FDR < 0.05) (Supplemental Figure 10C). To deconvolute whether we lose these cell populations or have a global decrease in transcript expression for these genes, we performed scRNA-seq on HT29 and H508 cells after LSD1 KD and integrated these samples into the previously analyzed empty vector (EV) controls using Seurat, as described in the Supplemental Methods. Consistent with reports of LSD1 overexpression in CRC (Hsu et al., 2015), LSD1 expression was elevated in our cell lines compared to the normal colon, which had expression levels similar to the HT29 and H508 LSD1 KD samples (Supplemental Figure 10D).
Loss of LSD1 dramatically reduced relative cell-type proportions of early EEC and goblet cells (mean Log2 Fold Difference < -1 and FDR < 0.05) but did not significantly decrease other populations in both cell lines ( Figure 6A). To better understand the effect of LSD1 KD on the differentiation trajectory of subpopulations within the cell lines, we computed the RNA-velocity and compared the results to the EV controls. Strong relative velocity length and confidence can be observed in the goblet and EEC lineages in the controls ( Figure 6B). Velocity confidence is a measure of agreement in the directionality of differentiation between nearby cells, and velocity length is a measure of the relative "speed" that one subpopulation is differentiating towards another in pseudotime. In LSD1 KD cells, however, both velocity length and confidence toward the secretory lineage is diminished relative to the other non-secretory cell lineage clusters such as TAs. As validation of these results, LSD1 KD leads to a significant reduction in the percentage of MUC2 positive goblet cells and SUSD2 positive cells, a marker expressed in EEC progenitor populations, as determined by immunofluorescence ( Figure 6C, D). Further, while LSD1 KD reduced expression of NEUROG3 and ATOH1 in HT29 and H508 cells, this reduction was not observed in KRAS mutant LS174T cells (Supplemental Figure 10E).
Disruption of the NOTCH pathway can significantly change the relative levels of secretory cells, and in HT29 cells NOTCH inhibition by DBZ significantly increased secretory cells at the expense of HES1+ absorptive cells as inferred by changes in gene expression ( Figure 1A & Supplemental Figure 11A). LSD1 KD did not alter HES1 expression in any cell line, and further analysis of our previously published RNA-seq data shows that LSD1 KD did not alter the notch signaling pathway, indicating that LSD1 function may be independent of NOTCH1 specification (Supplemental Figure 10E, 11B). Together these data show that LSD1 is required for the formation of secretory EEC progenitors and goblet cells in HT29 and H508 cell lines.

Loss of secretory cells reduces pS473-AKT and suppresses tumor growth and metastasis
We have previously demonstrated that LSD1 regulates phosphorylation of AKT at serine 473, a modification involved in the activation of AKT through an unknown mechanism (Miller et al., 2020). We hypothesized that the loss of secretory cells following LSD1 KD may thereby abrogate AKT phosphorylation potentially due to loss of secreted factors. Replacing the media of LSD1 KD cells with media conditioned by other LSD1 KD cells resulted in a significant decrease in pS473-AKT compared to EV cells ( Figure 7A). Replacing the media of LSD1 KD cells with media conditioned by EV cells resulted in the significant rescue of pS473-AKT levels and increased phosphorylation of the AKT-downstream target TSC complex subunit 2 (TSC2).
These results suggest that LSD1 KD cells are unable to produce a secreted factor critical for the activation of AKT, likely due to the loss of the cell type secreting this factor. LSD1 KD significantly reduced TFF3 mRNA expression, TFF3 protein secretion into the media, and TFF3 positive cells ( Figure 7B-D). Given the crucial function of TFF3 protein in the activation of EGFR, we tested the hypothesis that loss of TFF3 production in LSD1 KD cells reduces pS473-AKT. Treatment with a TFF3-neutralizing antibody reduced basal pS473-AKT and completely blocked pS473-AKT rescue in LSD1 KD cells supplemented with EV media ( Figure 7E). SNAIL protein levels followed a similar trend, in agreement with our previous work demonstrating that LSD1 regulates SNAIL protein stability through the regulation of AKT activation (Miller et al., 2020).
To elucidate the contribution of LSD1-mediated specification of secretory cells to CRC in vivo, HT29 cells dual-labeled with luciferase and mCherry were orthotopically implanted into the mouse colon. LSD1 KD tumors exhibited significantly reduced volume at the primary site and significantly reduced metastases to the lung ( Figure 7F, G). LSD1 KD tumors exhibited reduced staining for TFF3 and mucinous vacuoles ( Figure 7H). Furthermore, while the mature EEC marker Synaptophysin was detected in the mouse large intestine, no positive staining was detected in mCherry tumor cells, consistent with the lack of EEC differentiation observed in vitro (Supplemental Figure 12). Together these data demonstrate that loss of secretory cells following LSD1 KD leads to loss of pS473-AKT and abrogates tumor growth and metastases in vivo.

Discussion
BRAF mutant CRCs can be highly aggressive, with many studies showing worse survival compared to other CRC subtypes (Yaeger et al., 2014) (Margonis et al., 2018). In general, therapy response has been variable in clinical trials targeting patients with BRAF mutations, highlighting the need to develop additional therapeutic targets (De Roock et al., 2010) (Yaeger et al., 2015). In this study, we demonstrate that in addition to goblet cells, a significant proportion of BRAF mutant CRCs are also enriched for EEC progenitors, and support the notion that the presence of secretory cell populations is an important consideration in the treatment of BRAF mutant CRC. Furthermore, we identify LSD1 as a potential druggable target in secretory-cellenriched BRAF mutant CRC.
BRAF mutant CRC is strongly associated with CIMP (Weisenberger et al., 2006). Currently investigated treatments for BRAF CRC include DNA methyltransferase inhibitors (DNMTi) such as 5-azacitidine, which has been shown to sensitize BRAF mutant CRC cell lines to BRAF inhibitors (Mao et al., 2013). Interestingly we find that hypermethylation of the NEUROD1 promoter CpGI is associated with a differentiation block in EECs resulting in an accumulation of EEC progenitors in BRAF mutant CRC. Treatment with the DNMTi decitabine demethylates the promoter CpGI resulting in re-expression of low levels of NEUROD1 and the ability of neurogenesis-inducing small molecule ISX-9 to induce robust expression of NEUROD1 when used in combination with decitabine. These findings suggest that EEC progenitors in BRAF mutant CRC can be further differentiated into EECs through the use of DNMTis. Beyond our findings in CRC, NEUROD1 is frequently differentially methylated in neoplastic breast tissue (Fiegl et al., 2008). While the function of this methylation is unclear, these patients have a much worse overall survival, indicating a potential molecular connection between these two cancer types. Furthermore, NEUROD1 is repressed by the polycomb repressive complex 2 (PRC2) in medulloblastomas as a mechanism to stymie tumor cell differentiation (Cheng et al., 2020). The PRC2 has been shown to mark genes for de novo DNA methylation in multiple cancer types including CRC suggesting that the PRC2 may be linked to NEUROD1 CpGI methylation in BRAF-mutant CRC (Schlesinger et al., 2007).
The transient nature of EEC progenitors makes these populations more difficult to study than their fully differentiated counterparts, which still represent less than 1% of the adult large intestine. Our finding that BRAF mutant CRC cell lines maintain high levels of EEC progenitors suggests the cell lines may provide suitable models for the study of secretory cells in progenitor states. When comparing our scRNA-seq data from HT29 and H508 cell lines to a normal human colon sample, multiple of the clearly defined cell types seen in the colon sample appeared in a more undifferentiated or transcriptionally altered state in the CRC cell lines, consistent with other studies of CRC tumors at single-cell resolution (Uhlitz et al., 2020). However, distinct cell types such as goblet cells and putative early EECs appeared well-defined and with clearly conserved markers in both the normal colon and CRC cell lines. In line with these findings, researchers have derived subclones from the HT29 cell line capable of forming polarized goblet and enterocyte monolayers (Huet et al., 1987). RNA velocity predicted a pattern of cell differentiation for these populations where a subset of TA cells presumably finishing the cell cycle differentiate towards early progenitors and bifurcate into the stable goblet and early EEC cells. In our analysis of this data, we identify recently established transcriptional drivers of EEC and goblet cell differentiation as well as potential novel regulators. Due to the clear conservation of core transcriptional profiles in secretory populations between our BRAF mutant CRC cell lines and the normal colon, it is likely that most of our findings are applicable and of interest to the study of normal differentiation in the colon.
One of the clusters that were variant in the CRC cell lines compared to the normal intestine was canonical stem cells, which were completely absent in HT29 cells. We note in our data that in HT29 cells CDX2 expression was nearly absent, the loss of which frequently occurs in BRAF mutant CRC, and contributes to sessile serrated adenoma progression, a tumor subtype that progresses rapidly (Sakamoto et al., 2017). CDX2 is required to maintain the identity of intestinal stem cells (Simmini et al., 2014) and therefore the lack of CDX2 expression in HT29 may be linked to the lack of canonical stem cell populations, and explain the absence of this population in this cell line. Another interesting observation more specific to the CRC cell lines was the presence of the EGLC cluster. This cluster contained some canonical markers of goblet cells but was predicted based on the trajectory analyses to be a transition state in the formation of differentiated secretory cells. Given the deeper sequencing of the HT29 cell line, it is unclear whether or not this population was identified due to the resolution of this dataset or whether it may be due to the underlying biology of this specific cancer.
Further analysis of EEC progenitors in our cell lines showed that these cells surprisingly secrete factors such as TFF3, a factor previously thought to be secreted predominantly just by goblet cells. We demonstrate that TFF3 promotes cell survival during growth factor signaling suggesting both EECs and goblet cells secrete factors that promote CRC. Based on our identification of other tumor-promoting secretory genes expressed by goblet cells and EECs, other secreted factors in addition to TFF3 also likely contribute to CRC progression.
Furthermore, our data also suggest EEC progenitors are important in colony formation, a critical step in metastasis. In support of this notion, another study demonstrated that NEUROG3+ cells not only give rise to EECs but surprisingly also to non-EEC cells, which suggests these progenitors can dedifferentiate to produce other lineages (Schonhoff et al., 2004). Additional recent studies have shown that ATOH1+ common secretory progenitors can dedifferentiate into a multipotent stem-like state spontaneously and in response to intestinal injury (Tomic et al., 2018) (Castillo-Azofeifa et al., 2019. In agreement with these studies our trajectory analyses predict that while the vast majority of EEC progenitors are stable, a small fraction is predicted to dedifferentiate. It is tempting to speculate that the dedifferentiation of secretory progenitors may play a role in successful colony formation during metastases of BRAF tumors. We demonstrate for the first time that LSD1 is required for the maintenance of secretory EECs and goblet cells in BRAF CRC. By employing scRNA-sequencing we were able to determine that LS1 KD leads to the loss of velocity of lineage specifying genes in early secretory cells, culminating in the loss of early EEC and goblet terminal secretory states. Interestingly, we determined that LSD1 KD did not alter secretory cells in the KRAS mutant mucinous LS174T CRC cells. This study sheds light on previous work demonstrating that LSD1 regulates activation of AKT in HT29 cells (Miller et al., 2020) as we now show secreted TFF3 may play a critical role in this process. Reduction of goblet and EEC cells by LSD1 KD resulted in reduced TFF3 secretion, allowing LSD1 KD to ablate AKT activation in the whole cell population even though it is only directly affecting a small percentage of cells. We find that LSD1 appears to act downstream of NOTCH signaling during the differentiation process. We further show that LSD1 levels are highest in EEC progenitors and early secretory cells. In accordance, we speculate that LSD1 may function directly in early secretory progenitors, to regulate the expression of genes important for the formation of further differentiated EEC progenitors and goblet cells. In vivo, LSD1 KD xenografts maintained the loss of secretory cells seen in cell culture and reduced tumor growth and metastasis. Future studies that establish a mechanistic understanding of how LSD1 regulates the maintenance of secretory cells will be critical to translate these findings into rational therapeutic options for patients with BRAF mutant CRC.      directed PAGA plots (bottom) of the colon, H508 EV, and HT29 EV scRNA-seq data colored by clusters from spliced reads, showing inferred differentiation trajectories. B, Representative plots of inferred cell descendants over pseudotime from TA cells in G2/M phase, and the approximate percentages of their inferred end-state cell type in HT29. C, Velocity and expression UMAP plots, and a table of select markers that have differential velocity in the goblet or early EEC cells versus all other clusters in HT29 and H508. D, Velocity arrows for individual cells in the early secretory, goblet, and early EEC cell clusters. E, UMAP plot of RNA velocities of TFF3 and MUC2 per cell in HT29 and H508. F, Representative descendent tracing through pseudotime of a cell originating in the early secretory cluster, and that is inferred to slightly differentiate towards the goblet cell lineage before becoming an early EEC in HT29. G, Diagram depicting velocity analysis-based model of secretory cell differentiation from early secretory progenitors toward early EEC and goblet cells, as well as the redirection of some cells away from terminal goblet cell differentiation toward early EECs.

Figure 5: Secretory cells and their progenitors secrete tumor-promoting factors.
A, Violin plots of normalized Log2 + 1 expression for various marker genes in early EEC, EEC, and goblet subpopulations from the combined scRNA-seq data. Each gene is scaled separately.