Levelling out differences in aerobic glycolysis neutralizes the competitive advantage of oncogenic PIK3CA mutant progenitors in the esophagus

Normal human tissues progressively accumulate cells carrying mutations. Activating mutations in PIK3CA generate large clones in the aging human esophagus, but the underlying cellular mechanisms are unclear. Here, we tracked mutant PIK3CA esophageal progenitor cells in transgenic mice by lineage tracing. Expression of an activating heterozygous Pik3caH1047R mutation in single progenitor cells tilts cell fate towards proliferation, generating mutant clones that outcompete their wild type neighbors. The mutation leads to increased aerobic glycolysis through the activation of Hif1α transcriptional targets compared with wild type cells. We found that interventions that level out the difference in activation of the PI3K/HIF1α/aerobic glycolysis axis between wild type and mutant cells attenuate the competitive advantage of Pik3caH1047R mutant cells in vitro and in vivo. Our results suggest that clinically feasible interventions that even out signaling imbalances between wild type and mutant cells may limit the expansion of oncogenic mutants in normal epithelia.


Introduction
As humans age, normal tissues accumulate clones carrying somatic mutations (Blokzijl et al., 2016;Lee-Six et al., 2018;Martincorena et al., 2018 ;Martincorena et al., 2015;Moore et al., 2020;Yizhak et al., 2019;Yokoyama et al., 2019;Yoshida et al., 2020). An example is the esophagus, in which the normal epithelium develops into a patchwork of mutant clones by middle age . There is genetic evidence that the most prevalent mutations are in genes under strong positive selection, arguing that they confer a proliferative or survival advantage over wild type cells (Hall et al., 2019). However, little is known about the cellular mechanisms that underpin the clonal expansions triggered by most of the mutant genes. Understanding these processes is important as it both reveals genes that regulate progenitor cell behavior and, as most of the selected mutant genes are frequently mutated in cancer, gives insight into the earliest stages of neoplastic transformation. The identification of pharmacologically-druggable mutations may also open opportunities for cancer prevention.
One gene that is recurrently mutated in normal human esophagus is PIK3CA, which encodes the p110α catalytic subunit of phosphoinositide 3-kinase (PI3K) Yizhak et al., 2019;Yokoyama et al., 2019). PI3K is a signaling hub activated by insulin and growth factors to regulate a broad range of processes including cell proliferation, survival, growth and metabolism, mainly through the activation of the Akt/mTOR signaling axis (Fruman et al., 2017;Madsen et al., 2018). Activating PIK3CA mutations are recurrently found in solid tumors including esophageal squamous cell carcinoma (ESCC), and in overgrowth syndromes and vascular malformations (Castel et al., 2016;Castillo et al., 2016;Consortium, 2017;Madsen et al., 2018).
Reviewing published data, 74 missense PIK3CA mutant clones were identified in 18 cm 2 of histologically normal human esophageal epithelium . PIK3CA missense mutations generate particularly large clones in the human esophagus ( Figure 1A).
Indeed, PIK3CA is the mutant gene with the second highest average Variant Allele Fraction (VAF) (Figure 1B). The VAF corresponds to the proportion of sequence reads detected for each DNA variant and, in diploid cells, is proportional to the size of the mutant clones carrying the mutation in the sample analyzed. These mutant clones were significantly enriched for pathogenic or likely pathogenic PIK3CA activating mutations according to the ClinVar mutation database (https://clinvarminer.genetics.utah.edu/variants-by-gene/PIK3CA); with 47% (35/74) activating PIK3CA mutations observed, well above the 1% expected under neutrality (p=2e -27 , two tailed binomial test, see methods) . Notably, mutations classified as pathogenic or likely pathogenic had significantly higher clone size or VAF than the remaining PIK3CA missense mutants or the total 603 silent mutations in all tested genes, which have no effect on cell behavior ( Figure 1C). The most prevalent pathogenic mutation was PIK3CA H1047R (35% (8/23) of all pathogenic mutation events) ( Figure 1D), which had a significantly higher VAF than the other PIK3CA missense mutations ( Figure 1E).
Taken together, these findings indicate that activating PIK3CA mutants, and particularly PIK3CA H1047R , may drive large clonal expansions in normal human esophagus. In light of these observations, we set out to investigate the mechanisms by which Pik3ca H1047R mutant progenitors colonize normal esophagus in mouse esophageal epithelium. The tissue consists of layers of keratinocytes, with progenitor cells residing in the deepest, basal cell layer. Differentiating cells exit the cell cycle and leave the basal layer, migrating towards the epithelial surface where they are shed (Doupe et al., 2012;Piedrafita et al., 2020) (Figure   2A). Each progenitor division generates either two progenitor daughters, two non-dividing differentiating cells or one cell of each type ( Figure 2B) (Doupe et al., 2012;Piedrafita et al., 2020). The probabilities of these outcomes are balanced across the progenitor population so that one progenitor and one differentiating daughter are produced from an average division.
This ensures that across the progenitor population, equal numbers of progenitor and differentiating cells are generated, maintaining tissue homeostasis. Mutations may tilt the balance of cell production towards proliferation driving the clonal expansion of mutant progenitors (Alcolea et al., 2014;Fernandez-Antoran et al., 2019;Murai et al., 2018) ( Figure   2B).
To explore the effect of activating Pik3ca mutations on progenitor cells we developed a new transgenic mouse model that allowed us to track the fate of individual progenitors induced to express a single allele of the Pik3ca H1047R mutant in a background of wild type cells. We found that this mutation tilts progenitor cell fate, so that excess of mutant progenitors are produced per average division, a proliferative advantage that promotes mutant clone expansion. We show that Pik3ca H1047R activates glycolysis via the HIF1α transcription factor in esophageal cells and that levelling the competitive 'playing field' with agents that regulate glycolysis blocks the competitive advantage of Pik3ca H1047R mutant cells in vitro and in vivo. In homeostasis, the probabilities of each outcome are balanced in order to produce equal numbers of progenitor cells and differentiating cells across the epithelium. Mutations under positive selection may tilt the cell fate towards proliferation, producing excess progenitor cells. (C) Schematic illustration of the conditional targeted allele in the Pik3ca locus. Pik3ca exon 20 was flanked by loxP sites (triangles). The engineered duplicate region of Pik3ca exon 20 contains the H1047R mutation, and sequences coding for a self-cleaving T2A peptide and an enhanced Yellow Fluorescent Protein (EYFP) followed by a nuclear localization signal (NLS). Prior to Cre-mediated recombination, the wild type p110α protein is expressed; however, after Cre mediated recombination, the allele co-expresses p110α H1047R mutant protein and EYFP-NLS. Cre recombination was mediated by crossing the conditional mutant strain with AhCre ERT mice which express the Cre recombinase upon treatment with β-naphthoflavone and tamoxifen. (D) Typical confocal z stack image of a region of an esophageal epithelial whole-mount, viewed top down, from an Ahcre ERT Pik3ca H1047R/wt mice 3 months after induction with β-naphthoflavone and tamoxifen. An optical section through the basal cell layer is shown, Pik3ca H1047R/wt clones are indicated by white arrows. YFP immunofluorescence (green), nuclei are stained with DAPI (blue). Scale bar, 20 μm.

Generation of inducible Pik3ca H1047R-YFP knock-in mice
The majority of Pik3ca mutant clones found in the normal esophagus carry an activating H1047R mutation in one allele of the Pik3ca locus. To determine whether induction of a heterozygous Pik3ca H1047R allele altered progenitor cell behavior, we developed a new conditional mouse strain, Pik3ca fl-H1047R-T2A-YFP-NLS (henceforth referred to as Pik3ca H1047R-YFP ).
A conditional allele of Pik3ca H1047R , with a nuclear localized Yellow Fluorescent Protein (YFP) reporter linked to the C-terminus of the Pik3ca H1047R protein by a T2A self-cleaving peptide (Trichas et al., 2008), was targeted to the Pik3ca locus. Following recombination mediated by Cre recombinase, the wild type exon 20 of Pik3ca is excised and replaced by the mutant exon 20 encoding Pik3ca H1047R ( Figure 2C). This design allowed us to track individual Pik3ca H1047R-YFP cells in a Pik3ca wt/wt background, as the recombined mutant cells and their progeny stain positive for YFP ( Figure 2D). Following T2A cleavage, a 20-amino-acid peptide remained at the C terminus of Pik3ca H1047R protein. We found that a C-terminally extended p110α H1047R protein could still activate the PI3K/Akt pathway upon transient transfection in NIH3T3 fibroblasts ( Figure S1A) and also induce modest Akt activation upon in vitro recombination of primary esophageal keratinocytes from Pik3ca H1047R-YFP mice (Figure S1B-F).  H1047R-YFP/wt ). Cells were treated in medium without added growth factors and 0.1% serum (Starved, STV) or medium with 20% serum and growth factors (FCS) or FCS plus the PI3K inhibitor LY294002 50 µM. Cells were lysed and protein lysates were analyzed by immune capillary electrophoresis. (C-D) Results from immune capillary electrophoresis of lysates from protocol shown in (B). Total AKT protein and phospho-AKT(Ser473) (C) and total PRAS40 protein and phospho-PRAS40 (D). Two-tailed ratio paired t-test. n= 4 biological replicates. (E-F) Gene set enrichment analysis (GSEA) histograms of PI3K/Akt/mTOR and mTOR signaling Hallmark gene sets comparing RNA-seq data from induced Pik3ca H1047R-YFP/wt and uninduced cells from the same animals maintained in minimal FAD medium. The nominal p-value, the normalized enrichment score (NES) and the false discovery rate (FDR) q-value are indicated. n= 4 independent replicates per condition from one animal each.  (Clayton et al., 2007;Doupe et al., 2012;Kemp et al., 2004). As controls, we induced AhCre ERT Rosa26 flYFP/wt (henceforth termed Cre-RYFP) animals that are Pik3ca wild type and express a neutral YFP reporter after recombination. At multiple time points up to 6 months after induction, animals were culled (Figure 3A), the esophagus collected and the entire epithelium isolated, stained and imaged in 3D using confocal microscopy. The number of cells and their location within individual clones was then counted and control and mutant clone size distributions compared (Table S1).

Pik3ca H1047R-YFP/wt mutant cells outcompete wild type cells in esophageal epithelium
Within 10 days of induction we observed that the total number of cells in Pik3ca H1047R-YFP/wt clones was larger than that of wild type controls, the difference increasing progressively over 6 months (Figures 3B and C). The mutant clones had a higher proportion of basal cells and fewer differentiated (suprabasal) cells compared with controls ( Figure 3D). These observations indicate Pik3ca H1047R-YFP/wt cells have a competitive advantage over their wild type neighbors and suggest that mutant progenitors may generate a lower proportion of differentiating and more proliferating progeny than their wild type equivalents. AhCre ERT Rosa26 flEYFP/wt reporter mice (Cre-RYFP) and AhCre ERT Pik3ca flH1047R-T2A-YFP-NLS/wt mice (Cre-Pik3ca H1047R-YFP/wt ) were induced with β-naphthoflavone and tamoxifen and tissue collected at the indicated time points. Wild type (WT) clones from Cre-RYFP mice and Pik3ca H1047R-YFP/wt clones from Cre-Pik3ca H1047R-YFP/wt animals were imaged and cell numbers in each clone quantified. n=317-1522 clones from 4-6 animals per condition (see Table S1 for values). (B) Average total cells per clone (basal layer plus first suprabasal layer cells, left panel) and basal cells per clone (right panel) over time, for all clones with at least one basal cell. Dots indicate the average clone size in a single mouse. Lines and shaded areas represent the best fitting model for the clone size distributions and its plausible intervals (see Supplementary Text). Mean and standard error of the mean per condition are indicated in black. Two-tailed unpaired t-test. (C) Top down views of confocal images of representative clones (green) for Cre-RYFP reporter mice (top panels) and Cre-Pik3ca H1047R-YFP/wt mice (lower panels) 28 days and 168 days after induction. Nuclei are stained with DAPI (blue). An optical section through the basal cell layer is shown. Scale bars, 20 μm. (D) Heatmaps representing the frequency of clone sizes with the number of basal and first suprabasal cells indicated, for Cre-RYFP and Cre-Pik3ca H1047R-YFP/wt animals. Black dots and dashed lines show geometric median clone size. Lower panels show the differences between Cre-Pik3ca H1047R-YFP/wt and Cre-RYFP animals for each time point. 2D Kolmogorov-Smirnov test. (E) Basal cells (DAPI, blue) of a typical EE whole mount showing Pik3ca H1047R-YFP/wt cells (green), and EdU + basal cells (red). Scale bar, 20 μm. (F) Percentages of EdU + , uninduced (WT) or Pik3ca H1047R-YFP/wt , basal cells quantified in the same tissues one month after induction. 52,898 cells were quantified from 3 animals including 3,514 Pik3ca H1047R-YFP/wt cells. Each dot corresponds to an animal. Mean and standard deviation are shown in black. n.s., not significant (Ratio paired t-test). (G) Average proportion of suprabasal cells per clone over time, counting basal and first suprabasal cells from clones shown in (D). Each dot corresponds to one animal and the lines connect means. Two-tailed unpaired t-test. (H) Proportion of floating clones per mouse. Each dot corresponds to one animal and the lines connect averages. Two tailed unpaired t-test. n=4-6 animals per condition (see Table S1 for values). (I) Schematic illustration of WT and Pik3ca H1047R-YFP/wt cell behavior. The model predictions for the proportions of each cell division fate for both genotypes are shown. A bias in the fate of Pik3ca H1047R-YFP/wt progenitors together with an increased proportion of symmetric divisions results in an increased production of mutant progenitor cells per average cell division in the esophageal epithelium, even with the rate of mutant cell division being the same as that of WT cells.
To test if these results were due to genetic background differences between mutant and control mouse strains, we crossed Cre-Pik3ca H1047R-YFP/wt mice with the Rosa26 Confetti/wt strain (Snippert et al., 2014) (Figure S2A). This triple mutant allows to track Pik3ca H1047R-YFP/wt expressing clones (labelled with YFP) and non-recombined Pik3ca wild type clones, both labelled with red fluorescent protein (RFP) in the same esophagus (Figures S2B and C). The results confirmed that the mutant clones expanded more rapidly than wild type clones in the same mouse (Figures S2D and E). Therefore, we conclude that Pik3ca H1047R-YFP/wt progenitors have a competitive advantage over their wild type neighbor cells in the mouse esophageal epithelium.

Pik3ca H1047R mutation biases progenitor cell fate towards proliferation
There are several possible cellular mechanisms that may underpin the increased size of Pik3ca H1047R mutant over neutral, wild type clones. To investigate whether altered cell cycle kinetics in the mutant cells were responsible for this effect, we induced and aged Cre-Pik3ca H1047R-YFP/wt mice for 1 month. One hour before tissue collection, animals were injected with 5-ethynyl-2′-deoxyuridine (EdU), which labels keratinocytes in S phase of the cell cycle.  Table showing the possible color combinations obtained from (A) after immunofluorescence against GFP to detect Pik3ca H1047R-YFP/wt . GFP antibody recognizes GFP, EYFP and CFP together with EYFP expressed from the Pik3ca locus. Only RFP+ clones wild type or Pik3ca H1047R-YFP/wt can be distinguished and quantified. (C) Top down views of confocal images of an esophageal epithelium whole-mount of a Cre-Pik3ca H1047R-YFP/wt -Confetti animal 84 days after induction, showing a RFP + Pik3ca wt/wt and a RFP + Pik3ca H1047R-YFP/wt clone. Nuclei stained with DAPI (blue). Scale bar, 20 μm. (D) Heatmaps representing the frequency of clone sizes, with the number of basal and first suprabasal cells indicated, of RFP + clones observed in Cre-Pik3ca H1047R-YFP/wt -Confetti animals. RFP + clones from each animal were classified into Pik3ca wt/wt and Pik3ca H1047R-YFP/wt by immunofluorescence. Black dots and dashed lines show geometric median clone size. Lower panels show the differences between Pik3ca wt/wt and Pik3ca H1047R-YFP/wt clones. 2D Kolmogorov-Smirnov test. n=255-469 clones in total from 2-3 animals per condition (see Table S1 for values). (E) Average basal cells per clone over time, considering all clones with at least one basal cell. Dots indicate the average clone size of a mouse. Mean and standard error of the mean per condition are indicated in black. Lines and shaded areas represent the best fitting model for the clone size distributions shown in Figure 3 and its plausible intervals (see Supplementary Text). (F) Confocal images of an esophageal epithelium basal layer whole-mount stained for activated Caspase 3 (red), GFP (Pik3ca H1047R-YFP/wt cells, green) and DAPI (blue). Left panel shows a UV irradiated sample as a positive control, exposed to ultraviolet radiation and maintained as explant culture. Middle and right panels are images from the same tissue showing an apoptotic cell (red arrow) and a Pik3ca H1047R-YFP/wt clone (green arrow). Scale bars, 20 μm. (G-H) Results of parameter inference for wild type (G) and mutant (H) progenitor cells. Distributions of the number of basal cells/clone were fitted (see Supplementary Text) and heatmaps show most-likely parameter values according to likelihood inference for a neutral single-progenitor model with balanced fates (wild type, G) and when extending the analysis to a single-progenitor model with imbalanced fates, displaying selection (mutant, H). In the latter case, different values for the parameter of division fate bias D were considered (D=0 means neutral behavior). Red asterisk: maximum likelihood estimate (MLE). Colored regions fall within 95% CI (uncolored regions are out of bounds). (I) Distributions of wild type (light grey) and mutant (dark grey) clone sizes. Number of basal cells/clone, and number of total (basal + first suprabasal) cells/clone are displayed (top and bottom panels, respectively) (sizes grouped in powers of two). Error bars: experimental mean ± s.e.m. Overlaid are MLE model fits (shaded areas represent 95% plausible intervals given the total number of clones counted at each time point). (J) Theoretical prediction for the effect of a progenitor fate imbalance on the relative proportion of first suprabasal cells. The first suprabasal-to-total cell ratio decreases with D*r following a rational decay function (see Supplementary Text) (initial departure point corresponds to wild type value). (K) Proportion of floating clones (i.e. suprabasal clones having no basal attachment) over time. Yellow and green dots correspond to values in individual wild type and mutant mice, respectively (error bars: mean ± s.e.m.). Overlaid are MLE model fits once changes in suprabasal-to-total cell ratio over time were considered (shaded areas defined as in I).
Another potential mechanism of cell competition is by promoting apoptosis of neighboring cells (de la Cova et al., 2014). However, there was a negligible level of apoptosis in wild type cells, whether adjacent to or distant from mutant clones in induced Cre-Pik3ca H1047R-YFP/wt animals ( Figure S2F).
Finally, we investigated whether Pik3ca H1047R-YFP/wt clone behavior could be explained by altered progenitor cell fate. Even when cell division rates are similar, mutant populations could still expand by producing more progenitor than differentiating daughter cells per average cell division, as previously observed with some other genetic mutants (Alcolea et al., 2014;Murai et al., 2018;Piedrafita et al., 2020). Mathematical modeling revealed that wild type clones follow a neutral model of cell competition, as described previously, with equal proportions of proliferating and differentiating cells produced from the average cell division (Figures 3I and  S2G) (Doupe et al., 2012;Piedrafita et al., 2020). However, Pik3ca H1047R-YFP/wt clone dynamics is explained by a non-neutral model where mutant progenitors have altered division outcome probabilities (Supplementary text). There is a fate bias towards proliferation, so that, over the mutant progenitor population, the average cell division generates an excess of proliferating over differentiating progeny, explaining the clonal growth advantage over wild type cells ( Figure 3I and S2H). This simple model fits both the observed basal cell and total (basal plus suprabasal) clone size distributions and averages (Figures 3B and S2I). The model also predicts that a progenitor fate imbalance should result in a decreased proportion of suprabasal cells per clone ( Figure S2J) and a reduction in the number of fully differentiated clones lacking any basal cells (floating clones) in the mutant ( Figure S2K, Supplementary Text). These predictions also fit with experimental data since both the proportion of suprabasal cells per clone and floating clones were significantly reduced in the mutant (Figures 3G and H).
Therefore, Pik3ca H1047R-YFP/wt keratinocytes show a bias in basal cell fate towards the generation of more progenitor than differentiating daughters, resulting in mutant cells having a competitive advantage over wild type cells in the esophageal epithelium ( Figure 3I,

Pik3ca H1047R-YFP/wt mutant cell fitness depends on the level of PI3K pathway activation in neighboring wild type cells
To further investigate the basis of the mutant cell advantage over wild type cells we used a 3D stratified primary culture system suitable for long-term cell competition studies. We generated primary esophageal keratinocyte cultures from Rosa RYFP/RYFP (Pik3ca wt/wt , henceforth referred to as WT-RYFP) and Pik3ca H1047R-YFP/wt mice, and induced recombination by infecting these cultures with adenovirus encoding Cre recombinase (Figures S3A and B). Primary esophageal keratinocytes were isolated from uninduced Rosa26 RYFP/RYFP animals (A) or uninduced Pik3ca H1047R-YFP/wt animals (B). Cells were incubated either with Cre-expressing adenovirus (Ad-Cre) or null adenovirus (Ad-Null). Right panels show a representative image of an immunofluorescence against YFP (green) and DAPI (blue) of Ad-Cre or Ad-Null treated cultures. Scale bars, 20 μm. (C) Cell competition experimental protocol. Pik3ca H1047R-YFP/wt and Pik3ca wt/wt primary keratinocytes obtained in (B) were mixed with WT-RYFP cells obtained in (A). Upon reaching confluence, cultures were changed to minimal FAD medium or the specified treatment. Cells were collected at the start and end of the treatment and the proportion of WT-RYFP cells was quantified by flow cytometry. The proportion of WT-RYFP after treatment was normalized to the initial WT-RYFP cell proportion. (D) GSEA histograms of PI3K/Akt/mTOR and mTOR signalling Hallmark gene sets comparing RNA-seq data from control (CTL) and 5 μg/ml insulin (+INS) treated Pik3ca H1047R-YFP/wt cells from the same animals. The nominal p-value, the normalized enrichment score (NES) and the false discovery rate (FDR) q-value are indicated. n= 4 independent replicates per condition from one animal each. (E) Glucose levels in urine of the mice used in Figure 4 I-K at the day of induction. Two-tailed Mann-Whitney test. n=4 animals per condition.

Due to the much higher level of YFP reporter expression in WT-RYFP compared with
For the cell competition studies, we mixed WT-RYFP keratinocytes with either induced or uninduced Pik3ca H1047R-YFP/wt cells and followed how the proportion of WT-RYFP cells changed over time (Figures 4A and S3C). We first established that when uninduced Pik3ca H1047R-YFP/wt cells were mixed with WT-RYFP cells, the proportion of cells of both strains remained constant over time, meaning their competition is neutral ( Figure 4B upper panels and C). In addition, the ratio of suprabasal:basal cells was similar in both subpopulations after 14 days of cell competition ( Figure 4D). However, when induced Pik3ca H1047R-YFP/wt and WT-RYFP cells were co-cultured, the Pik3ca H1047R-YFP/wt cells almost completely took over the culture within 28 days ( Figure 4B lower panels and C). Moreover, the suprabasal:basal ratio of induced Pik3ca H1047R-YFP/wt cells was lower than for the WT-RYFP cells in the same culture ( Figure 4D).
We conclude that Pik3ca H1047R-YFP/wt mutant cells retain their competitive advantage over wild type cells in vitro.
If the advantage of Pik3ca H1047R-YFP/wt mutation depends on over-activation of the PI3K/mTOR axis, we reasoned that activating this pathway in wild type cells would decrease the fitness advantage of the mutant cells, by levelling out the signaling differences between the two genotypes. High concentrations of insulin activate PI3K/mTOR via the insulin and IGF1 receptors (Boucher et al., 2010). Thus, in a mixed culture both wild type and mutant cells might experience strong PI3K/mTOR activation. We treated mixed cultures with a dose of insulin that induced transcriptional changes consistent with PI3K/mTOR pathway activation in both wild type and mutant cells (Figure 4E and S3D). The advantage of the Pik3ca H1047R-YFP/wt over Pik3ca wt/wt cells was substantially reduced by insulin treatment (Figure 4F and G). In addition, insulin treatment lowered the ratio of suprabasal:basal compartment in wild type cells close to that seen when competing with uninduced Pik3ca wt/wt cells ( Figure 4D). We conclude that differential activation of the PI3K pathway underpins the competitive advantage of mutant over wild type cells ( Figure 4H).  Figure S3). Once a confluent culture was achieved, cells were kept for 28 days in culture with minimal FAD medium, or the specified treatment, for the duration of the experiment. Samples were collected at the start of the treatment and at 14 and 28 days. (B) Confocal z stack image representative of the specified mixed culture and time of treatment. An optical section through the basal cell layer is shown. YFP immunofluorescence (yellow), nuclei are stained with DAPI (blue). Scale bar, 20 μm. (C) Quantification by flow cytometry of the proportion of WT-RYFP cells versus the start of the experiment at the specified time points. Each dot represents a primary culture from a different animal. n=10-11 primary cultures from individual animals per condition. Two-tailed ratio paired t-test. (D) Proportion of suprabasal versus basal cells of each subpopulation in mixed cultures 14 days after the start of the experiment. +INS indicate cultures treated with 5 μg/ml insulin. n=10 primary cultures from individual animals per condition. n.s., not significant. Two-tailed unpaired t-test. (E) GSEA histograms of PI3K/Akt/mTOR and mTOR signalling Hallmark gene sets comparing RNAseq data from control (CTL) and 5 μg/ml insulin (+INS) treated wild type cells from the same animals. The nominal p-value, the normalized enrichment score (NES) and the false discovery rate (FDR) qvalue are indicated. n= 4 independent replicates per condition from one animal each. (F) Confocal z stack image representative of the specified mixed culture and condition after 28 days of continuous treatment. An optical section through the basal cell layer is shown. YFP immunofluorescence (yellow), nuclei are stained with DAPI (blue). Scale bar, 20 μm. (G) Quantification by flow cytometry of the proportion of WT-RYFP cells mixed with Pik3ca H1047R-YFP/wt cells, at the specified time points versus the start of the experiment. Cells were treated either in minimal FAD medium or treated with 5 μg/ml insulin (+INS) for the duration of the experiment. Each dot represents a primary culture from an animal. n=10-11 primary cultures from individual animals per condition. Two-tailed ratio paired t-test. (H) Summary of the results representing Pik3ca H1047R-YFP/wt (Mut) versus wild type (WT) competition in control (Low insulin) and insulin (High Insulin) conditions; in relation to the PI3K pathway activation in each subpopulation. (I) Experimental protocol: Cre-Pik3ca H1047R-YFP/wt mice were bred into Ins2 Akita/wt (Akita Het ) mice obtaining Cre-Pik3ca H1047R-YFP/wt -Akita Het and Cre-Pik3ca H1047R-YFP/wt -Akita WT littermates. After diabetes development in Akita Het mice, they were induced with β-naphthoflavone and tamoxifen and collected after 28 days. (J) Heatmaps representing the frequency of Pik3ca H1047R-YFP/wt clone sizes, with the number of basal and first suprabasal cells indicated, observed in Cre-Pik3ca H1047R-YFP/wt -Akita Het and Cre-Pik3ca H1047R-YFP/wt -Akita WT littermates (left panels). Heatmap showing the differences in Pik3ca H1047R-YFP/wt clone sizes between Cre-Pik3ca H1047R-YFP/wt -Akita Het and Cre-Pik3ca H1047R-YFP/wt -Akita WT . n=257 and 402 clones respectively, from 4 animals per condition (see Table S1 for numbers).
These results suggest that insulin levels may alter the competitiveness of Pik3ca H1047R-YFP/wt clones in vivo. To explore this, we turned to the Akita mouse model of type-1 diabetes, which harbors a mutation in the insulin-2 gene that results in reduced circulating insulin levels as mice age (Oyadomari et al., 2002;Yoshioka et al., 1997). We bred Cre-Pik3ca H1047R-YFP/wt mice onto an Akita Het (diabetic) or Akita wt (non-diabetic) background. Clonal recombination was induced after the onset of the diabetes in Akita Het mice ( Figure S3E) and clones analyzed one month after induction ( Figure 4I). Pik3ca H1047R-YFP/wt clones showed a reduced proportion of differentiated cells per clone in the diabetic background compared to non-diabetic littermates ( Figures 4J and K), indicating that the fitness Pik3ca H1047R-YFP/wt mutant relative to wild type cells is higher when insulin levels in blood are low. Therefore, both in vitro and in vivo, insulin levels modulate Pik3ca H1047R-YFP/wt cell competition with wild type cells.

Pik3ca H1047R mutation activates HIF1α and aerobic glycolysis
The results above argue that mutant cells may have a differential activation of the pathways downstream of PI3K to gain their competitive advantage over wild type cells. To investigate this, we compared the gene expression of induced and uninduced Pik3ca H1047R-YFP/wt cultures generated from the same mice. RNA sequencing revealed 301 upregulated and 195 downregulated transcripts (adjusted p-value<0.05) in the mutant cells (Figures 5A and B).
47% of the upregulated genes (transcripts with an adjusted p-value<0.01) were known or predicted direct targets of the HIF1α transcription factor (Figure 5C), a downstream effector of the PI3K/mTOR pathway (Denko, 2008;Rohwer et al., 2019;Xie et al., 2019). Consistent with activation of the PI3K/mTOR/HIF1 axis by the Pik3ca H1047R mutation, GSEA and KEGG pathway analysis showed an enrichment of the Hypoxia gene set and the HIF1α signaling pathway (Figures 5D and E). HIF1α switches cell metabolism from mitochondrial oxidative phosphorylation towards aerobic glycolysis, the metabolic conversion of glucose to lactate in the presence of oxygen to produce energy (Denko, 2008). Consistent with this function of HIF1α, gene expression analysis showed a significant upregulation of genes encoding for all glycolysis pathway enzymes in Pik3ca H1047R cells (Figures 5E, F and G). In the mutant cells, the expression of HIF1α target genes that promote a metabolic switch to aerobic glycolysis (Higd1a, Bhlhe40, Bnip3, Pdk1, Ndufa4l2 and Pfkbf3) (Ameri et al., 2015;Chang et al., 2019;Kim et al., 2006;Rikka et al., 2011;Tello et al., 2011;Yi et al., 2019) and genes that promote the export of lactate and protons to reduce the intracellular acidification derived from a glycolytic metabolism, was also increased (Dovmark et al., 2017;Mboge et al., 2018) (Figures   5F and G). To confirm the glycolytic switch in mutant cells we used high resolution respirometry which allows the measurement of the oxygen consumption rate (OCR, proportional to mitochondrial oxidation) and extracellular acidification rate (ECAR, proportional to the glycolysis to lactate) ( Figure 5F). The ratio of OCR and ECAR indicates whether cells are more oxidative or more glycolytic . Pik3ca H1047R-YFP/wt keratinocytes have a significantly reduced OCR/ECAR ratio, confirming a shift to aerobic glycolysis ( Figure   5H). We conclude that HIF1α is activated in Pik3ca H1047R-YFP/wt cells and drives a switch to aerobic glycolysis ( Figure 5I).
These results suggest HIF1α may be a key effector of the Pik3ca H1047R-YFP/wt cell phenotype.
To test this, we treated mixed cultures of induced Pik3ca H1047R-YFP/wt and WT-RYFP cells with the HIF1α inhibitor PX478 (Welsh et al., 2004) (Figure 5J). The advantage of mutant over wild type cells was significantly reduced in the presence of this inhibitor (Figure 5K), arguing that activation of HIF1α contributes to the competitive advantage of Pik3ca H1047R-YFP/wt mutant cells.
We showed above that treatment with a high dose of insulin reduced the mutant cell advantage by causing PI3K pathway over-activation in wild type and mutant cells (Figure 4E-G and S3D).
We speculated that this treatment may act via HIF1α and glycolysis activation. Transcriptional analysis showed that 82% of the genes upregulated in the Pik3ca H1047R-YFP/wt mutant cells were also induced in wild type cells upon insulin treatment (Figure S4A   is produced by the protons exported to the extracellular space during aerobic glycolysis. (G) Heatmaps comparing uninduced Pik3ca wt/wt (WT) and Pik3ca H1047R-YFP/wt primary keratinocytes in control (CTL) or treated with 5 μg/ml insulin (+INS). Heatmaps show glycolysis pathway genes, genes regulating aerobic glycolysis, and genes regulating the intracellular pH and lactate transport. Statistical tests are performed between CTL samples. None of those genes was differentially expressed between Pik3ca wt/wt and Pik3ca H1047R-YFP/wt in the +INS condition. ***p<0.001 and n.s., not significant. Wald test corrected for multiple testing using the Benjamini and Hochberg method. (H) Basal OCR to ECAR ratios of Pik3ca wt/wt (WT) and Pik3ca H1047R-YFP/wt primary keratinocytes in control (CTL) or treated with 5 μg/ml insulin (+INS). Basal OCR and ECAR were assessed using the Seahorse Extracellular Flux Analyser. Each dot represents primary cells obtained from one animal (n=4 animals) using 4-5 technical replicates per animal. OCR/ECAR ratios are presented as average and standard deviation. n.s., not significant. Twotailed ratio paired t-test. (I) Model showing how Pik3ca H1047R-YFP/wt cells activate HIF1α which in turn activates the aerobic glycolysis through its target genes. (J) In vitro cell competition assay. In vitro induced Rosa26 flYFP/flYFP primary esophageal keratinocytes (WT-RYFP) were mixed with in vitro induced Pik3ca H1047R-YFP/wt cells or uninduced controls from the same animals. Once a confluent culture is achieved, cells were treated for 28 days in culture with minimal FAD medium +/-PX-478 (10 µM). Samples were collected at the start of the treatment and at 28 days and the proportion of WT-RYFP at the end of the experiment versus at the beginning was calculated. (K) Quantification by flow cytometry of the proportion of WT-RYFP cells from (J). Each dot represents a primary culture from an animal and mean and standard deviation are shown. n=5-11. Two tailed unpaired t-test.

Metformin and DCA neutralize the competitive advantage of Pik3ca H1047R-YFP/wt clones
To test the metabolic dependency of the competitive advantage of Pik3ca H1047R-YFP/wt cells, we investigated two complementary approaches to reduce the imbalance in glycolysis between mutant and wild type cells. We tested metformin (MET), a widely used antidiabetic agent that enhances aerobic glycolysis (Martin-Montalvo et al., 2013) and dichloroacetate (DCA), a nonselective agent that inhibits pyruvate dehydrogenase kinase-1 (PDK1), hereby forcing glycolysis-derived pyruvate to be oxidized in the mitochondria instead of being transformed into lactate, thus favoring glucose oxidation at the expense of aerobic glycolysis (Michelakis et al., 2008) (Figures 6A and S5A). Mixed cultures of Pik3ca H1047R-YFP/wt and WT-RYFP cells were treated with either MET or DCA (Figure 6B). Both agents reduced the expansion of Pik3ca H1047R-YFP/wt cells in vitro suggesting that the competitive advantage of Pik3ca H1047R-YFP/wt mutant cells is attenuated by reducing metabolic differences between mutant and wild type cells (Figures 6 C, D and E).

Finally, we determined whether MET and DCA can both inhibit clonal expansion in vivo, using
Cre-Pik3ca H1047R-YFP/wt and Cre-RYFP mice. Animals were induced and treated with MET or DCA for one month, when clone sizes were analyzed ( Figure 6F). Both MET and DCA as separate treatments reduced mutant clone size in vivo and increased the proportion of differentiated cells towards wild type levels (Figures 6G-J and S5B). These results support the hypothesis that reducing glycolytic differences between Pik3ca H1047R-YFP/wt mutant and neighboring wild type cells, lowers the bias towards proliferation and hence the competitive fitness of mutant progenitors ( Figure 6K). mM. An optical section through the basal cell layer is shown. YFP immunofluorescence (yellow), nuclei are stained with DAPI (blue). Scale bar, 20 μm. (D-E) In vitro cell competition assays. Proportion of WT-RYFP cells, mixed either with induced Pik3ca H1047R-YFP/wt cells or uninduced controls, versus the start of the experiment, at the specified time points. Cells were treated in minimal FAD medium or minimal FAD medium with MET 2.5 mM (B) or DCA 25 mM (C) for the duration of the experiment. Each dot represents a primary culture from an animal. n.s., not significant. n=3-16 primary cultures coming from different animals. Two-tailed ratio paired t-test. (F) Experimental protocol. Cre-RYFP reporter mice and Cre-Pik3ca H1047R-YFP/wt mice were induced with β-naphthoflavone and tamoxifen. They were treated with MET or DCA for the duration of the experiment and collected 28 days after induction. (G) Heatmaps showing the frequency of clone sizes with the number of basal and first suprabasal cells indicated observed in animals from (F). Black dots and dashed lines indicate geometric median clone size. n=311-917 clones from 5-10 animals per condition (see Table S1 for numbers). (H) Heatmaps showing the differences between each treatment and control in Cre-RYFP (upper panels) or Cre-Pik3ca H1047R-YFP/wt (lower panels) animals. n=311-917 clones from 5-10 animals per condition (see Table S1 for numbers).  Table S1 for numbers).

Discussion
The results presented here show that a subtle activation of the PI3K pathway caused by a heterozygous activating missense mutation in Pik3a is sufficient to drive clonal expansion in normal esophageal epithelium.
The cellular mechanism underpinning the competitive advantage of mutant Pik3a progenitors is a small increase in the probability of generating mutant progenitors over differentiated daughters per division, with no detectable acceleration in the cell cycle. A similar change in mutant progenitor dynamics, an increase in the proportion of proliferating versus differentiating cells per average cell division, occurs with a Notch inhibiting mutant and mutant Trp53 in the mouse esophagus and skin respectively (Alcolea et al., 2014;Murai et al., 2018). It is striking that three disparate mutations under positive selection in human esophagus all result in a similar alteration in mutant cell dynamics. These observations indicate that altering progenitor cell fate is the common mechanism hijacked by mutations in different pathways to expand in squamous epithelia. Once clones collide with others of similar fitness progenitor cell fate reverts towards balanced production of progenitor and differentiating cells .
A small fate imbalance towards proliferating cells also occurs in high grade dysplasias and carcinomas in the mouse esophagus (Frede et al., 2016). However, while esophageal tumors have the potential to grow, competition with other mutant clones in non-transformed tissue is constrained by the limited space available for mutant clone expansion Frede et al., 2016).
We confirmed that in primary esophageal keratinocytes, Pik3ca H1047R expression activates glycolysis, as previously described in cell lines (Hu et al., 2016;Ilic et al., 2017;Jiang et al., 2018). However, the molecular basis of the effect of Pik3ca H1047R-YFP/wt on esophageal progenitor cell fate remains to be elucidated. Several studies of keratinocytes in the epidermis of the skin, have observed a link between the level of glycolysis and regulation of proliferation versus differentiation. For example, differentiation is increased by downregulation of glycolysis by activation of the aryl hydrocarbon receptor or inhibition of enolase (Sutter et al., 2019).
Similarly, deletion of the glucose transporter, Glut1, results in decreased proliferation and increased reactive oxygen species (ROS) (Zhang et al., 2018). Elevation of ROS drives keratinocyte differentiation in both skin and esophagus (Fernandez-Antoran et al., 2019; Hamanaka et al., 2013). Finally, HIF1α also promotes proliferation in human skin keratinocytes (Kim et al., 2018). Thus, our results suggest that the regulatory role for HIF1α and glycolysis in keratinocyte differentiation may be a common feature of stratified epithelia.
The expansion of Pik3ca H1047R-YFP/wt clones depends not only on the mutant cell phenotype but also the activity of the PI3K signaling axis and metabolic state of the adjacent wild type cells.
Reducing the differences in PI3K/HIF1α/aerobic glycolysis axis between mutant and wild type cells, attenuates the competitive advantage of the mutant (Figure 7). These observations parallel those of mutations that activate PI3K signaling in Drosophila, which confer a fitness advantage that may be modulated by alterations in insulin exposure or metformin (Nowak et al., 2013;Sanaki et al., 2020). Nevertheless, the relationship between aerobic glycolysis and competitive fitness seems context-dependent, resulting in elimination of glycolytic mutant cells in the mouse intestine but conferring a fitness advantage in the Drosophila wing disc (Banreti and Meier, 2020;Kon et al., 2017). Likewise, strong activation of PI3K signaling by biallelic expression of Pik3ca H1047R/H1047R in keratinocytes in the epidermis leads to their elimination by differentiation (Ying et al., 2018). Our findings hint that metabolic disease states such as insulin deficiency in type 1 diabetes or treatments such as long-term metformin administration, may alter competitive selection of signaling mutants in normal tissues, by modulating the advantage of clones with mutations that activate the PI3K pathway. Such remodeling of the normal tissue landscape may impact on the risk of neoplasia and represent a potential point of intervention in cancer prevention (Bradley et al., 2018;Carstensen et al., 2016;Yu et al., 2019). Beyond cancer, it is possible that part of the aging phenotype is due to the colonization of normal tissues by mutant clones (Blokzijl et al., 2016;Martincorena et al., 2018). If this is so, Metformin, currently in clinical trials as an 'anti-aging' drug, may have unexpected benefits in suppressing the expansion of a subset of mutant cell clones in normal adult epithelia (Barzilai et al., 2016).

Mouse experiments
All experiments were approved by the local ethical review committees at the Wellcome Sanger Institute, and conducted according to Home Office project licenses PPL70/7543, P14FED054 and PF4639B40. Animals were maintained on a C57/Bl6 genetic background, housed in individually ventilated cages and fed on standard chow. Experiments were carried out with male and female animals and no gender specific differences were observed. For lineage tracing experiments, the relevant floxed reporter lines were crossed onto the Ahcre ERT strain in which transcription from a transgenic CYP1A1 (arylhydrocarbon receptor, Ah) promoter is normally tightly repressed until is activated by β-napthoflavone (Kemp et al., 2004). In this model tamoxifen promotes the nuclear translocation of Cre ERT protein to mediate recombination. For lineage tracing of control clones, the Rosa26 flYFP/wt mice which express yellow fluorescent protein (YFP) from the constitutively active Rosa 26 locus were used (Srinivas et al., 2001). To assess the mutant and wild type clone growth in the same mouse,

Lineage tracing
Low frequency expression of EYFP in the mouse esophagus was achieved by inducing transgenic animals aged 10-16 weeks with a single intraperitoneal (i.p.) of 80 mg/kg βnaphthoflavone and 1 mg tamoxifen and 80 mg/kg β-naphthoflavone (Ahcre ERT Rosa26 flConfetti -Pik3ca fl-H1047R-T2A-EYFP-NLS/wt ) and 0.25 mg tamoxifen (rest of mouse strains). Following induction, between three and eight mice per time point were culled and the esophagus collected. Time points analyzed include 10 days, 1, 3 and 6 months after the induction. As the expression from the endogenous Pik3ca locus is very low, immunofluorescence was necessary in order to detect the RYFP expression ( Figure 2D). The total number of clones quantified for each figure can be found in Table S1. Normalized, clone-size distributions were built for each experimental condition and time point from the observed relative frequencies fm,n of clones of a certain size, containing m basal and n suprabasal cells, resulting in twodimensional histograms, (displayed as heatmaps using CloneSizeFreq_2Dheat package (https://github.com/gp10/CloneSizeFreq_2Dheat). A 2D histogram of the residuals or differences observed between conditions in the relative frequencies of each particular clone size (i.e., each cell on the grid) was generated when appropriate.

Quantitative analysis and mathematical modelling
For details of quantitative analysis of wild type and Pik3ca mutant progenitor cell lineage tracing data and the dynamics of mutant cells in the suprabasal cell layers, see Supplementary Text. Code used for this analysis has been made publicly available and can be found at https://github.com/gp10/DriverClonALTfate.

EdU in vivo proliferating cell analysis
One month after lineage tracing induction, 10 μg of EdU in PBS was administered by intraperitoneal injection 1 h before culling. Tissues were collected and stained with EdU-Click-iT kit and immunofluorescence as explained below. EdU-positive basal cells were quantified from a minimum of 10 z-stack images.

Primary keratinocyte 3D culture
After removing the muscle layer with fine forceps, esophageal explants were placed onto a transparent ThinCert™ insert (Greiner Bio-One) with the epithelium facing upward and the submucosa stretched over the membrane, and cultured in complete FAD medium (50:50)  Briefly, cells were incubated with adenovirus-containing medium supplemented with Polybrene (Sigma Aldrich, # H9268) (4 μg/ml) for 24 h at 37°C, 5% CO2. Cells were washed and fresh medium was added. Infection rates were > 90%.

Immunofluorescence and microscopy
For wholemount staining, the mouse esophagus was opened longitudinally, the muscle layer was removed and the epithelium was incubated for 1 h and 30 min in 20 mM EDTA-PBS at 37°C. The epithelium was peeled from underlying tissue and fixed in 4% paraformaldehyde in PBS for 30 min. Wholemounts were blocked for 1 h in blocking buffer (0.5% bovine serum albumin, 0.25% fish skin gelatin, 1% Triton X-100 and 10% donkey serum) in PHEM buffer (60 mM PIPES, 25 mM HEPES, 10 mM EGTA, and 4 mM MgSO4·7H2O). Anti-GFP antibody (1:4000, Life technologies A10262) was incubated over 3-5 days using blocking buffer, followed by several washes over 24 h with 0.2% Tween-20 in PHEM buffer. Where indicated, an additional overnight incubation with anti-caspase-3 (1:500, Abcam ab2302) was performed.
Finally, samples were incubated for 24 h with 1 μg/ml DAPI and secondary antibody antichicken (1:2000, Jackson ImmunoResearch 703-545-155) in blocking buffer. Clones were imaged on an SP8 Leica confocal microscope. Whole tissue or large tissue area images were obtained in most cases with the 20x objective with 1x digital zoom, optimal pinhole and line average, speed 600 hz and a pixel size of 0.5678µm/pixel. In YFP tissues at long time points, clones were manually detected in the microscope and individually imaged using 40x objective with 0.75x digital zoom, optimal pinhole and line average, speed 600 hz and a pixel size of 0.3788 µm/pixel. The numbers of basal and suprabasal cells in each clone were counted manually. Representative images were produced by selecting a 120x120 µm area in the images obtained as previously stated. Images were processed using ImageJ software adjusting brightness and contrast and applying a Gaussian blur of 1. EdU incorporation was detected with Click-iT chemistry kit according to the manufacturer's instructions (Invitrogen) using 555 Alexa Fluor azides. Confocal images for EdU-GFP staining were acquired on a Leica TCS SP8 confocal microscope (objective 20x; optimal pinhole and line average; speed 600 Hz; resolution 1024 × 1024, zoom ×2). Images were processed using ImageJ software adjusting brightness and contrast and applying a Gaussian blur of 1. For in vitro culture staining, inserts were fixed in 4% paraformaldehyde in PBS for 30 min, then blocked for 30min in blocking buffer and incubated overnight with anti-GFP antibody (1:1000, Life technologies A10262) in blocking buffer followed by 4x 15 min washes with 0.2% Tween-20 in PHEM buffer.
Finally, samples were incubated for 2h with 1 μg/ml DAPI and secondary antibody anti-chicken (1:500, Jackson ImmunoResearch 703-545-155) in blocking buffer. Afterwards, inserts were washed 4x 15 min with 0.2% Tween-20 in PHEM buffer and mounted in Vectashield (Vector Laboratories). Cultures were imaged on an SP8 Leica confocal microscope, obtained with the 40x objective with 0.75x digital zoom, optimal pinhole and line average, speed 600 hz and a pixel size of 0.1893 µm/pixel.

RNA isolation and RNA sequencing
Total RNA was extracted from 3D cultures of mouse primary keratinocytes after 1 week in FAD medium supplemented with fetal calf serum, apo-transferrin and Penicillin/Streptomycin, with or without insulin. RNA was extracted using RNeasy Micro Kit (QIAGEN, UK), following the manufacturer's recommendations. Briefly, cells were washed with cold Hank's Balanced Salt Solution-HBSS (GIBCO, UK) and then lysis buffer was added directly to the insert. The integrity of total RNA was determined by Qubit RNA Assay Kit (Invitrogen, UK). For RNA-seq, libraries were prepared in an automated fashion using an Agilent Bravo robot with a KAPA Standard mRNA-Seq Kit (KAPA BIOSYSTEMS). In house adaptors were ligated to 100-300 bp fragments of dsDNA. All the samples were then subjected to 10 PCR cycles using sanger_168 tag set of primers and paired-end sequencing was performed on Illumina HiSeq 2500 with 75 bp read length. Reads were mapped using STAR 2.5.3a, the alignment files were sorted and duplicate-marked using Biobambam2 2.0.54, and the read summarization performed by the htseq-count script from version 0.6.1p1 of the HTSeq framework (Anders et al., 2015;Dobin et al., 2013). For GSEA analysis raw counts were normalized by the median of ratios method (Love et al., 2014). Gene set enrichment was analyzed with GSEA software (Subramanian et al., 2005)  for GSEA is the estimated probability that a gene set with a given NES (normalized enrichment score) represents a false-positive finding, and an FDR<0.25 is considered to be statistically significant for GSEA. Differential gene expression was analyzed using the DEBrowser tool (https://debrowser.umassmed.edu/) with which we performed a DESeq2 analysis (Love et al., 2014) filtering the low counts to remove genes with less than 2 cpm in at least 2 samples.
Parametric' fitting of dispersions to the mean intensity was used with the likelihood ratio test on the difference in deviance between a full and reduced model formula (defined by nbinomLRT). An adjusted p-value cut-off of 0.05 were used to select significantly different expressed genes. Heatmaps were generated from the TPM values and build using ClustVis (https://biit.cs.ut.ee/clustvis/) and Morpheus tools (https://software.broadinstitute.org/morpheus/), significance was calculated from the adjusted p-value obtained in the DE analysis. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was performed uploading the significantly upregulated gene list (p<0.05) into the Enrichr tool (https://amp.pharm.mssm.edu/Enrichr/) (Kuleshov et al., 2016).

Validation of Pik3ca H1047R mutant construct in NIH3T3
DNA that reflects the recombined Pik3ca H1047R-YFP allele (PIK3CA H1047R fused to self-cleaving peptide P2A and GFP) was chemically synthesized (GenScript USA Inc.). cDNA encoding murine p110α wild type , p110α H1047R and p110α H1047R -P2A-GFP were amplified by PCR using IMAGE clone (Image ID 40141870/IRCL34 C10 (M13R), Source Bioscience) for wild type p110α or above cDNA for mutants and subcloned into the pCS2+ expression vector. NIH3T3 cells were transiently transfected with these constructs using Lipofectamine 2000 (Thermo Fisher 11668030) according to manufacturer's instruction. Cells were serum-starved by culture in DMEM containing 0.3% serum for 23 h. Serum-starved NIH3T3 cells were lysed in buffer containing 20 mM Hepes NaOH pH 7.9, 10% Glycerol, 0.4 M NaCl, 0.5% NP-40, 0.2 mM EDTA, 0.01% halt protease and phosphatase inhibitor (ThermoFisher Scientific, cat #78415). Protein concentrations were measured using standard Bradford protein assays (BioRAD QuickSTART™ Bradford Dye Reagents, cat.no.500-0202). Lysates were mixed with equal amount of 2x loading buffer (100 mM Tris-HCl pH 6.8, 4% SDS, 20% Glycerol, Bromophenol blue and 0.2% β-mercaptoethanol) and boiled at 96°C for 5 min. Samples were loaded onto a 7.5 or 10% of SDS-polyacrylamide gel. Proteins were separated by electrophoresis and transferred onto Immobilon-P membrane (pore size 0.45 µm, Millipore IPVH00010). Membranes were incubated in blocking buffer (5% dried skimmed milk, PBS, 0.1% Tween-20) at room temperature for 1 h and then with primary antibodies diluted in blocking buffer for 1 h at room temperature or overnight at 4°C on a rocking platform. After washing in PBST (0.1% Tween-20, PBS) three times, HRP conjugated secondary antibodies diluted in 0.5% skimmed milk in PBST were applied to the membrane for 10 min at room temperature on a rocking platform followed by four washes in PBST 20 min each. Washing and secondary antibodies steps were performed using SNAP id protein detection system (Sigma-Aldrich). Proteins were detected using Immobilon Western Chemiluminescent HRP substrate (Millipore WBLUC0500) or ECL blotting reagents (GE Healthcare GERPN2109).

Immune capillary electrophoresis
For protein phosphorylation analysis, 3D cultures were starved in FAD medium 0.1% FCS without cholera toxin, epidermal growth factor, insulin and hydrocortisone for 16 h at 37°C 5% CO2. Then treated for 15 min in the same starving medium or FAD medium with cholera toxin, epidermal growth factor, insulin, hydrocortisone and 20% FCS with or without LY294002 (50 µM). Cultures were lysed in ice-cold RIPA buffer (Thermo Scientific, UK) containing protease and phosphatase inhibitors. Plates were frozen at -80°C and thawed on ice, scraped and passed twice through a Qiashredder (2 min centrifuged at maximum speed), then incubated 1 h on ice vortexing every 15 min. Then lysates were centrifuged at 14000 g for 20 min at 4°C.
The supernatant was collected for analysis. Total protein quantification was performed using Pierce BCA Protein Assay Kit (Thermo Scientific, UK). Immune capillary electrophoresis was performed using Wes Simple™ (ProteinSimple, USA) following manufacturer's instructions.

Respirometry experiments
Fully recombined Pik3ca H1047R/wt or uninduced (adenovirus-null infected) parallel cultures from the same mice were treated for at least 1 week in FAD medium supplemented with fetal calf serum, apo-transferrin and Penicillin/Streptomycin, with or without insulin. At the moment of the experiment, 5 mm circular sections of each culture were obtained using a biopsy punch

Lactate measurement
Cultures were incubated for 3 days in the specified treatments. Medium was collected and lactate was analyzed using a commercial kit (DF16, Siemens Healthcare). All sample measurements were performed by the MRC MDU Mouse Biochemistry Laboratory.

Quantification and Statistical Analysis
Unless otherwise specified, all data are expressed as mean values ± standard deviation.

Author Contributions
A

Lead Contact and Materials Availability
Requests for reagent and resource sharing should be addressed to the Lead Contact, Philip H. Jones (pj3@sanger.ac.uk) who will fulfil requests.
2D simulation of the growth of wild type and Pik3ca H1047R/wt clones over 6 months starting from the same proportion of induced cells and following the parameters described in Figure 3I.

SUPPLEMENTARY THEORY
This report provides a detailed description of the quantitative methods and modelling used to study mutant keratinocyte behavior in the murine esophagus. Section 1 covers the analysis of wild type progenitor cell behavior. In Section 2 we extend the methodology to test Pik3ca mutant progenitor cell dynamics. In Section 3 we analyze cell dynamics in the suprabasal compartment for model validation.

Wild type progenitor cell dynamics
The murine esophageal epithelium consists of layers of keratinocytes maintained by a single type of proliferative cells (Doupe et al., 2012;Piedrafita et al., 2020). Progenitor (P-) cells reside in the deepest, basal layer where they divide regularly at a rate l. The outcome of a given progenitor division is stochastic: with a certain probability it results in two proliferating daughter cells retaining the proliferative capacity (P+P) or two post-mitotic, differentiating cells (D+D), the remaining divisions yielding an asymmetric outcome (P+D). Upon differentiation, D-cells stratify (at rate G) into the upper, suprabasal layers (transiting to S-cells), being ultimately shed into the lumen (at rate µ). This scenario is summarized by the single-progenitor (SP) model ( Figure 2B): .

→ ∅
While the outcome of individual divisions is unpredictable, overall, the likelihood of the symmetric PP and DD division outcomes is balanced in adult wild type mice (setting the same probability, r). This ensures that on average half the progenitor cells go on to divide and half differentiate, so that the tissue remains homeostatic.
Under homeostasis, one can assume the proportion of proliferative basal cells, r, remains constant, and overall, the net rate at which cells are generated in the basal compartment is compensated by cell stratification and cell loss by shedding (Figure 2A). Then, the following relationships between the parameters can be established: = /(1 − ), and = (1 − ℎ)/ℎ, where h is the proportion of suprabasal cells relative to total cells. Alternatively, one can set = / , if we define m as the global ratio of suprabasal-to-basal cell populations.
Our lineage tracing data from Cre-RYFP (wild type) mice are consistent with key dynamical features of the SP model with balanced fates, in agreement with our more extensive work carried out previously using the same mouse strain and others (Clayton et al., 2007;Doupe et al., 2012;Piedrafita et al., 2020). First, persisting clones (i.e. those retaining at least one basal cell) show ever increasing sizes during the duration of the experiment. In particular, the average number of basal cells per persisting clone follows a linear growth over time ( Figure   3B). Second, clones become increasingly heterogeneous in size, both in terms of the number of basal cells and total (basal + 1 st suprabasal) cells per clone, the distributions adopting a scaling behavior at late time points (Figure 3D). These are hallmarks of neutral clone competition, where some clones grow by chance at the expense of others that shrink, lose basal attachment and get ultimately extinct by shedding ( Figure 3H).
In order to validate the dynamics of the wild type progenitor cells in our particular experimental setup, we proceeded to determine the values for the unknown SP-model parameters by fitting the experimental basal clone size distributions at the different time points (suprabasal cell numbers are not required for this since proliferative cells are confined to the basal compartment). For the division rate, we took as a prior the average value for l and the distribution of cell-cycle time periods tcc inferred by H2B-GFP dilution chase experiments in , i.e. <l> = 2.9 week -1 and ;;~> + Gam( , ), where tR = 0.5 day -1 (refractory period) and Gam refers to a Gamma distribution with k = 8 and = (1 − > < > )/( < >). Notice that the election of this realistic description of the division rate will condition the modelling implementation as this cannot rely on Poisson processes that assume independent, underlying exponential events (see below). A maximum likelihood estimation (MLE) approach was then followed to infer the values of the two other remaining parameters, r and r (which sets the value of G in homeostasis), as explained below.
We performed a grid search spanning the range of all possible values for r and r, and for every parameter set θ we estimated the log-likelihood value ( ; ) as in previous work ( Figure S2G) Piedrafita et al., 2020): ( ; ) = ∑ ∑ ( L ( ) * log L ( , )) L R (2) L ( ) is the observed frequency of clones with a certain basal size n at time t. In turn, L ( , ) refers to the probability of observing clones of that size at time t given the SP model with parameter values θ, and it was obtained by numerical solution (³ 100,000 simulations) of the Master equation. In particular, in , this was computed by implementing a Markov-chain Monte Carlo method (Gillespie's algorithm) (Gillespie, 1977;Gillespie, 1976).
Here, we used an exact non-Markovian Monte Carlo analogue developed in , which allows to account for the gamma-distributed cell cycle times. For convenience, both experimental and simulated clone sizes were binned in ranges increasing in powers of two given the large asymmetry of the distributions, so that in practice, n in the equation above stands for clones with a number of basal cells in the range (2 n-1 + 1, 2 n ).
To discard possible biases due to the initial induction of post-mitotic cells (D-or S-cells), only clones with at least two basal cells were considered for the analysis. Also, a small fraction of clones at the latest time point (2 out of 317 clones in the wild type; 3 out of 563 clones in the mutant) were classified as outliers (i.e. having a number of basal cells >> 2,3 s.d. above the average of that time) and reassigned to the top size range n of non-outlier clones Piedrafita et al., 2020). Finally, since there were large differences in our experimental sample size across time points (especially in the wild type: 1522, 710, 457 and 317 clones quantified at time 10 d, 28 d, 84 d and 168 d, respectively) and these can imprint uneven contributions to ( ; ) calculations (Eq. 2), here we followed a bootstrapping strategy: We computed ( ; ) repeatedly for different sample subsets containing a fixed number of clones X across time points, drawn by random permutation with replacement of the original sampled clones. In this way we ensured even weights from the different time points and more robust parameter estimates by averaging ( ; ) across subsamples.
Following this methodology, we obtained the following parameter estimates ( S TUV ) for the wild type progenitors (Figure 3I; Figure S2G) Figure 3B) and the distributions of basal clone sizes at the different time points (Figure S2I), corroborating the robust dynamics of wild type progenitors in the esophageal epithelium.

Pik3ca H1047R-YFP/wt progenitor cell dynamics
The dynamics of Pik3ca H1047R-YFP/wt mutant progenitor cells clearly differed from wild type cells as clones showed an accelerated growth over time in both the basal and suprabasal compartments (Figure 3B-D). The percentage of EdU + basal cells among the induced Pik3ca H1047R-YFP/wt population was similar to wild type ( Figure 3F) and comparable to measurements done before in mouse esophagus (Fernandez-Antoran et al., 2019). This argues against changes in the rate of mutant cell division l. Yet, in principle dynamics might still be explained by a SP model with balanced fates if Pik3ca H1047R-YFP/wt progenitors experienced changes in some of the other parameters (e.g. r or G). Alternatively, it could be that Pik3ca H1047R-YFP/wt clone behavior responds to an imbalance in mutant progenitor division outcomes that favors proliferating daughter cells over differentiating progeny (i.e. PP symmetric division outcome being more likely than DD) ( Figure 2B). This later scenario has been shown to explain the phenotype of some other inducible mutants in squamous epithelium such as DN-Maml1, which inhibits the Notch pathway (Alcolea et al., 2014), and Trp53 mutants Murai et al., 2018). The distinction between these two possibilities is important since the former involves a neutral scenario where Pik3ca H1047R-YFP/wt population would exhibit no competitive advantage over wild type but an exacerbated stochastic behavior (i.e. accelerated clone growth but also decline). By contrast, fate imbalance would introduce a selective advantage, cause a net exponential-like growth of mutant clones and lead to mutant cell colonization of the epithelium.
The supralinear growth observed in the average mutant clone size points towards a progenitor fate imbalance in the Pik3ca H1047R-YFP/wt population ( Figure 3B). Unfortunately, however, the initial clonal induction efficiency was variable between mice and this precluded reliable confirmation of an overall mutant cell colonization. To circumvent this issue, we decided to explore the distributions of the mutant clone sizes ( Figure 3D) and extend the maximum likelihood estimation (MLE) approach explained earlier to infer the most likely scenario of mutant progenitor behavior. The following model that allows for a fixed progenitor fate imbalance (D) was considered Murai et al., 2018): According to the parameter relationships in Eq. 6, G = 4.86 (2.72; 285.48) week -1 . It follows that independently of the values of r and G, a neutral SP model with balanced fates (D = 0) was significantly less likely to explain mutant progenitor dynamics than a model with fate imbalance (D > 0), implying selection (using most favorable parameters for each model, p = 0.0002, *** by likelihood ratio test). In fact, the best neutral model found could not produce a good fit on the basal clone size distributions nor explain the curvature in the time course of the average number of basal cells per clone in the mutant population. A SP model with fate imbalance (MLE values from Eq. 7) gave an excellent fit on these experimental data ( Figure   3B; Figure S2I). We conclude that Pik3ca H1047R-YFP/wt progenitor dynamics are characterized by a small, but existent, statistical bias in fate towards an excess of dividing over differentiating daughters per average cell division. This is accompanied by an overall increased proportion of divisions being symmetric, as reflected by the larger value of r ( Figure 3I). Altogether, this would result in individual mutant clones developing into a wider range of possible sizes -more extreme random trajectories -while there is, overall, a relentless colonization of the esophageal epithelium by Pik3ca H1047R-YFP/wt cell population (Suppl. Video).

Dynamics of suprabasal cells and total clone behavior
Indeed, the average fraction of 1 st suprabasal cells in Pik3ca H1047R-YFP/wt clones was significantly smaller than in wild type clones across time points ( Figure 3G). Intriguingly, the experimental data showed a gradual decline in the proportion of suprabasal cells in both genotypes at the late time points, something a simple model with constant parameters would not capture. This apparent departure from homeostasis might be technical but could be related to epithelial changes as mice age (Liu et al., 2019). To achieve a more accurate description of the experimental conditions while retaining the MLE model parameters that suited basallayer behavior, we thus considered a time-dependent shedding rate µ(t), which is the only extra adjustable parameter needed to describe suprabasal cell dynamics. In this way, in the simulations of suprabasal and total clone sizes, µ could be adapted to reproduce the variable value of h over time (Eq. 6).
Independently of whether we implemented time-adjusted values for µ or used a fixed default value given by the experimental average suprabasal-to-total cell ratio h, our (zero-parameter) model fits on the distributions of total (basal + 1 st suprabasal) cells per clone were adequate over time points (Figure 3B; Figure Figure 3H).
Altogether, mathematical modelling indicates Pik3ca H1047R-YFP/wt behavior is explained by mutant progenitors showing an increased ability to yield proliferating daughters upon division, which confers this genotype a competitive advantage over wild type cells.
Code used for model simulations in this study can be found at: https://github.com/gp10/DriverClonALTfate