1 Transcriptomically-inferred PI 3 K activity and stemness show a 1 counterintuitive correlation with PIK 3 CA genotype in breast cancer 2 3

3 Ralitsa R. Madsen1,*, Oscar M. Rueda3,4,5, Xavier Robin6, Carlos Caldas3,4,5, Robert K. Semple2,a, 4 Bart Vanhaesebroeck1,a,* 5 6 1University College London Cancer Institute, Paul O'Gorman Building, University College London, 7 London, UK. 8 2Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 9 Edinburgh, UK. 10 3Cancer Research UK Cambridge Institute and Department of Oncology, Li Ka Shing Centre, 11 University of Cambridge, Cambridge, UK 12 4Cambridge Breast Unit, Addenbrooke’s Hospital, Cambridge University Hospital NHS Foundation 13 Trust, Cambridge, UK 14 5NIHR Cambridge Biomedical Research Centre and Cambridge Experimental Cancer Medicine 15 Centre, Cambridge University Hospital NHS Foundation Trust, Cambridge, UK 16 6SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50–70, 17 CH-4056 Basel, Switzerland 18 19 aThese authors contributed equally to this work. 20 21 *Corresponding authors: Ralitsa R. Madsen (R.R.M.), Bart Vanhaesebroeck (B.V.) 22

INTRODUCTION scores were negatively associated with patient survival in the METABRIC cohort, with a clear dosage contrast, although overall ER-negative cases with available survival data were limited in number, we in fact 182 noticed a loss of prognostic power when evaluating the two scores in this breast cancer subset (Fig. S1B, S1C).

183
Due to limited data, extensive survival analyses were not possible in TCGA breast cancers, however the 184 negative association between PI3K activity "strength" and pan-breast cancer survival was reproduced (Fig. 185 S1D).

186
As previously reported [33][34][35], activating PIK3CA mutations had no prognostic power in pan-breast or 187 ER-positive METABRIC tumours, despite their enrichment in the ER-positive cohort (Fig. 3E, 3F). Interestingly, 188 however, the presence of PIK3CA mutations in ER-negative tumours appeared to be associated with worse

201
As PI3K pathway activation and tumour dedifferentiation can be triggered by a range of oncogenic hits, 202 the relatively high PI3K and stemness scores in PIK3CA-WT breast cancers was not entirely surprising ( Fig.   203 4A, 4B). It was, however, counterintuitive that the presence of a single oncogenic PIK3CA missense variant 204 was associated with a substantial reduction in the stemness score and only a modest reduction in the PI3K 205 score (Fig. 4A, 4B). Relative to tumours with a single PIK3CA mutant copy, those with multiple oncogenic 206 PIK3CA copies exhibited higher PI3K and stemness scores (Fig. 4A, 4B). This relationship was lost upon 207 simple binary classification based on PIK3CA genotypes (i.e. wild-type vs mutant) (Fig. 4A, 4B) (Fig. 4D). In contrast, their heterozygous processes 228 229 Given the high depth and large sample size of the available breast cancer transcriptomic data, we next 230 undertook a global analysis encompassing all 50 "hallmark" MSigDB gene sets and the PluriNet signature to 231 identify relevant biological processes associated with breast cancer stemness and a high PI3K activity score.

232
Such processes can be used to guide future experimental studies aimed at dissecting the molecular 233 underpinnings of the observed relationships. To identify such associations, we applied GSVA to METABRIC 234 and TCGA data to generate a score for each gene signature, followed by correlation analysis with hierarchical 235 clustering. This global approach also allowed us to confirm that we are able to identify biologically-relevant gene

239
Data from either cohort revealed a characteristic clustering pattern for PI3K and stemness scores,

250
This study provides a comprehensive analysis of the relationship between PI3K signalling and stemness (or 251 tumour dedifferentiation) using two large breast cancer transcriptomic datasets encompassing almost 3,000 7 pathway activation [3,30,32,44]. Importantly, such PIK3CA mutant-independent pathway activation is captured 282 by the transcriptional footprint-based PI3K activity scores used in our study and will thus contribute to the values 283 observed in non-PIK3CA mutant tumours.
genotype-based PIK3CA classification. Paradoxically, however, PIK3CA mutations have prognostic power in subgroups defined by differences in PIK3CA mutant status and PI3K signalling/stemness scores differ in their 289 response to PI3Ka-targeted therapy.

290
It is also notable that our correlation analyses of breast cancer transcriptomes identified a PI3K/stemness

301
A limitation of the current and previous bulk-tissue transcriptomic analyses is that they cannot determine

302
(1) whether the observed correlations reflect mechanistic links or spurious associations caused by a confounder 303 variable that influences two or more processes independently and (2)

367
The PROGENy package was used to obtain a PI3K score according to a linear model based on pathway-368 responsive genes as described in Ref. [23].

369
The TCGAnalyze_Stemness() function in TCGAbiolinks was used to calculate a stemness score