Abstract
The prevalence of malignant cells in clinical specimens, or tumour purity, is affected by both intrinsic biological factors and extrinsic sampling bias. Molecular characterization of large clinical cohorts is typically performed on bulk samples; data analysis and interpretation can be biased by tumour purity variability. Transcription-based strategies to estimate tumour purity have been proposed, but no breast cancer specific method is available yet.
We interrogated over 4400 expression profiles from 9 breast cancer datasets to develop and validate a 9-gene Breast Cancer Purity Score (BCPS). BCPS outperformed existing methods for estimating tumour content. Adjusting transcriptomic profiles using the BCPS reduce sampling bias and aid data interpretation. BCPS-estimated tumour purity improved prognostication in luminal breast cancer, correlated with pathologic complete response in on-treatment biopsies from triple-negative breast cancer patients undergoing neoadjuvant treatment and effectively stratified the risk of relapse in HER2+ residual disease post-neoadjuvant treatment.
Competing Interest Statement
The authors have declared no competing interest.
List of abbreviations
- ANOVA
- ANalysis Of VAriance
- AUC
- Area under the ROC curve
- BCPS
- Breast Cancer Purity Score
- CBX
- Core-Biopsy
- DEFS
- Distant Event-Free Survival
- EFS
- Event-Free Survival
- FDR
- False Discovery Rate
- FFPE
- Formalin-Fixed, Paraffin-Embedded
- FNA
- Fine-Needle Aspiration
- FPKM
- Fragments Per Kilobase Million
- IDC
- Invasive Ductal Carcinoma
- ILC
- Invasive Lobular Cancer
- iTIL
- intraepithelial Tumour-Infiltrating Lymphocyte
- OS
- Overall Survival
- pCR
- pathological Complete Response
- PDX
- Patient-derived xenograft
- ROC
- Receiver Operating Characteristic
- sTIL
- stromal Tumour-Infiltrating Lymphocyte
- TME
- Tumour Microenvironment
- VCA
- Variance Component Analysis