Abstract
Background: Approaches that capitalize on the benefits of multi-omic data integration in invasive breast carcinoma to define prognostic biomarkers for precision medicine have been slow to emerge. In this work, we examined the efficacy of our methylation-to-expression feature model (M2EFM) approach to combining molecular and clinical predictors as part of a single analysis to create prognostic risk scores for overall survival, distant metastasis, and chemosensitivity.
Methods: Gene expression and DNA methylation values as well as clinical variables were integrated via M2EFM to build prognostic models of overall survival using 1028 breast tumor samples and further applied to external validation cohorts of 61 and 327 samples. Data-integrated prognostic models of distant recurrence-free survival and pathologic complete response were built using 306 samples and validated on 182 samples of external validation data. Additionally, we compared the discrimination and calibration of M2EFM models to other approaches.
Results: Despite different populations and assays, M2EFM models validated with good accuracy (C-index or AUC ≥ .7) for all outcomes in all validation data. M2EFM models had the most consistent performance overall and superior calibration, suggesting a greater likelihood of clinical utility. Finally, we demonstrated that M2EFM identifies functionally relevant genes, which could be useful in translating an M2EFM biomarker to the clinic.
Conclusion: M2EFM uses multiple levels of genomic data to infer disrupted regulatory patterns, thus providing a gene signature that connects loss of regulatory control with cancer prognosis.
Funding: The analyses described in this report were supported by NIH grants R01ES022222, P30CA138292, P30ES019776, and R01DE022772.
Conflicts of Interest: The authors declare no potential conflicts of interest.