ABSTRACT
Background Regional distribution of somatic mutations in cancer genomes associates with DNA replication timing (RT) and chromatin accessibility (CA), however normal tissues and cell lines have contributed these insights while associations with the epigenomes of primary cancers remain uncharacterized.
Results Here we model megabase-scale mutation burden in whole cancer genomes using ∼900 CA and RT profiles of primary cancers, normal tissues, and cell lines. CA profiles of primary cancers, rather than normal tissues, predict regional mutagenesis in most cancer types. Regional mutation burden associates with the CA profiles of matching cancer types, indicating tissue-specific determinants of mutagenesis. However, mutagenesis in squamous cell and lymphoid cancers instead associates with RT profiles. Mutational signatures also show tissue-specific associations with cancer epigenomes, especially for carcinogen-induced and unannotated signatures. Lastly, while each cancer type includes certain frequently-mutated genomic regions exceeding epigenome-informed predictions of mutation burden, these regions show a pan-cancer convergence to biological processes involved in development and cancer. Thus, modelling excess mutations using epigenomes highlights known cancer driver genes as well as frequently mutated non-coding regions.
Conclusions The dominant association of regional mutation burden with cancer epigenomes suggests that many passenger mutations are determined by the epigenetic landscapes of transformed cells and may occur later in tumor evolution. CA-informed models help find cancer genes and pathways with positive selection and highlight regions where additional mutation burden is contributed by local mutational processes. This study underlines the complex interplay of mutational processes, genome function and evolution in cancer and tissues of origin.
Competing Interest Statement
The authors have declared no competing interest.