PT - JOURNAL ARTICLE AU - Simone Mocellin AU - Sara Valpione AU - Carlo Riccardo Rossi AU - Karen Pooley TI - Breast cancer susceptibility: an integrative analysis of genomic data AID - 10.1101/279984 DP - 2018 Jan 01 TA - bioRxiv PG - 279984 4099 - http://biorxiv.org/content/early/2018/03/12/279984.short 4100 - http://biorxiv.org/content/early/2018/03/12/279984.full AB - Background Genome wide association studies (GWAS) are greatly accelerating the pace of discovery of germline variants underlying the genetic architecture of sporadic breast cancer predisposition. We have built the first knowledge-base dedicated to this field and used it to generate hypotheses on the molecular pathways involved in disease susceptibility.Methods We gathered data on the common single nucleotide polymorphisms (SNPs) discovered by breast cancer risk GWAS. Information on SNP functional effect (including data on linkage disequilibrium, expression quantitative trait locus, and SNP relationship with regulatory motifs or promoter/enhancer histone marks) was utilized to select putative breast cancer predisposition genes (BCPGs). Ultimately, BCPGs were subject to pathway (gene set enrichment) analysis and network (protein-protein interaction) analysis.Results Data from 38 studies (28 original case-control GWAS enrolling 383,260 patients with breast cancer; and 10 GWAS meta-analyses) were retrieved. Overall, 281 SNPs were associated with the risk of breast cancer with a P-value <10E-06 and a minor allele frequency >1%. Based on functional information, we identified 296 putative BCPGs. Primary analysis showed that germline perturbation of classical cancer-related pathways (e.g., apoptosis, cell cycle, signal transduction including estrogen receptor signaling) play a significant role in breast carcinogenesis. Other less established pathways (such as ribosome and peroxisome machineries) were also highlighted. In the main subgroup analysis, we considered the BCPGs encoding transcription factors (n=36), which in turn target 252 genes. Interestingly, pathway and network analysis of these genes yielded results resembling those of primary analyses, suggesting that most of the effect of genetic variation on disease risk hinges upon transcriptional regulons.Conclusions This knowledge-base, which is freely available and will be annually updated, can inform future studies dedicated to breast cancer molecular epidemiology as well as genetic susceptibility and development.GWASgenome-wide association studySNPsingle nucleotide polymorphismBCPGbreast cancer predisposition geneLDlinkage disequilibrium