ABSTRACT
Large-scale cancer sequencing studies have uncovered dozens of mutations critical to cancer initiation and progression. However, a significant proportion of genes linked to tumor propagation remain hidden, often due to noise in sequencing data confounding low frequency alterations. Further, genes in networks under purifying selection (NPS), or those that are mutated in cancers less frequently than would be expected by chance, may play crucial roles in sustaining cancers but have largely been overlooked. We describe here a statistical framework that identifies genes that have a first order protein interaction network significantly depleted for mutations, to elucidate key genetic contributors to cancers. Not reliant on and thus, unbiased by, the gene of interest’s mutation rate, our approach has identified 685 putative genes linked to cancer development. Comparative analysis indicates statistically significant enrichment of NPS genes in previously validated cancer vulnerability gene sets, while further identifying novel cancer-specific candidate gene targets. As more tumor genomes are sequenced, integrating systems level mutation data through this network approach should become increasingly useful in pinpointing gene targets for cancer diagnosis and treatment.