ABSTRACT
Identifying the complete repertoire of genes that drive cancer in individual patients is crucial for precision oncology. Established methods for driver detection focus mostly on genes that are recurrently altered across cohorts of cancer patients. However, mapping these genes back to patients leaves a sizeable fraction with few or no driver events, hindering our understanding of cancer mechanisms and limiting the choice of therapeutic interventions. Here we present sysSVM2, a tool based on machine learning that integrates somatic alteration data with systems-level gene properties to predict drivers in individual patients. We develop sysSVM2 for pan-cancer applicability, demonstrating robust performance on real and simulated cancer data. We benchmark its performance against other driver detection methods and show that sysSVM2 has a lower false positive rate and superior patient driver coverage. Applying sysSVM2 to 7,646 samples from 34 cancer types, we find that predicted drivers are often rare or patient-specific. However, they converge to disrupt well-known cancer-related processes including DNA repair, chromatin organisation and the cell cycle. sysSVM2 is a resource to enhance personalised predictions of cancer driver events with possible use in research and clinical settings. Code to implement sysSVM2 and the trained models in simulated cancer-agnostic data as well as in 34 cancer types are available at https://github.com/ciccalab/sysSVM2.
Competing Interest Statement
The authors have declared no competing interest.