RT Journal Article SR Electronic T1 Identification of Relevant Genetic Alterations in Cancer using Topological Data Analysis JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.30.922310 DO 10.1101/2020.01.30.922310 A1 Raúl Rabadán A1 Yamina Mohamedi A1 Udi Rubin A1 Tim Chu A1 Oliver Elliott A1 Luis Arnés A1 Santiago Cal A1 Álvaro J. Obaya A1 Arnold J. Levine A1 Pablo G. Cámara YR 2020 UL http://biorxiv.org/content/early/2020/01/31/2020.01.30.922310.abstract AB Large-scale cancer genomic studies enable the systematic identification of mutations that lead to the genesis and progression of tumors, uncovering the underlying molecular mechanisms and potential therapies. While some such mutations are recurrently found in many tumors, many others exist solely within a few samples, precluding detection by conventional recurrence-based statistical approaches. Integrated analysis of somatic mutations and RNA expression data across 12 tumor types reveals that mutations of cancer genes are usually accompanied by substantial changes in expression. We use topological data analysis to leverage this observation and uncover 38 elusive candidate cancer-associated genes, including inactivating mutations of the metalloproteinase ADAMTS12 in lung adenocarcinoma. We show that ADAMTS12−/− mice have a five-fold increase in the susceptibility to develop lung tumors, confirming the role of ADAMTS12 as a tumor suppressor gene. Our results demonstrate that data integration through topological techniques can increase our ability to identify previously unreported cancer-related alterations.