RT Journal Article SR Electronic T1 Pan-Cancer modelling of genomic alterations through gene expression JF bioRxiv FD Cold Spring Harbor Laboratory SP 492561 DO 10.1101/492561 A1 Federico M. Giorgi A1 Forest Ray YR 2018 UL http://biorxiv.org/content/early/2018/12/10/492561.abstract AB Cancer is a disease often characterized by the presence of multiple genomic alterations, which trigger altered transcriptional patterns and gene expression, which in turn sustain the processes of tumorigenesis, tumor progression and tumor maintenance. The links between genomic alterations and gene expression profiles can be utilized as the basis to build specific molecular tumorigenic relationships. In this study we perform pan-cancer predictions of the presence of single somatic mutations and copy number variations using machine learning approaches on gene expression profiles. We show that gene expression can be used to predict genomic alterations in every tumor type, where some alterations are more predictable than others. We propose gene aggregation as a tool to improve the accuracy of alteration prediction models from gene expression profiles. Ultimately, we show how this principle can be beneficial in intrinsically noisy datasets, such as those based on single cell sequencing.Author Summary In this article we show that transcript abundance can be used to predict the presence or absence of the majority of genomic alterations present in human cancer. We also show how these predictions can be improved by aggregating genes into small networks to counteract the effects of transcript measurement noise.