TY - JOUR T1 - SF3B1ness score: screening <em>SF3B1</em> mutation status from over 60,000 transcriptomes based on a machine learning approach JF - bioRxiv DO - 10.1101/572834 SP - 572834 AU - Yuichi Shiraishi AU - Kenichi Chiba AU - Ai Okada Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/03/09/572834.abstract N2 - In precision oncology, genomic evidence is used to determine the optimal treatment for each patient. However, identification of somatic mutations from genome sequencing data is often technically difficult and functional significance of somatic mutations is inconclusive in many cases. In this paper, to seek for an alternative approach, we tackle the problem of predicting functional mutations from transcriptome sequencing data. Focusing on SF3B1, a key splicing factor gene, we develop SF3B1ness score for classifying functional mutation status using a combination of Naive Bayes classifier and zero-inflated beta-binomial modeling (R package is available at (https://github.com/friend1WS/SF3B1ness). Using 8,992 TCGA exome and RNA sequencing data for evaluation, we show that the classifier based on SF3B1ness score is able to (1) attain very high precision (&gt;93%) and sensitivity (&gt;95%), (2) rescue several somatic mutations not identified by exome sequence analysis especially due to low variant allele frequencies, and (3) successfully measure functional importance for somatic mutation whose significance has been unknown. Furthermore, to demonstrate that the SF3B1ness score is highly robust and can be extensible to the cohorts outside training data, we performed a functional SF3B1 mutation screening on 51,577 additional transcriptome sequencing data. We have detected 135 samples with putative SF3B1 functional mutations including those that are rarely registered in the somatic mutation database (e.g., G664C, L747W, and R775G). Moreover, we could identify two cases with SF3B1 mutations from normal tissues, implying that SF3B1ness score can be used for detecting clonal hematopoiesis. ER -