RT Journal Article SR Electronic T1 Splice Expression Variation Analysis (SEVA) for Inter-tumor Heterogeneity of Gene Isoform Usage in Cancer JF bioRxiv FD Cold Spring Harbor Laboratory SP 091637 DO 10.1101/091637 A1 Bahman Afsari A1 Theresa Guo A1 Michael Considine A1 Liliana Florea A1 Luciane T. Kagohara A1 Genevieve L. Stein-O’Brien A1 Dylan Kelley A1 Emily Flam A1 Kristina D. Zambo A1 Patrick K. Ha A1 Donald Geman A1 Michael F. Ochs A1 Joseph A. Califano A1 Daria A. Gaykalova A1 Alexander V. Favorov A1 Elana J. Fertig YR 2018 UL http://biorxiv.org/content/early/2018/01/11/091637.abstract AB Motivation Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches.Results We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g., tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA’s performance against EBSeq, DiffSplice, and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVAin identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery.Conclusion These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data.Availability SEVA is implemented in the R/Bioconductor package GSReg.Contact bahman{at}jhu.edu, favorov{at}sensi.org, ejfertig{at}jhmi.edu