Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Targeted in silico characterization of fusion transcripts in tumor and normal tissues via FusionInspector

View ORCID ProfileBrian J. Haas, Alexander Dobin, Mahmoud Ghandi, Anne Van Arsdale, Timothy Tickle, James T. Robinson, Riaz Gillani, Simon Kasif, Aviv Regev
doi: https://doi.org/10.1101/2021.08.02.454639
Brian J. Haas
1Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2Graduate Program in Bioinformatics, Boston University, Boston, MA, 02215, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brian J. Haas
  • For correspondence: bhaas@broadinstitute.org
Alexander Dobin
4Cold Spring Harbor Laboratory, New York, 11724
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mahmoud Ghandi
5Monte Rosa Therapeutics, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anne Van Arsdale
6Department of Obstetrics and Gynecology and Women’s Health, Albert Einstein Montefiore Medical Center, Bronx, NY
7Department of Genetics, Albert Einstein College of Medicine, Bronx, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Timothy Tickle
1Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James T. Robinson
8School of Medicine, University of California San Diego, La Jolla, California
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Riaz Gillani
9Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
10Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
11Department of Pediatrics, Harvard Medical School, Boston, MA 02215, USA
12Boston Children’s Hospital, Boston, MA 02115, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Simon Kasif
2Graduate Program in Bioinformatics, Boston University, Boston, MA, 02215, USA
13Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aviv Regev
1Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
14Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
15Howard Hughes Medical Institute, Chevy Chase, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Gene fusions play a key role as driving oncogenes in tumors, and their reliable discovery and detection is important for cancer research, diagnostics, prognostics and guiding personalized therapy. While discovering gene fusions from genome sequencing can be laborious and costly, the resulting “fusion transcripts” can be recovered from RNA-seq data of tumor and normal samples. However, alleged and putative fusion transcript can arise from multiple sources in addition to the chromosomal rearrangements yielding fusion genes, including cis- or trans-splicing events, experimental artifacts during RNA-seq or computational errors of transcriptome reconstruction methods. Understanding how to discern, interpret, categorize, and verify predicted fusion transcripts is essential for consideration in clinical settings and prioritization for further research. Here, we present FusionInspector for in silico characterization and interpretation of candidate fusion transcripts from RNA-seq, enabling exploration of sequence and expression characteristics of fusions and their partner genes.

Results We applied FusionInspector to thousands of tumor and normal transcriptomes, and identified statistical and experimental features enriched among biologically impactful fusions. Through clustering and machine learning, we identified large collections of fusions potentially relevant to tumor and normal biological processes. We show that biologically relevant fusions are enriched for relatively high expression of the fusion transcript, imbalanced fusion allelic ratios, and canonical splicing patterns, and are deficient in sequence microhomologies detected between partner genes.

Conclusion We demonstrate FusionInspector to accurately in silico validate fusion transcripts, and to help identify numerous understudied fusions in tumor and normal tissues samples. FusionInspector is freely available as open source for screening, characterization, and visualization of candidate fusions via RNA-seq. We believe that this work will continue driving the discipline of transparent explanation and interpretation of machine learning predictions and tracing the predictions to their experimental sources.

Competing Interest Statement

A.R. is a co-founder and equity holder of Celsius Therapeutics, an equity holder in Immunitas, and was a scientific advisory board member of ThermoFisher Scientific, Syros Pharmaceuticals, Neogene Therapeutics and Asimov until 31 July 2020. From 1 August 2020, A.R. has been an employee of Genentech. MG is a current employee and stock holder at Monte Rosa Therapeutics.

Footnotes

  • https://github.com/broadinstitute/FusionInspectorPaper

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 04, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Targeted in silico characterization of fusion transcripts in tumor and normal tissues via FusionInspector
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Targeted in silico characterization of fusion transcripts in tumor and normal tissues via FusionInspector
Brian J. Haas, Alexander Dobin, Mahmoud Ghandi, Anne Van Arsdale, Timothy Tickle, James T. Robinson, Riaz Gillani, Simon Kasif, Aviv Regev
bioRxiv 2021.08.02.454639; doi: https://doi.org/10.1101/2021.08.02.454639
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Targeted in silico characterization of fusion transcripts in tumor and normal tissues via FusionInspector
Brian J. Haas, Alexander Dobin, Mahmoud Ghandi, Anne Van Arsdale, Timothy Tickle, James T. Robinson, Riaz Gillani, Simon Kasif, Aviv Regev
bioRxiv 2021.08.02.454639; doi: https://doi.org/10.1101/2021.08.02.454639

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4842)
  • Biochemistry (10771)
  • Bioengineering (8030)
  • Bioinformatics (27244)
  • Biophysics (13957)
  • Cancer Biology (11107)
  • Cell Biology (16025)
  • Clinical Trials (138)
  • Developmental Biology (8768)
  • Ecology (13265)
  • Epidemiology (2067)
  • Evolutionary Biology (17337)
  • Genetics (11678)
  • Genomics (15901)
  • Immunology (11010)
  • Microbiology (26030)
  • Molecular Biology (10625)
  • Neuroscience (56447)
  • Paleontology (417)
  • Pathology (1729)
  • Pharmacology and Toxicology (2999)
  • Physiology (4539)
  • Plant Biology (9614)
  • Scientific Communication and Education (1612)
  • Synthetic Biology (2682)
  • Systems Biology (6967)
  • Zoology (1508)