Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Extracting a Biologically Relevant Latent Space from Cancer Transcriptomes with Variational Autoencoders

View ORCID ProfileGregory P. Way, View ORCID ProfileCasey S. Greene
doi: https://doi.org/10.1101/174474
Gregory P. Way
1Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, PA 19104, USA E-mail:
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gregory P. Way
  • For correspondence: gregway@mail.med.upenn.edu
Casey S. Greene
2Department of Systems Pharmacology and Translational Therapeutics University of Pennsylvania, Philadelphia, PA 19104, USA E-mail:
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Casey S. Greene
  • For correspondence: csgreene@mail.med.upenn.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

The Cancer Genome Atlas (TCGA) has profiled over 10,000 tumors across 33 different cancer-types for many genomic features, including gene expression levels. Gene expression measurements capture substantial information about the state of each tumor. Certain classes of deep neural network models are capable of learning a meaningful latent space. Such a latent space could be used to explore and generate hypothetical gene expression profiles under various types of molecular and genetic perturbation. For example, one might wish to use such a model to predict a tumor’s response to specific therapies or to characterize complex gene expression activations existing in differential proportions in different tumors. Variational autoencoders (VAEs) are a deep neural network approach capable of generating meaningful latent spaces for image and text data. In this work, we sought to determine the extent to which a VAE can be trained to model cancer gene expression, and whether or not such a VAE would capture biologically-relevant features. In the following report, we introduce a VAE trained on TCGA pan-cancer RNA-seq data, identify specific patterns in the VAE encoded features, and discuss potential merits of the approach. We name our method “Tybalt” after an instigative, cat-like character who sets a cascading chain of events in motion in Shakespeare’s “Romeo and Juliet”. From a systems biology perspective, Tybalt could one day aid in cancer stratification or predict specific activated expression patterns that would result from genetic changes or treatment effects.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted October 02, 2017.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Extracting a Biologically Relevant Latent Space from Cancer Transcriptomes with Variational Autoencoders
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Extracting a Biologically Relevant Latent Space from Cancer Transcriptomes with Variational Autoencoders
Gregory P. Way, Casey S. Greene
bioRxiv 174474; doi: https://doi.org/10.1101/174474
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Extracting a Biologically Relevant Latent Space from Cancer Transcriptomes with Variational Autoencoders
Gregory P. Way, Casey S. Greene
bioRxiv 174474; doi: https://doi.org/10.1101/174474

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3606)
  • Biochemistry (7580)
  • Bioengineering (5529)
  • Bioinformatics (20808)
  • Biophysics (10335)
  • Cancer Biology (7986)
  • Cell Biology (11645)
  • Clinical Trials (138)
  • Developmental Biology (6610)
  • Ecology (10214)
  • Epidemiology (2065)
  • Evolutionary Biology (13625)
  • Genetics (9545)
  • Genomics (12851)
  • Immunology (7925)
  • Microbiology (19553)
  • Molecular Biology (7668)
  • Neuroscience (42128)
  • Paleontology (308)
  • Pathology (1258)
  • Pharmacology and Toxicology (2203)
  • Physiology (3268)
  • Plant Biology (7045)
  • Scientific Communication and Education (1294)
  • Synthetic Biology (1951)
  • Systems Biology (5428)
  • Zoology (1118)