Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Generalising Better: Applying Deep Learning To Integrate Deleteriousness Prediction Scores For Whole-Exome SNV Studies

Ilia Korvigo, Andrey Afanasyev, Nikolay Romashchenko, Mihail Skoblov
doi: https://doi.org/10.1101/126532
Ilia Korvigo
Moscow Institute for Physics and Technology;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: ilia.korvigo@gmail.com
Andrey Afanasyev
Moscow Institute for Physics and Technology;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nikolay Romashchenko
All-Russia Research Institute for Agricultural Microbiology
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mihail Skoblov
Moscow Institute for Physics and Technology;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Many automatic classifiers were introduced to aid inference of phenotypical effects of uncategorised nsSNVs (nonsynonymous Single Nucleotide Variations) in theoretical and medical applications. Lately, several meta-estimators have been proposed that combine different predictors, such as PolyPhen and SIFT, to integrate more information in a single score. Although many advances have been made in feature design and machine learning algorithms used, the shortage of high-quality reference data along with the bias towards intensively studied in vitro models call for improved generalisation ability in order to further increase classification accuracy and handle records with insufficient data. Since a meta-estimator basically combines different scoring systems with highly complicated nonlinear relationships, we investigated how deep learning (supervised and unsupervised), which is particularly efficient at discovering hierarchies of features, can improve classification performance. While it is believed that one should only use deep learning for high-dimensional input spaces and other models (logistic regression, support vector machines, Bayesian classifiers, etc) for simpler inputs, we still believe that the ability of neural networks to discover intricate structure in highly heterogenous datasets can aid a meta-estimator. We compare the performance with various popular predictors, many of which are recommended by the American College of Medical Genetics and Genomics (ACMG), as well as available deep learning-based predictors. Thanks to hardware acceleration we were able to use a computationally expensive genetic algorithm to stochastically optimise hyper-parameters over many generations. Overfitting was hindered by noise injection and dropout, limiting coadaptation of hidden units. Although we stress that this work was not conceived as a tool comparison, but rather an exploration of the possibilities of deep learning application in ensemble scores, our results show that even relatively simple modern neural networks can significantly improve both prediction accuracy and coverage. We provide open-access to our finest model at http://score.generesearch.ru

Copyright 
The copyright holder for this preprint is the author/funder. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
  • Posted April 11, 2017.

Download PDF

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Generalising Better: Applying Deep Learning To Integrate Deleteriousness Prediction Scores For Whole-Exome SNV Studies
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
Generalising Better: Applying Deep Learning To Integrate Deleteriousness Prediction Scores For Whole-Exome SNV Studies
Ilia Korvigo, Andrey Afanasyev, Nikolay Romashchenko, Mihail Skoblov
bioRxiv 126532; doi: https://doi.org/10.1101/126532
del.icio.us logo Digg logo Reddit logo Technorati logo Twitter logo CiteULike logo Connotea logo Facebook logo Google logo Mendeley logo
Citation Tools
Generalising Better: Applying Deep Learning To Integrate Deleteriousness Prediction Scores For Whole-Exome SNV Studies
Ilia Korvigo, Andrey Afanasyev, Nikolay Romashchenko, Mihail Skoblov
bioRxiv 126532; doi: https://doi.org/10.1101/126532

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (620)
  • Biochemistry (860)
  • Bioengineering (516)
  • Bioinformatics (4762)
  • Biophysics (1503)
  • Cancer Biology (1030)
  • Cell Biology (1448)
  • Clinical Trials (52)
  • Developmental Biology (974)
  • Ecology (1633)
  • Epidemiology (808)
  • Evolutionary Biology (3690)
  • Genetics (2513)
  • Genomics (3266)
  • Immunology (602)
  • Microbiology (2416)
  • Molecular Biology (895)
  • Neuroscience (6488)
  • Paleontology (42)
  • Pathology (124)
  • Pharmacology and Toxicology (220)
  • Physiology (287)
  • Plant Biology (893)
  • Scientific Communication and Education (247)
  • Synthetic Biology (386)
  • Systems Biology (1323)
  • Zoology (162)