Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A Simple Deep Learning Approach for Detecting Duplications and Deletions in Next-Generation Sequencing Data

View ORCID ProfileTom Hill, View ORCID ProfileRobert L. Unckless
doi: https://doi.org/10.1101/657361
Tom Hill
4055 Haworth Hall, The Department of Molecular Biosciences, University of Kansas, 1200 Sunnyside Avenue, Lawrence, KS 66045
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tom Hill
  • For correspondence: tom.hill@ku.edu
Robert L. Unckless
4055 Haworth Hall, The Department of Molecular Biosciences, University of Kansas, 1200 Sunnyside Avenue, Lawrence, KS 66045
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Robert L. Unckless
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods or coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data.

Available at: https://github.com/tomh1lll/dudeml

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted June 03, 2019.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A Simple Deep Learning Approach for Detecting Duplications and Deletions in Next-Generation Sequencing Data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
A Simple Deep Learning Approach for Detecting Duplications and Deletions in Next-Generation Sequencing Data
Tom Hill, Robert L. Unckless
bioRxiv 657361; doi: https://doi.org/10.1101/657361
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
A Simple Deep Learning Approach for Detecting Duplications and Deletions in Next-Generation Sequencing Data
Tom Hill, Robert L. Unckless
bioRxiv 657361; doi: https://doi.org/10.1101/657361

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (1524)
  • Biochemistry (2479)
  • Bioengineering (1731)
  • Bioinformatics (9670)
  • Biophysics (3897)
  • Cancer Biology (2968)
  • Cell Biology (4190)
  • Clinical Trials (135)
  • Developmental Biology (2624)
  • Ecology (4098)
  • Epidemiology (2031)
  • Evolutionary Biology (6894)
  • Genetics (5206)
  • Genomics (6498)
  • Immunology (2183)
  • Microbiology (6937)
  • Molecular Biology (2751)
  • Neuroscience (17262)
  • Paleontology (126)
  • Pathology (425)
  • Pharmacology and Toxicology (705)
  • Physiology (1056)
  • Plant Biology (2488)
  • Scientific Communication and Education (643)
  • Synthetic Biology (831)
  • Systems Biology (2687)
  • Zoology (429)