Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Complete assembly of parental haplotypes with trio binning

Sergey Koren, Arang Rhie, Brian P. Walenz, Alexander T. Dilthey, Derek M. Bickhart, Sarah B. Kingan, Stefan Hiendleder, John L. Williams, Timothy P. L. Smith, Adam M. Phillippy
doi: https://doi.org/10.1101/271486
Sergey Koren
1Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, Maryland, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Arang Rhie
1Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, Maryland, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brian P. Walenz
1Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, Maryland, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexander T. Dilthey
1Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, Maryland, USA
2Institute of Medical Microbiology, Heinrich-Heine-University Düsseldorf, Düsseldorf, North Rhine-Westphalia, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Derek M. Bickhart
3Cell Wall Biology and Utilization Laboratory, ARS USDA, Madison, Wisconsin, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarah B. Kingan
4Pacific Biosciences, Menlo Park, California, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stefan Hiendleder
5Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy SA, Australia
6Robinson Research Institute, The University of Adelaide, Adelaide SA, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John L. Williams
5Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy SA, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Timothy P. L. Smith
7US Meat Animal Research Center, ARS USDA, Clay Center, Nebraska, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: tim.smith@ars.usda.gov adam.phillippy@nih.gov
Adam M. Phillippy
1Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, Maryland, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: tim.smith@ars.usda.gov adam.phillippy@nih.gov
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Reference genome projects have historically selected inbred individuals to minimize heterozygosity and simplify assembly. We challenge this dogma and present a new approach designed specifically for heterozygous genomes. “Trio binning” uses short reads from two parental genomes to partition long reads from an offspring into haplotype-specific sets prior to assembly. Each haplotype is then assembled independently, resulting in a complete diploid reconstruction. On a benchmark human trio, this method achieved high accuracy and recovered complex structural variants missed by alternative approaches. To demonstrate its effectiveness on a heterozygous genome, we sequenced an F1 cross between cattle subspecies Bos taurus taurus and Bos taurus indicus, and completely assembled both parental haplotypes with NG50 haplotig sizes >20 Mbp and 99.998% accuracy, surpassing the quality of current cattle reference genomes. We propose trio binning as a new best practice for diploid genome assembly that will enable new studies of haplotype variation and inheritance.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.
Back to top
PreviousNext
Posted February 26, 2018.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Complete assembly of parental haplotypes with trio binning
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Complete assembly of parental haplotypes with trio binning
Sergey Koren, Arang Rhie, Brian P. Walenz, Alexander T. Dilthey, Derek M. Bickhart, Sarah B. Kingan, Stefan Hiendleder, John L. Williams, Timothy P. L. Smith, Adam M. Phillippy
bioRxiv 271486; doi: https://doi.org/10.1101/271486
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Complete assembly of parental haplotypes with trio binning
Sergey Koren, Arang Rhie, Brian P. Walenz, Alexander T. Dilthey, Derek M. Bickhart, Sarah B. Kingan, Stefan Hiendleder, John L. Williams, Timothy P. L. Smith, Adam M. Phillippy
bioRxiv 271486; doi: https://doi.org/10.1101/271486

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3609)
  • Biochemistry (7590)
  • Bioengineering (5533)
  • Bioinformatics (20833)
  • Biophysics (10347)
  • Cancer Biology (7998)
  • Cell Biology (11663)
  • Clinical Trials (138)
  • Developmental Biology (6619)
  • Ecology (10227)
  • Epidemiology (2065)
  • Evolutionary Biology (13648)
  • Genetics (9557)
  • Genomics (12860)
  • Immunology (7932)
  • Microbiology (19575)
  • Molecular Biology (7678)
  • Neuroscience (42193)
  • Paleontology (309)
  • Pathology (1259)
  • Pharmacology and Toxicology (2208)
  • Physiology (3272)
  • Plant Biology (7064)
  • Scientific Communication and Education (1295)
  • Synthetic Biology (1953)
  • Systems Biology (5435)
  • Zoology (1119)