Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

GLnexus: joint variant calling for large cohort sequencing

Michael F. Lin, Ohad Rodeh, John Penn, Xiaodong Bai, Jeffrey G. Reid, Olga Krasheninina, William J. Salerno
doi: https://doi.org/10.1101/343970
Michael F. Lin
aDNAnexus
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mlin@dnanexus.com
Ohad Rodeh
aDNAnexus
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John Penn
bRegeneron Genetics Center
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xiaodong Bai
bRegeneron Genetics Center
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeffrey G. Reid
bRegeneron Genetics Center
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Olga Krasheninina
cHuman Genome Sequencing Center, Baylor College of Medicine
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
William J. Salerno
cHuman Genome Sequencing Center, Baylor College of Medicine
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

ABSTRACT

As ever-larger cohorts of human genomes are collected in pursuit of genotype/phenotype associations, sequencing informatics must scale up to yield complete and accurate genotypes from vast raw datasets. Joint variant calling, a data processing step entailing simultaneous analysis of all participants sequenced, exhibits this scaling challenge acutely. We present GLnexus (GL, Genotype Likelihood), a system for joint variant calling designed to scale up to the largest foreseeable human cohorts. GLnexus combines scalable joint calling algorithms with a persistent database that grows efficiently as additional participants are sequenced. We validate GLnexus using 50,000 exomes to show it produces comparable or better results than existing methods, at a fraction of the computational cost with better scaling. We provide a standalone open-source version of GLnexus and a DNAnexus cloud-native deployment supporting very large projects, which has been employed for cohorts of >240,000 exomes and >22,000 whole-genomes.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted June 11, 2018.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
GLnexus: joint variant calling for large cohort sequencing
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
GLnexus: joint variant calling for large cohort sequencing
Michael F. Lin, Ohad Rodeh, John Penn, Xiaodong Bai, Jeffrey G. Reid, Olga Krasheninina, William J. Salerno
bioRxiv 343970; doi: https://doi.org/10.1101/343970
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
GLnexus: joint variant calling for large cohort sequencing
Michael F. Lin, Ohad Rodeh, John Penn, Xiaodong Bai, Jeffrey G. Reid, Olga Krasheninina, William J. Salerno
bioRxiv 343970; doi: https://doi.org/10.1101/343970

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4685)
  • Biochemistry (10362)
  • Bioengineering (7682)
  • Bioinformatics (26341)
  • Biophysics (13534)
  • Cancer Biology (10692)
  • Cell Biology (15445)
  • Clinical Trials (138)
  • Developmental Biology (8501)
  • Ecology (12824)
  • Epidemiology (2067)
  • Evolutionary Biology (16867)
  • Genetics (11401)
  • Genomics (15484)
  • Immunology (10619)
  • Microbiology (25225)
  • Molecular Biology (10225)
  • Neuroscience (54481)
  • Paleontology (402)
  • Pathology (1669)
  • Pharmacology and Toxicology (2897)
  • Physiology (4345)
  • Plant Biology (9252)
  • Scientific Communication and Education (1586)
  • Synthetic Biology (2558)
  • Systems Biology (6781)
  • Zoology (1466)