RT Journal Article SR Electronic T1 GLnexus: joint variant calling for large cohort sequencing JF bioRxiv FD Cold Spring Harbor Laboratory SP 343970 DO 10.1101/343970 A1 Michael F. Lin A1 Ohad Rodeh A1 John Penn A1 Xiaodong Bai A1 Jeffrey G. Reid A1 Olga Krasheninina A1 William J. Salerno YR 2018 UL http://biorxiv.org/content/early/2018/06/11/343970.abstract AB As ever-larger cohorts of human genomes are collected in pursuit of genotype/phenotype associations, sequencing informatics must scale up to yield complete and accurate genotypes from vast raw datasets. Joint variant calling, a data processing step entailing simultaneous analysis of all participants sequenced, exhibits this scaling challenge acutely. We present GLnexus (GL, Genotype Likelihood), a system for joint variant calling designed to scale up to the largest foreseeable human cohorts. GLnexus combines scalable joint calling algorithms with a persistent database that grows efficiently as additional participants are sequenced. We validate GLnexus using 50,000 exomes to show it produces comparable or better results than existing methods, at a fraction of the computational cost with better scaling. We provide a standalone open-source version of GLnexus and a DNAnexus cloud-native deployment supporting very large projects, which has been employed for cohorts of >240,000 exomes and >22,000 whole-genomes.