PT - JOURNAL ARTICLE AU - Michael F. Lin AU - Ohad Rodeh AU - John Penn AU - Xiaodong Bai AU - Jeffrey G. Reid AU - Olga Krasheninina AU - William J. Salerno TI - GLnexus: joint variant calling for large cohort sequencing AID - 10.1101/343970 DP - 2018 Jan 01 TA - bioRxiv PG - 343970 4099 - http://biorxiv.org/content/early/2018/06/11/343970.short 4100 - http://biorxiv.org/content/early/2018/06/11/343970.full AB - As ever-larger cohorts of human genomes are collected in pursuit of genotype/phenotype associations, sequencing informatics must scale up to yield complete and accurate genotypes from vast raw datasets. Joint variant calling, a data processing step entailing simultaneous analysis of all participants sequenced, exhibits this scaling challenge acutely. We present GLnexus (GL, Genotype Likelihood), a system for joint variant calling designed to scale up to the largest foreseeable human cohorts. GLnexus combines scalable joint calling algorithms with a persistent database that grows efficiently as additional participants are sequenced. We validate GLnexus using 50,000 exomes to show it produces comparable or better results than existing methods, at a fraction of the computational cost with better scaling. We provide a standalone open-source version of GLnexus and a DNAnexus cloud-native deployment supporting very large projects, which has been employed for cohorts of >240,000 exomes and >22,000 whole-genomes.