RT Journal Article SR Electronic T1 Accurate, scalable cohort variant calls using DeepVariant and GLnexus JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.02.10.942086 DO 10.1101/2020.02.10.942086 A1 Yun, Taedong A1 Li, Helen A1 Chang, Pi-Chuan A1 Lin, Michael F. A1 Carroll, Andrew A1 McLean, Cory Y. YR 2020 UL http://biorxiv.org/content/early/2020/02/11/2020.02.10.942086.abstract AB Population-scale sequenced cohorts are foundational resources for many genetic analyses, but creating them from single-sample variant calls remains challenging. Here we introduce an open-source cohort-calling method that uses the highly accurate germline caller DeepVariant and scalable merging tool GLnexus. Using callset quality metrics based on variant recall and precision in benchmark samples and Mendelian consistency in father-mother-child trios, we optimized the method across a range of cohort sizes, sequencing methods, and sequencing depths. The resulting callsets show consistent quality improvements over those generated using existing best practices. We further evaluated the DeepVariant+GLnexus pipeline in the deeply sequenced 1000 Genomes Project phase 3 samples (1KGP) and show superior callset quality metrics and imputation reference panel performance compared to an independently-generated GATK Best Practices pipeline. We publicly release the 1KGP individual-level variant calls and cohort callset to foster additional development and evaluation of cohort merging methods as well as broad studies of genetic variation.