RT Journal Article SR Electronic T1 High Resolution Ancestry Deconvolution for Next Generation Genomic Data JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.09.19.460980 DO 10.1101/2021.09.19.460980 A1 Helgi Hilmarsson A1 Arvind S. Kumar A1 Richa Rastogi A1 Carlos D. Bustamante A1 Daniel Mas Montserrat A1 Alexander G. Ioannidis YR 2021 UL http://biorxiv.org/content/early/2021/09/21/2021.09.19.460980.abstract AB As genome-wide association studies and genetic risk prediction models are extended to globally diverse and admixed cohorts, ancestry deconvolution has become an increasingly important tool. Also known as local ancestry inference (LAI), this technique identifies the ancestry of each region of an individual’s genome, thus permitting downstream analyses to account for genetic effects that vary between ancestries. Since existing LAI methods were developed before the rise of massive, whole genome biobanks, they are computationally burdened by these large next generation datasets. Current LAI algorithms also fail to harness the potential of whole genome sequences, falling well short of the accuracy that such high variant densities can enable. Here we introduce Gnomix, a set of algorithms that address each of these points, achieving higher accuracy and swifter computational performance than any existing LAI method, while also enabling portable models that are particularly useful when training data are not shareable due to privacy or other restrictions. We demonstrate Gnomix (and its swift phase correction counterpart Gnofix) on worldwide whole-genome data from both humans and canids and utilize its high resolution accuracy to identify the location of ancient New World haplotypes in the Xoloitzcuintle, dating back over 100 generations. Code is available at https://github.com/AI-sandbox/gnomix.Competing Interest StatementCDB is the founder and CEO of Galatea Bio Inc and on the boards of Genomics PLC and Etalon.