TY - JOUR T1 - A Maize Practical Haplotype Graph Leverages Diverse NAM Assemblies JF - bioRxiv DO - 10.1101/2020.08.31.268425 SP - 2020.08.31.268425 AU - Jose A. Valdes Franco AU - Joseph L. Gage AU - Peter J. Bradbury AU - Lynn C. Johnson AU - Zachary R. Miller AU - Edward S. Buckler AU - M. Cinta Romay Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/09/28/2020.08.31.268425.abstract N2 - As a result of millions of years of transposon activity, multiple rounds of ancient polyploidization, and large populations that preserve diversity, maize has an extremely structurally diverse genome, evidenced by high-quality genome assemblies that capture substantial levels of both tropical and temperate diversity. We generated a pangenome representation (the Practical Haplotype Graph, PHG) of these assemblies in a database, representing the pangenome haplotype diversity and providing an initial estimate of structural diversity. We leveraged the pangenome to accurately impute haplotypes and genotypes of taxa using various kinds of sequence data, ranging from WGS to extremely-low coverage GBS. We imputed the genotypes of the recombinant inbred lines of the NAM population with over 99% mean accuracy, while unrelated germplasm attained a mean imputation accuracy of 92 or 95% when using GBS or WGS data, respectively. Most of the imputation errors occur in haplotypes within European or tropical germplasm, which have yet to be represented in the maize PHG database. Also, the PHG stores the imputation data in a 30,000-fold more space-efficient manner than a standard genotype file, which is a key improvement when dealing with large scale data.Competing Interest StatementThe authors have declared no competing interest. ER -