TY - JOUR T1 - Phasing and imputation of single nucleotide polymorphism data of missing parents of bi-parental plant populations JF - bioRxiv DO - 10.1101/2020.02.07.938795 SP - 2020.02.07.938795 AU - Serap Gonen AU - Valentin Wimmer AU - R. Chris Gaynor AU - Ed Byrne AU - Gregor Gorjanc AU - John M. Hickey Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/02/07/2020.02.07.938795.abstract N2 - This paper presents an extension to a heuristic method for phasing and imputation of genotypes of descendants in bi-parental populations so that it can phase and impute genotypes of parents of bi-parental populations that are fully ungenotyped or partially genotyped. The imputed genotypes of the parent are then used to impute low-density genotyped descendants of the bi-parental population to high-density. The extension works in three steps. First, it identifies whether a parent has no or low-density genotypes available and it identifies all of its relatives that have high-density genotypes. Second, using the high-density information of relatives, it determines whether the parent is homozygous or heterozygous for a given locus. Third, it phases heterozygous positions of the parent by matching haplotypes to its relatives.We implemented the new algorithm in an extension of the AlphaPlantImptue software and tested its accuracy of imputing missing parent genotypes in simulated bi-parental populations from different scenarios. We also tested the accuracy of imputation of the missing parent’s descendants using the true genotype of the parent and compared this to using the imputed genotypes of the parent. Our results show that across all scenarios, the accuracy of imputation of a parent, measured as the correlation between true and imputed genotypes, was > 0.98 and did not drop below ∼ 0.96. The imputation accuracy of a parent was always higher when it was inbred than when it was outbred and when it had low-density genotypes. Including ancestors of the parent at HD, increasing the number of crosses and the number of high-density descendants all increased the accuracy of imputation. The high imputation accuracy achieved for the parent across all scenarios translated to little or no impact on the accuracy of imputation of its descendants at low-density.Key Message New fast and accurate method for phasing and imputation of SNP chip genotypes within diploid bi-parental plant populations. ER -