Abstract
Genomic prediction, the use of genetic information for predicting traits, has accelerated the breeding processes and provided mechanistic insights into the genetic bases of complex traits. While substantial efforts have been devoted to optimize genomic prediction, there remain areas that need to be further explored, including the impact of genome assemblies, genotyping approaches, variant types, allelic complexities, polyploidy levels, and population structures. Here, we assess the impact of these factors on the prediction of 20 complex traits in switchgrass (Panicum virgatum L.), a perennial biofuel feedstock. We found that short read-based genome assembly perform comparably to or even better than long read-based assembly in trait prediction; exome capture-based models have higher prediction accuracy than genotyping-by-sequencing-based models for 13 traits; bi-allelic insertion/deletions are as useful as bi-allelic single nucleotide polymorphisms in trait prediction, whereas multi-allelic variants outperform bi-allelic ones for 15 traits. Models built for tetraploids have higher prediction accuracy than those for octoploids for most traits. Traits of individuals with higher within-population genetic distances tend to have higher prediction accuracy. Finally, integrating different types of variants can improve the prediction accuracy. By exploring these factors, anthesis date prediction models built using multi-allelic insertion/deletions derived from exome capture led to the largest number of orthologs of benchmark flowering time genes compared to other models. Our study provides insights into the factors influencing genomic prediction outcomes that inform best practices for future studies and for improving agronomic traits in switchgrass and other species through selective breeding.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
The main text of this manuscript has been revised to convey our findings more clearly.