Abstract
The amount of available biological information on the markers is constantly increasing and provides valuable insight into the underlying biology of traits of interest. This information can also be used to inform the models applied for genomic selection to improve predictions. The objective of this study was to propose a general model for genomic selection using a link function approach within the hierarchical generalized linear model framework (hglm) that can include external information on the markers. These models can be fitted using the well-established hglm package in R. Furthermore, we also present an R package (CodataGS) to fit these models, which is significantly faster than the hglm package when the number of markers largely exceeds the number of individuals. Simulated data was used to validate the proposed model. Knowledge on the location of the QTLs on the genome, with varying degree of uncertainty, was used as external information on the markers. The proposed model showed improved accuracies from 3.8% up to 23.2% compared to the SNP-BLUP method, which assumes equal variances for all markers. The performance of the proposed models depended on the genetic architecture of the trait, as traits that deviate from the infinitesimal model benefited more from the external information. Also, the gain in accuracy depended on the degree of uncertainty of the external information provided to the model. The usefulness of these type of models is expected to increase with time as more accurate information on the markers becomes available.








