Subset selection of markers for genome-enabled prediction of genetic values using radial basis function neural networks

Isabela de Castro Sant’ Anna; Gabi Nunes Silva; Moysés Nascimento; Cosme Damião Cruz

doi:10.1101/490474

Abstract

This paper aimed to evaluate the efficiency of subset selection of markers for genome-enabled prediction of genetic values using radial basis function neural networks (RBFNN). For this purpose, an F1 population from hybridization of divergent parents with 500 individuals genotyped with 1,000 SNP-type markers was simulated. Phenotypic traits were determined by adopting three different gene action models – additive, additive-dominant, and epistasic, complying with two dominance situations: partial and complete with quantitative traits admitting heritability (h²) equal to 30 and 60%, each one controlled by 50 loci, considering two alleles per locus, totaling 12 different scenarios. To evaluate the predictive ability of RR_BLUP and the neural networks, a cross-validation procedure with five replicates were trained using 80% of the individuals of the population. Two methods were used: dimensionality reduction and stepwise regression. The square of the correlation between the predicted genomic estimated breeding value (GEBV) and the phenotype value was used to measure predictive reliability. For h² = 0.3 in the additive scenario, the R² values were 59% for neural network (RBFNN) and 57% for RR-BLUP, and in the epistatic scenario, R² values were 50% and 41%, respectively. Additionally, when analyzing the mean-squared error root, the difference in performance between the techniques is even greater. For the additive scenario, the estimates were 91 for RR-BLUP and 5 for neural networks and, in the most critical scenario, they were 427 for RR-BLUP and 20 for neural network. The results showed that the use of neural networks and variable selection techniques allows capturing epistasis interactions, leading to an improvement in the accuracy of prediction of the genetic value and, mainly, to a large reduction of the mean square error, which indicates greater genomic value.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.