Abstract
Genomic selection models use Single Nucleotide Polymorphism (SNP) markers to predict phenotypes. However, these predictive models face challenges due to the high dimensionality of genome-wide SNP marker data. Thanks to recent breakthroughs in DNA sequencing and decreased sequencing cost, the study of novel genomic variants such as Structural Variations (SVs) and Transposable Elements (TEs) become increasingly prevalent. In this paper, we develop a deep convolutional neural network model, NovGMDeep, to predict phenotypes using SVs and TEs markers for genomic selection. The proposed model is trained and tested on samples of A. thaliana and O. sativa using k-fold cross-validation. The prediction accuracy is evaluated using Pearson’s Correlation Coefficient (PCC), Mean Absolute Error (MAE), and Standard Deviation (SD) of MAE. The predicted results showed higher correlation when the model is trained with SVs and TEs than with SNPs. NovGMDeep also has higher prediction accuracy when comparing with conventional statistical models. This work sheds light on the unrecognized function of SVs and TEs in genotype-to-phenotype associations, as well as their extensive significance and value in crop development.
Competing Interest Statement
The authors have declared no competing interest.