Skip to main content
Log in

Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions

  • Original Paper
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Key message

Development of models to predict genotype by environment interactions, in unobserved environments, using environmental covariates, a crop model and genomic selection. Application to a large winter wheat dataset.

Abstract

Genotype by environment interaction (G*E) is one of the key issues when analyzing phenotypes. The use of environment data to model G*E has long been a subject of interest but is limited by the same problems as those addressed by genomic selection methods: a large number of correlated predictors each explaining a small amount of the total variance. In addition, non-linear responses of genotypes to stresses are expected to further complicate the analysis. Using a crop model to derive stress covariates from daily weather data for predicted crop development stages, we propose an extension of the factorial regression model to genomic selection. This model is further extended to the marker level, enabling the modeling of quantitative trait loci (QTL) by environment interaction (Q*E), on a genome-wide scale. A newly developed ensemble method, soft rule fit, was used to improve this model and capture non-linear responses of QTL to stresses. The method is tested using a large winter wheat dataset, representative of the type of data available in a large-scale commercial breeding program. Accuracy in predicting genotype performance in unobserved environments for which weather data were available increased by 11.1 % on average and the variability in prediction accuracy decreased by 10.8 %. By leveraging agronomic knowledge and the large historical datasets generated by breeding programs, this new model provides insight into the genetic architecture of genotype by environment interactions and could predict genotype performance based on past and future weather scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Abbreviations

BLUP:

Best linear unbiased predictor

GBLUP:

Genomic estimated best linear unbiased predictor

GEBV:

Genomic estimated breeding value

G*E:

Genotype by environment interactions

GS:

Genomic selection

MET:

Multi-environment trials

QTL:

Quantitative trait locus

Q*E:

QTL by environment interaction

SGL:

Sparse group lasso

SNP:

Single nucleotide polymorphism

TPE:

Target population of environments

References

  • Akdemir D, Heslot N (2012) Soft rule ensembles for statistical learning. Arxiv Prepr Arxiv 1205:4476

    Google Scholar 

  • Boer MP, Wright D, Feng L et al (2007) A mixed-model quantitative trait loci (QTL) analysis for multiple-environment trial data using environmental covariables for QTL-by-environment interactions, with an example in maize. Genetics. doi:10.1534/genetics.107.071068

    Google Scholar 

  • Brancourt-Hulmel M, Lecomte C, Meynard JM (1999) A diagnosis of yield-limiting factors on probe genotypes for characterizing environments in winter wheat trials. Crop Sci. doi:10.2135/cropsci1999.3961798x

    Google Scholar 

  • Brancourt-Hulmel M, Denis JB, Lecomte C (2000) Determining environmental covariates which explain genotype environment interaction in winter wheat through probe genotypes and biadditive factorial regression. Theor Appl Genet. doi:10.1007/s001220050038

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn. doi:10.1023/A:1010933404324

    Google Scholar 

  • Breiman L, Friedman J (1985) Estimating optimal transformations for multiple regression and correlation. J Am Stat Assoc 80:580–598

    Article  Google Scholar 

  • Bureau A, Dupuis J, Falls K et al (2005) Identifying SNPs predictive of phenotype using random forests. Genet Epidemiol. doi:10.1002/gepi.20041

    PubMed  Google Scholar 

  • Burgueño J, Crossa J, Cornelius PL, Yang RC (2008) Using factor analytic models for joining environments and genotypes without crossover genotype × environment interaction. Crop Sci. doi:10.2135/cropsci2007.11.0632

    Google Scholar 

  • Burgueño J, Crossa J, Cotes JM et al (2011) Prediction assessment of linear mixed models for multienvironment trials. Crop Sci. doi:10.2135/cropsci2010.07.0403

    Google Scholar 

  • Burgueño J, De los Campos G, Weigel K, Crossa J (2012) Genomic Prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Sci. doi:10.2135/cropsci2011.06.0299

    Google Scholar 

  • Chapman SC, Cooper M, Butler D, Henzell R (2000a) Genotype by environment interactions affecting grain sorghum I. Characteristics that confound interpretation of hybrid yield. Aust J Agric Res. doi:10.1071/AR99020

    Google Scholar 

  • Chapman SC, Cooper M, Hammer G, Butler D (2000b) Genotype by environment interactions affecting grain sorghum. II. Frequencies of different seasonal patterns of drought stress are related to location effects on hybrid yields. Aust J Agric Res 51:209–221

    Article  Google Scholar 

  • Chapman SC, Hammer G, Butler D, Cooper M (2000c) Genotype by environment interactions affecting grain sorghum III. Temporal sequences and spatial patterns in the target population of environments. Aust J Agric Res. doi:10.1071/AR99022

    Google Scholar 

  • Chenu K, Chapman SC, Hammer G et al (2008) Short-term responses of leaf growth rate to water deficit scale up to whole-plant and crop levels: an integrated modelling approach in maize. Plant Cell Environ. doi:10.1111/j.1365-3040.2007.01772.x

    PubMed  Google Scholar 

  • Chenu K, Deihimfard R, Chapman SC (2013) Large-scale characterization of drought pattern: a continent-wide modelling approach applied to the Australian wheatbelt––spatial and temporal trends. New Phytol. doi:10.1111/nph.12192

    PubMed  Google Scholar 

  • Chiquet J, Grandvalet Y, Charbonnier C (2012) Sparsity with sign-coherent groups of variables via the cooperative-lasso. Ann Appl Stat. doi:10.1214/11-AOAS520

    Google Scholar 

  • Comstock RE (1977) Quantitative genetics and the design of breeding programs. In: Pollak E, Kempthorne O, Bailey TB (eds) Proceedings of the international conference on quantitative genetics. Iowa State University Press, Ames, pp 705–718

    Google Scholar 

  • Cooper M, DeLacy IH (1994) Relationships among analytical methods used to study genotypic variation and genotype-by-environment interaction in plant breeding multi-environment experiments. Theor Appl Genet. doi:10.1007/BF01240919

    Google Scholar 

  • Crossa J, Vargas M, Van Eeuwijk FA et al (1999) Interpreting genotype × environment interaction in tropical maize using linked molecular markers and environmental covariables. Theor Appl Genet. doi:10.1007/s001220051276

    PubMed  Google Scholar 

  • Cullis BR, Smith AB, Beeck CP, Cowling WA (2010) Analysis of yield and oil from a series of canola breeding trials. Part II. Exploring variety by environment interaction using factor analysis. Genome. doi:10.1139/G10-080

    Google Scholar 

  • DeLacy IH, Basford KE, Cooper M et al (1996) Analysis of multi-environment trials––an historical perspective. In: Cooper M, Hammer G (eds) Plant adaptation and crop improvement. CAB International, Wallingford, pp 39–124

    Google Scholar 

  • Demotes-Mainard S, Doussinault G, Meynard JM (1996) Abnormalities in the male developmental programme of winter wheat induced by climatic stress at meiosis. Agronomie. doi:10.1051/agro:19960804

    Google Scholar 

  • Denis JB (1988) Two-way analysis using covariates. Statistics 19:123–132

    Article  Google Scholar 

  • Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Pearson Prentice Hall, Harlow

    Google Scholar 

  • Fischer RA (1985) Number of kernels in wheat crops and the influence of solar radiation and temperature. J Agri Sci. doi:10.1017/S0021859600056495

    Google Scholar 

  • Friedman J, Popescu BE (2003) Importance sampled learning ensembles. J Mach Learn Res 94305:1–32

    Google Scholar 

  • Friedman JH, Popescu BE (2008) Predictive learning via rule ensembles. Ann Appl Stat 2:916–954

    Article  Google Scholar 

  • Friedman JH, Hastie T, Tibshirani R (2010a) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1

    PubMed Central  PubMed  Google Scholar 

  • Friedman JH, Hastie T,Tibshirani R (2010b) A note on the group lasso and a sparse group lasso. Arxiv Prepr Arxiv:10010736

  • Gallagher JN, Biscoe PV (1978) Radiation absorption, growth and yield of cereals. J Agri Sci. doi:10.1017/S0021859600056616

    Google Scholar 

  • Gate P (1995) Ecophysiologie du blé. De la plante à la culture. Tec & Doc, Paris, p 430

    Google Scholar 

  • Gauch HG (2006) Statistical analysis of yield trials by AMMI and GGE. Crop Sci. doi:10.2135/cropsci2005.07-0193

    Google Scholar 

  • Gianola D, Fernando RL, Stella A (2006) Genomic-assisted prediction of genetic value with semiparametric procedure. Genetics. doi:10.1534/genetics.105.049510

    Google Scholar 

  • Gilmour AR, Gogel B, Cullis BR, et al (2009) ASREML user guide release 3.0. VSN International Ltd.

  • Habier D, Fernando RL, Dekkers JCM (2007) The impact of genetic relationship information on genome-assisted breeding values. Genetics. doi:10.1534/genetics.107.081190

    PubMed  Google Scholar 

  • Hammer G, Kropff MJ, Sinclair TR, Porter JR (2002) Future contributions of crop modelling—from heuristics and supporting decision making to understanding genetic regulation and aiding crop improvement. Eur J Agron. doi:10.1016/S1161-0301(02)00093-X

    Google Scholar 

  • He J, Le Gouis J, Stratonovitch P et al (2012) Simulation of environmental and genotypic variations of final leaf number and anthesis date for wheat. Eur J Agron. doi:10.1016/j.eja.2011.11.002

    Google Scholar 

  • Heffner EL, Lorenz AJ, Jannink J-L, Sorrells ME (2010) Plant breeding with genomic selection: gain per unit time and cost. Crop Sci. doi:10.2135/cropsci2009.11.0662

    Google Scholar 

  • Heslot N, Jannink J-L, Sorrells ME (2013) Using genomic prediction to characterize environments and optimize prediction accuracy in applied breeding data. Crop Sci. doi:10.2135/cropsci2012.07.0420

    Google Scholar 

  • Hunt LA (1991) Post anthesis temperature effects on duration and rate of grain filling in some winter and spring wheats. Can J Plant 617:609–617

    Article  Google Scholar 

  • Jamieson PD, Semenov MA, Brooking IR, Francis GS (1998) Sirius: a mechanistic model of wheat response to environmental variation. Eur J Agron. doi:10.1016/S1161-0301(98)00020-3

    Google Scholar 

  • Jullien A, Mathieu A, Allirand JM et al (2011) Characterization of the interactions between architecture and source-sink relationships in winter oilseed rape (Brassica napus) using the GreenLab model. Ann Bot-Lond. doi:10.1093/aob/mcq205

    Google Scholar 

  • Kelly AM, Cullis BR, Gilmour AR et al (2009) Estimation in a multiplicative mixed model involving a genetic relationship matrix. Genet Sel Evol. doi:10.1186/1297-9686-41-33

    PubMed Central  PubMed  Google Scholar 

  • Landau S, Mitchell RA, Barnett V et al (1998) Testing winter wheat simulation models’ predictions against observed UK grain yields. Agric Forest Meteorol. doi:10.1016/S0168-1923(97)00069-5

    Google Scholar 

  • Landau S, Mitchell RA, Barnett V et al (2000) A parsimonious, multiple-regression model of wheat yield response to environment. Agric Forest Meteorol. doi:10.1016/S0168-1923(99)00166-5

    Google Scholar 

  • Lecomte C (2005) Experimental evaluation of varietal innovations. Proposition of genotype––environment analysis tools adapted to the diversity of needs and constraints of the professionals of the seeds industry. Diss AgroParisTech p 262

  • Levins R (1966) The strategy of model building in population biology. Am Sci 54:421–431

    Google Scholar 

  • Löffler CM, Wei J, Fast T et al (2005) Classification of maize environments using crop simulation and geographic information systems. Crop Sci. doi:10.2135/cropsci2004.0370

    Google Scholar 

  • Lorenz AJ, Chao S, Asoro FG et al (2011) Genomic selection in plant breeding: knowledge and prospects. Adv Agron. doi:10.1016/B978-0-12-385531-2.00002-5

    Google Scholar 

  • Ma CX, Casella G, Wu R (2002) Functional mapping of quantitative trait loci underlying the character process: a theoretical framework. Genetics 161:1751–1762

    PubMed  Google Scholar 

  • Malosetti M, Voltas J, Romagosa I et al (2004) Mixed models including environmental covariables for studying QTL by environment interaction. Euphytica. doi:10.1023/B:EUPH.0000040511.46388.ef

    Google Scholar 

  • Martre P, Jamieson PD, Semenov MA et al (2006) Modelling protein content and composition in relation to crop nitrogen dynamics for wheat. Eur J Agron. doi:10.1016/j.eja.2006.04.007

    Google Scholar 

  • Messina C, Hammer G, Dong Z et al (2009) Modelling crop improvement in a GXEXM framework via gene-trail-phenotype relationships. In: Sadras VO, Calderini D (eds) Crop physiology: applications for genetic improvement and agronomy. Elsevier, Netherlands, pp 235–265

    Chapter  Google Scholar 

  • Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829

    CAS  PubMed  Google Scholar 

  • Meynard JM, Sebillotte M (1994) L’élaboration du rendement du blé, base pour l’étude des autres céréales à paille. In: Picard D, Combe L (eds) Elaboration du rendement des principales cultures annuelles. INRA, Paris, pp 31–51

    Google Scholar 

  • Monteith J (1972) Solar radiation and productivity in tropical ecosystems. J Appl Ecol 9:747–766

    Article  Google Scholar 

  • Ogutu JO, Piepho HP, Schulz-Streeck T (2011) A comparison of random forests, boosting and support vector machines for genomic selection. BMC Proc. doi:10.1186/1753-6561-5-S3-S11

    Google Scholar 

  • Park T, Casella G (2008) The bayesian lasso. Am Stat Assoc. doi:10.1198/016214508000000337

    Google Scholar 

  • Pérez P, De los Campos G, Crossa J, Gianola D (2010) Genomic-enabled prediction based on molecular markers and pedigree using the bayesian linear regression package in R. Plant Gen. doi:10.3835/plantgenome2010.04.0005

  • Piepho HP (1998) Empirical best linear unbiased prediction in cultivar trials using factor-analytic variance-covariance structures. Theor Appl Genet. doi:10.1007/s001220050885

    Google Scholar 

  • Piepho HP, Möhring J (2006) Selection in cultivar trials—is it ignorable? Crop Sci. doi:10.2135/cropsci2005.04-0038

    Google Scholar 

  • Piepho HP, Möhring J (2007) Computing heritability and selection response from unbalanced plant breeding trials. Genetics. doi:10.1534/genetics.107.074229

    Google Scholar 

  • Piepho HP, Denis JB, Van Eeuwijk FA (1998) Predicting cultivar differences using covariates. J Agric Biol Environ Stat. doi:10.2307/1400648

    Google Scholar 

  • Piepho HP, Möhring J, Melchinger AE, Büchse A (2008) BLUP for phenotypic selection in plant breeding and variety testing. Euphytica. doi:10.1007/s10681-007-9449-8

    Google Scholar 

  • Piepho HP, Ogutu JO, Schulz-Streeck T et al (2012) Efficient computation of ridge-regression best linear unbiased prediction in genomic selection in plant breeding. Crop Sci. doi:10.2135/cropsci2011.11.0592

    Google Scholar 

  • Podlich DW, Cooper M, Basford KE (1999) Computer simulation of a selection strategy to accommodate genotype-environment interactions in a wheat recurrent selection programme. Plant Breed. doi:10.1046/j.1439-0523.1999.118001017.x

    Google Scholar 

  • Quilot B, Génard M, Kervella J, Lescourret F (2004) Analysis of genotypic variation in fruit flesh total sugar content via an ecophysiological model applied to peach. Theor Appl Genet. doi:10.1007/s00122-004-1651-7

    Google Scholar 

  • Reymond M, Muller B, Leonardi A et al (2003) Combining quantitative trait loci analysis and an ecophysiological model to analyze the genetic variability of the responses of maize leaf growth to temperature and water deficit. Plant Physiol. doi:10.1104/pp.013839.soil

    PubMed Central  PubMed  Google Scholar 

  • Reymond M, Muller B, Tardieu F (2004) Dealing with the genotype x environment interaction via a modelling approach: a comparison of QTLs of maize leaf length or width with QTLs of model parameters. J Exp Bot. doi:10.1093/jxb/erh200

    PubMed  Google Scholar 

  • Smith AB, Cullis BR, Thompson R (2005) The analysis of crop cultivar breeding and evaluation trials: an overview of current mixed model approaches. J Agric Sci. doi:10.1017/S0021859605005587

    Google Scholar 

  • Sofield I, Evans L, Cook M, Wardlaw I (1977) Factors influencing the rate and duration of grain filling in wheat. Aust J Plant Physiol. doi:10.1071/PP9770785

    Google Scholar 

  • Stone P, Nicolas M (1998) The effect of duration of heat stress during grain filling on two wheat varieties differing in heat tolerance: grain growth and fractional protein accumulation. Aust J Plant Physiol. doi:10.1071/PP96114

    Google Scholar 

  • Tashiro T, Wardlaw I (1990) The response to high temperature shock and humidity changes prior to and during the early stages of grain development in wheat. Aust J Plant Physiol. doi:10.1071/PP9900551

    Google Scholar 

  • Van der Goot E, Orlandi S (2003) Technical description of interpolation and processing of meteorological data in CGMS. Joint Research Centre of the European Commission, Ispra, Italy, p 23

    Google Scholar 

  • Van Eeuwijk FA, Denis J-B, Kang MS (1996) Incorporating additional information on genotypes and environments in models for two-way genotype by environments tables. In: Kang MS, Gauch HG (eds) Genotype-by-environment interaction. CRC Press, Boca Raton, pp 15–50

    Chapter  Google Scholar 

  • Van Eeuwijk FA, Malosetti M, Yin X et al (2005) Statistical models for genotype by environment data: from conventional ANOVA models to eco-physiological QTL models. Aust J Agric Res. doi:10.1071/AR05153

    Google Scholar 

  • White JW, Herndl M, Hunt LA et al (2008) Simulation-based analysis of effects of loci on flowering in wheat. Crop Sci. doi:10.2135/cropsci2007.06.0318

    Google Scholar 

  • Windhausen VS, Wagener S, Magorokosho C et al (2012) Strategies to subdivide a target population of environments: results from the CIMMYT-led maize hybrid testing programs in Africa. Crop Sci. doi:10.2135/cropsci2012.02.0125

    Google Scholar 

  • Zadoks JC, Chang TT, Konzak CF (1974) A decimal code for the growth stages of cereals. Weed Res. doi:10.1111/j.1365-3180.1974.tb01084.x

    Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol. doi:10.1111/j.1467-9868.2005.00503.x

    Google Scholar 

Download references

Acknowledgments

We thank Pierre Martre for providing the crop model. The reviewers provided excellent comments that significantly improved the paper. JRC-MARS––Meteorological Data Base––EC––JRC provided access to the interpolated meteorological data. This research was supported in part by USDA-NIFA-AFRI grants, award numbers 2009-65300-05661, 2011-68002-30029, and 2005-05130 and by Hatch project 149-449. Limagrain Europe provided financial support for N. Heslot.

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical standards

The experiments comply with the current laws of the countries in which they were performed.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Luc Jannink.

Additional information

Communicated by A. E. Melchinger.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM1 (PDF 128 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heslot, N., Akdemir, D., Sorrells, M.E. et al. Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions. Theor Appl Genet 127, 463–480 (2014). https://doi.org/10.1007/s00122-013-2231-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-013-2231-5

Keywords

Navigation