New genotypic adaptability and stability analyses using Legendre polynomials and genotype-ideotype distances

Developing cultivars with superior performance in different cultivation environments is one of the main challenges of breeding programs. The current adaptability and stability analyses have limitations, especially when used with trials with genetic or statistical imbalances, heterogeneity of residual variances, and genetic covariance. Thus, adaptability and stability analyses based on mixed model approaches are an effective alternative in such cases. We propose a new methodology for genotypic adaptability and stability analyses, based on Legendre polynomials and genotype-ideotype distances aiming at greater precision when recommending cultivars. We applied the methodology to a set of common bean cultivars throughout a multi-environment trial. We used a set of 13 trials, where they were classified in unfavorable or favorable environments, depending on the average of the cultivars in these trials. The results showed that the methodology allows to predict the genotypic values of cultivars in environments where they were not evaluated with high accuracy values, therefore circumventing the imbalance of the experiments. From these values, it was possible to measure the genotypic adaptability according to ideotypes. The stability of the cultivars was quantified as the invariance of their behavior throughout the trials. The use of ideotypes based on real data allowed a better comparison of the performance of cultivars across environments.


Introduction
The main objective of a crop breeding program, is to develop cultivars that can replace 36 those that are currently available [1]. In the final stages of a breeding program, the most 37 promising lines are evaluated in trials conducted in different environments, such as different 38 years, places, and seasons. In Brazil, these tests are called Valor de Cultivo e Uso (VCU), and 39 their results are the basis for the cultivar recommendation [2]. 40 Adaptability and stability studies are used to quantify the performances of the genotypes 41 to make recommendations [3]. Adaptability is defined as the ability of a genotype to respond 42 advantageously to its environment, while its stability is related to the predictability of its 43 behavior [4,5]. It is thus possible to identify genotypes that have wide or specific adaptability 44 to favorable or unfavorable environments. Finlay and Wilkinson [4] defined favorable and 45 unfavorable environments as those that result in the average performance of the genotype being 46 above or below the average of all the trials, respectively.

47
Genotypes that have specific adaptability to favorable environments, have genes that 48 enable them to respond to improved environmental conditions, and should be recommended to 49 farmers who wish to utilize the most current technologies. Genotypes with specific adaptability 50 to unfavorable environments however, may have specific genes that enable them to grow in 51 these environments. These are rustic genotypes and should be recommended to farmers who 52 utilize lower level technologies. In general, rustic genotypes have more genes that tolerate biotic 53 and abiotic stresses, which means they may be favored in more adverse environmental 54 conditions. 55 In recent decades, several methods to analyze adaptability and stability have been 56 proposed, based on different statistical principles. To identify genotypes that have general or Step one: Environmental gradient 86 87 The first step is the classification of the trials as an environmental gradient. To define 88 this gradient, trials in which the genotypes are evaluated must be ordered a priori, according to 89 certain classification criteria such as Akaike Information Criterion (AIC) [18], Bayesian  Step two: Fitting reaction norm models Once the environmental gradient is established, different reaction norm models must be 104 adjusted to identify what best quantifies the behavior of the genotypes in the different trials.

105
The number of models to be tested depends on the number of trials used (determines the 106 maximum order of the polynomial), the number of effects included in the model via the 107 Legendre polynomials, and the residual covariance structures.

108
For the trials conducted in randomized block designs, for example, the model to be 109 adopted was as follows: where: y ijk is the observation of the i-th genotype (i = 1, 2,…, ng, where ng is the total number 113 of genotypes), in the j-th trial (j = 1, 2,…, na, where na is the total number of trials), in the k-th 114 block (k = 1, 2, 3); A j is the effect of the trial; R/A jk is the fixed effect of the blocks within each 115 trial; α im is the reaction norm coefficient for the Legendre polynomial of order m for the 116 genotypic effects of the genotypes; Ф ijm is Legendre's m-th polynomial for the j-th trial, 117 standardized from -1 to +1 for the i-th genotype; M is the order of adjustment of the Legendre 118 polynomial for genotypic effects; and e ijk is the residual random effect associated with y ijk .

119
In a matrix, the model above is described as: , where: y is the vector = + + 120 of phenotypic data; b is the vector of the fixed effects of the combination of blocks × trials 121 added to the general average; g is the vector of genetic effects (assumed to be random); and 122 is the residue vector (random). X and Z represent the incidence matrix for these effects, Step three: Choosing the best fit model 130 131 To select the best fit model, criteria the AIC, BIC and PAL were utilized. These criteria 132 are described as follows:  Step six: Accuracy at the original scale 167 168 The prediction accuracy, also in original scale, is estimated according to the following 169 equation: where: is the correlation between the predicted and real genotype values for genotype i in  Step seven: Genotypic adaptability and stability

236
The trials were designed in randomized blocks with three replications. The plots 237 consisted of four lines of two meters (m), spaced 0.5 m apart. The treatments used were in 238 accordance with the recommendations for common bean cultures [24]. The evaluated 239 characteristic was grain yield, and they were harvested from the two central lines of each plot.

240
The data were corrected to 13 % humidity and converted to kg ha -1 . To create and organize the environmental gradient, the 13 trials were classified as 245 favorable or unfavorable, according to the environmental index (Eq. 1). We adjusted 14 reaction 246 norm models to identify the model that best quantifies the behavior of the cultivars for grain 247 yield in the MET, with trials ordered according to the environmental index. Among these 248 models, seven were tested considering the homogeneous residual variance and the other seven To view the results, the ten cultivars with the highest probability were selected to plot 263 their curves with their respective reaction norms, for the three ideotypes, as we chose not to 264 include ideotype IV, since it makes no sense to recommend cultivars of minimal adaptability.

265
The BLUP of each cultivar was added, plus the environment average, and the general average, 266 as well as two witnesses, Pérola (Carioca bean) and Ouro Negro (Black bean), for comparison 267 purposes. These two cultivars were selected as witnesses, as they are used as references for the   We found that the different criteria (AIC, BIC, and PAL) pointed to different models as 292 having a better fit. The AIC criterion identified model Leg.6.D, which has a diagonal structure 293 for the residues and a grade six for the Legendre polynomials, as having the best fit (    (Table 1).

355
The trials are ordered according to the environmental index (  The trials are ordered according to the environmental index (Table 1) over the years, in addition to the loss of information due to problems that occurred over the 460 trials, resulting in genetic and statistical imbalances. In this context, Resende [16,50] states that 461 the mixed model approach is a better alternative, for the analysis of such trials.

462
As noted, only 12 cultivars of superior performance were found in Fig 2- situations, whether this is due to a lack or excess of any factor. However, in a situation of 498 improvement of the environment, these cultivars will not be responsive to this increment of 499 environmental quality. This illustrates the definition of adaptability as presented by Cruz et al.

500
[3], as the differential response of cultivars due to a stimulus from the environment.

501
However, for the cultivars that are identified for favorable environments, we see the 502 opposite behavior. It is expected that cultivars adapted to these locations would normally who are responsible for the selection of these strains and always utilize optimal cultivation 510 conditions (fertilization, irrigation, and pest and disease control), may explain this.

511
The maintenance of productivity in different environments is explained by the response 512 to the environmental stimulus, being caused by the differential expression of the genes present 513 in each individual. In this way, the adaptability and stability indicated in the reaction norm 514 curves of the cultivars, provides information regarding their capacity to express phenotypes that 515 may better adjust to the environmental conditions [53]. In this sense, one way to improve the 516 adaptability of cultivars to the different environments in which they will be cultivated, is to 517 pyramid the genes of maximum expression in both the unfavorable and favorable environments.

518
The superior cultivars in each studied scenario were developed in different breeding programs 519 from four institutions (EMBRAPA, UFV, IAC, and IAPAR). This is indicative of the effort and 520 success of these breeding programs, as well as the genetic diversity between them, since the