## Abstract

Heterosis (hybrid vigor) is a universal phenomenon of crucial agro-economic and evolutionary importance. We show that the most common heterosis indices do not properly measure the deviation from additivity because they include both a component accounting for the “real” heterosis and a term that has no link with heterosis since it depends only on the parental values. Therefore these indices are ineffective whenever the aim of the studies is to compare heterosis levels between traits, environments, genetic backgrounds or developmental stages, as these factors may affect not only heterosis but also the parental values. This observation argues for the careful choice of heterosis indices according to the purpose of the work.

## Introduction

Non-linear processes are extremely common in biology. In particular, the genotype-phenotype or phenotype-phenotype relationships display frequently concave behaviours, resulting in dominance of “high” over “low” alleles (Wright, 1934) and in positive heterosis for a large diversity of polygenic traits (Fiévet *et al.*, 2018; Vasseur *et al.*, 2019). Quantifying properly the degree of non-additivity is an essential prerequisite for any interpretation and comparison of genetic studies and for predictions in plant and animal breeding. However, most of the classically used heterosis indices can hardly meet this requirement.

Recall first the way the degree of dominance is measured. There are two classical dominance indices: (i) Wright (1934) defined:
where *z*_{1}, *z*_{2} and *z*_{12} are respectively the phenotypic values of genotypes *A*_{1}*A*_{1}, *A*_{2}*A*_{2} and *A*_{1}*A*_{2}, with *z*_{2} > *z*_{1}. *D*_{W} varies from 0, when *A*_{2} is fully dominant over *A*_{1}, to 1, when *A*_{2} is fully recessive with respect to *A*_{1}. *D*_{W} = 0.5 corresponds to semi-dominance or additivity (Table 1). Note that *D*_{W} is strictly equivalent to the coefficient of dominance *h* used in evolutionary genetics (Crow & Kimura, 1970). (ii) Falconer (1960) proposed the following index:
where . *D*_{F} varies in the reverse direction as compared to *D*_{W}: its value is 1 if *z*_{12} = *z*_{2} (complete dominance of *A*_{2} over *A*_{1}), −1 if *z*_{12} = *z*_{1} (*A*_{2} is fully recessive with respect to *A*_{1}) and 0 in case of additivity. In case of overdominance, *D*_{W} < 0 and *D*_{F} > 1, and in case of underdominance, *D*_{W} > 1 and *D*_{F} < −1 (Table 1).

The indices *D*_{W} and *D*_{F} are linearly related:
so it does not make any difference to quantify dominance with either of these indices: both give the position of the heterozygote relative to the parental homozygotes.

For polygenic traits, either index could be used to quantify non additivity, *i.e.* real heterosis, without any ambiguity. Actually *D*_{W} does not seem to have been used in this context, and *D*_{F} very little. In the literature one finds five heterosis indices, which are summarized in Table 1, with their characteristic values. Their expression in terms of genetic effects, namely additive, dominance, dominance-by-dominance epistasis and additive-by-additive epistasis effects, are shown in Supporting Table S1.

The two most popular indices are the best-parent (BP) and mid-parent (MP) heterosis indices (*e.g.* Gowen, 1952; Frankel, 1983):
where *z*_{2}, *z*_{12} and are respectively the phenotypic values of parent 2 (with *z*_{2} > *z*_{1}), of hybrid parent 1 × parent 2 and of the parental mean.

In some instances, the authors do not normalize the difference between the hybrid and the best- or mid-parent value:

Finally, the so-called “potence ratio” (Mather, 1949) has the same expression as the Falconer’s index of dominance:

Its value is 0 in case of additivity, 1 if *z*_{12} = *z*_{2} (hybrid value = best-parent value), −1 if *z*_{12} = *z*_{1} (hybrid value = worst-parent value) and > 1 (resp. < −1) in case of best-parent (resp. worst-parent) heterosis. *H*_{PR} includes the values of the three genotypes, whereas the other indices lack one of the parental values (*H*_{BP} and *H*_{bp}) or both (*H*_{MP} and *H*_{mp}). From a genetic point of view, *H*_{PR} is explicitly expressed in terms of the five genetic effects contributing to heterosis (Supporting information Table S1). Thus the potence ratio, which is yet by far the least used index, is the only one that informs us on the exact position of the hybrid value relative to the parental values. The Wright’s index of dominance has the same property, but its inverse direction of variation, that makes comparisons less easy, probably explains why it is not used in this context.

Let us examine the possible interpretation fallacies resulting from the use of the common heterosis indices.

## Relationships between the potence ratio and other heterosis indices

It is easy to show that the relationship between *H*_{PR}, hereafter noted *h*_{P} for simplicity, and the other indices is (with *z*_{2} > *z*_{1}):
where is the coefficient of variation (*σ /µ*) of the trait in the parents and is the difference between parents normalized by the best-parent value (or *z*_{b} = 2 *σ /z*_{2}).

For a given *h*_{P} value, the indices *H*_{MP} and *H*_{BP} are linearly related to *z*_{m} and *z*_{b}, respectively, *i.e.* they depend on the scale of parental values. The relation between *H*_{MP} and *z*_{m} is negative when *h*_{P} < 0 and positive when *h*_{P} > 0, while the relation between *H*_{BP} and *z*_{b} is negative when *h*_{P} < 1 and positive when *h*_{P} > 1. Recalling that *z*_{m} and *z*_{b} are positive, we see from equation 1 that for *h*_{P} ≠ 0, we have
and we see from equation 2 that for *h*_{P} ≠ 1, we have

If *h*_{P} = 0 (resp. *h*_{P} = 1), *H*_{MP} (resp. *H*_{BP}) = 0.

Numerical applications performed with nine *h*_{P} values, from *h*_{P} = −2 to *h*_{P} = 2, show that a given *H*_{MP} or *H*_{BP} value can be observed with contrasted *h*_{P} values (Supporting information Fig. S1). For instance, *H*_{MP} ≈ 0.4 can both correspond to mid-parent heterosis (*h*_{P} = 0.5, *z*_{m} ≈ 0.8) and to best-parent heterosis (*h*_{P} = 2, *z*_{m} ≈ 0.21).

This can also be illustrated from experimental data in maize. We measured six traits (flowering time, plant height, ear height, grain yield, thousand-kernel weight and kernel moisture) in four crosses (B73×F252, F2×EP1, F252×EP1, F2×F252) and three environments in France (Saint-Martin-de-Hinx in 2014, Jargeau in 2015 and Rhodon in 2015). We computed *h*_{P}, *H*_{MP} and *H*_{BP} for the 72 trait-cross-environment combinations. Fig. **1a,b** shows that the relationship between *h*_{P} and either index is very loose, if any. A given *h*_{P} value can correspond to a large range of *H*_{MP} or *H*_{BP} values, and vice versa. We performed the same analyses from the data published by Shang *et al.* (2016), who measured in cotton five traits in two crosses and three environments. The same loose relationship between *h*_{P} and either heterosis index was observed (Fig. **1c,d**). This means that the coefficients of variation of the traits or the normalized difference between parents, which have no link with heterosis since they do not include the hybrid values, affect markedly *H*_{MP} and *H*_{BP}.

Regarding the indices *H*_{mp} and *H*_{bp}, which are not dimensionless, they give no other information than the sign of heterosis. For a given *h*_{P} value, *H*_{mp} can vary from −∞ to 0 when *h*_{P} < 0 and from 0 to +∞ when *h*_{P} > 0, and *H*_{bp} can vary from −∞ to 0 when *h*_{P} < 1 and from 0 to +∞ when *h*_{P} > 1 (equations 3 and 4).

## The pitfalls of the commonly used indices

The non-univocal relationship between *h*_{P} and the commonly used heterosis indices has two consequences. (i) For a given trait, comparing these indices in different crosses and/or environments and/or developmental stages is quite tricky: as soon as there is an effect of these factors on the scale of the trait and/or the difference between parental values (*i.e.* on *z*_{m} or *z*_{b}), it becomes impossible to compare the actual levels of heterosis between the conditions. (ii) When studying different traits, the problem is even more pronounced because each trait has its own scale of variation, making *H*_{MP} or *H*_{BP} (and even more *H*_{mp} or *H*_{bp}) useless for comparing their real levels of heterosis.

These pitfalls could easily be illustrated from our maize dataset. Fig. **2a** shows that classifying the traits for their degree of heterosis can give markedly different results depending on whether one uses the *h*_{P} index or one of the two indices *H*_{MP} and *H*_{BP}. For instance, in the cross F252×EP1 flowering time displays moderate heterosis according to *H*_{MP} and *H*_{BP} but it is actually the trait with the highest *h*_{P} value. Conversely, plant height is the second most heterotic trait regarding *H*_{MP} or *H*_{BP}, which is not the case if we consider *h*_{P}. Similarly, comparing heterosis of a given trait in different hybrids results in index-specific rankings: heterosis of ear height measured with *h*_{P} is maximum in hybrid B73×F252, while from *H*_{MP} and *H*_{BP} the highest values are in hybrid F252×EP1 (Fig. **2b**). Finally the effect of the environment on heterosis gives the same discrepancies between *h*_{P} on the one hand and *H*_{MP} or *H*_{BP} on the other hand (Fig. **2c**).

It is also informative to compare the variation of heterosis indices for a trait measured during development or growth. We fitted the percentage of flowering over time in the hybrids W117×F192 and W117×F252 and their parents, using Hill functions:
where *n* is the Hill coefficient, then we computed the variation of heterosis for percentage of flowering estimated from the fitted curves (Fig. **3**). Again, *h*_{P} tells a specific story as compared to *H*_{MP} or *H*_{BP}. Both *H*_{MP} and *H*_{BP} decrease as flowering comes along because the coefficient of variation also decrease. This prevents to follow the variation of real heterosis.

The same type of result was observed in a simulation describing the increase of a population size that follows a logistic function, as observed for instance in yeast cultures. We used:
where *y* is the size of the population, *K* the carrying capacity, *a* a constant, *r* the growth rate and *θ* the time. We assumed that the parents differed only for growth rate *r* and that there is additivity for this parameter. The result shows that *H*_{MP} and *H*_{BP} for population size follow over time variations clearly non congruent with that of *h*_{P} (Supporting information Fig. S2).

## Discussion

If *H*_{MP} and *H*_{BP} (and their non normalized forms *H*_{mp} and *H*_{bp}) do not give reliable information on non-additivity, why are they so commonly used? There are probably both historical and technical reasons: (i) The first scientists who quantified heterosis were plant breeders (Shull, 1908; East, 1936). In an economic perspective, the goal was and still is to develop hybrids “better” than the best- or mid-parent for desired agronomic traits, and not to know where is the hybrid value relative to the parental values. So the heterosis indices have been defined accordingly and the habit has remained; (ii) The indices giving the right non-additivity values, *h*_{P} (= *H*_{PR}) for heterosis and *D*_{W} or *D*_{F} for dominance, can take high or very high values when the parents are close, due to the small differences *z*_{2} −*z*_{1} in the denominator of the fractions. This can produce extreme values that are not easy to represent and to manipulate for statistical treatments. Nevertheless such values are biological realities that convey precisely the inheritance of the traits under study, what *H*_{MP}, *H*_{BP}, *H*_{mp} and *H*_{bp} do not. In addition, from a practical point of view, a single index is sufficient to know the position of the hybrid relative to the mid- or the best-parent, whereas in a number of studies the authors compute and comment both *H*_{MP} and *H*_{BP} (or *H*_{mp} and *H*_{bp}). More important, as soon as it comes to compare amplitude of heterosis between traits, developmental stages, crosses or environmental conditions, there is no other choice but to use heterosis indices that are not affected by the scale of the parental values but account for the position of the hybrid in the parental range.

## Author contributions

Conceptualization: DdV. Maize experiments: JBF. Data analyses and numerical applications: DdV and JBF. Writing article: DdV and JBF.

## Acknowledgements

We thank our colleagues Mélisande Blein-Nicolas, Michel Zivy, Judith Legrand and Christine Dillmann for their useful reading of the manuscript. We are grateful to the key persons of the INRA Station of Saint-Martin-de-Hinx, of Euralis and of MASseeds for the 2014 and 2015 field experiments. These experiments were supported by the French Agence Nationale de la Recherche (*Amaizing* project ANR-10-BTBR-01).