Phenotypic distribution and selection response for linear-product gene-environment interaction in quantitative traits

Gene-environment interaction is often described by linear phenotypic plasticity but has recently also been expressed as function of the product of genotype and environmental variables. While this model can be fitted in a multiple regression scenario, little has been written on the distribution of the product of breeding values and environment, GE, its expected moments, and the theoretical impact on phenotypic selection. Here we will explore these topics introducing the distribution for GE, its mean and variance, and its expected impact of lowering realized heritability due to is increasing the phenotypic variance.


Introduction
Gene-environment interaction has played an important, though not often wellknown, role in the history of genetics (Hogben, 1933;Falconer, 1952;Robertson, 1959;Via and Lande, 1985;Mulder and Bijma, 2005;Mulder et. al., 2006). Gene-environment interaction is most often defined as a situation under which the contribution to phenotype by genotype varies across different environments. This posits a function I (G, E), where G is the genotype, most commonly represented as the breeding value, and E is the relevant environmental variable(s). This restates the traditional equation for contributions to phenotypic value as Gene-environment interaction is distinct from the normal additive effects of environment across all genotypes which varies phenotype with respect to a fixed mean breeding value, environment, as well as gene-environment covariance (GXE covariance). Unlike gene-environment interaction which is inherent in the organism under a given scope of genotypes and environments, GXE covariance can be eliminated by randomization of genotypes with respect to relevant environmental variables. A common model for gene-environment interaction is linear phenotypic plasticity (reaction norm) where the phenotypic average of a population varies linearly with the values of certain environmental variables the population is exposed to (Scheiner, 1993;Via et. al., 1995;De Jong and Bijma, 2002;Kolmodin and Bijma, 2004;Mulder and Bijma, 2005).
There have been many studies of gene-environment interaction and its affect on selection with the linear phenotypic plasticity model (Van Tienderen and De Jong, 1994;Via et. al., 1995;De Jong and Bijma, 2002;Kolmodin and Bijma, 2004;Mulder and Bijma, 2005;Mulder et. al., 2006), however, more recently new models of gene-environment interaction have found use, especially in medical genetics.
Guo (Guo, 2000) introduced a model where the gene-environment interaction is defined as the product of G and E along with an effect size constant. This work and later studies (Ma et. al., 2011) use multiple linear regression models of the form The gene-environment interaction is given by the GE term. Guo analyzed the properties of this gene-environment interaction for a single locus being affected by a single binary risk factor (environmental variable). For many quantitative traits with continuous values under the influence of many loci (aka the Fisher infinitesimal model (Barton et. al., 2017)), alternate techniques are necessary to describe the distribution of gene-environment interaction and its effect on the moments of the phenotype in the population as well as the response to selection.

Statistical distribution of gene-environment interaction in the linear-product model
The model we will define has phenotypic population mean µ and distributions of breeding values and environmental values defined as the probability of deviation from the mean value of the population. In other words, both G and E are normally distributed with means of zero and variances σ 2 G and σ 2 E . Also, the gene-environment interaction is represented by a term c 1 GE where c 1 is an effect size constant. Thus the phenotype in the population is defined as In order to determine the distribution and moments of the gene-environment interaction contribution, we rely on recent work defining the statistical distribution of the product of two normal variables. In recent years, an exact form for the distribution of the product of two normal variables has been derived by (Nadarajah and Pogány, 2016;Cui et. al., 2016) and also been related to the variance-gamma distribution (Gaunt, 2014(Gaunt, , 2019(Gaunt, , 2021. In particular, for two correlated normal distributions with mean 0, variances σ 2 G and σ 2 E , and geneenvironment correlation r GXE the probability density of their product I = GE is ) (4) In the case where there is no GXE covariance and r GXE = 0, this reduces to The term K 0 in the two equations represents the modified Bessel function of the second kind of order zero. While not a widely familiar function, the moments for this distribution can be stated below by reference to the related variance-gamma distribution (Gaunt, 2014(Gaunt, , 2019(Gaunt, , 2021. Variance-gamma (V G) distributions are described in the appendix, but the distribution . The first and second moments of this distribution are given by Again, when GXE covariance is absent These moments lead to several conclusions about the effect of linear-product gene-environment interaction on the phenotypes of quantitative traits. In the absence of GXE covariance, there is no effect on the phenotypic mean. When covariance is present, however, the mean shifts proportionally to the value of the GXE covariance. In short the mean will shift by a value c 1 σ GXE .
The variance of GE will always widen proportionally to the product of the breeding value and environmental variances c 2 1 σ 2 G σ 2 E , with the increase rising with gene-environment covariance to a maximum value proportional to twice the product of the breeding value variance, the environmental variance, and the effect size squared.
The overall phenotypic variance is directly affected only by the variance of GE since there is no covariance between either G nor E with GE. Covariances for product variables have been derived by (Bohrnstedt and Goldberger, 1969). From their equation 13, assuming x = G, y = E, u = 1, ∆u = 0, and v = G, we define σ GXGE as A similar derivation can be done for the covariance of E and GE. Given our parameterizations of µ G = µ E = 0, σ GXGE = 0, thus the covariance of G or E with GE is zero.
In summary, product model GXE interaction will always widen the variance proportionally to the product of the variances of G and E but a population mean shift can only occur if there is simultaneous gene-environment covariance. This can be confirmed through numerical simulation where the first and second moments of populations with linear-product gene-environment interaction match theoretical expectations.

Phenotypic normality with the influence of GE
A key question is whether the product gene-environment interaction will induce a measurable distortion in the distribution of the phenotypic trait, which in most cases is distributed normally. The prior derivation of the first two moments is not conclusive to show this since higher order moments may deviate from normal expectations. If G or E and GE were statistically independent, the third moment for example would be given by m 3 (P ) = m 3 (G) + m 3 (E) + c 3 1 m 3 (GE).
The third and higher moments of GE are thus the deciding factors. First, if r = 0, GE is symmetric and the third moment, and thus skew, are zero though excess kurtosis differs from normal distribution expectations. However, graphs of GE show that positive or negative gene-environment covariance induce right and left skews to the distributions respectively as shown in figures 2 and 3. Therefore, given a large enough value of c 1 , departures from normality can be expected under substantial gene-environment covariance.
However, even when r = 0, we will show the phenotypic distribution becomes skewed for large values of c 1 . While the covariance between G or E and GE indicates no linear dependence, the variables are not completely statistically independent from GE. There is nonlinear dependence which makes the simple addition of moments incorrect in calculating the higher order moments and produces skewed distributions.
In figure 4, for several values of c 1 and r = 0 we show the phenotypic distribution and describe the calculated skew. Though the covariance is zero in all circumstances, the mutual information, which measures linear and nonlinear dependence is 0.4 between G and GE. As can be seen, when c 1 becomes greater than approximately 0.25, substantial right skews develop in the phenotypic distribution due to the greater influence of the nonlinear dependence between the G or E and GE. Therefore, even if gene-environment covariance is considered absent, large effect sizes for GE would seem to dictate non-normality in the phenotypic distribution of the trait.

Linear-product gene-environment interaction and selection response
The previous discussion of phenotypic normality raises the question of how response to phenotypic selection plays out for linear-product model gene-environment interaction. Fortunately, as we will demonstrate, expectations for realized heritability are influenced little by the phenotypic skew and are largely in expectation with linear models. We calculate the realized narrow heritability h 2 * as defined in (Van Tienderen and De Jong, 1994) as Though realized heritability is accurately defined using equation 9 only for those models with strictly linear dependence between phenotype and genotype, it can be used to investigate how the nonlinearity of this model affects a simulated selection response.
Given σ 2 P +G = σ 2 P + σ 2 G + 2Cov(P, G) and σ 2 P +G = σ 2 2G+E+GE we can calculate Therefore, the gene-environment interaction does not contribute to the covariance between phenotype and breeding value since the covariance between genotype and the genotype-environment product has already been shown to be zero. However, it does contribute to the realized heritability due to the phenotypic variance in the denominator and thus we can define Thus, the only effect of product gene-environment interaction in the response of selection is to lower the realized narrow heritability due to the increase in phenotypic variance. Therefore, a slightly weaker selection response is expected. In Figure 5, a population of 100,000 individuals with randomly selected values of G and E and GE are subject to directional phenotypic selection at one standard deviation above the mean. The mean breeding values post-selection are divided by the difference between the phenotypic mean of the selected population and the mean of the original population(equation 6) to estimate realized heritability and compared against the calculation of equation 11. This is done for various combinations of r GXE and c 1 in Figure 5.
The response to selection, measured through the realized heritability, does not deviate far from expected theoretical values. The largest deviation actually comes when there is no GXE covariance. Therefore, selection response models only accounting for the increase in phenotypic variance lowering the realized heritability will in most cases yield acceptably accurate results.

Discussion
Gene-environment interaction is an important aspect of explaining how the values and distributions of phenotypes emerge across a variety of populations and environments. While linear phenotypic plasticity is the most widely used and understood model for this interaction, interactions arising from the product of genetic and environmental factors have received increasing focus and interest in studies of gene-environment interaction. This paper has described the basic expected statistical aspects of such interactions and how they would change response under phenotypic selection. In summary, the variance of the population is increased proportional to the product of breeding value and environmental variances while the population mean is not changed except in the presence of GXE covariance. For normally distributed populations, the effect size measured by c 1 is necessarily of low to moderate value if no unambiguous skew is evident in the distribution.
The effect of linear-product gene-environment interaction is clear in that by shifting the mean and widening the variance, a greater range of phenotypes  should be present in the population than if interaction were absent. For threshold type traits which are often used to model diseases, etc. the proportion of the population above the threshold can raise significantly, especially if GXE covariance is also present. The response to phenotypic selection is widely in line with what is expected due to the effects of a larger variance lowering realized heritability. The components that drive selection response, additive genetic variance and GXE covariance, see their impact unaffected by linear-product gene-environment interaction.

Data Availability Statement
No data was used for this paper.

A Appendix: The Variance Gamma Distribution
The variance gama distribution is a statistical distribution used in a variety of fields including economics and finance due to its importance in modeling stochastic phenomena, such as asset price returns, that exhibit skewed tails but finite moments in their statistical distributions. It has many forms but the one most useful in our analysis is due to Gaunt (2014Gaunt ( , 2019Gaunt ( , 2021. The variance gamma distribution is described by four parameters, r, θ, σ, and µ and has a formal expression of f (x; r, θ, σ, µ) = 1 σ √ πΓ( r 2 ) e θ σ 2 (x−µ) The distribution is often abbreviated V G (r, θ, σ, µ) and the parameters have values of r > 0, θ ∈ R, σ > 0, and µ ∈ R. Under certain parameterizations, the variance gamma distribution reduces to more common distributions. The distribution V G(r, 0, σ/ √ r, µ) converges to the normal distribution N (µ, σ 2 ) as r → ∞ and V G(2, 0, σ, µ) is equivalent to the Laplace(µ, σ) distribution. The first two moments for the variance gamma distribution are given by Gaunt (2014) The distribution of the product of two normally distributed random variables both of mean zero, but with variances σ 2 1 and σ 2 2 and which have a bivariate normal distribution of correlation, ρ, can be represented by the variance gamma distribution V G(1, ρσ 1 σ 2 , σ 1 σ 2 √ 1 − ρ 2 , 0).