A Kernel Method for Dissecting Genetic Signals in Tests of High-Dimensional Phenotypes

Genomewide association studies increasingly employ multivariate tests of multiple correlated phenotypes to exploit likely pleiotropy to improve power. Typical multivariate methods produce a global p-value of association between a variant (or set of variants) and multiple phenotypes. When the global test is significant, subsequent interest then focuses on dissecting the signal and, in particular, delineating the set of phenotypes where the genetic variant(s) have a direct effect from the remaining phenotypes where the genetic variant(s) possess either indirect or no effect. While existing techniques like mediation models can be utilized for this purpose, they generally cannot handle high-dimensional phenotypic and genotypic data. To assist in filling this important gap, we propose a modification of a kernel distance-covariance framework for gene mapping of multiple variants with multiple phenotypes to test instead whether the association between the variants and a group of phenotypes is driven through a direct association with just a subset of the phenotypes. We use simulated data to show that our new method controls for type I error and is powerful to detect a variety of models demonstrating different patterns of direct and indirect effects. We further illustrate our method using GWAS data from the Grady Trauma Project and show that an existing signal between genetic variants in the ZHX2 gene and 21 items within the Beck Depression Inventory appears to be due to a direct effect of these variants on only 3 of these items. Our approach scales to genomewide analysis, and is applicable to high-dimensional correlated phenotypes.


Abstract.
Genomewide association studies increasingly employ multivariate tests of multiple correlated phenotypes to exploit likely pleiotropy to improve power. Typical multivariate methods produce a global p-value of association between a variant (or set of variants) and multiple phenotypes. When the global test is significant, subsequent interest then focuses on dissecting the signal and, in particular, delineating the set of phenotypes where the genetic variant(s) have a direct effect from the remaining phenotypes where the genetic variant(s) possess either indirect or no effect. While existing techniques like mediation models can be utilized for this purpose, they generally cannot handle high-dimensional phenotypic and genotypic data. To assist in filling this important gap, we propose a modification of a kernel distance-covariance framework for gene mapping of multiple variants with multiple phenotypes to test instead whether the association between the variants and a group of phenotypes is driven through a direct association with just a subset of the phenotypes. We use simulated data to show that our new method controls for type I error and is powerful to detect a variety of models demonstrating different patterns of direct and indirect effects. We further illustrate our method using GWAS data from the Grady Trauma Project and show that an existing signal between genetic variants in the ZHX2 gene and 21 items within the Beck Depression Inventory appears to be due to a direct effect of these variants on only 3 of these items. Our approach scales to genomewide analysis, and is applicable to high-dimensional correlated phenotypes.
Key words: pleiotropy, association, high dimension, complex human traits

Introduction
Genetic analysis of multiple correlated phenotypes simultaneously is a popular strategy for gene mapping owing to the observation that many trait-influencing variants can influence multiple distinct phenotypes through pleiotropy (see Solovieff et al. (2013), Galesloot et al. (2014), Chung et al. (2014)). The difficulty in most multivariate tests lies in the fact that they are omnibus tests, and thus, these tests do not provide insight on which of the phenotypes in the set are driving the multivariate association through a direct association with the genotypes (direct effect), which phenotypes are only associated with the genotypes due to mediation (indirect effect), and which phenotypes are not associated with the genotypes. Such information would be valuable to obtain to gain important understanding of underlying biological processes.
A general strategy to distinguish direct effects from indirect effects in studies of association between exposures and outcomes of interest involves mediation analyses. In genetic studies, researchers apply mediation analyses to assess whether a cross-phenotype effect is due to biological pleiotropy or mediation pleiotropy. For example, suppose we find a significant cross-phenotype association between a SNP and two phenotypes (P1 and P2). Biological pleiotropy states that the SNP has direct effects on both P1 and P2. Mediation, on the other hand, considers the possibility that perhaps P2 lies in the pathway between the SNP and P1 such that at least part of the association between SNP and P1 is indirect due to each having a direct relationship with P2. Such mediation analyses are increasingly common in genetic studies of complex traits VanderWeele et al., 2012;Gabrielsen et al., 2013).
Existing methods for mediation analysis are not tailored to handle situations where the outcome variable, mediator variable, and genetic variable are of possibly high dimension. Thus, current methods are not applicable in a situation that involves testing whether a group of rare variants is only directly associated with a (possibly high dimensional) subset of the original collection of phenotypes. To fill this important gap, we propose a test for dissecting those phenotypes driving a multivariate association signal through direct associations with a genotype or set of genotypes in a gene. Our idea builds on GAMuT (Broadaway et al., 2016), a powerful method for cross-phenotype testing of both common and rare variants in a gene using a kernel distance-covariance (KDC) framework (Gretton et al., 2008;Székely et al., 2007;Székely and Rizzo, 2009;Hua and Ghosh, 2015). Our new method ("GAMuT-Dissect") identifies whether an observed association between a variant (or a group of variants within a gene) and possibly high-dimensional phenotype data is driven by direct association with a specific subset of these phenotypes (also possibly of high dimension). The method can handle both categorical and continuous data and further can adjust for covariates. Our method also yields analytic p-values without the need of permutation or resampling, thereby facilitating large-scale analysis.
The organization of the manuscript is as follows: in section 2.1, we describe the methods for the pleiotropic analysis of high-dimensional phenotypes (from Broadaway et al. (2016)). In section 2.2, we describe the extension of GAMuT to enable dissection of direct effects among high-dimensional phenotypes. In section 3, we present simulation results demonstrating the feasibility of the approach. We further illustrate the method by applying the technique to genetic and high-dimensional phenotypic data from the Grady Trauma Project dataset to dissect a previously detected association between the gene ZHX2 and the 21-item Beck Depression Inventory. Using our method, we show that the gene may have direct association with only 3 of the 21 items that compose this phenotype. Finally, in section 4, we elaborate on some limitations of our methodology and future areas of improvement.

GAMuT for pleiotropic analysis of high-dimensional phenotypes
GAMuT (Broadaway et al., 2016) studies the association between high-dimensional phenotypes and highdimensional genotypes, through a nonparametric test of independence between two sets of multivariate variables. We assume a sample of N subjects that are genotyped for V genetic variants in a target gene or region. We further assume these subjects are measured for L phenotypes. For subject j (j = 1, . . . , N ), let Y j· = (Y j,1 , Y j,2 , . . . , Y j,L ) be the L-dimensional vector of phenotypes, so that Y ∈ R N ×L is the full matrix of phenotypes. Similarly, let G j· = (G j,1 , G j,2 , . . . , G j,V ) be the genotypes of subject j at V variants in the gene or region of interest. Note that G j,v represents the number of copies of the minor allele that the subject possesses at the v th variant. Thus, the matrix of genotypes for the sample is denoted G ∈ R N ×V .
GAMuT tests for independence between Y (the N × L matrix of multivariate phenotypes) and G (the N × V matrix of multivariate genotypes) by constructing an N × N phenotypic-similarity matrix Q, and an N × N genotypic-similarity matrix X. See Broadaway et al. (2016) for more details on how to model pairwise similarity or dissimilarity.
After constructing the similarity matrices Q and X, we center them as Q c = HQH and X c = HXH, where H = (I − 11 T /N ) is a centering matrix (HH = H), I ∈ R N ×N is an identity matrix, and 1 ∈ R N ×1 is a vector of ones. With the centered similarity matrices (Q c , X c ), we construct the GAMuT test statistic as Under the null hypothesis that the two matrices are independent, T GAM uT follows the asymptotic distribution of a weighted sum of independent and identically distributed χ 2 (1) variables (Broadaway et al., 2016). We then use Davies' method (Davies, 1980) to analytically calculate the p-value of T GAM uT .
Our framework for determining those phenotypes driving a multivariate association signal through direct associations with a set of genotypes involves two separate hypothesis tests: 1) standard GAMuT test to identify if there is association between G and Y (as described in this section), and 2) GAMuT-Dissect to identify if there is association between P 1 (a subset of phenotypes) and G, after removing the effect of the potential mediator P 2 (which will be described in the next section).

GAMuT-Dissect for determining subset of high-dimensional phenotypes with direct genetic effects
Let G ∈ R N ×V be a genotype matrix of N unrelated individuals and V genetic variants. Let Y = [Y ·1 , · · · , Y ·L ] ∈ R N ×L be a phenotype matrix containing L phenotype vector columns Y ·i ∈ R N . Let Y = [P 1 , P 2 ] be a partition of the phenotype matrix with P 2 ∈ R N × 2 as the potential mediator between G and P 1 ∈ R N × 1 , with 1 + 2 = L (see Figure 1). Without loss of generality, we can assume that Y ·i ∈ P 1 for i = 1, · · · , 1 and Y ·i ∈ P 2 for i = 1 + 1, · · · , L. Let Z ∈ R N ×η represent the remaining shared (e.g. polygenic) effect that affects P 1 and P 2 simultaneously (see Figure 1). This effect drives the non-causal association between P 1 and P 2 . Without loss of generality, we assume here that the shared effect is one-dimensional, and thus, η = 1.
GAMuT-Dissect tests the null hypothesis that a direct association between G and P 2 fully explains the previously observed association between G and Y (no dashed line in Figure 1), after removing the effect of Z if we believe such an effect to exist. If Z is believed to exist and is measurable, we can regress out this effect from each phenotype under consideration (see Algorithm 1). If Z is instead assumed to be random, one could possibly eigendecompose the resulting covariance matrix and multiply the phenotypes by the eigenvector matrix (Zhou and Stephens, 2014). We detail this idea more in Discussion.
Under the null hypothesis (H 0 :P 1 ⊥ G|P 2 ),P 1 should not be associated with G after conditioning onP 2 , suggesting thatP 2 explains the previously observed association between G and Y . The alternative hypothesis is that there is association between G andP 1 after conditioning onP 2 , indicating thatP 2 either 1) explains some but not all of the previously observed association between G and Y or 2)P 2 does not contribute to the association between G and Y .
Note that in the algorithm (Algorithm 1), we first create the kernel matrix for the phenotypes and genotypes based on Broadaway et al. (2016). A kernel matrix is a matrix obtained by applying a pairwise kernel function on the phenotype or genotype matrix. Among the kernel functions, we can highlight the linear kernel or projection kernel, but see Broadaway et al. (2016) and references therein for more details.
Finally, we note that this method can also be applied to chromosome X variant sets by following the genotype coding in Ma et al. (2015).

Algorithm 1: GAMuT-Dissect
Input: Genotype matrix G, Phenotype matrix Y = [P 1 , P 2 ], where P 2 is the potential mediator, Z polygenic effects on Y -For every column Y ·i in Y , letỸ i be the residuals of the linear regression Y ·i ∼ Z. If there are covariates in the dataset, Y ·i represents the residuals after regressing out the covariates. -LetỸ = [Ỹ 1 , · · · ,Ỹ ] = [P 1 ,P 2 ] be the new matrix of phenotypes, after the effect of Z has been removed -Let K 1 ∈ R N ×N represent the (linear or projection) kernel matrix ofP 1 , and K G ∈ R N ×N the (linear or projection) kernel matrix of the genotype matrix G. For each column i in K 1 and K G , we will fit a linear regression model onP 2 and extract the residuals. That is, for column i, let K 1[·,i] = [1,P 2 ]β 1 denote the linear regression model, where 1 ∈ R N , β 1 ∈ R 2+1 , the fitted values are given bŷ K 1[·,i] = [1,P 2 ]β 1 , and residuals are given by , the fitted values are given byK G[·,i] = [1,P 2 ]β G , and residuals are given by ·,i] and stored in the i th column of the matrix W ∈ R N ×N . -Perform a standard GAMuT test on the matrices W and U Output: P-value of the hypothesis test of independence between W and U .

Adjusting for covariates
It is important for pleiotropic tests to adjust for important covariates, such as principal components of ancestry, to avoid potential confounding. The adjustment for covariates in GAMuT-Dissect is handled in a similar manner to that in the standard GAMuT (Broadaway et al., 2016). Before applying GAMuT, we can control for confounders by regressing each phenotype on covariates of interest and then using the residuals to form the phenotypic similarity matrix. We note that even though such residualization is not usual for the case of binary phenotypes, some studies have shown the validity of this procedure in genetic association studies (Price et al., 2006;Kang et al., 2010, Wei andLu (2017)). Finally, GAMuT-Dissect can also handle potential cryptic relatedness in a similar fashion to the standard GAMuT by eigendecomposing the sample relatedness matrix and multiplying the phenotypes and genotypes by the eigenvector matrix (Zhou and Stephens, 2014).

Simulations
We conducted simulations to show that GAMuT-Dissect properly preserves type I error and to further assess power under different models of direct, indirect, and no effects of genotype. Under the simulation setup in Figure 2 (left), we used R to simulate a region of 500 independent genetic variants (assuming 1% to be causal) with varying MAF between 0.01 and 0.05 (see the SM for MAF equal to 0.25). We simulated the polygenic effects Z as independent normally distributed with mean 0 and variance 1. We simulated 4 continuous phenotypes: (Y 1 , Y 2 , Y 3 , Y 4 ), with Y 1 and Y 2 associated with G through two different effect sizes: α 1 for Y 1 and α 2 for Y 2 . All 4 phenotypes are correlated through the latent scalar variable Z which represents the remaining polygenic effects not captured by G, with a specific effect size β (assumed β i = β for all i = 1, 2, 3, 4). This remaining polygenic effect is assumed known here. We varied β as 0, 0.25, 0.5, 1.0. For the case of rare variants, the effect sizes (α i ) were multiplied by | log 10 (M AF )| so that the effect size of a given causal variant is inversely proportional to its MAF. For an individual k, the vector where r was randomly sampled from a uniform distribution on (0.3, 0.5) (medium-sized correlation) and with h i = 2α i M AF (1 − M AF ) where the sum is over the causal genetic variants (i = 1, 2, 3, 4). For i = 3, 4, alpha i is set as 0 in the definition of h j .
The effect sizes were varied within the range α i ∈ (0.0, 0.11) for i = 1, 2 so that the R 2 was kept between (0, 0.027). The 4 phenotypes were split in two groups: intermediate group and outcome group under three scenarios (Figure 2 right). Scenario 1 models G as having direct effects on intermediate variables only. Scenario 2 models G as having direct effects on both intermediate and outcome variables. Finally, Scenario 3 assumes G has direct effects on outcome variables only.
We used a sample size of 5000 unrelated individuals, and replicated each simulation setting 100 times for power simulations and 1000 for null simulations. The empirical power was estimated by computing the proportion of p-values less than the significance level (α = 0.05) out of 100 replicates per scenario.

Analysis of the Grady Trauma Project
The Grady Trauma Project (GTP) studies genetic risk factors associated with psychiatric disorders such as PTSD and depression (Bradley et al., 2008;Ressler et al., 2011) (protocols approved by the IRBs of Emory University School of Medicine and Grady Memorial Hospital). The participants in GTP are predominantly African-American of low socioeconomic status, served by the Grady Hospital in Atlanta, Georgia. GTP staff collected an Oragene salivary sample for DNA extraction. Participants were genotyped on the Illumina HumanOmni1-Quad array for GWAS analyses. After standard GWAS quality control filters, 4,607 African-American subjects remained in the dataset with good quality genotype data.
GTP staff also conducts an extensive verbal interview, with demographic information, history of stressful life events, and several psychological surveys, including the Beck Depression Inventory (BDI). The BDI contains 21 groups of statements that represent diverse symptoms and attitudes associated with depression. Each group includes 4 statements with intensity in a scale of 0 to 3. The 21 groups are in Table 1. The BDI is generally self-administered or self-reported, and is scored by summing the ratings given to each of the 21 items. Summing the responses yields a score ranging from 0-63, with scores higher than 28 being indicative of moderate to severe depression. Subjects who did not report at least one past trauma, subjects with missing BDI scores, or subjects with incomplete covariate data (age, gender, and the top ten principal components to account for ancestry) were also removed, and thus, the final sample contained 3,520 subjects. Holleman et al. (2019) applied GAMuT to the BDI data using a linear kernel to measure pairwise phenotypic similarity in multivariate symptom scores on 765,580 common genetic variants (MAF > 5%) that fell within 19,609 autosomal genes. Their analyses identified one gene exceeding study-wise significance: ZHX2, on chromosome 8 was strongly associated with the symptoms in the BDI (P = 2.73 × 10 −6 ). Previous studies have identified a possible link between ZHX2 and autism spectrum disorder (Walker and Scherer, 2013). Here, we used GAMuT-Dissect to explore whether the significant association between ZHX2 and the 21 symptoms within the BDI is driven by direct associations of the variants in this gene with only a select subset of the symptoms.
For the real-life data application, the set of mediators is not known in advance. We propose a sequential approach in which first each individual phenotype is tested as a potential mediator (with the remaining L − 1 in P 1 ). Then, the set of potential mediators is created by including those phenotypes identified as mediators by the GAMuT-Dissect test. In a second stage, pairs of mediators are tested together in P 2 (with the remaining L − 2 phenotypes in P 1 ). This same procedure is followed until the largest possible set of mediators is identified by GAMuT-Dissect. We then adjust for multiple testing (Schaid et al., 2016;Stevens et al., 2017). See the Discussion for more details on the limitations of the sequential approach. Figure 3 shows the quantile-quantile (QQ) plots of 1000 null simulations of Scenario 1 (Figure 2) for our GAMuT-Dissect test when we adjust by the polygenic effect (Z) on all the phenotypes (as described in algorithm 1). We do not observe any inflation of p-values, and thus, the type I error for the null hypothesis of Scenario 1 is well-preserved. Thus, our method is properly calibrated for the situation where the genetic variants have direct effects only on the intermediate variables.

Power simulations (Scenario 2)
Here, we show that GAMuT-Dissect has power to detect Scenario 2 where genetic variants have direct effects on both intermediate and outcome variables (see Figure 2 right) after controlling for polygenic effect Z.
We present power plots as a function of R 2 corresponding to α 1 , because this effect is the one driving the simulation model, with different lines for R 2 corresponding to α 2 . Figure 4 shows that GAMuT-Dissect has increasing power as the effect α 1 increases, regardless of the value of α 2 . We also show that the parameters chosen for the effect sizes (α 1 , α 2 ) produce datasets with enough power to detect omnibus association with G (standard GAMuT in Figure 5).

Power simulations (Scenario 3)
Here, we show that GAMuT-Dissect has power to detect Scenario 3 (see Figure 2 right) where the genetic variants have direct effects only on outcome variables after adjustment for polygenic effects Z. We present power plots as a function of R 2 corresponding to α 1 , because this is the effect between G and the outcome. Figure 6 shows that GAMuT-Dissect has increasing power as the effect α 1 increases. We also show that the parameters chosen for the effect size (α 1 ) produce datasets with enough power to detect omnibus association with G (standard GAMuT in Figure 7).

Analysis of the Grady Trauma Project
With the GTP dataset, we applied GAMuT-Dissect by sequentially setting each of the 21 items composing the Beck Depression Inventory as a potential mediator between the variants in gene ZHX2 and the remaining 20 items in order to test whether particular BDI items were driving the association with ZHX2. Just as in Holleman et al. (2019), we controlled for gender, age, and ancestry, and used the estimated log odds ratios from external GWAS from the Psychiatric Genomics Consortium (Consortium et al., 2012;Group et al., 2011;Consortium et al., 2014) as the variants' weights in GAMuT. We found that ZHX2 variants appeared to have direct effects on 3 items if we use the standard significance threshold of 0.05 (red line in Figure 8): self-dislike, worthlessness, irritability. This result means that there is evidence (at the 5% significance level) that the ZHX2 gene perhaps is directly associated with these symptoms (self-dislike, worthlessness, irritability), while having only weak or no direct associations with the other 18. By applying a Bonferroni multiple testing correction, we identify 8 mediator candidates (blue line in Figure 8): pessimism, self-dislike, self-criticalness, agitation, indecisiveness, worthlessness, irritability, concentration.
We next assessed whether the group of 8 items identified in our sequential analysis fully explains the association between ZHX2 and the complete set of 21 items. When we perform this new GAMuT-Dissect test of whether the set of all 8 phenotypes are directly associated with gene ZHX2, and the gene is not associated with the other 13 phenotypes, we again fail to reject the null hypothesis (Scenario 1) at the standard threshold level of 0.05 with p-value 0.1815221. Thus, there is evidence that these 8 phenotypes fully account for the association we originally observed in our omnibus GAMuT test between the ZHX2 gene and the 21 items that compose the BDI. This provides extra information that the standard GAMuT (or another standard pleitropy test) is unable to elucidate. While standard pleitropy tests are able to identify potential associations of one gene (ZHX2) with a collection of phenotypes (in this case, the 21 BDI symptoms), these tests cannot tease apart which phenotypes are directly associated with the gene and are driving the original multivariate signal. With our proposed GAMuT-Dissect, we are able to distinguish that there is evidence for a direct effect between gene ZHX2 and 8 symptoms (pessimism, self-dislike, self-criticalness, agitation, indecisiveness, worthlessness, irritability, concentration), while the remaining 13 symptoms do not appear directly influenced by the gene. We conducted a standard GAMuT between ZHX2 and these remaining 13 symptoms, which yielded a suggestive p-value of 5.5e-4. This result indicates that ZHX2 has potential indirect effects on at least a subset of the 13 symptoms mediated by the 8 directly associated symptoms identified by GAMuT-Dissect. We further tested if each of the 8 phenotypes was a mediator of the other 7. We identified 3 symptoms that fully mediate the relationship with the ZHX2 gene of the remaining 7 symptoms for a significance level of 0.625% (Bonferroni correction): self-dislike, worthlessness and irritability. Next, we tested whether those 3 symptoms fully mediated the relationship with ZHX2 of the remaining 5 and we were unable to reject full mediation with a p-value of 0.8917.
A followup study could focus on these 3 symptoms, and a more detailed phenotyping endevour, which could potentially help identify a more fine-grained collection of symptoms concentrated on self-dislike, worthlessness, irritability that are directly associated with the ZHX2 gene.

Discussion
A variety of genetic studies employ multivariate tests of multiple correlated phenotypes to exploit likely pleiotropy to improve power. In this work, we focus on the topic of follow-up genetic study of a significant omnibus test of cross-phenotype association. We propose a novel method built on the GAMuT framework for cross-phenotype testing of a gene set using kernel distance-covariance techniques. Our new kernel method, called GAMuT-Dissect, teases apart the significant signal from an omnibus test in order to determine the subset of phenotypes where the genetic variant (or set of variants) has a direct effect. Such information can yield improved biological insight of the traits under study.
Unlike existing mediation techniques that can tease apart direct and indirect genetic effects, GAMuT-Dissect can handle both high-dimensional phenotypic and genotypic data. Our method can also handle both categorical and continuous data and further adjusts for covariates. Furthermore, since GAMuT derives analytic P-values from Davies' exact method, our proposed methodology is computationally efficient and applicable on a genome-wide scale. We simulated phenotypic data under a variety of direct and indirect genetic effects and showed our approach had good behavior across a wide range of models. We further illustrated our method using GWAS data from the Grady Trauma Project and showed that an existing signal between genetic variants in the ZHX2 gene and 21 items within the Beck Depression Inventory (Holleman et al., 2019) appears to be due to a direct effect of these variants on only 3 of these items.
We conclude with some comments on the characteristics of our method.
Adjustment due to shared measured or unmeasured effects. First, our method requires adjustment for any shared effects (due to a common cause) among phenotypes that are independent of the genetic variants of interest. Failure to adjust for these shared effects (or similar covariates) can lead to misleading inference. If these shared effects are measured, we can regress such effects out prior to analysis. If the shared effects are random (i.e. modeling the polygenic correlations among phenotypes using genomewide correlation estimates from Bulik-Sullivan et al. 2015), we can decorrelate the phenotypes prior to analysis using the technique of (Zhou and Stephens, 2014) that eigendecomposes the resulting correlation/covariance matrix and multiplies the phenotypes by the eigenvector matrix.
Interpreting the omnibus association. As noted when describing our method, GAMuT-Dissect tests the null that the omnibus association is completely explained by the subset of phenotypes modeled as the intermediate variables. Thus, when applying our method in general, special care should be used in interpreting the findings when rejecting or failing to reject the null, especially in relation to standard mediation techniques.

Potential collider bias.
Regarding the fact that GAMuT-Dissect teases apart the direct effects of G by conditioning on intermediate variables P 2 (see Figure 1), we urge caution in interpretation of GAMuT-Dissect if there is reason to believe the outcome variables P 1 causes P 2 since the conditioning performed as part of our framework would then lead to collider bias. Thus, the underlying causal structure should be thought about before conditioning on a certain intermediate, in order to avoid a biased result.
Greedy sequential approach to identify the mediator set. In this work, we assumed that the subsets P 1 and P 2 (mediator) are known in advance. This is likely not the case in some real-life applications. When the set of mediators is not known, one alternative is to devise a sequential approach in which first each individual phenotype is tested as a potential mediator (with the remaining L − 1 in P 1 ). Then, the set of potential mediators is created by including those phenotypes identified as mediators by the GAMuT-Dissect test. In a second stage, pairs of mediators are tested together in P 2 (with the remaining L − 2 phenotypes in P 1 ). This same procedure is followed until the largest possible set of mediators is identified by GAMuT-Dissect. More is need to be studied of this greedy approach and will be left as future work. In particular, it is necessary to explore the control of type I error under this sequential approach that includes multiple (correlated) tests. om/epstein-software.

Data Availability:
The Grady Trauma Project dataset is available from KJR on reasonable request. P 1 P 2P 1 P 2 Figure 1: Left: Z is a latent variable that represents the polygenic effect affecting both P 1 and P 2 . If the dashed line is absent, the genotype G is only directly associated with P 2 , so any association between G and P 1 is due to mediation or correlation. Right:P 1 andP 2 represent the residuals after regressing out the latent variable Z.    Empirical power for GAMuT-Dissect as a function of the R 2 corresponding to the effect α 1 under Scenario 2. Different colors for different R 2 for the effect α 2 (no major differences by color). Simulated datasets (100) assumed a window of 500 rare genetic variants with a given MAF (0.01 left and 0.05 right). Different panels correspond to different polygenic effects (Z). Bigger polygenic effects of Z correspond to higher correlation between the phenotypes.    P-values of GAMuT-Dissect using each phenotype of GTP as a potential mediator one at a time between the ZHX2 gene and the other phenotypes in the BDI questionnaire (after adjusting for covariates). The null hypothesis of Scenario 1 is not rejected in three cases (significance level 5% (red line) and Bonferroni-corrected 0.23% (blue line)).

Supplementary material
Type I Error Simulations: Single genetic variant Figure 9 shows the quantile-quantile (QQ) plots of 100 null simulations of no association (α 1 = α 2 = 0 in simulation scenario figure in main text) for the standard GAMuT test (Broadaway et al., 2016) and for the mediation GAMuT test. Both methods properly control the type I error under the scenario of no association.

Power Simulations of partial mediation: Single genetic variant
Here, we show that the mediation GAMuT has power to detect partial mediation. We present power plots as a function of R 2 corresponding to α 1 , because this effect is the one driving the partial mediation, with different lines for R 2 corresponding to α 2 . Figure 11 shows that the mediation GAMuT has increasing power as the effect α 1 increases, regardless of the value of α 2 . We also show that the parameters chosen for the effect sizes (α 1 , α 2 ) produce datasets with enough power to detect association with G (standard GAMuT in figure 12).

Power Simulations of no mediation: Single genetic variant
Here, we show that the mediation GAMuT does not have power to distinguish no mediation from the full mediation (null hypothesis) for the case of correlated phenotypes when we estimate the polygenic effect Z   Figure 11: Partial mediation with single genetic variant. Empirical power for mediation GAMuT as a function of R 2 corresponding to the effect α 1 under partial mediation. Different colors for different R 2 for the effect α 2 (no major differences by color). Simulated datasets (100) assumed a single genetic variant with a given MAF. Different panels correspond to different polygenic effects (Z). Bigger polygenic effects of Z correspond to higher correlation among the phenotypes. Empirical power for standard GAMuT as a function of R 2 corresponding to the effect α 1 under partial mediation. Different colors for different R 2 for the effect α 2 (which contributes to the association). Simulated datasets (100) assumed a single genetic variant with a given MAF. Different panels correspond to different polygenic effects (Z). Bigger polygenic effects of Z correspond to higher correlation between the phenotypes.
with P C1. We present power plots as a function of R 2 corresponding to α 1 (which we assume to be the same effect for both Y 1 and Y 2 in this scenario for simplicity).
In this setup, the mediator set [Y 3 , Y 4 ] (simulation scenario figure in main text) is not associated with G. When the phenotypes are uncorrelated (effect of Z is zero in figure 13), the empirical power has a peak at a certain effect size α 1 . We suspect that as the effect size (α 1 ) increases, the PC represents this signal instead of the effect from Z. Thus, when we residualize by the PCs, we lose the association signal. When the phenotypes are correlated (effect of Z greater than zero), suddenly the empirical power is close to the significance level, hence, the test does not have power to detect between the alternative hypothesis of no mediation and the null hypothesis of full mediation. This situation is not created by lack of signal of an association between G and the set of all phenotypes ([Y 1 , Y 2 , Y 3 , Y 4 ]) as shown in figure 14.  Empirical power for standard GAMuT as a function of the R 2 corresponding to the effect α 1 in the case of no mediation. Simulated datasets (100) assumed a single genetic variant with a given MAF. Different panels correspond to different polygenic effects (Z). Bigger polygenic effects of Z correspond to higher correlation between the phenotypes.