Abstract
Interaction analysis is used to investigate the effect which two risk factors have on each other, and on disease risk. To study interactions, both additive and multiplicative models have been used, although their interpretations are not universally understood. In this study, we simulated several scenarios of risk factors relationships and investigated the resulting interactions using additive or multiplicative models. Independent risk factors approach additive effect at low disease prevalence, showing a sub-additive relationship. However, risk factors that contribute to the same chain of events (i.e. have synergy) lead to multiplicative relative risk. Thresholds on the number of required risk factors lead to intermediaries between additive and multiplicative risk. We proposed a novel metric of interaction consistent with additive, multiplicative and multifactorial threshold models. Finally, we demonstrate the utility of the simulation-strategy and discovered relationships by analyzing and interpreting gene-gene odds ratios obtained in a rheumatoid arthritis cohort.
Introduction
Screening the genetics of large cohorts of individuals can identify genetic loci that impact phenotypic traits on a gene-by-gene basis1, e.g. linking single-nucleotide polymorphisms to traits in genome-wide association studies. These studies have for some diseases resulted in lists containing over hundreds of associated risk genes2–4. However, these genes have not been sufficient in our understanding of why a particular individual gets a certain disease. Although we know that for several diseases, combinations of genetic and/or environmental risk factors have been observed to have a larger than expected risk when both factors are present, it is still a challenge to resolve if, and how, these multiple factors interact in shaping traits, and to biologically interpret the identified interactions5.
The association between individual genetic loci and an outcome (e.g. disease) is typically quantified as odds ratios or relative risks. Often the case-control design is used to query low prevalence diseases, in which odds ratios approximate the relative risk in the population if the samples are unevenly drawn. Interaction tests among risk factors are often examined pairwise, yielding three odds (or risk) ratios notated as6: OR11 for carrying both risk factors; OR10 and OR01 for the exclusive combinations, and lack of both risk factors OR00, which is used as reference (OR00=1). Confusingly, two different null models are commonly used, the additive (OR11 = OR10 + OR01 − 1) and the multiplicative (OR11 = OR10 ⋅ OR01). The additive null model builds on work by KJ Rothman7, who showed that if two factors are part of the disease’s cause and are part of the same sufficient cause (e.g. pathway), then their join risks will be larger than their sum (often termed “departure from additivity”). This additive model has been criticized for always giving positive results8,9. On the other hand, the multiplicative model has been criticized as a statistical convenience without theoretical basis, boosted by the implicit multiplicativity in logistic regression8,10,11.
In this study we strive to lessen the confusion by using simulations of various models of interaction. We thereafter show how additive and multiplicative risk scales compare to our simulation models, aiming to help a broader group of geneticists and epidemiologists to understand and interpret different models of interaction.
Results
Models used for simulation
To better understand risk interactions for qualitative traits, we performed simulations. As their basis, we designed five different models (Model I-V) of interactions with increasing complexity. We will later show how these models are related. Across all simulations the risk factors are dichotomous, and neither necessary nor sufficient for disease to occur.
In the single-group model (Model I), individuals are first split into cases and controls and then subject to independent “spiking in” (i.e. artificial creation) of two risk factors with higher frequency among cases than controls (Figure 1a). The etiological meaning of this model is unclear, but the model has been used in the past, in part due to its simplicity9.
The next two models (separate-groups models, Model II and Model III) are versions of spiking-in two risk factors into two groups of simulated individuals. These two groups are random and independent from the case and control groups. Each split only has an increase in frequency of one of the risk factors, so the risk factors cannot interact and are thus forced to be independent of each other. In Model II (Figure 1b), the frequency of the non-risk factor corresponds to the overall frequency in each random split (i.e. the same frequency for cases and controls). In Model III (Figure 1c) the frequency of the non-risk factor in each split is instead set to the frequency among the controls in the other split. Model II was designed to mimic mathematical addition, whereas Model III is a low-(trait)prevalence simplification of Models II and V (the latter of which is described below) designed to work for case-control setups.
In the Models IV and V, we simulated more explicit relationships between the two risk factors using the AND/OR relationships from Boolean logic. First, we assign randomly drawn cases and controls to two different types of groups, arbitrary called component 1 and component 2 (comp1 and comp2, respectively). Then, we spike in one risk factor by increasing the frequency among the cases. Finally, we implement the respective Boolean logic to assign case/control status. In Model IV we applied AND to assign cases, meaning that an individual was a case only if it was present in both comp1 and comp2 (Figure 1d). While in Model V, we assigned case status if it was in either comp1 or comp2 (Figure 1e). Because there will be simulated individuals that are comp1 cases and not exposed to the risk factor, and individuals that are comp1 controls yet exposed the risk factor, this is a simulation of two risk factors that are neither necessary nor sufficient to develop disease, but that will have different risk levels. Thus, Model IV requires the two risk factors jointly present for disease to develop, where both X and Y are part of different mechanisms (i.e. synergism between causes). While Model V corresponds to multiple mechanisms yielding disease, with X and Y risk factors taking part in independent mechanism which separately cause the same phenotype (i.e. heterogeneity of causes).
Relative risks
As we wanted to learn how additive and multiplicative risk scales compare to these simulation models, we calculated the relative risk of having both risk factors, for a range of simulated frequencies of the risk factors. We then compared these observed relative risks to the expected value based on additive or multiplicative combination of the relative risk for the individual risk factors, and varied the fraction of cases (corresponding to outcome/disease prevalence) (Figure 2a).
For all fractions of cases, Model II and Model IV stood out. Model II, which had separated risk factors, followed additivity and thus showed no interaction term on the additive. Model IV, with full synergism between the two risk factors, instead followed multiplicativity, and showed interaction on the additive scale.
For the remaining models, Model I, III and V, as the fraction of cases decreased, they became indistinguishable with respect to relative risk, odds ratio (Figure 2)and correlations (Supplementary Figure 1). Implying that they are approximations of one another under low disease prevalence. Model I, the simple model of spiking in two factors with higher frequency in a single group of cases, produced multiplicative effects, indicating that it becomes a version of the logical AND model (Model IV). The logical OR model (Model V) on the other hand showed additive behavior, as did the second separation model (Model III). Reassuringly, we could from algebra derive the same conclusions about multiplicative relative risks for Model IV and about the approaching additive relative risk as prevalence decreases for Model V (Figure 2b-c). The other three models (Model I-III) did not have obvious formulas we could work with. Model V had a lower-than-additive relative risk for the doubly exposed at higher frequency of cases (Figure 2a). This is consistent with the prediction from the algebra (Figure 2c) where high penetrance coefficients (i and k) had the same effect. We could also extend Model IV and V to three-factor models, as tested on a mixed logical AND and OR models which produced a relative risk in line with our results on two-factor models (Supplementary Figure 2).
Theoretically the relative risk for the doubly exposed in Model IV should be RR11=RR10+RR01−1−a⋅b⋅m/(1−m) where a=RR10−1, b=RR01−1, m=(−1+a⋅x−b⋅y−p ± (p2−2⋅a⋅x⋅p−2⋅b⋅y⋅p−4⋅a⋅x⋅b⋅y⋅p−2⋅p+a2⋅x2+2⋅a⋅x+b2⋅y2+2⋅b⋅y+2⋅a⋅x⋅b⋅y+1)0.5)/2/(−1−a⋅x−b⋅y−a⋅x⋅b⋅y), p=frequency of outcome, x=frequency of X, y=frequency of Y. However we could not get this, nor with the approximation m/(1−m)≈p, to agree with the observed values for RR11 in the simulation (data not shown).
Odds ratios
We calculated odds ratios (Figure 2d), and performed the same comparisons as we had for relative risks, in order to identify good simulation setups for case-control studies. Two models, not the same ones as for relative risk, stably follow additive or multiplicative risk at all simulated frequencies of cases. The odds ratios for the double risk from Model I followed multiplicativity, and Model III produced additive odds ratios. At a low fraction of cases the remaining models (Model II, IV, V) converged the same way as they did for relative risk. The same was true for correlations between the risk factors (Supplementary Figure 1), where only the models that produced additive risk had negative correlation among cases.
Multifactorial thresholds
While we have already shown mechanistic relationships that give rise to additive or multiplicative relative risks and odds ratios, it would also be useful to know what kind of risk factor relationships give rise to intermediary interaction terms between additive and multiplicative. As we were interested in what multifactorial thresholds would do to mathematical risk relationships, we set up a simulation (Figure 3a) with five equally common components than, when reaching a certain threshold, cause disease. The extreme thresholds 1 and 5 correspond directly to the models V and IV respectively and therefore cause additive or multiplicative risk respectively (Figure 3b). More important are the intermediary thresholds, which give double risk relative risks estimates between additivity and multiplicativity. Specifically, we found that the intermediary thresholds produced, for a ratio fthr=(t−1)/(F−1) where F is the number of components (factors) that can cross the threshold t, that the relative risk for doubly exposed was (1 − √f) ⋅ expected(additive) + √f ⋅ expected(multiplicative) (Figure 3b). Here √f is the square root of fthr. This formula for the double risk can be inverted, and the observed OR11 plugged in; this produced a metric √fest (signed square root of the estimated multifactorial threshold fraction) with the convenient properties of having both a natural lower value (0 for additive) and a natural higher value (1 for multiplicative) as well as an interpretable scale in-between through its connection to the threshold fraction fthr (Figure 3c). While the metric fest = sign(√fest) ⋅ (√fest)2 had a more natural scale, unlike √fest this was not symmetric around the median (data not shown). The large spread for √fest at threshold 1 (Figure 3c) could have resulted from having cases that did not involve components 1 or 2 in this simulation model (Figure 3a) and thus lowering odds ratios, rather than intrinsically from additive risk. √fest is related to another measure of interaction size, relative excess risk due to interaction (RERI), by √fest = RERI/(OR10 − 1)/(OR01 − 1) and could perhaps be used instead of measures like attributable proportion, synergy index and RERI, given that it can pinpoint multiplicative risk in testing on additive scale (and vice versa).
Example from rheumatoid arthritis
Both the synergism (Model IV) and heterogeneity (Model V) models represent interesting relationships between two risk factors, and the appropriate interaction model to use depends on the hypothesis one is interested in. Given the lack of insight into many RA risk loci, we needed an hypothesis-free approach, and the closest to that is evaluating both null hypotheses, as that would cover both models as well as threshold-based scenarios due to their intermediate nature (i.e. they would fail both types of tests in the opposite direction). We therefore evaluated both additive and multiplicative interaction on a case-control genome-wide association dataset for anti-citrullinated protein antibody positive (ACPA-positive) rheumatoid arthritis (RA), from the Swedish epidemiological investigation of RA (EIRA) cohort. For the two top genetic risk factors for RA in European-descendent populations, HLA-DRB1 shared epitope and PTPN22 rs2476601 T, we tested the risk factor against all non-HLA risk SNPs. HLA-DRB1 shared epitope is a group of alleles with similar effect, and rs2476601 is a non-synonymous coding variant of the PTPN22 gene. Two tests were used, one which used additivity as null hypothesis and one that used multiplicativity as null hypothesis. We found that there was no detectable deviation for multiplicativity, but there was from additivity (Figure 4a-b). In the case of HLA-DRB1 shared epitope, we have published on the deviation from additivity before12. The new simulation presented here has increased our ability to interpret this result as a widespread interaction between HLA-DRB1 shared epitope and all non-HLA genetic risk factors, in the common meaning of interaction where synergism is a type of interaction. From it, we can derive that the HLA-DRB1 shared epitope cannot be substituted for (i.e. phenocopied by) a non-HLA genetic risk factor for its part in the chain of ACPA-positive RA etiology (Figure 4c). The same is the case for the PTPN22 risk allele, given the similarities in P-value distributions we observed (Figure 4a-b). For both set of tests there were a majority of tested loci where there was too little data to distinguish additive from multiplicative odds ratios. We followed up the results of multiplicativity by looking only at known risk SNPs (Figure 4c), but found results similar to a randomization based on Model I (and therefore bound to produce multiplicative odds ratios), with similar variability (P=0.6-0.8, Levene’s test) implying a dearth of non-multiplicative odds ratios (Figure 4d). This randomization is the same as Test III of Ignac et al13. We also devised a randomization scheme creating additive odds ratios based on Model III and tested it on those full SNP set (Supplementary Figure 3), where it as expected deviated very noticeably from the real data.
Discussion
We herein present a simulation approach intended to help interpretation of additive and multiplicative interaction of relative risks and odds ratios. We show that additivity of risk factors occurs when the risk is comprised of two independent risk factors (Model V, Figure 1c), or a process that approximates that setup at a given fraction of cases. Multiplicativity of risk factors, and deviation from additivity, as well as negative correlation between risk factors, follows if two different mechanisms are required for disease (Model IV, Figure 1d). Depending on the hypothesis one should therefore chose the appropriate statistical test. For example, if the null hypothesis is that two factors are co-operating in causing a disease, this could be tested using deviation from a multiplicative effect. Often however, we are interested in testing whether disease is caused by the interaction of two factors, and then it is appropriate to test for deviation from additivity.
Our inspiration for this work came from a simulation study9, which in turn discussed our previous research on RA12, in which we detected deviation from additivity between risk factors. In the simulation study9, case and control status was randomly assigned, and one risk factor was spiked in to resemble the strongest genetic risk factor for ACPA-positive RA, and interaction with other risk factors (selected by p-value for risk) was computed. The simulation lead to an overrepresentation of additive interactions (i.e. deviation for additive odds ratios). However, selecting by risk p-value from a large random set is equivalent to spiking-in, except for real-world allele frequencies and population substructures (as opposed to arbitrary frequencies and statistical independence). Thus, this simulation9 was set up equivalent to Model I (Figure 1a). The author9 noted that this model produced a multiplicative null model that does not match additivity and concluded that the additive interaction observed were erroneous, as no interaction should be present. However, as mentioned, the setup in the simulation is similar to our Model I, and if the results are put in a causal context, the assumption about no interaction and following conclusion is incorrect. We instead propose an alternative interpretation, based on the convergences we found at low prevalence: Model I has an intrinsic interaction in the meaning of synergy between the risk factors14 as the model is equivalent to taking Model IV at low prevalence and subsampling it with a bias for cases (unchanged odds ratio means Model I is unaffected by biased sampling, and the equal match to multiplicative model at low prevalence, and in terms of correlations, means they become the same at a prevalence like that of RA: 0.7% for all RA in Sweden15, of which 60% are ACPA-positive16). The simulation study9 is therefore in line with the confusion over additive and multiplicative interaction that can sometimes be found in the literature17, highlighting the need to understand the relationships to the risk factors that they imply. It should be noted that the recent simulation study9 does demonstrate the emergence of multiplicative odds ratios when there are false risk factors, if one is using a common study design that creates the same scenario as Model I. After all, this paper assumes X and Y are true risk factors, for models IV onwards, instead of simulating false risk factors9, causing a divergence in which type of results can be interpreted in the light of each paper.
There was a negative correlation among cases for two risk factors for additive models. Theoretically, two individually sufficient factors (meaning that there is heterogeneity of causation) should have a strongly negative correlation, a similar but attenuated pattern for non-sufficient risk factors in regards to correlation is not surprising.
In this paper we also present the results when testing the multiplicative interaction between the strongest genetic risk for ACPA-positive RA and other risk-SNPs in the same material as our previous paper12, and show that it always follows the multiplicative null. In light of the new understanding that our simulations give, the presence of deviation from additivity, along with no deviation from multiplicativity, supports the existence of widespread synergism between the genetic risk factors in causing ACPA-positive RA.
The fact that most loci showed no statistically significant deviation from neither additive nor multiplicative interaction will be the unfortunate reality for many applications of interaction testing. While statistical power for single risk factor testing scales with the inverse square of the number of samples, already requiring large sample sizes in genome-wide association studies, the statistical power for interaction testing scales to the inverse power of four18, thus requiring far larger sample sizes than standard association testing.
A multiplicative assumption would have merit in our testing against HLA-DRB1 shared epitope and PTPN22, if ACPA-positive RA were a homogeneous set of causes, rather than the kind of heterogeneity of causation that we have shown give rise to additivity between risk factors. Despite being defined based on a mediating risk factor, such homogeneity of ACPA-positive RA is not thought to be the case19.
There is a concept of testing for additive interaction to find multiplicative risk relationship. For example, KS Kendler studied gene-environment interactions and hypothesized that those at genetic risk of major depression should have a stronger effect of environmental factors and therefore the risk ought to be multiplicative. He then intentionally tested on a linear scale as that was the opposite hypothesis20. For the same reason we added testing for multiplicative interactions to find additive risk relationships, which would imply heterogeneity of etiology, and in-between the two types of tests one might find multifactorial threshold relationships.
For both relative risk and odds ratios, the expected value for the double risk was always higher for multiplicativity. This is unsurprising, as the formula for the expected odds ratio or relative risk can be rewritten OR11 = OR10 + OR01 − 1 + (OR10 − 1)(OR01 − 1). For model V (heterogeneity), both the algebra and the simulations could produce less-than-additive effect wherever the prevalence (in the simulation) or penetrance (in the algebra) was not approaching zero. This indicates that small negative interaction terms on the additive scale can be caused by the trivial reason of non-infinitesimal prevalence and penetrance.
The algebra we employed was written as a deterministic model, for the convenience of simple equations. However, in the equation if ((X AND I) OR J) then comp1 case, I and J are there to make X neither necessary nor sufficient. This is the case for the typical risk factor, but the equation does not require that I and J are single risk factors, they could as well be compound risk factors, nor does it require that they are risk factors at all. They can for example be representations of chance, or stochastic factors, meaning our algebra covers probabilistic thinking. Considering the rules of Boolean logic, where (NOT a) AND (NOT b) = NOT (a OR b), and (NOT a) OR (NOT b) = NOT (a AND b), we expect processes which produce synergism among risk factors to show causal heterogeneity among protective factors and vice versa.
Model IV can be viewed as the chain of events scenario, whereas Model V corresponds to phenocopying. In terms of Rothman’s sufficient-cause model, the risk factors X and Y in Model IV correspond to risk factors in the same cause, referred to as causal co-action, joint action or synergism, whereas X and Y in Model V correspond to risk factors in different causes21. The multifactorial threshold model has thought of in terms of genetic liability22. To describe models I-III in very simplistic terms: Model I is placing the risk factors together, creating a type of interaction, while models II and III are putting two groups together to get additive.
In this simulation study we demonstrate the causal interpretations of additive and multiplicative interaction in both the relative risk and odds ratio setting. Some of this has been understood intuitively in the past, especially the connection between multiplicative effect and logical AND23, but here we try to further show this to the reader through simulation and simple algebra. We hope that this will guide the interpretation of future interaction studies.
Methods
Simulations
For each of the different models (models I to V – Figure 2 and Supplementary Figure 1 and 3, multifactorial threshold model – Figure 3 and three risk factors model – Supplementary Figure 2) 1,000 simulations were performed. Each simulation consisted of 1 million data points (or simulated individuals) where the presence or absence of a risk allele was assigned, as well as the status of case or control according to each model. For the sake of simplicity, binary factors were used, corresponding to a dominant or recessive scenario in genetics. We used components with the same ratio of cases to controls, except for the three factors model, where components 2 and 3 had the same ratio and component 1 has the same ratio as the OR combination as those two.
The allele frequency for the two factors tested in each model, named X and Y, was established before each simulation and set from a random value in a given range. For instance, the lower frequency for the risk factor Y was a random value between 5% to 15%. Then the higher frequency of the factor Y was the multiplication of the lower frequency by a random number between 1.1 to 4. Similarly, the lower frequency for the factor X was randomly designated between 5% to 25%. In turn, the higher frequency of the factor X was the multiplication of its lower frequency by a random number between 1.1 to 2.
For Model II, we spiked-in both risk factors with their risk frequencies into both groups, but then shuffled X in one group (between both cases and controls in the group) and shuffled Y in the other group, as a result the frequency corresponding to the numbers in italics in Figure 1b (X: freq. 0.35 and Y: freq. 0.25) will be higher_freq ⋅ prevalence + lower_freq ⋅ (1-prevalence), for example for a 50% prevalence and using the numbers for X in Figure 1b, 0.35 = 0.6 ⋅ 0.5 + 0.1 ⋅ (1 − 0.5).
Correlation between risk factors
The Pearson correlation was implemented to calculate the relationship between two risk factors in the five different models (Supplementary Figure 1).
Computational packages
We used the web calculator from symbolab24 to solve algebra, specifically the RR11=RR10+RR01−1−a⋅b⋅m/(1−m) formula. Otherwise, calculations were done using python, including the packages numpy, scipy, matplotlib and pandas.
Interactions in rheumatoid arthritis GWAS
The genotyped and imputed GWAS data from the EIRA study (see 12 for sources included) were used in this part of the study. Only data from ACPA-positive RA patients was included. The standard data filtering was performed as previously described12. Briefly, missing rate higher or equal to 5% and p-values of less than 0.001 for Hardy-Weinberg equilibrium in controls. The SNPs located in the extended mayor histocompatibility complex (MHC) region (chr6:27339429-34586722, GRCh37/hg19) were removed, due to the high linkage disequilibrium and possible independent signals of association with ACPA-positive RA in the locus.
The departure from additivity or multiplicativity for two risk factors was estimated in the imputed GWAS data (3,138,911 SNPs for the test with HLA-DRB1 shared epitope and 3,308,784 SNPs for the test with PTPN22 rs2476601 T), using GEISA25, where a dominant model was assumed. The first ten principal components and gender were used as covariables in this analysis, in order to control by population stratification and differences between allele frequencies due to sex, respectively. A cut-off of minimum five individuals per each odds ratio (OR) combination was applied. The HLA-DRB1 share epitope alleles included *01 (except *0103), *0404, *0405 and *0408 and *1001. The p-values for interaction from these analyses are plotted in the Figure 4a-b. We included only SNPs at risk allele frequencies between 10% and 50% in this testing to minimize the risk of including protective factors, however when we tested all the SNPs at a minor allele frequency above 1% the result provided the same conclusion.
To address both additive and multiplicative risk scales, and evaluate the behavior of the ORs for double risk exposure (OR11 – Figure 4c-d and Supplementary Figure 4), we used genotyped EIRA GWAS data (281,195 SNPs). For this analysis, the data was transposed using Plink 1.0726, then risk SNPs were selected base on a OR higher than 1.1 together with the criterion of having been reported as associated to RA in published case-control RA GWAS27–29.
Code availability
Code is available at https://github.com/danielramskold/additive_risk_heterogeneity_multiplicative_risk_synergism where we provide the code used to generate the figures. It also has code for the data-not-shown for the heterogeneity model at high prevalence as well as a program designed to be user-friendly (Model_I-V_sim.py) for calculating relative risk and odds ratio for each model with given allele frequencies and which includes the codominant scenario.
Author contributions
D.R. and B.B. conceived the study. L.M.D.G. and R.S. provided feedback on study design. D.R. performed the simulations. L.M.D.G. and D.R. performed the other analyses. H.W. helped with analysis settings. D.R. drafted the manuscript, and all authors critically revised the manuscript.
Acknowledgements
Anton Larsson helped during literature search. Lars Klareskog, Lars Alfredsson and Leonid Padyukov engaged in scientific discussions. This research was funded in part by Ulla och Gustaf af Uggla Foundation (2018-02670), Reumatikerförbundet (R-861801) and Konung Gustaf V:s 80-årsfond (FAI-2018-0518).