A Two­state Epistasis Model Better Explains Complex Traits and Increases Power to Detect Additive Risk Variants

1. CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not. 2. CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not. ABSTRACT Despite more than a decade of effort, the genetic underpinnings of many complex traits and diseases remain largely elusive. It is increasingly recognized that a purely additive model, upon which most genome­wide association studies (GWAS) rely, is insufficient. Although thousands of significant trait­associated loci have been identified, they often explain little of the inferred genetic variance. Several factors have been invoked to explain the 'missing heritability', including epistasis. Accounting for all possible epistatic interactions is computationally complex and requires very large samples. Here, we propose a simple two­state epistasis model, in which individuals show either high or low variant penetrance with respect to a certain trait. The use of this model increases the power to detect additive trait­associated loci. We show that this model is consistent with current GWAS results and better fits heritability observations based on twin studies. We suggest that accounting for variant penetrance will significantly increase our power to identify underlying additive loci. 3. CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not .


INTRODUCTION
We have been remarkably successful in identifying the genetic causes of Mendelian diseases with pedigree and sequence analysis (Botstein andRisch 2003; Gilissen et al. 2011) .In contrast, the genetic underpinnings of multigenic heritable traits and diseases remain largely unknown.Many complex diseases such as autism, schizophrenia and cleft lip/palate have discrete, binary states (affected vs unaffected), yet these discrete states are determined by whether an individual passes some threshold in an underlying quantitative liability.This liability is both genetic and environmental in origin (Stefansson et al. 2014; Lynch andWalsh 1998) .
GWAS is the most commonly used approach to discover the multiple loci underlying complex traits, relying on the assumption that risk alleles are more common among affected individuals than unaffected individuals.As currently implemented, GWA also assumes that loci contribute independently and additively to the total genetic contribution of a complex trait.How successful this experimental design has been is a matter of debate.Since 2007, GWAS have identified thousands of novel loci with significant trait or disease association and implicated previously unconnected pathways in disease processes (Visscher et al. 2012) .Examples of clear biological and clinical relevance include the autophagy pathway in Crohn's disease (Rioux et al. 2007; Hampe et al. 2007; Prescott et al. 2007; Yamazaki et al. 2007) and the JAKSTAT signaling pathway in rheumatoid arthritis (Stahl et al. 2010; Diogo, Okada, andPlenge 2014) , among others.However, many traitassociated loci explain little of the inferred genetic variance.They are therefore of limited use for predicting the disease risk of specific individuals and for understanding the mechanistic underpinnings of complex traits.It is widely acknowledged that there is room for improvement (Wray, Goddard, andVisscher 2007; Wray et al. 2013) .
Many hypotheses have emerged to explain the 'missing heritability' (Eichler et al. 2010; Manolio et al. 2009; Gibson 2011) .Some complex diseases such as autism spectrum disorder represent likely insufficiently resolved pools of phenotypically similar, but inherently rarer, disease traits with different genetic underpinnings.If so, the observed odds ratios for significantly traitassociated SNP are low because the common, traitassociated SNP is linked to a rare causative mutation that only appears in a small subset of haplotypes (Mitchell 2012) .Finegrained phenotyping rather than relying on discrete, binary diagnoses should help to explore this hypothesis (Walter 2013) .Some have suggested that the heritability of complex traits is overestimated (Wilson 2008; Zuk et al. 2012) .Studies accounting for all SNPs genomewide simultaneously, as opposed to individually associating SNPs with traits, indicate that this explanation is unlikely for many traits (Yang et al. 2010; Zaitlen et al. 2013) .Others have invoked currently inaccessible genetic or structural variants or rare risk alleles of moderate effect as major factors in complex traits (McClellan and King 2010; Eichler et al. 2010; Manolio et al. 2009) .However, at least for autism, recent studies suggest that common variants account for over 50% of the genetic risk (Klei et al. 2012; Gaugler et al. 2014) .
Finally, although additive genetics is certainly a major fodder for evolution and selective breeding since even epistatic interactions present mainly as additive (James F Crow 2010; MäkiTanila and Hill 2014) , epistasis can have large influence on complex traits (Lachowiec et al. 2014) .A highly socially relevant example for the importance of epistasis comes from plant breeding.As breeders increased seed yield, presumably via additive genetic factors, seeds became far more numerous, larger, and heavier.The increasing pressure on plant stalks required new mutations that enabled plants to remain erect under the increased seed weight this epistatic interaction enabled the Green Revolution (Hedden 2003) that vastly increased food security in many poor parts of the world (Ramanathan 2009) .
In humans, twin concordance rates indicate that at least some of the genetic variation influencing complex diseases is nonadditive.Experimental evidence demonstrates that epistasis, i.e. the phenotypically relevant, and often nonreciprocal interaction of nonallelic genes, is pervasive in complex traits in various model organisms (Lehner et al. 2006; Queitsch, Carlson, and Girirajan 2012; Mackay 2014) .
There is no reason to assume that the genetic architecture of complex traits differs between humans and other highly complex eukaryotes.Therefore, the inclusion of epistatic effects in statistical models has been increasingly suggested and even attempted in some studies (Lachowiec et al. 2014; Hemani et al. 2014a; Hemani et al. 2011) .Although models that allow for all gene × gene (and gene × gene × gene, etc) interactions will be more realistic, such a higherorder models require much larger datasets and faster algorithms (Lippert et al. 2013; Hemani, Knott, and Haley 2013; Hemani et al. 2014a; Wood et al. 2014; Li, Horstman, and Chen 2011; Hemani et al. 2014b) .
Studies in model organisms suggest that a simpler, twostate epistasis may apply to complex traits and diseases (Gibson 2009; Mackay 2014) .Certain genes can act as strong genetic modifiers for many others loci resulting in highly significant 'gene x genome' interactions as opposed to the familiar, often smalleffect, gene x gene interactions.Known examples include the chaperone Hsp90 in plants, flies, worms, fish, and yeast (Rutherford and Lindquist 1998; Jarosz and Lindquist 2010; Queitsch, Sangster, and Lindquist 2002; Casanueva, Burga, and Lehner 2012; Rohner et al. 2013) and several chromatin remodeling proteins in worms, among others (Lehner et al. 2006; Levy andSiegal 2008) .In addition to increasing the penetrance of many genetic variants in genetically diverse populations, perturbation of strong genetic modifiers can also vastly increase phenotypic variation in isogenic populations (Richardson et al. 2013) .Notably, the activity of strong genetic modifiers can be modulated by environmental stress (Rutherford 2003) .Based on studies in plants, worms, and yeast, the number of strong genetic modifiers is small, possibly ~10% of all genes (Lehner et al. 2006; Levy and Siegal 2008; Fu et al. 2009) .
There is some empirical support for strong genetic modifiers playing a role in complex diseases.For example, exome sequencing of autism parentchild trios finds diseaseassociated de novo mutations in CHD8, a chromatin remodeling factor involved in betacatenin and p53 signaling (O'Roak, Vives, Girirajan, et al. 2012) .
Other diseaseassociated de novo mutations also implicate chromatin remodeling and betacatenin signaling, which are both general developmental pathways, rather than specific neurodevelopmental pathways (O'Roak, Vives, Fu, et al. 2012; Krumm et al. 2014) .Chromatin remodeling is also implicated in schizophrenia, a distinct neurodevelopmental disease (Koga et al. 2009; McCarthy et al. 2014) .It seems unlikely that perturbation of a general pathway like chromatin remodeling leads to two distinct diseases; rather, these perturbations may increase penetrance of diseasespecific genetic variants.Similarly, network analysis of asthma GWAS hits and interactome data (genomics, gene expression, drug response) found three novel pathways that contribute to an 'asthma disease module': the AP1 family of transcription factors, Trka Receptor Signaling, and the GAB1 signalosome (Sharma et al. 2015) .All three are general signaling pathways of growth and development, involving canonical MAP kinase components; their specific relevance to asthma is unclear.In short, complex diseases may arise through perturbed strong genetic modifiers which epistatically enhance the penetrance of many other variants genomewide.Perturbations may occur either through mutation (as above) or environmental stress (Thomas 2010; Tsuang et al. 2004; Gibson 2009) .Assuming a threshold model for complex diseases, this strong interaction would cause a large shift of variant effect across the disease threshold.
Although the existence of differences in variant penetrance among individuals is not a new concept, its effect on GWAS in humans has not been investigated.Here, we present a simple twostate epistasis model, in which disease status of an individual depends on a combination of additive alleles (as before) as well as their penetrance in a given individual.Such a model might be called an A×P, or Additive × Penetrance model.A population consisting of all individuals with increased penetrance of many different genetic variants will have higher phenotypic variation, i.e. it will be less phenotypically robust, than an otherwise equivalent population consisting of all robust individuals with low penetrance (Queitsch, Carlson, and Girirajan 2012) .Theoretical population genetics provides a strong argument for the potential benefits of maintaining a population with a balance of robust and nonrobust individuals (Le Cunff and Pakdaman 2012; Hermisson and Wagner 2004; HuertaSanchez and Durrett 2007; Wagner 2008; Ferrada and Wagner 2008; Draghi et al. 2010; Wagner 2012) .As we show, this model increases the power to detect additive traitassociated loci and better fits heritability observations based on twin studies.In the absence of robustness measures in humans, one may argue that such model is of limited use to improve GWAS in humans.However, we argue that our model's success should inform our approach to finding diseasecausing loci and we discuss strategies how to apply it to existing and future GWAS data.

Heritability estimates imply substantial nonadditivity of genetic factors contributing to human disorders:
Geneticists refer to two types of heritability: additive ( narrowsense or h 2 ) and total ( broadsense or H 2 ).Since broadsense heritability ( H 2 ) is the fraction of phenotypic variance explained by all genetic factors, and narrowsense heritability ( h 2 ) is the fraction of phenotypic variance explained by additive genetic factors, it is tempting to deduce nonadditive genetic effects (epistatic effects) by simply subtracting these two values.However, despite numerous references to heritability in the literature, there remains some confusion on the methods of calculating H 2 in humans ( see Appendix I ) (Zuk et al. 2012; Hemani, Knott, andHaley 2013) .
The additive heritability of quantitative traits is simply a function of the slope of the line of regression between pairs of related individuals ( Fig. 1A ) (Lynch and Walsh 1998; Visscher, Hill, andWray 2008) .However, complications arise for traits that are observably discrete, but rely on thresholding of an underlying, unobserved quantitative liability scale.Such traits are called threshold characters (Wright 1934; Rendel 1962; Lynch and Walsh 1998) .Using the same method to measure additive heritability as before -doubling the slope of the line of regression between parentoffspring pairs -results in a substantially smaller estimate of the additive heritability ( Fig. 1A, inset ).
The degree to which this observable discrete additive heritability ( h 2 o ) is decreased relative to the heritability of the underlying additive liability ( h 2 ) depends on where the diagnostic threshold is drawn -in other words, the fraction of individuals that are affected ( φ ).Henceforth, we use h 2 bin to refer to the observable heritability of threshold traits (binary, affected/unaffected).
In 1950, Dempster and Lerner (Dempster and Lerner 1950) derived a formula relating h 2 to h 2 bin using a simplifying assumption that the quantitative trait (liability) is normally distributed in the population (see Appendix II ).Here, we refer to this DempsterLerner derivation as T ( h 2 bin ) as the maximum heritability of the binary character attainable if the underlying liability consisted purely of additive genetic factors ( h 2 =1) ( Fig. 1B ).
Expected values for h 2 bin can also be determined via simulation, in which the additive liability is binomially distributed ( N is twice the number of additive loci; P is the frequency of the risk allele at each locus).We refer to the maximum binary heritability attained via simulation of a purely additive model with h 2 =1 as S(h 2 bin ) .
In either case, binary heritability h 2 bin reaches a maximum (with respect to h 2 ) when half of the population is affected and drops off rapidly as the fraction of affected individuals approaches values typical for common discrete complex traits, such as autism, schizophrenia, and multiple sclerosis ( Fig. 1B ).However, empirically determined h 2 bin values for these traits -O ( h 2 bin ) -are much higher ( Table 1 ), calling a purely additive model into question.
To explore the implied nonadditive factors, we developed a simulator that designates a subset of the population as nonrobust by incorporating a robustness perturbation factor that increases the effect size of all additive risk alleles, i.e.
increases variant penetrance.Using this simulator, a purely robust population maintains the theoretical relationship between the fraction of affected individuals ( φ ) and h 2 bin [here called S ( h 2 bin )] ( Fig. 1B , black curve).The inclusion of nonrobust individuals with increased variant penetrance results in an increase of h 2 bin [here called S r ( h 2 bin )] even when the binary trait is at very low frequency ( Fig. 1B , gold curve).Below, we explore the effect of varying this robustness factor, as well as the frequency of nonrobust individuals in the population and the contribution of nongenetic factors ( H 2 <1 ).
Hill and colleagues (Hill, Goddard, and Visscher 2008) proposed a simpler function to determine the extent of nonadditive genetic effects: the correlation of monozygotic twins minus twice the correlation of dizygotic twins ( r MZ −2 r DZ ).They use this statistic to evaluate both quantitative traits, such as height, and threshold traits, such as endometriosis." r MZ >r DZ implies that part of the resemblance is due to genetic factors and r MZ > 2 r DZ implies the importance of nonadditive genetic effects".This metric makes intuitive sense, because monozygotic twins share 100% their alleles, and dizygotic twins share 50% of their alleles.While most of the complex traits that Hill and colleagues examine in their paper had r MZ −2 r DZ values close to or less than zero, for many important complex diseases, r MZ −2 r DZ > 0 ( Table 1 ).
Therefore, rather than relying on traditional heritability metrics, we quantify the nonadditive genetic component of complex traits and diseases by comparing  1, Fig. 1C ) .
Robustness as a twostate model of epistasis: Rather than expressing liability as a deviation from the mean, here, we define a quantitative phenotype (y) as a strictly positive value that is a linear combination of genetic factors and noise (environmental factors): where the α i 's are the weights of each of the contributing genetic components, n is the number of contributing genetic components, ε is the residual effect, which is assumed to be normally distributed with mean zero and standard deviation σ ε , and g i is an indicator variable, taking values 0 or 1 depending on whether the risk allele is present at the i th locus.( Fig. 2A ) The biallelic states can be thought of as intact (wild type) or broken (risk) versions of the gene.With exome sequencing, such data are now readily available (O'Roak et al. 2011) .For this and all further models (including populations consisting of a mixture of subpopulations), we determine binary trait status of each individual using a threshold drawn on the empirical population of y , such that the fraction of affected individuals in the population is consistent with that of a given trait of interest ( φ ).
Note that if we assume that all risk alleles contribute equally to disease risk, (all α i 's are equal), the formula simplifies further to: where n is the number of riskassociated alleles.
A nonrobust subpopulation could be defined in several ways.We explored four common manifestations of largeeffect factors, represented by coefficients c L , c ε , c α , and c n that modify the base additive formula ( Fig. 2 ), and evaluated their potential for explaining observed information.Note that at this point, we make no assumptions about whether robustness status is genetic or environmental in origin.
The first way in which the simple additive model could be modified is by inclusion of a large factor that simply adds to the phenotypic liability independently of other additive factors ( Fig. 2B ), where c L >>α .In this case the factor ( c L ) is not epistatic to the other additive factors ( g i ), it is yet another, larger, additive factor.
These types of factors are relatively simple to detect with linkage analysis (if genetic) or epidemiological studies (if environmental).Examples include a mutation in the fibroblast growth factor receptor 3, which causes a decrease in height (achondroplasia) (Shiang et al. 1994; Rousseau et al. 1994) , and William's syndrome, which causes, among other things, a decrease in IQ (Mervis and Becerra 2007) .In both these cases, the standard deviation of the trait (height and IQ, respectively) stays the same as that for typically developing populations.A population with a novel factor of large effect, as described above, would have a different phenotypic coefficient of variation, a common metric for robustness (Lempe et al. 2005) , than a population without that factor.However the difference is driven exclusively by a change in mean phenotype, and the novel factor would not be considered as having a robustness effect.We therefore exclude this model from further consideration.
A second modification to the simple additive model, and one that would result in an increase in phenotypic variance, is one in which the degree of stochastic noise (ε) in determining the phenotype is increased, ( Fig. 2C , where for robust individuals, c ε =1 and for nonrobust individuals, c ε >1 ).Such a change would affect the variance of the phenotypic value in the population without changing the mean value, and therefore such a largeeffect factor would be considered a robustness factor.
However, because c ε does not depend on the additive factors ( g i ), there is no epistatic effect.Note that this is the same as decreasing the importance of genetics in determining the phenotype ( i.e. decreasing overall heritability).An example of a single gene perturbation affecting trait robustness is the elf4 mutation that greatly increases variance of the plant circadian period but maintains mean value (Doyle et al. 2002) .
Another example would be the increased variation in normalized eye and orbit size caused by perturbation of Hsp90 in surfacedwelling Astyanax mexicanus fish, which did not change trait means (Rohner et al. 2013) .As this type of epistasis decreases the importance of genetics, is unlikely to underlie complex diseases with high observed heritability and monozygotic twin concordance, and we therefore exclude it from further consideration.
The final two types of singlefactor epistatic effects are (i) those that increase the effect size ( α ) of alreadypresent additive alleles ( Fig. 2D ; c 1998) .Second, loss of robustness can increase effect size ( i.e. penetrance) of individual mutations (Casanueva, Burga, andLehner 2012; Burga, Casanueva, andLehner 2011) .Third, methods based on comparing parent phenotypic variation to sibsib phenotypic variation have been developed to estimate the number of additive loci affecting a trait (Slatkin 2013; Penrose 1969) .Under a cryptic genetic variation scenario, an allrobust population would appear to have more additive loci affecting a trait than an allnonrobust population, whereas under an increased effect size scenario, they would be the same.
Therefore, we implement these last two models, D&E, in our simulator ( Fig. 3 ).
Our aim was to measure the effect of changing certain parameter values, such as n , φ , c α and c n on expected observable values, such as λ MZ , λ DZ , and the odds ratio ( OR ) of a locus found in a typical GWAS, while holding noise (ε) constant.Using a set of input parameters, we first simulate a set of families each containing two parents and a primary child (trio).We then simulate monozygotic and dizygotic twins of that primary child, and assign quantitative phenotypes ( y ) to each child.To do so, we sample the number of risk alleles per child, given the number of relevant loci and frequency of risk alleles at each locus.Third, children are designated to be robust or nonrobust, based on the frequency of perturbed 'robustness' alleles, p r .If a child is nonrobust, its additive risk is multiplied by c α or c n and the noise factor ε is added ( y ).Fourth, using the liability ( y ) of primary children, we designate the top φ fraction ( e.g.1%) as affected and thereby define a risk threshold for affected and unaffected individuals.
This threshold is used to determine affected/unaffected status of twins.Fifth, we calculate the monozygotic and dizygotic concordance rates ( λ MZ and λ DZ , respectively), and perform in silico GWAS on affected vs unaffected primary children (not including simulated twins) to determine the odds ratio of individual additive risk loci ( OR add ).
Our first result confirms intuition: in a mixed population, consisting of both robust ( c α =1; c n =1 ) and nonrobust ( c α >1; c n >1 ) individuals under either model D or E, individuals crossing the clinical threshold will be disproportionately from the nonrobust subgroup.As c α or c n increase, total liability ( y ) increases.This is true for both the nonrobust subpopulation and the entire population, which is why we use a percentile ( φ ) rather than an absolute threshold to determine affected/unaffected status.
Secondly, we find that a population in which just 1% of individuals are nonrobust can easily produce the range of empirically determined heritabilities of binary threshold traits [ Fig. 1B inset , S r (h 2 bin ) ] and reported twin concordances and broadsense heritabilities of several complex diseases ( Fig. 1C, Table 1 ).This is due to the fact that as c α or c n increase and ε remains constant, the fraction of liability that is determined by genetics ( H 2 ) increases for the nonrobust subpopulation and, by extension, for the entire population.
Controlling for robustness status adds power to GWAS: Regardless whether robustness is modeled as hiding cryptic variation or reducing penetrance of variants, if the goal is to find additive riskloci ( i.e .which may be good therapeutic targets), it is best to use only robust individuals for both affected and unaffected groups ( Fig. 4B ).Of course, controlling for a hidden robustness state, which modifies the effect of many alleles, returns the experiment to the situation for which GWAS was designed a purely additive model.It is less intuitive why using only robust (rather than all nonrobust) individuals improves our ability to detect additive risk alleles in GWAS, which is a result we explore below.
In the case of cryptic variation, robust individuals have fewer available additive loci than nonrobust individuals, so there are fewer ways to cross the threshold number of riskalleles; the risk alleles are concentrated at the noncryptic loci.
Alternatively, affected robust individuals may carry additive risk alleles of larger effect size, i.e. a different type of risk alleles than nonrobust affected individuals.Here we assume that nonrobust and robust individuals carry the same additive alleles.
In the case of increased penetrance, the explanation is similar to that given for why common risk alleles are more easily found than rare alleles that increase risk by the same amount (Gibson 2011) : effect size is a function not only of the case:control ratio of allele frequency, but also of the magnitude of the frequency of the alleles in affected and unaffected individuals ( Fig. 4A ).If all affected and unaffected individuals are robust (as compared to nonrobust), more risk alleles are required to cross the threshold (for affected individuals) and more risk alleles are allowed for individuals not crossing the threshold (for unaffected individuals).Therefore, while the ratio of allele frequencies between affected and unaffected individuals is the same for both all robust and all nonrobust groups, the magnitude of the allele frequencies is different, making those loci easier to find in robust populations.
Robustness status may be geneticallydetermined, but difficult to pinpoint.We state earlier that robustness status need not be genetically determined; however, we wish to explore reasonable scenarios in which it is.
Were robustness status encoded by a single genetic factor, this factor would be readily discernible, either by GWAS or linkage analysis because, under our model, affected individuals in mixed populations are disproportionately nonrobust.However, robustness status is highly unlikely to be encoded by a single gene.Model organism studies suggest that a significant fraction of total number of genes -possibly up to 10% -can affect robustness (Lehner et al. 2006; Levy and Siegal 2008; Fu et al. 2009) .
To model the scenario of multiple 'robustness' genes, we use a simple yet plausible houseofcards model of robustness, in which all n r 'robustness' loci must be functional for an individual to be robust.We observe that the odds ratio for any one 'robustness' locus ( OR r ) decreases to within the range often seen in GWAS as n r >50 ( Fig. 4C ).In other words, since there many possible 'robustness' loci, specific ones will be hard to find in GWAS.
With this result in mind, it may appear that models

DISCUSSION
We conclude that (i) a model that includes a houseofcards robustness state explains the observed data for several complex diseases better than one without, (ii) when looking for loci with additive risk alleles it is best to use all robust individuals for both cases and controls, and (iii) 'robustness' loci are unlikely to be identified in GWAS.
In any given family, a faulty 'robustness' allele passed on from parent to child will act as a largeeffect risk allele with Mendelian inheritance.This explains both high concordance among relatives and missing heritability in GWAS.
An obvious challenge is how to identify robust and nonrobust humans; this challenge is met in several ways in model organisms (Queitsch, Carlson, andGirirajan 2012; Lempe et al. 2005) .In humans, one may identify 'robustness' genes by comparing individuals with high comorbidity of complex diseases to those with none; however, to our knowledge, this has yet to be done.Another currently available approach would be to pool individuals that are affected by distinct complex diseases and compare these to their pooled unaffected controls.As we expect nonrobust individuals to be overrepresented among affected individuals for any complex disease, this GWAS approach may identify 'robustness' loci because the frequency of perturbed 'robustness' loci is increased.Indeed, this approach has shown promise for neurological disorders ("Identification of Risk Loci with Shared Effects on Five Major Psychiatric Disorders: A GenomeWide Analysis" 2013) .Similarly, the distinct diseases autism spectrum disorder and schizophrenia converge on chromatin remodeling as a general pathway that is disrupted in affected individuals (O'Roak, Vives, Girirajan, et al. 2012; O'Roak, Vives, Fu, et al. 2012; Krumm et al. 2014; Koga et al. 2009; McCarthy et al. 2014) .We suggest that perturbation of chromatin remodeling leads to increased penetrance of different additive risk alleles ( i.e. at different loci), hence resulting in distinct diseases.In fact, perturbation of chromatin remodeling genes increases the penetrance of many genetic variants in worms (Lehner et al. 2006) .
The cited autism studies also point to a promising path towards identifying 'robustness' loci: wholegenome sequencing of parentchild trios (unaffected parents, affected child) (O'Roak et al. 2011; O'Roak, Vives, Girirajan, et al. 2012) .This approach will capture the hypothesized largeeffect risk alleles, which are either inherited in families or arise de novo in affected children.Notably, exome sequencing of autism parentchild trios finds significantly associated de novo mutations in CHD8, a chromatin remodeling factor with roles in betacatenin and p53 signaling (O'Roak, Vives, Fu, et al. 2012) .Other de novo mutations in affected children also implicate chromatin remodeling and betacatenin signaling, both general developmental pathways, rather than specific neurodevelopmental pathways (O'Roak, Vives, Fu, et al. 2012; Krumm et al. 2014) .
A third approach possible with current data would be to assume that the GWAS loci associated with the highest disease risk represent 'robustness' genesparticularly if there is no obvious association between gene function and a specific disease.One would then proceed by controlling for these additive risk allele.As a community, we have been reluctant to stray from additive models because of the rapidly rising complexity involved in accounting for interactions between genes.Also, multiplicative models assuming many interactions will yield lower concordance rate between firstdegree relatives than is typically observed.This approach contrasts with ours as we assume 'gene x genome' epistasis ( i.e. strong genetic modifiers epistatically interact with many other loci).Note that statistical additivity is often an emergent property of underlying epistatic interactions (Song, Wang, and Slatkin 2010; Mackay 2014; MäkiTanila and Hill 2014) .Nevertheless, from the viewpoint of developing predictive diagnostics and effective therapeutics, it is important to identify the mechanistic underpinnings of complex diseases.If 'gene x genome' epistasis contributes to complex disease as we suggest, we need to think of ways to identify 'robustness' genes or markers for robustness in humans.
One plausible proxy for robustness could be the level of genomewide heterozygosity.For many traits in many organisms F 1 hybrids show less phenotypic variance and indeed hybrid vigor compared to their parental inbred lines P A and P B (Robertson and Reeve 1952; Festing 1976; Becker and Léon 1988; Phelan and Austad 1994; Klempt et al. 2006) .Plant breeders have been using this simple principle for almost a century (J F Crow 1998) .Hybrids show decreased phenotypic variance and increased vigor despite the fact that all three populations (F 1 , P A and P B ) are isogenic and therefore all phenotypic variation within each population should be environmental in origin.Because phenotypic variation due to environment is apparently reduced by heterozygosity, the fraction of phenotypic variation due to genetics should be greater in increasingly heterozygous populations.Hence, a good proxy for robustness status may be a genomewide heterozygosity score (Campbell et al. 2007; Govindaraju et al. 2009; Lie, Simmons, and Rhodes 2009; Gage et al. 2006) , which is often readily calculated from the same type of data collected for GWAS.
The relationship between heterozygosity and disease is reflected in the sex bias of neurodevelopmental disorders (Jacquemont et al. 2014) , with many more males than females affected (~30% to 50% excess, autistic male/female ratio of 4:1).Autistic females tend to carry a higher load of copy number variants (CNV) and single nucleotide variants (SNV) than affected males, arguing that a greater genetic liability is required for females to cross the disease threshold.A likely explanation for this "female protection" is the second X chromosome, which 'buffers' otherwise deleterious mutations (Jacquemont et al. 2014) .Moreover, autistic females are often much more severely affected; many fewer girls (one in seven) than boys have highly functioning ASD (autism spectrum disorder).Taken together, increased ASD severity, decreased occurrence, and higher mutational burden in females strongly implicate nonadditive genetic effects in ASD.
Ideally, we would have readily accessible and affordable DNA, RNA, protein or cellbased markers to determine robustness levels across many individuals and apply robustness as a covariate in GWAS.Without extensive studies in model organisms, which offer orthogonal measures to determine robustness (Lempe et al. 2005) , the identification of such markers seems not easily feasible.Some have suggested that levels of somatic genetic variation may be a readout of robustness and mutation penetrance with higher levels of somatic variation indicating lower robustness (Queitsch, Carlson, and Girirajan 2012; Press, Carlson, and Queitsch 2014; Heng 2010) .Singlecell sequencing and phenotyping may offer insights into an individual's robustness with greater celltocell variation indicating lower robustness and higher mutation penetrance (Steininger et al. 2014; Altschuler andWu 2010) .Simpler alternatives may rely on identifying individuals that are outliers in expression or methylation.Similar to using genotype data to elucidate population structure (Pritchard, Stephens, and Donnelly 2000) , expression (and methylation) data could be used to find subpopulations with different patterns of covariance between gene expression (methylation) levels (Satoh et al. 2006) .Admittedly, none of these suggestions are exactly affordable; yet, compared to the vast resources committed to GWAS, exploring these potential markers in model organism studies seems worthwhile.Notably, as robustness status can be environmentallydetermined, we believe that such markers would detect environmentallydependent robustness defects.Although a single environmental riskfactor should be readily identified through epidemiological studies, there could be a time lag between the environmental insult and the disease or a houseofcards mechanism for multiple environmental insults that make them difficult to pinpoint.Even without knowing the etiology of robustness status, we need to separate individuals into robust and nonrobust categories.
Are there indeed robust and nonrobust individuals as we argue?Population genetics theory suggests that it is evolutionarily advantageous to maintain a balance of Medical professionals intuitively agree.Drawing on their experience, they evaluate the full "Gestalt" of a patient and predict a patient's risk for a negative outcome with often great precision.Applying 'robustness' markers may turn a physician's intuition into diagnostics.Because of the potential for increasing predictive power and identifying effective drug targets, the possibility that robustness is a major player in the etiology of complex disease should be carefully considered.

APPENDIX
(I) Broadsense, or Total Heritability: The definition of total, or broadsense heritability is straightforward: where is the phenotypic variance due to genetic factors and is the phenotypic σ 2 g σ 2 ϵ variance due to environmental factors.In model organisms, such as yeast, these values can be measured directly using replicates (Bloom et al. 2013) .In humans, however, we must rely on an often slim set of naturallyoccurring replicatesmonozygotic twins.
Multiple different methods of calculating total heritability are still used, most of which are simple functions of monozygotic and dizygotic similarity, but the most common was proposed by Holzinger in 1929(Holzinger 1929) : where r MZ and r DZ stand for the correlation between the phenotypes of pairs of monozygotic and dizygotic twins, respectively.
Adding to the complexity of these calculations is the fact that traits can be either continuous (quantitative) or discrete (affected or unaffected).In the case of Holzinger's formula for total heritability ( H 2 ), the same formula can be used.The following is a useful simplification of the original formula for the case of discrete traits, where unaffected and affected individuals assume values of 0 and 1 respectively: where λ MZ and λ DZ are concordance rates between monozygotic and dizygotic twins, respectively.
Despite known problems surrounding the calculation of broadsense heritability, specifically oversimplifying assumptions of (i) nonlinkage between risk loci, and (ii) similarity of environment for monozygotic and dizygotic twins (Jacquard 1983; Schönemann 1997; Lynch and Walsh 1998) , the above formula for H 2 remains a common method for calculating the "heritability" reported for many complex diseases.To sidestep confusion surrounding heritability calculations, we use simple monozygotic and dizygotic twin concordance as an output measurement.
Other methods of calculating include Falconer's formula, 2(r MZ −r DZ ) and Nichols's formula 2(r MZ − r DZ )/r MZ , however these methods clearly fail in an essential criteria of the H 2 statistic: that H 2 must have a value between zero and one, as it is, after all, a fraction -specifically, the fraction of phenotypic variance due to genetic effects.For example, for Autism Spectrum Disorders, the concordance of monozygotic twins is 0 .95 and the concordance of dizygotic twins is 0 .04, and an overall population prevalence of 0 .5 implies r MZ =0 .97 and r DZ =0 .20, which would make H 2 1 .55 and 1 .59 for Falconer's and Nichols's formulas, respectively.It is safe to say that calculating the broadsense heritability in humans is an intractable problem.
(II) Narrowsense, or Additive Heritability of Threshold Traits: The relationship between the observable heritability of a discrete threshold trait ( h 2 bin ) and the additive heritability of the underlying, often hidden quantitative trait ( h 2 ) was derived as follows by Dempster and Lerner in 1950(Dempster and Lerner 1950; Lynch and Walsh 1998) , under the assumption that the underlying additive liability is normally distributed: where p(x p ) is the height of the standardized normal distribution at the diagnostic threshold (above which, an individual would be "affected") and φ p is the fraction of affected individuals in the population.h 2 bin reaches a maximum value of 2 /π ≈ 0 .637 when all of the underlying liability is genetic and additive ( h 2 =1 ) and half of the population is affected ( φ=0.5 ).When φ=0.01 , a typical value for more common discrete complex traits such as autism and schizophrenia, observable heritability ( h 2 bin ) based purely on underlying additive genetic liability is expected to be 0 .072.
The maximum value for h 2 bin decreases as the fraction of the population affected φ decreases.
In our simulations, genetic liability is described by a binomial distribution, with the number of trials being twice the number of risk loci, or 2n (for a diploid individual) and the probability of success being the frequency of a risk allele at any given locus, or p .Using this distribution, we get values of h 2 bin that are slightly higher than those predicted by Dempster and Lerner based on a normal distribution for all values of φ other than a small window around φ=0.5 (roughly 0.4 < φ <0.6) ( Fig. 1B , Table 1 ).and the maximum possible observable binary heritability ( h 2 bin ) when the underlying liability is entirely determined by additive genetic factors ( h 2 = 1) as predicted by theory (Dempster and Lerner 1950) , T(h 2 bin ) , as determined by simulation, S(h 2 bin ) , or as determined by simulation of a population that includes two types of individuals, robust and nonrobust, at a ratio of 99:1, S r ( h 2 bin ), where H 2 = 1.In the inset, we display the relevant φ range where 1% or less of the population is affected, which is common for many complex diseases.For the diseases shown, empirically determined heritability O(h 2 bin ) is up to twice as high as predicted under the models that do not include an epistatic robustness factor.( C ) In a population containing just 1% nonrobust individuals, increasing the robustness perturbation factor ( c α ), i.e. the foldchange of the additive genetic liability, increases both heritability and twin concordance to levels observed in several complex diseases.λ MZ = monozygotic twin concordance; λ DZ = dizygotic twin concordance; φ = frequency in population; T ( h 2 bin ) = maximum h 2 bin attainable when h 2 = 1 and assuming that the liability of the trait follows a normal distribution (Dempster and Lerner 1950) ; S ( h 2 bin ) = h 2 bin attained when h 2 = 1 and simulating the liability of the trait influenced by 100 additive risk loci as a binomial distribution, where N = 2×100 and p = 0 .5; O ( h 2 bin ) = observed heritability of a binary threshold trait; r MZ − r DZ is the statistic used by Hill and colleagues (Hill, Goddard, and Visscher 2008) [ Note: r ( correlation ) is related but not identical to λ ( concordance ).]; H 2 is Holzinger's broadsense heritability (Holzinger 1929) .
α ), and (ii) those that reveal cryptic riskalleles, increasing the total number of loci ( n ) influencing the trait ( Fig.2E , c n).In each scenario, populations consisting exclusively of either robustness type (robust individuals: c α =1 and c n =1 ; nonrobust individuals: c α >1 or c n >1 ) would differ in both the mean and the variance of the phenotype, and thus would satisfy the classical definition of robustness(Queitsch, Carlson, and Girirajan 2012) .Because the robustness factor interacts with the additive factors ( g i ), c α and c n would also be considered epistatic effects.We call these coefficients c α and c n robustness perturbation factors .Although the simplified versions of the equations for the two scenarios ( Fig.2D, E), wherein all weights are equal, are equivalent: y=(c α α)n+ε and y=α(c α n)+ε , we can distinguish them experimentally.First, trait mapping in nonrobust individuals will reveal additional loci, compared to a comparable study in robust individuals, if loss of robustness exposes cryptic loci(Sangster, Salathia, Lee, et al. 2008; Sangster,   Salathia, Undurraga, et al. 2008; Jarosz and Lindquist 2010; Rutherford and Lindquist D and E are similar to the model that includes an additional additive factor of large effect (model B) in that affected individuals are disproportionately either nonrobust (models D&E) or carriers of a largeeffect alleles (model B) -particularly since multiple factors of large additive effect that result in indistinguishable phenotypes could conceivably work in a houseofcards model.However, we argue that models D and E explain the marginal penetrance of some seemingly large effect factors and concomitant lack of Mendelian inheritance in pedigrees (Rosenfeld et al. 2013) as well as the presence of the multitude of other lesser riskassociated loci much more readily than model B.
robust and nonrobust individuals within a population, mainly due to the fact that nonrobust individuals supply a broader phenotypic range on which selection can act under extreme circumstances (Le Cunff and Pakdaman 2012;Hermisson and Wagner   2004; HuertaSanchez and Durrett 2007; Wagner 2008; Ferrada and Wagner 2008;   Draghi et al. 2010; Wagner 2012)  .

Figure 1 :
Figure 1: Comparison of narrowsense heritability of quantitative liability, observable heritability of discrete traits, and empirically determined heritability of certain complex threshold traits.( A ) Each symbol in the large scatter plot represents the genetic liabilities for one parentchild pair.Each symbol in the inset scatter plot represents the disease state for one parentchild pair (disease state must be 0=unaffected or 1=affected; points were jittered for visual clarity).Green pluses indicate cases in which the child is affected ( i.e. has crossed the diagnostic threshold) and the parent is not affected; blue crosses indicate cases in which the parent is affected and the child is not; red filled circles indicate cases in which both are affected; black open circles indicate cases in which neither is affected.Input parameter values for the simulation were as follows: number of additive riskloci ( n ) = 100; frequency of additive riskallele ( p ) = 0 .1; frequency of nonrobust state ( p r ) = 0. ( B ) The relationship between frequency of affected individuals in the population ( φ )

Figure 4 :
Figure 4: Effect size of an additive risk locus is determined by both the total