Sex differences in disease genetics

There is long-standing evidence for gene-by-sex interactions in disease risk, which can now be tested in genome-wide association studies with participant numbers in the hundreds of thousands. Contemporary methods start with a separate test for each sex, but simulations suggest a more powerful approach should be to use sex as an interaction term in a single test. The traits currently with the most compelling evidence for sex-dependent genetic effects are for adiposity (predictive of cardiac disease), type II diabetes, asthma and inflammatory bowel disease. Sexually dimorphic gene expression varies dynamically, by age, tissue type, and chromosome, so sex dependent genetic effects are expected for a wide range of diseases. Key concepts Compelling findings of sex-dependent genetic effects on disease have been made in adiposity-related anthropometric traits, type II diabetes, and inflammatory bowel disease. Other disorders remain to be more fully investigated, regardless of what sexual differences they exhibit in prevalence and presentation. Current evidence indicates that sex difference in gene expression is not required for a SNP to have a sex-dependent effect. However, sex differences in gene expression vary dynamically, by organ and age, so generalisations may be inaccurate without comprehensive data. Sex-dependent risk alleles are predicted to be of greater effect size than conventional ones, because natural selection acts only against the sex which has the disease. There is evidence for this from a high-powered GWAS of adiposity-related traits. Many of the large GWAS meta-analyses look for sex-dependent genetic effects by testing male and female groups separately. However, this may be under-powered compared to a whole-sample, gene-by-sex interaction test. Glossary Genome-wide association study (GWAS). Method for identifying molecular genetic variation that controls heritable traits, in a population sample. Involves assessing the correlation between allele frequencies and phenotype value, at millions of markers of common genetic variation across the genome. Sexual dimorphism. A difference between males and females in a population for the value of a particular trait. May include anything from anatomical measurements to expression level of a gene. Sex-dependent genetic effect. A disease risk allele is termed sex-specific when it increases risk in one sex only but has no effect on the disease in the other sex. The term sex-biased is used for an allele causes a significant increase in risk of disease in both sexes, but for which the magnitude of the risk increase is significantly different between males and females. There are also reports where an allele that increases risk of a disease in one sex reduces risk of the same disease in the other sex but none have been replicated, and there is no biochemical reason why this could be true. It effectively constitutes a sexually antagonistic effect, but should be distinguished from intra-locus sexual conflict which explicitly requires than an allele have opposing effects on the evolutionary fitness of males and females (Bonduriansky and Chenoweth 2009). All of the above relationships constitute a form of sex-dependent genetic effect.

Knowledge of the genetic causes of disease is rapidly expanding, following recent advances in genotyping 2 and computation methods, and massive participant numbers. There are currently two important 3 questions. Firstly, why do results from genome-wide association studies not reach the predictions of 4 classical quantitative genetics (Manolio et al. 2009)? Secondly, how can we use knowledge of risk genes 5 help in the treatment of individuals (Manolio 2013)? 6 The modification of genetic effects by environmental variables (such as smoking, nutrient intake, stress) 7 might help answer these questions, they are difficult to accurately quantify. Sex (gender) is comparable 8 to an environmental variable because it influences the immediate physiological environment within which 9 a gene functions, via the sex-determination pathway and steroid hormones. In contrast to classical 10 environmental variables, sex is simple to measure, equally-distributed in all populations, and determined 11 at conception. Sex-dependent genetic effects have long been known, with good evidence coming from 12 studies of Mendelian sex-linked disorders, narrow-sense heritability, and linkage mapping of quantitative 13 traits (Ober, Loisel, and Gilad 2008). Finally, the mitochondria, which is maternally-inherited and shows 14 a strong sex differences in function, harbours many disease-causing mutations which effect predominantly 15 males (Beekman, Dowling, and Aanen 2014). 16 There have been broad assumptions that sex-dependent genetic effects are the result of differences 17 in sex hormone levels, but experiments using hormone treatment and gonadectomy demonstrate that 18 sex differences in core domains of immune response, behaviour, and toxin resistance are controlled by 19 sex chromosomes, and not by sex hormones (Penaloza et al. 2009;Ngun et al. 2011). Additionally, 20 studies of human cell lines, which are devoid of sex hormones, indicate that expression level of up to 21 15% of genes is determined by the combination of both genotype and sex (Dimas et al. 2012). With 22 genome-wide association testing and massive participant numbers, the true extent of robust and actionable 23 sex-dependent disease associations is starting to be revealed.

24
Since the last review of sex-differences in disease genetics, which also provided an insight into possible 25 their evolutionary origins (Gilks, Abbott, and Morrow 2014), many new discoveries have been made. I 26 firstly review methods for detecting sex dependent effects in genome-wide association studies (GWAS), and 27 then summarise the most convincing associations, and the current knowledge within disease categories. 28 Using evidence from studies in transcriptomics, cell biology and biochemistry, I discuss the diverse 29 mechanisms by which an allele can have a sex-dependent effect on disease risk. show diverse and interesting patterns of sex differences. The arrangement of plots is by increasing DALY 34 from left to right, and vertically by sex difference. Note that the y-axis scale difference between plots, so 35 that for Stroke (top-left), the DALY is 1 million greater in middle-aged men than in women.

36
Methods for identifying sex-dependent effects in genome-wide 37 association studies 38 Genome-wide association studies for common diseases, typically model the relationship between genotype 39 and disease status using logistic regression to detect additive genetic effects, whereby the effect of two 40 alleles at a locus doubles the risk from having just one. In large GWAS with multiple sample collections, 41 meta-analysis is appropriate. Detection of sex-dependent effects in GWAS meta-analyses has been 42 pioneered by Magi and colleagues, with both a sex-differentiated test of association, and a test of allelic 43 heterogeneity between the sexes, which withstood comprehensive simulations for loss-of-power (Magi,44 Lindgren, and Andrew P. Morris 2010). The software these authors developed, 'GWAMA', has been 45 applied to many large GWAS, although without tests for sex-dependent effects.

46
In many recent GWAS, with sample sizes in the hundreds of thousands originating from multiple 47 clinical research groups, t and p-values for a sex difference have been calculated from separate-sex 48 meta-analysis statics (i.e. p, SE and ), where r was the Spearman rank coefficient across all SNPs 49 (Randall et al. 2013;Winkler et al. 2015;Shungin et al. 2015)). A permutation-based variation of 50 the separate-sex method has been described (Liu et al. 2012), but the separate-sex approach may be 51 underpowered compared to main-effects analysis because the test statistics for each sex are from a smaller 52 sample (Behrens et al. 2011).

53
The alternate method for detecting sex-dependent genetic effects in case-control data is by using 54 sex as an interaction term in the logistic regression ('GxS') which maintains full sample size, and is 55 implementable with standard software such as R/GenAbel (Aulchenko et al. 2007) and Plink (C. C. 56 Chang et al. 2015). Despite the usability and power of these programs, principles of statistical genetics 57 should be thoroughly understood prior to use. There has been extensive work on statistical methods for 58 gene-by-environment interactions in GWAS, which can be applied to GxS (Gauderman et al. 2013). The 59 GxS model does not appear to have been applied to any of the very large GWAS collections, possibly 60 because of complications in the meta-analysis. Another potential issue with GxS is the potential to 61 identify statistically-significant but biologically-unintuitive results. For example, a negligible difference in 62 allele frequency between the sexes within the disease group, will generate a significant result if there is a 63 similar difference in the opposite direction within the control group. Nevertheless, in a multi-collection 64 GWAS, performing the GxS test for each subgroup, then performing the meta-analysis on these statistics 65 may increase power over separate-sex meta-analysis. groups, i.e. male and female for both cases and controls, enforces the need for large and carefully-selected 68 control samples. Given the large number of different variables and co-factors in human populations 69 (compared to controlled laboratory and agricultural studies), mixed models are being increasingly used in 70 GWAS, and have good potential to investigate sex-dependent effects (Hoffman, Mezey, and Schadt 2014). 71 Finally, it is now possible to conduct association tests across the X-chromosome that incorporates male 72 hemizygosity and X-inactivation using the XWAS software package (Gao et al. 2015). This method has 73 found new X-linked genes for auto-immune disorders, some of which exert sex-specific risk (D. effects. In Alzheimer's disease, the APOE gene, ✏4 risk allele exhibits an earlier onset in men but a greater 80 overall risk in women, notably in heterozygous women carrying one APOE-✏4 allele (Riedel,Thompson,81 and Brinton 2016). The biochemical origins of this are expected to be in the lowering of bioenergetic rate 82 during menopause, which creates a uniquely-female risk profile (Riedel, Thompson, and Brinton 2016). 83 Within the other disorders, there are subtle sex-differences reported in prevalence, drug metabolism and 84 clinical presentation, suggesting some influence of sex differences in pathogenesis and potentially genetics. 85 (Also see Figure 1, Unipolar depression, with increased disability-adjusted life years in women globally.) 86 Autism typically has a 3:1 male-bias in prevalence, and is predominated by high-penetrance de novo 87 mutations. These do not occur in male-biased genes, although gene expression in specific brain regions 88 exhibits increased activity of normally-male biased genes (Werling, Parikshak, and Geschwind 2016). 89 This provides strong evidence that hyper-masculinisation of certain regions of the brain, in both sexes, 90 forms part of the aetiology of autism. In female autism cases, maternally-inherited genomic structural 91 variation is a more common cause, leading to speculation that female embryos are more tolerant to this 92 kind of genetic variation, but with the disadvantage of increased risk of disorders such as autism (Desachy 93 et al. 2015). These examples for autism concern the impact of rare variants on disease, and are not under 94 the same evolutionary forces as the high-frequency, low-risk variants which control many quantitative 95 traits and common disease. Nevertheless, it constitutes a sex difference in genetic architecture, which is 96 Gilks, Sex differences in disease genetics 7/17 likely to originate from the differing pressures of natural selection on male and female genomes.

97
Auto-immunity 98 These disorders classically show increased prevalence in females e.g. rheumatoid arthritis and osteoarthritis 99 (from Figure 1 and see (Rubtsova, Marrack, and Rubtsov 2015), but there are few reports from high-100 powered GWAS of common immune disorders into the investigation or discovery of sex-dependent 101 genetic effects. Application of the 'XWAS method' has identified X-linked sex-specific genetic effects on 102 inflammatory bowel disease at genes C1-GALT1-C1, CENPI and MCF2 (D. Chang et al. 2014).

103
For asthma, a male-only risk effect has been observed at the ZPBP2 gene (Naumova et al. 2013).   (Figure 2). One study reported 11 out of 44 sex-dependent loci had significant opposing 129 directions of effect between the two sexes (Winkler et al. 2015). Although these effects were not directly 130 reproduced in the other studies, significant effects at some of the same genes were found, but in one 131 sex only. Several genes now implicated in regulation of WHR adjBMI in women, encode proteins that 132 interact with one another. For example, PPAR and RXR↵ proteins (Winkler et al. 2015) bind to 133 form Adipocyte-specific transcription factor 6, a known master-regulator of adipocyte gene expression 134 (Tontonoz et al. 1994). In another example, the transcription factors HoxC13 and MEIS1 bind in vivo, 135 with complex tissue-dependent activity, and phenotypes relating to leukaemia, prostate cancer, hair, and 136 limb development (Adamaki et al. 2015;Z. Lin et al. 2012) 137 Of the four GWAS studies, that by Shungin and colleagues (Shungin et al. 2015) is the most 138 comprehensive and rigorous, although many results from the other studies are likely to be biologically 139 meaningful. This study identified 49 loci effecting WHR adjBMI , of which 19 were female-specific (38%), 140 and 1 was male-specific. The distribution of effect sizes at significant genes, suggests that a subset 141 female-specific loci have greater effect sizes than loci that effect both sexes equally ( Figure 2). Overall, 142 these results tentatively suggest that female-specific genetic effects are likely to make up a large proportion 143 of additive heritability for WHR adjBMI , which in-turn is linked to adiposity, heart disease and type II 144 diabetes. However more information is needed on what the biological basis of WHR adjBMI is, and whether 145 it is correlated with mortality equally in both males and females. Sex-dependent risk loci are expected for common diseases because of sex-dependent expression quantitative 161 trait-loci (eQTL), whereby expression level of a gene is determined by the combination of sex and genotype. 162 In cell culture sex-dependent eQTLs constitute 15% of genes, and it is interesting to note that many 163 genes with no difference in expression level between the sexes, can still have sequence variation which 164 acts in a sex-dependent way on expression (Dimas et al. 2012;Werling, Parikshak, and Geschwind 165 2016). One biochemical possibility for this is that different transcription factors transcribe the gene at 166 the same rate in each sex, but have different binding sites on the DNA, so only one of which will be 167 effected by a short polymorphism. Sexually dimorphic gene expression varies dynamically by tissue type, 168 age and chromosome (Kang et al. 2011;Lowe et al. 2015). categorised, and how univariate analysis might be misleading.

178
The evolution of sex-dependent genetic effects on disease risk has been postulated to arise from 179 intra-locus sexual conflict, but direct evidence is limited (Gilks, Abbott, and Morrow 2014). In the 180 example of ZPBP2 and asthma, the increased gene activity and risk in males is likely to be maintained in 181 the population because of the fundamental need for sperm production. For genomic structural variation 182 in autism, if female embryos are indeed more resistant, this increases the female birth ratio but at, the 183 price of reduced cognition (Desachy et al. 2015). Variants in RXR↵ increase waist-height ratio in females, 184 and high expression levels correlated increased rates premature birth, which is inevitably associated 185 with increase infant mortality. This might indicate balancing selection between nutrient metabolism 186 and storage in the mother, and investment in the health of the offspring. Overall, the evidence hints 187 that processes such as sex-specific pleiotropy and inter-locus sexual conflict might provide evolutionary 188 explanations for the observed sex-dependent disease risk genes. only selected against in one sex (Morrow and Connallon, 2013). One cause for concern is the use of 199 analysis methods which do not directly make a test of genotype-by-sex interaction, and thus remain 200 under-powered.

201
There has been great public investment into GWAS meta-analyses, publications and results are 202 becoming progressively more accessible, so release the computational code and logs detailing precisely 203 how the analyses were performed might be of benefit to the rest of the research community. This review 204 highlights the need, not only for more research into sex-dependent genetic effects, but also for continued 205 collection of large sample sizes for GWAS, and a for comprehensive study of gene expression across all 206 ages, sexes and tissue types, both of which are huge projects requiring massive collaboration, astute 207 management, and fearless science. Smaller-scale studies are warranted too, simulations of experimental 208 design and analysis methods for detecting sex dependent genetic effects, and investigation of the forces of 209 evolution that provide deep-rooted explanations for the everyday problems of heritable diseases.

210
URLs for data, software and code