Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

Identifying loci affecting trait variability and detecting interactions in genome-wide association studies

Abstract

Identification of genetic variants with effects on trait variability can provide insights into the biological mechanisms that control variation and can identify potential interactions. We propose a two-degree-of-freedom test for jointly testing mean and variance effects to identify such variants. We implement the test in a linear mixed model, for which we provide an efficient algorithm and software. To focus on biologically interesting settings, we develop a test for dispersion effects, that is, variance effects not driven solely by mean effects when the trait distribution is non-normal. We apply our approach to body mass index in the subsample of the UK Biobank population with British ancestry (nā€‰~408,000) and show that our approach can increase the power to detect associated loci. We identify and replicate novel associations with significant variance effects that cannot be explained by the non-normality of body mass index, and we provide suggestive evidence for a connection between leptin levels and body mass index variability.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Association signal of AV test for simulated phenotypes with different parameters.
Fig. 2: Manhattan Sunset plot visualizing the genome-wide additive and log-linear variance test statistics for BMI.
Fig. 3: Relationship between additive and log-linear variance effects.
Fig. 4: Quantile-quantile plots for test statistics.
Fig. 5: Manhattan Sunset plot for dispersion effects.

Similar content being viewed by others

Data availability

The primary data analyzed in this study come from the UK Biobank. Applications for access can be made on the UK Biobank website.

References

  1. Price, A. L., Spencer, C. C. A. & Donnelly, P. Progress and promise in understanding the genetic basis of common diseases. Proc. Biol. Sci. 282, 20151684 (2015).

    ArticleĀ  Google ScholarĀ 

  2. Hill, W. G., Goddard, M. E. & Visscher, P. M. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 4, e1000008 (2008).

    ArticleĀ  Google ScholarĀ 

  3. Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS. Med. 12, e1001779 (2015).

    ArticleĀ  Google ScholarĀ 

  4. Marchini, J., Donnelly, P. & Cardon, L. R. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37, 413ā€“417 (2005).

    ArticleĀ  CASĀ  Google ScholarĀ 

  5. ParĆ©, G., Cook, N. R., Ridker, P. M. & Chasman, D. I. On the use of variance per genotype as a tool to identify quantitative trait interaction effects: a report from the Womenā€™s Genome Health Study. PLoS Genet. 6, e1000981 (2010).

    ArticleĀ  Google ScholarĀ 

  6. Struchalin, M. V., Dehghan, A., Witteman, J. C., van Duijn, C. & Aulchenko, Y. S. Variance heterogeneity analysis for detection of potentially interacting genetic loci: method and its limitations. BMC Genet. 11, 92 (2010).

    ArticleĀ  Google ScholarĀ 

  7. Hill, W. G. & Mulder, H. A. Genetic analysis of environmental variation. Genet. Res. (Camb). 92, 381ā€“395 (2010).

    ArticleĀ  Google ScholarĀ 

  8. Forsberg, S. K. G. et al. The multi-allelic genetic architecture of a variance-heterogeneity locus for molybdenum concentration in leaves acts as a source of unexplained additive genetic variance. PLoS Genet. 11, e1005648 (2015).

    ArticleĀ  Google ScholarĀ 

  9. Ivarsdottir, E. V et al. Effect of sequence variants on variance in glucose levels predicts type 2 diabetes risk and accounts for heritability. Nat. Genet. 1398ā€“1402 (2017).

  10. Kitano, H. Biological robustness. Nat. Rev. Genet. 5, 826ā€“837 (2004).

    ArticleĀ  CASĀ  Google ScholarĀ 

  11. RƶnnegƄrd, L. & Valdar, W. Recent developments in statistical methods for detecting genetic loci affecting phenotypic variability. BMC Genet. 13, 63 (2012).

    ArticleĀ  Google ScholarĀ 

  12. Yang, J. et al. FTO genotype is associated with phenotypic variability of body mass index. Nature 490, 267ā€“272 (2012).

    ArticleĀ  CASĀ  Google ScholarĀ 

  13. Cao, Y., Wei, P., Bailey, M., Kauwe, J. S. K. & Maxwell, T. J. A versatile omnibus test for detecting mean and variance heterogeneity. Genet. Epidemiol. 38, 51ā€“59 (2014).

    ArticleĀ  Google ScholarĀ 

  14. Cao, Y., Maxwell, T. J. & Wei, P. A family-based joint test for mean and variance heterogeneity for quantitative traits. Ann. Hum. Genet. 79, 46ā€“56 (2015).

    ArticleĀ  Google ScholarĀ 

  15. RƶnnegƄrd, L., Felleki, M., Fikse, F., Mulder, H. A. & Strandberg, E. Genetic heterogeneity of residual variance: estimation of variance components using double hierarchical generalized linear models. Genet. Sel. Evol. 42, 8 (2010).

    ArticleĀ  Google ScholarĀ 

  16. Box, G. E. P. Non-normality and tests on variances. Biometrika 40, 318ā€“335 (1953).

    ArticleĀ  Google ScholarĀ 

  17. Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197ā€“206 (2015).

    ArticleĀ  CASĀ  Google ScholarĀ 

  18. Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355ā€“360 (2010).

    ArticleĀ  CASĀ  Google ScholarĀ 

  19. Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. & Price, A. L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100ā€“106 (2014).

    ArticleĀ  Google ScholarĀ 

  20. Bycroft, C. et al. Genome-wide genetic data on ~500,000 UK Biobank participants. Preprint at https://www.biorxiv.org/content/early/2017/07/20/166298 (2017).

  21. Turcot, V. et al. Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity. Nat. Genet. 50, 26ā€“41 (2018).

    ArticleĀ  CASĀ  Google ScholarĀ 

  22. Horikoshi, M. et al. New loci associated with birth weight identify genetic links between intrauterine growth and adult height and metabolism. Nat. Genet. 45, 76ā€“82 (2013).

    ArticleĀ  CASĀ  Google ScholarĀ 

  23. Freathy, R. M. et al. Variants in ADCY5 and near CCNL1 are associated with fetal growth and birth weight. Nat. Genet. 42, 430ā€“435 (2010).

    ArticleĀ  CASĀ  Google ScholarĀ 

  24. Hivert, M. F. et al. Genetic determinants of adiponectin regulation revealed by pregnancy. Obesity (Silver Spring) 25, 935ā€“944 (2017).

    ArticleĀ  CASĀ  Google ScholarĀ 

  25. Perry, J. R. B. et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature 514, 92ā€“97 (2014).

    ArticleĀ  CASĀ  Google ScholarĀ 

  26. Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336ā€“1341 (2007).

    ArticleĀ  CASĀ  Google ScholarĀ 

  27. Larsen, T. M., Toubro, S. & Astrup, A. PPARgamma agonists in the treatment of type II diabetes: is increased fatness commensurate with long-term efficacy? Int. J. Obes. Relat. Metab. Disord. 27, 147ā€“161 (2003).

    ArticleĀ  CASĀ  Google ScholarĀ 

  28. Young, A. I., Wauthier, F. & Donnelly, P. Multiple novel gene-by-environment interactions modify the effect of FTO variants on body mass index. Nat. Commun. 7, 12724 (2016).

    ArticleĀ  CASĀ  Google ScholarĀ 

  29. KilpelƤinen, T. O. et al. Genome-wide meta-analysis uncovers novel loci influencing circulating leptin levels. Nat. Commun. 7, 10494 (2016).

    ArticleĀ  Google ScholarĀ 

  30. Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291ā€“295 (2015).

    ArticleĀ  CASĀ  Google ScholarĀ 

  31. Marazzi, A. Algorithms, Routines, and S-Functions for Robust Statistics. (Chapman and Hall/CRC, New York, 1993).

    Google ScholarĀ 

  32. Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833ā€“835 (2011).

    ArticleĀ  CASĀ  Google ScholarĀ 

  33. Wolfinger, R., Tobiast, R., Sall, J., Tobias, R. & Sall, J. Computing Gaussian likelihoods and their derivatives for general linear mixed models. SIAM J. Sci. Comput. 15, 1294ā€“1310 (1994).

    ArticleĀ  Google ScholarĀ 

  34. Lippert, C. et al. The benefits of selecting phenotype-specific variants for applications of mixed models in genomics. Sci. Rep. 3, 1815 (2013).

    ArticleĀ  Google ScholarĀ 

  35. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997ā€“1004 (1999).

    ArticleĀ  CASĀ  Google ScholarĀ 

Download references

Acknowledgements

This work was supported by Wellcome Trust grant 095552/Z/11/Z to P.D. and grants 090532/Z/09/Z and 20314/Z/16/Z as core support for the Wellcome Trust Centre for Human Genetics. A.Y. was supported by a Wellcome Trust Doctoral Studentship (099670/Z/12/Z) and by the Li Ka Shing Foundation.

Author information

Authors and Affiliations

Authors

Contributions

A.Y. developed the method, led its application to the UK Biobank data, and wrote the paper. F.L.W. was involved in the development and application of the method. P.D. supervised the research and wrote the paper. All work undertaken by F.L.W. was done while F.L.W. was at University of Oxford.

Corresponding authors

Correspondence to Alexander I. Young or Peter Donnelly.

Ethics declarations

Competing interests

P.D. is a founder and director of Genomics plc, and a partner of Peptide Groove LLP. The remaining authors declare no competing interests.

Additional information

Publisherā€™s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Association signal of the additive variance (AV) test for simulated phenotypes with different parameters.

The expected āˆ’log10 (P value) of the AV test for different additive and log-linear variance effects of the test locus is indicated by shading. Phenotypes were simulated for 100,000 unrelated individuals (Methods). The test locus had frequency 0.05. To make this plot comparable to Fig. 1, we used the same set of additive effects. As in Fig. 1, the strength of the additive effect is parameterized by the amount of variance explained, h2, if the allele frequency is 0.5. Here the allele frequency is 0.05, so the actual variance explained is 0.19 times the variance explained when the allele frequency is 0.5. The log-linear variance effect is indicated on the y axis and corresponds approximately to the proportional change in phenotypic variance per allele. We have highlighted two regions of parameter space: the area inside the green lines is where the association signal is stronger under the AV test than under the additive test, and the area inside the yellow lines is where the AV test is genome-wide significant (Pā€‰<ā€‰5ā€‰Ć—ā€‰10āˆ’8) but the additive test is not.

Supplementary Figure 2 Comparison of association signal for the additive variance (AV) and additive tests for different sample sizes.

The association signal when testing for both additive and log-linear variance effects (AV test) compared to testing for only an additive effect (additive test) in simulations. The y axis gives the expected log ratio (base 10) of the P value from the additive test to the AV test for different variance effects of the test SNP (x axis), with values above zero indicating a stronger signal from the AV test. The simulations were performed for sample sizes of 10,000 (red), 50,000 (green), and 100,000 (blue), indicated with the different colored curves. The log ratio is plotted as a crossed box if the expected P value from the additive variance test would pass the standard genome-wide significance threshold of 5ā€‰Ć—ā€‰10āˆ’8, and it is plotted with a triangle if neither of the expected P values from the two tests would pass the significance threshold.

Supplementary Figure 3 Relationship between additive and variance effects from GIANT meta-analyses.

Estimated additive (x axis) and variance (y axis) effects on BMI are plotted for all genome-wide loci, shaded in proportion to the negative log10 (P value) for an additive effect, up to a maximum of negative log10 (5ā€‰Ć—ā€‰10āˆ’8), the conventional boundary for genome-wide significance. The additive effects are taken from Locke et al. (Nature 518, 197ā€“206, 2015), and the variance effects are taken from Yang et al. (Nature 490, 267ā€“272, 2012). Because of the meanā€“variance relationship of untransformed BMI, any locus with an additive effect is expected to have a variance effect, even after inverse normal transformation. The red line has slope 0.1071, determined by robust regression of genome-wide variance effects on additive effects, with weights proportional to the inverse square of the standard error of the estimated variance effects.

Supplementary Figure 4 Relationship between estimated leptin effect and estimated dispersion effect on BMI.

Estimated leptin effect (s.d. change in leptin per allele) (x axis) and dispersion (y axis) effects on BMI are plotted for the top 100 approximately independent SNPs ranked by evidence for a leptin effect (Methods). The leptin effects are taken from KilpelƤinen et al. (Nat. Commun. 7, 2016), and the dispersion effects are taken from our analysis of the UK Biobank. The red line gives the estimated expected dispersion effect for a given leptin effect (Methods).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1ā€“4, Supplementary Tables 1 and 5, and Supplementary Note

Reporting Summary

Supplementary Table 2

Genome-wide summary statistics for BMI

Supplementary Table 3

Summary statistics for genome-wide significant SNPs

Supplementary Table 4

Summary statistics from the gene-by-environment interaction analysis

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Young, A.I., Wauthier, F.L. & Donnelly, P. Identifying loci affecting trait variability and detecting interactions in genome-wide association studies. Nat Genet 50, 1608ā€“1614 (2018). https://doi.org/10.1038/s41588-018-0225-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-018-0225-6

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter ā€” what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics