MAGMA: generalized gene-set analysis of GWAS data

PLoS Comput Biol. 2015 Apr 17;11(4):e1004219. doi: 10.1371/journal.pcbi.1004219. eCollection 2015 Apr.

Abstract

By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Computer Simulation
  • Crohn Disease / genetics
  • Databases, Genetic*
  • Genome-Wide Association Study / methods*
  • Humans
  • Models, Genetic
  • Software*

Grants and funding

This study was conducted as part of the Complexity project of the Netherlands Scientific Organisation (www.nwo.nl), grant NWO 645-000-003 (DP, TH). Statistical analyses were carried out on the Genetic Cluster Computer (http://www.geneticcluster.org) funded by the Netherlands Scientific Organisation, grant NWO 480-05-003 (DP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.