Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

An Atlas of Genetic Correlations across Human Diseases and Traits

Brendan Bulik-Sullivan, Hilary K Finucane, Verneri Anttila, Alexander Gusev, Felix R. Day, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3, Laramie Duncan, John R. B. Perry, Nick Patterson, Elise B. Robinson, Mark J. Daly, Alkes L. Price, Benjamin M. Neale
doi: https://doi.org/10.1101/014498
Brendan Bulik-Sullivan
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2Stanley Center for Psychiatric Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
3Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hilary K Finucane
4Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Verneri Anttila
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2Stanley Center for Psychiatric Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
3Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexander Gusev
5Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
6Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Felix R. Day
7MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laramie Duncan
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2Stanley Center for Psychiatric Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
3Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John R. B. Perry
7MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nick Patterson
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Elise B. Robinson
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2Stanley Center for Psychiatric Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
3Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mark J. Daly
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2Stanley Center for Psychiatric Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
3Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alkes L. Price
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
5Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
6Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benjamin M. Neale
1Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2Stanley Center for Psychiatric Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
3Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: bulik@broadinstitute.org bneale@broadinstitute.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use our method to estimate 300 genetic correlations among 25 traits, totaling more than 1.5 million unique phenotype measurements. Our results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity and associations between educational attainment and several diseases. These results highlight the power of genome-wide analyses, since there currently are no genome-wide significant SNPs for anorexia nervosa and only three for educational attainment.

Introduction

Understanding the complex relationships between human behaviours, traits and diseases is a fundamental goal of epidemiology. In the absence of randomized controlled trials and longitudinal studies, many disease risk factors are identified on the basis of population cross-sectional correlations of variables at a single time point. Such approaches can be biased by confounding and reverse causation, leading to spurious associations [1, 2]. Genetics can help elucidate cause or effect, since inherited genetic effects cannot be subject to reverse causation and are biased by a smaller list of confounders.

The first methods for testing for genetic overlap were family studies [3–7]. The disadvantage of these methods is the requirement to measure all traits on the same individuals, which scales poorly to studies of a large number of traits, especially traits that are difficult or costly to measure (e.g., low-prevalence diseases). Genome-wide association studies (GWAS) produce effect-size estimates for specific genetic variants, so it is possible to test for shared genetics by looking for correlations in effect-sizes across traits, which does not require measuring multiple traits per individual.

A widely-used technique for testing for relationships between phenotypes using GWAS data is Mendelian randomization (MR) [1, 2], which is the specialization to genetics of instrumental variables [8]. MR is effective for traits where significant associations account for a substantial fraction of heritability [9, 10]. For many complex traits, heritability is distributed over thousands of variants with small effects, and the proportion of heritability accounted for by significantly associated variants at current sample sizes is small [11]. For such traits, MR suffers from low power and weak instrument bias [8, 12].

A complementary approach is to estimate genetic correlation, a quantity that includes the effects of all SNPs, including those that do not reach genome-wide significance (Methods). Genetic correlation is also meaningful for pairs of diseases, in which case it can be interpreted as the genetic analogue of comorbidity. The two main existing techniques for estimating genetic correlation from GWAS data are restricted maximum likelihood (REML) [13–18] and polygenic scores [19, 20]. These methods have only been applied to a few traits, because they require individual genotype data, which are difficult to obtain due to informed consent limitations.

In response to these limitations, we have developed a technique for estimating genetic correlation using only GWAS summary statistics that is not biased by sample overlap. Our method, cross-trait LD Score regression, is to single trait LD Score regression [21] and is computationally very fast. We apply this method to data from 25 GWAS and report genetic correlations for 300 pairs of phenotypes, demonstrating shared genetic bases for many complex diseases and traits.

Results

Overview of Methods

The method presented here for estimating genetic correlation from summary statistics relies on the fact that the GWAS effect-size estimate for a given SNP incorporates the effects of all SNPs in linkage disequilibrium (LD) with that SNP [21, 22]. For a polygenic trait, SNPs with high LD will have higher χ2 statistics on average than SNPs with low LD [21]. A similar relationship holds if we replace χ2 statistics for a single study with the product of z-scores from two studies of traits with non-zero genetic correlation.

More precisely, under a polygenic model [13, 15], the expected value of z1jz2j is Embedded Image where Ni is the sample size for study i, ρg is genetic covariance (defined in Methods), ℓj is LD Score [21], Ns is the number of individuals included in both studies, and ρ is the phenotypic correlation among the Ns overlapping samples. We derive this equation in the Supplementary Note. If study 1 and study 2 are the same study, then Equation 1 reduces to the single-trait result from [21], because genetic covariance between a trait and itself is heritability, and χ2 = z2. Asa consequence of equation 1, we can estimate genetic covariance using the slope from the regression of z1jz2j on LD Score, which is computationally very fast (Methods). If there is sample overlap, it will only affect the intercept from this regression (the term Embedded Image) and not the slope, so the estimates of genetic correlation will not be biased by sample overlap. Similarly, shared population stratification will alter the intercept but have minimal impact on the slope, for the same reasons that population stratification has minimal impact on the slope from single-trait LD Score regression [21]. If we are willing to assume no shared population stratification and we know the amount of sample overlap and phenotypic correlation in advance (i.e., the true value of Embedded Image), we can constrain the intercept to this value, which reduces the standard error. We refer to this approach as constrained intercept LD Score regression. Normalizing genetic covariance by the SNP-heritabilities yields genetic correlation: Embedded Image, where Embedded Image denotes the SNP-heritability [13] from study i. Genetic correlation ranges between −1 and 1. Similar results hold if one or both studies is a case/control study, in which case genetic covariance is on the observed scale. There is no distinction between observed and liability scale genetic correlation for case/control traits, so we can talk about genetic correlation between a case/control trait and a quantitative trait and genetic correlation between pairs of case/control traits without difficulties (Supplementary Note).

Simulations

We performed a series of simulations to evaluate the robustness of the model to potential confounders such as sample overlap and model misspecification, and to verify the accuracy of the standard error estimates (Methods).

Table 1 shows cross-trait LD Score regression estimates and standard errors from 1,000 simulations of quantitative traits. For each simulation replicate, we generated two phenotypes for each of 2,062 individuals in our sample by drawing effect sizes approximately 600,000 SNPs on chromosome 2 from a bivariate normal distribution. We then computed summary statistics for both phenotypes and estimated heritability and genetic correlation with cross-trait LD Score regression. The summary statistics were generated from completely overlapping samples. Results are shown in Table 1. These simulations confirm that cross-trait LD Score regression yields accurate estimates of the true genetic correlation and that the standard errors match the standard deviation across simulations. Thus, cross-trait LD Score regression is not biased by sample overlap, in contrast to estimation of genetic correlation via polygenic risk scores, which is biased in the presence of sample overlap [20]. We also evaluated simulations with one quantitative trait and one case/control study and show that cross-trait LD Score regression can be applied to binary traits and is not biased by oversampling of cases (Table S1).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1:

Simulations with complete sample overlap. Truth shows the true parameter values. Estimate shows the average cross-trait LD Score regression estimate across 1000 simulations. SD shows the standard deviation of the estimates across 1000 simulations, and SE shows the mean cross-trait LD Score regression SE across 1000 simulations. Further details of the simulation setup are given in the Methods.

Estimates of heritability and genetic covariance can be biased if the underlying model of genetic architecture is misspecified, e.g., if variance explained is correlated with LD Score or MAF [21, 23]. Because genetic correlation is estimated as a ratio, it is more robust: biases that affect the numerator and the denominator in the same direction tend to cancel. We obtain approximately correct estimates of genetic correlation even in simulations with models of genetic architecture where our estimates of heritability and genetic covariance are biased (Table S2).

Replication of Pyschiatric Cross-Disorder Results

As technical validation, we replicated the estimates of genetic correlations among psychiatric disorders obtained with individual genotypes and REML in [16], by applying cross-trait LD Score regression to summary statistics from the same data [24]. These summary statistics were generated from non-overlapping samples, so we applied cross-trait LD Score regression using both unconstrained and constrained intercepts (Methods). Results from these analyses are shown in Figure 1. As expected, the results from cross-trait LD Score regression were similar to the results from REML. cross-trait LD Score regression with constrained intercept gave standard errors that were only slightly larger than those from REML, while the standard errors from cross-trait LD Score regression with intercept were substantially larger, especially for traits with small sample sizes (e.g., ADHD, ASD).

Figure 1:
  • Download figure
  • Open in new tab
Figure 1:

Replication of Psychiatric Cross-Disorder Results. This plot compares cross-trait LD Score regression estimates of genetic correlation using the summary statistics from [24] to estimates obtained from REML with the same data [16]. The horizontal axis indicates pairs of phenotypes, and the vertical axis indicates genetic correlation. Error bars are standard errors. Green is REML; orange is LD Score with intercept and white is LD Score with constrained intercept. The estimates of genetic correlation among psychiatric phenotypes in figure 2 use larger sample sizes; this analysis is intended as a technical validation. Abbreviations: ADHD = attention deficit disorder; ASD = autism spectrum disorder; BPD = bipolar disorder; MDD = major depressive disorder; SCZ = schizophrenia.

Application to Summary Statistics From 25 Phenotypes

We used cross-trait LD Score regression to estimate genetic correlations among 25 phenotypes (URLs, Methods). Genetic correlation estimates for all 300 pairwise combinations of the 25 traits are shown in Figure 2. For clarity of presentation, the 25 phenotypes were restricted to contain only one phenotype from each cluster of closely related phenotypes (Methods). Genetic correlations among the educational, anthropometric, smoking, and insulin-related phenotypes that were excluded from Figure 2 are shown in Table S4 and Figures S1, S2 and S3, respectively. References and sample sizes are shown in Table S3.

Figure 2:
  • Download figure
  • Open in new tab
Figure 2:

Genetic Correlations among 25 GWAS. Blue represents positive genetic correlations; red represents negative. Larger squares correspond to more significant p-values. Genetic correlations that are different from zero at 1% FDR are shown as full-sized squares. Genetic correlations that are significantly different from zero after Bonferroni correction for the 300 tests in this figure have an asterisk. We show results that do not pass multiple testing correction as smaller squares in order to avoid whiting out positive controls where the estimate points in the expected direction, but does not achieve statistical significance due to small sample size. This multiple testing correction is conservative, since the tests are not independent.

For the majority of pairs of traits in Figure 2, no GWAS-based genetic correlation estimate has been reported; however, many associations have been described informally based on the observation of overlap among genome-wide significant loci. Examples of genetic correlations that are consistent with overlap among top loci include the correlations between plasma lipids and cardiovascular disease [10]; age at onset of menarche and obesity [25]; type 2 diabetes, obesity, fasting glucose, plasma lipids and cardiovascular disease [26]; birth weight, adult height and type 2 diabetes [27, 28]; birth length, adult height and infant head circumference [29, 30]; and childhood obesity and adult obesity [29]. For many of these pairs of traits, we can reject the null hypothesis of zero genetic correlation with overwhelming statistical significance (e.g., p < 10−20 for age at onset of menarche and obesity).

The first section of Table 2 lists genetic correlation results that are consistent with epidemiological associations, but, as far as we are aware, have not previously been reported using genetic data. The estimates of the genetic correlation between age at onset of menarche and adult height [31], triglycerides [32] and type 2 diabetes [32, 33] are consistent with the epidemiological associations.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2:

Genetic correlation estimates, standard errors and p-values for selected pairs of traits. Results are grouped into genetic correlations that are new genetic results, but are consistent with established epidemiological associations (“Epidemiological”), genetic correlations that are new both to genetics and epidemiology (“New/Nonzero”) and interesting null results (“New/Low”). The p-values are uncorrected p-values. Results that pass multiple testing correction for the 300 tests in Figure 2 at 1% FDR have a single asterisk; results that pass Bonferroni correction have two asterisks. We present some genetic correlations that agree with epidemiological associations but that do not pass multiple testing correction in these data.

The estimate of a negative genetic correlation between anorexia nervosa and obesity suggests that the same genetic factors influence normal variation in BMI as well as dysregulated BMI in psychiatric illness. This result is consistent with the observation that BMI GWAS findings implicate neuronal, rather than metabolic, cell-types and epigenetic marks [34, 35]. The negative genetic correlation between adult height and coronary artery disease agrees with a replicated epidemiological association [36–38]. We observe several significant associations with the educational attainment phenotypes from Rietveld et al. [39]: we estimate a statistically significant negative genetic correlation between college and Alzheimer’s disease, which agrees with epidemiological results [40, 41]. The positive genetic correlation between college and bipolar disorder is consistent with previous epidemiological reports [42, 43]. The estimate of a negative genetic correlation between smoking and college is consistent with the observed differences in smoking rates as a function of educational attainment [44].

The second section of table 2 lists three results that are, to the best of our knowledge, new both to genetics and epidemiology. One, we find a positive genetic correlation between anorexia nervosa and schizophrenia. Comorbidity between eating and psychotic disorders has not been thoroughly investigated in the psychiatric literature [45, 46], and this result raises the possibility of similarity between these classes of disease. Two, we estimate a negative genetic correlation between ulcerative colitis (UC) and childhood obesity. The relationship between premorbid BMI and ulcerative colitis is not well-understood; exploring this relationship may be a fruitful direction for further investigation. Three, we estimate a positive genetic correlation between autism spectrum disorder (ASD) and educational attainment, which itself has very high genetic correlation with IQ [39, 47, 48]. The ASD summary statistics were generated using a case-pseudocontrol study design, so this result cannot be explained by the tendency for the parents of children who receive a diagnosis of ASD to be better educated than the general population [49]. The distribution of IQ among individuals with ASD has lower mean than the general population, but with heavy tails [50] (i.e., an excess of individuals with low and high IQ). There is evidence that the genetic architectures of high IQ and low IQ ASD are dissimilar [51].

The third section of table 2 lists interesting examples where the genetic correlation is close to zero with small standard error. The low genetic correlation between schizophrenia and rheumatoid arthritis is interesting because schizophrenia has been observed to be protective for rheumatoid arthritis [52], though the epidemiological effect is weak, so it is possible that there is a real genetic correlation, but it is too small for us to detect. The low genetic correlation between schizophrenia and smoking is notable because of the high prevalence of smoking among individuals with schizophrenia [53]. The low genetic correlation between schizophrenia and plasma lipid levels contrasts with a previous report of pleiotropy between schizophrenia and triglycerides [54]. Pleiotropy (unsigned) is different from genetic correlation (signed; see Methods); however, the pleiotropy reported by Andreassen, et al. [54] could be explained by the sensitivity of the method used to the properties of a small number of regions with strong LD, rather than trait biology (Figure S5). We estimate near-zero genetic correlation between Alzheimer’s disease and schizophrenia. The genetic correlations between Alzheimers disease and the other psychiatric traits (anorexia nervosa, bipolar, major depression, ASD) are also close to zero, but with larger standard errors, due to smaller sample sizes. This suggests that the genetic basis of Alzheimer’s disease is distinct from psychiatric conditions. Last, we estimate near zero genetic correlation between rheumatoid arthritis (RA) and both Crohn’s disease (CD) and UC. Although these diseases share many associated loci [55, 56], there appears to be no directional trend: some RA risk alleles are also risk alleles for UC and CD, but many RA risk alleles are protective for UC and CD [55], yielding near-zero genetic correlation. This example highlights the distinction between pleiotropy and genetic correlation (Methods).

Finally, the estimates of genetic correlations among metabolic traits are consistent with the estimates obtained using REML in Vattikuti et al. [17] (Supplementary Table S4), and are directionally consistent with the recent Mendelian randomization results from Wuertz et al. [57]. The estimate of 0.57 (0.074) for the genetic correlation between CD and UC is consistent with the estimate of 0.62 (0.042) from Chen et al. [18].

Discussion

We have described a new method for estimating genetic correlation from GWAS summary statistics, which we applied to a dataset of GWAS summary statistics consisting of 25 traits and more than 1.5 million unique phenotype measurements. We reported several new findings that would have been difficult or impossible to obtain with existing methods, including a positive genetic correlation between anorexia nervosa and schizophrenia. Our method replicated many previously-reported GWAS-based genetic correlations, and confirmed observations of overlap among genome-wide significant SNPs, MR results and epidemiological associations.

This method is an advance for several reasons: it does not require individual genotypes, genomewide significant SNPs or LD-pruning (which loses information if causal SNPs are in LD). Our method is not biased by sample overlap and is computationally fast. Furthermore, our approach does not require measuring multiple traits on the same individuals, so it scales easily to studies of thousands of pairs of traits. These advantages allow us to estimate genetic correlation for many more pairs of phenotypes than was possible with existing methods.

The challenges in interpreting genetic correlation are similar to the challenges in MR. We highlight two difficulties. First, genetic correlation is immune to environmental confounding, but is subject to genetic confounding, analogous to confounding by pleiotropy in MR. For example, the genetic correlation between HDL and CAD in Figure 2 could result from a causal effect HDL → CAD, but could also be mediated by triglycerides (TG) [10, 58], represented graphically [59] as HDL ← G → TG → CAD, where G is the set of genetic variants with effects on both HDL and TG. Extending genetic correlation to multiple genetically correlated phenotypes is an important direction for future work [60]. Second, although genetic correlation estimates are not biased by oversampling of cases, they are affected by other forms of selection bias, such as misclassification [16].

We note several limitations of cross-trait LD Score regression as an estimator of genetic correlation. First, cross-trait LD Score regression requires larger sample sizes than methods that use individual genotypes in order to achieve equivalent standard error. Second, cross-trait LD Score regression is not currently applicable to samples from recently-admixed populations. Third, we have not investigated the potential impact of assortative mating on estimates of genetic correlation, which remains as a future direction. Fourth, methods built from polygenic models, such as cross-trait LD Score regression and REML, are most effective when applied to traits with polygenic genetic architectures. For traits where significant SNPs account for a sizable proportion of heritability, analyzing only these SNPs can be more powerful. Developing methods that make optimal use of both large-effect SNPs and diffuse polygenic signal is a direction for future research.

Despite these limitations, we believe that the cross-trait LD Score regression estimator of genetic correlation will be a useful addition to the epidemiological toolbox, since it allows for rapid screening for correlations among a diverse set of traits, without the need for measuring multiple traits on the same individuals or genome-wide significant SNPs.

Methods

Definition of Genetic Covariance and Correlation

All definitions refer to narrow-sense heritabilities and genetic covariances. Let S denote a set of M SNPs, let X denote a vector of additively (0-1-2) coded genotypes for the SNPs in S, and let y1 and y2 denote phenotypes. Define Embedded Image, where the maximization is performed in the population (i.e., in the infinite data limit). Let γ denote the corresponding vector for y2. This is a projection, so β is unique modulo SNPs in perfect LD. Define Embedded Image, the heritability explained by SNPs in S, as Embedded Image and ρS(y1, y2), the genetic covariance among SNPs in S, as Embedded Image. The genetic correlation among SNPs in S is Embedded Image, which lies in [-1,1]. Following [13], we use subscript g (as in Embedded Image, ρg, rg) when the set of SNPs is genotyped and imputed SNPs in GWAS.

SNP genetic correlation (rg) is different from family study genetic correlation. In a family study, the relationship matrix captures information about all genetic variation, not just common SNPs. As a result, family studies estimate the total genetic correlation (S equals all variants). Unlike the relationship between SNP-heritability [13] and total heritability, for which Embedded Image, no similar relationship holds between SNP genetic correlation and total genetic correlation. If β and γ are more strongly correlated among common variants than rare variants, then the total genetic correlation will be less than the SNP genetic correlation.

Genetic correlation is (asymptotically) proportional to Mendelian randomization estimates. If we use a genetic instrument Embedded Image to estimate the effect b12 of y1 on y2, the 2SLS estimate is Embedded Image [8]. The expectations of the numerator and denominator are Embedded Image and Embedded Image. Thus, Embedded Image. If we use the same set S of SNPs to estimate b12 and b21 (e.g., if S is the set of all common SNPs, as in the genetic correlation analyses in this paper), then this procedure is symmetric in y1 and y2.

Genetic correlation is different from pleiotropy. Two traits have a pleiotropic relationship if many variants affect both. Genetic correlation is a stronger condition than pleiotropy: to exhibit genetic correlation, the directions of effect must also be consistently aligned.

Cross-Trait LD Score Regression

We estimate genetic covariance by regressing z1jz2j against Embedded Image, (where Nij is the sample size for SNP j in study i) then multiplying the resulting slope by M, the number of SNPs in the reference panel with MAF between 5% and 50% (technically, this is an estimate of ρ5-50%, see the Supplementary Note).

If we know the amount of sample overlap ahead of time, we can reduce the standard error by constraining the intercept with the --constrain-intercept flag in ldsc. This works even if there is nonzero sample overlap, in which case the intercept should be constrained to Embedded Image.

Regression Weights

For heritability estimation, we use the regression weights from [21]. If effect sizes for both phenotypes are drawn from a bivariate normal distribution, then the optimal regression weights for genetic covariance estimation are Embedded Image

(Supplementary Note). This quantity depends on several parameters Embedded Image which are not known a priori, so it is necessary to estimate them from the data. We compute the weights in two steps:

  1. The first regression is weighted using heritabilities from the single-trait LD Score regressions, ρNs = 0, and ρg estimated as Embedded Image.

  2. The second regression is weighted using the estimates of ρNs and ρg from step 1. The genetic covariance estimate that we report is the estimate from the second regression.

Linear regression with weights estimated from the data is called feasible generalized least squares (FGLS). FGLS has the same limiting distribution as WLS with optimal weights, so WLS p-values are valid for FGLS [8]. We multiply the heteroskedasticity weights by 1/ℓj (where ℓj is LD Score with sum over regression SNPs) in order to downweight SNPs that are overcounted. This is a heuristic: the optimal approach is to rotate the data so that it is de-correlated, but this rotation matrix is difficult to compute.

Assessment of Statistical Significance via Block Jackknife

Summary statistics for SNPs in LD are correlated, so the OLS standard error will be biased downwards. We estimate a heteroskedasticity-and-correlation-robust standard error with a block jackknife over blocks of adjacent SNPs. This is the same procedure used in [21], and gives accurate standard errors in simulations (Table 1). We obtain a standard error for the genetic correlation by using a ratio block jackknife over SNPs. The default setting in ldsc is 200 blocks per genome, which can be adjusted with the --num-blocks flag.

Computational Complexity

Let N denote sample size and M the number of SNPs. The computational complexity of the steps involved in LD Score regression are as follows:

  1. Computing summary statistics takes Embedded Image time.

  2. Computing LD Scores takes Embedded Image time, though the N for computing LD Scores need not be large. We use the N = 378 Europeans from 1000 Genomes.

  3. LD Score regression takes Embedded Image time and space.

For a user who has already computed summary statistics and downloads LD Scores from our website (URLs), the computational cost of LD Score regression is Embedded Image time and space. For comparison, REML takes time Embedded Image for computing the GRM and Embedded Image time for maximizing the likelihood.

Practically, estimating LD Scores takes roughly an hour parallelized over chromosomes, and LD Score regression takes about 15 seconds per pair of phenotypes on a 2014 MacBook Air with 1.7 GhZ Intel Core i7 processor.

Simulations

We simulated quantitative traits under an infinitesimal model in 2062 controls from a Swedish study. To simulate the standard scenario where many causal SNPs are not genotyped, we simulated phenotypes by drawing casual SNPs from 622,146 best-guess imputed 1000 Genomes SNPs on chromosome 2, then retained only the 90,980 HM3 SNPs with MAF above 5% for LD Score regression.

We note that the simulations in [21] show that single-trait LD Score regression is only minimally biased by uncorrected population stratification and moderate ancestry mismatch between the reference panel used for estimating LD Scores and the population sampled in GWAS. In particular, LD Scores estimated from the 1000 Genomes reference panel are suitable for use with European-ancestry meta-analyses. Put another way, LD Score is only minimally correlated with FST, and the differences in LD Score among European populations are not so large as to bias LD Score regression. Since we use the same LD Scores for cross-trait LD Score regression as for single-trait LD Score regression, these results extend to cross-trait LD Score regression.

Summary Statistic Datasets

We selected traits for inclusion in the main text via the following procedure:

  1. Begin with all publicly available non-sex-stratified European-only summary statistics.

  2. Remove studies that do not provide signed summary statistics.

  3. Remove studies not imputed to at least HapMap 2.

  4. Remove studies that include heritable covariates [61].

  5. Remove all traits with heritability z-score below 4. Genetic correlation estimates for traits with heritability z-score below 4 are generally too noisy to interpret.

  6. Prune clusters of correlated phenotypes (e.g., obesity classes 1-3) by picking the trait from each cluster with the highest heritability heritability z-score.

We then applied the following filters (implemented in the script sumstats_to_chisq.py included with ldsc):

  1. For studies that provide a measure of imputation quality, filter to INFO above 0.9.

  2. For studies that provide sample MAF, filter to sample MAF above 1%.

  3. In order to restrict to well-imputed SNPs in studies that do not provide a measure of imputation quality, filter to HapMap3 [62] SNPs with 1000 Genomes EUR MAF above 5%, which tend to be well-imputed in most studies. This step should be skipped if INFO scores are available for all studies.

  4. If sample size varies from SNP to SNP, remove SNPs with effective sample size less than 0.67 times the 90th percentile of sample size.

  5. Remove indels and structural variants.

  6. Remove strand-ambiguous SNPs.

  7. Remove SNPs whose alleles do not match the alleles in 1000 Genomes.

  8. Because the presence of outliers can increase the regression standard error, we also removed SNPs with extremely large effect sizes (χ2 > 80, as in [21]).

Genomic control (GC) correction at any stage biases the heritability and genetic covariance estimates downwards (see the Supplementary Note of [21]. The biases in the numerator and denominator of genetic correlation cancel exactly, so genetic correlation is not biased by GC correction. A majority of the studies analyzed in this paper used GC correction, so we do not report genetic covariance and heritability.

Data on Alzheimer’s disease were obtained from the following source: International Genomics of Alzheimer’s Project (IGAP) is a large two-stage study based upon genome-wide association studies (GWAS) on individuals of European ancestry. In stage 1, IGAP used genotyped and imputed data on 7,055,881 single nucleotide polymorphisms (SNPs) to meta-analyze four previously-published GWAS datasets consisting of 17,008 Alzheimer’s disease cases and 37,154 controls (The European Alzheimer’s Disease Initiative, EADI; the Alzheimer Disease Genetics Consortium, ADGC; The Cohorts for Heart and Aging Research in Genomic Epidemiology consortium, CHARGE; The Genetic and Environmental Risk in AD consortium, GERAD). In stage 2, 11,632 SNPs were genotyped and tested for association in an independent set of 8,572 Alzheimer’s disease cases and 11,312 controls. Finally, a meta-analysis was performed combining results from stages 1 and 2.

We only used stage 1 data for LD Score regression.

URLs

  1. ldsc software: github.com/bulik/ldsc

  2. This paper: github.com/bulik/gencor_tex

  3. PGC (psychiatric) summary statistics: www.med.unc.edu/pgc/downloads

  4. GIANT (anthopometric) summary statistics: www.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files

  5. EGG (Early Growth Genetics) summary statistics: www.egg-consortium.org/

  6. MAGIC (insulin, glucose) summary statistics: www.magicinvestigators.org/downloads/

  7. CARDIoGRAM (coronary artery disease) summary statistics: www.cardiogramplusc4d.org

  8. DIAGRAM (T2D) summary statistics: www.diagram-consortium.org

  9. Rheumatoid arthritis summary statistics: www.broadinstitute.org/ftp/pub/rheumatoid_arthritis/Stahl_etal_2010NG/

  10. IGAP (Alzheimers) summary statistics: www.pasteur-lille.fr/en/recherche/u744/igap/igap download. php

  11. IIBDGC (inflammatory bowel disease) summary statistics: www.ibdgenetics.org/downloads.html

    We used a newer version of these data with 1000 Genomes imputation.

  12. Plasma lipid summary statistics: www.broadinstitute.org/mpg/pubs/lipids2010/

  13. SSGAC (educational attainment) summary statistics: www.ssgac.org/

  14. Beans: www.barismo.com www.bluebottlecoffee.com

Author Contributions

MJD provided reagents. BMN and ALP provided reagents. CL, ER, VA, JP and FD aided in the interpretation of results. JP and FD provided data on age at onset of menarche. The caffeine molecule is responsible for all that is good about this manuscript. BBS and HKF are responsible for the rest. All authors revised and approved the final manuscript.

Competing Financial Interests

We have no financial conflicts of interest to declare.

Collaborators

Collaborators from the Psychiatric Genomics Consortium were, in alphabetical order: Devin Absher, Rolf Adolfsson, Ingrid Agartz, Esben Agerbo, Huda Akil, Margot Albus, Madeline Alexander, Farooq Amin, Ole A Andreassen, Adebayo Anjorin, Richard Anney, Dan Arking, Philip Asherson, Maria H Azevedo, Silviu A Bacanu, Lena Backlund, Judith A Badner, Tobias Banaschewski, Jack D Barchas, Michael R Barnes, Thomas B Barrett, Nicholas Bass, Michael Bauer, Monica Bayes, Martin Begemann, Frank Bellivier, Judit Bene, Sarah E Bergen, Thomas Bettecken, Elizabeth Bevilacqua, Joseph Biederman, Tim B Bigdeli, Elisabeth B Binder, Donald W Black, Douglas HR Blackwood, Cinnamon S Bloss, Michael Boehnke, Dorret I Boomsma, Anders D Borglum, Elvira Bramon, Gerome Breen, Rene Breuer, Richard Bruggeman, Nancy G Buccola, Randy L Buckner, Jan K Buitelaar, Brendan Bulik-Sullivan, William E Bunner, Margit Burmeister, Joseph D Buxbaum, William F Byerley, Sian Caesar, Wiepke Cahn, Guiqing Cai, Murray J Cairns, Dominique Campion, Rita M Cantor, Vaughan J Carr, Noa Carrera, Miquel Casas, Stanley V Catts, Aravinda Chakravarti, Kimberley D Chambert, Raymond CK Chan, Eric YH Chen, Ronald YL Chen, Wei Cheng, Eric FC Cheung, Siow Ann Chong, Khalid Choudhury, Sven Cichon, David St Clair, C Robert Cloninger, David Cohen, Nadine Cohen, David A Collier, Edwin Cook, Hilary Coon, Bru Cormand, Paul Cormican, Aiden Corvin, William H Coryell, Nicholas Craddock, David W Craig, Ian W Craig, Benedicto Crespo-Facorro, James J Crowley, David Curtis, Darina Czamara, Mark J Daly, Ariel Darvasi, Susmita Datta, Michael Davidson, Kenneth L Davis, Richard Day, Franziska Degenhardt, Lynn E DeLisi, Ditte Demontis, Bernie Devlin, Dimitris Dikeos, Timothy Dinan, Srdjan Djurovic, Enrico Domenici, Gary Donohoe, Alysa E Doyle, Elodie Drapeau, Jubao Duan, Frank Dudbridge, Naser Durmishi, Howard J Edenberg, Hannelore Ehrenreich, Peter Eichhammer, Amanda Elkin, Johan Eriksson, Valentina Escott-Price, Tonu Esko, Laurent Essioux, Bruno Etain, Ayman H Fanous, Stephen V Faraone, Kai-How Farh, Anne E Farmer, Martilias S Farrell, Jurgen Del Favero, Manuel A Ferreira, I Nicol Ferrier, Matthew Flickinger, Tatiana Foroud, Josef Frank, Barbara Franke, Lude Franke, Christine Fraser, Robert Freedman, Nelson B Freimer, Marion Friedl, Joseph I Friedman, Louise Frisen, Menachem Fromer, Pablo V Gejman, Giulio Genovese, Lyudmila Georgieva, Elliot S Gershon, Eco J De Geus, Ina Giegling, Michael Gill, Paola Giusti-Rodriguez, Stephanie Godard, Jacqueline I Goldstein, Vera Golimbet, Srihari Gopal, Scott D Gordon, Katherine Gordon-Smith, Jacob Gratten, Elaine K Green, Tiffany A Greenwood, Gerard Van Grootheest, Magdalena Gross, Detelina Grozeva, Weihua Guan, Hugh Gurling, Omar Gustafsson, Lieuwe de Haan, Hakon Hakonarson, Steven P Hamilton, Christian Hammer, Marian L Hamshere, Mark Hansen, Thomas F Hansen, Vahram Haroutunian, Annette M Hartmann, Martin Hautzinger, Andrew C Heath, Anjali K Henders, Frans A Henskens, Stefan Herms, Ian B Hickie, Maria Hipolito, Joel N Hirschhorn, Susanne Hoefels, Per Hoffmann, Andrea Hofman, Mads V Hollegaard, Peter A Holmans, Florian Holsboer, Witte J Hoogendijk, Jouke Jan Hottenga, David M Hougaard, Hailiang Huang, Christina M Hultman, Masashi Ikeda, Andres Ingason, Marcus Ising, Nakao Iwata, Assen V Jablensky, Stephane Jamain, Inge Joa, Edward G Jones, Ian Jones, Lisa Jones, Erik G Jonsson, Milan Macek Jr, Richard A Belliveau Jr, Antonio Julia, Tzeng JungYing, Anna K Kahler, Rene S Kahn, Luba Kalaydjieva, Radhika Kandaswamy, Sena Karachanak-Yankova, Juha Karjalainen, David Kavanagh, Matthew C Keller, Brian J Kelly, John R Kelsoe, Kenneth S Kendler, James L Kennedy, Elaine Kenny, Lindsey Kent, Jimmy Lee Chee Keong, Andrey Khrunin, Yunjung Kim, George K Kirov, Janis Klovins, Jo Knight, James A Knowles, Martin A Kohli, Daniel L Koller, Bettina Konte, Ania Korszun, Robert Krasucki, Vaidutis Kucinskas, Zita Ausrele Kucinskiene, Jonna Kuntsi, Hana Kuzelova-Ptackova, Phoenix Kwan, Mikael Landen, Niklas Langstrom, Mark Lathrop, Claudine Laurent, Jacob Lawrence, William B Lawson, Marion Leboyer, Phil Hyoun Lee, S Hong Lee, Sophie E Legge, Todd Lencz, Bernard Lerer, Klaus-Peter Lesch, Douglas F Levinson, Cathryn M Lewis, Jun Li, Miaoxin Li, Qingqin S Li, Tao Li, Kung-Yee Liang, Paul Lichtenstein, Jeffrey A Lieberman, Svetlana Limborska, Danyu Lin, Chunyu Liu, Jianjun Liu, Falk W Lohoff, Jouko Lonnqvist, Sandra K Loo, Carmel M Loughland, Jan Lubinski, Susanne Lucae, Donald MacIntyre, Pamela AF Madden, Patrik KE Magnusson, Brion S Maher, Pamela B Mahon, Wolfgang Maier, Anil K Malhotra, Jacques Mallet, Sara Marsal, Nicholas G Martin, Manuel Mattheisen, Keith Matthews, Morten Mattingsdal, Robert W McCarley, Steven A McCarroll, Colm McDonald, Kevin A McGhee, James J McGough, Patrick J McGrath, Peter McGuffin, Melvin G McInnis, Andrew M McIntosh, Rebecca McKinney, Alan W McLean, Francis J McMahon, Andrew McQuillin, Helena Medeiros, Sarah E Medland, Sandra Meier, Carin J Meijer, Bela Melegh, Ingrid Melle, Fan Meng, Raquelle I Mesholam-Gately, Andres Metspalu, Patricia T Michie, Christel M Middeldorp, Lefkos Middleton, Lili Milani, Vihra Milanova, Philip B Mitchell, Younes Mokrab, Grant W Montgomery, Jennifer L Moran, Gunnar Morken, Derek W Morris, Ole Mors, Preben B Mortensen, Valentina Moskvina, Bryan J Mowry, Pierandrea Muglia, Thomas W Muehleisen, Walter J Muir, Bertram Mueller-Myhsok, Kieran C Murphy, Robin M Murray, Richard M Myers, Inez Myin-Germeys, Benjamin M Neale, Michael C Neale, Mari Nelis, Stan F Nelson, Igor Nenadic, Deborah A Nertney, Gerald Nestadt, Kristin K Nicodemus, Caroline M Nievergelt, Liene Nikitina-Zake, Ivan Nikolov, Vishwajit Nimgaonkar, Laura Nisenbaum, Willem A Nolen, Annelie Nordin, Markus M Noethen, John I Nurnberger, Evaristus A Nwulia, Dale R Nyholt, Eadbhard O’Callaghan, Michael C O’Donovan, Colm O’Dushlaine, F Anthony O’Neill, Robert D Oades, Sang-Yun Oh, Ann Olincy, Line Olsen, Edwin JCG van den Oord, Roel A Ophoff, Jim Van Os, Urban Osby, Hogni Oskarsson, Michael J Owen, Aarno Palotie, Christos Pantelis, George N Papadimitriou, Sergi Papiol, Elena Parkhomenko, Carlos N Pato, Michele T Pato, Tiina Paunio, Milica Pejovic-Milovancevic, Brenda P Penninx, Michele L Pergadia, Diana O Perkins, Roy H Perlis, Tune H Pers, Tracey L Petryshen, Hannes Petursson, Benjamin S Pickard, Olli Pietilainen, Jonathan Pimm, Joseph Piven, Andrew J Pocklington, Porgeir Porgeirsson, Danielle Posthuma, James B Potash, John Powell, Alkes Price, Peter Propping, Ann E Pulver, Shaun M Purcell, Vinay Puri, Digby Quested, Emma M Quinn, Josep Antoni Ramos-Quiroga, Henrik B Rasmussen, Soumya Raychaudhuri, Karola Rehnstrom, Abraham Reichenberg, Andreas Reif, Mark A Reimers, Marta Ribases, John Rice, Alexander L Richards, Marcella Rietschel, Brien P Riley, Stephan Ripke, Joshua L Roffman, Lizzy Rossin, Aribert Rothenberger, Guy Rouleau, Panos Roussos, Douglas M Ruderfer, Dan Rujescu, Veikko Salomaa, Alan R Sanders, Susan Santangelo, Russell Schachar, Ulrich Schall, Martin Schalling, Alan F Schatzberg, William A Scheftner, Gerard Schellenberg, Peter R Schofield, Nicholas J Schork, Christian R Schubert, Thomas G Schulze, Johannes Schumacher, Sibylle G Schwab, Markus M Schwarz, Edward M Scolnick, Laura J Scott, Rodney J Scott, Larry J Seidman, Pak C Sham, Jianxin Shi, Paul D Shilling, Stanley I Shyn, Engilbert Sigurdsson, Teimuraz Silagadze, Jeremy M Silverman, Kang Sim, Pamela Sklar, Susan L Slager, Petr Slominsky, Susan L Smalley, Johannes H Smit, Erin N Smith, Jordan W Smoller, Hon-Cheong So, Erik Soderman, Edmund Sonuga-Barke, Chris C A Spencer, Eli A Stahl, Matthew State, Hreinn Stefansson, Kari Stefansson, Michael Steffens, Stacy Steinberg, Hans-Christoph Stein-hausen, Elisabeth Stogmann, Richard E Straub, John Strauss, Eric Strengman, Jana Strohmaier, T Scott Stroup, Mythily Subramaniam, Patrick F Sullivan, James Sutcliffe, Jaana Suvisaari, Dragan M Svrakic, Jin P Szatkiewicz, Peter Szatmari, Szabocls Szelinger, Anita Thapar, Srinivasa Thirumalai, Robert C Thompson, Draga Toncheva, Paul A Tooney, Sarah Tosato, Federica Tozzi, Jens Treutlein, Manfred Uhr, Juha Veijola, Veronica Vieland, John B Vincent, Peter M Visscher, John Waddington, Dermot Walsh, James TR Walters, Dai Wang, Qiang Wang, Stanley J Watson, Bradley T Webb, Daniel R Weinberger, Mark Weiser, Myrna M Weissman, Jens R Wendland, Thomas Werge, Thomas F Wienker, Dieter B Wildenauer, Gonneke Willemsen, Nigel M Williams, Stephanie Williams, Richard Williamson, Stephanie H Witt, Aaron R Wolen, Emily HM Wong, Brandon K Wormley, Naomi R Wray, Adam Wright, Jing Qin Wu, Hualin Simon Xi, Wei Xu, Allan H Young, Clement C Zai, Stan Zammit, Peter P Zandi, Peng Zhang, Xuebin Zheng, Fritz Zimprich, Frans G Zitman, and Sebastian Zoellner.

Genetic Consortium for Anorexia Nervosa (GCAN): Vesna Boraska Perica, Christopher S Franklin, James A B Floyd, Laura M Thornton, Laura M Huckins, Lorraine Southam, N William Rayner, Ioanna Tachmazidou, Kelly L Klump, Janet Treasure, Cathryn M Lewis, Ulrike Schmidt, Federica Tozzi, Kirsty Kiezebrink, Johannes Hebebrand, Philip Gorwood, Roger A H Adan, Martien J H Kas, Angela Favaro, Paolo Santonastaso, Fernando Fernández-Aranda, Monica Gratacos, Filip Rybakowski, Monika Dmitrzak-Weglarz, Jaakko Kaprio, Anna Keski-Rahkonen, Anu Raevuori-Helkamaa, Eric F Van Furth, Margarita C T Slof-Op’t Landt, James I Hudson, Ted Reichborn-Kjennerud, Gun Peggy S Knudsen, Palmiero Monteleone, Allan S Kaplan, Andreas Karwautz, Hakon Hakonarson, Wade H Berrettini, Yiran Guo, Dong Li, Nicholas J Schork, Gen Komaki, Tetsuya Ando, Hidetoshi Inoko, Tõnu Esko, Krista Fischer, Katrin Männik, Andres Metspalu, Jessica H Baker, Roger D Cone, Jennifer Dackor, Janiece E DeSocio, Christopher E Hilliard, Julie K O’Toole, Jacques Pantel, Jin P Szatkiewicz, Chrysecolla Taico, Stephanie Zerwas, Sara E Trace, Oliver S P Davis, Sietske Helder, Katharina Bühren, Roland Burghardt, Martina de Zwaan, Karin Egberts, Stefan Ehrlich, Beate Herpertz-Dahlmann, Wolfgang Herzog, Hartmut Imgart, André Scherag, Susann Scherag, Stephan Zipfel, Claudette Boni, Nicolas Ramoz, Audrey Versini, Marek K Brandys, Unna N Danner, Carolien de Kove, Judith Hendriks, Bobby P C Koeleman, Roel A Ophoff, Eric Strengman, Annemarie A van Elburg, Alice Bruson, Maurizio Clementi, Daniela Degortes, Monica Forzan, Elena Tenconi, Elisa Docampo, Geòrgia Escaramí, Susana Jiménez-Murcia, Jolanta Lissowska, Andrzej Rajewski, Neonila Szeszenia-Dabrowska, Agnieszka Slopien, Joanna Hauser, Leila Karhunen, Ingrid Meulenbelt, P Eline Slagboom, Alfonso Tortorella, Mario Maj, George Dedoussis, Dimitris Dikeos, Fragiskos Gonidakis, Konstantinos Tziouvas, Artemis Tsitsika, Hana Papezova, Lenka Slachtova, Debora Martaskova, James L Kennedy, Robert D Levitan, Zeynep Yilmaz, Julia Huemer, Doris Koubek, Elisabeth Merl, Gudrun Wagner, Paul Lichtenstein, Gerome Breen, Sarah Cohen-Woods, Anne Farmer, Peter McGuffin, Sven Cichon, Ina Giegling, Stefan Herms, Dan Rujescu, Stefan Schreiber, H-Erich Wichmann, Christian Dina, Rob Sladek, Giovanni Gambaro, Nicole Soranzo, Antonio Julia, Sara Marsal, Raquel Rabionet, Valerie Gaborieau, Danielle M Dick, Aarno Palotie, Samuli Ripatti, Elisabeth Widén, Ole A Andreassen, Thomas Espeseth, Astri Lundervold, Ivar Reinvang, Vidar M Steen, Stephanie Le Hellard, Morten Mattingsdal, Ioanna Ntalla, Vladimir Bencko, Lenka Foretova, Vladimir Janout, Marie Navratilova, Steven Gallinger, Dalila Pinto, Stephen W Scherer, Harald Aschauer, Laura Carlberg, Alexandra Schosser, Lars Alfredsson, Bo Ding, Lars Klareskog, Leonid Padyukov, Chris Finan, Gursharan Kalsi, Marion Roberts, Darren W Logan, Leena Peltonen, Graham R S Ritchie, Jeff C Barrett, Xavier Estivill, Anke Hinney, Patrick F Sullivan, David A Collier, Eleftheria Zeggini, and Cynthia M Bulik.

Wellcome Trust Case Control Consortium 3 (WTCCC3): Carl A Anderson, Jeffrey C Barrett, James A B Floyd, Christopher S Franklin, Ralph McGinnis, Nicole Soranzo, Eleftheria Zeggini, Jennifer Sambrook, Jonathan Stephens, Willem H Ouwehand, Wendy L McArdle, Susan M Ring, David P Strachan, Graeme Alexander, Cynthia M Bulik, David A Collier, Peter J Conlon, Anna Dominiczak, Audrey Duncanson, Adrian Hill, Cordelia Langford, Graham Lord, Alexander P Maxwell, Linda Morgan, Leena Peltonen, Richard N Sandford, Neil Sheerin, Frederik O Vannberg, Hannah Blackburn, Wei-Min Chen, Sarah Edkins, Mathew Gillman, Emma Gray, Sarah E Hunt, Suna Nengut-Gumuscu, Simon Potter, Stephen S Rich, Douglas Simpkin, and Pamela Whittaker.

The members of the ReproGen consortium are John RB Perry, Felix Day, Cathy E Elks, Patrick Sulem, Deborah J Thompson, Teresa Ferreira, Chunyan He, Daniel I Chasman, Tnu Esko, Gudmar Thorleifsson, Eva Albrecht, Wei Q Ang, Tanguy Corre, Diana L Cousminer, Bjarke Feenstra, Nora Franceschini, Andrea Ganna, Andrew D Johnson, Sanela Kjellqvist, Kathryn L Lunetta, George McMahon, Ilja M Nolte, Lavinia Paternoster, Eleonora Porcu, Albert V Smith, Lisette Stolk, Alexander Teumer, Natalia Ternikova, Emmi Tikkanen, Sheila Ulivi, Erin K Wagner, Najaf Amin, Laura J Bierut, Enda M Byrne, JoukeJan Hottenga, Daniel L Koller, Massimo Mangino, Tune H Pers, Laura M YergesArmstrong, Jing Hua Zhao, Irene L Andrulis, Hoda AntonCulver, Femke Atsma, Stefania Bandinelli, Matthias W Beckmann, Javier Benitez, Carl Blomqvist, Stig E Bojesen, Manjeet K Bolla, Bernardo Bonanni, Hiltrud Brauch, Hermann Brenner, Julie E Buring, Jenny ChangClaude, Stephen Chanock, Jinhui Chen, Georgia ChenevixTrench, J. Margriet Colle, Fergus J Couch, David Couper, Andrea D Coveillo, Angela Cox, Kamila Czene, Adamo Pio D’adamo, George Davey Smith, Immaculata De Vivo, Ellen W Demerath, Joe Dennis, Peter Devilee, Aida K Dieffenbach, Alison M Dunning, Gudny Eiriksdottir, Johan G Eriksson, Peter A Fasching, Luigi Ferrucci, Dieter FleschJanys, Henrik Flyger, Tatiana Foroud, Lude Franke, Melissa E Garcia, Montserrat GarcaClosas, Frank Geller, Eco EJ de Geus, Graham G Giles, Daniel F Gudbjartsson, Vilmundur Gudnason, Pascal Gunel, Suiqun Guo, Per Hall, Ute Hamann, Robin Haring, Catharina A Hartman, Andrew C Heath, Albert Hofman, Maartje J Hooning, John L Hopper, Frank B Hu, David J Hunter, David Karasik, Douglas P Kiel, Julia A Knight, VeliMatti Kosma, Zoltan Kutalik, Sandra Lai, Diether Lambrechts, Annika Lindblom, Reedik Mgi, Patrik K Magnusson, Arto Mannermaa, Nicholas G Martin, Gisli Masson, Patrick F McArdle, Wendy L McArdle, Mads Melbye Kyriaki Michailidou, Evelin Mihailov, Lili Milani, Roger L Milne, Heli Nevanlinna, Patrick Neven, Ellen A Nohr, Albertine J Oldehinkel, Ben A Oostra, Aarno Palotie,, Munro Peacock, Nancy L Pedersen, Paolo Peterlongo, Julian Peto, Paul DP Pharoah, Dirkje S Postma, Anneli Pouta, Katri Pylks, Paolo Radice, Susan Ring, Fernando Rivadeneira, Antonietta Robino, Lynda M Rose, Anja Rudolph, Veikko Salomaa, Serena Sanna, David Schlessinger, Marjanka K Schmidt, Mellissa C Southey, Ulla Sovio Meir J Stampfer, Doris Stckl Anna M Storniolo, Nicholas J Timpson Jonathan Tyrer, Jenny A Visser, Peter Vollenweider, Henry Vlzke, Gerard Waeber, Melanie Waldenberger, Henri Wallaschofski, Qin Wang, Gonneke Willemsen, Robert Winqvist, Bruce HR Wolffenbuttel, Margaret J Wright, Australian Ovarian Cancer Study The GENICA Network, kConFab, The LifeLines Cohort Study, The InterAct Consortium, Early Growth Genetics (EGG) Consortium, Dorret I Boomsma, Michael J Econs, KayTee Khaw, Ruth JF Loos, Mark I McCarthy, Grant W Montgomery, John P Rice, Elizabeth A Streeten, Unnur Thorsteinsdottir, Cornelia M van Duijn, Behrooz Z Alizadeh, Sven Bergmann, Eric Boerwinkle, Heather A Boyd, Laura Crisponi, Paolo Gasparini, Christian Gieger, Tamara B Harris, Erik Ingelsson, MarjoRiitta Jrvelin, Peter Kraft, Debbie Lawlor, Andres Metspalu, Craig E Pennell, Paul M Ridker, Harold Snieder, Thorkild IA Srensen, Tim D Spector, David P Strachan, Andr G Uitterlinden, Nicholas J Wareham, Elisabeth Widen, Marek Zygmunt, Anna Murray, Douglas F Easton, Kari Stefansson, Joanne M Murabito, Ken K Ong.

Acknowledgements

We would like to thank P. Sullivan, C. Bulik, S. Caldwell, O. Andreassen for helpful comments. This work was supported by NIH grants R01 MH101244 (ALP), R03 CA173785 (HKF) and by the Fannie and John Hertz Foundation (HKF). The coffee that Brendan drank while writing this paper was roasted by Barismo in Arlington, MA and Blue Bottle Coffee in Oakland, CA.

Data on anorexia nervosa were obtained by funding from the WTCCC3 WT088827/Z/09 titled “A genome-wide association study of anorexia nervosa”.

Data on glycaemic traits have been contributed by MAGIC investigators and have been downloaded from www.magicinvestigators.org.

Data on coronary artery disease / myocardial infarction have been contributed by CARDIo-GRAMplusC4D investigators and have been downloaded from www.CARDIOGRAMPLUSC4D.ORG

We thank the International Genomics of Alzheimer’s Project (IGAP) for providing summary results data for these analyses. The investigators within IGAP contributed to the design and implementation of IGAP and/or provided data but did not participate in analysis or writing of this report. IGAP was made possible by the generous participation of the control subjects, the patients, and their families. The i-Select chips was funded by the French National Foundation on Alzheimer’s disease and related disorders. EADI was supported by the LABEX (laboratory of excellence program investment for the future) DISTALZ grant, Inserm, Institut Pasteur de Lille, Universit de Lille 2 and the Lille University Hospital. GERAD was supported by the Medical Research Council (Grant 503480), Alzheimer’s Research UK (Grant 503176), the Wellcome Trust (Grant 082604/2/07/Z) and German Federal Ministry of Education and Research (BMBF): Competence Network Dementia (CND) grant 01GI0102, 01GI0711, 01GI0420. CHARGE was partly supported by the NIH/NIA grant R01 AG033193 and the NIA AG081220 and AGES contract N01-AG-12100, the NHLBI grant R01 HL105756, the Icelandic Heart Association, and the Erasmus Medical Center and Erasmus University. ADGC was supported by the NIH/NIA grants: U01 AG032984, U24 AG021886, U01 AG016976, and the Alzheimer’s Association grant ADGC-10-196728.

Footnotes

  • ↵* Co-first authors

  • ↵† Co-last authors

  • ↵8 A list of members and affiliations appears in the Supplementary Note.

  • ↵1 We ignore the distinction between normalizing and centering in the population and in the sample, since this introduces only Embedded Image error.

  • ↵2 The assumption that all β is drawn with equal variance for all SNPs hides an implicit assumption that rare SNPs have larger per-allele effect sizes than common SNPs. As discussed in the simulations section of the main text and in our earlier work [21], LD Score regression is robust to moderate violations of this assumption, though it may break down in extreme cases, e.g., if all causal variants are rare. In situations where a different model for Var[β] is more appropriate, all proofs in this note go through with LD Score replaced by weighted LD Scores, Embedded Image.

  • ↵3 For instance, it is sufficient but not necessary to assume that β, γ, δ and ϵ are multivariate normal. More generally, the z-scores will be approximately normal if β and γ are reasonably polygenic. If the distribution of effect sizes is heavy-tailed, e.g., if there are few casual SNPs, then the CVF may be larger.

  • ↵4 Conditional on the marginal effect of j, the expected value of Embedded Image is not equal to pj unless P = K or the marginal effect of j is zero.

  • ↵5 For ℓj = 100 (roughly the median 1kG LD Score), M = 107 and ρg,obs = 1, we get ρg,obsℓj/M = 10−5. A worst-case value for Ns/N1N2 might be Ns = N1 = N2 = 103, in which case Ns/N1N2 = 10−3. Thus, ρg,obsℓj/M and Ns/N1N2 will generally be at least 3 orders of magnitude smaller than 1.

1 References

  1. [1].↵
    George Davey Smith and Shah Ebrahim. Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease? International journal of epidemiology, 32(1):1–22, 2003.
    OpenUrlCrossRefPubMedWeb of Science
  2. [2].↵
    George Davey Smith and Gibran Hemani. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Human molecular genetics, 23(R1):R89–R98, 2014.
    OpenUrlCrossRefPubMedWeb of Science
  3. [3].↵
    SG Vandenberg. Multivariate analysis of twin differences. Methods and goals in human behavior genetics, pages 29–43, 1965.
  4. [4].
    Oscar Kempthorne and Richard H Osborne. The interpretation of twin data. American journal of human genetics, 13(3):320, 1961.
    OpenUrlPubMed
  5. [5].
    John C Loehlin and Steven Gerritjan Vandenberg. Genetic and environmental components in the covariation of cognitive abilities: An additive model. Louisville Twin Study, University of Louisville, 1966.
  6. [6].
    Michael Neale and Lon Cardon. Methodology for genetic studies of twins and families. Number 67. Springer, 1992.
  7. [7].↵
    Paul Lichtenstein, Benjamin H Yip, Camilla Björk, Yudi Pawitan, Tyrone D Cannon, Patrick F Sullivan, and Christina M Hultman. Common genetic determinants of schizophrenia and bipolar disorder in swedish families: a population-based study. The Lancet, 373(9659):234–239, 2009.
    OpenUrlCrossRef
  8. [8].↵
    Joshua D Angrist and Jörn-Steffen Pischke. Mostly harmless econometrics: An empiricist’s companion. Princeton university press, 2008.
  9. [9].↵
    Benjamin F Voight, Gina M Peloso, Marju Orho-Melander, Ruth Frikke-Schmidt, Maja Barbalic, Majken K Jensen, George Hindy, Hilma Hólm, Eric L Ding, Toby Johnson, et al. Plasma hdl cholesterol and risk of myocardial infarction: a mendelian randomisation study. The Lancet, 380(9841):572–580, 2012.
    OpenUrl
  10. [10].↵
    Ron Do, Cristen J Willer, Ellen M Schmidt, Sebanti Sengupta, Chi Gao, Gina M Peloso, Stefan Gustafsson, Stavroula Kanoni, Andrea Ganna, Jin Chen, et al. Common variants associated with plasma triglycerides and risk for coronary artery disease. Nature genetics, 45(11):1345–1352, 2013.
    OpenUrlCrossRefPubMed
  11. [11].↵
    Peter M Visscher, Matthew A Brown, Mark I McCarthy, and Jian Yang. Five years of gwas discovery. The American Journal of Human Genetics, 90(1):7–24, 2012.
    OpenUrlCrossRefPubMed
  12. [12].↵
    Stephen Burgess, Simon G Thompson, et al. Avoiding bias from weak instruments in mendelian randomization studies. International journal of epidemiology, 40(3):755–764, 2011.
    OpenUrlCrossRefPubMedWeb of Science
  13. [13].↵
    Jian Yang, Beben Benyamin, Brian P McEvoy, Scott Gordon, Anjali K Henders, Dale R Nyholt, Pamela A Madden, Andrew C Heath, Nicholas G Martin, Grant W Montgomery, et al. Common snps explain a large proportion of the heritability for human height. Nature Genetics, 42(7):565–569, 2010.
    OpenUrlCrossRefPubMedWeb of Science
  14. [14].
    Jian Yang, S Hong Lee, Michael E Goddard, and Peter M Visscher. Gcta: a tool for genomewide complex trait analysis. The American Journal of Human Genetics, 88(1):76–82, 2011.
    OpenUrlCrossRefPubMed
  15. [15].↵
    Sang Hong Lee, Jian Yang, Michael E Goddard, Peter M Visscher, and Naomi R Wray. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics, 28(19):2540–2542, 2012.
    OpenUrlCrossRefPubMedWeb of Science
  16. [16].↵
    Cross-Disorder Group of the Psychiatric Genomics Consortium et al. Genetic relationship between five psychiatric disorders estimated from genome-wide snps. Nature Genetics, 2013.
  17. [17].↵
    Shashaank Vattikuti, Juen Guo, and Carson C Chow. Heritability and genetic correlations explained by common snps for metabolic syndrome traits. PLoS genetics, 8(3):e1002637, 2012.
    OpenUrl
  18. [18].↵
    Guo-Bo Chen, Sang Hong Lee, Marie-Jo A Brion, Grant W Montgomery, Naomi R Wray, Graham L Radford-Smith, Peter M Visscher, et al. Estimation and partitioning of (co) heritability of inflammatory bowel disease from gwas and immunochip data. Human molecular genetics, page ddu174, 2014.
  19. [19].↵
    Shaun M Purcell, Naomi R Wray, Jennifer L Stone, Peter M Visscher, Michael C O’Donovan, Patrick F Sullivan, Pamela Sklar, Shaun M Purcell, Jennifer L Stone, Patrick F Sullivan, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature, 460(7256):748–752, 2009.
    OpenUrlCrossRefPubMedWeb of Science
  20. [20].↵
    Frank Dudbridge. Power and predictive accuracy of polygenic risk scores. PLoS genetics, 9(3):e1003348, 2013.
    OpenUrlCrossRef
  21. [21].↵
    Brendan Bulik-Sullivan, Po-Ru Loh, Hilary Finucane, Stephan Ripke, Jian Yang, Nick Patterson, Mark J Daly, Alkes L Price, and Benjamin M Neale. Ld score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics, 2015.
  22. [22].↵
    Jian Yang, Michael N Weedon, Shaun Purcell, Guillaume Lettre, Karol Estrada, Cristen J Willer, Albert V Smith, Erik Ingelsson, Jeffrey R O’Connell, Massimo Mangino, et al. Genomic inflation factors under polygenic inheritance. European Journal of Human Genetics, 19(7):807–812, 2011.
    OpenUrlCrossRefPubMed
  23. [23].↵
    Doug Speed, Gibran Hemani, Michael R Johnson, and David J Balding. Improved heritability estimation from genome-wide snps. The American Journal of Human Genetics, 91(6):1011–1021, 2012.
    OpenUrlCrossRefPubMed
  24. [24].↵
    Cross-Disorder Group of the Psychiatric Genomics Consortium et al. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet, 381(9875):1371, 2013.
    OpenUrlCrossRefPubMedWeb of Science
  25. [25].↵
    John RB Perry, Felix Day, Cathy E Elks, Patrick Sulem, Deborah J Thompson, Teresa Ferreira, Chunyan He, Daniel I Chasman, Tõnu Esko, Gudmar Thorleifsson, et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature, 514(7520):92–97, 2014.
    OpenUrlCrossRefPubMedWeb of Science
  26. [26].↵
    Andrew P Morris, Benjamin F Voight, Tanya M Teslovich, Teresa Ferreira, Ayellet V Segre, Valgerdur Steinthorsdottir, Rona J Strawbridge, Hassan Khan, Harald Grallert, Anubha Mahajan, et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nature genetics, 44(9):981, 2012.
    OpenUrlCrossRefPubMed
  27. [27].↵
    Momoko Horikoshi, Hanieh Yaghootkar, Dennis O Mook-Kanamori, Ulla Sovio, H Rob Taal, Branwen J Hennig, Jonathan P Bradfield, Beate St Pourcain, David M Evans, Pimphen Charoen, et al. New loci associated with birth weight identify genetic links between intrauterine growth and adult height and metabolism. Nature genetics, 45(1):76–82, 2013.
    OpenUrlCrossRefPubMed
  28. [28].↵
    Rachel M Freathy, Amanda J Bennett, Susan M Ring, Beverley Shields, Christopher J Groves, Nicholas J Timpson, Michael N Weedon, Eleftheria Zeggini, Cecilia M Lindgren, Hana Lango, et al. Type 2 diabetes risk alleles are associated with reduced size at birth. Diabetes, 58(6):1428–1433, 2009.
    OpenUrlAbstract/FREE Full Text
  29. [29].↵
    Early Growth Genetics (EGG) Consortium et al. A genome-wide association meta-analysis identifies new childhood obesity loci. Nature genetics, 44(5):526–531, 2012.
    OpenUrlCrossRefPubMed
  30. [30].↵
    H Rob Taal, Beate St Pourcain, Elisabeth Thiering, Shikta Das, Dennis O Mook-Kanamori, Nicole M Warrington, Marika Kaakinen, Eskil Kreiner-Møller, Jonathan P Bradfield, Rachel M Freathy, et al. Common variants at 12q15 and 12q24 are associated with infant head circumference. Nature genetics, 44(5):532–538, 2012.
    OpenUrlCrossRefPubMed
  31. [31].↵
    NC Onland-Moret, PHM Peeters, CH Van Gils, F Clavel-Chapelon, T Key, A Tjønneland, A Trichopoulou, R Kaaks, Jonas Manjer, S Panico, et al. Age at menarche in relation to adult height the epic study. American journal of epidemiology, 162(7):623–632, 2005.
    OpenUrlCrossRefPubMedWeb of Science
  32. [32].↵
    Felix Day et al. Puberty timing associated with diabetes, cardiovascular disease and also diverse health outcomes in men and women: the uk biobank study. Submitted, 2014.
  33. [33].↵
    Cathy E Elks, Ken K Ong, Robert A Scott, Yvonne T van der Schouw, Judith S Brand, Petra A Wark, Pilar Amiano, Beverley Balkau, Aurelio Barricarte, Heiner Boeing, et al. Age at menarche and type 2 diabetes risk the epic-interact study. Diabetes care, 36(11):3526–3534, 2013.
    OpenUrlAbstract/FREE Full Text
  34. [34].↵
    Hilary K. Finucane, Brendan Bulik-Sullivan, Alexander Gusev, Gosia Trynka, Yakir Reshef, Po-Ru Loh, Verneri Anttila, Han Xu, Chongzhi Zang, Kyle Farh, Stephan Ripke, Felix R. Day, The ReproGen Consortium, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Shaun Purcell, Eli Stahl, Sara Lindstrom, John R. B. Perry, Yukinori Okada, Brad Bernstein, Soumya Raychaudhuri, Mark Daly, Nick Patterson, Benjamin M. Neale, and Alkes L. Price. Polygenic effects of cell-type-specific functional elements in 17 traits and 1.3 million phenotyped samples. In preparation, 2014.
  35. [35].↵
    I Sadaf Farooqi. Defining the neural basis of appetite and obesity: from genes to behaviour. Clinical Medicine, 14(3):286–289, 2014.
    OpenUrlAbstract/FREE Full Text
  36. [36].↵
    Na Wang, Xianglan Zhang, Yong-Bing Xiang, Gong Yang, Hong-Lan Li, Jing Gao, Hui Cai, Yu-Tang Gao, Wei Zheng, and Xiao-Ou Shu. Associations of adult height and its components with mortality: a report from cohort studies of 135 000 chinese women and men. International journal of epidemiology, 40(6):1715–1726, 2011.
    OpenUrlCrossRefPubMedWeb of Science
  37. [37].
    Patricia R Hebert, Janet W Rich-Edwards, JE Manson, Paul M Ridker, Nancy R Cook, Gerald T O’Connor, Julie E Buring, and Charles H Hennekens. Height and incidence of cardiovascular disease in male physicians. Circulation, 88(4):1437–1443, 1993.
    OpenUrlAbstract/FREE Full Text
  38. [38].↵
    Janet W Rich-Edwards, JoAnn E Manson, Meir J Stampfer, Graham A Colditz, Walter C Willett, Bernard Rosner, Frank E Speizer, and Charles H Hennekens. Height and the risk of cardiovascular disease in women. American journal of epidemiology, 142(9):909–917, 1995.
    OpenUrlPubMedWeb of Science
  39. [39].↵
    Cornelius A Rietveld, Sarah E Medland, Jaime Derringer, Jian Yang, Tõnu Esko, Nicolas W Martin, Harm-Jan Westra, Konstantin Shakhbazov, Abdel Abdellaoui, Arpana Agrawal, et al. Gwas of 126,559 individuals identifies genetic variants associated with educational attainment. Science, 340(6139):1467–1471, 2013.
    OpenUrlAbstract/FREE Full Text
  40. [40].↵
    Deborah E Barnes and Kristine Yaffe. The projected effect of risk factor reduction on alzheimer’s disease prevalence. The Lancet Neurology, 10(9):819–828, 2011.
    OpenUrlCrossRef
  41. [41].↵
    Sam Norton, Fiona E Matthews, Deborah E Barnes, Kristine Yaffe, and Carol Brayne. Potential for primary prevention of alzheimer’s disease: an analysis of population-based data. The Lancet Neurology, 13(8):788–794, 2014.
    OpenUrlCrossRefPubMed
  42. [42].↵
    James H MacCabe, Mats P Lambe, Sven Cnattingius, Pak C Sham, Anthony S David, Abraham Reichenberg, Robin M Murray, and Christina M Hultman. Excellent school performance at age 16 and risk of adult bipolar disorder: national cohort study. The British Journal of Psychiatry, 196(2):109–115, 2010.
    OpenUrlAbstract/FREE Full Text
  43. [43].↵
    Jari Tiihonen, Jari Haukka, Markus Henriksson, Mary Cannon, Tuula Kieseppä, Ilmo Laaksonen, Juhani Sinivuo, and Jouko Lönnqvist. Premorbid intellectual functioning in bipolar disorder and schizophrenia: results from a cohort study of male conscripts. American Journal of Psychiatry, 162(10):1904–1910, 2005.
    OpenUrlCrossRefPubMedWeb of Science
  44. [44].↵
    John P Pierce, Michael C Fiore, Thomas E Novotny, Evridiki J Hatziandreu, and Ronald M Davis. Trends in cigarette smoking in the united states: educational differences are increasing. Jama, 261(1):56–60, 1989.
    OpenUrlCrossRefPubMedWeb of Science
  45. [45].↵
    Ruth H Striegel-Moore, Vicki Garvin, Faith-Anne Dohm, and Robert A Rosenheck. Psychiatric comorbidity of eating disorders in men: a national study of hospitalized veterans. International Journal of Eating Disorders, 25(4):399–404, 1999.
    OpenUrlCrossRefPubMedWeb of Science
  46. [46].↵
    Barton J Blinder, Edward J Cumella, and Visant A Sanathara. Psychiatric comorbidities of female inpatients with eating disorders. Psychosomatic Medicine, 68(3):454–462, 2006.
    OpenUrlAbstract/FREE Full Text
  47. [47].↵
    Ian J Deary, Steve Strand, Pauline Smith, and Cres Fernandes. Intelligence and educational achievement. Intelligence, 35(1):13–21, 2007.
    OpenUrlCrossRefWeb of Science
  48. [48].↵
    Catherine M Calvin, Cres Fernandes, Pauline Smith, Peter M Visscher, and Ian J Deary. Sex, intelligence and educational achievement in a national cohort of over 175,000 11-year-old schoolchildren in england. Intelligence, 38(4):424–432, 2010.
    OpenUrlCrossRef
  49. [49].↵
    Maureen S Durkin, Matthew J Maenner, F John Meaney, Susan E Levy, Carolyn DiGuiseppi, Joyce S Nicholas, Russell S Kirby, Jennifer A Pinto-Martin, and Laura A Schieve. Socioeconomic inequality in the prevalence of autism spectrum disorder: evidence from a us cross-sectional study. PLoS One, 5(7):e11551, 2010.
    OpenUrlCrossRefPubMed
  50. [50].↵
    Elise B Robinson, Kaitlin E Samocha, Jack A Kosmicki, Lauren McGrath, Benjamin M Neale, Roy H Perlis, and Mark J Daly. Autism spectrum disorder severity reflects the average contribution of de novo and familial influences. Proceedings of the National Academy of Sciences, 111(42):15161–15165, 2014.
    OpenUrlAbstract/FREE Full Text
  51. [51].↵
    Kaitlin E Samocha, Elise B Robinson, Stephan J Sanders, Christine Stevens, Aniko Sabo, Lauren M McGrath, Jack A Kosmicki, Karola Rehnström, Swapan Mallick, Andrew Kirby, et al. A framework for the interpretation of de novo mutation in human disease. Nature genetics, 46(9):944–950, 2014.
    OpenUrlCrossRefPubMed
  52. [52].↵
    Alan J Silman and Jacqueline E Pearson. Epidemiology and genetics of rheumatoid arthritis. Arthritis Res, 4(Suppl 3):S265–S272, 2002.
    OpenUrlCrossRefPubMed
  53. [53].↵
    Jose de Leon and Francisco J Diaz. A meta-analysis of worldwide studies demonstrates an association between schizophrenia and tobacco smoking behaviors. Schizophrenia research, 76(2):135–157, 2005.
    OpenUrlCrossRefPubMedWeb of Science
  54. [54].↵
    Ole A Andreassen, Srdjan Djurovic, Wesley K Thompson, Andrew J Schork, Kenneth S Kendler, Michael C O?Donovan, Dan Rujescu, Thomas Werge, Martijn van de Bunt, Andrew P Morris, et al. Improved detection of common variants associated with schizophrenia by leveraging pleiotropy with cardiovascular-disease risk factors. The American Journal of Human Genetics, 92(2):197–209, 2013.
    OpenUrlCrossRefPubMed
  55. [55].↵
    Chris Cotsapas, Benjamin F Voight, Elizabeth Rossin, Kasper Lage, Benjamin M Neale, Chris Wallace, Gonçalo R Abecasis, Jeffrey C Barrett, Timothy Behrens, Judy Cho, et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS genetics, 7(8):e1002254, 2011.
    OpenUrl
  56. [56].↵
    Kyle Kai-How Farh, Alexander Marson, Jiang Zhu, Markus Kleinewietfeld, William J Housley, Samantha Beik, Noam Shoresh, Holly Whitton, Russell JH Ryan, Alexander A Shishkin, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature, 2014.
  57. [57].↵
    Peter Wurtz et al. Metabolic signatures of adiposity in young adults: Mendelian randomization analysis and effects of weight change. PLoS Medicine, 2014.
  58. [58].↵
    Stephen Burgess, Daniel F Freitag, Hassan Khan, Donal N Gorman, and Simon G Thompson. Using multivariable mendelian randomization to disentangle the causal effects of lipid fractions. PloS one, 9(10):e108891, 2014.
    OpenUrlCrossRefPubMed
  59. [59].↵
    Sander Greenland, Judea Pearl, and James M Robins. Causal diagrams for epidemiologic research. Epidemiology, pages 37–48, 1999.
  60. [60].↵
    Andy Dahl, Victoria Hore, Valentina Iotchkova, and Jonathan Marchini. Network inference in matrix-variate gaussian models with non-independent noise. arXiv preprint arXiv:1312.1622, 2013.
  61. [61].↵
    Hugues Aschard, Bjarni J Vilhjálmsson, Amit D Joshi, Alkes L Price, and Peter Kraft. Adjusting for heritable covariates can bias effect estimates in genome-wide association studies. The American Journal of Human Genetics, 2015.
  62. [62].↵
    International HapMap 3 Consortium et al. Integrating common and rare genetic variation in diverse human populations. Nature, 467(7311):52–58, 2010.
    OpenUrlCrossRefPubMedWeb of Science
  63. [63].↵
    Karl Pearson and Alice Lee. On the inheritance of characters not capable of exact quantitative measurement. Philosophical Transactions of the Royal Society of London, A (195), pages 79–150, 1901.
  64. [64].↵
    Schizophrenia Working Group of the Psychiatric Genomics Consortium et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature, 511(7510):421–427, 2014.
    OpenUrlCrossRefPubMedWeb of Science
  65. [65].
    Pamela Sklar, Stephan Ripke, Laura J Scott, Ole A Andreassen, Sven Cichon, Nick Craddock, Howard J Edenberg, John I Nurnberger, Marcella Rietschel, Douglas Blackwood, et al. Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near odz4. Nature genetics, 43(10):977, 2011.
    OpenUrlCrossRefPubMed
  66. [66].
    Stephan Ripke, Naomi R Wray, Cathryn M Lewis, Steven P Hamilton, Myrna M Weissman, Gerome Breen, Enda M Byrne, Douglas HR Blackwood, Dorret I Boomsma, Sven Cichon, et al. A mega-analysis of genome-wide association studies for major depressive disorder. Molecular psychiatry, 18(4):497–511, 2012.
    OpenUrlPubMed
  67. [67].
    Vesna Boraska, Christopher S Franklin, James AB Floyd, Laura M Thornton, Laura M Huck-ins, Lorraine Southam, N William Rayner, Ioanna Tachmazidou, Kelly L Klump, Janet Treasure, et al. A genome-wide association study of anorexia nervosa. Molecular psychiatry, 2014.
  68. [68].
    Tobacco, Genetics Consortium et al. Genome-wide meta-analyses identify multiple loci associated with smoking behavior. Nature genetics, 42(5):441–447, 2010.
    OpenUrlCrossRefPubMedWeb of Science
  69. [69].
    Jean-Charles Lambert, Carla A Ibrahim-Verbaas, Denise Harold, Adam C Naj, Rebecca Sims, Céline Bellenguez, Gyungah Jun, Anita L DeStefano, Joshua C Bis, Gary W Beecham, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for alzheimer’s disease. Nature genetics, 2013.
  70. [70].
    Hana Lango Allen, Karol Estrada, Guillaume Lettre, Sonja I Berndt, Michael N Weedon, Fernando Rivadeneira, Cristen J Willer, Anne U Jackson, Sailaja Vedantam, Soumya Raychaudhuri, et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature, 467(7317):832–838, 2010.
    OpenUrlCrossRefPubMedWeb of Science
  71. [71].
    Sonja I Berndt, Stefan Gustafsson, Reedik Mägi, Andrea Ganna, Eleanor Wheeler, Mary F Feitosa, Anne E Justice, Keri L Monda, Damien C Croteau-Chonka, Felix R Day, et al. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture. Nature genetics, 45(5):501–512, 2013.
    OpenUrlCrossRefPubMed
  72. [72].
    Heribert Schunkert, Inke R König, Sekar Kathiresan, Muredach P Reilly, Themistocles L Assimes, Hilma Holm, Michael Preuss, Alexandre FR Stewart, Maja Barbalic, Christian Gieger, et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nature genetics, 43(4):333–338, 2011.
    OpenUrlCrossRefPubMed
  73. [73].↵
    Tanya M Teslovich, Kiran Musunuru, Albert V Smith, Andrew C Edmondson, Ioannis M Stylianou, Masahiro Koseki, James P Pirruccello, Samuli Ripatti, Daniel I Chasman, Cristen J Willer, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature, 466(7307):707–713, 2010.
    OpenUrlCrossRefPubMedWeb of Science
  74. [74].
    Alisa K Manning, Marie-France Hivert, Robert A Scott, Jonna L Grimsby, Nabila Bouatia-Naji, Han Chen, Denis Rybin, Ching-Ti Liu, Lawrence F Bielak, Inga Prokopenko, et al. A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nature genetics, 44(6):659–669, 2012.
    OpenUrlCrossRefPubMed
  75. [75].
    Ralf JP van der Valk, Eskil Kreiner-Møller, Marjolein N Kooijman, Mònica Guxens, Evangelia Stergiakouli, Annika Sääf, Jonathan P Bradfield, Frank Geller, M Geoffrey Hayes, Diana L Cousminer, et al. A novel common variant in dcst2 is associated with length in early life and height in adulthood. Human molecular genetics, page ddu510, 2014.
  76. [76].
    Luke Jostins, Stephan Ripke, Rinse K Weersma, Richard H Duerr, Dermot P McGovern, Ken Y Hui, James C Lee, L Philip Schumm, Yashoda Sharma, Carl A Anderson, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature, 491(7422):119–124, 2012.
    OpenUrlCrossRefPubMedWeb of Science
  77. [77].
    Eli A Stahl, Soumya Raychaudhuri, Elaine F Remmers, Gang Xie, Stephen Eyre, Brian P Thomson, Yonghong Li, Fina AS Kurreeman, Alexandra Zhernakova, Anne Hinks, et al. Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci. Nature genetics, 42(6):508–514, 2010.
    OpenUrlCrossRefPubMedWeb of Science
  78. [78].↵
    Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium et al. Genome-wide association study identifies five new schizophrenia loci. Nature genetics, 43(10):969–976, 2011.
    OpenUrlCrossRefPubMed
Back to top
PreviousNext
Posted April 06, 2015.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
An Atlas of Genetic Correlations across Human Diseases and Traits
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
An Atlas of Genetic Correlations across Human Diseases and Traits
Brendan Bulik-Sullivan, Hilary K Finucane, Verneri Anttila, Alexander Gusev, Felix R. Day, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3, Laramie Duncan, John R. B. Perry, Nick Patterson, Elise B. Robinson, Mark J. Daly, Alkes L. Price, Benjamin M. Neale
bioRxiv 014498; doi: https://doi.org/10.1101/014498
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
An Atlas of Genetic Correlations across Human Diseases and Traits
Brendan Bulik-Sullivan, Hilary K Finucane, Verneri Anttila, Alexander Gusev, Felix R. Day, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3, Laramie Duncan, John R. B. Perry, Nick Patterson, Elise B. Robinson, Mark J. Daly, Alkes L. Price, Benjamin M. Neale
bioRxiv 014498; doi: https://doi.org/10.1101/014498

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4246)
  • Biochemistry (9175)
  • Bioengineering (6807)
  • Bioinformatics (24066)
  • Biophysics (12160)
  • Cancer Biology (9567)
  • Cell Biology (13844)
  • Clinical Trials (138)
  • Developmental Biology (7661)
  • Ecology (11739)
  • Epidemiology (2066)
  • Evolutionary Biology (15547)
  • Genetics (10673)
  • Genomics (14365)
  • Immunology (9515)
  • Microbiology (22916)
  • Molecular Biology (9135)
  • Neuroscience (49170)
  • Paleontology (358)
  • Pathology (1487)
  • Pharmacology and Toxicology (2584)
  • Physiology (3851)
  • Plant Biology (8351)
  • Scientific Communication and Education (1473)
  • Synthetic Biology (2301)
  • Systems Biology (6207)
  • Zoology (1304)