RT Journal Article SR Electronic T1 Subtle stratification confounds estimates of heritability from rare variants JF bioRxiv FD Cold Spring Harbor Laboratory SP 048181 DO 10.1101/048181 A1 Gaurav Bhatia A1 Alexander Gusev A1 Po-Ru Loh A1 Hilary Finucane A1 Bjarni J. Vilhjálmsson A1 Stephan Ripke A1 Schizophrenia Working Group of the Psychiatric Genomics Consortium A1 Shaun Purcell A1 Eli Stahl A1 Mark Daly A1 Teresa R de Candia A1 Sang Hong Lee A1 Benjamin M Neale A1 Matthew C. Keller A1 Noah A. Zaitlen A1 Bogdan Pasaniuc A1 Nick Patterson A1 Jian Yang A1 Alkes L. Price YR 2016 UL http://biorxiv.org/content/early/2016/04/12/048181.abstract AB Genome-wide significant associations generally explain only a small proportion of the narrow-sense heritability of complex disease (h2). While considerably more heritability is explained by all genotyped SNPs (hg2), for most traits, much heritability remains missing (hg2 < h2). Rare variants, poorly tagged by genotyped SNPs, are a major potential source of the gap between hg2 and h2. Recent efforts to assess the contribution of both sequenced and imputed rare variants to phenotypes suggest that substantial heritability may lie in these variants. Here we analyze sequenced SNPs, imputed SNPs and haploSNPs— haplotype variants constructed from within a sample, without using a reference panel— and show that studies of heritability from these variants may be strongly confounded by subtle population stratification. For example, when meta-analyzing heritability estimates from 22 randomly ascertained case-control traits from the GERA cohort, we observe a statistically significant increase in heritability explained by imputed SNPs even after correcting for principal components (PCs) from genotyped (or imputed) SNPs. However, this increase is eliminated when correcting for stratification using PCs from a larger number of haploSNPs. We note that subtle stratification may also impact estimates of heritability from array SNPs, although we find that this is generally a less severe problem. Overall, our results suggest that estimating the heritability explained by rare variants for case-control traits requires exquisite control for population stratification, but current methods may not provide this level of control.