PT - JOURNAL ARTICLE
AU - Connor, Gregory
AU - O’Neill, Michael
TI - Finite-sample genome-wide regression p-values (GWRPV) with a non-normally distributed phenotype
AID - 10.1101/204727
DP - 2017 Jan 01
TA - bioRxiv
PG - 204727
4099 - http://biorxiv.org/content/early/2017/11/09/204727.short
4100 - http://biorxiv.org/content/early/2017/11/09/204727.full
AB - This paper derives the exact finite-sample p-value for univariate regression of a quantitative phenotype on individual genome markers, relying on a mixture distribution for the dependent variable. The p-value estimator conventionally used in existing genome-wide association study (GWAS) regressions assumes a normally-distributed dependent variable, or relies on a central limit theorem based approximation. The central limit theorem approximation is unreliable for GWAS regression p-values, and measured phenotypes often have markedly non-normal distributions. A normal mixture distribution better fits observed phenotypic variables, and we provide exact small-sample p-values for univariate GWAS regressions under this flexible distributional assumption. We illustrate the adjustment using a years-of-education phenotypic variable.