PT - JOURNAL ARTICLE AU - Tucker-Drob, Elliot M. TI - Measurement Error Correction of Genome-Wide Polygenic Scores in Prediction Samples AID - 10.1101/165472 DP - 2017 Jan 01 TA - bioRxiv PG - 165472 4099 - http://biorxiv.org/content/early/2017/07/19/165472.short 4100 - http://biorxiv.org/content/early/2017/07/19/165472.full AB - DiPrete, Burik, & Koellinger (2017; http://dx.doi.org/10.1101/134197) propose using an instrumental variable (IV) framework to correct genome-wide polygenic scores (GPSs) for error, thereby producing disattenuated estimates of SNP heritability in predictions samples. They demonstrate their approach by producing two independent GPSs for Educational Attainment (“multiple indicators”) in a prediction sample (Health and Retirement Study; HRS) from independent sets of SNP regression weights, each computed from a different half of the discovery sample (EA2; Okbay et al. 2016), i.e. “by randomly splitting the GWAS sample that was used for [the GPS] construction.”Here, I elucidate how a structural equation modeling (SEM) framework that specifies true score variance in GPSs as a latent variable can be used to derive an equivalent correction to the IV approach proposed by DiPrete et al. (2017). This approach, which is rooted in a psychometric modeling tradition, has a number of advantages: (1) it formalizes the assumed data-generating model, (2) it estimates all parameters of interest in a single step, (3) is can be flexibly incorporated into a larger multivariate analysis (such as the “Genetic Instrumental Variable” approach proposed by DiPrete et al., 2017), (4) it can easily be adapted to relax assumptions (e.g. that the GPS indicators equally represent the true genetic factor score), and (5) it can easily be extended to include more than two GPS indicators. After describing how the multiple indicator approach to GPS correction can specified as a structural equation model, I demonstrate how a structural equation modeling approach can be used to correct GPSs for error using SNP heritability obtained using GREML or LD score regression to produce a correction that is equivalent to an approach recently proposed by Daniel Benjamin and colleagues. Finally, I briefly discuss what I view as some conceptual limitations surrounding the error correction approaches described, regardless of the estimation method implemented.