Abstract
Polygenic risk scores (PRS) trained from genome-wide association study (GWAS) results are set to play a pivotal role in biomedical research addressing multifactorial human diseases. The prospect of using these risk scores in clinical care and public health is generating both enthusiasm and controversy, with varying opinions about strengths and limitations across experts1. The performances of existing polygenic scores are still limited, and although it is expected to improve with increasing sample size of GWAS and the development of new powerful methods, it remains unclear how much prediction can be ultimately achieved. Here, we conducted a retrospective analysis to assess the progress in PRS prediction accuracy since the publication of the first large-scale GWASs using six common human diseases with sufficient GWAS data. We show that while PRS accuracy has grown rapidly for years, the improvement pace from recent GWAS has decreased substantially, suggesting that further increasing GWAS sample size may translate into very modest risk discrimination improvement. We next investigated the factors influencing the maximum achievable prediction using recently released whole genome-sequencing data from 125K UK Biobank participants, and state-of-the-art modeling of polygenic outcomes. Our analyses point toward increasing the variant coverage of PRS, using either more imputed variants or sequencing data, as a key component for future improvement in prediction accuracy.
Competing Interest Statement
The authors have declared no competing interest.