PT - JOURNAL ARTICLE AU - Julian R. Homburger AU - Cynthia L. Neben AU - Gilad Mishne AU - Alicia Y. Zhou AU - Sekar Kathiresan AU - Amit V. Khera TI - Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores AID - 10.1101/716977 DP - 2019 Jan 01 TA - bioRxiv PG - 716977 4099 - http://biorxiv.org/content/early/2019/07/31/716977.short 4100 - http://biorxiv.org/content/early/2019/07/31/716977.full AB - Background The inherited susceptibility of common, complex diseases may be caused by rare, ‘monogenic’ pathogenic variants or by the cumulative effect of numerous common, ‘polygenic’ variants. As such, comprehensive genome interpretation could involve two distinct genetic testing technologies -- high coverage next generation sequencing for known genes to detect pathogenic variants and a genome-wide genotyping array followed by imputation to calculate genome-wide polygenic scores (GPSs). Here we assessed the feasibility and accuracy of using low coverage whole genome sequencing (lcWGS) as an alternative to genotyping arrays to calculate GPSs.Methods First, we performed downsampling and imputation of WGS data from ten individuals to assess concordance with known genotypes. Second, we assessed the correlation between GPSs for three common diseases -- coronary artery disease (CAD), breast cancer (BC), and atrial fibrillation (AF) -- calculated using lcWGS and genotyping array in 184 samples. Third, we assessed concordance of lcWGS-based genotype calls and GPS calculation in 120 individuals with known genotypes, selected to reflect diverse ancestral backgrounds. Fourth, we assessed the relationship between GPSs calculated using lcWGS and disease phenotypes in 11,502 European individuals seeking genetic testing.Results We found imputation accuracy r2 values of greater than 0.90 for all ten samples -- including those of African and Ashkenazi Jewish ancestry -- with lcWGS data at 0.5X. GPSs calculated using both lcWGS and genotyping array followed by imputation in 184 individuals were highly correlated for each of the three common diseases (r2 = 0.93 - 0.97) with similar score distributions. Using lcWGS data from 120 individuals of diverse ancestral backgrounds, including South Asian, East Asian, and Hispanic individuals, we found similar results with respect to imputation accuracy and GPS correlations. Finally, we calculated GPSs for CAD, BC, and AF using lcWGS in 11,502 European individuals, confirming odds ratios per standard deviation increment in GPSs ranging 1.28 to 1.59, consistent with previous studies.Conclusions Here we show that lcWGS is an alternative approach to genotyping arrays for common genetic variant assessment and GPS calculation. lcWGS provides comparable imputation accuracy while also overcoming the ascertainment bias inherent to variant selection in genotyping array design.GPSgenome-wide polygenic scorelcWGSlow coverage whole genome sequencingCADcoronary artery diseaseBCbreast cancerAFatrial fibrillation1KGP1000 Genomes ProjectGIABGenome in a BottleIndelinsertion-deletionBWABurrows-Wheeler AlignerIBSIdentity-by-StatePCprincipal componentsPCAPC analysisMAFminor allele frequencyAUCarea under the curve