Abstract
We report here some anomalies discovered in the minor allele frequencies (MAFs) and some likely mismappings found in our analyses of UK Biobank dataset (UKB) and several other databases. We compared the MAFs present in the UKB to those measured in two other UK studies, ALSPAC and TwinsUK, and found a large set of SNPs for which the UKB MAFs are inconsistent. Additionally, even after accounting for population structure effects and other possible causes of spurious correlations, we found many SNPs that appear to be in interchromosomal linkage. Analyzing these interchromosomal linkages carefully, we found that they are all associated with identical sequences on different chromosomes, implying that these SNPs are simply mismapped. Some (but certainly not all) of the MAF disagreements appear to be the result of these mismappings. Our results, including lists of SNPs with inconsistent MAFs and/or apparent interchromosomal linkage, are freely available to download at: http://kunertgraf.com/data/biobank.html
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
1 For example: the SNP rs71238527 has a MAF=0.216 in the UK Biobank. dbSNP reports values from two different studies, ExAC and GO-ESP, with MAF values of 0.145 and 0.086. We take the closest value, 0.145, as the “best match” from any dbSNP study.