RT Journal Article SR Electronic T1 ‘The Thousand Polish Genomes Project’ - a national database of Polish variant allele frequencies JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.07.07.451425 DO 10.1101/2021.07.07.451425 A1 Elżbieta Kaja A1 Adrian Lejman A1 Dawid Sielski A1 Mateusz Sypniewski A1 Tomasz Gambin A1 Tomasz Suchocki A1 Mateusz Dawidziuk A1 Paweł Golik A1 Marzena Wojtaszewska A1 Maria Stępień A1 Joanna Szyda A1 Karolina Lisiak-Teodorczyk A1 Filip Wolbach A1 Daria Kołodziejska A1 Katarzyna Ferdyn A1 Alicja Woźna A1 Marcin Żytkiewicz A1 Anna Bodora-Troińska A1 Waldemar Elikowski A1 Zbigniew Król A1 Artur Zaczyński A1 Agnieszka Pawlak A1 Robert Gil A1 Waldemar Wierzba A1 Paula Dobosz A1 Katarzyna Zawadzka A1 Paweł Zawadzki A1 Paweł Sztromwasser YR 2021 UL http://biorxiv.org/content/early/2021/07/09/2021.07.07.451425.abstract AB Although Slavic populations account for over 3.5% of world inhabitants, no centralized, open source reference database of genetic variation of any Slavic population exists to date. Such data are crucial for either biomedical research and genetic counseling and are essential for archeological and historical studies. Polish population, homogenous and sedentary in its nature but influenced by many migrations of the past, is unique and could serve as a good genetic reference for middle European Slavic nations.The aim of the present study was to describe first results of analyses of a newly created national database of Polish genomic variant allele frequencies. Never before has any study on the whole genomes of Polish population been conducted on such a large number of individuals (1,079).A wide spectrum of genomic variation was identified and genotyped, such as small and structural variants, runs of homozygosity, mitochondrial haplogroups and Mendelian inconsistencies. The allele frequencies were calculated for 943 unrelated individuals and released publicly as The Thousand Polish Genomes database. A precise detection and characterisation of rare variants enriched in the Polish population allowed to confirm the allele frequencies for known pathogenic variants in diseases, such as Smith-Lemli-Opitz syndrome (SLOS) or Nijmegen breakage syndrome (NBS). Additionally, the analysis of OMIM AR genes led to the identification of 22 genes with significantly different cumulative allele frequencies in the Polish (POL) vs European NFE population. We hope that The Thousand Polish Genomes database will contribute to the worldwide genomic data resources for researchers and clinicians.Competing Interest StatementThe authors have declared no competing interest.