RT Journal Article SR Electronic T1 Semantic and population analysis of the genetic targets related to COVID-19 and its association with genes and diseases JF bioRxiv FD Cold Spring Harbor Laboratory SP 2022.09.16.508278 DO 10.1101/2022.09.16.508278 A1 Louis Papageorgiou A1 Eleni Papakonstantinou A1 Io Diakou A1 Katerina Pierouli A1 Konstantina Dragoumani A1 Flora Bacopoulou A1 George P Chrousos A1 Elias Eliopoulos A1 Dimitrios Vlachakis YR 2022 UL http://biorxiv.org/content/early/2022/09/16/2022.09.16.508278.abstract AB SARS-CoV-2 is a coronavirus responsible for one of the most serious, modern worldwide pandemics, with lasting and multi-faceted effects. By late 2021, SARS-CoV-2 has infected more than 180 million people and has killed more than 3 million. The virus gains entrance to human cells through binding to ACE2 via its surface spike protein and causes a complex disease of the respiratory system, termed COVID-19. Vaccination efforts are being made to hinder the viral spread and therapeutics are currently under development. Towards this goal, scientific attention is shifting towards variants and SNPs that affect factors of the disease such as susceptibility and severity. This genomic grammar, tightly related to the dark part of our genome, can be explored through the use of modern methods such as natural language processing. We present a semantic analysis of SARS-CoV-2 related publications, which yielded a repertoire of SNPs, genes and disease ontologies. Population data from the 100Genomes Project were subsequently integrated into the pipeline. Data mining approaches of this scale have the potential to elucidate the complex interaction between COVID-19 pathogenesis and host genetic variation; the resulting knowledge can facilitate the management of high-risk groups and aid the efforts towards precision medicine.Competing Interest StatementThe authors have declared no competing interest.