Abstract
Copy number variants (CNVs) play important roles in many biological processes, including the development of genetic diseases, making them attractive targets for genetic analysis. This led to the demand for interpretation tools that would relieve researchers, laboratory diagnosticians, genetic counselors and clinical geneticists from the laborious process of annotation and classification of CNVs. Here we demonstrate that the prediction of the clinical impact of CNVs can be automated using modern machine learning methods applied to publicly available genomic annotations, requiring only basic input information about the genomic location and structural type (duplication/deletion) of the analyzed CNV. The presented approach achieved 0.95 prediction accuracy on deletions and 0.96 on duplications from the ClinVar dataset and therefore have a great potential to guide users to more precise conclusions.
Competing Interest Statement
All authors are employees of Geneton Ltd., where they also participate in development of a commercial application for the annotation and interpretation of CNV. The presented method was filed as a patent application under the number PCT / EP2020 / 025292. Apart from the above mentioned all authors have declared no conflicts of interest. The presented work was supported by the the Slovak Research and Development Agency (grant ID APVV-18-0319) (20% of charges) and the 'REVOGENE - Research centre for molecular genetics' project (ITMS 26240220067) supported by the Operational Programme Research and Development funded by the ERDF (80% of charges).