Abstract
Structural variant (SV) detection in human genomes using short-read sequencing data is hindered by false positives, arising from sequencing and mapping artifacts that mimic genuine SV signals. Despite advances, state-of-the-art SV callers like GRIDSS and Manta exhibit trade-offs between precision and recall, with GRIDSS offering the highest precision and Manta excelling in recall. To address these limitations, we introduce sv-channels, a novel deep learning model designed to improve the precision of SV detection by leveraging read information at call sites. Our method effectively reduces false positives in Manta’s deletion callsets, achieving precision that surpasses GRIDSS while maintaining a recall rate comparable to Manta. This represents a significant improvement in SV detection, leveraging Manta’s high recall through deep learning and paving the way for more accurate genomic analyses. The sv-channels codebase is openly accessible on GitHub at https://github.com/GooglingTheCancerGenome/sv-channels enabling further research and application in the field.
Competing Interest Statement
J.d.R and W.P.K are co-founders and directors of Cyclomics, a genomics company, they declare no competing interests. L.S is an employer of JSR Life Sciences, he declares no competing interests. S.G, A.K, B.S.P, C.S, S.M, L.R declare no competing interests.