TY - JOUR T1 - SVhound: Detection of future Structural Variation hotspots JF - bioRxiv DO - 10.1101/2021.04.09.439237 SP - 2021.04.09.439237 AU - Luis F Paulin AU - Muthuswamy Raveendran AU - R. Alan Harris AU - Jeffrey Rogers AU - Arndt von Haeseler AU - Fritz J Sedlazeck Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/04/10/2021.04.09.439237.abstract N2 - Recent population studies are ever growing in size of samples to investigate the diversity of a given population or species. These studies reveal ever new polymorphism that lead to important insights into the mechanisms of evolution, but are also important for the interpretation of these variations. Nevertheless, while the full catalog of variations across entire species remains unknown, we can predict which regions harbor additional variations that remain hidden and investigate their properties, thereby enhancing the analysis for potentially missed variants.To achieve this we implemented SVhound (https://github.com/lfpaulin/SVhound), which based on a population level SVs dataset can predict regions that harbor novel SV alleles. We tested SVhound using subsets of the 1000 genomes project data and showed that its correlation (average correlation of 2,800 tests r=0.7136) is high to the full data set. Next, we utilized SVhound to investigate potentially missed or understudied regions across 1KGP and CCDG that included multiple genes. Lastly we show the applicability for SVhound also on a small and novel SV call set for rhesus macaque (Macaca mulatta) and discuss the impact and choice of parameters for SVhound. Overall SVhound is a unique method to identify potential regions that harbor hidden diversity in model and non model organisms and can also be potentially used to ensure high quality of SV call sets.Competing Interest StatementFritz J Sedlazeck has received sponsored travel by Phase genomics, Oxford Nanopore and PacBio ER -