ABSTRACT
Genome graphs have gained prominence and are becoming increasingly pertinent in the genomic research landscape. Despite their innate advantages, there is a shortage of techniques to comprehensively analyse the structural properties of genome graphs and systematically unearth the underlying genomic complexity of the population or species they represent. In this study, we formulated a novel framework to represent and capture the intricate structural complexities inherent in genome graphs. This approach opens up the opportunity to visualise the entire human genome at once and enables the prioritisation of sites of interest that are valuable for in-depth research. We applied the formulated technique to visualise and compare the structural properties of two human pan-genome graphs: one that augments only the variants commonly present in different human populations and the other that augments all the variants, including the rare ones. We also developed and benchmarked various genome-graph-based variant calling workflows and analysed human whole genomes with them. We compared the variant-calling performance of the two constructed graphs with each other and with the linear reference genome. We identified that genome graphs are better reference structures than their linear counterparts, and the proposed structural analysis framework can effectively analyse, visualise and compare the complexities embedded in them.
Competing Interest Statement
The authors have declared no competing interest.