TY - JOUR T1 - Loose ends in cancer genome structure JF - bioRxiv DO - 10.1101/2021.05.26.445837 SP - 2021.05.26.445837 AU - Julie M. Behr AU - Xiaotong Yao AU - Kevin Hadi AU - Huasong Tian AU - Aditya Deshpande AU - Joel Rosiene AU - Titia de Lange AU - Marcin Imieliński Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/05/27/2021.05.26.445837.abstract N2 - Recent pan-cancer studies have delineated patterns of structural genomic variation across thousands of tumor whole genome sequences. It is not known to what extent the shortcomings of short read (≤ 150 bp) whole genome sequencing (WGS) used for structural variant analysis has limited our understanding of cancer genome structure. To formally address this, we introduce the concept of “loose ends” - copy number alterations that cannot be mapped to a rearrangement by WGS but can be indirectly detected through the analysis of junction-balanced genome graphs. Analyzing 2,319 pan-cancer WGS cases across 31 tumor types, we found loose ends were enriched in reference repeats and fusions of the mappable genome to repetitive or foreign sequences. Among these we found genomic footprints of neotelomeres, which were surprisingly enriched in cancers with low telomerase expression and alternate lengthening of telomeres phenotype. Our results also provide a rigorous upper bound on the role of non-allelic homologous recombination (NAHR) in large-scale cancer structural variation, while nominating INO80, FANCA, and ARID1A as positive modulators of somatic NAHR. Taken together, we estimate that short read WGS maps >97% of all large-scale (>10 kbp) cancer structural variation; the rest represent loose ends that require long molecule profiling to unambiguously resolve. Our results have broad relevance for future research and clinical applications of short read WGS and delineate precise directions where long molecule studies might provide transformative insight into cancer genome structure.Competing Interest StatementThe authors have declared no competing interest. ER -