RT Journal Article SR Electronic T1 Integrated quality control of allele-specific copy numbers, mutations and tumour purity from cancer whole genome sequencing assays JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.02.13.429885 DO 10.1101/2021.02.13.429885 A1 Jacob Househam A1 Riccardo Bergamin A1 Salvatore Milite A1 William CH Cross A1 Giulio Caravagna YR 2021 UL http://biorxiv.org/content/early/2021/10/21/2021.02.13.429885.abstract AB Cancer is a global health issue that places enormous demands on healthcare systems. Basic research, the development of targeted treatments, and the utility of DNA sequencing in clinical settings, have been significantly improved with the introduction of whole genome sequencing. However the broad applications of this technology come with complications. To date there has been very little standardisation in how data quality is assessed, leading to inconsistencies in analyses and disparate conclusions. Manual checking and complex consensus calling strategies often do not scale to large sample numbers, which leads to procedural bottlenecks. To address this issue, we present a quality control method that integrates somatic point mutations, allele-specific copy numbers, and tumour purity into a single quantitative score. We demonstrate its power via simulations, and on n = 2778 whole-genomes from PCAWG, on n = 10 multi-region whole-genomes of two colorectal cancers and on n = 48 whole-exomes from TCGA. Our approach significantly improves the generation of cancer mutation data, providing visualisations for cross-referencing with other analyses. The method is fully automated and designed to be compatible with any bioinformatic pipeline, and can automatise tool parameterization paving the way for fast computational assessment of data quality in the era of whole genome sequencing.Competing Interest StatementThe authors have declared no competing interest.