TY - JOUR T1 - The Tangent copy-number inference pipeline for cancer genome analyses JF - bioRxiv DO - 10.1101/566505 SP - 566505 AU - Barbara Tabak AU - Gordon Saksena AU - Coyin Oh AU - Galen F. Gao AU - Barbara Hill Meyers AU - Michael Reich AU - Steven E. Schumacher AU - Lindsay C. Westlake AU - Ashton C. Berger AU - Scott L.b Carter AU - Andrew D. Cherniack AU - Matthew Meyerson AU - Rameen Beroukhim AU - Gad Getz Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/03/05/566505.abstract N2 - Motivation Somatic copy-number alterations (SCNAs) play an important role in cancer development. Systematic noise in sequencing and array data present a significant challenge to the inference of SCNAs for cancer genome analyses. As part of The Cancer Genome Atlas (TCGA), the Broad Institute Genome Characterization Center developed the Tangent copy-number inference pipeline to generate copy-number profiles using single-nucleotide polymorphism (SNP) array and whole-exome sequencing (WES) data from over 10,000 pairs of tumors and matched normal samples. Here, we describe the Tangent pipeline, which begins with DNA sequencing data in the form of .bam files or raw SNP array probe-level intensity data, and ends with segmented copy-number calls to facilitate the identification of novel genes potentially targeted by SCNAs. We also describe a modification of Tangent, Pseudo-Tangent, which enables denoising through comparisons between tumor profiles when few normal samples are available.Results Tangent Normalization offers substantial signal-to-noise ratio (SNR) improvements compared to conventional normalization methods in both SNP array and WES analyses. The improvement in SNRs is achieved primarily through noise reduction with minimal effect on signal. Pseudo-Tangent also reduces noise when few normal samples are available. Tangent and Pseudo-Tangent are broadly applicable and enable more accurate inference of SCNAs from DNA sequencing and array data.Availability and Implementation Tangent is available at https://github.com/coyin/tangent and as a Docker image (https://hub.docker.com/r/coyin/tangent). Tangent is also the normalization method for the Copy Number pipeline in Genome Analysis Toolkit 4 (GATK4).Contact matthew_meyerson{at}dfci.harvard.edu, rameen{at}broadinstitute.org, gadgetz{at}broadinstitute.org ER -