PT - JOURNAL ARTICLE AU - N. Servant AU - N. Varoquaux AU - E. Heard AU - JP. Vert AU - E. Barillot TI - Effective normalization for copy number variation in Hi-C data AID - 10.1101/167031 DP - 2017 Jan 01 TA - bioRxiv PG - 167031 4099 - http://biorxiv.org/content/early/2017/07/21/167031.short 4100 - http://biorxiv.org/content/early/2017/07/21/167031.full AB - Normalization is essential to ensure accurate analysis and proper interpretation of sequencing data. Chromosome conformation data, such as Hi-C, is not different. The most widely used type of normalization of Hi-C data casts estimations of unwanted effects as a matrix balancing problem, relying on the assumption that all genomic regions interact as much as any other. Here, we show that these approaches, while very effective on fully haploid or diploid genome, fail to correct for unwanted effects in the presence of copy number variations. We propose a simple extension to matrix balancing methods that properly models the copy-number variation effects. Our approach can either retain the copy-number variation effects or remove it. We show that this leads to better downstream analysis of the three-dimensional organization of rearranged genome.