Abstract
Hi-C data is commonly normalized using single sample processing methods, with focus on comparisons between regions within a given contact map. Here, we aim to compare contact maps across different samples. We demonstrate that unwanted variation, of likely technical origin, is present in Hi-C data with replicates from different individuals, and that properties of this unwanted variation changes across the contact map. We present BNBC, a method for normalization and batch correction of Hi-C data and show that it substantially improves comparisons across samples.
Footnotes
Extensive revision. 1. Our main comparison is now to ICE instead of HiCNorm. 2. We have added a HiC-QTL analysis. 3. Many figures have been removed as being superfluous and text has been extensively revised.