PT - JOURNAL ARTICLE AU - Hamid Alinejad-Rokny AU - Rassa Ghavami AU - Hamid R. Rabiee AU - Narges Rezaei AU - Kin Tung Tam AU - Alistair R. R. Forrest TI - MaxHiC: robust estimation of chromatin interaction frequency in Hi-C and capture Hi-C experiments AID - 10.1101/2020.04.23.056226 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.04.23.056226 4099 - http://biorxiv.org/content/early/2020/04/25/2020.04.23.056226.short 4100 - http://biorxiv.org/content/early/2020/04/25/2020.04.23.056226.full AB - Hi-C is a genome-wide chromosome conformation capture technology that detects interactions between pairs of genomic regions, and exploits higher order chromatin structures. Conceptually Hi-C data counts interaction frequencies between every position in the genome and every other position. Biologically functional interactions are expected to occur more frequently than random (background) interactions. To identify biologically relevant interactions, several background models that take biases such as distance, GC content and mappability into account have been proposed. Here we introduce MaxHiC, a background correction tool that deals with these complex biases and robustly identifies statistically significant interactions in both Hi-C and capture Hi-C experiments. MaxHiC uses a negative binomial distribution model and a maximum likelihood technique to correct biases in both Hi-C and capture Hi-C libraries. We systematically benchmark MaxHiC against major Hi-C background correction tools and demonstrate using published Hi-C and capture Hi-C datasets that 1) Interacting regions identified by MaxHiC have significantly greater levels of overlap with known regulatory features (e.g. active chromatin histone marks, CTCF binding sites, DNase sensitivity) and also disease-associated genome-wide association SNPs than those identified by currently existing models, and 2) the pairs of interacting regions are more likely to be linked by eQTL pairs and more likely to identify known enhancer-promoter pairs than any of the existing methods. We also demonstrate that interactions between different genomic region types have distinct distance distribution only revealed by MaxHiC. MaxHiC is publicly available as a python package for the analysis of Hi-C and capture Hi-C data.Competing Interest StatementThe authors have declared no competing interest.