Abstract
Sequencing-based 3D genome mapping technologies can identify loops formed by interactions between regulatory elements hundreds of kilobases apart. Existing loop-calling tools are mostly restricted to a single data type, with accuracy dependent on a pre-defined resolution contact matrix or called peaks, and can have prohibitive hardware costs. Here we introduce cLoops (‘see loops’) to address these limitations. cLoops is based on the clustering algorithm cDBSCAN that directly analyzes the paired-end tags (PETs) to find candidate loops and uses a permuted local background to estimate statistical significance. These two data-type-independent processes enable loops to be reliably identified for both sharp and broad peak data, including but not limited to ChIA-PET, Hi-C, HiChIP and Trac-looping data. Loops identified by cLoops showed much less distance-dependent bias and higher enrichment relative to local regions than existing tools. Altogether, cLoops improves accuracy of detecting of 3D-genomic loops from sequencing data, is versatile, flexible, efficient, and has modest hardware requirements, and is freely available at: https://github.com/YaqiangCao/cLoops.