ABSTRACT
The chromosome is a fundamental component of cell biology, housing DNA that encapsulates hierarchical genetic information. DNA compresses its size by forming loops, and these loop regions contain numerous protein particles, including CTCF, SMC3, H3 histone, and Topologically Associating Domains (TADs). In this study, we conducted a comprehensive study of 22 loop calling methods. Additionally, we have provided detailed insights into the methodologies underlying these algorithms for loop detection, categorizing them into five distinct groups based on their fundamental approaches. Furthermore, we have included critical information such as resolution, input and output formats, and parameters. For this analysis, we utilized the primary and replicate GM12878 Hi-C datasets at 5KB and 10KB resolutions. Our evaluation criteria encompassed various factors, including loop count, reproducibility, overlap, running time, Aggregated Peak Analysis (APA), and recovery of protein-specific sites such as CTCF, H3K27ac, and RNAPII. This analysis offers insights into the loop detection processes of each method, along with the strengths and weaknesses of each, enabling readers to effectively choose suitable methods for their datasets. We evaluate the capabilities of these tools and introduce a novel Biological, Consistency, and Computational robustness score (BCCscore) to measure their overall robustness ensuring a comprehensive evaluation of their performance.
Competing Interest Statement
The authors have declared no competing interest.