ABSTRACT
The metazoan genome is compartmentalized in megabase-scale areas of highly interacting chromatin known as topologically associating domains (TADs), typically identified by computational analyses of Hi-C sequencing data. TADs are demarcated by boundaries that are largely conserved across cell types and even across species, although, increasing evidence suggests that the seemingly invariant TAD boundaries may exhibit plasticity and their insulating strength can vary. However, a genome-wide characterization of TAD boundary strength in mammals is still lacking. A systematic classification and characterization of TAD boundaries may generate new insights into their function. In this study, we use fused two-dimensional lasso as a machine-learning method to first improve Hi-C contact matrix reproducibility, and, subsequently, categorize TAD boundaries based on their strength. We demonstrate that increased boundary strength is associated with elevated CTCF levels and that TAD boundary insulation scores may differ across cell types. Intriguingly, we observed that super-enhancer elements are preferentially insulated by strong boundaries. Furthermore, a pan-cancer analysis revealed that strong TAD boundaries and super-enhancer elements are frequently co-duplicated. Taken together, our findings suggest that super-enhancers insulated by strong TAD boundaries may be exploited, as a functional unit, by cancer cells to promote oncogenesis.
Footnotes
↵* To whom correspondence should be addressed. Tel: +16465012693; Email: Aristotelis.Tsirigos{at}nyumc.org; Correspondence may also be addressed to Iannis Aifantis. Tel: +1 212 263 9898, Fax: +1 212 263 9210, E-mail: Ioannis.Aifantis{at}nyumc.org