Abstract
Background Super-enhancers are clusters of transcriptional enhancers densely occupied by the Mediators, transcription factors and chromatin regulators. They control the expression of cell identity genes and disease associated genes. Current studies demonstrated the possibility of multiple factors with important roles in super-enhancer formation; however, a systematic analysis to assess the relative importance of chromatin and sequence signatures of super-enhancers and their constituents remain unclear.
Results Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of super-enhancers and their constituents and to investigate their relative importance. Through computational modelling, we found that Cdk8, Cdk9 and Smad3 as new key features of super-enhancers along with many known features such as H3K27ac. Comprehensive analysis of these features in embryonic stem cells and pro-B cells revealed their role in the super-enhancer formation and cellular identity. We also observed that super-enhancers are significantly GC-rich in contrast with typical enhancers. Further, we observed significant correlation among many cofactors at the constituents of super-enhancers.
Conclusions Our analysis and ranking of super-enhancer signatures can serve as a resource to further characterize and understand the formation of super-enhancers. Our observations are consistent with a cooperative or synergistic model underlying the interaction of super-enhancers and their constituents with numerous factors.
Footnotes
Author’s email address: aziz.khan{at}ncmm.uio.no; zhangxg{at}tsinghua.edu.cn
Abbreviations
- TFs
- Transcription Factors
- ChIP-seq
- Chromatin immune precipitation followed by high-throughput sequencing
- ESCs
- Embryonic Stem Cells
- SEs
- Super-enhancers
- TEs
- Typical enhancers
- TSS
- Transcription Start Site
- SNPs
- Single Nucleotide Polymorphisms
- GEO
- Gene Expression Omnibus
- GO
- Gene Ontology