High-throughput identification of long-range regulatory elements and their target promoters in the human genome

Nucleic Acids Res. 2013 May;41(9):4835-46. doi: 10.1093/nar/gkt188. Epub 2013 Mar 21.

Abstract

Enhancer elements are essential for tissue-specific gene regulation during mammalian development. Although these regulatory elements are often distant from their target genes, they affect gene expression by recruiting transcription factors to specific promoter regions. Because of this long-range action, the annotation of enhancer element-target promoter pairs remains elusive. Here, we developed a novel analysis methodology that takes advantage of Hi-C data to comprehensively identify these interactions throughout the human genome. To do this, we used a geometric distribution-based model to identify DNA-DNA interaction hotspots that contact gene promoters with high confidence. We observed that these promoter-interacting hotspots significantly overlap with known enhancer-associated histone modifications and DNase I hypersensitive sites. Thus, we defined thousands of candidate enhancer elements by incorporating these features, and found that they have a significant propensity to be bound by p300, an enhancer binding transcription factor. Furthermore, we revealed that their target genes are significantly bound by RNA Polymerase II and demonstrate tissue-specific expression. Finally, we uncovered that these elements are generally found within 1 Mb of their targets, and often regulate multiple genes. In total, our study presents a novel high-throughput workflow for confident, genome-wide discovery of enhancer-target promoter pairs, which will significantly improve our understanding of these regulatory interactions.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Base Sequence
  • Binding Sites
  • Conserved Sequence
  • DNA / metabolism
  • Enhancer Elements, Genetic*
  • Gene Expression
  • Genome, Human*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Nucleotide Motifs
  • Promoter Regions, Genetic*
  • RNA Polymerase II / metabolism
  • Sequence Analysis, DNA
  • Vertebrates / genetics
  • p300-CBP Transcription Factors / metabolism

Substances

  • DNA
  • p300-CBP Transcription Factors
  • p300-CBP-associated factor
  • RNA Polymerase II