Transcription factor and chromatin features predict genes associated with eQTLs

Nucleic Acids Res. 2013 Feb 1;41(3):1450-63. doi: 10.1093/nar/gks1339. Epub 2012 Dec 28.

Abstract

Cell type-specific gene expression in humans involves complex interactions between regulatory factors and DNA at enhancers and promoters. Mapping studies for expression quantitative trait loci (eQTLs), transcription factors (TFs) and chromatin markers have become widely used tools for identifying gene regulatory elements, but prediction of target genes remains a major challenge. Here, we integrate genome-wide data on TF-binding sites, chromatin markers and functional annotations to predict genes associated with human eQTLs. Using the random forest classifier, we found that genomic proximity plus five TF and chromatin features are able to predict >90% of target genes within 1 megabase of eQTLs. Despite being regularly used to map target genes, proximity is not a good indicator of eQTL targets for genes 150 kilobases away, but insulators, TF co-occurrence, open chromatin and functional similarities between TFs and genes are better indicators. Using all six features in the classifier achieved an area under the specificity and sensitivity curve of 0.91, much better compared with at most 0.75 for using any single feature. We hope this study will not only provide validation of eQTL-mapping studies, but also provide insight into the molecular mechanisms explaining how genetic variation can influence gene expression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Binding Sites
  • Chromatin / chemistry
  • Chromatin / metabolism*
  • Chromosomes
  • Enhancer Elements, Genetic
  • Gene Expression
  • Histones / metabolism
  • Humans
  • Promoter Regions, Genetic
  • Quantitative Trait Loci*
  • Regulatory Elements, Transcriptional*
  • Transcription Factors / metabolism*

Substances

  • Chromatin
  • Histones
  • Transcription Factors