Sequence and chromatin determinants of cell-type–specific transcription factor binding

  1. Christina Leslie1,3
  1. 1Computational Biology Program, Memorial Sloan-Kettering Cancer Center, New York, New York 10065, USA;
  2. 2Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA

    Abstract

    Gene regulatory programs in distinct cell types are maintained in large part through the cell-type–specific binding of transcription factors (TFs). The determinants of TF binding include direct DNA sequence preferences, DNA sequence preferences of cofactors, and the local cell-dependent chromatin context. To explore the contribution of DNA sequence signal, histone modifications, and DNase accessibility to cell-type–specific binding, we analyzed 286 ChIP-seq experiments performed by the ENCODE Consortium. This analysis included experiments for 67 transcriptional regulators, 15 of which were profiled in both the GM12878 (lymphoblastoid) and K562 (erythroleukemic) human hematopoietic cell lines. To model TF-bound regions, we trained support vector machines (SVMs) that use flexible k-mer patterns to capture DNA sequence signals more accurately than traditional motif approaches. In addition, we trained SVM spatial chromatin signatures to model local histone modifications and DNase accessibility, obtaining significantly more accurate TF occupancy predictions than simpler approaches. Consistent with previous studies, we find that DNase accessibility can explain cell-line–specific binding for many factors. However, we also find that of the 10 factors with prominent cell-type–specific binding patterns, four display distinct cell-type–specific DNA sequence preferences according to our models. Moreover, for two factors we identify cell-specific binding sites that are accessible in both cell types but bound only in one. For these sites, cell-type–specific sequence models, rather than DNase accessibility, are better able to explain differential binding. Our results suggest that using a single motif for each TF and filtering for chromatin accessible loci is not always sufficient to accurately account for cell-type–specific binding profiles.

    Footnotes

    • 3 Corresponding author

      E-mail cleslie{at}cbio.mskcc.org

    • [Supplemental material is available for this article.]

    • Article and supplemental material are at http://www.genome.org/cgi/doi/10.1101/gr.127712.111.

      Freely available online through the Genome Research Open Access option.

    • Received June 15, 2011.
    • Accepted March 27, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    Related Articles

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server