Patterns of regulatory activity across diverse human cell types predict tissue identity, transcription factor binding, and long-range interactions

  1. Terrence S. Furey7,8
  1. 1Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA;
  2. 2Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina 27710, USA;
  3. 3Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
  4. 4Bergen Center for Computational Science and Sars Centre for Marine Molecular Biology, University of Bergen, N-5008 Bergen, Norway;
  5. 5Department of Molecular Sciences, Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom; and MRC Clinical Sciences Centre, London W12 0NN, United Kingdom;
  6. 6Department of Pediatrics, Division of Medical Genetics, Duke University, Durham, North Carolina 27710, USA;
  7. 7Department of Genetics and Department of Biology, Carolina Center for Genome Sciences, Linberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599, USA

    Abstract

    Regulatory elements recruit transcription factors that modulate gene expression distinctly across cell types, but the relationships among these remains elusive. To address this, we analyzed matched DNase-seq and gene expression data for 112 human samples representing 72 cell types. We first defined more than 1800 clusters of DNase I hypersensitive sites (DHSs) with similar tissue specificity of DNase-seq signal patterns. We then used these to uncover distinct associations between DHSs and promoters, CpG islands, conserved elements, and transcription factor motif enrichment. Motif analysis within clusters identified known and novel motifs in cell-type-specific and ubiquitous regulatory elements and supports a role for AP-1 regulating open chromatin. We developed a classifier that accurately predicts cell-type lineage based on only 43 DHSs and evaluated the tissue of origin for cancer cell types. A similar classifier identified three sex-specific loci on the X chromosome, including the XIST lincRNA locus. By correlating DNase I signal and gene expression, we predicted regulated genes for more than 500K DHSs. Finally, we introduce a web resource to enable researchers to use these results to explore these regulatory patterns and better understand how expression is modulated within and across human cell types.

    Footnotes

    • 8 Corresponding authors

      E-mail b.lenhard{at}csc.mrc.ac.uk

      E-mail greg.crawford{at}duke.edu

      E-mail tsfurey{at}email.unc.edu

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.152140.112.

      Freely available online through the Genome Research Open Access option.

    • Received November 17, 2012.
    • Accepted March 7, 2013.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server