Spark: A navigational paradigm for genomic data exploration

  1. Steven J.M. Jones1
  1. 1Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, V5Z 4S6, Canada;
  2. 2School of Computing Science, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada;
  3. 3The Genome Center, University of California-Davis, Davis, California 95616, USA;
  4. 4Epigenome Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA;
  5. 5Center for Genome Sciences and Systems Biology, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA;
  6. 6Brain Tumor Research Center, Department of Neurosurgery, Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California 94143, USA;
  7. 7Department of Microbiology and Immunology, Centre for High-Throughput Biology, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
    • 8 Present address: Department of Biochemistry & Molecular Biology, Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California 90089, USA.

    Abstract

    Biologists possess the detailed knowledge critical for extracting biological insight from genome-wide data resources, and yet they are increasingly faced with nontrivial computational analysis challenges posed by genome-scale methodologies. To lower this computational barrier, particularly in the early data exploration phases, we have developed an interactive pattern discovery and visualization approach, Spark, designed with epigenomic data in mind. Here we demonstrate Spark's ability to reveal both known and novel epigenetic signatures, including a previously unappreciated binding association between the YY1 transcription factor and the corepressor CTBP2 in human embryonic stem cells.

    Footnotes

    • Received March 16, 2012.
    • Accepted August 10, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    | Table of Contents

    Preprint Server