Abstract
Motivation With the increasing amount of genomic and epigenomic data in the public domain, a pressing challenge is how to integrate these data to investigate the role of epigenetic mechanisms in regulating gene expression and maintenance of cell-identity. To this end, we have implemented a computational pipeline to systematically study epigenetic variability and uncover regulatory DNA sequences that play a role in gene regulation.
Results Haystack is a bioinformatics pipeline to characterize hotspots of epigenetic variability across different cell-types as well as cell-type specific cis-regulatory elements along with their corresponding transcription factors. Our approach is generally applicable to any epigenetic mark and provides an important tool to investigate cell-type identity and the mechanisms underlying epigenetic switches during development. Additionally, we make available a set of precomputed tracks for a number of epigenetic marks across several cell types. These precomputed results may be used as an independent resource for functional annotation of the human genome.
Availability The Haystack pipeline is implemented as an open-source, multiplatform, Python package called haystack_bio available at https://github.com/pinellolab/haystack_bio.
Contact lpinello{at}mgh.harvard.edu, gcyuan{at}jimmy.harvard.edu