RT Journal Article SR Electronic T1 StereoGene: Rapid Estimation of Genomewide Correlation of Continuous or Interval Feature Data JF bioRxiv FD Cold Spring Harbor Laboratory SP 059584 DO 10.1101/059584 A1 Elena D. Stavrovskaya A1 Tejasvi Niranjan A1 Elana J. Fertig A1 Sarah J. Wheelan A1 Alexander Favorov A1 Andrey Mironov YR 2017 UL http://biorxiv.org/content/early/2017/05/25/059584.abstract AB Motivation Genomics features with similar genomewide distributions are generally hypothesized to be functionally related, for example, co-localization of histones and transcription start sites indicate chromatin regulation of transcription factor activity. Therefore, statistical algorithms to perform spatial, genomewide correlation among genomic features are required.Results Here, we propose a method, StereoGene, that rapidly estimates genomewide correlation among pairs of genomic features. These features may represent high throughput data mapped to reference genome or sets of genomic annotations in that reference genome. StereoGene enables correlation of continuous data directly, avoiding the data binarization and subsequent data loss. Correlations are computed among neighboring genomic positions using kernel correlation. Representing the correlation as a function of the genome position, StereoGene outputs the local correlation track as part of the analysis. StereoGene also accounts for confounders such as input DNA by partial correlation. We apply our method to numerous comparisons of ChIP-Seq datasets from the Human Epigenome Atlas and FANTOM CAGE to demonstrate its wide applicability. We observe the changes in the correlation between epigenomic features across developmental trajectories of several tissue types consistent with known biology, and find a novel spatial correlation of CAGE clusters with donor splice sites and with poly(A) sites. These analyses provide examples for the broad applicability of StereoGene for regulatory genomics.Availability The StereoGene C++ source code, program documentation, Galaxy integration scripts and examples are available from the project homepage http://stereogene.bioinf.fbb.msu.ru/Contact favorov{at}sensi.orgSupplementary information Supplementary data are available online.