RT Journal Article SR Electronic T1 Variance-stabilized units for sequencing-based genomic signals JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.31.929174 DO 10.1101/2020.01.31.929174 A1 Faezeh Bayat A1 Maxwell Libbrecht YR 2020 UL http://biorxiv.org/content/early/2020/02/02/2020.01.31.929174.abstract AB Sequencing-based genomic signals such as ChIP-seq are widely used to measure many types of genomic biochemical activity, such transcription factor binding, chromatin accessibility and histone modification. The processing pipeline for these assays usually outputs a real-valued signal for every position in the genome that measures the strength of activity at that position. This signal is used in downstream applications such as visualization and chromatin state annotation. There are several representations of signal strength at a given that are currently used, including the raw read count, the fold enrichment over control, and log p-value of enrichment relative to control. However, these representations lack the property of variance stabilization. That is, a difference between 100 and 200 reads usually has a very different statistical importance from a difference between 1,100 and 1,200 reads. Here, we propose VSS, variance-stabilized signals for sequencing-based genomic signals. We generate VSS by learning the empirical relationship between the mean and variance of a given signal data set and producing transformed signals that normalize for this dependence. We demonstrate that these variance stabilized units have several desirable properties, including that differences in ChIP-seq signal across cell types indicate a difference in that gene’s expression. VSS units will eliminate the need for downstream methods to implement complex mean-variance relationship models, and will enable genomic signals to be easily understood by eye.