TY - JOUR T1 - CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data JF - bioRxiv DO - 10.1101/2021.03.09.434676 SP - 2021.03.09.434676 AU - Len Taing AU - Clara Cousins AU - Gali Bai AU - Paloma Cejas AU - Xintao Qiu AU - Myles Brown AU - Clifford A. Meyer AU - X. Shirley Liu AU - Henry W. Long AU - Ming Tang Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/03/10/2021.03.09.434676.abstract N2 - Motivation The chromatin profile measured by ATAC-seq, ChlP-seq, or DNase-seq experiments can identify genomic regions critical in regulating gene expression and provide insights on biological processes such as diseases and development. However, quality control and processing chromatin profiling data involve many steps, and different bioinformatics tools are used at each step. It can be challenging to manage the analysis.Results We developed a Snakemake pipeline called CHIPS (CHromatin enrichment Processor) to streamline the processing of ChIP-seq, ATAC-seq, and DNase-seq data. The pipeline supports single- and paired-end data and is flexible to start with FASTQ or BAM files. It includes basic steps such as read trimming, mapping, and peak calling. In addition, it calculates quality control metrics such as contamination profiles, PCR bottleneck coefficient, the fraction of reads in peaks, percentage of peaks overlapping with the union of public DNaseI hypersensitivity sites, and conservation profile of the peaks. For downstream analysis, it carries out peak annotations, motif finding, and regulatory potential calculation for all genes. The pipeline ensures that the processing is robust and reproducible.Availability CHIPS is available at https://bitbucket.org/plumbers/cidc_chips/src/master/Contact mtang{at}ds.dfci.harvard.edu: henry_long{at}dfci.harvard.eduCompeting Interest StatementThe authors have declared no competing interest. ER -