ABSTRACT
ChIP-seq reveals genomic regions where proteins, e.g. transcription factors (TFs) interact with DNA. A substantial fraction of these regions, however, do not contain the cognate binding site for the TF of interest. Protein-protein interactions, co-binding, and DNA looping can explain this phenomenon. We uniformly processed 3727 human ChIP-seq data sets and determined the cistrome for 292 TFs together with the distances between the TF binding site (TFBS) centers and the ChIP-seq peak summits. In addition to providing a comprehensive human TFBS collection, the ChIPSummitDB database and web interface allows to examine the topological arrangements of TF complexes on the DNA.
Footnotes
List of abbreviations
- Bp
- base pair
- BWA
- Burrows-Wheeler Aligner
- ChIP-seq
- Chromatin Immunoprecipitation Sequencing
- CTCF
- CCCTC-binding factor
- DB
- Database
- dbSNP
- Single Nucleotide Polymorphism Database
- ELF1
- E74-like factor 1
- ETS1
- E26 Oncogene Homolog 1
- GATA1
- GATA-binding factor 1
- GM12878
- Lymphoblastoid cell line
- HOMER
- Hypergeometric Optimization of Motif EnRichment
- HTML
- HyperText Markup Language
- HTTP
- HyperText Transfer Protocol
- ID
- identification number/identifier
- NCBI
- National Center for Biotechnology Information
- NFYB
- Nuclear transcription factor Y subunit beta
- PHP
- Hypertext Preprocessor
- PWM
- Position weight matrix
- RAD21
- Double-strand-break repair protein (Scc1, Mcd1)
- SA1
- Stromal Antigen 1
- SMC1/3
- Structural maintenance of chromosomes proteins
- SNP
- Single Nucleotide Polymorphism
- SQL
- Structured Query Language
- SRA
- Sequence Read Archive
- TAL1
- T-cell acute lymphocytic leukemia protein 1
- TF
- transcription factor
- TFBS
- transcription factor binding site
- USF1
- Upstream stimulatory factor 1
- XML
- Extensible Markup Language
- YY1
- Yin Yang 1
- ZNF143
- Zinc Finger Protein 143
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.