Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

HBP: an integrative and flexible pipeline for the interaction analysis of Hi-C dataset

Chao He, Ping Li, Minglei Shi, Yan Zhang, Bingyu Ye, Dejian Xie, Wenlong Shen, Zhihu Zhao
doi: https://doi.org/10.1101/083576
Chao He
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ping Li
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Minglei Shi
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yan Zhang
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Bingyu Ye
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dejian Xie
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wenlong Shen
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: shenwl1988@gmail.com zhaozh@bmi.ac.cn
Zhihu Zhao
1Beijing Institute of Biotechnology, Beijing 100071, China.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: shenwl1988@gmail.com zhaozh@bmi.ac.cn
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Background The spatial organization of interphase chromatin in the nucleus play an important role in gene expression regulation and function. With the rapid development of revolutionized chromosome conformation capture technology and its genome-wide derivatives such as Hi-C, investigation of the genome folding becomes more efficient and convenient. How to robustly deal with these massive datasets and infer accurate 3D model and within-nucleus compartmentalization of chromosomes becomes a new challenge.

Result The implemented pipeline HBP (Hi-C BED file analysis Pipeline) integrates existing pipelines focusing on individual steps of Hi-C data processing into an all-in-one package with adjustable parameters to infer the consensus 3D structure of genome from raw Hi-C sequencing data. What’s more, HBP could assign statistical confidence estimation for chromatin interactions, and clustering interaction loci according to enrichment tracks or topological structure automatically.

Conclusion The freely available HBP is an optimized and flexible pipeline for analyzing the folding of whole chromosome and interactions between some specific sites from the Hi-C raw sequencing reads to the partially processed datasets. The other complex genetic and epigenetic datasets from public sources such as GWAS, ENCODE consortiums etc. will also easily be integrated into HBP, hence the final output results of HBP could provide a comprehensive in-depth understanding for the specific chromatin interactions, potential molecular mechanisms and biological significance. We believe that HBP is a reliable tool for the rapidly analysis of Hi-C data and will be very useful for a wide range of researchers, particularly those who lack of background in computational biology. HBP is freely accessible at https://github.com/hechao0407/HBP/blob/master/HBP_1.0.tar.gz.

Background

Numerous light microscopy studies have shown that chromatin is tightly packaged and organized into higher-order dynamic structures, which is essential for the genome function[1–3]. Indeed, the organization of chromatin was found to play important regulation roles in many biological processes such as transcription, DNA replication, repair and even translation [4–6]. With the occurrence and rapid development of DNA-FISH[3], other super-resolution imaging[7–9] and molecular techniques, particularly newly developed 3C (Chromosome Conformation Capture)[10] and its genome-wide derivatives such as 4C[11, 12], 5C[13], ChlA-PET[14], Hi-C[15], series of Capture-C[16] etc., the study of the chromosome folding and dynamics has made great progress in recent years and the resolution achieves an unprecedented high-level[17, 18]. In fact, the resolution of Hi-C for human genome already achieved to 1-kilobase [19], which could even be reached at the 64bp theoretically as the very recent work demonstrated [20].

Combined with the numerous newly developed computational modeling and data processing methods[21], the conventional or ‘ensemble’ C-series studies of cell populations irrefutably revealed that the interphase chromatin is hierarchically and dynamic folded to form the higher-order fractal globule structure and chromosome territories, which is coincided with the results of the DNA-FISH at the single cell level in most cases[17, 18]. It also revealed the TAD (Topological Associated Domain) is the conserved structural and functional units of the chromatin folding [22, 23]. Each TAD is composed of sub-TADs and numerous loop interactions that within the same TADs are much higher than those across TADs. The different TADs could further cluster into larger-scale compartments corresponding to transcription active or inactive chromatin[24]. Although numerous different computational modeling methods such as GenomicInteractions[25], HiTC[26], hiclib[27], ChromContact[28], HiCPlotter[29], HiC-Pro[21, 30] etc. have been developed and promote the systematic, global analysis of Hi-C results, unfortunately, we still do not have an integrated general platform to analyze interactions of global and specific sequences from mapping, filtering, interaction calling, statistical evaluation and visualization of interactions automatically. Recently, Cournac and colleagues developed some program to analyze interactions between repetitive elements[31], but the program only calculate co-localization scores to evaluate genomic imprints of the 3D folding, and thus cannot be used in a single command-line manner either. The GenomicInteractions [25] is also able to analyze the interaction network of specific sites. However, it can neither process data from raw Hi-C reads, add tracks to investigate the enrichment of histones or transcription factor binding sites, nor perform the statistical analysis. HiCPlotter focuses on the association between global Hi-C interaction heatmap and other epigenetic features such as histone modification[29]. But it cannot investigate interactions between specific types of chromatin sequences. ChromContact is a web tool for analyzing spatial contact chromosomes. It can retrieve all the information of loci of interest from the UCSC browser directly[28]. However, it is incompatible with researcher’s user-supplied data, and cannot provide information such as interaction frequency or probability etc.

To fulfill the great challenges and provide an integrative resolution for Hi-C data processing, we present HBP, an open-source, easy-to-use complete pipeline to provide the tools for the sequence mapping of raw reads, extraction and normalization of chromosome interactions, heatmap visualization of interaction frequency, network analysis of specific interactions, statistical evaluation with adjustable parameters. HBP is an intact, integrative all-in-one workflow for the whole process of Hi-C data mapping, filtering and visualizing. HBP can start either from Hi-C raw dataset or other existing interaction frequency matrix. Here, we illustrate features of HBP by applying it to some publicly available datasets, and also show some interesting findings.

Implementation

HBP is a publicly available R package for processing Hi-C datasets. It uses Hi-C raw datasets or interaction matrix files from hiclib[27] or HiC-Pro [30], a BED format file containing specific sites information. HBP is also compatible with the tracks files of histone modification or transcription factors DNA binding motif enrichment information of ChIP-seq which can be downloaded from the UCSC [32] or user-own-supplied files. Users can examine the interaction frequency distribution, explore the network topology information such as the list of name, degree, closeness, betweenness, local cluster coefficient, eigenvector centrality of network nodes etc. Besides providing the global scatter diagram to pinpoint locations of specific sites, circos[33] pictures with different tracks, network clusters according to these tracks or topological structure of the global genome and statistical evaluation, HBP could also show the frequency, number and distribution of interactions between specific sites in the genome. These information are very important for the understanding of the chromatin interaction networks. Here we provide vignettes detailing the use of HBP in Hi-C data.

Data import

HBP can import different categories datasets such as the raw Hi-C sequencing files or the output files from HiC-Pro [30], specific sites information of standard BED format and tracks information of WIG format etc. This flexible feature allows users to easily integrate the datasets from many different sources such as ENCODE[34], modENCODE [35], GWAS [36]consortium and UCSC [32] etc.

Interaction frequency distribution

Following the Gu’s work [37], HBP sets up and down thresholds, and discards the interaction-pairs whose frequencies are lower than the down threshold or higher than the up threshold. Then HBP separates chromatin interaction frequencies into many bins (referred to as frequency bin), and computes density for specific sites in each frequency bins. Meanwhile, HBP will compare the results with that of random control groups which were generated according to the distribution of these imported specific sites. Based on these imported specific region, every member of random control groups was calculated and made in the similar region. This comparison results make it easier for users to ascertain whether the observed interactions were biologically significant and whether these specific sites have effect on the conformation of chromatin or not.

Interaction network topological analysis

HBP maps sequence sites of interest to the probability matrix, and calculate the interaction frequency and pinpoint the interaction networks mediated by these sites. The igraph package was used to plot an interaction network graph and make topological cluster analysis. According to the clustering results, users can classify all sites of interest into different types or clusters, and examine their potential differential properties respectively.

Visualization of interactions and tracks

The package can visualize the interaction of interested sequences. It is compatible with the adding of some tracks such as histone modifications or transcription factors binding sites enrichment. From the graph, users can easily find out the association rules of the chromatin interactions and other genetic and epigenetic features of the interested sequences, put forward hypotheses and further design related experiments to verify.

Statistical significance tests

HBP can also perform cluster analysis using histone modifications or transcription factors binding tracks, and make Kruskal-Wallis rank sum or T test to evaluate the statistical difference of interaction frequency, or the interaction strength of the specific sites respectively. These tests will tell us whether there exist significant differences, suggesting different properties of these sites and the interactions.

RESULT

Usage examples

Investigating effect of different repeat elements on the chromatin conformation

The repeat sequence accounts of around 50% of human genome and is reported to be related to many different physiological process of genome [34, 38]. We and others previously reported some repeat sequences are involved in chromosome interactions [11, 31, 37]. Here we examine whether all the different kinds of repeat sequence might mediate extensive chromatin interactions. To this aim, we provide a demonstration of using HBP to perform an example analysis of Hi-C data from human IMR90 cells (GEO dataset GSM1551600) [19]. With the built-in function to import data from HiC-Pro interaction file format, HBP pre-processed the Hi-C data by HiC-Pro package, and the result was normalized by the iterative correction algorithm of HiC-Pro to remove systematic noise and bias. After downloaded from Repbase in the form of FASTA file [39], and aligned to hg19 by Blat [40], the datasets of repeat elements of L1, LTR retrotransposon, SINE, satellite, ERV2 and DNA transposons etc. were inputted into our pipeline. Then the interactions frequency distribution associated with these repeat elements was estimated. To confirm the difference, HBP also set up random control groups corresponding to the distribution of different repeat elements. Figure 2 shows the distribution of six repeat elements families. While the interaction frequency of satellite and SINE involved is much lower than random controls, that of L1 is higher than random groups. These results indicated as a family, L1 but not the satellite and SINE repeat elements, is globally mediated or maintained chromatin interactions, which is agreement with a previous model [41]. Compared with control, the frequency of other elements is fluctuated, suggesting these elements maybe need to be classified into different sub-families for further accurate analysis.

Figure 1.
  • Download figure
  • Open in new tab
Figure 1.

The workflow of HBP. This workflow contains the main content of the software, can be used to analyze the interaction between some specific sites.

Figure 2.
  • Download figure
  • Open in new tab
Figure 2.

Interactions frequency distribution of different repeat elements. The red line represents interactions frequency distribution of repeat element, and the box plot represents distribution of random control.

Investigating the CTCF interaction network in human GM12878 cells

Alone or in combination with other protein and even RNA regulators, CTCF mediated extensive chromatin interactions [42, 43], but the topological features of CTCF-involved chromatin interaction networks remain elusive. Here we examined this issue by inputting the human CTCF DNA binding peaks of interphase chromatin into our pipeline firstly. GM12878 CTCF peaks were downloaded from ENCODE project [34, 38], GM12878 Hi-C data was downloaded from Rao’s work (GEO dataset GSM1551591) [19]. HBP processed the Hi-C dataset by HiC-Pro, invalid pairs and duplicates were removed automatically.

All CTCF-involved interactions, defined as both sites of interactions with CTCF-binding peaks around, were filtered from the dataset, and the interaction network and interaction frequency distribution were checked by HBP. As an example, we just examined the chosen region of chr8:69,300,000-74,300,000. Firstly, it was found that both the enrichments (Figure 3 a,b) and peaks (Figure 3 c,d) of the global chromosome interactions correlated with CTCF-involved specific interactions very well. These results, along with the previous comparison of ChlA-PET with Hi-C [44], strongly indicated CTCF is the main architectural protein bridging genome topology[42]. Secondly, it was found all CTCF peaks within this region (98 in total) were involved in the interaction network and have either higher or lower interactions with each other (Figure 4 a), additionally, the interaction network plotted by HBP could be sorted into two clusters (contained 46, 52 nodes respectively) according to network topological structure (Figure 4 b,c). These interaction clusters were defined as of TSC (Topological Structure Cluster), which had different distribution relative to genes position. For example, while majority of TSC1 peaks from intron, TSC2 didn’t have peaks in the 3’UTR and exons regions (Fig 4 b). The different distribution of two TSCs relative to gene position implied each of these TSCs might has different regulation function, which needs further examinations. What’s more, just like Fig 3 d showed, the two clusters just belonged to two different peaks respectively, and they were separated by the TAD boundary at chr8:71,760,000-71,800,000. Additionally, we also found the node in chr8:73,199,355-73,200,024 had the highest degree (degree = 79) and connected with more than half nodes, which suggested that this node might function as a connector or hub of the whole network. To further characterize this node, we looked into this area by UCSC genome browser. We found that it was a DNase I Hypersensitivity sites cluster with many transcription factors binding sites of EBF1, IRF4, ZNF143, RAD21, SMC3, etc. within (Figure 4 c). This result provide a potential molecular mechanism explanation for the network formation.

Figure 3.
  • Download figure
  • Open in new tab
Figure 3.

Interaction distribution of chr8: 69,300,000-74,300,000. a Hi-C interaction heatmap of this region. b CTCF involved interaction distribution of this region. c The reads counts of each bin in Hi-C interaction heatmap. d The reads counts of each bin in CTCF involved interaction distribution.

Figure 4.
  • Download figure
  • Open in new tab
Figure 4.

Interaction network of chr8:69,300,000-74,300,000. a The network picture without cluster. The color of interaction line represent the read counts, the size of each node represent the degree, and the color of each node represent topological structure clusters. b c The distribution of TSC1 and TSC2. The two clusters have different characteristics in relation to genes. d The highest degree site in chr8:69,300,000-74,300,000 plotted by UCSC. It is a DNase I Hypersensitivity sites cluster with many transcription factors binding sites of EBF1, IRF4, ZNF143, RAD21, SMC3, etc.

In order to learn more about the characteristics of CTCF involved interactions, we used HBP to plot a circos picture (Figure 5 a) by adding 6 histone modification features of H3K4me1, H3K4me2, H3K4me3, H3K27ac, H3K36me3 and H3K79me2. In the picture, the interaction stronger, the color deeper. We could clearly found that some interaction nodes have H3K4me1 and H3K4me2 enrichment from the tracks heatmap, which are posttranslational marks associated with enhancer regions. To know more details, all above mentioned six histone modification datasets were inputted and histone modification clusters were made by HBP, a cluster tree and a heatmap sorted by these clusters was plotted (Figure 5 b). These clusters had different modification enrichment according to the heatmap. While cluster 2 has very lower even no any enrichment of all 6 histone modification, cluster 1 and 3 have relative higher enrichment of all 6 histone modifications. Additionally, compared with cluster 3, cluster1 had higher H3K27ac enrichment but lower H3K36me3 enrichment, cluster3 was just on the contrary. As the Histone H3K27ac separates active from poised enhancers[45, 46] and higher H3K36me3 enrichment marks the active exon[46, 47], these results strongly suggested CTCF cluster 1 is active enhancer whereas cluster 3 is active expressed gene bodies, especially exons.

Figure 5.
  • Download figure
  • Open in new tab
Figure 5.

Circos plot and tracks clusters heatmap of chr8:69,300,000-74,300,000. a Circos plot of the defined region. From outside to inside, the six tracks are H3K27ac, H3k4me1, H3k4me2, H3K4me3, H3k9ac, and H3k9me3 respectively. b Tracks clusters heatmap of the defined region. The three clusters are separated by the black line in this picture. The blue line represents the value of corresponding site. c The histone modification picture plotted by UCSC. An example node from cluster 1 of b with the black square line. It is a strong active enhancer with H3K4me1, H3K4me2, H3K27ac enrichment peaks.

To further quantitatively determine whether there existed statistical differences in these histone modifications between these CTCF binding sites and other sites, Kruskal-Wallis rank sum tests were made by HBP. The test shown that H3k4me1, H3k4me2, H3k4me3, H3K27ac, H3K36me3 and H3K79me2 markers existed significant differences (p = 4.17e-05, 4.93e-03, 1.67e-07, 1.21e-03, 4.55e-05, 5.30e-04 respectively). Then T Test was carried out to estimate histone modification level of each binding site or node, and a node list containing information about modification level higher or lower than average level was obtained (Supple Table 1). What’s interesting is that 61.2% of the node shad the coherent tendency of H3K4me1, H3K4me2 and H3K27ac relative high enrichment level, which is the typical markers of active enhancers [45]. We chose one node for further analysis. It indeed just located within a super-enhancer (Figure 5 c). Super-enhancers are clusters of transcriptional enhancers that usually drive expression of genes important for defining cell identity during development[48]. What’s more, the H3K36me3 and H3K79me2 modification level in 73.5% nodes showed similar tendency, it was suggested that these site might work together in the boundary of extron-intron regions [45]. At last, T Test was also used to evaluate the interaction number between CTCF binding sites. HBP selected 100 random group with the similar distribution of CTCF binding sites to set up control. It was found that while the interaction number of random group was 1393 ˜ 1409 at 95 percent confidence interval, that of the CTCF binding sites involved in this region is 1423 (p = 1.21e-05). These results showed that in chr8:69,300,000-74,300,000, CTCF involved interaction number was much higher than random group, in other words, CTCF binding sites were more likely to interact with each other in this region.

Conclusions

We proposed a new pipeline, HBP, to analyze global and defined interactions between specific sites from Hi-C data. Compared with other software, HBP is more optimized and flexible by providing many user-adjustable parameters, and can process Hi-C data from raw sequence or already partially processed data in an all-in-one manner. All these unique features make HBP will be a robust and useful tools for the investigation of genome hierarchical folding and functions, particularly for researchers without strong background in computational biology.

Availability and requirements

HBP is a publicly available R package available from https://github.com/hechao0407/HBP/blob/master/HBP_1.0.tar.gz. This package requires R > = 3.3.0 and depend on several R packages including optparse, grid, lattice, IDPmisc, OmicCircos, stringr, ggplot2, igraph, reshape2, pgirmess, coin, multcomp, flexclust, HiTC, rtracklayer and gplots. All of the analyses and figures presented in the paper can be reproduced by using HBP.

Ethics statement

Ethics statement is not applicable to our study as this study only uses publicly available data.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

CH, WLS and ZhZ conceived the research. CH designed the software. CH, PL, MIS, YZ, WIS and ZhZ contributed to the development of software and documentation. CH is responsible for the maintenance of software. All authors contributed to the writing of the manuscript. All authors read and approved the final manuscript.

Acknowledgements

The authors would like to thank Zhihu Zhao, Wenlong Shen, and Jing Zeng for their helpful discussions during the development of this package. This work was supported by National Basic Research Program of China [2013CB966802], National Natural Science Foundation of China [31370762, 31030026, 31272416, 81372218] and National Key Technology R&D Program of China [2012BAI01B07].

Footnotes

  • Email address: Chao He: hechao2010{at}tsinghua.org.cn, Ping Li: lipingtry{at}163.com, Minglei Shi: shiml79{at}126.com, Yan Zhang: zany1983{at}163.com, Bingyu Ye: bby0804{at}gmail.com, Dejian Xie: ncuskxiedejian{at}163.com, Wenlong Shen: shenwl1988{at}gmail.com, Zhihu Zhao: zhaozh{at}bmi.ac.cn

References

  1. 1.↵
    Cremer T, Cremer C: Rise, fall and resurrection of chromosome territories: a historical perspective. Part I. The rise of chromosome territories. Eur J Histochem 2006, 50(3):161–176.
    OpenUrlPubMed
  2. 2.
    Cremer T, Cremer C: Rise, fall and resurrection of chromosome territories: a historical perspective. Part II. Fall and resurrection of chromosome territories during the 1950s to 1980s. Part III. Chromosome territories and the functional nuclear architecture: experiments and models from the 1990s to the present. Eur J Histochem 2006, 50(4):223–272.
    OpenUrlPubMed
  3. 3.↵
    Rouquette J, Cremer C, Cremer T, Fakan S: Functional nuclear architecture studied by microscopy: present and future. Int Rev Cell Mol Biol 2010, 282:1–90.
    OpenUrlCrossRefPubMed
  4. 4.↵
    Diament A, Pinter RY, Tuller T: Three-dimensional eukaryotic genomic organization is strongly correlated with codon usage expression and function. Nat Commun 2014, 5:5876.
    OpenUrlCrossRefPubMed
  5. 5.
    Shen W, Wang D, Ye B, Shi M, Ma L, Zhang Y, Zhao Z: GC3-biased gene domains in mammalian genomes. Bioinformatics 2015, 31(19):3081–3084.
    OpenUrlCrossRefPubMed
  6. 6.↵
    Cavalli G, Misteli T: Functional implications of genome topology. Nat Struct Mol Biol 2013, 20(3):290–299.
    OpenUrlCrossRefPubMed
  7. 7.↵
    Klar TA, Jakobs S, Dyba M, Egner A, Hell SW: Fluorescence microscopy with diffraction resolution barrier broken by stimulated emission. Proc Natl Acad Sci U S A 2000, 97(15):8206–8210.
    OpenUrlAbstract/FREE Full Text
  8. 8.
    Rust MJ, Bates M, Zhuang X: Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM). Nat Methods 2006, 3(10):793–795.
    OpenUrlCrossRefPubMedWeb of Science
  9. 9.↵
    Boettiger AN, Bintu B, Moffitt JR, Wang S, Beliveau BJ, Fudenberg G, Imakaev M, Mirny LA, Wu CT, Zhuang X: Super-resolution imaging reveals distinct chromatin folding for different epigenetic states. Nature 2016, 529(7586):418–422.
    OpenUrlCrossRefPubMed
  10. 10.↵
    Dekker J, Rippe K, Dekker M, Kleckner N: Capturing chromosome conformation. Science 2002, 295(5558):1306–1311.
    OpenUrlAbstract/FREE Full Text
  11. 11.↵
    Zhao Z, Tavoosidana G, Sjolinder M, Gondor A, Mariano P, Wang S, Kanduri C, Lezcano M, Sandhu KS, Singh U et al: Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra-and interchromosomal interactions. Nat Genet 2006, 38(11):1341–1347.
    OpenUrlCrossRefPubMedWeb of Science
  12. 12.↵
    Simonis M, Klous P, Splinter E, Moshkin Y, Willemsen R, de Wit E, van Steensel B, de Laat W: Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nat Genet 2006, 38(11):1348–1354.
    OpenUrlCrossRefPubMedWeb of Science
  13. 13.↵
    Dostie J, Richmond TA, Arnaout RA, Selzer RR, Lee WL, Honan TA, Rubio ED, Krumm A, Lamb J, Nusbaum C et al: Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res 2006, 16(10):1299–1309.
    OpenUrlAbstract/FREE Full Text
  14. 14.↵
    Fullwood MJ, Liu MH, Pan YF, Liu J, Xu H, Mohamed YB, Orlov YL, Velkov S, Ho A, Mei PH et al: An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 2009, 462(7269):58–64.
    OpenUrlCrossRefPubMedWeb of Science
  15. 15.↵
    Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO et al: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 2009, 326(5950):289–293.
    OpenUrlAbstract/FREE Full Text
  16. 16.↵
    Mifsud B, Tavares-Cadete F, Young AN, Sugar R, Schoenfelder S, Ferreira L, Wingett SW, Andrews S, Grey W, Ewels PA et al: Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat Genet 2015, 47(6):598–606.
    OpenUrlCrossRefPubMed
  17. 17.↵
    Dekker J, Mirny L: The 3D Genome as Moderator of Chromosomal Communication. Cell 2016, 164(6):1110–1121.
    OpenUrlCrossRefPubMed
  18. 18.↵
    Denker A, de Laat W: The second decade of 3C technologies: detailed insights into nuclear organization. Genes Dev 2016, 30(12):1357–1382.
    OpenUrlAbstract/FREE Full Text
  19. 19.↵
    Rao SS, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, Sanborn AL, Machol I, Omer AD, Lander ES et al: A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 2014, 159(7):1665–1680.
    OpenUrlCrossRefPubMedWeb of Science
  20. 20.↵
    Darrow EM, Huntley MH, Dudchenko O, Stamenova EK, Durand NC, Sun Z, Huang SC, Sanborn AL, Machol I, Shamim M et al: Deletion of DXZ4 on the human inactive X chromosome alters higher-order genome architecture. Proc Natl Acad Sci U S A 2016, 113(31):E4504–4512.
    OpenUrlAbstract/FREE Full Text
  21. 21.↵
    Ay F, Noble WS: Analysis methods for studying the 3D architecture of the genome. Genome Biol 2015, 16:183.
    OpenUrlCrossRefPubMed
  22. 22.↵
    Nora EP, Lajoie BR, Schulz EG, Giorgetti L, Okamoto I, Servant N, Piolot T, van Berkum NL, Meisig J, Sedat J et al: Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 2012, 485(7398):381–385.
    OpenUrlCrossRefPubMedWeb of Science
  23. 23.↵
    Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 2012, 485(7398):376–380.
    OpenUrlCrossRefPubMedWeb of Science
  24. 24.↵
    Gibcus JH, Dekker J: The hierarchy of the 3D genome. Mol Cell 2013, 49(5):773–782.
    OpenUrlCrossRefPubMedWeb of Science
  25. 25.↵
    Harmston N, Ing-Simmons E, Perry M, Baresic A, Lenhard B: Genomiclnteractions: An R/Bioconductor package for manipulating and investigating chromatin interaction data. BMC Genomics 2015, 16(1):963.
    OpenUrl
  26. 26.↵
    Servant N, Lajoie BR, Nora EP, Giorgetti L, Chen CJ, Heard E, Dekker J, Barillot E: HiTC: exploration of high-throughput ‘C’ experiments. Bioinformatics 2012, 28(21):2843–2844.
    OpenUrlCrossRefPubMedWeb of Science
  27. 27.↵
    Naumova N, Imakaev M, Fudenberg G, Zhan Y, Lajoie BR, Mirny LA, Dekker J: Organization of the mitotic chromosome. Science 2013, 342(6161):948–953.
    OpenUrlAbstract/FREE Full Text
  28. 28.↵
    Sato T, Suyama M: ChromContact: A web tool for analyzing spatial contact of chromosomes from Hi-C data. BMC Genomics 2015, 16:1060.
    OpenUrl
  29. 29.↵
    Akdemir KC, Chin L: HiCPlotter integrates genomic data with interaction matrices. Genome Biol 2015, 16:198.
    OpenUrlCrossRefPubMed
  30. 30.↵
    Servant N, Varoquaux N, Lajoie BR, Viara E, Chen CJ, Vert JP, Heard E, Dekker J, Barillot E: HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol 2015, 16:259.
    OpenUrlCrossRefPubMed
  31. 31.↵
    Cournac A, Koszul R, Mozziconacci J: The 3D folding of metazoan genomes correlates with the association of similar repetitive elements. Nucleic Acids Res 2016, 44(1):245–255.
    OpenUrlCrossRefPubMed
  32. 32.↵
    Rosenbloom KR, Armstrong J, Barber GP, Casper J, Clawson H, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M et al: The UCSC Genome Browser database: 2015 update. Nucleic Acids Res 2015, 43(Database issue):D670–681.
    OpenUrlCrossRefPubMed
  33. 33.↵
    Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA: Circos: an information aesthetic for comparative genomics. Genome Res 2009, 19(9):1639–1645.
    OpenUrlAbstract/FREE Full Text
  34. 34.↵
    Consortium EP: An integrated encyclopedia of DNA elements in the human genome. Nature 2012, 489(7414):57–74.
    OpenUrlCrossRefPubMedWeb of Science
  35. 35.↵
    mod EC, Roy S, Ernst J, Kharchenko PV, Kheradpour P, Negre N, Eaton ML, Landolin JM, Bristow CA, Ma L et al: Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 2010, 330(6012):1787–1797.
    OpenUrlAbstract/FREE Full Text
  36. 36.↵
    Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L et al: The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 2014, 42(Database issue):D1001–1006.
    OpenUrlCrossRefPubMedWeb of Science
  37. 37.↵
    Gu Z, Jin K, Crabbe MJ, Zhang Y, Liu X, Huang Y, Hua M, Nan P, Zhang Z, Zhong Y: Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome. Protein Cell 2016, 7(4):250–266.
    OpenUrlCrossRef
  38. 38.↵
    Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R et al: Architecture of the human regulatory network derived from ENCODE data. Nature 2012, 489(7414):91–100.
    OpenUrlCrossRefPubMedWeb of Science
  39. 39.↵
    Kohany O, Gentles AJ, Hankus L, Jurka J: Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bioinformatics 2006, 7:474.
    OpenUrlCrossRefPubMed
  40. 40.↵
    Kent WJ: BLAT--the BLAST-like alignment tool. Genome Res 2002, 12(4):656–664.
    OpenUrlAbstract/FREE Full Text
  41. 41.↵
    Tang SJ: A Model of Repetitive-DNA-Organized Chromatin Network of Interphase Chromosomes. Genes (Basel) 2012, 3(1): 167–175.
    OpenUrl
  42. 42.↵
    Ong CT, Corces VG: CTCF: an architectural protein bridging genome topology and function. Nat Rev Genet 2014, 15(4):234–246.
    OpenUrlCrossRefPubMed
  43. 43.↵
    Ye BY, Shen WL, Wang D, Li P, Zhang Z, Shi ML, Zhang Y, Zhang FX, Zhao ZH: ZNF143 is involved in CTCF-mediated chromatin interactions by cooperation with cohesin and other partners. Mol Biol (Mosk) 2016, 50(3):496–503.
    OpenUrl
  44. 44.↵
    Tang Z, Luo OJ, Li X, Zheng M, Zhu JJ, Szalaj P, Trzaskoma P, Magalska A, Wlodarczyk J, Ruszczycki B et al: CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription. Cell 2015, 163(7):1611–1627.
    OpenUrlCrossRefPubMed
  45. 45.↵
    Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, Hanna J, Lodato MA, Frampton GM, Sharp PA et al: Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A 2010, 107(50):21931–21936.
    OpenUrlAbstract/FREE Full Text
  46. 46.↵
    Zhou VW, Goren A, Bernstein BE: Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet 2011, 12(1):7–18.
    OpenUrlCrossRefPubMed
  47. 47.↵
    Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim TK, Koche RP et al: Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 2007, 448(7153):553–560.
    OpenUrlCrossRefPubMedWeb of Science
  48. 48.↵
    Hnisz D, Abraham BJ, Lee Tl, Lau A, Saint-Andre V, Sigova AA, Hoke HA, Young RA: Super-enhancers in the control of cell identity and disease. Cell 2013, 155(4):934–947.
    OpenUrlCrossRefPubMedWeb of Science
Back to top
PreviousNext
Posted October 26, 2016.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
HBP: an integrative and flexible pipeline for the interaction analysis of Hi-C dataset
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
HBP: an integrative and flexible pipeline for the interaction analysis of Hi-C dataset
Chao He, Ping Li, Minglei Shi, Yan Zhang, Bingyu Ye, Dejian Xie, Wenlong Shen, Zhihu Zhao
bioRxiv 083576; doi: https://doi.org/10.1101/083576
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
HBP: an integrative and flexible pipeline for the interaction analysis of Hi-C dataset
Chao He, Ping Li, Minglei Shi, Yan Zhang, Bingyu Ye, Dejian Xie, Wenlong Shen, Zhihu Zhao
bioRxiv 083576; doi: https://doi.org/10.1101/083576

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4370)
  • Biochemistry (9559)
  • Bioengineering (7075)
  • Bioinformatics (24789)
  • Biophysics (12577)
  • Cancer Biology (9927)
  • Cell Biology (14304)
  • Clinical Trials (138)
  • Developmental Biology (7934)
  • Ecology (12084)
  • Epidemiology (2067)
  • Evolutionary Biology (15963)
  • Genetics (10906)
  • Genomics (14714)
  • Immunology (9849)
  • Microbiology (23595)
  • Molecular Biology (9461)
  • Neuroscience (50730)
  • Paleontology (369)
  • Pathology (1537)
  • Pharmacology and Toxicology (2675)
  • Physiology (4001)
  • Plant Biology (8646)
  • Scientific Communication and Education (1505)
  • Synthetic Biology (2388)
  • Systems Biology (6417)
  • Zoology (1345)