A comprehensive catalog of 3D genome organization in diverse human genomes facilitates understanding of the impact of structural variation on chromatin structure

Chong Li; Marc Jan Bonder; Sabriya Syed; Human Genome Structural Variation Consortium (HGSVC); HGSVC Functional Analysis Working Group; Michael C. Zody; Mark J.P. Chaisson; Michael E Talkowski; Tobias Marschall; Jan O Korbel; Evan E Eichler; Charles Lee; Xinghua Shi

doi:10.1101/2023.05.15.540856

Summary

The human genome is packaged into the three-dimensional (3D) nucleus and organized into functional units known as topologically associating domains (TADs) and chromatin loops. Recent studies show that the 3D genome can be modified by genome structural variants (SVs) through disrupting higher-order chromatin organizations such as TADs, which play an essential role in insulating genes from aberrant regulation by regulatory elements outside TADs. Here, we have developed an integrative Hi-C analysis pipeline to generate a comprehensive catalog of TADs, TAD boundaries, and loops in human genomes to fill the gap of limited resources. We identified 2,293 TADs and 6,810 sub-TADs missing in the previously released TADs of GM12878. We then quantified the impact of SVs overlapping with TAD boundaries and observed that two SVs could significantly alter chromatin architecture leading to abnormal expression and splicing of genes associated with human diseases.

Introduction

To better understand the genotype-to-phenotype relationship in the human genome, functional annotation of genomes is critical. As an intermediate step, it is vital to determine how the spatial arrangement of DNA impacts genome functionality and gene regulation through the characterization of the three-dimensional (3D) structure of chromatin inside the nucleus¹⁴. Chromosome conformation capture sequencing (Hi-C) is a genome-wide sequencing technique that combines proximity-based DNA ligation with high-throughput sequencing to measure the geographical proximity of possibly any pair of genomic loci¹. Techniques such as Hi-C have been widely used to characterize the 3D structure of the genome and uncover folding principles of chromatin, including topologically associating domains (TADs) and chromatin loops. TADs are stable genomic regions separated by insulating proteins, e.g., CCCTC-binding factor (CTCF), and provide an encapsulating domain for constraining the chromatin contacts between regulatory elements and genes^{2, 28, 29, 79}. TAD boundaries, which separate adjacent TADs, have been found to be well conserved across cell types and more evolutionarily constrained than TADs themselves⁴⁰. At kilobase to megabase scale, Hi-C data have been utilized to pinpoint specific interactions between distant chromatin regions, known as chromatin loops, which are detected as enriched regions compared to their surrounding areas⁵². Chromatin loops typically connect promoters and enhancers and correlate with the activation of genes. It is shown that chromatin loops are conserved across different cell types and species⁵, and the rewiring of such loops contributes to developmental disorders and tumorigenesis^{13, 53}.

Recently, there has been a growing interest in uncovering the disruption of gene regulation resulting from pathogenic large genomic rearrangements or structural variants (SVs), including deletions, insertions, inversions, and duplications. These genomic rearrangements can disturb the normal 3D structure of the genome and lead to aberrant interactions between chromatin-regulatory elements⁴⁴, with such alterations bringing about the abnormal expression of oncogenic and disease-causing genes⁴². Studies have shown that disruptions to TADs can change the long-range regulatory architecture, result in alterations of gene expression levels, and lead to diseases³². In particular, SVs can affect CTCF-associated border elements, alter gene expression and enhancer-promoter interactions, and eventually result in human diseases^{32, 34}. Another study found that SVs can cause distinct TADs to fuse, and complex rearrangements significantly alter the chromatin folding maps in cancer genomes⁴⁴. Additionally, by integrating the information of TADs, the gene expression data, and somatic copy-number alterations (SCNAs), it can help to identify the cancer-related gene overexpression resulting from cis-regulatory elements reorganization, such as enhancer hijacking⁴³. A recent study identified SV-induced enhancer-hijacking and silencer-hijacking events and reported the role of repressive loops on their target genes in patients with acute myeloid leukemia⁵¹.

Hence, it is of great need to characterize the 3D genome organization of human genomes to facilitate the understanding of the 3D structure landscape of human genomes and the impact of genetic variants on chromatin structures⁶⁶. In doing this, Hi-C has been demonstrated as a powerful tool to chart the landscape of chromatin structures, including TADs and chromatin loops at the genome level². To date, the most comprehensive characterization of 3D genome organization from Hi-C sequencing comes from GM12878, with the densest Hi-C data so far⁵, and individuals from the 4D Nucleome (4DN) project²⁴. Despite these efforts, the utility of Hi-C data is typically limited due to their relatively low resolution, insufficient read coverage, and the sparse contact matrices that result in the lack of detailed characterization of chromatin structures in human genomes. Given the challenges in producing high-coverage Hi-C data due to sequencing costs and library preparation, various computational methods have been proposed to augment or enhance the quality of Hi-C data. For example, HiCPlus⁴⁵ applies a straightforward neural network optimized using mean squared error, and an improved tool named HiCNN⁴⁶ was proposed to improve Hi-C data quality. The use of Generative Adversarial Networks (GAN) was then introduced in the hicGAN⁴⁷, which produces high-resolution contact maps based on low-resolution versions of such contact maps from Hi-C data. Another GAN-based method, DeepHiC⁴⁸, was proposed with a user-friendly web interface. In HiCSR⁴⁹, a novel loss function is tailored to optimize an adversarial loss in a GAN to improve GAN-based Hi-C data augmentation. Another method, VEHiCLE⁵⁰, was presented to enhance Hi-C contact maps utilizing variational autoencoder generative models incorporating multiple loss functions.

These observations and studies highlight the importance of providing a comprehensive map of 3D chromatin structure to represent the spatial genomic organization in diverse human genomes in healthy populations. This comprehensive map in humans will facilitate the investigation of genomic variation and the disrupted 3D genome structures in health and disease. Therefore, in this study, we collected Hi-C data on the 1000 Genomes individuals⁶ from various projects to construct a comprehensive resource of TADs and loops in lymphoblastoid cell lines (LCLs) in 44 different individuals from five super populations including Africa (AFR), East Asia (EAS), Europe (EUR), South Asia (SAS), and the Americas (AMR) (Table S12). This Hi-C data set includes Hi-C sequencing of 27 individuals from HGSVC2⁴, six individuals for three trios from HGSVC1²², ten individuals from 4DN project^{22, 24}, and GM12878⁵. We derived the TAD map of these samples by integrating two TAD/TAD boundary-calling methods, Arrowhead⁵ and Insulation Score⁸ (IS), to take advantage of their respective strengths. Arrowhead is able to call TADs that exhibit TAD-associated biological features and can output hierarchical TADs⁵⁸. Meanwhile, TADs detected by IS are demonstrated to be robust against low coverage levels and capable of detecting dynamics of TAD boundary strength under different conditions⁵⁹. We also generated a comprehensive catalog of chromatin loops in these individuals by combining loops produced from two different loop callers (HiCCUPs⁵ and cooltools¹²).

To demonstrate the utility of this map of 3D chromatin structure, we utilized the up-to-date genetic variation, particularly SVs of all major types, on the 64 haplotypes from 32 diverse human genomes produced by the Human Genome Structural Variation Consortium (HGSVC)⁴. Equipped with the comprehensive catalog of TADs and TAD boundaries resulting from this study, we assessed the impact of SVs on chromatin organization in a variety of human genomes that come from five super populations in the 1000 Genomes samples⁶. First, we identified 11,653 SVs that intersect with TAD boundaries with the possibility of disrupting local chromatin structures. Among these, two SVs, including one deletion (chr8-644401-DEL-5014) and one insertion (chr6-130293639-INS-295), were revealed to significantly alter TAD boundaries and also gene expression and splicing by overlapping with eQTL and sQTL signals previously reported on these samples⁴. Except for TADs, we then conducted individual loop calling and generated a comprehensive catalog of loops on the same samples. Our analysis illustrates the utility of this up-to-date 3D genome structure in diverse humans as a resource to help elucidate the links between genetic variation and 3D genome structure, as well as their disruptive impact on gene regulation.

Results

1. A comprehensive TAD catalog identified across diverse individuals in human genomes

We first sought to construct a 3D chromatin structure map of human genomes in LCLs by combining Hi-C data from 44 individuals and comparing the 3D chromatin structure features across these diverse individuals (Table 1). Hi-C data collected on these 44 individuals were pre-processed using the same pipeline to generate the final map, including characterization of TADs, TAD boundaries, and chromatin loops. By an integrative analysis of the Hi-C data from these 44 individuals, we identified 18,972 TADs in our Integrative Catalog using a customized pipeline that combines two TAD calling algorithms, namely Arrowhead and Insulation Score (IS), for calling TADs and quantifying TAD boundaries (Table S2-S5). We compared the results between our Integrative Catalog and the previously published largest map of GM12878 human LCLs (Table S1, Figure S3, Figure S4)⁵. We found that our Integrative Catalog covers all of the TADs of GM12878 released by ENCODE⁶⁰ (Fig. 1a). Specifically, our Integrative Catalog identified 2,293 novel TADs and 6,810 smaller sub-TADs^{5, 27, 76} within 3,605 meta-TADs of GM12878 (ENCODE), that were missing in the GM12878 (ENCODE) (Fig. 1b). Note that, for any of the data and results on GM12878 included in this study, we did not use the original results published from Rao et al.⁵, but used the results re-processed by ENCODE⁶⁰ that mapped to hg38 as the reference genome (Methods). Explicitly, we considered two TADs shared when their regions are over 50% reciprocally overlapped. Thus, around 95.34% and 74.30% of the TADs released by ENCODE for GM12878 coincided with the TADs separately detected by Arrowhead and IS in our study, respectively. Approximately 66.43% of the TADs called by Arrowhead can also be detected by IS in our dataset (Figure S2). We observed that the relatively lower number of overlapped TADs between Arrowhead and IS is due to the fact that Arrowhead tends to call TADs in larger sizes than IS, which cannot be recognized when we calculate the reciprocal overlap once the size of the TAD called in Arrowhead is more than half of the potential overlapped TAD called by IS (Figure S1). For a more strict overlapping comparison, we found that over 97.5% of the TADs released by ENCODE for GM12878 are over 50% reciprocally overlapped with our Integrative Catalog call set (Figure S1). Since there were no released TAD boundary locations for GM12878, we only focused on comparing the sizes and numbers of the TADs in this study (Table S1).

Fig. 1. The comparison of TADs between GM12878 released by ENCODE and the Integrative TAD Catalog released in our study.

a). The overlap of TADs between GM12878 released by ENCODE and the Integrative TAD Catalog generated using our customized pipeline in this study (one bp overlapped). Note that the 10,044 TADs of GM12878 released by ENCODE are at least one bp overlapped with 16,679 TADs in our Integrative Catalog (containing sub-TADs). Therefore, only 2,293 remaining TADs that were uniquely released in our Integrative Catalog are shown in the Venn diagram. b). Visualization of one region containing TADs that were identified in our Integrative Catalog but were missing in the GM12878 released by ENCODE. From top to bottom, these plots show the Hi-C contact maps, the insulation scores, and the corresponding boundaries with the boundary scores over this region. Green regions represent TADs identified by both GM12878 (ENCODE) and our Integrative Catalog, while yellow regions highlight TADs that were identified by our pipeline, not in GM12878 (ENCODE).

View this table:

Table 1. The Integrative Catalog in human 3D structure generated from this study aggregates Hi-C data from multiple projects.

The Hi-C data from 44 individuals collected in this study come from HGSVC2⁴, HGSVC1²², 4DN project^{22, 24}, and GM12878⁵ with the highest resolution. Three samples (HG00514, HG00733, NA19240) from HGSVC1 are re-sequenced in HGSVC2 with a higher resolution; thus, we kept those samples from HGSVC2 in our further analysis. The resolutions were calculated for each sample, and the TADs and loops were called on the 44 merged samples and 27 merged samples in HGSVC2, respectively.

2. Identification of TAD boundaries for individuals in the HGSVC2 cohort

Next, we characterized and quantified TAD boundaries to explore how genetic variations like SVs disrupt TAD boundaries and bring about potential downstream changes in gene regulation and phenotypes (Table S6). 27 of the 44 individuals have been recently characterized by HGSVC2 with haplotypes containing genetic variation, including SVs⁴, which allowed us to detect the effect of SVs on 3D human genome chromatin conformation. We examined the map resolutions (compared with GM12878 in Rao’s study), read numbers, and contact numbers (Fig. 2) for each of the 27 samples, which inspired us to use the merging strategy in our Integrative Catalog TADs calling pipeline. In the end, we identified 14,612 boundaries (lenient set) and 5,884 strong boundaries (stringent set) in our Integrative Catalog with different boundary score cutoffs received from IS (Methods, Table S4, S5). For those 27 subjects in the HGSVC2 cohort, on average, we identified 12,410 boundaries (lenient set) and 5,299 strong boundaries (stringent set) (Fig. 3a, 3b). The HGSVC2 27 individuals come from five different super populations in Africa (AFR), East Asia (EAS), Europe (EUR), South Asia (SAS), and the Americas (AMR). We aggregated each individual by its super population and compared the numbers of TAD boundaries per individual genome (Fig. 3c, 3d). We did not observe any significant relationship between the different types of super populations and their insulation scores or boundary scores, which indicates that TAD features appear to be conserved across diverse human populations^{16, 27}.

Fig. 2. Map resolutions, read numbers, and contact numbers of each sample.

a). The map resolutions of our 27 samples were compared with the GM12878 in the last columns. The x-axis represents the sample ID, from left to right, the resolution low to high; the y-axis shows the value of resolution in the base pair (bp). b). The number of read pairs of each sample with the filtered alignment quality based on the mapping quality score (MAPQ) ≥ 30. c). The number of Hi-C contacts of each sample with the filtered alignment quality for MAPQ ≥ 30.

Fig. 3. Distribution of TAD boundaries in 27 samples of HGSVC2.

The x-axis represents sample IDs, with resolution ordered low to high from left to right in a and b, and with the super population ordered and represented by a different color as displayed in the color key in c and d. The y-axis shows the number of all TAD boundaries (lenient set in a and c), and that from the stringent set of TAD boundaries (stringent set, in b and d) detected with our pipeline in the 27 samples. All of these TAD boundaries were called with 10 kilobases (kb) resolution for each individual using IS.

3. SVs’ impact on 3D chromatin structure, gene expression, and splicing levels

We hypothesized that SVs in the human population disrupting TAD boundaries would have a functional impact on gene regulation³⁹. To investigate the effects of SVs disrupting TAD boundary on the landscape of chromatin interaction, we intersected the identified TAD boundaries on the HGSVC2 samples with the SVs that are expression quantitative trait loci (SV-eQTLs) and splicing quantitative trait loci (SV-sQTLs) characterized in Ebert’s study⁴.

In total, we found 4,047 deletions and 5,512 insertions overlapped with the TAD boundaries identified in our Integrative Catalog. We used the boundary strength value (boundary score, BS) of each individual as a measurement of the disrupted chromatin structure to compare the BS changes between the homozygous genotype (0/0) for the SVs and other genotypes that contain at least one alternative allele (1/1, 0/1, and 1/0) for the SVs. BS is defined as the difference in the delta vector (the difference between the amount of insulation change 100 kb to the left and right of the central bin) between the nearest 5’ local maximum and 3’ local minimum relative to the boundary, which can be used to filter out the potential boundary⁸. Higher BS represents stronger boundaries, while lower BS refers to weaker boundaries. Significant changes in BS were observed in two deletions and four insertions (FDR < 0.2, Wilcoxon rank-sum test, two-sided) (Table S8). These findings suggest that removing or inserting particular TAD border sequences could alter the insulation between nearby TADs, thereby resulting in a change in the frequency of interactions between sequences in typically isolated domains.

We next examined if those SVs we found above that disrupt TAD boundaries have a potential effect on gene regulation using gene expression and splicing quantification from RNA sequencing (RNA-seq) of HGSVC2 samples. In the 26 HGSVC2 samples (the GM12329 sample was excluded due to a relatively low sequencing quality), we intersected the TAD boundaries with overlapped SVs that are also SV-eQTLs and SV-sQTLs (113 deletions and 120 insertions) and identified ten genes and 37 genes whose expression values were significantly changed for ten deletions and 32 insertions, respectively (FDR < 0.2, Wilcoxon rank-sum test, two-sided) (Table S9 and S10); We observed 173 splice junctions and 169 splice junctions whose splicing ratios were significantly changed for 71 deletions and 83 insertions, respectively (FDR < 0.205, Wilcoxon rank-sum test, two-sided) (Table S9 and S11). These results reveal that the deletion and insertion of TAD boundaries can result in significant changes in gene expression and splicing in human genomes.

By integrating these findings, our observations found one deletion (chr8-644401-DEL-5014) and one insertion (chr6-130293639-INS-295) within our 26 HGSVC2 call set disrupted regulatory sequences to influence local chromatin architecture and, in the meantime drive altered genes both on the expression and splicing levels. We observed that deletions tend to remove the TAD boundaries and fuse the adjacent TADs³³, while insertion of sequence can split a single TAD into two adjacent TADs (neo-TAD)⁵⁷ (Fig. 4).

Fig. 4. Visualization of two SVs that disrupt TAD boundaries with significant changes in boundary strength, gene expression, and splicing levels

a). A deletion (chr8-644401-DEL-5014) that disrupts the TAD boundary shows differences in Hi-C contact maps, boundary scores, and insulation scores for individuals with (genotype 1/1) and without (genotype 0/0) the deletion. The dark red rectangle represents the location of this deletion, and the blue arrow shows where the ERICH1 gene is. The green rectangle highlights the TAD boundary’s location and corresponding boundary strength. The left figure is the GM19036 sample, whose genotype is 0/0, i.e., does not have this deletion, compared with the right sample GM19650, whose genotype is 1/1, i.e., carries this deletion. The boundary score panel shows that the right sample misses the TAD boundary where it carries that genomic deletion. b). Boxplots demonstrate the significant difference in gene expression for two different genotype categories, 0 (genotype 0/0) and 1 (genotypes 0/1, 1/0, or 1/1). The boxplot on the left shows the significant changes between the boundary score and different genotype categories of this deletion, while the right boxplot shows significant changes between the associated ERICH1 gene expression values and different genotypes of this deletion. c). Boxplots show the significant difference between gene splice ratios (after log transformation and quantile normalization) for the splice junction clusters and genotypes of the same deletion. Specifically, we found two splicing quantitative trait loci QTLs (sQTLs) (8:640834:668598:clu_8652_- and 8:664676:668598:clu_8652_-) overlapped with this deletion, and both of which are associated with ERICH1 gene. Subfigures e to f show the same comparisons for an example of insertion (chr6-130293639-INS-295) between the individual HG03371 (genotype 0/0) and GM18534 (genotype 0/1).

4. Catalog of chromatin loops across diverse individuals in human genomes

Besides TADs, we derived a comprehensive list of chromatin loops with the Hi-C datasets collected in this study⁵ (Table S6 and Table S14-S19). We constructed an integrated list of chromatin loops combining loops generated by two methods based on different overlap criteria. Chromatin loops of the 27 HGSVC2 merged set were predicted by both HiCCUPS GPU using the SCALE normalized matrix⁵ and cooltools call-dots¹² which uses an iterative correction algorithm for matrix balancing (Imakaev et al. 2012 Nature Methods volume 9, pages 999–1003 (2012)). Loop calling was performed at both 5kb and 10 kb resolution. Our results for the merged loop set from these two models of loop calling show a very high consistency between HGSVC2 samples (Fig. 5a and 5b) as more than ∼74% of loops are exactly overlapped (17,623 loops; 74.0/75.6% HiCCUPS/call-dots) (Table S14, S15, and S16) and more than 80% loops are overlapped by adding 5 kb flanking length to the start and end site of the detected anchors (19,102 loops; 82.0/80.2% HiCCUPS/call-dots)) (Fig. 5). The addition of 4DN and HGSVC1 dilution Hi-C experiments to the HGSVC2 merged loop list led to a drop in the HiCCUPS and call-dots loop overlap (21,838 loops; 36.1/25.4% HiCCUPS/call-dots). HiCCUPs identified 60,494 merged loops, while cooltools call-dots identified 86,024 loops (Table S17 and S18). Overall, our integrated list of chromatin loops combining two loop calling methods amounts to 21,838 loops (Table S19). By adding a 5 kb flanking length to the start and end site of the detected anchors, 25,994 overlapped loops are found between HiCCUPS and call-dots (25,994 loops; 43.0/30.2% HiCCUPS/call-dots). The average distance between loop anchors was calculated for HiCCUPS and call-dots loops, with call-dots showing a slightly higher average distance between loops and wider bp distribution of the loops (Figure S5). Integration of the 44 Hi-C datasets has allowed us to generate a comprehensive chromatin loop catalog which we will use to study how genetic variation impacts specific chromatin contacts in the future.

Fig. 5. Comparison of loops detected by HiCCUPs and cooltools from the HGSVC2 merged Hi-C sample under 5kb and 10 kb merged resolutions.

The agreement between HiCCUPs and cooltools loops is shown as Venn diagrams for our 27 merged Hi-C contact maps. The overlap between the two loop lists is colored in light blue, and the percentages of overlap with respect to each method are shown separately. a). The exact same loop coordinates detected by both methods. b). The overlapped loop coordinates are produced by adding a 5 kb flanking region of the start and end site of anchors detected by both methods.

Discussion

The 3D genome has revolutionized the field of genomics by providing a comprehensive view of the spatial organization of the genome. The availability of a comprehensive 3D genome map provides a valuable resource to assist with the understanding of human genome structure and function. Our study thus provides a comprehensive 3D genome catalog in diverse human genomes that serves as a valuable resource to elucidate chromatin organization in human genomes, by tackling current limitations of low resolution and lack of diversity in providing a map of 3D genome structure in healthy human individuals. To the best of our knowledge, our results provide the most comprehensive catalog of TADs and TAD boundaries in human LCLs (Table S2 to S6), which were produced by a customized Hi-C analysis pipeline that integrates Hi-C data generated from different platforms, with various sequencing resolutions aggregated from multiple data resources. We identified several significant examples of SVs that disrupt the genome 3D structure via genomic deletions or insertions in TAD boundaries, lead to gene regulation alterations, and also reported to contribute to pathological phenotypes such as developmental abnormalities, neurodegenerative disorders, and cancer^{3, 13, 30–32, 77, 78}.

Previous research has revealed that alterations in the genome’s sequence can result in chromatin structural modifications, which might regulate gene expression and lead to various pathogenic phenotypes, such as aging-related disease, Alzheimer’s disease, cancers, and developmental disorders^{3, 13, 35–38}. In recent studies, cancer development and progression have been proven to be influenced by allele-specific expression (ASE)^{18, 19}. Thus, we sought to investigate if those two SVs and the associated eGenes and sGenes would have any potential contribution to the risk of complex diseases (Table S7). We found two of the genes (ERICH1, TMEM200A) were previously reported to be associated with various cancers (e.g., such as pancreatic cancer and gastric cancer) and tumor progression^{15, 61–64}. Among them, ERICH1 was reported to be related to the genomic imprinting regulatory mechanism (which is frequently linked to ASE^{20, 21}) and differential allelic expression^{17, 65}. Interestingly, one recent study reported that the expression of TMEM200A was notably high in gastric cancer, and overexpression of TMEM200A would shorten the overall survival of gastric cancer patients⁶³. They concluded that upregulated TMEM200A would serve as a promising prognostic marker for gastric cancer and is closely associated with the tumor microenvironment.

However, there is room for improvement in our current analysis to note here. As more Hi-C data is available (e.g., HGSVC3 Hi-C sequencing data), there is an essential need for us to add those data to our current Integrative Catalog, and our pipeline is readily available to support such analysis. Another limitation in our analysis is that we only investigated SVs in the current HGSVC2 eQTL/sQTL results (deletions and insertions)⁴. We will extend our investigation of the impact of other types of SVs (e.g., duplications, inversions) on the chromatin structure and gene regulation in the future. In addition, the Telomere-to-Telomere (T2T) Consortium has released the first complete and gapless sequence of the human genome sequence, named T2T-CHM13⁵⁵, which will complement the standard human reference genome that was used in our study, known as Genome Reference Consortium build 38 (GRCh38). With this new reference genome available, we are able to map our Hi-C reads to T2T and potentially personal assemblies with no gaps and unprecedented accuracy.

We envision that the comprehensive catalog of TADs, TAD boundaries, and chromatin loops produced from this study will provide a reference map of known 3D genome structures of human genomes. Our findings have critical implications for understanding human whole-genome sequencing data in disease diagnosis and precision medicine. Genomic variation, including all types of SVs that overlap or intersect with TAD boundaries in human genomes, should be further investigated to better understand the molecular basis of human genetic diseases²⁶.

In summary, we generated a high-resolution 3D genome profile of TADs, TAD boundaries, and chromatin loops of diverse human genomes. This comprehensive catalog allows us to associate SVs with 3D genome structure to reveal the importance of TAD boundary sequences for genome function and regulatory mechanisms. Our study provides insight into the significance of integrative analysis of SVs and TADs, in that deletions and insertions overlapping with and proximal to TAD boundaries should be carefully examined in disease studies. We believe that the released Integrative Catalog of TADs and TAD boundaries in the human genome will assist the investigation of genomic variation, gene regulation, 3D genome structure, and their impact on and association with human diseases, with promises to provide insights into disease etiology and therapeutic discovery⁴¹.

STAR Methods

Resource availability Lead contact

Further information and request for resources and reagents should be directed to and will be fulfilled by the lead contact, Xinghua Shi (mindyshi{at}temple.edu).

Materials availability

This study did not generate new unique reagents.

Method details

Hi-C data collection

We have collected Hi-C data on 44 samples from HGSVC2, HGSVC1, 4DN, and GM12878 (Table S12). The first dataset (HGSVC2) is the Hi-C data for the 27 HGSVC2 samples generated from the Human Genome Structural Variation Consortium. Hi-C libraries were generated with 1.5 M human cells as input using Proximo Hi-C kits v4.0 (Phase Genomics, Seattle, WA) according to the manufacturer’s protocol with the following change: cells were crosslinked, quenched, lysed sequentially with Lysis Buffers 1 and 2, and liberated chromatin immobilized on magnetic recovery beads. During the fragmentation process, a cocktail of 4-cutter restriction enzymes (DpnII (GATC), DdeI (CTNAG), HinfI (GANTC), and MseI (TTAA) was utilized to enhance coverage and facilitate haplotype phasing. Fragmented chromatin was proximity ligated for 4 hours at 25°C after fragmentation and fill-in with biotinylated nucleotides. The crosslinks were then reversed. Magnetic streptavidin beads were used to purify DNA and retrieve biotinylated junctions. The dual-unique indexed library was created using bead-bound proximity ligated fragments and Illumina sequencing chemistry. Fluorescent-based assays, such as qPCR with the Universal KAPA Library Quantification Kit and Tapestation, were used to assess the Hi-C libraries (Agilent). The libraries were sequenced at New York Genome Center (NYGC) in a paired-end 150 bp format on an Illumina Novaseq 6000.

The second (HGSVC1) and third (4DN) pilot datasets are the Hi-C sequencing data which has been generated for the 1000 Genomes Project SV group by the laboratory of Bing Ren using the Illumina HiSeq (2000, 2500 or 3000) paired-end sequencing^{22, 24, 54} and were performed using a “dilution” 6-cutter HindIII (AAGCTT) protocol. The second pilot dataset came from Hi-C sequencing for three trios: Yoruba (NA19238, NA19239, NA19240), Han Chinese (HG00512, HG00513, HG00514), and Puerto Rican (HG00731, HG00732, HG00733). The HG00514, HG00733, and NA19240 were excluded from this dataset since these samples are re-sequenced in the HGSVC2 study. For the same reason, the GM19238 from the third (4DN) pilot dataset was also excluded.

We also included the Hi-C data from GM12878 B-lymphoblastoid cells in our analysis from Rao et al., which has almost 5 billion mapped paired-end read pairs and is considered the largest coverage for the accurate identification of 3D chromatin structures. The sequence data were produced as in Rao et al. (2014)⁵ and can be downloaded from ENCODE.

Hi-C Data Processing

We required a minimum alignment quality for each read included in our Hi-C maps. We used the mapping quality score (MAPQ), which quantifies the probability that a read is misplaced, to filter out the read pairs where the alignment of one or both reads fails to meet these two thresholds: MAPQ > 0 and MAPQ ≥ 30 (Fig. 2b)⁵. In our study, we used the MAPQ ≥ 30 filtered data to be stringent about avoiding false positives caused by poor alignments. The result of filtering low-quality alignments is a list of Hi-C contacts (Fig. 2c).

For each sample, the raw resulting reads were mapped to the GRCh38 reference genome and processed using Juicer software tools (version 1.6) with default aligner BWA mem^{7, 73}. Unmapped reads such as abnormal split reads and duplicate reads were removed, and low mapping quality read pairs were filtered out if their mapping quality value MAPQ was less than 30. Those filtered read pairs (Hi-C contacts) were subsequently used to construct chromatin contact maps for each sample by Juicer. To build a Hi-C contact map on an Integrative Catalog of LCLs basis, contacts were pooled across all 44 individuals using the mega.sh script provided in Juicer. The outputs of the previous processes are two sets of .hic files, a special format of the highly compressed binary file to store contact matrices with various resolutions (i.e., bin sizes) and can be mainly supported by Juicer, JuicerTools, and Juicebox command line tools for downstream analysis and visualization⁹. All analyses and results reported in our study employ the contact frequency matrices normalized with SCALE matrix balancing⁷, which remedies the issue that sometimes KR normalization does not provide coverage for a particular region or chromosome²³.

Calculation of Hi-C map resolution

To decide the applicable TAD map resolution to be called for each sample, we first applied the script from Juicer to calculate the map resolution of each sample⁷. The map resolution is intended to reflect the finest scale at which local features can be reliably detected. The lowest resolution is 17,800 bp (GM12329), while the highest is 4,650 bp (GM19650). Juicer created a specific .hic file format to describe the input reads under nine different bin sizes (base-pair-delimited resolutions: 2500000, 1000000, 500000, 250000, 100000, 50000, 25000, 10000, 5000), and usually based on the sequencing depth of the Hi-C file, 5 kb or 10 kb resolution will be used to call Arrowhead⁷. Considering that GM12878, with the largest contact map and deepest depth of sequencing in its Hi-C file up to now (map resolution of 950 bp), was called under 5 kb resolution in the Arrowhead⁵, we chose the same 5 kb as the practical resolution of TAD calling on all of our 44 merged samples (kilobase map resolution of 200 bp).

We applied the method used by Rao et al.⁵ to calculate the Hi-C map resolution of our 27 samples (Fig. 2a). The map resolution is defined as the smallest bin size that 80% of bins have at least 1,000 contacts, which is intended to reflect the finest scale at which local features can be detected reliably. The scripts for calculating the Hi-C resolution of each sample were directly downloaded from Rao’s study⁵.

Identification and visualization of TAD and TAD boundaries

Two TAD callers, JuicerTools (version 1.22.01)’ Arrowhead and Insulation Score (IS), were used and compared to call TADs for 43 (except GM12878) samples respectively at 10 kb resolution from the human lymphoblast line^{7, 8}. Arrowhead was more accurate and sensitive for ultra-high resolution data and focused on detecting the corners of the domains to locate the boundaries of TADs, while IS algorithm was initially created to find TAD boundaries on Hi-C data with a relatively low resolution²⁵. For pool calling, SCALE normalized merged contact matrix (44 samples) at 5 kb resolution was used to calculate the insulation score and corresponding boundary score (BS) using the FAN-C toolkit version 0.9.26b2 with default parameters to detect the TAD boundaries at a 100 kb window size (which was referenced from the 4DN domain calling protocol)^{10, 24}. The same method was applied for sample calling and used the SCALE normalized contact matrix at 10 kb of each sample as input. The IS defines a sliding window and adds up contacts in this window to align the Hi-C matrix diagonal. The regions with low insulation scores (high boundary scores) are insulating and referred to as TAD boundaries, and regions with high insulation scores (low boundary scores) are most typically found inside domains and referred to as TAD regions, which were also considered as the regions between the two neighbored TAD boundaries¹⁰.

The sex chromosomes X and Y were eliminated from all analyses because of the gender disparities in our samples. ENCODE has processed the GM12878 files for hg38 and released the TADs called by Arrowhead on the SCALE normalized Hi-C matrices⁶⁰. Thus, we compared their TAD results with the GM12878 result processed by our pipeline to determine a minimum boundary strength cut-off value (0.17) for GM12878, which gave the most repetitive results. Comparing our minimum boundary score cut-off value of 0.17 with the 0.2 value used in the 4DN TAD calling protocol, we simply decreased 0.03 to the 4DN strong boundary score cut-off value of 0.5 as our strong cut-off value at 0.47. Those values were then transformed to a percentile to find the corresponding cut-off scores in our LCLs call set. After the removal of the missing and duplicated boundary scores, a minimum boundary score cut-off value of 0.1880268762332596 and a strong boundary score cut-off value of 0.5477218522527745 were chosen. The insulation scores, TAD boundaries, and corresponding boundary scores were visualized using Juicebox software and the FAN-C toolkit in Python 3.7^{9, 10}.

We next sought to generate the most comprehensive TADs call set aligned with our TAD boundary locations for more downstream analysis (Fig. 6). To be specific, 1) we first called the TADs directly from Arrowhead under 5 kb resolution to be our TADs callset1; 2) the boundaries we previously received from IS were first converted to TADs and then be filtered by a) excluding those TADs that contain more than ten percent of the length of those TADs that have missing insulation scores and b) removing TADs that don’t have any maxima of insulation scores to be our TADs callset2; 3) We then used “bedtools intersect”⁷¹ to find the overlapped regions that are both detected by Arrowhead (TADs callset1) and IS (TADs callset2) when the reciprocal overlap was above 50%. For those overlapped regions that contain more than one TAD detected from Arrowhead, we kept the TAD locations identified from Arrowhead instead of IS since Arrowhead is able to find the sub-TADs nested or overlapped within the larger TADs, which are demonstrated to have more cell or species-specific gene regulation activities within it^{5, 16, 56}. We subtracted those regions both from callset1 and callset2 to avoid duplicates; 4) We then merged the remaining overlapped regions from step 3 by the smallest start position and largest end position for each TAD; 5) We finally added the TADs that distinctly detected from Arrowhead and IS, the sub-TAD regions from step 3, the merged regions from step 4 to generate our final comprehensive TAD catalog.

Fig. 6. The step-by-step workflow to process raw Hi-C data into TADs and TAD boundaries in our Hi-C analysis pipeline.

44 samples’ raw reads files were used as input in Juicer to preprocess and create Hi-C maps, which were binned at multiple resolutions. Insulation score algorithms were applied to call an initial TAD boundary for each sample. All 44 .hic files were then merged together to create a “mega” map and used as an input of Arrowhead and Insulation Score algorithms to call TADs, and TAD boundaries for the LCLs merged call set. A finalized TAD boundary results for each individual were defined as those sample boundaries located within the merged boundary plus 25 kb flanking regions on the left side of the boundary start site and the right side of the boundary end site¹¹. The two figures located in the bottom left corner are shown as a comparison between the merged subjects level and single subject level, which includes the Hi-C contact maps, the insulation scores, and the boundary strengths for the merged call set (5kb) and the GM19036 (10kb) sample over the region chr14: 35Mb-35.8Mb.

Individual genotype verification

We used verifyBamID to verify whether the Hi-C sequencing reads in our aligned files match with the PanGenie genotyped SV calls of the 44 individuals used in this study^{67, 68}. It can also identify whether the reads have been contaminated or swapped by a mixture of two samples. We did this for each of our samples (Table S13), and we observed that one of the samples (GM19204) has 12% or more of non-reference based observed in reference cites, which is much greater than a normal 2% standard criteria and indicated that this sample is very likely to have contamination. For this reason, we ensured to exclude this sample from any of our genotype-related downstream analyses for accuracy.

SV-eQTLs and SV-sQTLs candidates selection

Gene expression and splicing levels’ quantifications of HGSVC2 26 samples (excluded GM12329) were extracted from RNA-seq and produced as previously described in our HGSVC2 study⁴. SV-eQTLs, SV-sQTLs results, and the PanGenie genotyped SV calls used in this study were obtained on the same 26 samples from the HGSVC2 project⁴. The SV-eQTLs result includes 34,745 deletions and 25,516 insertions; the SV-sQTLs result contains 44,945 deletions and 33,950 insertions. Only deletion and insertion alleles that: 1) were genotyped in our pangenie SVs calls; 2) were present in at least five samples; and 3) the homozygous for the reference allele of SVs (0/0) were present at least once in our sample of 26 individuals were kept in our final analysis.

Quantification of the effects of candidate SVs on TAD boundary

A Wilcoxon rank-sum test (Mann–Whitney U test) was performed on each of those SVs located within the flanking TAD boundaries between the homozygous reference (0/0) and the heterozygous/homozygous deletions (1/1, 0/1, and 1/0) for the 26 samples using the boundary scores calculated by IS, and an FDR < 0.2 was considered as significant. If multiple TAD boundaries per sample were located inside one single TAD boundary from the merged call set, we used the median value of those samples’ TAD boundaries. All test statistics were estimated using the Scipy library, and box plots were generated using Matplotlib in Python 3.7.3. The False discovery rate (FDR) correction was conducted in the qvalue 2.28.0 package in R 4.2.1.

HGSVC chromatin loop calling

The chromatin loops for the merged set and individual samples were identified by HiCCUPS GPU⁵ at 5 kb and 10 kb resolution in a merged resolution manner on the SCALE normalized Hi-C matrix with all other parameters at default. For comparison, an alternative modular Hi-C analysis pipeline was used for analyzing the individual Hi-C samples, called distiller-nf⁷². An iterative correction algorithm for matrix balancing, cooler balance, was used to balance .cool matrix files generated by distiller⁶⁹. The cooltools Hi-C data analysis package has a loop-calling function named call-dots¹². Call-dots is a re-implantation of the HiCCUPS loop-calling algorithm and was used as an additional model for loop calling of all individual Hi-C samples. For a generation of the merged loop list, HiCCUPS was used to generate a merged loop list on the merged .hic file under 5 kb and 10 kb merged resolutions. The merged cooltools loop list was generated by converting the .hic file to a .cool file using hic2cool convert⁷⁴. The cooler support library function, cooler balance, was used for matrix balancing⁶⁹. Dots were called at 5kb and 10kb resolution with the following parameters: fdr = 0.1, tile_size = 20000000, max-nans-tolerated = 7, kernel-width =7, kernel-peak =4, dot-clustering-radius = 20,000. 5kb and 10kb resolution loop lists were merged using HiCExplorer tools’ hicMergeLoops^{70, 75}. Bedtools pairtopair⁷¹ was used to identify loop overlap between HiCCUPS and call-dots. The final reported loop lists included in this study were detected using both models, and only those detected by both models were reported to reduce the false-positive loop results (in a stringent and robust overlapped criteria manners introduced in the results section).

Data availability

The raw sequencing Hi-C data generated by HGSVC2 discussed in this publication can be downloaded directly at the following link: (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/HGSVC2/working/20200512_Hi-C/)⁴. The raw sequencing Hi-C data of HGSVC1 generated by Bing Ren can be found at (http://www.ebi.ac.uk/ena/data/view/PRJEB11418). The raw sequencing Hi-C and other processed data files generated by Ren lab are available through the 4D Nucleome data portal (https://data.4dnucleome.org/publications/b8c7c5f5-c76f-457f-9a0d-6c567924b816/#expsets-table). The raw sequencing GM12878 Hi-C data have been deposited into NCBI’s Gene Expression Omnibus (GEO) accession GSE63525⁵, as well as through the Encode project portal (https://www.encodeproject.org/experiments/ENCSR410MDC/). The RNA-seq data discussed in this study have been deposited in the following link: (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/HGSVC2/working/20200627_RNAseq_JAX/). The PanGenie genotyped SV calls can be downloaded directly at the following link (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/HGSVC2/release/v2.0/PanGenie_results/)4. The eQTL calls can be downloaded from the https://drive.google.com/file/d/1S0nARNKakaQ6vAdRr0dsMIv3_dZzD54E/view?usp=share_link, and the sQTL calls can be downloaded from the https://drive.google.com/file/d/1L6KhNn-RkC2GA0XrJmmqNbk2urNCH9JZ/view?usp=share_link. On reasonable request, the respective authors can provide additional data to support the conclusions of this study.

Data and code availability

All data reported in this paper will be shared by the lead contact upon request.

All original code for the statistical analysis and pipeline has been deposited on GitHub https://github.com/caragraduate/Hi-C-integrative-catalog and is publicly available as of the date of publication.

Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Author contributions

The study was designed by C.Li. and overseen by X.S., HGSVC (co-chaired by T.B., J.O.K., E.E.E., and C.L.), and the HGSVC Functional Analysis Working Group. C.Li. conducted data analysis, statistical analysis, and visualization of the Hi-C and associated data and results.

M.J.B. assisted in the analysis of Hi-C data. C.Li. and S.S. performed loop calling. C.Li. and X.S. wrote the manuscript, incorporating contributions from all other co-authors. All authors read and approved the final manuscript.

Declaration of interests

E.E.E. is a scientific advisory board (SAB) member of Variant Bio. C.L. is a SAB member of Nabsys and Genome Insight. The other authors declare no competing interests.

Consortia

The members of the Human Genome Structural Variation Consortium (HGSVC) are Haley J. Abel, Hufsah Ashraf, Peter A. Audano, Anna O. Basile, Christine Beck, Marc Jan Bonder, Harrison Brand, Marta Byrska-Bishop, Mark J.P. Chaisson, Yu Chen, Ken Chen, Zechen Chong, Nelson T. Chuang, Wayne E. Clarke, André Corvelo, Scott E. Devine, Peter Ebert, Jana Ebler, Evan E. Eichler, Uday S. Evani, Susan Fairley, Paul Flicek, Mark B. Gerstein, Maryam Ghareghani, Ira M. Hall, Pille Hallast, William T. Harvey, Patrick Hasenfeld, Alex R. Hastie, Wolfram Höps, PingHsun Hsieh, Sarah Hunt, Miriam K. Konkel, Jan O. Korbel, Sushant Kumar, Charles Lee, Alexandra P. Lewis, Chong Li, Bin Li, Yang I. Li, Jiadong Lin, Mark Loftus, Tsung-Yu Lu, Rebecca Serra Mari, Tobias Marschall, Ryan E. Mills, Zepeng Mu, Katherine M. Munson, David Porubsky, Benjamin Raeder, Tobias Rausch, Allison A. Regier, Jingwen Ren, Bernardo Rodriguez-Martin, Ashley D. Sanders, Martin Santamarina, Xinghua Shi, Chen Song, Oliver Stegle, Michael E. Talkowski, Luke J. Tallon, Jose M.C. Tubio, Aaron M. Wenger, Xiaofei Yang, Kai Ye, Feyza Yilmaz, Xuefang Zhao, Weichen Zhou, Qihui Zhu, and Michael C. Zody.

HGSVC Functional Analysis Working Group

The members of the Human Genome Structural Variation Consortium (HGSVC) functional analysis working group are Anna O. Basile, Bernardo Rodriguez-Martin, Chong Li, Marc Jan Bonder, Marta Byrska-Bishop, Mark J.P. Chaisson, Ken Chen, Wolfram Höps, Yang I. Li, Ryan E. Mills, Zepeng Mu, Sabriya Syed, Qingnan Liang, Michael E. Talkowski, Yukun Tan, Matthew Jensen, Weichen Zhou, and Michael C. Zody.

Acknowledgment

This research is supported by National Institutes of Health (NIH) grants U24HG007497 (to C.L., E.E.E., J.O.K., T.M.), U01HG010973 (to T.M., E.E.E., and J.O.K.), and R01HG002385 and R01HG010169 (to E.E.E.); the German Federal Ministry for Research and Education (BMBF 031L0184 to J.O.K. and T.M.); the German Research Foundation (DFG 391137747 to T.M.); the German Human Genome-Phenome Archive (DFG [NFDI 1/1] to J.O.K.); the European Research Council (ERC Consolidator grant 773026 to J.O.K.); E.E.E. is an investigator of the Howard Hughes Medical Institute.

We want to thank Michael Zody, Michael Talkowski, Mark Chaisson, and Weichen Zhou from HGSVC Functional Analysis Working Group for providing critical and constructive advice and discussion about Hi-C analysis. We thank Mohammad Erfan Mowlaei and Emily Thyrum for their feedback on the project. We sincerely extend our gratitude to the people who contributed samples to the 1000 Genomes Project.

References

1.↵
Forcato, M., Nicoletti, C., Pal, K., Livi, C.M., Ferrari, F., and Bicciato, S. (2017). Comparison of computational methods for Hi-C data analysis. Nat. Methods 14, 679–685. doi:10.1038/nmeth.4325.
OpenUrl CrossRef PubMed
2.↵
Melo, U.S., Schöpflin, R., Acuna-Hidalgo, R., Mensah, M.A., Fischer-Zirnsak, B., Holtgrewe, M., Klever, M.-K., Türkmen, S., Heinrich, V., Pluym, I.D., et al. (2020). Hi-C Identifies Complex Genomic Rearrangements and TAD-Shuffling in Developmental Diseases. Am. J. Hum. Genet. 106, 872–884. doi:10.1016/j.ajhg.2020.04.016.
OpenUrl CrossRef PubMed
3.↵
Spielmann, M., Lupiáñez, D.G., and Mundlos, S. (2018). Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467. doi:10.1038/s41576-018-0007-0.
OpenUrl CrossRef PubMed
4.↵
Ebert, P., Audano, P.A., Zhu, Q., Rodriguez-Martin, B., Porubsky, D., Bonder, M.J., Sulovari, A., Ebler, J., Zhou, W., Serra Mari, R., et al. (2021). Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372, eabf7117. doi:10.1126/science.abf7117.
OpenUrl Abstract/FREE Full Text
5.↵
Rao, S.S.P., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., Sanborn, A.L., Machol, I., Omer, A.D., Lander, E.S., et al. (2014). A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell 159, 1665–1680. doi:10.1016/j.cell.2014.11.021.
OpenUrl CrossRef PubMed Web of Science
6.↵
The 1000 Genomes Project Consortium, Corresponding authors, Auton, A., Abecasis, G.R., Steering committee, Altshuler, D.M., Durbin, R.M., Abecasis, G.R., Bentley, D.R., Chakravarti, A., et al. (2015). A global reference for human genetic variation. Nature 526, 68–74. doi:10.1038/nature15393.
OpenUrl CrossRef PubMed
7.↵
Durand, N.C., Shamim, M.S., Machol, I., Rao, S.S.P., Huntley, M.H., Lander, E.S., and Aiden, E.L. (2016). Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98. doi:10.1016/j.cels.2016.07.002.
OpenUrl CrossRef PubMed
8.↵
Crane, E., Bian, Q., McCord, R.P., Lajoie, B.R., Wheeler, B.S., Ralston, E.J., Uzawa, S., Dekker, J., and Meyer, B.J. (2015). Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244. doi:10.1038/nature14450.
OpenUrl CrossRef PubMed
9.↵
Durand, N.C., Robinson, J.T., Shamim, M.S., Machol, I., Mesirov, J.P., Lander, E.S., and Aiden, E.L. (2016). Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 3, 99–101. doi:10.1016/j.cels.2015.07.012.
OpenUrl CrossRef PubMed
10.↵
Kruse, K., Hug, C.B., and Vaquerizas, J.M. (2020). FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data. Genome Biol. 21, 303. doi:10.1186/s13059-020-02215-9.
OpenUrl CrossRef
11.↵
Yu, W., He, B., and Tan, K. (2017). Identifying topologically associating domains and subdomains by Gaussian Mixture model And Proportion test. Nat. Commun. 8, 535. doi:10.1038/s41467-017-00478-8.
OpenUrl CrossRef
12.↵
Open2C, Abdennur, N., Abraham, S., Fudenberg, G., Flyamer, I.M., Galitsyna, A.A., Goloborodko, A., Imakaev, M., Oksuz, B.A., and Venev, S.V. (2022). Cooltools: enabling high-resolution Hi-C analysis in Python. Preprint at bioRxiv, 2022.10.31.514564. doi:10.1101/2022.10.31.514564.
OpenUrl Abstract/FREE Full Text
13.↵
Luo, X., Liu, Y., Dang, D., Hu, T., Hou, Y., Meng, X., Zhang, F., Li, T., Wang, C., Li, M., et al. (2021). 3D Genome of macaque fetal brain reveals evolutionary innovations during primate corticogenesis. Cell 184, 723–740.e21. doi:10.1016/j.cell.2021.01.001.
OpenUrl CrossRef
14.↵
Szabo, Q., Bantignies, F., and Cavalli, G. (2019). Principles of genome folding into topologically associating domains. Sci. Adv. 5, eaaw1668. doi:10.1126/sciadv.aaw1668.
OpenUrl FREE Full Text
15.↵
Ochoa, D., Hercules, A., Carmona, M., Suveges, D., Gonzalez-Uriarte, A., Malangone, C., Miranda, A., Fumis, L., Carvalho-Silva, D., Spitzer, M., et al. (2021). Open Targets Platform: supporting systematic drug–target identification and prioritisation. Nucleic Acids Res. 49, D1302–D1310. doi:10.1093/nar/gkaa1027.
OpenUrl CrossRef
16.↵
Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and Ren, B. (2012). Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions. Nature 485, 376–380. doi:10.1038/nature11082.
OpenUrl CrossRef PubMed Web of Science
17.↵
Baran, Y., Subramaniam, M., Biton, A., Tukiainen, T., Tsang, E.K., Rivas, M.A., Pirinen, M., Gutierrez-Arcelus, M., Smith, K.S., Kukurba, K.R., et al. (2015). The landscape of genomic imprinting across diverse adult human tissues. Genome Res. 25, 927–936. doi:10.1101/gr.192278.115.
OpenUrl Abstract/FREE Full Text
18.↵
Robles-Espinoza, C.D., Mohammadi, P., Bonilla, X., and Gutierrez-Arcelus, M. (2021). Allele-specific expression: applications in cancer and technical considerations. Curr. Opin. Genet. Dev. 66, 10–19. doi:10.1016/j.gde.2020.10.007.
OpenUrl CrossRef
19.↵
de Souza, M.M., Zerlotini, A., Rocha, M.I.P., Bruscadin, J.J., Diniz, W.J. da S., Cardoso, T.F., Cesar, A.S.M., Afonso, J., Andrade, B.G.N., Mudadu, M. de A., et al. (2020). Allele-specific expression is widespread in Bos indicus muscle and affects meat quality candidate genes. Sci. Rep. 10, 10204. doi:10.1038/s41598-020-67089-0.
OpenUrl CrossRef
20.↵
Shao, L., Xing, F., Xu, C., Zhang, Q., Che, J., Wang, X., Song, J., Li, X., Xiao, J., Chen, L.-L., et al. (2019). Patterns of genome-wide allele-specific expression in hybrid rice and the implications on the genetic basis of heterosis. Proc. Natl. Acad. Sci. U. S. A. 116, 5653–5658. doi:10.1073/pnas.1820513116.
OpenUrl Abstract/FREE Full Text
21.↵
Knight, J.C. (2004). Allele-specific gene expression uncovered. Trends Genet. TIG 20, 113–116. doi:10.1016/j.tig.2004.01.001.
OpenUrl CrossRef PubMed Web of Science
22.↵
Gorkin, D.U., Qiu, Y., Hu, M., Fletez-Brant, K., Liu, T., Schmitt, A.D., Noor, A., Chiou, J., Gaulton, K.J., Sebat, J., et al. (2019). Common DNA sequence variation influences 3-dimensional conformation of the human genome. Genome Biol. 20, 255. doi:10.1186/s13059-019-1855-4.
OpenUrl CrossRef
23.↵
Knight, P.A., and Ruiz, D. (2013). A fast algorithm for matrix balancing. IMA J. Numer. Anal. 33, 1029–1047. doi:10.1093/imanum/drs019.
OpenUrl CrossRef PubMed
24.↵
Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O’Shea, C.C., Park, P.J., Ren, B., et al. (2017). The 4D nucleome project. Nature 549, 219–226. doi:10.1038/nature23884.
OpenUrl CrossRef PubMed
25.↵
Chen, F., Li, G., Zhang, M.Q., and Chen, Y. (2018). HiCDB: a sensitive and robust method for detecting contact domain boundaries. Nucleic Acids Res. 46, 11239–11250. doi:10.1093/nar/gky789.
OpenUrl CrossRef
26.↵
Rajderkar, S., Barozzi, I., Zhu, Y., Hu, R., Zhang, Y., Li, B., Alcaina Caro, A., Fukuda-Yuzawa, Y., Kelman, G., Akeza, A., et al. (2023). Topologically associating domain boundaries are required for normal genome function. Commun. Biol. 6, 1–10. doi:10.1038/s42003-023-04819-w.
OpenUrl CrossRef
27.↵
Dixon, J.R., Gorkin, D.U., and Ren, B. (2016). Chromatin Domains: The Unit of Chromosome Organization. Mol. Cell 62, 668–680. doi:10.1016/j.molcel.2016.05.018.
OpenUrl CrossRef PubMed
28.↵
Phillips, J.E., and Corces, V.G. (2009). CTCF: master weaver of the genome. Cell 137, 1194–1211. doi:10.1016/j.cell.2009.06.001.
OpenUrl CrossRef PubMed Web of Science
29.↵
Merkenschlager, M., and Nora, E.P. (2016). CTCF and Cohesin in Genome Folding and Transcriptional Gene Regulation. Annu. Rev. Genomics Hum. Genet. 17, 17–43. doi:10.1146/annurev-genom-083115-022339.
OpenUrl CrossRef PubMed
30.
Ibrahim, D.M., and Mundlos, S. (2020). Three-dimensional chromatin in disease: What holds us together and what drives us apart? Curr. Opin. Cell Biol. 64, 1–9. doi:10.1016/j.ceb.2020.01.003.
OpenUrl CrossRef
31.
Tena, J.J., and Santos-Pereira, J.M. (2021). Topologically Associating Domains and Regulatory Landscapes in Development, Evolution and Disease. Front. Cell Dev. Biol. 9, 702787. doi:10.3389/fcell.2021.702787.
OpenUrl CrossRef
32.↵
Lupiáñez, D.G., Kraft, K., Heinrich, V., Krawitz, P., Brancati, F., Klopocki, E., Horn, D., Kayserili, H., Opitz, J.M., Laxova, R., et al. (2015). Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025. doi:10.1016/j.cell.2015.04.004.
OpenUrl CrossRef PubMed
33.↵
Shanta, O., Noor, A., Chaisson, M.J.P., Sanders, A.D., Zhao, X., Malhotra, A., Porubsky, D., Rausch, T., Gardner, E.J., Rodriguez, O.L., et al. (2020). The effects of common structural variants on 3D chromatin structure. BMC Genomics 21, 95. doi:10.1186/s12864-020-6516-1.
OpenUrl CrossRef
34.↵
Ibn-Salem, J., Köhler, S., Love, M.I., Chung, H.-R., Huang, N., Hurles, M.E., Haendel, M., Washington, N.L., Smedley, D., Mungall, C.J., et al. (2014). Deletions of chromosomal regulatory boundaries are associated with congenital disease. Genome Biol. 15, 423. doi:10.1186/s13059-014-0423-1.
OpenUrl CrossRef PubMed
35.
Chen, H., Li, C., Zhou, Z., and Liang, H. (2018). Fast-Evolving Human-Specific Neural Enhancers Are Associated with Aging-Related Diseases. Cell Syst. 6, 604–611.e4. doi:10.1016/j.cels.2018.04.002.
OpenUrl CrossRef
36.
de la Torre-Ubieta, L., Stein, J.L., Won, H., Opland, C.K., Liang, D., Lu, D., and Geschwind, D.H. (2018). The Dynamic Landscape of Open Chromatin during Human Cortical Neurogenesis. Cell 172, 289–304.e18. doi:10.1016/j.cell.2017.12.014.
OpenUrl CrossRef PubMed
37.
Won, H., de la Torre-Ubieta, L., Stein, J.L., Parikshak, N.N., Huang, J., Opland, C.K., Gandal, M.J., Sutton, G.J., Hormozdiari, F., Lu, D., et al. (2016). Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527. doi:10.1038/nature19847.
OpenUrl CrossRef PubMed
38.↵
Won, H., Huang, J., Opland, C.K., Hartl, C.L., and Geschwind, D.H. (2019). Human evolved regulatory elements modulate genes involved in cortical expansion and neurodevelopmental disease susceptibility. Nat. Commun. 10, 2396. doi:10.1038/s41467-019-10248-3.
OpenUrl CrossRef
39.↵
Krefting, J., Andrade-Navarro, M.A., and Ibn-Salem, J. (2018). Evolutionary stability of topologically associating domains is associated with conserved gene regulation. BMC Biol. 16, 87. doi:10.1186/s12915-018-0556-x.
OpenUrl CrossRef
40.↵
McArthur, E., and Capra, J.A. (2021). Topologically associating domain boundaries that are stable across diverse cell types are evolutionarily constrained and enriched for heritability. Am. J. Hum. Genet. 108, 269–283. doi:10.1016/j.ajhg.2021.01.001.
OpenUrl CrossRef PubMed
41.↵
Boltsis, I., Grosveld, F., Giraud, G., and Kolovos, P. (2021). Chromatin Conformation in Development and Disease. Front. Cell Dev. Biol. 9, 723859. doi:10.3389/fcell.2021.723859.
OpenUrl CrossRef
42.↵
Kim, K., Kim, M., Kim, Y., Lee, D., and Jung, I. (2022). Hi-C as a molecular rangefinder to examine genomic rearrangements. Semin. Cell Dev. Biol. 121, 161–170. doi:10.1016/j.semcdb.2021.04.024.
OpenUrl CrossRef
43.↵
Weischenfeldt, J., Dubash, T., Drainas, A.P., Mardin, B.R., Chen, Y., Stütz, A.M., Waszak, S.M., Bosco, G., Halvorsen, A.R., Raeder, B., et al. (2017). Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74. doi:10.1038/ng.3722.
OpenUrl CrossRef PubMed
44.↵
Akdemir, K.C., Le, V.T., Chandran, S., Li, Y., Verhaak, R.G., Beroukhim, R., Campbell, P.J., Chin, L., Dixon, J.R., Futreal, P.A., et al. (2020). Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer. Nat. Genet. 52, 294–305. doi:10.1038/s41588-019-0564-y.
OpenUrl CrossRef
45.↵
Zhang, Y., An, L., Xu, J., Zhang, B., Zheng, W.J., Hu, M., Tang, J., and Yue, F. (2018). Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus. Nat. Commun. 9, 750. doi:10.1038/s41467-018-03113-2.
OpenUrl CrossRef
46.↵
Liu, T., and Wang, Z. (2019). HiCNN: a very deep convolutional neural network to better enhance the resolution of Hi-C data. Bioinforma. Oxf. Engl. 35, 4222–4228. doi:10.1093/bioinformatics/btz251.
OpenUrl CrossRef
47.↵
Liu, Q., Lv, H., and Jiang, R. (2019). hicGAN infers super resolution Hi-C data with generative adversarial networks. Bioinforma. Oxf. Engl. 35, i99–i107. doi:10.1093/bioinformatics/btz317.
OpenUrl CrossRef
48.↵
Hong, H., Jiang, S., Li, H., Du, G., Sun, Y., Tao, H., Quan, C., Zhao, C., Li, R., Li, W., et al. (2020). DeepHiC: A generative adversarial network for enhancing Hi-C data resolution. PLoS Comput. Biol. 16, e1007287. doi:10.1371/journal.pcbi.1007287.
OpenUrl CrossRef
49.↵
Dimmick, M.C., Lee, L.J., and Frey, B.J. (2020). HiCSR: a Hi-C super-resolution framework for producing highly realistic contact maps. 2020.02.24.961714. doi:10.1101/2020.02.24.961714.
OpenUrl Abstract/FREE Full Text
50.↵
Highsmith, M., and Cheng, J. (2021). VEHiCLE: a Variationally Encoded Hi-C Loss Enhancement algorithm for improving and generating Hi-C data. Sci. Rep. 11, 8880. doi:10.1038/s41598-021-88115-9.
OpenUrl CrossRef
51.↵
Xu, J., Song, F., Lyu, H., Kobayashi, M., Zhang, B., Zhao, Z., Hou, Y., Wang, X., Luan, Y., Jia, B., et al. (2022). Subtype-specific 3D genome alteration in acute myeloid leukemia. Nature 611, 387–398. doi:10.1038/s41586-022-05365-x.
OpenUrl CrossRef
52.↵
Pal, K., Forcato, M., and Ferrari, F. (2019). Hi-C analysis: from data generation to integration. Biophys. Rev. 11, 67–78. doi:10.1007/s12551-018-0489-1.
OpenUrl CrossRef
53.↵
Dixon, J.R., Xu, J., Dileep, V., Zhan, Y., Song, F., Le, V.T., Yardımcı, G.G., Chakraborty, A., Bann, D.V., Wang, Y., et al. (2018). Integrative detection and analysis of structural variation in cancer genomes. Nat. Genet. 50, 1388–1398. doi:10.1038/s41588-018-0195-8.
OpenUrl CrossRef PubMed
54.↵
Chaisson, M.J.P., Sanders, A.D., Zhao, X., Malhotra, A., Porubsky, D., Rausch, T., Gardner, E.J., Rodriguez, O.L., Guo, L., Collins, R.L., et al. (2019). Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 10, 1784. doi:10.1038/s41467-018-08148-z.
OpenUrl CrossRef PubMed
55.↵
Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A.V., Mikheenko, A., Vollger, M.R., Altemose, N., Uralsky, L., Gershman, A., et al. (2022). The complete sequence of a human genome. Science 376, 44–53. doi:10.1126/science.abj6987.
OpenUrl CrossRef PubMed
56.↵
Beagan, J.A., and Phillips-Cremins, J.E. (2020). On the existence and functionality of topologically associating domains. Nat. Genet. 52, 8–16. doi:10.1038/s41588-019-0561-1.
OpenUrl CrossRef PubMed
57.↵
Willemin, A., Lopez-Delisle, L., Bolt, C.C., Gadolini, M.-L., Duboule, D., and Rodriguez-Carballo, E. (2021). Induction of a chromatin boundary in vivo upon insertion of a TAD border. PLoS Genet. 17, e1009691. doi:10.1371/journal.pgen.1009691.
OpenUrl CrossRef
58.↵
Zufferey, M., Tavernari, D., Oricchio, E., and Ciriello, G. (2018). Comparison of computational methods for the identification of topologically associating domains. Genome Biol. 19, 217. doi:10.1186/s13059-018-1596-9.
OpenUrl CrossRef
59.↵
Yardımcı, G.G., Ozadam, H., Sauria, M.E.G., Ursu, O., Yan, K.-K., Yang, T., Chakraborty, A., Kaul, A., Lajoie, B.R., Song, F., et al. (2019). Measuring the reproducibility and quality of Hi-C data. Genome Biol. 20, 57. doi:10.1186/s13059-019-1658-7.
OpenUrl CrossRef PubMed
60.↵
The ENCODE (ENCyclopedia Of DNA Elements) Project (2004). Science 306, 636–640. doi:10.1126/science.1105136.
OpenUrl Abstract/FREE Full Text
61.↵
Dharanipragada, P., and Parekh, N. (2019). Genome-wide characterization of copy number variations in diffuse large B-cell lymphoma with implications in targeted therapy. Precis. Clin. Med. 2, 246–258. doi:10.1093/pcmedi/pbz024.
OpenUrl CrossRef
62.
Li, Y., Pang, X., Cui, Z., Zhou, Y., Mao, F., Lin, Y., Zhang, X., Shen, S., Zhu, P., Zhao, T., et al. (2020). Genetic factors associated with cancer racial disparity – an integrative study across twenty-one cancer types. Mol. Oncol. 14, 2775–2786. doi:10.1002/1878-0261.12799.
OpenUrl CrossRef
63.↵
Deng, H., Li, T., Wei, F., Han, W., Xu, X., and Zhang, Y. (2023). High expression of TMEM200A is associated with a poor prognosis and immune infiltration in gastric cancer. Pathol. Oncol. Res. 29, 1610893. doi:10.3389/pore.2023.1610893.
OpenUrl CrossRef
64.↵
Brown, B.C., Bray, N.L., and Pachter, L. (2018). Expression reflects population structure. PLoS Genet. 14, e1007841. doi:10.1371/journal.pgen.1007841.
OpenUrl CrossRef
65.↵
Morley, M., Molony, C.M., Weber, T.M., Devlin, J.L., Ewens, K.G., Spielman, R.S., and Cheung, V.G. (2004). Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747. doi:10.1038/nature02797.
OpenUrl CrossRef PubMed Web of Science
66.↵
Dang, D., Zhang, S.-W., Duan, R., and Zhang, S. (2023). Defining the separation landscape of topological domains for decoding consensus domain organization of the 3D genome. Genome Res. 33, 386–400. doi:10.1101/gr.277187.122.
OpenUrl Abstract/FREE Full Text
67.↵
Ebler, J., Ebert, P., Clarke, W.E., Rausch. T., Audano. P. A., Houwaart, T., Mao, Y., Korbel J. O., Eichler, E. E., Zody, M. C., et al. (2022). Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nat Genet 54, 518–525. doi:10.1038/s41588-022-01043-w
OpenUrl CrossRef PubMed
68.↵
Jun, G., Flickinger, M., Hetrick, K. N., Romm, J. M., Doheny, K. F., Abecasis, G. R., Boehnke, M., and Kang, H. M. (2012). Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am J Hum Genet., 91(5), 839–848. doi:10.1016/j.ajhg.2012.09.004.
OpenUrl CrossRef PubMed
69.↵
Abdennur, N., & Mirny, L. A. (2020). Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics, 36(1), 311–316. doi:10.1093/bioinformatics/btz540.
OpenUrl CrossRef PubMed
70.↵
Wolff, J., Rabbani, L., Gilsbach, R., Richard, G., Manke, T., Backofen, R., & Grüning, B. A. (2020). Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res, 48(W1), W177–W184. doi:10.1093/nar/gkaa220.
OpenUrl CrossRef
71.↵
Quinlan, A. R., & Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26(6), 841–842. /doi:10.1093/bioinformatics/btq033
OpenUrl CrossRef PubMed Web of Science
72.↵
Anton Goloborodko, Sergey Venev, Nezar Abdennur, azkalot1, & Paolo Di Tommaso. (2019). mirnylab/distiller-nf: v0.3.3rc2 (v0.3.3rc2). Zenodo. doi:10.5281/zenodo.3350925.
OpenUrl CrossRef
73.↵
Li, H., and Durbin, R.,(2009). Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25.14: 1754–1760. doi:10.1093/bioinformatics/btp324
OpenUrl CrossRef PubMed Web of Science
74.↵
Reiff, S.B., Schroeder, A.J., Kırlı, K., Cosolo, A., Bakker, C., Mercado, L., Lee, S., Veit, A. D., Balashov, A. K., Vitzthum, C., et al. (2022). The 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data. Nat Commun 13, 2365. doi:10.1038/s41467-022-29697-4
OpenUrl CrossRef
75.↵
Ramírez, F., Bhardwaj, V., Arrigoni, L., Lam K. C., Grüning, B. A., Villaveces, J., Habermann, B., Akhtar, A., & Manke, T. (2018). High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat Commun 9, 189. doi:10.1038/s41467-017-02525-w
OpenUrl CrossRef PubMed
76.↵
Phillips-Cremins, J.E., Sauria, M.E.G., Sanyal, A., Gerasimova, T.I., Lajoie, B.R., Bell, J.S.K., Ong, C.-T., Hookway, T.A., Guo, C., Sun, Y., et al. (2013). Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell 153, 1281–1295. doi:10.1016/j.cell.2013.04.053.
OpenUrl CrossRef PubMed Web of Science
77.↵
Zhu, X., Qi, C., Wang, R., Lee, J.-H., Shao, J., Bei, L., Xiong, F., Nguyen, P.T., Li, G., Krakowiak, J., et al. (2022). Acute depletion of human core nucleoporin reveals direct roles in transcription control but dispensability for 3D genome organization. Cell Rep. 41, 111576. doi:10.1016/j.celrep.2022.111576.
OpenUrl CrossRef
78.↵
Mourad, R., Hsu, P.-Y., Juan, L., Shen, C., Koneru, P., Lin, H., Liu, Y., Nephew, K., Huang, T.H., and Li, L. (2014). Estrogen Induces Global Reorganization of Chromatin Structure in Human Breast Cancer Cells. PLoS ONE 9, e113354. doi:10.1371/journal.pone.0113354.
OpenUrl CrossRef PubMed
79.↵
Shrestha, D., Bag, A., Wu, R., Zhang, Y., Tang, X., Qi, Q., Xing, J., and Cheng, Y. (2022). Genomics and epigenetics guided identification of tissue-specific genomic safe harbors. Genome Biol. 23, 199. doi:10.1186/s13059-022-02770-3.
OpenUrl CrossRef

View the discussion thread.

Posted May 15, 2023.

Download PDF

Citation Tools

Subject Area

Genomics

Subject Areas

All Articles

Animal Behavior and Cognition (5215)
Biochemistry (11752)
Bioengineering (8752)
Bioinformatics (29200)
Biophysics (14974)
Cancer Biology (12096)
Cell Biology (17411)
Clinical Trials (138)
Developmental Biology (9421)
Ecology (14182)
Epidemiology (2067)
Evolutionary Biology (18308)
Genetics (12245)
Genomics (16803)
Immunology (11869)
Microbiology (28097)
Molecular Biology (11594)
Neuroscience (60969)
Paleontology (451)
Pathology (1871)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2886)
Systems Biology (7340)
Zoology (1651)

[1] 1.↵
Forcato, M., Nicoletti, C., Pal, K., Livi, C.M., Ferrari, F., and Bicciato, S. (2017). Comparison of computational methods for Hi-C data analysis. Nat. Methods 14, 679–685. doi:10.1038/nmeth.4325.
OpenUrl CrossRef PubMed

[2] 2.↵
Melo, U.S., Schöpflin, R., Acuna-Hidalgo, R., Mensah, M.A., Fischer-Zirnsak, B., Holtgrewe, M., Klever, M.-K., Türkmen, S., Heinrich, V., Pluym, I.D., et al. (2020). Hi-C Identifies Complex Genomic Rearrangements and TAD-Shuffling in Developmental Diseases. Am. J. Hum. Genet. 106, 872–884. doi:10.1016/j.ajhg.2020.04.016.
OpenUrl CrossRef PubMed

[3] 3.↵
Spielmann, M., Lupiáñez, D.G., and Mundlos, S. (2018). Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467. doi:10.1038/s41576-018-0007-0.
OpenUrl CrossRef PubMed

[4] 4.↵
Ebert, P., Audano, P.A., Zhu, Q., Rodriguez-Martin, B., Porubsky, D., Bonder, M.J., Sulovari, A., Ebler, J., Zhou, W., Serra Mari, R., et al. (2021). Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372, eabf7117. doi:10.1126/science.abf7117.
OpenUrl Abstract/FREE Full Text

[5] 5.↵
Rao, S.S.P., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., Sanborn, A.L., Machol, I., Omer, A.D., Lander, E.S., et al. (2014). A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell 159, 1665–1680. doi:10.1016/j.cell.2014.11.021.
OpenUrl CrossRef PubMed Web of Science

[6] 6.↵
The 1000 Genomes Project Consortium, Corresponding authors, Auton, A., Abecasis, G.R., Steering committee, Altshuler, D.M., Durbin, R.M., Abecasis, G.R., Bentley, D.R., Chakravarti, A., et al. (2015). A global reference for human genetic variation. Nature 526, 68–74. doi:10.1038/nature15393.
OpenUrl CrossRef PubMed

[7] 7.↵
Durand, N.C., Shamim, M.S., Machol, I., Rao, S.S.P., Huntley, M.H., Lander, E.S., and Aiden, E.L. (2016). Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98. doi:10.1016/j.cels.2016.07.002.
OpenUrl CrossRef PubMed

[8] 8.↵
Crane, E., Bian, Q., McCord, R.P., Lajoie, B.R., Wheeler, B.S., Ralston, E.J., Uzawa, S., Dekker, J., and Meyer, B.J. (2015). Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244. doi:10.1038/nature14450.
OpenUrl CrossRef PubMed

[9] 9.↵
Durand, N.C., Robinson, J.T., Shamim, M.S., Machol, I., Mesirov, J.P., Lander, E.S., and Aiden, E.L. (2016). Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 3, 99–101. doi:10.1016/j.cels.2015.07.012.
OpenUrl CrossRef PubMed

[10] 10.↵
Kruse, K., Hug, C.B., and Vaquerizas, J.M. (2020). FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data. Genome Biol. 21, 303. doi:10.1186/s13059-020-02215-9.
OpenUrl CrossRef

[11] 11.↵
Yu, W., He, B., and Tan, K. (2017). Identifying topologically associating domains and subdomains by Gaussian Mixture model And Proportion test. Nat. Commun. 8, 535. doi:10.1038/s41467-017-00478-8.
OpenUrl CrossRef

[12] 12.↵
Open2C, Abdennur, N., Abraham, S., Fudenberg, G., Flyamer, I.M., Galitsyna, A.A., Goloborodko, A., Imakaev, M., Oksuz, B.A., and Venev, S.V. (2022). Cooltools: enabling high-resolution Hi-C analysis in Python. Preprint at bioRxiv, 2022.10.31.514564. doi:10.1101/2022.10.31.514564.
OpenUrl Abstract/FREE Full Text

[13] 13.↵
Luo, X., Liu, Y., Dang, D., Hu, T., Hou, Y., Meng, X., Zhang, F., Li, T., Wang, C., Li, M., et al. (2021). 3D Genome of macaque fetal brain reveals evolutionary innovations during primate corticogenesis. Cell 184, 723–740.e21. doi:10.1016/j.cell.2021.01.001.
OpenUrl CrossRef

[14] 14.↵
Szabo, Q., Bantignies, F., and Cavalli, G. (2019). Principles of genome folding into topologically associating domains. Sci. Adv. 5, eaaw1668. doi:10.1126/sciadv.aaw1668.
OpenUrl FREE Full Text

[15] 15.↵
Ochoa, D., Hercules, A., Carmona, M., Suveges, D., Gonzalez-Uriarte, A., Malangone, C., Miranda, A., Fumis, L., Carvalho-Silva, D., Spitzer, M., et al. (2021). Open Targets Platform: supporting systematic drug–target identification and prioritisation. Nucleic Acids Res. 49, D1302–D1310. doi:10.1093/nar/gkaa1027.
OpenUrl CrossRef

[16] 16.↵
Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and Ren, B. (2012). Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions. Nature 485, 376–380. doi:10.1038/nature11082.
OpenUrl CrossRef PubMed Web of Science

[17] 17.↵
Baran, Y., Subramaniam, M., Biton, A., Tukiainen, T., Tsang, E.K., Rivas, M.A., Pirinen, M., Gutierrez-Arcelus, M., Smith, K.S., Kukurba, K.R., et al. (2015). The landscape of genomic imprinting across diverse adult human tissues. Genome Res. 25, 927–936. doi:10.1101/gr.192278.115.
OpenUrl Abstract/FREE Full Text

[18] 18.↵
Robles-Espinoza, C.D., Mohammadi, P., Bonilla, X., and Gutierrez-Arcelus, M. (2021). Allele-specific expression: applications in cancer and technical considerations. Curr. Opin. Genet. Dev. 66, 10–19. doi:10.1016/j.gde.2020.10.007.
OpenUrl CrossRef

[19] 19.↵
de Souza, M.M., Zerlotini, A., Rocha, M.I.P., Bruscadin, J.J., Diniz, W.J. da S., Cardoso, T.F., Cesar, A.S.M., Afonso, J., Andrade, B.G.N., Mudadu, M. de A., et al. (2020). Allele-specific expression is widespread in Bos indicus muscle and affects meat quality candidate genes. Sci. Rep. 10, 10204. doi:10.1038/s41598-020-67089-0.
OpenUrl CrossRef

[20] 20.↵
Shao, L., Xing, F., Xu, C., Zhang, Q., Che, J., Wang, X., Song, J., Li, X., Xiao, J., Chen, L.-L., et al. (2019). Patterns of genome-wide allele-specific expression in hybrid rice and the implications on the genetic basis of heterosis. Proc. Natl. Acad. Sci. U. S. A. 116, 5653–5658. doi:10.1073/pnas.1820513116.
OpenUrl Abstract/FREE Full Text

[21] 21.↵
Knight, J.C. (2004). Allele-specific gene expression uncovered. Trends Genet. TIG 20, 113–116. doi:10.1016/j.tig.2004.01.001.
OpenUrl CrossRef PubMed Web of Science

[22] 22.↵
Gorkin, D.U., Qiu, Y., Hu, M., Fletez-Brant, K., Liu, T., Schmitt, A.D., Noor, A., Chiou, J., Gaulton, K.J., Sebat, J., et al. (2019). Common DNA sequence variation influences 3-dimensional conformation of the human genome. Genome Biol. 20, 255. doi:10.1186/s13059-019-1855-4.
OpenUrl CrossRef

[23] 23.↵
Knight, P.A., and Ruiz, D. (2013). A fast algorithm for matrix balancing. IMA J. Numer. Anal. 33, 1029–1047. doi:10.1093/imanum/drs019.
OpenUrl CrossRef PubMed

[24] 24.↵
Dekker, J., Belmont, A.S., Guttman, M., Leshyk, V.O., Lis, J.T., Lomvardas, S., Mirny, L.A., O’Shea, C.C., Park, P.J., Ren, B., et al. (2017). The 4D nucleome project. Nature 549, 219–226. doi:10.1038/nature23884.
OpenUrl CrossRef PubMed

[25] 25.↵
Chen, F., Li, G., Zhang, M.Q., and Chen, Y. (2018). HiCDB: a sensitive and robust method for detecting contact domain boundaries. Nucleic Acids Res. 46, 11239–11250. doi:10.1093/nar/gky789.
OpenUrl CrossRef

[26] 26.↵
Rajderkar, S., Barozzi, I., Zhu, Y., Hu, R., Zhang, Y., Li, B., Alcaina Caro, A., Fukuda-Yuzawa, Y., Kelman, G., Akeza, A., et al. (2023). Topologically associating domain boundaries are required for normal genome function. Commun. Biol. 6, 1–10. doi:10.1038/s42003-023-04819-w.
OpenUrl CrossRef

[27] 27.↵
Dixon, J.R., Gorkin, D.U., and Ren, B. (2016). Chromatin Domains: The Unit of Chromosome Organization. Mol. Cell 62, 668–680. doi:10.1016/j.molcel.2016.05.018.
OpenUrl CrossRef PubMed

[28] 28.↵
Phillips, J.E., and Corces, V.G. (2009). CTCF: master weaver of the genome. Cell 137, 1194–1211. doi:10.1016/j.cell.2009.06.001.
OpenUrl CrossRef PubMed Web of Science

[29] 29.↵
Merkenschlager, M., and Nora, E.P. (2016). CTCF and Cohesin in Genome Folding and Transcriptional Gene Regulation. Annu. Rev. Genomics Hum. Genet. 17, 17–43. doi:10.1146/annurev-genom-083115-022339.
OpenUrl CrossRef PubMed

[30] 30.
Ibrahim, D.M., and Mundlos, S. (2020). Three-dimensional chromatin in disease: What holds us together and what drives us apart? Curr. Opin. Cell Biol. 64, 1–9. doi:10.1016/j.ceb.2020.01.003.
OpenUrl CrossRef

[31] 31.
Tena, J.J., and Santos-Pereira, J.M. (2021). Topologically Associating Domains and Regulatory Landscapes in Development, Evolution and Disease. Front. Cell Dev. Biol. 9, 702787. doi:10.3389/fcell.2021.702787.
OpenUrl CrossRef

[32] 32.↵
Lupiáñez, D.G., Kraft, K., Heinrich, V., Krawitz, P., Brancati, F., Klopocki, E., Horn, D., Kayserili, H., Opitz, J.M., Laxova, R., et al. (2015). Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025. doi:10.1016/j.cell.2015.04.004.
OpenUrl CrossRef PubMed

[33] 33.↵
Shanta, O., Noor, A., Chaisson, M.J.P., Sanders, A.D., Zhao, X., Malhotra, A., Porubsky, D., Rausch, T., Gardner, E.J., Rodriguez, O.L., et al. (2020). The effects of common structural variants on 3D chromatin structure. BMC Genomics 21, 95. doi:10.1186/s12864-020-6516-1.
OpenUrl CrossRef

[34] 34.↵
Ibn-Salem, J., Köhler, S., Love, M.I., Chung, H.-R., Huang, N., Hurles, M.E., Haendel, M., Washington, N.L., Smedley, D., Mungall, C.J., et al. (2014). Deletions of chromosomal regulatory boundaries are associated with congenital disease. Genome Biol. 15, 423. doi:10.1186/s13059-014-0423-1.
OpenUrl CrossRef PubMed

[35] 35.
Chen, H., Li, C., Zhou, Z., and Liang, H. (2018). Fast-Evolving Human-Specific Neural Enhancers Are Associated with Aging-Related Diseases. Cell Syst. 6, 604–611.e4. doi:10.1016/j.cels.2018.04.002.
OpenUrl CrossRef

[36] 36.
de la Torre-Ubieta, L., Stein, J.L., Won, H., Opland, C.K., Liang, D., Lu, D., and Geschwind, D.H. (2018). The Dynamic Landscape of Open Chromatin during Human Cortical Neurogenesis. Cell 172, 289–304.e18. doi:10.1016/j.cell.2017.12.014.
OpenUrl CrossRef PubMed

[37] 37.
Won, H., de la Torre-Ubieta, L., Stein, J.L., Parikshak, N.N., Huang, J., Opland, C.K., Gandal, M.J., Sutton, G.J., Hormozdiari, F., Lu, D., et al. (2016). Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527. doi:10.1038/nature19847.
OpenUrl CrossRef PubMed

[38] 38.↵
Won, H., Huang, J., Opland, C.K., Hartl, C.L., and Geschwind, D.H. (2019). Human evolved regulatory elements modulate genes involved in cortical expansion and neurodevelopmental disease susceptibility. Nat. Commun. 10, 2396. doi:10.1038/s41467-019-10248-3.
OpenUrl CrossRef

[39] 39.↵
Krefting, J., Andrade-Navarro, M.A., and Ibn-Salem, J. (2018). Evolutionary stability of topologically associating domains is associated with conserved gene regulation. BMC Biol. 16, 87. doi:10.1186/s12915-018-0556-x.
OpenUrl CrossRef

[40] 40.↵
McArthur, E., and Capra, J.A. (2021). Topologically associating domain boundaries that are stable across diverse cell types are evolutionarily constrained and enriched for heritability. Am. J. Hum. Genet. 108, 269–283. doi:10.1016/j.ajhg.2021.01.001.
OpenUrl CrossRef PubMed

[41] 41.↵
Boltsis, I., Grosveld, F., Giraud, G., and Kolovos, P. (2021). Chromatin Conformation in Development and Disease. Front. Cell Dev. Biol. 9, 723859. doi:10.3389/fcell.2021.723859.
OpenUrl CrossRef

[42] 42.↵
Kim, K., Kim, M., Kim, Y., Lee, D., and Jung, I. (2022). Hi-C as a molecular rangefinder to examine genomic rearrangements. Semin. Cell Dev. Biol. 121, 161–170. doi:10.1016/j.semcdb.2021.04.024.
OpenUrl CrossRef

[43] 43.↵
Weischenfeldt, J., Dubash, T., Drainas, A.P., Mardin, B.R., Chen, Y., Stütz, A.M., Waszak, S.M., Bosco, G., Halvorsen, A.R., Raeder, B., et al. (2017). Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74. doi:10.1038/ng.3722.
OpenUrl CrossRef PubMed

[44] 44.↵
Akdemir, K.C., Le, V.T., Chandran, S., Li, Y., Verhaak, R.G., Beroukhim, R., Campbell, P.J., Chin, L., Dixon, J.R., Futreal, P.A., et al. (2020). Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer. Nat. Genet. 52, 294–305. doi:10.1038/s41588-019-0564-y.
OpenUrl CrossRef

[45] 45.↵
Zhang, Y., An, L., Xu, J., Zhang, B., Zheng, W.J., Hu, M., Tang, J., and Yue, F. (2018). Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus. Nat. Commun. 9, 750. doi:10.1038/s41467-018-03113-2.
OpenUrl CrossRef

[46] 46.↵
Liu, T., and Wang, Z. (2019). HiCNN: a very deep convolutional neural network to better enhance the resolution of Hi-C data. Bioinforma. Oxf. Engl. 35, 4222–4228. doi:10.1093/bioinformatics/btz251.
OpenUrl CrossRef

[47] 47.↵
Liu, Q., Lv, H., and Jiang, R. (2019). hicGAN infers super resolution Hi-C data with generative adversarial networks. Bioinforma. Oxf. Engl. 35, i99–i107. doi:10.1093/bioinformatics/btz317.
OpenUrl CrossRef

[48] 48.↵
Hong, H., Jiang, S., Li, H., Du, G., Sun, Y., Tao, H., Quan, C., Zhao, C., Li, R., Li, W., et al. (2020). DeepHiC: A generative adversarial network for enhancing Hi-C data resolution. PLoS Comput. Biol. 16, e1007287. doi:10.1371/journal.pcbi.1007287.
OpenUrl CrossRef

[49] 49.↵
Dimmick, M.C., Lee, L.J., and Frey, B.J. (2020). HiCSR: a Hi-C super-resolution framework for producing highly realistic contact maps. 2020.02.24.961714. doi:10.1101/2020.02.24.961714.
OpenUrl Abstract/FREE Full Text

[50] 50.↵
Highsmith, M., and Cheng, J. (2021). VEHiCLE: a Variationally Encoded Hi-C Loss Enhancement algorithm for improving and generating Hi-C data. Sci. Rep. 11, 8880. doi:10.1038/s41598-021-88115-9.
OpenUrl CrossRef

[51] 51.↵
Xu, J., Song, F., Lyu, H., Kobayashi, M., Zhang, B., Zhao, Z., Hou, Y., Wang, X., Luan, Y., Jia, B., et al. (2022). Subtype-specific 3D genome alteration in acute myeloid leukemia. Nature 611, 387–398. doi:10.1038/s41586-022-05365-x.
OpenUrl CrossRef

[52] 52.↵
Pal, K., Forcato, M., and Ferrari, F. (2019). Hi-C analysis: from data generation to integration. Biophys. Rev. 11, 67–78. doi:10.1007/s12551-018-0489-1.
OpenUrl CrossRef

[53] 53.↵
Dixon, J.R., Xu, J., Dileep, V., Zhan, Y., Song, F., Le, V.T., Yardımcı, G.G., Chakraborty, A., Bann, D.V., Wang, Y., et al. (2018). Integrative detection and analysis of structural variation in cancer genomes. Nat. Genet. 50, 1388–1398. doi:10.1038/s41588-018-0195-8.
OpenUrl CrossRef PubMed

[54] 54.↵
Chaisson, M.J.P., Sanders, A.D., Zhao, X., Malhotra, A., Porubsky, D., Rausch, T., Gardner, E.J., Rodriguez, O.L., Guo, L., Collins, R.L., et al. (2019). Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 10, 1784. doi:10.1038/s41467-018-08148-z.
OpenUrl CrossRef PubMed

[55] 55.↵
Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A.V., Mikheenko, A., Vollger, M.R., Altemose, N., Uralsky, L., Gershman, A., et al. (2022). The complete sequence of a human genome. Science 376, 44–53. doi:10.1126/science.abj6987.
OpenUrl CrossRef PubMed

[56] 56.↵
Beagan, J.A., and Phillips-Cremins, J.E. (2020). On the existence and functionality of topologically associating domains. Nat. Genet. 52, 8–16. doi:10.1038/s41588-019-0561-1.
OpenUrl CrossRef PubMed

[57] 57.↵
Willemin, A., Lopez-Delisle, L., Bolt, C.C., Gadolini, M.-L., Duboule, D., and Rodriguez-Carballo, E. (2021). Induction of a chromatin boundary in vivo upon insertion of a TAD border. PLoS Genet. 17, e1009691. doi:10.1371/journal.pgen.1009691.
OpenUrl CrossRef

[58] 58.↵
Zufferey, M., Tavernari, D., Oricchio, E., and Ciriello, G. (2018). Comparison of computational methods for the identification of topologically associating domains. Genome Biol. 19, 217. doi:10.1186/s13059-018-1596-9.
OpenUrl CrossRef

[59] 59.↵
Yardımcı, G.G., Ozadam, H., Sauria, M.E.G., Ursu, O., Yan, K.-K., Yang, T., Chakraborty, A., Kaul, A., Lajoie, B.R., Song, F., et al. (2019). Measuring the reproducibility and quality of Hi-C data. Genome Biol. 20, 57. doi:10.1186/s13059-019-1658-7.
OpenUrl CrossRef PubMed

[60] 60.↵
The ENCODE (ENCyclopedia Of DNA Elements) Project (2004). Science 306, 636–640. doi:10.1126/science.1105136.
OpenUrl Abstract/FREE Full Text

[61] 61.↵
Dharanipragada, P., and Parekh, N. (2019). Genome-wide characterization of copy number variations in diffuse large B-cell lymphoma with implications in targeted therapy. Precis. Clin. Med. 2, 246–258. doi:10.1093/pcmedi/pbz024.
OpenUrl CrossRef

[62] 62.
Li, Y., Pang, X., Cui, Z., Zhou, Y., Mao, F., Lin, Y., Zhang, X., Shen, S., Zhu, P., Zhao, T., et al. (2020). Genetic factors associated with cancer racial disparity – an integrative study across twenty-one cancer types. Mol. Oncol. 14, 2775–2786. doi:10.1002/1878-0261.12799.
OpenUrl CrossRef

[63] 63.↵
Deng, H., Li, T., Wei, F., Han, W., Xu, X., and Zhang, Y. (2023). High expression of TMEM200A is associated with a poor prognosis and immune infiltration in gastric cancer. Pathol. Oncol. Res. 29, 1610893. doi:10.3389/pore.2023.1610893.
OpenUrl CrossRef

[64] 64.↵
Brown, B.C., Bray, N.L., and Pachter, L. (2018). Expression reflects population structure. PLoS Genet. 14, e1007841. doi:10.1371/journal.pgen.1007841.
OpenUrl CrossRef

[65] 65.↵
Morley, M., Molony, C.M., Weber, T.M., Devlin, J.L., Ewens, K.G., Spielman, R.S., and Cheung, V.G. (2004). Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747. doi:10.1038/nature02797.
OpenUrl CrossRef PubMed Web of Science

[66] 66.↵
Dang, D., Zhang, S.-W., Duan, R., and Zhang, S. (2023). Defining the separation landscape of topological domains for decoding consensus domain organization of the 3D genome. Genome Res. 33, 386–400. doi:10.1101/gr.277187.122.
OpenUrl Abstract/FREE Full Text

[67] 67.↵
Ebler, J., Ebert, P., Clarke, W.E., Rausch. T., Audano. P. A., Houwaart, T., Mao, Y., Korbel J. O., Eichler, E. E., Zody, M. C., et al. (2022). Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nat Genet 54, 518–525. doi:10.1038/s41588-022-01043-w
OpenUrl CrossRef PubMed

[68] 68.↵
Jun, G., Flickinger, M., Hetrick, K. N., Romm, J. M., Doheny, K. F., Abecasis, G. R., Boehnke, M., and Kang, H. M. (2012). Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am J Hum Genet., 91(5), 839–848. doi:10.1016/j.ajhg.2012.09.004.
OpenUrl CrossRef PubMed

[69] 69.↵
Abdennur, N., & Mirny, L. A. (2020). Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics, 36(1), 311–316. doi:10.1093/bioinformatics/btz540.
OpenUrl CrossRef PubMed

[70] 70.↵
Wolff, J., Rabbani, L., Gilsbach, R., Richard, G., Manke, T., Backofen, R., & Grüning, B. A. (2020). Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res, 48(W1), W177–W184. doi:10.1093/nar/gkaa220.
OpenUrl CrossRef

[71] 71.↵
Quinlan, A. R., & Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26(6), 841–842. /doi:10.1093/bioinformatics/btq033
OpenUrl CrossRef PubMed Web of Science

[72] 72.↵
Anton Goloborodko, Sergey Venev, Nezar Abdennur, azkalot1, & Paolo Di Tommaso. (2019). mirnylab/distiller-nf: v0.3.3rc2 (v0.3.3rc2). Zenodo. doi:10.5281/zenodo.3350925.
OpenUrl CrossRef

[73] 73.↵
Li, H., and Durbin, R.,(2009). Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25.14: 1754–1760. doi:10.1093/bioinformatics/btp324
OpenUrl CrossRef PubMed Web of Science

[74] 74.↵
Reiff, S.B., Schroeder, A.J., Kırlı, K., Cosolo, A., Bakker, C., Mercado, L., Lee, S., Veit, A. D., Balashov, A. K., Vitzthum, C., et al. (2022). The 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data. Nat Commun 13, 2365. doi:10.1038/s41467-022-29697-4
OpenUrl CrossRef

[75] 75.↵
Ramírez, F., Bhardwaj, V., Arrigoni, L., Lam K. C., Grüning, B. A., Villaveces, J., Habermann, B., Akhtar, A., & Manke, T. (2018). High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat Commun 9, 189. doi:10.1038/s41467-017-02525-w
OpenUrl CrossRef PubMed

[76] 76.↵
Phillips-Cremins, J.E., Sauria, M.E.G., Sanyal, A., Gerasimova, T.I., Lajoie, B.R., Bell, J.S.K., Ong, C.-T., Hookway, T.A., Guo, C., Sun, Y., et al. (2013). Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell 153, 1281–1295. doi:10.1016/j.cell.2013.04.053.
OpenUrl CrossRef PubMed Web of Science

[77] 77.↵
Zhu, X., Qi, C., Wang, R., Lee, J.-H., Shao, J., Bei, L., Xiong, F., Nguyen, P.T., Li, G., Krakowiak, J., et al. (2022). Acute depletion of human core nucleoporin reveals direct roles in transcription control but dispensability for 3D genome organization. Cell Rep. 41, 111576. doi:10.1016/j.celrep.2022.111576.
OpenUrl CrossRef

[78] 78.↵
Mourad, R., Hsu, P.-Y., Juan, L., Shen, C., Koneru, P., Lin, H., Liu, Y., Nephew, K., Huang, T.H., and Li, L. (2014). Estrogen Induces Global Reorganization of Chromatin Structure in Human Breast Cancer Cells. PLoS ONE 9, e113354. doi:10.1371/journal.pone.0113354.
OpenUrl CrossRef PubMed

[79] 79.↵
Shrestha, D., Bag, A., Wu, R., Zhang, Y., Tang, X., Qi, Q., Xing, J., and Cheng, Y. (2022). Genomics and epigenetics guided identification of tissue-specific genomic safe harbors. Genome Biol. 23, 199. doi:10.1186/s13059-022-02770-3.
OpenUrl CrossRef