Abstract
Three-dimensional genome organization plays a critical role in DNA function. Flexible chromatin structure suggested that the genome is phase-separated to form A/B compartments in interphase nuclei. Here, we examined this hypothesis by developing a polymer model of the whole genome of human cells and assessing the impact of phase separation on genome structure. Upon entry to the G1 phase, the simulated genome expanded according to heterogeneous repulsion among chromatin chains, which moved chromatin heterogeneously, inducing phase separation. This repulsion-driven phase separation quantitatively reproduces the experimentally observed chromatin domains, A/B compartments, lamina-associated domains, and nucleolus-associated domains, consistently explaining nuclei of different human cells and predicting their dynamic fluctuations. We propose that phase separation induced by heterogeneous repulsive interactions among chromatin chains largely determines dynamic genome organization.
One-Sentence Summary The whole-genome simulation showed the importance of repulsion-driven phase separation of chromatin in genome organization.
Three-dimensional genome structure and its dynamics play a crucial role in regulating eukaryotic DNA function (1–3). The recent development of techniques, such as high-throughput chromosome conformation capture (Hi-C) (4, 5), electron microscopy (6), and super-resolution microscopy (7–10), has enhanced our understanding of genome organization. For further clarifying the mechanisms of genome organization, it is necessary to develop reliable computational models that can bridge these different experimental paradigms. Computational polymer models of individual chromosomes and their complexes were developed using the Hi-C data of global chromatin contacts as the input to deduce the knowledge-based forces on chromatin (11–14). More refined input data such as the global genome-wide contact pattern in the single-cell Hi-C data was necessary for modeling the whole-genome structure of mouse and human cells (15–17). For further elucidating the principles of genome organization, it is highly desirable to develop a physical model of the whole genome using straightforward assumptions instead of fitting the model to the vast amount of experimental data on the global genome conformation.
In physical modeling of the whole genome, the important subject is to examine the possible phase separation of chromatin and assess its impact on genome structure. Chromatin shows flexible configuration (6–8) and movements (9, 10, 18), suggesting that chromatin in interphase nuclei is dynamically phase-separated to determine the genome structure (19–22). In the present study, we tested this hypothesis by examining the mechanism of phase separation. A previously proposed mechanism, which is still under debate (23), is the droplet-like condensation of factors such as HP1; these condensates may mediate attraction between heterochromatin regions, leading to phase separation of heterochromatin from euchromatin (24, 25). Following this idea, the previous whole-genome models assumed attractive interactions, but these interactions spontaneously gathered heterochromatin toward the nuclear center, leading to the unusual genomic configuration. This anomalous chromatin distribution was remedied in the models by assuming the counteracting attractive interactions between chromatin and the nuclear lamina (21, 26). However, the mechanism to establish such a balance among competitive interactions in the nucleus was unclear. In the present study, we resolved this difficulty by focusing on repulsion rather than attraction in chromatin interactions. We consider a polymer model, which describes heterogeneously distributed physical properties of chromatin. With heterogeneous repulsive interactions among chromatin regions, the simulated genome unfolded from the mitotic chromosomes, which generated heterogeneous movement of chromatin chains, leading to phase separation of chromatin in the G1 phase. This repulsion-driven phase separation quantitatively explains the genome organization of human fibroblast (IMR90) and lymphoblastoid (GM12878) cells and predicts dynamic fluctuations residing after the genome reached the G1 phase.
Neighboring region contact index
The interactions between chromatin regions depend on the local physical properties of chromatin. We inferred these physical properties from the local chromatin contacts. Fig. 1A shows a distribution of the ratio of observed/expected contact frequencies obtained from the Hi-C data (5), , where mk,k+s is the observed contact frequency between k and k + sth positions along the sequence, and F(s) is the mean contact frequency for the se-quence separation s. Contacts between chromatin loci with the sequence separation s < 300 kb are more frequent in compartment A than compartment B, while contacts with s > 300 kb are more frequent in compartment B (Fig. 1A). Here, compartment A/B is identified by the principal component analysis (PCA) of the Hi-C contact matrix of the genome (4). In other words, the properties of chromatin in a few hundred kb scale are correlated to the compart-ments defined in hundreds of Mb or the larger scale. We uantified this correlation by defining the neighboring region contact index (NCI), , where Ci,j = ∑k∈(ith region) ∑l∈(jth region) mk,l is a sum of the frequency of contacts between 50-kb chromatin regions labeled i and j along the sequence. As shown in Figs. 1B and 1C, NCI correlates to the compartment signal defined by PCA. This finding suggests that the A/B compartmentalization originates from the heterogeneity of local chromatin properties as captured by NCI. We examined this hypothesis by performing the polymer simulation.
First, we defined the property of each chromatin region using NCI; the 100 kb region with Zw ≥ 0.3 is called the type-A region, where Zw is the Z score of NCI. The region with Zw ≤ −0.3 is called the type-B region, and the region with −0.3 < Zw < 0.3 is called the type-u region. The type-A/B/u sequence (Fig. 2A), thus derived from the local Hi-C data, was used as the input into our polymer simulation. We consider heteropolymer chains connecting type-A, B, and u beads by springs, with each bead representing a 100-kb chromatin region. Then, we compare the predicted results from the polymer simulation with the experimentally observed global Hi-C contact data. These global data have a much larger size than the input; therefore, this comparison between the simulated and observed data should give a stringent test on the physical assumptions adopted in the simulation. We note that the other definitions of the sequence of chromatin properties, such as the histone modification patterns (20, 27, 28), are compatible with the present polymer model as the alternative input into the simulation. Here, we used the type-A/B/u sequence instead of the other definitions to restrict ourselves to using only the local physical chromatin properties as the input.
Repulsion-driven phase separation of chromatin
Because the typical size of the chromatin loop is ~ 200 kb (5), the larger NCI in a type-A region implies the more frequent intraloop contacts. These frequent contacts in the type-A region should be possible when the chromatin configuration is flexible enough to allow various contacts within the loop. This flexibility correlates with the observation that type-A (type-B) regions are abundant in euchromatin (heterochromatin). Therefore, we expected that the chromatin configuration in type-A regions is looser (7) and more flexible (9) than type-B regions. The loose configuration should allow a spatial overlap of two type-A regions. This feasible overlapping was modeled with a soft Gaussian-like repulsive potential UAA(d) between two type-A regions whose centers are separated by distance d (Fig. 2B). A harder repulsive interaction is expected between type-B regions having a tighter configuration. However, chromatin chains should be intermingled for small enough d, inducing an effective attraction between type-B chains to compensate for the repulsion. These features of the interaction were modeled by a box-like repulsive potential UBB(d) (Fig. 2B); UBB(d) shows strong repulsion between two type-B regions that are in contact, while UBB(d) shows weak repulsion when the type-B regions largely overlap (Supplementary Text). The repulsion between two type-u regions was assumed to be , and we set with α and β being A, B, or u.
This set of interactions induce phase separation of type-A and B regions when we confine them in a high-density space. For example, in a simple model of the polymer blend consisting of polymers of type-A segments and polymers of type-B segments, the Brownian motion of polymers induces phase separation (Fig. 2C). The mean square displacement (MSD) analysis (grayscale in Fig. 2D) shows that the type-B segments are packed in a more solid-like manner, whereas the type-A segments show a more fluid behavior. The large fluctuations of type-A segments allow the type-A segments to merge into the phase-A domain. In particular, in the system confined in a rigid sphere, the type-A segments occupy the inner region to acquire the volume allowing the motion while the type-B segments are packed at the periphery analogously to the heterochromatin/euchromatin separation in cells (Figs. 2C and 2D). We used these repulsive potentials Uαβ to simulate the interphase genome structure.
During interphase in the human nucleus, a chromosome is displaced at most ~ 2 μm (29), a much shorter distance than the nuclear size; the system is neither stirred nor equilibrated during interphase. Therefore, to explain genome organization in interphase, the process of structure formation at the entry to the G1 phase needs to be carefully considered (30). We simulated anatelophase genome by pulling a centromere locus of each condensed chromosome chain toward one direction in the model space (Fig. 3A). Then, from the thus obtained configuration, we started the simulation of decompression by assuming the disappearance of the condensin constraints at this stage, which allowed chromosomes to expand with repulsion among chromatin chains (Fig. 3A, Movies S1–S3). The nuclear envelope forms during this expansion (31). We simulated this envelope formation by assuming a spheroid (IMR90) or sphere (GM12878) surface, whose radii varied dynamically by balancing the pressure from outside the nucleus and the one arising from the repulsion among chromatin chains. The nucleoli also form during this expansion (32). We represented the nucleoli by assemblies of particles adhering to rDNA, into which the transcription products and the related factors are condensed. The weak short-range, attractive interactions arising from the exchange of diffusive molecules were assumed between the particles, which spontaneously assembled to form nucleoli through the genome expansion process. Each chromosome gained a V-shaped conformation during anatelophase, whose effects remained through the expansion process, leading to the long-range contacts between p- and q-arms of chromosomes in interphase.
Fig. 3B shows a snapshot of the simulated structure obtained after the nucleus reached a stationary size. In this state, the genome is phase-separated into type-A, B, and u regions and nucleoli, where the type-u regions reside at the boundary of the type-A and B regions. Phase-separated A/B regions are formed in each chromosome and across chromosomes.
Genome organization generated through phase separation of chromatin
We analyzed the generated structures by comparing the experimentally observed (5) and simulated Hi-C contact matrices. The simulated data reproduces the observed features of contacts among several chromosomes (Fig. 4A) and in the whole genome (Fig. S1), while the noise arising from the small sample size (n = 200) of simulated cells remained, reflecting the intense cell-to-cell fluctuation in inter-chromosome contacts. The noise was removed by enhancing the signal/noise ratio with the correlation-coefficient representation of the contact matrix (4), clarifying the agreement between the simulated and observed results (Fig. 4B). Comparisons for the intra-chromosome contacts show a further agreement between the simulated and observed results (Fig. 4C, Figs. S2–S5). Diagonal blocks in the intra-chromosomal matrices represent chromatin domains with several Mb or the larger size. The simulated and observed block patterns agree with each other, suggesting these chromatin domains arise from phase separation (Movies S4, S5). The observed dependence of the contact frequency on the sequence separation is well reproduced in the simulation (Fig. 4D), showing that the simulation explains both the global and local chromosome structures in a balanced way. The plaid pattern of the inter- (Figs. 4A, 4B) and intra-chromosomal (Fig 4C) contact matrices represents A/B compartmentalization. This A/B compartmentalization was quantified using the compartment signal. The compartment signals derived from the simulated data were superposed with the observed data with a Pearson’s correlation coefficient of r ≈ 0.8 (Fig. 4E). The genome-wide compartment signals for IMR90 and GM12878 show that the simulation reproduces the genome-wide data (Fig. 4F). Therefore, our model explains the A/B compartments and the other features of the intra- and inter-chromosome Hi-C contacts in different cells using the same single set of model parameters. These features were lost when type-A, B, and u loci were randomly assigned along the polymer chains (Fig. S6), showing that the arrangement of local properties along the chromatin chain is essential for proper phase separation and genome organization.
We further analyzed the generated genome structure by examining the lamina-chromatin association. Fig. 5A shows that type-B chromatin accumulates, while type-A chromatin depletes near the nuclear envelope (Materials and Methods). The specific lamina-chromatin attractive interactions are not considered in the present model; therefore, this accumulation was not due to the tethering of type-B chromatin to the lamina but was induced dynamically similarly to the simulated polymer-blend system (Figs. 2C, 2D). Fig. 5B compares the simulated and experimentally observed (33) lamina-chromatin association data for an example chromosome. The loci showing a large frequency of association, i.e., the lamina-associated domains (LADs), are reproduced by the present simulation. Fig. 5C compares the genome-wide data showing an agreement between the simulated and observed (33) lamina-chromatin association with r = 0.55. Specific factors that tether the LADs to the lamina should delay the dissociation kinetics of the LADs from the lamina, but our simulation revealed that the lamina-LAD association resulted as a consequence of the dynamic genome-wide phase separation process. We found similar results around the nucleoli. The simulation reproduced the observed (34) nucleolus-associated domains (NADs) though the chromatin chains except for the rDNA loci were not tethered to the nucleoli in the model (Fig. 5D). The simulated and observed genome-wide nucleoli-chromatin associations show an agreement with r = 0.70 (Fig. 5E). We note that the A/B compartments were well reproduced even when the system was simulated with the absence of nucleoli (Fig. S7), showing that the nucleoli are not the driving force of phase separation of chromatin, but a perturbation to the genome structure.
Dynamic fluctuations of the genome
After the simulated genome reached a stationary G1 phase, there remained fluctuations in genome movement such as the dynamic feature of the simulated lamina-chromatin interactions (Fig. 6A). The lamina-chromatin contacts spread as the genome expanded from t = 0 min. After the genome became stationary at t ≈ 200 min, dynamic fluctuations including association, dissociation, and positional shift of lamina-chromatin contacts continued to take place. A pair of homologous chromosomes showed differing fluctuating patterns from each other. We analyzed these fluctuations by plotting the temporal change of the normalized spatial distribution of chromatin loci after they adhered to the lamina (Fig. 6B, Materials and Methods). The distribution became broad over time, reflecting the dynamic dissociation of chromatin from the lamina as observed in the single-cell experiment (35). The root-mean-square distance (RMSD) of chromatin loci from the lamina is plotted as a function of time passed after each chromatin locus attached to the lamina, showing RMSD ~ tα with α ≈ 0.35 for both type-A and B chromatin but with the larger RMSD for the type-A chromatin (Fig. 6C). With this small α, once the loci were attached to the lamina, they stayed within 1 μm of the lamina for a long time as experimentally observed (35). This reflects the tendency of the phase-separated compartment to retain LADs near the lamina.
Dynamic fluctuations were found in the entire nucleus. Fig. 6D shows a snapshot of the distribution of square displacement of each 100-kb chromatin region during Δt = 4 s. The distribution is heterogeneous with the slow movement at the nuclear periphery and fast movement in the inner regions. Similar heterogeneous distributions were observed using live-cell imaging (9,10,36). Fig. 6E is the distribution of MSD of each chromatin region during Δt = 4 s. The distribution was bimodal as observed with live-cell imaging (37), showing contributions from the fast and slow components. The fast chromatin is mostly type-A, whereas the slow chromatin tends to be type-B (Fig. S8). The pair-correlation functions showed that the positions of fast chromatin are correlated within 1 μm, constituting the fast-moving domains (Fig. S9). The difference in the movements between type-A and B regions was a driving force of the phase separation during the genome expansion process at the entry to the G1 phase, which remains as differences in dynamic fluctuations after the genome reached the G1 phase.
In conclusion, the present polymer model helps bridge different experimental analyses including Hi-C and other high-throughput measurements, as well as live-cell imaging. The model explained various genomic data quantitatively, suggesting that the repulsion-driven phase separation largely determines the genome organization. The determinant role of the mechanism implies that tethering of chromatin chains to the lamina, nucleoli, and other droplet-like condensates should work as perturbations to the genome structure/dynamics. Thus, the repulsion-driven phase-separated structure of the genome provides a platform for analyzing these chain-constraining effects and other perturbative effects for comprehensive analyses of the genome organization.
Author contributions
Conceptualization: SF, MS; Methodology: SF; Investigation: SF; Data curation: SF; Formal analysis: SF; Visualization: SF; Funding acquisition: MS; Project administration: MS; Supervision: MS; Writing - original draft: SF, MS; Writing - review & editing: SF, MS.
Competing interests
There is no competing interest to declare.
Data and materials availability
All data are available in the main text or the supplementary materials. The source codes used in this study are available from the GitHub site.
Supplementary Materials
Materials and Methods
Supplementary Text
References 1 to 16
Figs. S1 to S9
Table S1
Movies S1 to S5
Acknowledgements
We are grateful to Dr. Kazuhiro Maeshima for crtical reading of the manuscript. This work was supprted by JST-CREST Grant JPMJCR15G2 (MS); the Riken Pioneering Project (MS); JSPS-KAKENHI Grants 19H01860, 19H05258, 20H05530, and 21H00248 (MS).