Abstract
We propose a simple model for chromatin organization based on the interaction of the chromatin fibres with Lamin proteins along the nuclear membrane. Lamin proteins are known to be a major factor that influences chromatin organization, and hence gene expression in the cells. Our polymer model explains the formation of lamin associated domains, and for heteropolymers with sequence control, can reproduce observed length distributions of LADs. In particular, lamin mediated interaction can enhance the formation of chromosome territories as well as the organization of chromatin into tightly packed heterochromatin and the loosely-packed gene-rich euchromatin regions.
Introduction
The link between microscopic and macroscopic descriptions of genome structure and function is one of the key questions of present day biology [1–3]. In particular, the structure and folding principles of the interphase chromosomes has been the subject of much debate over the last decade[4–6]. While the structure of DNA double helix and histone proteins that form the nucleosome is well understood[7–9], how the nucleosomes finally organise to form the interphase chromosomes still remains an open question[10–12]. The organization of the chromatin determines the function of the cell type, with the epigenetic state of a differentiated cell correlated with differential packaging of the chromosome. Understanding the principle behind the chromatin organization thus has important implications for the proper functioning of the cell as misfolding errors leads to several human pathologies.
The advent of experimental techniques such as chromosome conformation capture (2C/3C/5C/Hi-C)[13–16] and FISH[17, 18] has provided a wealth of information on the large scale structure of the chromosome. A key experimental observation has been that different chromosomes segregate into different territories with minimal contact between them[19–21]. Additionally, the chromatin can also be classified into transcriptionally silent heterochromatin and gene expression active euchromatin, with heterochromatin regions being more tightly packed, and located preferentially in the nuclear periphery. The euchromatin, in contrast, is relatively loosely packed and located in the interior[22–25].
One of the key mediators in the organization of the genome are interactions between the nuclear lamina and chromatin. The nuclear lamina (NL) is a complex structure that acts as a scaffold for various proteins that regulate nuclear structure and function [26, 27]. Mammalian cells contain two B-type lamins, B1 and B2, and two A-type lamins, lamins A and C. In most cells, lamins B1 and B2 are concentrated along the NL in nculear periphery, while A-type lamins are also found in the nuclear interior[28, 29]. The lamin proteins in the NL play an important role in the organization of the chromosomes within the nucleus. Experimental studies have shown that certain regions of the chromosomes lie in close proximity to the lamina, and DamID experiments have been used to build a map of chromosome lamin interactions[26, 27, 30, 31]. These experiments reveal that there are large domains of chromosomal regions that have a high degree of affinity for nuclear lamins (called lamina-associated domains or LADs), alternating with regions of very low affinity. The LADs in the human genome can be very large, ranging from 0.1Mb − 10Mb in length[30, 31]. The LADs are associated with gene poor regions of the chromosome, with the mean gene density being around half of that in the non-LAD regions. Additionally, other gene activity markers, such as PolII and the histone mark H3K4me2 are also repressed within LADs, indicating that LADs represent a strongly repressive chromatin environment[26, 27, 30]. A large fraction of the human chromosome (≈ 40%) consists of LADs[30].
Subsequent experiments have also shown that the interaction between LADs and lamin proteins are stochastic in nature, with only a fraction of the total LADs being in contact with the NL in a given cell[31]. After the cell division process, a new subset of LADs can be in contact with the NL. ChIP-DamID experiments have shown that only those LADs which interact with the lamin proteins have enhanced levels of H3K9me2, which implies that this methylation mark status is also stochastic, and directly correlated with the LAD position[31]. These experiments conclusively show that LADs are positioned stochastically within the nucleus.
While there has been considerable theoretical progress in understanding the origins of the three-dimensional organization of chromosomes [32–37], an important missing ingredient in the proposed models has been the effect of the NL-chromatin interactions. In this letter we propose a simple polymer model that incorporates confinement as well as the attractive interactions between the chromosomes and the lamin proteins. The main results are as follows: (i) for a homopolymer we reproduce experimentally observed scaling of chromatin size as a function of base pairs and their associated contact probabilities, (ii) for a heteropolymer with sequence control we obtain length distributions of LADs, and (iii) for a mixture of homo and hetero-polymers we observe phase separation of chromosomes and formation of distinct territories. We thus demonstrate that a complete understanding of the folding principles of the chromosome need to incorporate this interaction for a cohesive picture.
Model
We model the chromatin as a self-avoiding polymer chain of N beads, each of diameter b0, connected by harmonic springs. The polymer is confined within a sphere of radius R. The polymer consists of two types of beads, type A and type B (see Fig. 1). The inner surface of the sphere attracts beads of type A, mimicking laminchromatin interactions. The fraction of type A beads is denoted by f. The lamin interactions occur if a bead of type A lies within a cut-off distance Rc = b0 from the inner wall of the sphere. The energy of the polymer chain is thus given by the Hamiltonian, where k = 30 kBT denotes the spring constant of the bead-spring polymer, and is taken to be in the same range as previous studies [32, 37], and tj denotes the type of bead (0 for A-type beads, and 1 for B-type beads). The lamin-chromatin binding energy is given by EB. Equilibrium conformations of this confined polymer system in the presence of an attractive boundary is then simulated following Metropolis Monte Carlo scheme, and the statistical averages are calculated for a time of TMC = 109 MC steps after allowing an equilibration time of Teq ∼ 10N 2 ∼ 106 MC steps.
The radius R of the nucleus and the polymer volume fraction is chosen in our model to conform to a biologically relevant scenario. Human nuclei sizes can range between diameters of 6µm − 11µm[38, 39], while the total number of base pairs in the human genome is of the order of 6 billion (6 × 109bp)[38]. Assuming that a 30 nm chromatin bead has 3000bp[40, 41], the chromatin volume fraction of the human nuclei ranges from 0.004 to about 0.25. For our simulated confined polymer with N = 512 monomers, the radius of the confining sphere then corresponds to R ≃ 12 for a volume fraction of 0.004 and R = 7 for a volume fraction of 0.25. We look at physical quantities of interest for these two values of the radius R = 12 and R = 7 in this Letter.
Results
We first consider a homopolymer with all type A beads that can bind to the lamin proteins, i.e. f = 1. This is done for both values of the nuclear radii used in our investigations, R = 7 and R = 12 for a range of binding energies . As shown in Fig. 2(a), the mean square displacement 〈R2(s)〉 as a function of contour length of the polymer s scales as 〈R2(s)〉 ∼ s2ν, with an exponent ν between 0.4 − 0.7 for short genomic distances and saturates for higher values. The saturation of the mean square separation is a simple consequence of the confinement of the polymer. For R = 12, ν ≈ 0.5−0.7; while for R = 7, ν ≈ 0.4−0.6 which corresponds extremely well with measured values from FISH data[42–44].
We also compute the contact probability for two monomers separated by s base pairs to approach within a certain cutoff distance rcut of each other. Consistent with previous studies, we choose rcut = 2.5b0 [34]. For small base pair separations, the contact probability decreases before tapering off at large values (see Fig. 2(b)). The contact probability decreases following a power law of the form pc(s) = s−β1 for small values of s. The exponent β1 ≈ 1.5 − 1.6 for R = 12 and ≈ 0.9 − 1.0 for R = 7, consistent with Hi-C experiments[45, 46].
Further, we compute volume fraction of monomers as a function of the radial distance from the centre, i.e. we count the number of monomers nr in a thin shell r → r + ∆r, with ∆r = 1, and compute the volume occupied by these nr monomers normalized by the volume of the shell. This is shown in Fig. 2c. The corresponding number of monomers (normalised by the total number N) in each shell is shown in Fig. 2(d). Entropic confinement, in the absence of lamin-protein interactions (i.e. EB = 0), leads to a low volume fraction near the surface and a constant value throughout the rest of the nucleus. For EB ≥ 2 kBT, the chromatin volume fraction at the nuclear periphery increases such that the outermost shell is more densely packed than the inner ones. This is consistent with observations of tighter chromatin packing in heterochromatin regions[22–25], and also correlates with the hypothesis that lamin protein chromatin interactions are more prominent in heterochromatin regions, which explains their tighter packing, and consequently, higher volume fractions.
Next, we investigate domain formation within our model to compare against DamID experimental data. We define the lamin proximity index for the ith monomer as, , which determines the position of the centre of a monomer, with respect to the distance Rc beyond which it interacts with the lamin. The LP I value ranges from [−1: 1], with positive values indicating bond-formation with the lamin. Figures 3(a,b) and (d,e) show the variation of LP I as a function of the binding energy EB for both spheres of radii R = 12 and R = 7 respectively.
The lamin-associated chromatin domains for the homopolymer, has a size distribution P (ℓ) ∼ exp[−ℓ/ℓ0], peaked around ℓ = 0, with the characteristic length scale ℓ0 monotonically increasing as a function of the binding energy EB (Fig. 3(c,f)). The size-distribution obtained from DamID data is peaked around a non-zero value of ℓ [30]. This arises due to the fact that for chromatin roughly 40% of the monomers can associate with the lamin. This necessitates a heteropolymer model of chromatin where only a fraction f of the beads (type A) can associate with the lamin proteins.
Heteropolymer
A heteropolymer model for chromatin assumes NA beads of type A and NB beads of type B, such that NA/(NA + NB) = f. We consider three heteropolymer models: (i) random-heteropolymer - where the NA number of A types beads are chosen randomly; (ii) uniform-block-copolymer - where the NA beads are divided into uniform patches of size p = N/4; and (iii) gaussian-block-copolymer - where the NA beads are divided into patches with patch sizes are chosen from a Gaussian distribution (µ = 20, σ = 5). For polymers with quenched randomness of the type studied here, the fraction f as well as the disorder-correlation length plays a role in dictating their equilibrium properties. We investigate the statistical properties of the confined polymer both as a function of the lamin binding energy as well the fraction f of binding monomers.
The mean square displacement between any two monomers 〈R2(s)〉 has the same statistical features as a homopolymer, with 〈R2(s)〉 ∼ sν for small separations followed by a saturation at larger values. The exponent ν increases as a function of EB for all fractions considered, as shown in Fig. 4(a). Increasing the binding energy EB allows the monomers to spread out on the surface, as lamin attachments become more favourable, leading to an increase in the exponent ν. As expected, ν also increases with increasing the fraction f of binding monomers. A similar behaviour is observed for the contact probability exponent β (Fig. 4b). The values of the exponents ν and β are similar for the different disorder realisations studied.
We also plot the volume fraction (Fig. 4c) as a function of the normalised distance from the center of the sphere, for a binding fraction of f = 0.4 corresponding to the biologically relevant situation. For small binding energies, the volume fraction drops close to the surface of the nucleus. With increasing binding energy the volume fraction shows a maxima near the lamina, indicating an increased density of monomers there.
We now turn to the lamin contact maps in order to investigate domain formation for the heteropolymer case. A representative plot for the Lamin Proximity Index (LP I) for a random heteropolymer with f = 0.4 is shown in Fig. 5(a) for R = 7 and EB = 3kBT. Domains of lamin-associated-chromatin alternate with ones that do not come in contact with the NL. The length distribution of the domains P (ℓ) for R = 7 and R = 12 are shown in Figures 5(b) and 5(c) respectively. The domain sizes are distributed exponentially for both values of the nuclear size indicating that a random heteropolymer model does not reproduce the characteristic distributions of LADs observed in experiments [30]. Fig. 5d) shows the LP I for the uniform block copolymer with f = 0.5 having 4 equal length alternating blocks of attractive and inert domains that interact with the lamin. The block copolymeric nature is reflected in the LPI plots (Fig. 5d) with larger domain sizes in comparison to those formed in the case of the random heteropolymer. The associated domain length distributions for R = 7 and R = 12 are shown in Figs. 5 (e) and (f) respectively. In contrast to the single exponential fit for both the homopolymer and the random heteropolymer, the domain size distribution in this case is a double exponential, with an enhanced probability for larger domains. For high enough binding energies, additionally there is a peak corresponding to the domain size as well. Finally we consider the case of Gaussian block heteropolymer with f = 0.4, with the corresponding LPI shown in Fig. 5(g). The distributions of the LAD sizes reflect the Gaussian nature of the block copolymer for higher values of the binding energy EB ≳ 2kBT, as can be seen from Figs. 5(h) and (i) for both values of the radius. This qualitatively agrees with the domain size distribution observed in DamID measurements[30]. The heteropolymer model thus illustrates that the observed domain length distributions in experimental studies must necessarily correlate with the distribution of chromatin regions that can interact with lamin proteins as well as setting a scale for the strength of the lamin-chromatin interactions. Weak interactions give rise to exponential domain length distributions, and hence the interaction energies must be beyond a certain strength in order to explain the experimental observations.
Multiple polymers
We explore the formation of chromatin territories within our model. In the absence of any lamin interactions (EB = 0), the lamin contact maps (Fig. 6a) shows negligible territory formation. A sample equilibrium configuration, shown in Fig. 6(c) illustrates that there is significant interpenetration between the four polymer strands. If we allow two of the four polymers to interact with the lamin, these two polymers show extremely low levels of interpenetration. This can be seen from the contact map shown in Fig. 6(b), where the first polymer (monomers 0 − 127), and the third polymer, (monomers 256 − 383) interacts with the lamin (EB ≠ 0), while the other two do not. The regions of the contact map corresponding to overlaps between the first and third polymers clearly show a reduced intensity, indicating the two polymers seldom approach each other. This is also clear in the sample equilibrium configuration shown in Fig. 6(d), where the first (brown) and third (red) polymers stay close to the attractive surface on the two distal sides of the sphere. The two noninteracting polymers (shown here in cyan and blue) on the other hand, occupy the central region of the sphere and have increased inter-penetration. The interactions of the lamin proteins with the chromatin thus enhances the ability of the chromatin to segregate into individual territories. The attractive surface interaction makes it energetically favourable for the chromatin polymers to occupy different regions of the nucleus, leading to the formation of territories. This provides a candidate mechanism by which chromosome territories form within the eukaryotic cell nucleus.
Discussion
In conclusion, we model chromatin packing in the cell nucleus as a polymer confined in a spherical cavity having an attractive interaction with the inner walls. We compute statistically measurable quantities such as mean square displacement as a function of base pair separation 〈R2(s)〉, the lamin proximity index LP I, and contact map for homo and heteropolymers with different disorder realisations and different nuclear radii. Our computational results explain the data observed in DamID and FISH experiments.
A complete map of chromatin-nuclear lamina interactions generated by DamID experiments have identified a large number of LADs in the human chromosome, and understanding the origin of the observed distribution of LADs is thus important to understand the 3D organization of the genome. Our work shows that a heteropolymeric chromatin, with domains of lamin-binding regions with sizes drawn from a distribution, reflects this structural heterogeneity. We show, for example, that for Gaussian distributed domains, the lengths of lamin associated regions also follow a Gaussian distribution. Further, for low binding energies, the distribution of lengths follows a single exponential. Thus our analysis shows that lamin chromatin interactions need to be beyond a certain critical strength (few kBT s), such that thermal energies can compete with the interaction energy leading to dynamic chromosomes, and also reflect the underlying domain structure.
For the case of multiple polymers, while it is known that confinement can introduce territory formation [47], we show that lamin mediated interactions can effectively strengthen the formation of chromosome territories, even at densities where territory formation would not be expected from simple confinement.
While our model shows several interesting features a quantitative understanding of genome organization is far from complete. We assumed an uniformly attractive surface, corresponding to a uniform distribution of lamin proteins. In reality, there is a complex and heterogeneous distribution of lamin filaments on the nuclear envelope, and this would lead to tighter heterochromatin packaging at regions of high lamin concentration [48]. Further, A-type lamins are known to be present on the nucleolas as well[28, 29]. Thus LADs on chromatin can interact with the nucleolas surface in the interior of the nucleus. The effect of this attractive surface in the nuclear interior can compete with the attractive interaction at the nuclear periphery to give rise to non-trivial structures. The current model focuses solely on the effect of nuclear lamin interactions. However, it is known that various other proteins such CTCF[49, 50], cohesins and condensins[51–53] play a role in the large-scale organization of the genome. An open question is how protein-mediated interactions in the bulk compete with lamin-interactions on the surface to determine 3D organization. Non-equilibrium active forces may also play a role, although their role in large scale organization remains open. While preparing this manuscript for publication we became aware of a similar model by Chiang et al.[54]. While our model does not include sequence specific data as an input into hetero/euchromatin track lengths, it is more generic, allowing for systematically tuning the disorder correlation length of heterochromatin tracks along the backbone. Despite its conceptual simplicity, our model captures features of conformational statistics of the chromatin. This calls for a systematic study of specific interactions among beads and lamin to understand robustness of genetic landscape on changes in mesoscale parameters.
We hope that our simple theoretical model will encourage experimental work in this direction. Our study predicts that tuning the strength of lamin-chromatin interactions can change the distributions of lamin-associated domain lengths. If the interactions can be made sufficiently weak, the theory predicts a transition to a single exponential distribution of domain sizes. Further, tuning the interaction strength will also change the radial volume fraction, with volume fraction at the nuclear periphery increasing monotonically with increasing interaction strength. These specific predictions can be tested against experimental data.
Our study emphasizes how sequence heterogeneity in the genome affects the three-dimensional genome organization. In particular, this heterogeneity is crucial to a complete understanding of the chromosome packaging problem.
Acknowledgements
Financial support is acknowledged by MKM for Ramanujan Fellowship (13DST052), DST and IITB (14IRCCSG009).
Footnotes
↵∗ b.chakrabarti{at}sheffield.ac.uk
References
- [1].↵
- [2].
- [3].↵
- [4].↵
- [5].
- [6].↵
- [7].↵
- [8].
- [9].↵
- [10].↵
- [11].
- [12].↵
- [13].↵
- [14].
- [15].
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].
- [21].↵
- [22].↵
- [23].
- [24].
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].
- [34].↵
- [35].
- [36].
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].
- [53].↵
- [54].↵