ABSTRACT
Bulk chromatin motion has not been analysed at high resolution. We present Hi-D, a method to quantitatively map dynamics of chromatin and abundant nuclear proteins for every pixel simultaneously over the entire nucleus from fluorescence image series. Hi-D combines reconstruction of chromatin motion, and classification of local diffusion processes by Bayesian inference. We show that DNA dynamics in the nuclear interior are spatially partitioned into 0.3 – 3 μm domains in a mosaic-like pattern, uncoupled from chromatin compaction. This pattern was remodelled in response to transcriptional activity. Hi-D can be applied to any dense and bulk structures opening new perspectives towards understanding motion of nuclear molecules.
INTRODUCTION
Spatial organization and dynamics of chromatin correlate with cell function and fate [1]. On the coarsest level, chromosomes occupy territories in mammalian cells [2]. The relative proportion of dense heterochromatin and open euchromatin regions reflect cellular activity [3]. Transitions within and between eu- and heterochromatin involve the remodeling of multiple hierarchical levels of chromatin organization from domain folding and long-range looping to nucleosome density to adapt to and enable DNA processing [4]. Structural models derived from contact and crosslinking frequencies [5–7] are consistent with the view that the genome is partitioned into functional compartments and sub-compartments [8]. It is now becoming increasingly clear that such nuclear compartments are also dynamic entities whose conformational changes impact mechanisms and function of genome folding [9]. Tracking of labelled single DNA loci [10–14] or chromatin domains [15,16] demonstrated that chromatin motion is highly heterogeneous at short time intervals. Sparse loci, however, are difficult to place in the context of global chromatin organization [17] and locally restrained genomic processes can also not be inferred from quantities averaged over the entire nucleus [18,19]. A first study analyzing bulk chromatin motion at nanoscale resolution revealed dynamic partitioning of chromatin into a number of nuclear sub-regions with correlated motion of chromatin in the micrometer range [20]. However, the cause and/or effect of (correlated) chromatin dynamics is not yet clear. Likewise, whether compaction of chromatin determines its spatial coherence or whether chromatin dynamics are distinct in open and closed chromatin is still a matter of debate.
To tackle this need, we developed a new approach called High-resolution Diffusion mapping (Hi-D) that overcomes the limitations of sparse and ensemble approaches. Hi-D combines a dense Optical Flow reconstruction to first quantify the local motion of chromatin and other abundant nuclear constituents at sub-pixel accuracy within a series of images, and a Bayesian inference approach in the second step to precisely classify local types of diffusion. Biophysical properties such as diffusion constants and anomalous exponents are determined for each pixel to create two dimensional maps of chromatin dynamics at single pixel resolution in living single cells. Hi-D created spatially resolved maps show that DNA compaction and dynamics do not necessarily correlate. Instead, the maps suggest that chromatin dynamics are dictated by DNA-DNA contacts and protein binding to DNA, rather than chromatin density.
RESULTS
Hi-D maps genome dynamic properties at nanoscale resolution in living cells
Motion of densely distributed fluorescent molecules was quantitatively reconstructed from a series of conventional confocal fluorescence microscopy images by a dense Optical Flow method [20]. By integrating the resulting flow fields, a trajectory was computed for each pixel (Fig 1a; Additional file 1: Note S1; Additional file 1: Fig S1). The type of diffusion characterizing each pixel’s chromatin motion was chosen in an unbiased manner using a Bayesian inference from a set of five common models to fit each trajectory’s Mean Squared Displacement (MSD) [21] (Fig 1b, left panel; Additional file 1: Fig S2). The best fitting models were directly mapped onto the nucleus (Fig 1b; right panel) (Methods section). We found that only a small fraction of trajectories displayed directed diffusion (Fig 1b), while the bulk of chromatin exhibited sub-diffusive behaviour. Distinguishing between the comparable cases of anomalous and confined diffusion is a challenging task, given the limited duration of the experiment. After examination of a range of parameters governing these different types of diffusion, our results suggest that chromatin diffusion in human U2OS cells can be adequately described as anomalous to avoid misinterpretation (Additional file 1: Note S2; Additional file 1: Fig S3). The resulting biophysical parameters calculated for each pixel by Hi-D (diffusion constant D, anomalous exponent α and drift velocity V) are presented in color-coded 2D heat-maps (Fig 1c) (Methods section). They are distributed in a mosaic of irregular shape and dimensions of similar values (Fig 1c). These parameter maps clearly demonstrate that chromatin dynamics are spatially heterogeneous and partitioned. These maps also illustrate the notion that chromatin dynamics are spatially correlated in the micrometer-range [18,20]. To further characterize this heterogeneous distribution, the parameter distributions were deconvolved into discrete sub-populations using a General Mixture model (GMM) (Fig 1c; Methods; and Additional file 1: Fig S4). The GMM identified three populations of chromatin mobility referred to as slow, intermediate and fast (Methods; exemplary in Fig 1c), irrespective of the parameter under consideration (diffusion constant or anomalous exponent) or transcriptional state of the cell. We found that chromatin dynamics characterized by directed motion involving a drift velocity (V) was less present than free and anomalous diffusion and provided significantly less data for V than for the other two parameters (Fig 1c). Hence, drift velocity was not retained for further analysis.
Validation of the Hi-D approach in simulation and experiment
In order to examine the suitability of calculated trajectories and associated diffusion constants by Hi-D to whole-chromatin imaging conditions, we compared Hi-D to dynamic multiple-target tracing (MTT), a Single Particle Tracking (SPT) method which is commonly used for dense molecule tracking [22] (Fig 2a,b; Additional file 1: Note S3; Additional file 1: Fig S5; Additional file 1: Fig S7). While the SPT method outperforms the Hi-D approach in scenarios of sparsely labelled molecules (Fig 2a), Hi-D analysis provided considerably more accurate estimates of local diffusion constants than SPT in scenarios of densely labelled molecules or structures of heterogeneous label density such as chromatin (Fig 2b). Hi-D therefore constitutes an approach to extract dynamic parameters of biomolecules with dense labeling where SPT is unsuitable. One should, however, keep in mind that SPT and Hi-D are meant to analyze images from drastically different labeling conditions and should thus refrain from a direct comparison between single-locus dynamics analyzed by SPT and local bulk chromatin dynamics by Hi-D (Additional file 1: Note S4; Additional file 1: Movie S1).
To ensure that the calculated dynamics are not a consequence of imaging noise, we experimentally validated the sensitivity of the approach by calculating the MSD for formaldehyde fixed and living U2OS cells labelled by DNA-SiR in quiescence (- serum) or normal growth (+ serum). Diffusion constants derived from the MSD curves by Bayesian inference were about two orders of magnitude greater in living cells than in fixed cells (Fig 2c) confirming that Hi-D enables quantifying DNA dynamics well above the noise background. To confirm the robustness of extracted parameter values with respect to varying levels of imaging noise, Hi-D was applied to nuclei to which noise was artificially added. The signal-to-noise ratio (SNR) of the original nuclei were about 26 dB and subsequently reduced stepwise down to 16 dB. The distributions of computed diffusion constants were consistent up to a lower limit of about 20 dB, below which the distribution is considerably biased towards larger values and broadens (Fig 2d, Additional file 1: Fig S8). Likewise, features of the spatial map of diffusion constants were equally conserved for SNR values as low as 20 dB, demonstrating the robustness of Hi-D for varying imaging noise (Additional file 1: Fig S8). In analogy to the robustness to varying SNR levels, Hi-D is thus robust to photobleaching effects (if SNR ≥ 20dB) since flow fields are only estimated between consecutive images, for which illumination changes due to photobleaching are usually negligible. Furthermore, Hi-D was also shown to be robust to small variations in time intervals of acquired time series as long as the expected motion between frames was in the order of the pixel size (Additional file 1: S9). We further validated Hi-D against iMSD, a well-established method to extract dynamic information of dense molecules, based on the spatial correlation function of intensity fluctuations caused by diffusing molecules, which are recorded using camera-based systems [23]. Using successive calculations of iMSD to overlap regions of interest, we computed a diffusion map similar to Hi-D derived maps (Additional file 1: Note S5). Quantitatively, both methods yield diffusion constants of the same order of magnitude (Hi-D (1.6 ± 0.8) · 10−3 μm2/s, iMSD (2.2 ± 4.5) · 10−3 μm2/s, mean ± standard deviation), which are consistent with reported values using SPT and correlation spectroscopy methods applied to interphase chromatin [14,18]. However, the distribution of values derived by iMSD was considerably broader than the distribution revealed by Hi-D (Fig 2f). The distribution of anomalous exponents computed by iMSD showed many spurious values at the limit of the scale, while Hi-D consistently returns reasonable values (Fig 2g). We thus conclude that Hi-D reveals dynamic parameters of the same order of magnitude as iMSD but is advantageous in the estimation of multiple parameters simultaneously, by virtue of the featured Bayesian model selection. Hi-D is thus an accurate, robust, fast and easy to use tool to determine dynamics of macromolecules nucleus-wide.
Single-cell biophysical property maps of genome conformation and behaviour
To concomitantly monitor position and distribution of the DNA mobility populations under different biological conditions, we determined Hi-D maps of the same serum-starved and then stimulated cell (Fig 3a). Transcription is largely inhibited in cell-cycle arrested cells grown in a serum-free medium (Additional file 1: Fig S10). Adding serum to the medium stimulates mRNA production through transcriptional activity [20,24–26]. As above, diffusion constants of DNA motion were calculated for each pixel based on the model selected by Bayesian inference. Small diffusion constants characterized motion of chromatin prominently located at the nuclear envelope (dark blue). Plotting the average diffusion constants versus the distance from the nuclear periphery showed that the mobility within a rim of 1 µm from the periphery increases linearly before adopting a nearly constant value in the inner volume of the nucleus (Additional file 1: Fig S11). At numerous sites across the remaining nuclear volume, fast diffusive areas of irregular dimensions spanning 0.3 − 3 μm in diameter (yellow areas in Fig 3a) are embedded in the bulk of moderately dynamic chromatin. Areas of different parameter values seamlessly transition into one another without clearly defined boundaries, reminiscent of spatially correlated chromatin dynamics [20]. Upon serum stimulation, the spatial distribution of high and low diffusion constants was largely conserved (compare the presence of yellow regions in the quiescent and serum stimulated cell in Fig 3a), but the diffusion constant was globally strongly reduced by nearly one order of magnitude. Deconvolution of the distribution of diffusion constants and labelling of pixels according to the mobility population determined by the GMM (Fig 3b; slow - red, intermediate – orange and fast - yellow) yields a map in agreement with this observation. Deconvolution hence classifies regions according to the values of a given parameter compared to other regions within the same nucleus. When nuclear activity is modulated, changes in this classification can be measured. In particular, the fast diffusing population with respect to the bulk chromatin in quiescent cells is reduced upon serum stimulation (Fig 3c) and re-classified as intermediate population. Connected areas with high mobility appear eroded (Fig 3b). In contrast, the slow population occupying ∼ 6 % of the nuclear area, which is almost exclusively located at the nuclear periphery, was invariant to transcriptional changes (Fig 3b, c). Despite considerable reorganization of the relative distribution of mobility populations and overall reduced intensity of motion, the type of diffusion governing the nuclear parameter maps showed only moderate changes upon stimulation of transcriptional activity (Fig 3d).
Anomalous diffusion dominated across the entire nucleus (0.3 ≤ α ≤ 0.73) forming a mosaic-like pattern, which underwent, compared to maps of diffusion constants, considerable remodeling upon transcriptional activation (Fig 3e, f). Within this pattern, patches of super-diffusive (red: α > 1) motion segregated into distinct islands which became more fragmented upon serum stimulation. Random contacts or re-distribution of existing contacts of the chromatin with itself may give rise to such variations in anomalous exponent upon serum stimulation [27]. Because the diffusion constant of chromatin fibers appears unaffected for moderate degrees of cross-linking [28], we expect that association of proteins with DNA upon serum stimulation could favor global decrease of mobility in vivo. Hi-D reveals high-resolution spatial changes in mobility and in anomaly of chromatin diffusion in single cells. Further investigation may tell us if all or a subset of visible physical domains correspond to the ones determined using chromosome conformation capture (Hi-C).
Transcription modulates chromatin and RNA polymerase II motion
To further explore the relationship between global chromatin dynamics and transcriptional activity, we examined the dynamics of RNA polymerase II (RPB1-Dendra2; RNA Pol II) in live U2OS cell nuclei (Fig 4a) at different transcriptional states. Hi-D analysis resolved three mobility populations of RNA Pol II (Fig 4b), which is consistent with the existence of three kinetically different groups of RNA Pol II based on the half-life of chromatin-binding [29,30]. Diffusion constants of the three dynamic populations in actively transcribing cells (grown in normal condition) were significantly greater compared to transcriptionally less active cells (serum-starved cells) (Fig 4b). In quiescent cells, the fraction of quickly diffusing RNA Pol II complexes was reduced compared to actively transcribing cells. Upon elongation inhibition using 5,6-Dichloro-1-β-D-ribofuranosylbenzimidazole (DRB), the slowly diffusing fraction was greater than in untreated cells, indicating tenacious immobilization of RNA Pol II on the DNA template after initiation (Fig 4c). The average diffusion constants in serum starved and DRB treated cells stayed roughly unchanged in all three populations, suggesting that RNA Pol II is unbound in the absence of serum [25].
We then compared the effect of transcriptional activity on chromatin dynamics in serum-starved and –stimulated cells. In contrast to RNA Pol II mobility, the average diffusion constant of DNA in serum-starved U2OS nuclei decreased by nearly one order of magnitude for all three populations upon addition of serum. Arresting RNA Pol II before elongation did not change the observed diffusion constants, compared to undisturbed transcription (Fig 4d). These cell-population results are consistent with results from single-cell analyses (Fig 3) and strengthen our hypothesis that nuclear processes considerably hamper diffusion of chromatin. In a quiescent state, only essential nuclear processes are maintained. Fewer protein complexes acting upon DNA in a could facilitate motion of the chromatin fiber. Upon serum stimulation, binding of transcription factor complexes and other proteins to DNA increase crowding, reduce the freedom to move and hence the apparent chromatin dynamics, at least in a subset of domains. Increased DNA-protein interactions and interchromatin contacts also enhance spatial correlation of chromatin dynamics in serum supplemented compared to quiescent cells [20]. Serum addition to starved cells likely stimulated RNA Pol II binding to DNA. When inhibiting elongation, transcription factories are still present [24] and, in agreement with chromatin coherence, DNA mobility remains constrained [20].
Independently of the culture conditions, a ground-state Rouse-like behavior characterizes chromatin in the examined nuclei (MSD fit with α close to 0.5) [31,32]. Upon serum stimulation of starved cells, anomalous diffusion became predominant and its value (α ∼ 0.33) is indicative of entangled polymers [33]. This behavior was also determined for a single labelled site next to an actively transcribing gene [13]. Entanglement could stem from random DNA-protein contacts, a model coherent with polymer simulations inspired by chromosomal capture data [34]. Hindered motion of chromatin and RNA Pol II is thus a direct consequence of forming transcription ‘hubs’ or factories to which chromatin is tethered [25].
Chromatin dynamics is uncoupled from compaction
We next asked if chromatin dynamics are influenced by the compaction of chromatin since heterochromatin is widely believed to be less dynamic than euchromatin [16]. Eu- and heterochromatin domains were determined in serum-starved and –stimulated cells by quantifying fluorescence intensity as described in [35] (Fig 5a). We found that the average flow magnitude between successive frames was independent of the compaction state of chromatin (Fig 5b). Likewise, the distribution of diffusion constants did not correlate with chromatin density or euchromatin and heterochromatin (Fig 5c). Peripheral heterochromatin overlapped with the slow motion domain at the nuclear rim (Fig 5d) consistent with previous findings [16]. In the inner nuclear volume however, we did not observe any tendency of heterochromatin being associated with a specific mobility population. Instead, we observed that mobility populations were distributed randomly among euchromatin and heterochromatin regions (Fig 5e) with the exception that in serum-starved cells, the heterochromatin fraction was slightly enriched in the slow diffusing population at the nuclear periphery. Furthermore, we found that regions characterized by a specific anomalous exponent did not preferentially overlap with either eu- or heterochromatin (Fig 5f, g). These results also hold for MCF7 cells and different fluorescent markers for chromatin (Additional file 1: Fig S12). These findings were confirmed in NIH3T3 cells expressing GFP-HP1α, a well-established marker for heterochromatin (Additional file 1: Fig S13). In addition, Hi-D analysis of HP1α hints towards previously proposed liquid phase-separation [36]. Our results thus suggest that chromatin undergoes diffusion processes which are, in general, unrelated to the compaction level of chromatin. However, compact chromatin is characterized by increased contact frequency of the chromatin fiber with itself, which could enhance the extent of coherent chromatin motion. To test this hypothesis, we calculated Moran’s Index of Spatial Autocorrelation [37] for the flow magnitude assessed at different time lags in eu- or heterochromatin (Fig 5h). We found that heterochromatin exhibits enhanced spatial autocorrelation compared to euchromatin across all accessible time lags. Furthermore, the spatial autocorrelation decreases with increasing time lags in serum-starved cells, while in serum-stimulated cells, autocorrelation is enhanced in the long-time limit (over 30 seconds). This finding points to active processes establishing spatial coherence in the long-term [18,20] while random processes such as thermal fluctuations decrease autocorrelation at time scales greater than 10 seconds in serum-starved cells.
DISCUSSION
Hi-D analysing the dynamics of dense structures such as chromatin and RNA Pol II directly in single cells without losing active fluorophore density and with no need for prior experience in sophisticated labelling preparations or advanced microscopy [38]. We show that Hi-D is an accurate and robust read-out of chromatin dynamics and that the information gain through image analysis afforded by Hi-D alleviates the incompatibility of conventional microscopy for nanoscale mapping of properties of nuclear dynamics in living cells.
Hi-D analysis revealed that DNA dynamics can roughly be classified into three sub-populations within the mammalian nucleus. The use of three distinct populations does not reflect real, likely fluctuating transitions between the populations, but is a handy means to characterize the highly complex and heterogeneous dynamic landscape of chromatin. The first population, a slow mobility fraction prominently located at the nuclear rim, is reminiscent of lamina associated domains (LADs) [39]. The dynamic response of this population to transcriptional stimuli supports the hypothesis that LADs play an important role in attenuating transcription activity and in controlling gene expression [40]. Although less mobile chromatin at the nuclear periphery largely overlapped with long known perinuclear heterochromatin [41,42], Hi-D analysis remarkably points to an overall absence of correlation between chromatin compaction and mobility. Indeed, two non-peripheral chromatin subpopulations which display intermediate and highly diffusive regimes, are distributed in a mosaic-like pattern throughout the nucleus and include sections of intranuclear heterochromatin. Heterochromatin therefore does not exhibit low mobility in nuclear space in general, but may be divided into a more viscous component with reduced mobility due to anchoring to a nuclear lamina and a more rigid LAD component tethered to nuclear structures [39,43–45]. These results are coherent with the hypothesis that heterochromatin domains are formed by a liquid phase-separation mechanism characteristic of mobility reduction across phase boundaries [36]. The extent of the third, highly mobile fraction, which dominated in the quiescent state, decreased dramatically when cells were serum stimulated. This switch in chromatin mobility is suggestive of altered DNA-protein interactions and accrued local concentration of proteins in a highly transcribing nucleus [25].
Heterogeneous chromatin motion arises due to irregular protein binding along the chromosomes and can lead to thermodynamic or electrostatic self-organisation of nuclear compartments [46,47]. Local patches of large anomalous exponents indicate super-diffusive behaviour of chromatin which may result, among others, from active noise acting on the chromatin fiber [48,49], even in quiescent cells. Chromatin patches with α < 0.3 and α > 1 respectively correspond in size to one or a few DNA loops [50]. These two types of patches are present as islands within the general chromatin fraction governed by an anomalous exponent 0.3 < α < 1. This organization is also in good agreement with the chromosome territory – interchromatin compartment model [2]. In conclusion, the combination of diffusion constant and anomalous exponent maps provides an integrated view on chromatin dynamics and yields insights into possible mechanisms driving dynamics as well as the local chromosomal organization and re-organization during genomic processes.
Our results support the hypothesis that a short treatment of DRB is sufficient to halt most of a cell’s transcription [51], but it is believed to have a limited effect on chromatin configuration of promoters, or on upstream events and assembly dynamics of the preinitiation complex (PIC) [52]. Our results suggest that a few minutes of DRB treatment affects only the assembly dynamics of the RNA Pol II itself. In contrast, longer treatment with DRB as well as prolonged serum starvation, are expected to result in massive PIC disassembly. We believe therefore that increased chromatin mobility upon serum starvation correlates with PIC breakdown. Conversely, when serum is present in the medium, activity of RNA Pol II and of PIC enzymes is restored and PIC will assemble onto DNA, both pervasively across the genome at background levels (pervasive transcription) and more stably at genes [53,54]. Reduced mobility of chromatin in the presence of serum suggests that stable PIC binding may serve as an anchoring function for individual chromatin fibres, on top of its essential function in transcription initiation.
Hi-D can be applied to real-time imaging of any abundant fluorescent molecule to obtain comprehensive maps of their dynamic behaviour in response to stimuli, inhibitors or disruptors of nuclear functions and integrity.
METHODS
Cell Culture
A Human U2OS osterosarcoma cell line (for DNA imaging) and MCF-7 cells (ATCC) were maintained in Dulbecco’s modified Eagle’s medium (DMEM) containing phenol red-free and DMEM-12 (Sigma-Aldrich), respectively. For RNA Pol II imaging, a U2OS cell line stably expressing RPB1 (subunit of RNA Pol II) fused with Dendra2 was constructed as previously described in [55]. Medium was supplemented with Glutamax containing 50 μg/ml gentamicin (Sigma-Aldrich), 10% Fetal bovine serum (FBS), 1 mM sodium pyruvate (Sigma-Aldrich) and G418 0.5 mg/ml (Sigma-Aldrich) at 37°C with 5% CO2. Cells were plated for 24 h on 35 mm petri dishes with a #1.5 coverslip like bottom (μ-Dish, Ibidi, Biovalley) with a density of about 105 cells/dish.
DNA staining
U2OS and MCF-7 cell lines were labelled by using SiR-DNA (SiR-Hoechst) kit (Spirochrome AG). For DNA was labelled as described in [56]. Briefly, we diluted 1 mM stock solution in cell culture medium to concentration of 2 μM and vortexed briefly. On the day of the imaging, the culture medium was changed to medium containing SiR-fluorophores and incubated at 37°C for 30-60 minutes. Before imaging, the medium was changed to L-15 medium (Liebovitz’s, Gibco) for live imaging.
Cell starvation and stimulation
For starvation mode, cells were incubated for 24 h at 37°C before imaging with serum-free medium (DMEM, Glutamax containing 50 μg/ml gentamicin, 1 mM sodium pyruvate, and G418 0.5 mg/ml). Just before imaging, cells were mounted in L-15 medium. For stimulation, 10% FBS was added to the L-15 medium for 10-15 minutes.
Flow cytometry analysis with Hoechst and Pyronin Y staining
To differentiate cells in G0 versus G1, double staining of Hoechst 33324 and Pyronin Y was used, as previously described [26,57]. Briefly, U2OS cells representing each condition (with serum, without serum during 24h and without serum during 24h + 15 min serum) were trypsinized, cells were collected into phosphate-buffer saline (PBS) at a concentration of 2×106 cells/ml, then added to a fixative of ice-cold 70% ethanol. Cells were fixed for at least 2h and washed twice with FACS buffer (1xPBS supplemented with 2% (v/v) heat-inactivated, sterile-filtered fatal bovine serum, 1 mM EDTA). Cells were then incubated in a water bath pre-adjusted to 37°C for 45 min in the dark with 2 µg/ml of Hoechst 33324 (Invitrogen, H3570) diluted in FACS buffer, then incubated to 37°C for 30 min in the dark with 4 µg/ml of Pyronin Y (abcam, ab146350), diluted in FACS buffer. The samples were kept in the dark at 4°C until analyzed using a LSR II flow cytometer (BD Biosciences). Hoechst 33342 and Pyronin Y staining were measured with a UV (350 nm) and yellow green (561 nm) lasers, respectively. DNA content was determined by Hoechst 33342 and RNA content was determined by Pyronin Y. Cells in G0 were identified as the population with 2N DNA content and RNA content lower than the level of cells assigned to G1 phase. Medians of Pyronin Y staining for each condition were compared to assess the increase in mRNA production after serum addition.
DRB treatment
The U2OS cells for both DNA and RNA Pol II images were treated with 100 μM 5,6 Dichlorobenzimidazole 1-β-D-ribofuranoside (DRB; Sigma-Aldrich) for transcription inhibition prior to live-cell image acquisition. DRB was diluted in the L-15 (Leibovitz) imaging medium that was supplemented with 10% FBS, DMEM, Glutamax containing 50 μg/ml gentamicin, 1 mM sodium pyruvate, and G418 0.5 mg/ml. The imaging medium was changed with fresh L-15 medium containing DRB and incubated under the microscope for 15 min before imaging.
Cell fixation
U2OS cells were washed with a pre-warmed (37 °C) phosphate buffered saline (PBS) and followed by fixation with 4% (vol/vol) Paraformaldehyde in PBS for 10-20 min at room temperature. Images were recorded at room temperature in PBS, after washing the cells with PBS (three times, 5 min each).
DNA live cell imaging
Cells were placed in a 37 °C humid incubator by controlling the temperature and CO2 flow using H201-couple with temperature and CO2 units. Live chromatin imaging was acquired using a DMI8 inverted automated microscope (Leica Microsystems) featuring a confocal spinning disk unit (CSU-X1-M1N, Yokogawa). An integrated laser engine (ILE 400, Andor) was used for excitation with a selected wavelength of 647 nm and 140 mW as excitation power. A 100x oil immersion objective (Leica HCX-PL-APO) with a 1.4 NA was chosen for a high-resolution imaging. Fluorescence emission of the SiR–Hoechst was filtered by a single-band bandpass filter (FF01-650/13-25, Semrock, Inc.). Image series of 150 frames (5 fps), with exposure time of 150ms per frame were acquired using Metamorph software (Molecular Devices) and detected using sCMOS cameras (ORCA-Flash4.0 V2) and (1×1 binning), with sample pixel size of 65 nm. All series were recorded at 37°C.
RNA Pol II live cell imaging
Image series of 150 frames were recorded with an exposure time of 200 ms using a Nipkow-disk confocal system (Revolution, Andor) featuring a confocal spinning disk unit (CSU22, Yokogawa). A diode-pumped solid-state laser with a single wavelength of a 488 nm (Coherent) at 5-10% laser power was used for excitation of RPB1 fused to Dendra2. A 100x an oil immersion objective (Plan Apo 1.42, Nikon) was used for imaging. The fluorescent emission signal was filtered through an emission filter (ET525/30-25, Semrock, Inc.), and detected at 512/18 nm on a cooled electron multiplying charge-coupled device camera (iXon Ultra 888), with sample pixel size of 88 nm.
Image processing
Denoising
Raw images were denoised using non-iterative bilateral filtering [58]. While Gaussian blurring only accounts for the spatial distance of a pixel and its neighbourhood, bilateral filtering additionally takes the difference in intensity values into account and is therefore an edge-preserving method. Abrupt transitions from high- to low-intensity regions (e.g. heterochromatin to euchromatin) are not over-smoothed. Images of varying noise levels were treated with a bilateral filter with half-size of the Gaussian bilateral filter window of 5 pixels, the spatial-domain standard deviation value was set to 5 pixels and the intensity-domain standard deviation was varied from 0.3 to 0.8 for decreasing levels of the signal-to-noise ratio from 26 dB to 16 dB (SNR = 10 log10(I2/σ2), with the signal power I2 and the noise variance σ2).
MSD analysis and model selection by using Bayesian inference
In order to carry out a MSD analysis locally, the spatial dependency of the Mean Squared Displacement (MSD) can be written explicitly: where is the position at time t of a virtual particle with initial position , τ = {Δt, 2Δt, …, (N − 1)Δt} are time lags where Δt is the time difference between subsequent images and the average <·>t is taken over time. The resulting MSD is a function of the initial position and the time lag τ.
MSD models
The MSD can be expressed analytically for anomalous diffusion (DA), confined diffusion (DR) and directed motion (V) in two dimensions as where Dα is the diffusion constant in units of μm2/sα, α is its anomalous exponent, v [μm/s] its velocity and RC [μm] is the radius of a sphere within the particle is confined [59]. The case α = 1 is known as free diffusion, 0 < α < 1 corresponds to anomalous diffusion and 1 < α ≤ 2 corresponds to superdiffusion. Strictly speaking, each generalized diffusion constant Dα has different units, corresponding to the specific value of α. However, we refer to it as the diffusion constant D throughout the text for simplicity. Additionally to eq. (1)-(3), different types of motion can appear overlaying, resulting in a linear combination of the equations above. For example, anomalous motion can be superimposed on an underlying drift and the resulting MSD reads MSDDAV (τ) = MSDDA(τ) + MSDV (τ). We found that anomalous and confined diffusion appears very similar in experimental data and therefore decided in favor for anomalous diffusion to describe our data (Additional file 1: Note S3). The abbreviations used in this study are summarized in Table 1. As experimental data is usually subject to noise, a constant offset o is added to every model.
MSD model selection
The MSD is calculated for every pixel independently, resulting in a space- and time lag-dependent MSD. It is known that living cells can behave largely heterogeneous [3,60]. Ad-hoc, it is not known which diffusion model is appropriate. Fitting an MSD curve with a wrong model might result in poor fits and highly inaccurate determination of the mentioned parameters. For this reason, we use a Bayesian inference approach to test different models for any given MSD curve as proposed by Monnier et al. [21]. Given the data Y = {Y1, …, Yn} and K model candidates M = {M1, …, MK}, each with its own (multidimensional) parameter set θ = {θ1, …, θK}, we want to find the model Mk(Y, θk) such that the probability that Mk(Y, θk) describes the data, given the set of models to test, is maximal. By Bayes’ theorem, the probability for each model is given by
If there is no reason to prefer one model over the other, the prior probability of each model P(Mk) is equal. The parameter set which is used to describe the data, given a fixed model, strongly influences the probability. Therefore, it is crucial to estimate the optimal parameters for every model in order to calculate the model probabilities. The probability that the data Y is observed, given the model Mk described by the model function Mk(x; θk) and any parameter set θk is approximated by a general multivariate Gaussian function [61] where C is the empirical covariance matrix of the data and the prefactor is a normalizing factor. This equation has an intuitive meaning. Assume we test a model Mk parametrized by θk to find out if it describes the data Y. The exponential function consists of the term [Y − Mk(x; θk)], i.e. the residuals of the data and the describing model. If the residuals are small, i.e. the model describes the data well, the exponent is small and the probability P(Y|θk, Mk) seeks 1. On the other hand, the worse the fit, the greater the resulting residuals and the probability seeks asymptotically to 0. The factor C−1 accounts for the covariance in the data. The covariance matrix for a set of MSD curves normally shows large values for large time lags as the uncertainty increases and MSD curves diverge. The covariance matrix implicitly introduces a weight to the data, which is small for large variances and large where the data spreads little. This fact avoids cutting of the MSD curve after a specific number of time lags, but instead includes all available time lags weighted by the covariance matrix. The approach is illustrated in (Additional file 1: Fig S2b) with the covariance matrix exemplary shown in the inset. In case of uncorrelated errors, non-diagonal elements are zero, but the approach keeps its validity [62] and follows an ordinary least-squares regression.
Given the best estimate of the parameter set for a model, the model and its corresponding parameters are chosen so that their probability to describe the data is maximal: . It has to be stressed that values of the anomalous exponent scatter around 1, but do not assume the value 1 (e.g. Fig 1c, middle panel). This is due to the model selection procedure, selecting the simplest model which is consistent with the data. In the case that the underlying motion is well described by free diffusion, α is inherently set to 1 and classified as free diffusion rather than anomalous diffusion. The descriptions of free diffusion or anomalous diffusion with α = 1 are equivalent, but the free diffusion model contains one parameter less and is therefore preferred leading to ‘missing’ α values close to 1 in the parameter maps and histograms. To carry out the MSD analysis locally, we choose to take the 3×3 neighborhood of a pixel, detect possible outliers therein by the interquartile range criterion [63] and calculate the error covariance matrix of the data within the pixel’s neighborhood. The restriction to a single pixel and its neighborhood allows us to carry out the MSD analysis of trajectories locally, in contrast to an ensemble MSD in previous studies [18], revealing only average information over many trajectories. The choice of a 3×3 window is reasonable with regard to the equivalently chosen filter size in the Optical Flow estimation. The flow field in this region is therefore assumed to be sufficiently smooth. All calculations, except for the General Mixture Model analysis, were carried out using MATLAB (MATLAB Release 2017a, The MathWorks, Inc., Natick, Massachusetts, United States) on a 64-bit Intel Xeon CPUE5-2609 1.90 GHz workstation with 64 GB RAM and running Microsoft Windows 10 Professional.
Deconvolution of sub-populations
Regarding the distribution of diffusion constants, an analytical expression can be found assuming that the diffusion constant was calculated from a freely diffusing particle (α = 1)[64]. However, we find anomalous diffusion to a large extent in our data (e.g. Fig 1c, Fig 2g, Fig 3f, Fig 4e and Fig 5g) and, to our knowledge, an analytical expression cannot be found for distributions of anomalous exponent, radius of confinement and drift velocity. We therefore deconvolved the parameter sets in a rather general manner, for which we use a General Mixture model (GMM), a probabilistic model composed of multiple distributions and corresponding weights. We describe each data point as a portion of a normal or log-normal distribution described by respectively. The logarithmic mean m and standard deviation s are related to the mean and standard deviation of the normal distribution via [65].
We consider up to three subpopulations to be found in our data and model the total density estimate as a superposition of one, two or three subpopulations, i.e. the Mixture Model reads for both normal and log-normal distributions, where to sum goes to 1, 2 or 3, respectively. The variable wk describes the weights for each population (or component), which satisfy 0 ≤ wk ≤ 1 and sum up to unity. The weights of each component are directly proportional to the area of the histogram covered by this component and therefore its existence in the data set.
General Mixture Model analysis
Let Y = {Y1, …, Yn} denote n data points. For the scope of this description, assume Y to be a one-dimensional variable. Further assume that the data cannot be described by a single distribution, but by a mixture of distributions. A deconvolution of the data into sub-populations faces the following problem: Given a label for each data point, denoting the affiliation to a population, one could group corresponding data points and find the parameters of each population separately using a maximum likelihood estimation or other methods. On the other hand, if we had given the model parameters for each population, labels could in principle be inferred from the likelihood of a data point being described by a population or another. The problem can be formulated by Bayes’ rule (M indicates model, D indicates data)
Here, P(M|D) is the posterior probability of the model given the data, which is the aim to calculate. We assign a data point to the component, which maximizes P(M|D). The probability to observe the data given a model is described by P(D|M), i.e. the likelihood function. P(M) is the prior for the models to be chosen from. In our case, we have no prior beliefs on the models (all models are equally likely) such that P(M) is uniform. Lastly, the probability P(D) does not depend on the models and can therefore be dropped.
Neither labels, that is P(M|D), nor model parameters and weights are known a priori. The problem can be approached by an Expectation-Maximization (EM) scheme: Without any prior beliefs about the data distribution, one starts with a simple estimate of model parameters, e.g. a k-means clustering estimate and iterates subsequently between the two following steps until convergence:
Expectation step
Calculation of the probability that the component with the current parameter estimate generated the sample, i.e. P(D|M).
Maximization step
Update the current parameter estimate for each component by means of a weighted maximum likelihood estimate, where the weight is the probability that the component generated the sample.
We illustrate the results of the EM algorithm exemplary in Additional file 1: Fig S4. From the input data (Additional file 1: Fig S4a), represented as histogram, both the likelihood P(D|M) (Additional file 1: Fig S4b) and the posterior (Additional file 1: Fig S4c) is obtained. The sum of sub-populations corresponds to the overall probability distribution (shown in black) with different model parameters and weights found by maximizing the likelihood function. The posterior describes the probability of data points to fall under each population, i.e. ∑k P(Mk|D) = 1. The data points are assigned to those population, for which P(Mk|D) is maximum, resulting in labeled data. The labels are subsequently mapped in two dimensions, visualizing spatial correspondence of slow, intermediate and fast sub-populations (Additional file 1: Fig S4d). The GMM analysis is carried out using the pomegranate machine learning package for probabilistic modeling in Python [66].
Selection of subpopulations by the Bayesian Information Criterion (BIC)
A priori, it is not unambiguously clear from how many populations the data is sampled and which form the subpopulations take. We therefore assess the suitability of each model by means of the Bayesian Information Criterion (BIC), which is calculated by [67] where is the maximum likelihood of the maximum likelihood estimation (MLE) estimate, p denotes the number of parameters in the model and n is the number of data points used for the fit. Among a family of models, the one with the lowest BIC is considered to describe the data best, taking into account competing complexity of models. A large likelihood of a model favors it to describe the data well, while on the other hand the model is penalized if many parameters are involved in the model by the second term in eq. (4). Therefore, the BIC prevents overfitting. In order to judge which model is appropriate for our data, we tested all considered models for each histogram and assessed the optimal model by means of the BIC. The fraction of all histograms which described best by one of the six models considered is given in Table 2. Based on the objective judgement of the fit using the BIC, we chose for each parameter the model which best describes the largest fraction of histograms (Table 2, bold cells).
AVAILABILITY OF DATA AND MATERIALS
Raw microscopy data is made publicly available [68]. The source code of the Hi-D method is available under the GNU General Public License [69].
FUNDING
This work was supported by grants (to K. B.) from the ANR, IDEX Toulouse strategic actions, the Foundation ARC and INSERM Cancer and Epigenetics.
ETHICAL APPROVAL
Not applicable.
AUTHOR CONTRIBUTIONS
H.A.S. conceived the project, performed the experimental work, developed the data processing algorithm and analyzed the data; R.B, developed the data processing algorithm, wrote the code, carried out simulations and analyzed the data; L.R. performed flow cytometry experiment; H.A.S. and K.B. interpreted the results; H.A.S., R.B. and K.B. wrote the manuscript.
CONFLICT OF INTEREST
The authors declare no competing financial interests.
ADDITIONAL FILES
Additional File 1 (AdditionalFile1.pdf 3.63 Mb) contains details on the conversion from flow fields to trajectories, detailed discussion on the distinction between confined and anomalous diffusion and a comparison of Single Particle Tracking and Optical Flow for reconstruction of motion in intermediate to high labelling density scenarios and further discussion on the comparison of Hi-D and iMSD.
ACKNOWLEDGMENTS
We thank Maxime Dahan (Curie Institute) for gifting the U2OS cell line expressing RPB1, Amy Strom (Gary Karpen Lab, UC Berkeley, USA) for providing images of HP1 in NIH3T3 cells, Silvia Kocanova (CBI Toulouse) for assistance with image acquisition, Raphael Mourad (CBI Toulouse) and Genevieve Fourel (ENS Lyon) for fruitful discussions, Alain Kamgoué (CBI Toulouse) for assistance with computer cluster set-up, and Fatima-Ezzahra L’Faqihi (CPTP-INSERM U1043, TRI) for technical assistance with the Flow cytometry experiments. We acknowledge support from the LITC imaging platform, CBI Toulouse.